[Qemu-devel] [PATCH 0/4] migration: Make multifd not experimental

All of lore.kernel.org
 help / color / mirror / Atom feed

* [Qemu-devel] [PATCH 0/4] migration: Make multifd not experimental
@ 2019-02-06 13:23 Juan Quintela
  2019-02-06 13:23 ` [Qemu-devel] [PATCH 1/4] multifd: Change page count default to 128 Juan Quintela
                   ` (3 more replies)
  0 siblings, 4 replies; 18+ messages in thread
From: Juan Quintela @ 2019-02-06 13:23 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juan Quintela, Markus Armbruster, Thomas Huth, Paolo Bonzini,
	Laurent Vivier, Eric Blake, Dr. David Alan Gilbert

This series:

- Change page_count default to 128.

  16 was really small, and it makes much less contention on mutexes to
  just have bigger packets.

- Drop multifd-page_count parameter

  This parameter was useful for testing, but in all my testing 128 is
  good enough, no need to have an extra knob.  Libvirt don't want to
  expose this parameter because it is difficult to explain.

- Drop experimental "x-" from multifd

  Code is stable, nothing big is happening here.

- Multifd test

  And a test for multifd, this test has already been on other patch
  series.  But now it uses the names without "-x".

Please review.

Juan Quintela (4):
  multifd: Change page count default to 128
  multifd: Drop x-multifd-page-count parameter
  multifd: Drop x-
  tests: Add migration multifd test

 hmp.c                  | 17 ++++---------
 migration/migration.c  | 56 ++++++++++--------------------------------
 migration/migration.h  |  1 -
 migration/ram.c        | 13 ++++++----
 qapi/migration.json    | 45 +++++++++++++--------------------
 tests/migration-test.c | 48 ++++++++++++++++++++++++++++++++++++
 6 files changed, 91 insertions(+), 89 deletions(-)

-- 
2.20.1

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Qemu-devel] [PATCH 1/4] multifd: Change page count default to 128
  2019-02-06 13:23 [Qemu-devel] [PATCH 0/4] migration: Make multifd not experimental Juan Quintela
@ 2019-02-06 13:23 ` Juan Quintela
  2019-02-07 11:33   ` Daniel P. Berrangé
  2019-02-06 13:23 ` [Qemu-devel] [PATCH 2/4] multifd: Drop x-multifd-page-count parameter Juan Quintela
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 18+ messages in thread
From: Juan Quintela @ 2019-02-06 13:23 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juan Quintela, Markus Armbruster, Thomas Huth, Paolo Bonzini,
	Laurent Vivier, Eric Blake, Dr. David Alan Gilbert

I haven't seend any problem about using 64 or 128.  And it make much
less contention on the locks.  Just change it.

Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 migration/migration.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/migration/migration.c b/migration/migration.c
index ef1d53cde2..f673486679 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -81,7 +81,7 @@
 /* The delay time (in ms) between two COLO checkpoints */
 #define DEFAULT_MIGRATE_X_CHECKPOINT_DELAY (200 * 100)
 #define DEFAULT_MIGRATE_MULTIFD_CHANNELS 2
-#define DEFAULT_MIGRATE_MULTIFD_PAGE_COUNT 16
+#define DEFAULT_MIGRATE_MULTIFD_PAGE_COUNT 128
 
 /* Background transfer rate for postcopy, 0 means unlimited, note
  * that page requests can still exceed this limit.
-- 
2.20.1

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [Qemu-devel] [PATCH 2/4] multifd: Drop x-multifd-page-count parameter
  2019-02-06 13:23 [Qemu-devel] [PATCH 0/4] migration: Make multifd not experimental Juan Quintela
  2019-02-06 13:23 ` [Qemu-devel] [PATCH 1/4] multifd: Change page count default to 128 Juan Quintela
@ 2019-02-06 13:23 ` Juan Quintela
  2019-02-06 14:20   ` Laurent Vivier
  2019-02-07 12:33   ` Daniel P. Berrangé
  2019-02-06 13:23 ` [Qemu-devel] [PATCH 3/4] multifd: Drop x- Juan Quintela
  2019-02-06 13:23 ` [Qemu-devel] [PATCH 4/4] tests: Add migration multifd test Juan Quintela
  3 siblings, 2 replies; 18+ messages in thread
From: Juan Quintela @ 2019-02-06 13:23 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juan Quintela, Markus Armbruster, Thomas Huth, Paolo Bonzini,
	Laurent Vivier, Eric Blake, Dr. David Alan Gilbert

Libvirt don't want to expose (and explain it).  And testing looks like
128 is good for all use cases, so just drop it.

Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 hmp.c                 |  7 -------
 migration/migration.c | 30 ------------------------------
 migration/migration.h |  1 -
 migration/ram.c       | 13 ++++++++-----
 qapi/migration.json   | 13 +------------
 5 files changed, 9 insertions(+), 55 deletions(-)

diff --git a/hmp.c b/hmp.c
index 63019729ed..73b8443a8e 100644
--- a/hmp.c
+++ b/hmp.c
@@ -426,9 +426,6 @@ void hmp_info_migrate_parameters(Monitor *mon, const QDict *qdict)
         monitor_printf(mon, "%s: %u\n",
             MigrationParameter_str(MIGRATION_PARAMETER_X_MULTIFD_CHANNELS),
             params->x_multifd_channels);
-        monitor_printf(mon, "%s: %u\n",
-            MigrationParameter_str(MIGRATION_PARAMETER_X_MULTIFD_PAGE_COUNT),
-            params->x_multifd_page_count);
         monitor_printf(mon, "%s: %" PRIu64 "\n",
             MigrationParameter_str(MIGRATION_PARAMETER_XBZRLE_CACHE_SIZE),
             params->xbzrle_cache_size);
@@ -1776,10 +1773,6 @@ void hmp_migrate_set_parameter(Monitor *mon, const QDict *qdict)
         p->has_x_multifd_channels = true;
         visit_type_int(v, param, &p->x_multifd_channels, &err);
         break;
-    case MIGRATION_PARAMETER_X_MULTIFD_PAGE_COUNT:
-        p->has_x_multifd_page_count = true;
-        visit_type_int(v, param, &p->x_multifd_page_count, &err);
-        break;
     case MIGRATION_PARAMETER_XBZRLE_CACHE_SIZE:
         p->has_xbzrle_cache_size = true;
         visit_type_size(v, param, &cache_size, &err);
diff --git a/migration/migration.c b/migration/migration.c
index f673486679..65df9b566e 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -81,7 +81,6 @@
 /* The delay time (in ms) between two COLO checkpoints */
 #define DEFAULT_MIGRATE_X_CHECKPOINT_DELAY (200 * 100)
 #define DEFAULT_MIGRATE_MULTIFD_CHANNELS 2
-#define DEFAULT_MIGRATE_MULTIFD_PAGE_COUNT 128
 
 /* Background transfer rate for postcopy, 0 means unlimited, note
  * that page requests can still exceed this limit.
@@ -749,8 +748,6 @@ MigrationParameters *qmp_query_migrate_parameters(Error **errp)
     params->block_incremental = s->parameters.block_incremental;
     params->has_x_multifd_channels = true;
     params->x_multifd_channels = s->parameters.x_multifd_channels;
-    params->has_x_multifd_page_count = true;
-    params->x_multifd_page_count = s->parameters.x_multifd_page_count;
     params->has_xbzrle_cache_size = true;
     params->xbzrle_cache_size = s->parameters.xbzrle_cache_size;
     params->has_max_postcopy_bandwidth = true;
@@ -1112,14 +1109,6 @@ static bool migrate_params_check(MigrationParameters *params, Error **errp)
                    "is invalid, it should be in the range of 1 to 255");
         return false;
     }
-    if (params->has_x_multifd_page_count &&
-        (params->x_multifd_page_count < 1 ||
-         params->x_multifd_page_count > 10000)) {
-        error_setg(errp, QERR_INVALID_PARAMETER_VALUE,
-                   "multifd_page_count",
-                   "is invalid, it should be in the range of 1 to 10000");
-        return false;
-    }
 
     if (params->has_xbzrle_cache_size &&
         (params->xbzrle_cache_size < qemu_target_page_size() ||
@@ -1202,9 +1191,6 @@ static void migrate_params_test_apply(MigrateSetParameters *params,
     if (params->has_x_multifd_channels) {
         dest->x_multifd_channels = params->x_multifd_channels;
     }
-    if (params->has_x_multifd_page_count) {
-        dest->x_multifd_page_count = params->x_multifd_page_count;
-    }
     if (params->has_xbzrle_cache_size) {
         dest->xbzrle_cache_size = params->xbzrle_cache_size;
     }
@@ -1283,9 +1269,6 @@ static void migrate_params_apply(MigrateSetParameters *params, Error **errp)
     if (params->has_x_multifd_channels) {
         s->parameters.x_multifd_channels = params->x_multifd_channels;
     }
-    if (params->has_x_multifd_page_count) {
-        s->parameters.x_multifd_page_count = params->x_multifd_page_count;
-    }
     if (params->has_xbzrle_cache_size) {
         s->parameters.xbzrle_cache_size = params->xbzrle_cache_size;
         xbzrle_cache_resize(params->xbzrle_cache_size, errp);
@@ -2044,15 +2027,6 @@ int migrate_multifd_channels(void)
     return s->parameters.x_multifd_channels;
 }
 
-int migrate_multifd_page_count(void)
-{
-    MigrationState *s;
-
-    s = migrate_get_current();
-
-    return s->parameters.x_multifd_page_count;
-}
-
 int migrate_use_xbzrle(void)
 {
     MigrationState *s;
@@ -3286,9 +3260,6 @@ static Property migration_properties[] = {
     DEFINE_PROP_UINT8("x-multifd-channels", MigrationState,
                       parameters.x_multifd_channels,
                       DEFAULT_MIGRATE_MULTIFD_CHANNELS),
-    DEFINE_PROP_UINT32("x-multifd-page-count", MigrationState,
-                      parameters.x_multifd_page_count,
-                      DEFAULT_MIGRATE_MULTIFD_PAGE_COUNT),
     DEFINE_PROP_SIZE("xbzrle-cache-size", MigrationState,
                       parameters.xbzrle_cache_size,
                       DEFAULT_MIGRATE_XBZRLE_CACHE_SIZE),
@@ -3366,7 +3337,6 @@ static void migration_instance_init(Object *obj)
     params->has_x_checkpoint_delay = true;
     params->has_block_incremental = true;
     params->has_x_multifd_channels = true;
-    params->has_x_multifd_page_count = true;
     params->has_xbzrle_cache_size = true;
     params->has_max_postcopy_bandwidth = true;
     params->has_max_cpu_throttle = true;
diff --git a/migration/migration.h b/migration/migration.h
index bd41b57af9..5e2b004a6c 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -269,7 +269,6 @@ bool migrate_auto_converge(void);
 bool migrate_use_multifd(void);
 bool migrate_pause_before_switchover(void);
 int migrate_multifd_channels(void);
-int migrate_multifd_page_count(void);
 
 int migrate_use_xbzrle(void);
 int64_t migrate_xbzrle_cache_size(void);
diff --git a/migration/ram.c b/migration/ram.c
index 59191c1ed2..ebe893e356 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -520,6 +520,9 @@ exit:
 
 #define MULTIFD_FLAG_SYNC (1 << 0)
 
+#define MULTIFD_PAGE_COUNT 128
+
+
 typedef struct {
     uint32_t magic;
     uint32_t version;
@@ -718,7 +721,7 @@ static void multifd_send_fill_packet(MultiFDSendParams *p)
     packet->magic = cpu_to_be32(MULTIFD_MAGIC);
     packet->version = cpu_to_be32(MULTIFD_VERSION);
     packet->flags = cpu_to_be32(p->flags);
-    packet->size = cpu_to_be32(migrate_multifd_page_count());
+    packet->size = cpu_to_be32(MULTIFD_PAGE_COUNT);
     packet->used = cpu_to_be32(p->pages->used);
     packet->packet_num = cpu_to_be64(p->packet_num);
 
@@ -756,10 +759,10 @@ static int multifd_recv_unfill_packet(MultiFDRecvParams *p, Error **errp)
     p->flags = be32_to_cpu(packet->flags);
 
     packet->size = be32_to_cpu(packet->size);
-    if (packet->size > migrate_multifd_page_count()) {
+    if (packet->size > MULTIFD_PAGE_COUNT) {
         error_setg(errp, "multifd: received packet "
                    "with size %d and expected maximum size %d",
-                   packet->size, migrate_multifd_page_count()) ;
+                   packet->size, MULTIFD_PAGE_COUNT) ;
         return -1;
     }
 
@@ -1085,7 +1088,7 @@ static void multifd_new_send_channel_async(QIOTask *task, gpointer opaque)
 int multifd_save_setup(void)
 {
     int thread_count;
-    uint32_t page_count = migrate_multifd_page_count();
+    uint32_t page_count = MULTIFD_PAGE_COUNT;
     uint8_t i;
 
     if (!migrate_use_multifd()) {
@@ -1281,7 +1284,7 @@ static void *multifd_recv_thread(void *opaque)
 int multifd_load_setup(void)
 {
     int thread_count;
-    uint32_t page_count = migrate_multifd_page_count();
+    uint32_t page_count = MULTIFD_PAGE_COUNT;
     uint8_t i;
 
     if (!migrate_use_multifd()) {
diff --git a/qapi/migration.json b/qapi/migration.json
index b62947791f..8c5db60406 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -547,9 +547,6 @@
 #                     number of sockets used for migration.  The
 #                     default value is 2 (since 2.11)
 #
-# @x-multifd-page-count: Number of pages sent together to a thread.
-#                        The default value is 16 (since 2.11)
-#
 # @xbzrle-cache-size: cache size to be used by XBZRLE migration.  It
 #                     needs to be a multiple of the target page size
 #                     and a power of 2
@@ -569,7 +566,7 @@
            'cpu-throttle-initial', 'cpu-throttle-increment',
            'tls-creds', 'tls-hostname', 'max-bandwidth',
            'downtime-limit', 'x-checkpoint-delay', 'block-incremental',
-           'x-multifd-channels', 'x-multifd-page-count',
+           'x-multifd-channels',
            'xbzrle-cache-size', 'max-postcopy-bandwidth',
            'max-cpu-throttle' ] }
 
@@ -637,9 +634,6 @@
 #                     number of sockets used for migration.  The
 #                     default value is 2 (since 2.11)
 #
-# @x-multifd-page-count: Number of pages sent together to a thread.
-#                        The default value is 16 (since 2.11)
-#
 # @xbzrle-cache-size: cache size to be used by XBZRLE migration.  It
 #                     needs to be a multiple of the target page size
 #                     and a power of 2
@@ -670,7 +664,6 @@
             '*x-checkpoint-delay': 'int',
             '*block-incremental': 'bool',
             '*x-multifd-channels': 'int',
-            '*x-multifd-page-count': 'int',
             '*xbzrle-cache-size': 'size',
             '*max-postcopy-bandwidth': 'size',
 	    '*max-cpu-throttle': 'int' } }
@@ -754,9 +747,6 @@
 #                     number of sockets used for migration.
 #                     The default value is 2 (since 2.11)
 #
-# @x-multifd-page-count: Number of pages sent together to a thread.
-#                        The default value is 16 (since 2.11)
-#
 # @xbzrle-cache-size: cache size to be used by XBZRLE migration.  It
 #                     needs to be a multiple of the target page size
 #                     and a power of 2
@@ -786,7 +776,6 @@
             '*x-checkpoint-delay': 'uint32',
             '*block-incremental': 'bool' ,
             '*x-multifd-channels': 'uint8',
-            '*x-multifd-page-count': 'uint32',
             '*xbzrle-cache-size': 'size',
 	    '*max-postcopy-bandwidth': 'size',
             '*max-cpu-throttle':'uint8'} }
-- 
2.20.1

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [Qemu-devel] [PATCH 3/4] multifd: Drop x-
  2019-02-06 13:23 [Qemu-devel] [PATCH 0/4] migration: Make multifd not experimental Juan Quintela
  2019-02-06 13:23 ` [Qemu-devel] [PATCH 1/4] multifd: Change page count default to 128 Juan Quintela
  2019-02-06 13:23 ` [Qemu-devel] [PATCH 2/4] multifd: Drop x-multifd-page-count parameter Juan Quintela
@ 2019-02-06 13:23 ` Juan Quintela
  2019-02-07 11:23   ` Dr. David Alan Gilbert
  2019-02-06 13:23 ` [Qemu-devel] [PATCH 4/4] tests: Add migration multifd test Juan Quintela
  3 siblings, 1 reply; 18+ messages in thread
From: Juan Quintela @ 2019-02-06 13:23 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juan Quintela, Markus Armbruster, Thomas Huth, Paolo Bonzini,
	Laurent Vivier, Eric Blake, Dr. David Alan Gilbert

We make it supported from now on.

Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 hmp.c                 | 10 +++++-----
 migration/migration.c | 26 +++++++++++++-------------
 qapi/migration.json   | 34 +++++++++++++++++-----------------
 3 files changed, 35 insertions(+), 35 deletions(-)

diff --git a/hmp.c b/hmp.c
index 73b8443a8e..0cafce9713 100644
--- a/hmp.c
+++ b/hmp.c
@@ -424,8 +424,8 @@ void hmp_info_migrate_parameters(Monitor *mon, const QDict *qdict)
             MigrationParameter_str(MIGRATION_PARAMETER_BLOCK_INCREMENTAL),
             params->block_incremental ? "on" : "off");
         monitor_printf(mon, "%s: %u\n",
-            MigrationParameter_str(MIGRATION_PARAMETER_X_MULTIFD_CHANNELS),
-            params->x_multifd_channels);
+            MigrationParameter_str(MIGRATION_PARAMETER_MULTIFD_CHANNELS),
+            params->multifd_channels);
         monitor_printf(mon, "%s: %" PRIu64 "\n",
             MigrationParameter_str(MIGRATION_PARAMETER_XBZRLE_CACHE_SIZE),
             params->xbzrle_cache_size);
@@ -1769,9 +1769,9 @@ void hmp_migrate_set_parameter(Monitor *mon, const QDict *qdict)
         p->has_block_incremental = true;
         visit_type_bool(v, param, &p->block_incremental, &err);
         break;
-    case MIGRATION_PARAMETER_X_MULTIFD_CHANNELS:
-        p->has_x_multifd_channels = true;
-        visit_type_int(v, param, &p->x_multifd_channels, &err);
+    case MIGRATION_PARAMETER_MULTIFD_CHANNELS:
+        p->has_multifd_channels = true;
+        visit_type_int(v, param, &p->multifd_channels, &err);
         break;
     case MIGRATION_PARAMETER_XBZRLE_CACHE_SIZE:
         p->has_xbzrle_cache_size = true;
diff --git a/migration/migration.c b/migration/migration.c
index 65df9b566e..c7dbc44562 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -746,8 +746,8 @@ MigrationParameters *qmp_query_migrate_parameters(Error **errp)
     params->x_checkpoint_delay = s->parameters.x_checkpoint_delay;
     params->has_block_incremental = true;
     params->block_incremental = s->parameters.block_incremental;
-    params->has_x_multifd_channels = true;
-    params->x_multifd_channels = s->parameters.x_multifd_channels;
+    params->has_multifd_channels = true;
+    params->multifd_channels = s->parameters.multifd_channels;
     params->has_xbzrle_cache_size = true;
     params->xbzrle_cache_size = s->parameters.xbzrle_cache_size;
     params->has_max_postcopy_bandwidth = true;
@@ -1103,7 +1103,7 @@ static bool migrate_params_check(MigrationParameters *params, Error **errp)
 
     /* x_checkpoint_delay is now always positive */
 
-    if (params->has_x_multifd_channels && (params->x_multifd_channels < 1)) {
+    if (params->has_multifd_channels && (params->multifd_channels < 1)) {
         error_setg(errp, QERR_INVALID_PARAMETER_VALUE,
                    "multifd_channels",
                    "is invalid, it should be in the range of 1 to 255");
@@ -1188,8 +1188,8 @@ static void migrate_params_test_apply(MigrateSetParameters *params,
     if (params->has_block_incremental) {
         dest->block_incremental = params->block_incremental;
     }
-    if (params->has_x_multifd_channels) {
-        dest->x_multifd_channels = params->x_multifd_channels;
+    if (params->has_multifd_channels) {
+        dest->multifd_channels = params->multifd_channels;
     }
     if (params->has_xbzrle_cache_size) {
         dest->xbzrle_cache_size = params->xbzrle_cache_size;
@@ -1266,8 +1266,8 @@ static void migrate_params_apply(MigrateSetParameters *params, Error **errp)
     if (params->has_block_incremental) {
         s->parameters.block_incremental = params->block_incremental;
     }
-    if (params->has_x_multifd_channels) {
-        s->parameters.x_multifd_channels = params->x_multifd_channels;
+    if (params->has_multifd_channels) {
+        s->parameters.multifd_channels = params->multifd_channels;
     }
     if (params->has_xbzrle_cache_size) {
         s->parameters.xbzrle_cache_size = params->xbzrle_cache_size;
@@ -2005,7 +2005,7 @@ bool migrate_use_multifd(void)
 
     s = migrate_get_current();
 
-    return s->enabled_capabilities[MIGRATION_CAPABILITY_X_MULTIFD];
+    return s->enabled_capabilities[MIGRATION_CAPABILITY_MULTIFD];
 }
 
 bool migrate_pause_before_switchover(void)
@@ -2024,7 +2024,7 @@ int migrate_multifd_channels(void)
 
     s = migrate_get_current();
 
-    return s->parameters.x_multifd_channels;
+    return s->parameters.multifd_channels;
 }
 
 int migrate_use_xbzrle(void)
@@ -3257,8 +3257,8 @@ static Property migration_properties[] = {
     DEFINE_PROP_UINT32("x-checkpoint-delay", MigrationState,
                       parameters.x_checkpoint_delay,
                       DEFAULT_MIGRATE_X_CHECKPOINT_DELAY),
-    DEFINE_PROP_UINT8("x-multifd-channels", MigrationState,
-                      parameters.x_multifd_channels,
+    DEFINE_PROP_UINT8("multifd-channels", MigrationState,
+                      parameters.multifd_channels,
                       DEFAULT_MIGRATE_MULTIFD_CHANNELS),
     DEFINE_PROP_SIZE("xbzrle-cache-size", MigrationState,
                       parameters.xbzrle_cache_size,
@@ -3282,7 +3282,7 @@ static Property migration_properties[] = {
     DEFINE_PROP_MIG_CAP("x-release-ram", MIGRATION_CAPABILITY_RELEASE_RAM),
     DEFINE_PROP_MIG_CAP("x-block", MIGRATION_CAPABILITY_BLOCK),
     DEFINE_PROP_MIG_CAP("x-return-path", MIGRATION_CAPABILITY_RETURN_PATH),
-    DEFINE_PROP_MIG_CAP("x-multifd", MIGRATION_CAPABILITY_X_MULTIFD),
+    DEFINE_PROP_MIG_CAP("x-multifd", MIGRATION_CAPABILITY_MULTIFD),
 
     DEFINE_PROP_END_OF_LIST(),
 };
@@ -3336,7 +3336,7 @@ static void migration_instance_init(Object *obj)
     params->has_downtime_limit = true;
     params->has_x_checkpoint_delay = true;
     params->has_block_incremental = true;
-    params->has_x_multifd_channels = true;
+    params->has_multifd_channels = true;
     params->has_xbzrle_cache_size = true;
     params->has_max_postcopy_bandwidth = true;
     params->has_max_cpu_throttle = true;
diff --git a/qapi/migration.json b/qapi/migration.json
index 8c5db60406..c202703889 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -401,7 +401,7 @@
 # @pause-before-switchover: Pause outgoing migration before serialising device
 #          state and before disabling block IO (since 2.11)
 #
-# @x-multifd: Use more than one fd for migration (since 2.11)
+# @multifd: Use more than one fd for migration (since 4.0)
 #
 # @dirty-bitmaps: If enabled, QEMU will migrate named dirty bitmaps.
 #                 (since 2.12)
@@ -418,7 +418,7 @@
 { 'enum': 'MigrationCapability',
   'data': ['xbzrle', 'rdma-pin-all', 'auto-converge', 'zero-blocks',
            'compress', 'events', 'postcopy-ram', 'x-colo', 'release-ram',
-           'block', 'return-path', 'pause-before-switchover', 'x-multifd',
+           'block', 'return-path', 'pause-before-switchover', 'multifd',
            'dirty-bitmaps', 'postcopy-blocktime', 'late-block-activate' ] }
 
 ##
@@ -542,10 +542,10 @@
 # 	migrated and the destination must already have access to the
 # 	same backing chain as was used on the source.  (since 2.10)
 #
-# @x-multifd-channels: Number of channels used to migrate data in
-#                     parallel. This is the same number that the
-#                     number of sockets used for migration.  The
-#                     default value is 2 (since 2.11)
+# @multifd-channels: Number of channels used to migrate data in
+#                    parallel. This is the same number that the
+#                    number of sockets used for migration.  The
+#                    default value is 2 (since 4.0)
 #
 # @xbzrle-cache-size: cache size to be used by XBZRLE migration.  It
 #                     needs to be a multiple of the target page size
@@ -566,7 +566,7 @@
            'cpu-throttle-initial', 'cpu-throttle-increment',
            'tls-creds', 'tls-hostname', 'max-bandwidth',
            'downtime-limit', 'x-checkpoint-delay', 'block-incremental',
-           'x-multifd-channels',
+           'multifd-channels',
            'xbzrle-cache-size', 'max-postcopy-bandwidth',
            'max-cpu-throttle' ] }
 
@@ -629,10 +629,10 @@
 # 	migrated and the destination must already have access to the
 # 	same backing chain as was used on the source.  (since 2.10)
 #
-# @x-multifd-channels: Number of channels used to migrate data in
-#                     parallel. This is the same number that the
-#                     number of sockets used for migration.  The
-#                     default value is 2 (since 2.11)
+# @multifd-channels: Number of channels used to migrate data in
+#                    parallel. This is the same number that the
+#                    number of sockets used for migration.  The
+#                    default value is 2 (since 4.0)
 #
 # @xbzrle-cache-size: cache size to be used by XBZRLE migration.  It
 #                     needs to be a multiple of the target page size
@@ -663,7 +663,7 @@
             '*downtime-limit': 'int',
             '*x-checkpoint-delay': 'int',
             '*block-incremental': 'bool',
-            '*x-multifd-channels': 'int',
+            '*multifd-channels': 'int',
             '*xbzrle-cache-size': 'size',
             '*max-postcopy-bandwidth': 'size',
 	    '*max-cpu-throttle': 'int' } }
@@ -742,10 +742,10 @@
 # 	migrated and the destination must already have access to the
 # 	same backing chain as was used on the source.  (since 2.10)
 #
-# @x-multifd-channels: Number of channels used to migrate data in
-#                     parallel. This is the same number that the
-#                     number of sockets used for migration.
-#                     The default value is 2 (since 2.11)
+# @multifd-channels: Number of channels used to migrate data in
+#                    parallel. This is the same number that the
+#                    number of sockets used for migration.
+#                    The default value is 2 (since 4.0)
 #
 # @xbzrle-cache-size: cache size to be used by XBZRLE migration.  It
 #                     needs to be a multiple of the target page size
@@ -775,7 +775,7 @@
             '*downtime-limit': 'uint64',
             '*x-checkpoint-delay': 'uint32',
             '*block-incremental': 'bool' ,
-            '*x-multifd-channels': 'uint8',
+            '*multifd-channels': 'uint8',
             '*xbzrle-cache-size': 'size',
 	    '*max-postcopy-bandwidth': 'size',
             '*max-cpu-throttle':'uint8'} }
-- 
2.20.1

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [Qemu-devel] [PATCH 4/4] tests: Add migration multifd test
  2019-02-06 13:23 [Qemu-devel] [PATCH 0/4] migration: Make multifd not experimental Juan Quintela
                   ` (2 preceding siblings ...)
  2019-02-06 13:23 ` [Qemu-devel] [PATCH 3/4] multifd: Drop x- Juan Quintela
@ 2019-02-06 13:23 ` Juan Quintela
  2019-02-06 15:49   ` Thomas Huth
  3 siblings, 1 reply; 18+ messages in thread
From: Juan Quintela @ 2019-02-06 13:23 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juan Quintela, Markus Armbruster, Thomas Huth, Paolo Bonzini,
	Laurent Vivier, Eric Blake, Dr. David Alan Gilbert

We set multifd-channels.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 tests/migration-test.c | 48 ++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 48 insertions(+)

diff --git a/tests/migration-test.c b/tests/migration-test.c
index a27f095b28..3e0e447f95 100644
--- a/tests/migration-test.c
+++ b/tests/migration-test.c
@@ -942,6 +942,53 @@ static void test_precopy_tcp(void)
     g_free(uri);
 }
 
+static void test_multifd_tcp(void)
+{
+    char *uri;
+    QTestState *from, *to;
+
+    if (test_migrate_start(&from, &to, "tcp:127.0.0.1:0", false)) {
+        return;
+    }
+
+    /*
+     * We want to pick a speed slow enough that the test completes
+     * quickly, but that it doesn't complete precopy even on a slow
+     * machine, so also set the downtime.
+     */
+    /* 1 ms should make it not converge*/
+    migrate_set_parameter(from, "downtime-limit", 1);
+    /* 1GB/s */
+    migrate_set_parameter(from, "max-bandwidth", 1000000000);
+
+    migrate_set_parameter(from, "multifd-channels", 2);
+    migrate_set_parameter(to, "multifd-channels", 2);
+
+    migrate_set_capability(from, "multifd", "true");
+    migrate_set_capability(to, "multifd", "true");
+    /* Wait for the first serial output from the source */
+    wait_for_serial("src_serial");
+
+    uri = migrate_get_socket_address(to, "socket-address");
+
+    migrate(from, uri, "{}");
+
+    wait_for_migration_pass(from);
+
+    /* 300ms it should converge */
+    migrate_set_parameter(from, "downtime-limit", 300);
+
+    if (!got_stop) {
+        qtest_qmp_eventwait(from, "STOP");
+    }
+    qtest_qmp_eventwait(to, "RESUME");
+
+    wait_for_serial("dest_serial");
+    wait_for_migration_complete(from);
+
+    test_migrate_end(from, to, true);
+}
+
 int main(int argc, char **argv)
 {
     char template[] = "/tmp/migration-test-XXXXXX";
@@ -995,6 +1042,7 @@ int main(int argc, char **argv)
     qtest_add_func("/migration/precopy/unix", test_precopy_unix);
     qtest_add_func("/migration/precopy/tcp", test_precopy_tcp);
     qtest_add_func("/migration/xbzrle/unix", test_xbzrle_unix);
+    qtest_add_func("/migration/multifd/tcp", test_multifd_tcp);
 
     ret = g_test_run();
 
-- 
2.20.1

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [PATCH 2/4] multifd: Drop x-multifd-page-count parameter
  2019-02-06 13:23 ` [Qemu-devel] [PATCH 2/4] multifd: Drop x-multifd-page-count parameter Juan Quintela
@ 2019-02-06 14:20   ` Laurent Vivier
  2019-02-06 17:58     ` Juan Quintela
  2019-02-07 12:33   ` Daniel P. Berrangé
  1 sibling, 1 reply; 18+ messages in thread
From: Laurent Vivier @ 2019-02-06 14:20 UTC (permalink / raw)
  To: Juan Quintela, qemu-devel
  Cc: Markus Armbruster, Thomas Huth, Paolo Bonzini, Eric Blake,
	Dr. David Alan Gilbert

On 06/02/2019 14:23, Juan Quintela wrote:
> Libvirt don't want to expose (and explain it).  And testing looks like
> 128 is good for all use cases, so just drop it.
> 
> Signed-off-by: Juan Quintela <quintela@redhat.com>
> ---
>  hmp.c                 |  7 -------
>  migration/migration.c | 30 ------------------------------
>  migration/migration.h |  1 -
>  migration/ram.c       | 13 ++++++++-----
>  qapi/migration.json   | 13 +------------
>  5 files changed, 9 insertions(+), 55 deletions(-)
> 
...
> diff --git a/migration/migration.c b/migration/migration.c
> index f673486679..65df9b566e 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -81,7 +81,6 @@
>  /* The delay time (in ms) between two COLO checkpoints */
>  #define DEFAULT_MIGRATE_X_CHECKPOINT_DELAY (200 * 100)
>  #define DEFAULT_MIGRATE_MULTIFD_CHANNELS 2
> -#define DEFAULT_MIGRATE_MULTIFD_PAGE_COUNT 128

Why do you update it in the previous patch to remove it in this one?

Thanks,
Laurent

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [PATCH 4/4] tests: Add migration multifd test
  2019-02-06 13:23 ` [Qemu-devel] [PATCH 4/4] tests: Add migration multifd test Juan Quintela
@ 2019-02-06 15:49   ` Thomas Huth
  0 siblings, 0 replies; 18+ messages in thread
From: Thomas Huth @ 2019-02-06 15:49 UTC (permalink / raw)
  To: Juan Quintela, qemu-devel
  Cc: Markus Armbruster, Paolo Bonzini, Laurent Vivier, Eric Blake,
	Dr. David Alan Gilbert

On 2019-02-06 14:23, Juan Quintela wrote:
> We set multifd-channels.
> 
> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> Signed-off-by: Juan Quintela <quintela@redhat.com>
> ---
>  tests/migration-test.c | 48 ++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 48 insertions(+)
> 
> diff --git a/tests/migration-test.c b/tests/migration-test.c
> index a27f095b28..3e0e447f95 100644
> --- a/tests/migration-test.c
> +++ b/tests/migration-test.c
> @@ -942,6 +942,53 @@ static void test_precopy_tcp(void)
>      g_free(uri);
>  }
>  
> +static void test_multifd_tcp(void)
> +{
> +    char *uri;
> +    QTestState *from, *to;
> +
> +    if (test_migrate_start(&from, &to, "tcp:127.0.0.1:0", false)) {
> +        return;
> +    }
> +
> +    /*
> +     * We want to pick a speed slow enough that the test completes
> +     * quickly, but that it doesn't complete precopy even on a slow
> +     * machine, so also set the downtime.
> +     */
> +    /* 1 ms should make it not converge*/
> +    migrate_set_parameter(from, "downtime-limit", 1);
> +    /* 1GB/s */
> +    migrate_set_parameter(from, "max-bandwidth", 1000000000);
> +
> +    migrate_set_parameter(from, "multifd-channels", 2);
> +    migrate_set_parameter(to, "multifd-channels", 2);
> +
> +    migrate_set_capability(from, "multifd", "true");
> +    migrate_set_capability(to, "multifd", "true");
> +    /* Wait for the first serial output from the source */
> +    wait_for_serial("src_serial");
> +
> +    uri = migrate_get_socket_address(to, "socket-address");
> +
> +    migrate(from, uri, "{}");
> +
> +    wait_for_migration_pass(from);
> +
> +    /* 300ms it should converge */
> +    migrate_set_parameter(from, "downtime-limit", 300);
> +
> +    if (!got_stop) {
> +        qtest_qmp_eventwait(from, "STOP");
> +    }
> +    qtest_qmp_eventwait(to, "RESUME");
> +
> +    wait_for_serial("dest_serial");
> +    wait_for_migration_complete(from);
> +
> +    test_migrate_end(from, to, true);
> +}
> +
>  int main(int argc, char **argv)
>  {
>      char template[] = "/tmp/migration-test-XXXXXX";
> @@ -995,6 +1042,7 @@ int main(int argc, char **argv)
>      qtest_add_func("/migration/precopy/unix", test_precopy_unix);
>      qtest_add_func("/migration/precopy/tcp", test_precopy_tcp);
>      qtest_add_func("/migration/xbzrle/unix", test_xbzrle_unix);
> +    qtest_add_func("/migration/multifd/tcp", test_multifd_tcp);
>  
>      ret = g_test_run();

Acked-by: Thomas Huth <thuth@redhat.com>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [PATCH 2/4] multifd: Drop x-multifd-page-count parameter
  2019-02-06 14:20   ` Laurent Vivier
@ 2019-02-06 17:58     ` Juan Quintela
  2019-02-06 19:00       ` Laurent Vivier
  0 siblings, 1 reply; 18+ messages in thread
From: Juan Quintela @ 2019-02-06 17:58 UTC (permalink / raw)
  To: Laurent Vivier
  Cc: qemu-devel, Markus Armbruster, Thomas Huth, Paolo Bonzini,
	Eric Blake, Dr. David Alan Gilbert

Laurent Vivier <lvivier@redhat.com> wrote:
> On 06/02/2019 14:23, Juan Quintela wrote:
>> Libvirt don't want to expose (and explain it).  And testing looks like
>> 128 is good for all use cases, so just drop it.
>> 
>> Signed-off-by: Juan Quintela <quintela@redhat.com>
>> ---
>>  hmp.c                 |  7 -------
>>  migration/migration.c | 30 ------------------------------
>>  migration/migration.h |  1 -
>>  migration/ram.c       | 13 ++++++++-----
>>  qapi/migration.json   | 13 +------------
>>  5 files changed, 9 insertions(+), 55 deletions(-)
>> 
> ...
>> diff --git a/migration/migration.c b/migration/migration.c
>> index f673486679..65df9b566e 100644
>> --- a/migration/migration.c
>> +++ b/migration/migration.c
>> @@ -81,7 +81,6 @@
>>  /* The delay time (in ms) between two COLO checkpoints */
>>  #define DEFAULT_MIGRATE_X_CHECKPOINT_DELAY (200 * 100)
>>  #define DEFAULT_MIGRATE_MULTIFD_CHANNELS 2
>> -#define DEFAULT_MIGRATE_MULTIFD_PAGE_COUNT 128
>
> Why do you update it in the previous patch to remove it in this one?

To make clear that I change the default.  Otherwise it gets hidden into
the whole patch.  if you preffer I could have done the other way around.

Later, Juan.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [PATCH 2/4] multifd: Drop x-multifd-page-count parameter
  2019-02-06 17:58     ` Juan Quintela
@ 2019-02-06 19:00       ` Laurent Vivier
  2019-02-07 12:15         ` Juan Quintela
  0 siblings, 1 reply; 18+ messages in thread
From: Laurent Vivier @ 2019-02-06 19:00 UTC (permalink / raw)
  To: quintela
  Cc: qemu-devel, Markus Armbruster, Thomas Huth, Paolo Bonzini,
	Eric Blake, Dr. David Alan Gilbert

On 06/02/2019 18:58, Juan Quintela wrote:
> Laurent Vivier <lvivier@redhat.com> wrote:
>> On 06/02/2019 14:23, Juan Quintela wrote:
>>> Libvirt don't want to expose (and explain it).  And testing looks like
>>> 128 is good for all use cases, so just drop it.
>>>
>>> Signed-off-by: Juan Quintela <quintela@redhat.com>
>>> ---
>>>  hmp.c                 |  7 -------
>>>  migration/migration.c | 30 ------------------------------
>>>  migration/migration.h |  1 -
>>>  migration/ram.c       | 13 ++++++++-----
>>>  qapi/migration.json   | 13 +------------
>>>  5 files changed, 9 insertions(+), 55 deletions(-)
>>>
>> ...
>>> diff --git a/migration/migration.c b/migration/migration.c
>>> index f673486679..65df9b566e 100644
>>> --- a/migration/migration.c
>>> +++ b/migration/migration.c
>>> @@ -81,7 +81,6 @@
>>>  /* The delay time (in ms) between two COLO checkpoints */
>>>  #define DEFAULT_MIGRATE_X_CHECKPOINT_DELAY (200 * 100)
>>>  #define DEFAULT_MIGRATE_MULTIFD_CHANNELS 2
>>> -#define DEFAULT_MIGRATE_MULTIFD_PAGE_COUNT 128
>>
>> Why do you update it in the previous patch to remove it in this one?
> 
> To make clear that I change the default.  Otherwise it gets hidden into
> the whole patch.  if you preffer I could have done the other way around.

OK, I understand. It's not really clear because the new default
(MULTIFD_PAGE_COUNT) is hidden in the patch.

Moreover, in the first patch you update the value, but you don't update
the comments in qapi/migration.json (I've seen that because you remove
them in this patch).

Perhaps you can proceed in the reverse order: remove the parameter and
then set the new default... or merge the two patches and saying in the
commit message you change the default value.

Thanks,
Laurent

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [PATCH 3/4] multifd: Drop x-
  2019-02-06 13:23 ` [Qemu-devel] [PATCH 3/4] multifd: Drop x- Juan Quintela
@ 2019-02-07 11:23   ` Dr. David Alan Gilbert
  0 siblings, 0 replies; 18+ messages in thread
From: Dr. David Alan Gilbert @ 2019-02-07 11:23 UTC (permalink / raw)
  To: Juan Quintela
  Cc: qemu-devel, Markus Armbruster, Thomas Huth, Paolo Bonzini,
	Laurent Vivier, Eric Blake

* Juan Quintela (quintela@redhat.com) wrote:
> We make it supported from now on.
> 
> Signed-off-by: Juan Quintela <quintela@redhat.com>

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

> ---
>  hmp.c                 | 10 +++++-----
>  migration/migration.c | 26 +++++++++++++-------------
>  qapi/migration.json   | 34 +++++++++++++++++-----------------
>  3 files changed, 35 insertions(+), 35 deletions(-)
> 
> diff --git a/hmp.c b/hmp.c
> index 73b8443a8e..0cafce9713 100644
> --- a/hmp.c
> +++ b/hmp.c
> @@ -424,8 +424,8 @@ void hmp_info_migrate_parameters(Monitor *mon, const QDict *qdict)
>              MigrationParameter_str(MIGRATION_PARAMETER_BLOCK_INCREMENTAL),
>              params->block_incremental ? "on" : "off");
>          monitor_printf(mon, "%s: %u\n",
> -            MigrationParameter_str(MIGRATION_PARAMETER_X_MULTIFD_CHANNELS),
> -            params->x_multifd_channels);
> +            MigrationParameter_str(MIGRATION_PARAMETER_MULTIFD_CHANNELS),
> +            params->multifd_channels);
>          monitor_printf(mon, "%s: %" PRIu64 "\n",
>              MigrationParameter_str(MIGRATION_PARAMETER_XBZRLE_CACHE_SIZE),
>              params->xbzrle_cache_size);
> @@ -1769,9 +1769,9 @@ void hmp_migrate_set_parameter(Monitor *mon, const QDict *qdict)
>          p->has_block_incremental = true;
>          visit_type_bool(v, param, &p->block_incremental, &err);
>          break;
> -    case MIGRATION_PARAMETER_X_MULTIFD_CHANNELS:
> -        p->has_x_multifd_channels = true;
> -        visit_type_int(v, param, &p->x_multifd_channels, &err);
> +    case MIGRATION_PARAMETER_MULTIFD_CHANNELS:
> +        p->has_multifd_channels = true;
> +        visit_type_int(v, param, &p->multifd_channels, &err);
>          break;
>      case MIGRATION_PARAMETER_XBZRLE_CACHE_SIZE:
>          p->has_xbzrle_cache_size = true;
> diff --git a/migration/migration.c b/migration/migration.c
> index 65df9b566e..c7dbc44562 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -746,8 +746,8 @@ MigrationParameters *qmp_query_migrate_parameters(Error **errp)
>      params->x_checkpoint_delay = s->parameters.x_checkpoint_delay;
>      params->has_block_incremental = true;
>      params->block_incremental = s->parameters.block_incremental;
> -    params->has_x_multifd_channels = true;
> -    params->x_multifd_channels = s->parameters.x_multifd_channels;
> +    params->has_multifd_channels = true;
> +    params->multifd_channels = s->parameters.multifd_channels;
>      params->has_xbzrle_cache_size = true;
>      params->xbzrle_cache_size = s->parameters.xbzrle_cache_size;
>      params->has_max_postcopy_bandwidth = true;
> @@ -1103,7 +1103,7 @@ static bool migrate_params_check(MigrationParameters *params, Error **errp)
>  
>      /* x_checkpoint_delay is now always positive */
>  
> -    if (params->has_x_multifd_channels && (params->x_multifd_channels < 1)) {
> +    if (params->has_multifd_channels && (params->multifd_channels < 1)) {
>          error_setg(errp, QERR_INVALID_PARAMETER_VALUE,
>                     "multifd_channels",
>                     "is invalid, it should be in the range of 1 to 255");
> @@ -1188,8 +1188,8 @@ static void migrate_params_test_apply(MigrateSetParameters *params,
>      if (params->has_block_incremental) {
>          dest->block_incremental = params->block_incremental;
>      }
> -    if (params->has_x_multifd_channels) {
> -        dest->x_multifd_channels = params->x_multifd_channels;
> +    if (params->has_multifd_channels) {
> +        dest->multifd_channels = params->multifd_channels;
>      }
>      if (params->has_xbzrle_cache_size) {
>          dest->xbzrle_cache_size = params->xbzrle_cache_size;
> @@ -1266,8 +1266,8 @@ static void migrate_params_apply(MigrateSetParameters *params, Error **errp)
>      if (params->has_block_incremental) {
>          s->parameters.block_incremental = params->block_incremental;
>      }
> -    if (params->has_x_multifd_channels) {
> -        s->parameters.x_multifd_channels = params->x_multifd_channels;
> +    if (params->has_multifd_channels) {
> +        s->parameters.multifd_channels = params->multifd_channels;
>      }
>      if (params->has_xbzrle_cache_size) {
>          s->parameters.xbzrle_cache_size = params->xbzrle_cache_size;
> @@ -2005,7 +2005,7 @@ bool migrate_use_multifd(void)
>  
>      s = migrate_get_current();
>  
> -    return s->enabled_capabilities[MIGRATION_CAPABILITY_X_MULTIFD];
> +    return s->enabled_capabilities[MIGRATION_CAPABILITY_MULTIFD];
>  }
>  
>  bool migrate_pause_before_switchover(void)
> @@ -2024,7 +2024,7 @@ int migrate_multifd_channels(void)
>  
>      s = migrate_get_current();
>  
> -    return s->parameters.x_multifd_channels;
> +    return s->parameters.multifd_channels;
>  }
>  
>  int migrate_use_xbzrle(void)
> @@ -3257,8 +3257,8 @@ static Property migration_properties[] = {
>      DEFINE_PROP_UINT32("x-checkpoint-delay", MigrationState,
>                        parameters.x_checkpoint_delay,
>                        DEFAULT_MIGRATE_X_CHECKPOINT_DELAY),
> -    DEFINE_PROP_UINT8("x-multifd-channels", MigrationState,
> -                      parameters.x_multifd_channels,
> +    DEFINE_PROP_UINT8("multifd-channels", MigrationState,
> +                      parameters.multifd_channels,
>                        DEFAULT_MIGRATE_MULTIFD_CHANNELS),
>      DEFINE_PROP_SIZE("xbzrle-cache-size", MigrationState,
>                        parameters.xbzrle_cache_size,
> @@ -3282,7 +3282,7 @@ static Property migration_properties[] = {
>      DEFINE_PROP_MIG_CAP("x-release-ram", MIGRATION_CAPABILITY_RELEASE_RAM),
>      DEFINE_PROP_MIG_CAP("x-block", MIGRATION_CAPABILITY_BLOCK),
>      DEFINE_PROP_MIG_CAP("x-return-path", MIGRATION_CAPABILITY_RETURN_PATH),
> -    DEFINE_PROP_MIG_CAP("x-multifd", MIGRATION_CAPABILITY_X_MULTIFD),
> +    DEFINE_PROP_MIG_CAP("x-multifd", MIGRATION_CAPABILITY_MULTIFD),
>  
>      DEFINE_PROP_END_OF_LIST(),
>  };
> @@ -3336,7 +3336,7 @@ static void migration_instance_init(Object *obj)
>      params->has_downtime_limit = true;
>      params->has_x_checkpoint_delay = true;
>      params->has_block_incremental = true;
> -    params->has_x_multifd_channels = true;
> +    params->has_multifd_channels = true;
>      params->has_xbzrle_cache_size = true;
>      params->has_max_postcopy_bandwidth = true;
>      params->has_max_cpu_throttle = true;
> diff --git a/qapi/migration.json b/qapi/migration.json
> index 8c5db60406..c202703889 100644
> --- a/qapi/migration.json
> +++ b/qapi/migration.json
> @@ -401,7 +401,7 @@
>  # @pause-before-switchover: Pause outgoing migration before serialising device
>  #          state and before disabling block IO (since 2.11)
>  #
> -# @x-multifd: Use more than one fd for migration (since 2.11)
> +# @multifd: Use more than one fd for migration (since 4.0)
>  #
>  # @dirty-bitmaps: If enabled, QEMU will migrate named dirty bitmaps.
>  #                 (since 2.12)
> @@ -418,7 +418,7 @@
>  { 'enum': 'MigrationCapability',
>    'data': ['xbzrle', 'rdma-pin-all', 'auto-converge', 'zero-blocks',
>             'compress', 'events', 'postcopy-ram', 'x-colo', 'release-ram',
> -           'block', 'return-path', 'pause-before-switchover', 'x-multifd',
> +           'block', 'return-path', 'pause-before-switchover', 'multifd',
>             'dirty-bitmaps', 'postcopy-blocktime', 'late-block-activate' ] }
>  
>  ##
> @@ -542,10 +542,10 @@
>  # 	migrated and the destination must already have access to the
>  # 	same backing chain as was used on the source.  (since 2.10)
>  #
> -# @x-multifd-channels: Number of channels used to migrate data in
> -#                     parallel. This is the same number that the
> -#                     number of sockets used for migration.  The
> -#                     default value is 2 (since 2.11)
> +# @multifd-channels: Number of channels used to migrate data in
> +#                    parallel. This is the same number that the
> +#                    number of sockets used for migration.  The
> +#                    default value is 2 (since 4.0)
>  #
>  # @xbzrle-cache-size: cache size to be used by XBZRLE migration.  It
>  #                     needs to be a multiple of the target page size
> @@ -566,7 +566,7 @@
>             'cpu-throttle-initial', 'cpu-throttle-increment',
>             'tls-creds', 'tls-hostname', 'max-bandwidth',
>             'downtime-limit', 'x-checkpoint-delay', 'block-incremental',
> -           'x-multifd-channels',
> +           'multifd-channels',
>             'xbzrle-cache-size', 'max-postcopy-bandwidth',
>             'max-cpu-throttle' ] }
>  
> @@ -629,10 +629,10 @@
>  # 	migrated and the destination must already have access to the
>  # 	same backing chain as was used on the source.  (since 2.10)
>  #
> -# @x-multifd-channels: Number of channels used to migrate data in
> -#                     parallel. This is the same number that the
> -#                     number of sockets used for migration.  The
> -#                     default value is 2 (since 2.11)
> +# @multifd-channels: Number of channels used to migrate data in
> +#                    parallel. This is the same number that the
> +#                    number of sockets used for migration.  The
> +#                    default value is 2 (since 4.0)
>  #
>  # @xbzrle-cache-size: cache size to be used by XBZRLE migration.  It
>  #                     needs to be a multiple of the target page size
> @@ -663,7 +663,7 @@
>              '*downtime-limit': 'int',
>              '*x-checkpoint-delay': 'int',
>              '*block-incremental': 'bool',
> -            '*x-multifd-channels': 'int',
> +            '*multifd-channels': 'int',
>              '*xbzrle-cache-size': 'size',
>              '*max-postcopy-bandwidth': 'size',
>  	    '*max-cpu-throttle': 'int' } }
> @@ -742,10 +742,10 @@
>  # 	migrated and the destination must already have access to the
>  # 	same backing chain as was used on the source.  (since 2.10)
>  #
> -# @x-multifd-channels: Number of channels used to migrate data in
> -#                     parallel. This is the same number that the
> -#                     number of sockets used for migration.
> -#                     The default value is 2 (since 2.11)
> +# @multifd-channels: Number of channels used to migrate data in
> +#                    parallel. This is the same number that the
> +#                    number of sockets used for migration.
> +#                    The default value is 2 (since 4.0)
>  #
>  # @xbzrle-cache-size: cache size to be used by XBZRLE migration.  It
>  #                     needs to be a multiple of the target page size
> @@ -775,7 +775,7 @@
>              '*downtime-limit': 'uint64',
>              '*x-checkpoint-delay': 'uint32',
>              '*block-incremental': 'bool' ,
> -            '*x-multifd-channels': 'uint8',
> +            '*multifd-channels': 'uint8',
>              '*xbzrle-cache-size': 'size',
>  	    '*max-postcopy-bandwidth': 'size',
>              '*max-cpu-throttle':'uint8'} }
> -- 
> 2.20.1
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [PATCH 1/4] multifd: Change page count default to 128
  2019-02-06 13:23 ` [Qemu-devel] [PATCH 1/4] multifd: Change page count default to 128 Juan Quintela
@ 2019-02-07 11:33   ` Daniel P. Berrangé
  2019-02-07 12:13     ` Juan Quintela
  2019-02-07 12:13     ` Juan Quintela
  0 siblings, 2 replies; 18+ messages in thread
From: Daniel P. Berrangé @ 2019-02-07 11:33 UTC (permalink / raw)
  To: Juan Quintela
  Cc: qemu-devel, Laurent Vivier, Thomas Huth, Markus Armbruster,
	Dr. David Alan Gilbert, Paolo Bonzini

On Wed, Feb 06, 2019 at 02:23:28PM +0100, Juan Quintela wrote:
> I haven't seend any problem about using 64 or 128.  And it make much
> less contention on the locks.  Just change it.

Isn't there a issue with having a fixed page count given the
very different default page sizes across architectures ?

x86 is 4kb pages, while ppc64 uses 64kb pages IIUC.

This would mean current value of 64 pages, would correspond
to 1/4 MB on x86, and 4 MB on ppc64.  The new value would
be 1/2 MB on x86 and 8 MB on ppc64.

Should we instead be measuring this tunable in units that
are independant of page size ? eg meansure in KB, with a
requirement that the value is a multiple of the page size.
Then set the default to 512 KB ?

> 
> Signed-off-by: Juan Quintela <quintela@redhat.com>
> ---
>  migration/migration.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/migration/migration.c b/migration/migration.c
> index ef1d53cde2..f673486679 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -81,7 +81,7 @@
>  /* The delay time (in ms) between two COLO checkpoints */
>  #define DEFAULT_MIGRATE_X_CHECKPOINT_DELAY (200 * 100)
>  #define DEFAULT_MIGRATE_MULTIFD_CHANNELS 2
> -#define DEFAULT_MIGRATE_MULTIFD_PAGE_COUNT 16
> +#define DEFAULT_MIGRATE_MULTIFD_PAGE_COUNT 128
>  
>  /* Background transfer rate for postcopy, 0 means unlimited, note
>   * that page requests can still exceed this limit.
> -- 
> 2.20.1
> 
> 

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [PATCH 1/4] multifd: Change page count default to 128
  2019-02-07 11:33   ` Daniel P. Berrangé
@ 2019-02-07 12:13     ` Juan Quintela
  2019-02-07 12:13     ` Juan Quintela
  1 sibling, 0 replies; 18+ messages in thread
From: Juan Quintela @ 2019-02-07 12:13 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: qemu-devel, Laurent Vivier, Thomas Huth, Markus Armbruster,
	Dr. David Alan Gilbert, Paolo Bonzini

Daniel P. Berrangé <berrange@redhat.com> wrote:
> On Wed, Feb 06, 2019 at 02:23:28PM +0100, Juan Quintela wrote:
>> I haven't seend any problem about using 64 or 128.  And it make much
>> less contention on the locks.  Just change it.
>
> Isn't there a issue with having a fixed page count given the
> very different default page sizes across architectures ?
>
> x86 is 4kb pages, while ppc64 uses 64kb pages IIUC.
>
> This would mean current value of 64 pages, would correspond
> to 1/4 MB on x86, and 4 MB on ppc64.  The new value would
> be 1/2 MB on x86 and 8 MB on ppc64.

I saw no difference (on x86 between 64 and 128 pages).  Bigger pages
means half the contention on the locks and better for compression (see
next series).

> Should we instead be measuring this tunable in units that
> are independant of page size ? eg meansure in KB, with a
> requirement that the value is a multiple of the page size.
> Then set the default to 512 KB ?

See next patch, I just dropped the tunable altogether.  Libvirt don't
want to support it (difficult to explain), and in the past you asked me
to choose a sane value and live with it O:-)
It was good for testing, though.

Once there, is there a good value for a network packet?
I put it in pages because it facilitates the coding, but doing a:

CONSTANT/qemu_target_page_size() is not going to complicate anything
either.

Later, Juan.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [PATCH 1/4] multifd: Change page count default to 128
  2019-02-07 11:33   ` Daniel P. Berrangé
  2019-02-07 12:13     ` Juan Quintela
@ 2019-02-07 12:13     ` Juan Quintela
  2019-02-07 12:41       ` Daniel P. Berrangé
  1 sibling, 1 reply; 18+ messages in thread
From: Juan Quintela @ 2019-02-07 12:13 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: qemu-devel, Laurent Vivier, Thomas Huth, Markus Armbruster,
	Dr. David Alan Gilbert, Paolo Bonzini

Daniel P. Berrangé <berrange@redhat.com> wrote:
> On Wed, Feb 06, 2019 at 02:23:28PM +0100, Juan Quintela wrote:
>> I haven't seend any problem about using 64 or 128.  And it make much
>> less contention on the locks.  Just change it.
>
> Isn't there a issue with having a fixed page count given the
> very different default page sizes across architectures ?
>
> x86 is 4kb pages, while ppc64 uses 64kb pages IIUC.
>
> This would mean current value of 64 pages, would correspond
> to 1/4 MB on x86, and 4 MB on ppc64.  The new value would
> be 1/2 MB on x86 and 8 MB on ppc64.

I saw no difference (on x86 between 64 and 128 pages).  Bigger pages
means half the contention on the locks and better for compression (see
next series).

> Should we instead be measuring this tunable in units that
> are independant of page size ? eg meansure in KB, with a
> requirement that the value is a multiple of the page size.
> Then set the default to 512 KB ?

See next patch, I just dropped the tunable altogether.  Libvirt don't
want to support it (difficult to explain), and in the past you asked me
to choose a sane value and live with it O:-)
It was good for testing, though.

Once there, is there a good value for a network packet?
I put it in pages because it facilitates the coding, but doing a:

CONSTANT/qemu_target_page_size() is not going to complicate anything
either.

Later, Juan.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [PATCH 2/4] multifd: Drop x-multifd-page-count parameter
  2019-02-06 19:00       ` Laurent Vivier
@ 2019-02-07 12:15         ` Juan Quintela
  0 siblings, 0 replies; 18+ messages in thread
From: Juan Quintela @ 2019-02-07 12:15 UTC (permalink / raw)
  To: Laurent Vivier
  Cc: qemu-devel, Markus Armbruster, Thomas Huth, Paolo Bonzini,
	Eric Blake, Dr. David Alan Gilbert

Laurent Vivier <lvivier@redhat.com> wrote:
> On 06/02/2019 18:58, Juan Quintela wrote:
>> Laurent Vivier <lvivier@redhat.com> wrote:
>>> On 06/02/2019 14:23, Juan Quintela wrote:
>>>> Libvirt don't want to expose (and explain it).  And testing looks like
>>>> 128 is good for all use cases, so just drop it.
>>>>
>>>> Signed-off-by: Juan Quintela <quintela@redhat.com>
>>>> ---
>>>>  hmp.c                 |  7 -------
>>>>  migration/migration.c | 30 ------------------------------
>>>>  migration/migration.h |  1 -
>>>>  migration/ram.c       | 13 ++++++++-----
>>>>  qapi/migration.json   | 13 +------------
>>>>  5 files changed, 9 insertions(+), 55 deletions(-)
>>>>
>>> ...
>>>> diff --git a/migration/migration.c b/migration/migration.c
>>>> index f673486679..65df9b566e 100644
>>>> --- a/migration/migration.c
>>>> +++ b/migration/migration.c
>>>> @@ -81,7 +81,6 @@
>>>>  /* The delay time (in ms) between two COLO checkpoints */
>>>>  #define DEFAULT_MIGRATE_X_CHECKPOINT_DELAY (200 * 100)
>>>>  #define DEFAULT_MIGRATE_MULTIFD_CHANNELS 2
>>>> -#define DEFAULT_MIGRATE_MULTIFD_PAGE_COUNT 128
>>>
>>> Why do you update it in the previous patch to remove it in this one?
>> 
>> To make clear that I change the default.  Otherwise it gets hidden into
>> the whole patch.  if you preffer I could have done the other way around.
>
> OK, I understand. It's not really clear because the new default
> (MULTIFD_PAGE_COUNT) is hidden in the patch.
>
> Moreover, in the first patch you update the value, but you don't update
> the comments in qapi/migration.json (I've seen that because you remove
> them in this patch).

Aha, I knew I was forgetting something.

> Perhaps you can proceed in the reverse order: remove the parameter and
> then set the new default... or merge the two patches and saying in the
> commit message you change the default value.

Ok.

> Thanks,
> Laurent

Later, Juan.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [PATCH 2/4] multifd: Drop x-multifd-page-count parameter
  2019-02-06 13:23 ` [Qemu-devel] [PATCH 2/4] multifd: Drop x-multifd-page-count parameter Juan Quintela
  2019-02-06 14:20   ` Laurent Vivier
@ 2019-02-07 12:33   ` Daniel P. Berrangé
  2019-02-12  9:34     ` Juan Quintela
  1 sibling, 1 reply; 18+ messages in thread
From: Daniel P. Berrangé @ 2019-02-07 12:33 UTC (permalink / raw)
  To: Juan Quintela
  Cc: qemu-devel, Laurent Vivier, Thomas Huth, Markus Armbruster,
	Dr. David Alan Gilbert, Paolo Bonzini

On Wed, Feb 06, 2019 at 02:23:29PM +0100, Juan Quintela wrote:
> Libvirt don't want to expose (and explain it).  And testing looks like
> 128 is good for all use cases, so just drop it.

One significant concern inline...

> 
> Signed-off-by: Juan Quintela <quintela@redhat.com>
> ---
>  hmp.c                 |  7 -------
>  migration/migration.c | 30 ------------------------------
>  migration/migration.h |  1 -
>  migration/ram.c       | 13 ++++++++-----
>  qapi/migration.json   | 13 +------------
>  5 files changed, 9 insertions(+), 55 deletions(-)

> @@ -718,7 +721,7 @@ static void multifd_send_fill_packet(MultiFDSendParams *p)
>      packet->magic = cpu_to_be32(MULTIFD_MAGIC);
>      packet->version = cpu_to_be32(MULTIFD_VERSION);
>      packet->flags = cpu_to_be32(p->flags);
> -    packet->size = cpu_to_be32(migrate_multifd_page_count());
> +    packet->size = cpu_to_be32(MULTIFD_PAGE_COUNT);
>      packet->used = cpu_to_be32(p->pages->used);
>      packet->packet_num = cpu_to_be64(p->packet_num);
>

Here the source QEMU sends the page size - which is now
a hardcoded constant - to the target QEMU.

> @@ -756,10 +759,10 @@ static int multifd_recv_unfill_packet(MultiFDRecvParams *p, Error **errp)
>      p->flags = be32_to_cpu(packet->flags);
>  
>      packet->size = be32_to_cpu(packet->size);
> -    if (packet->size > migrate_multifd_page_count()) {
> +    if (packet->size > MULTIFD_PAGE_COUNT) {
>          error_setg(errp, "multifd: received packet "
>                     "with size %d and expected maximum size %d",
> -                   packet->size, migrate_multifd_page_count()) ;
> +                   packet->size, MULTIFD_PAGE_COUNT) ;
>          return -1;
>      }
>

Here the dest QEMU receives the page size that the source QEMU used, and
checks that it is not larger than its constant.

IIUC, the implication here is that if we ever increase the size of this
constant in future QEMU, we will break live migration from new to old
QEMU due to this check.  In fact your previous patch in this series has
done exactly that, so this appears to mean QEMU 4.0 -> QEMU 3.2
multifd migration is broken now.

Alternatively if we decrease the size of the constant in future
QEMU, we will break live migration from old QEMU to new QEMU which
is even worse.

This problem existed before this patch, if the management app was
not explicitly using migrate-set-parameters to set the page count
on both sides of QEMU. So we're already broken, but at least the
feature was marked experimental.

What is the purpose of this packet size check ?  Is it something
we can safely remove, so that we can increase or decrease the
size at will without breaking migration compat.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [PATCH 1/4] multifd: Change page count default to 128
  2019-02-07 12:13     ` Juan Quintela
@ 2019-02-07 12:41       ` Daniel P. Berrangé
  0 siblings, 0 replies; 18+ messages in thread
From: Daniel P. Berrangé @ 2019-02-07 12:41 UTC (permalink / raw)
  To: Juan Quintela
  Cc: qemu-devel, Laurent Vivier, Thomas Huth, Markus Armbruster,
	Dr. David Alan Gilbert, Paolo Bonzini

On Thu, Feb 07, 2019 at 01:13:51PM +0100, Juan Quintela wrote:
> Daniel P. Berrangé <berrange@redhat.com> wrote:
> > On Wed, Feb 06, 2019 at 02:23:28PM +0100, Juan Quintela wrote:
> >> I haven't seend any problem about using 64 or 128.  And it make much
> >> less contention on the locks.  Just change it.
> >
> > Isn't there a issue with having a fixed page count given the
> > very different default page sizes across architectures ?
> >
> > x86 is 4kb pages, while ppc64 uses 64kb pages IIUC.
> >
> > This would mean current value of 64 pages, would correspond
> > to 1/4 MB on x86, and 4 MB on ppc64.  The new value would
> > be 1/2 MB on x86 and 8 MB on ppc64.
> 
> I saw no difference (on x86 between 64 and 128 pages).  Bigger pages
> means half the contention on the locks and better for compression (see
> next series).

1/4 MB -> 1/2 MB is not all that significant a change, but  1/2 MB
vs 8 MB  is very significant.

I wouldn't be surprised if this difference in values results in
rather different performance characteristics for multifd migrate
between x86 and ppc64.

> > Should we instead be measuring this tunable in units that
> > are independant of page size ? eg meansure in KB, with a
> > requirement that the value is a multiple of the page size.
> > Then set the default to 512 KB ?
> 
> See next patch, I just dropped the tunable altogether.  Libvirt don't
> want to support it (difficult to explain), and in the past you asked me
> to choose a sane value and live with it O:-)
> It was good for testing, though.

Yep, I think its good if QEMU choose a sane value. I'm just wondering
whether the value chosen is actually suitable for non-x86 architectures.

> Once there, is there a good value for a network packet?

I don't have any particular suggestion here. Probably would have to
look at real performance measurements of migration vs guest workload
to understand if we've got a good size.

> I put it in pages because it facilitates the coding, but doing a:
> CONSTANT/qemu_target_page_size() is not going to complicate anything
> either.



Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [PATCH 2/4] multifd: Drop x-multifd-page-count parameter
  2019-02-07 12:33   ` Daniel P. Berrangé
@ 2019-02-12  9:34     ` Juan Quintela
  2019-02-12 10:29       ` Daniel P. Berrangé
  0 siblings, 1 reply; 18+ messages in thread
From: Juan Quintela @ 2019-02-12  9:34 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: qemu-devel, Laurent Vivier, Thomas Huth, Markus Armbruster,
	Dr. David Alan Gilbert, Paolo Bonzini

Daniel P. Berrangé <berrange@redhat.com> wrote:
> On Wed, Feb 06, 2019 at 02:23:29PM +0100, Juan Quintela wrote:
>> Libvirt don't want to expose (and explain it).  And testing looks like
>> 128 is good for all use cases, so just drop it.
>
> One significant concern inline...
>
>> 
>> Signed-off-by: Juan Quintela <quintela@redhat.com>
>> ---
>>  hmp.c                 |  7 -------
>>  migration/migration.c | 30 ------------------------------
>>  migration/migration.h |  1 -
>>  migration/ram.c       | 13 ++++++++-----
>>  qapi/migration.json   | 13 +------------
>>  5 files changed, 9 insertions(+), 55 deletions(-)
>
>
>> @@ -718,7 +721,7 @@ static void multifd_send_fill_packet(MultiFDSendParams *p)
>>      packet->magic = cpu_to_be32(MULTIFD_MAGIC);
>>      packet->version = cpu_to_be32(MULTIFD_VERSION);
>>      packet->flags = cpu_to_be32(p->flags);
>> -    packet->size = cpu_to_be32(migrate_multifd_page_count());
>> +    packet->size = cpu_to_be32(MULTIFD_PAGE_COUNT);
>>      packet->used = cpu_to_be32(p->pages->used);
>>      packet->packet_num = cpu_to_be64(p->packet_num);
>>
>
> Here the source QEMU sends the page size - which is now
> a hardcoded constant - to the target QEMU.
>
>> @@ -756,10 +759,10 @@ static int multifd_recv_unfill_packet(MultiFDRecvParams *p, Error **errp)
>>      p->flags = be32_to_cpu(packet->flags);
>>  
>>      packet->size = be32_to_cpu(packet->size);
>> -    if (packet->size > migrate_multifd_page_count()) {
>> +    if (packet->size > MULTIFD_PAGE_COUNT) {
>>          error_setg(errp, "multifd: received packet "
>>                     "with size %d and expected maximum size %d",
>> -                   packet->size, migrate_multifd_page_count()) ;
>> +                   packet->size, MULTIFD_PAGE_COUNT) ;
>>          return -1;
>>      }
>>
>
> Here the dest QEMU receives the page size that the source QEMU used, and
> checks that it is not larger than its constant.
>
> IIUC, the implication here is that if we ever increase the size of this
> constant in future QEMU, we will break live migration from new to old
> QEMU due to this check.  In fact your previous patch in this series has
> done exactly that, so this appears to mean QEMU 4.0 -> QEMU 3.2
> multifd migration is broken now.
>
> Alternatively if we decrease the size of the constant in future
> QEMU, we will break live migration from old QEMU to new QEMU which
> is even worse.
>
> This problem existed before this patch, if the management app was
> not explicitly using migrate-set-parameters to set the page count
> on both sides of QEMU. So we're already broken, but at least the
> feature was marked experimental.
>
> What is the purpose of this packet size check ?  Is it something
> we can safely remove, so that we can increase or decrease the
> size at will without breaking migration compat.

We have a "dinamyc" array of pages of that size.  What we check is that
the array fits into the part that we have assigned.

We "could" wait until this moment to create the arrays, I need to look
into that.  Notice that what the check does is making sure that whatewer
we receive is not bigger than the space that we have allocated.

At this point, that check can only fail if we are "being" attacked and
we have a malformed string.  We check during negotiation that this value
is ok.

We should check this *also* in the initial packet, and then this check
should never be true.

>From a management point of view, what do you preffer here?

Later, Juan.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [PATCH 2/4] multifd: Drop x-multifd-page-count parameter
  2019-02-12  9:34     ` Juan Quintela
@ 2019-02-12 10:29       ` Daniel P. Berrangé
  0 siblings, 0 replies; 18+ messages in thread
From: Daniel P. Berrangé @ 2019-02-12 10:29 UTC (permalink / raw)
  To: Juan Quintela
  Cc: qemu-devel, Laurent Vivier, Thomas Huth, Markus Armbruster,
	Dr. David Alan Gilbert, Paolo Bonzini

On Tue, Feb 12, 2019 at 10:34:35AM +0100, Juan Quintela wrote:
> Daniel P. Berrangé <berrange@redhat.com> wrote:
> > On Wed, Feb 06, 2019 at 02:23:29PM +0100, Juan Quintela wrote:
> >> Libvirt don't want to expose (and explain it).  And testing looks like
> >> 128 is good for all use cases, so just drop it.
> >
> > One significant concern inline...
> >
> >> 
> >> Signed-off-by: Juan Quintela <quintela@redhat.com>
> >> ---
> >>  hmp.c                 |  7 -------
> >>  migration/migration.c | 30 ------------------------------
> >>  migration/migration.h |  1 -
> >>  migration/ram.c       | 13 ++++++++-----
> >>  qapi/migration.json   | 13 +------------
> >>  5 files changed, 9 insertions(+), 55 deletions(-)
> >
> >
> >> @@ -718,7 +721,7 @@ static void multifd_send_fill_packet(MultiFDSendParams *p)
> >>      packet->magic = cpu_to_be32(MULTIFD_MAGIC);
> >>      packet->version = cpu_to_be32(MULTIFD_VERSION);
> >>      packet->flags = cpu_to_be32(p->flags);
> >> -    packet->size = cpu_to_be32(migrate_multifd_page_count());
> >> +    packet->size = cpu_to_be32(MULTIFD_PAGE_COUNT);
> >>      packet->used = cpu_to_be32(p->pages->used);
> >>      packet->packet_num = cpu_to_be64(p->packet_num);
> >>
> >
> > Here the source QEMU sends the page size - which is now
> > a hardcoded constant - to the target QEMU.
> >
> >> @@ -756,10 +759,10 @@ static int multifd_recv_unfill_packet(MultiFDRecvParams *p, Error **errp)
> >>      p->flags = be32_to_cpu(packet->flags);
> >>  
> >>      packet->size = be32_to_cpu(packet->size);
> >> -    if (packet->size > migrate_multifd_page_count()) {
> >> +    if (packet->size > MULTIFD_PAGE_COUNT) {
> >>          error_setg(errp, "multifd: received packet "
> >>                     "with size %d and expected maximum size %d",
> >> -                   packet->size, migrate_multifd_page_count()) ;
> >> +                   packet->size, MULTIFD_PAGE_COUNT) ;
> >>          return -1;
> >>      }
> >>
> >
> > Here the dest QEMU receives the page size that the source QEMU used, and
> > checks that it is not larger than its constant.
> >
> > IIUC, the implication here is that if we ever increase the size of this
> > constant in future QEMU, we will break live migration from new to old
> > QEMU due to this check.  In fact your previous patch in this series has
> > done exactly that, so this appears to mean QEMU 4.0 -> QEMU 3.2
> > multifd migration is broken now.
> >
> > Alternatively if we decrease the size of the constant in future
> > QEMU, we will break live migration from old QEMU to new QEMU which
> > is even worse.
> >
> > This problem existed before this patch, if the management app was
> > not explicitly using migrate-set-parameters to set the page count
> > on both sides of QEMU. So we're already broken, but at least the
> > feature was marked experimental.
> >
> > What is the purpose of this packet size check ?  Is it something
> > we can safely remove, so that we can increase or decrease the
> > size at will without breaking migration compat.
> 
> We have a "dinamyc" array of pages of that size.  What we check is that
> the array fits into the part that we have assigned.
> 
> We "could" wait until this moment to create the arrays, I need to look
> into that.  Notice that what the check does is making sure that whatewer
> we receive is not bigger than the space that we have allocated.
> 
> At this point, that check can only fail if we are "being" attacked and
> we have a malformed string.  We check during negotiation that this value
> is ok.
> 
> We should check this *also* in the initial packet, and then this check
> should never be true.

Right but checking earlier will still have the same problem I describe
with back compatibility if we ever change the page count in future QEMU,
as the limit checked by one QEMU may be smaller than what the other QEMU
is intentionally sending.  I don't see where we have a bi-directional
channel that would allow the 2 QEMUs to negotiate a mutually acceptable
page count :-(

> From a management point of view, what do you preffer here?

The earlier we check requirements the better, as it means we get the
error reported sooner, before wasting time of data transmisions.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2019-02-12 10:29 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-02-06 13:23 [Qemu-devel] [PATCH 0/4] migration: Make multifd not experimental Juan Quintela
2019-02-06 13:23 ` [Qemu-devel] [PATCH 1/4] multifd: Change page count default to 128 Juan Quintela
2019-02-07 11:33   ` Daniel P. Berrangé
2019-02-07 12:13     ` Juan Quintela
2019-02-07 12:13     ` Juan Quintela
2019-02-07 12:41       ` Daniel P. Berrangé
2019-02-06 13:23 ` [Qemu-devel] [PATCH 2/4] multifd: Drop x-multifd-page-count parameter Juan Quintela
2019-02-06 14:20   ` Laurent Vivier
2019-02-06 17:58     ` Juan Quintela
2019-02-06 19:00       ` Laurent Vivier
2019-02-07 12:15         ` Juan Quintela
2019-02-07 12:33   ` Daniel P. Berrangé
2019-02-12  9:34     ` Juan Quintela
2019-02-12 10:29       ` Daniel P. Berrangé
2019-02-06 13:23 ` [Qemu-devel] [PATCH 3/4] multifd: Drop x- Juan Quintela
2019-02-07 11:23   ` Dr. David Alan Gilbert
2019-02-06 13:23 ` [Qemu-devel] [PATCH 4/4] tests: Add migration multifd test Juan Quintela
2019-02-06 15:49   ` Thomas Huth

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.