All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 0/3] Postcopy bandwidth limiting
@ 2018-06-13 10:26 Dr. David Alan Gilbert (git)
  2018-06-13 10:26 ` [Qemu-devel] [PATCH 1/3] migration/postcopy: Add max-postcopy-bandwidth parameter Dr. David Alan Gilbert (git)
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Dr. David Alan Gilbert (git) @ 2018-06-13 10:26 UTC (permalink / raw)
  To: qemu-devel, quintela, peterx; +Cc: jdenemar

From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>

Hi,
  Postcopy currently turns off bandwidth limits during the postcopy
phase to make sure the urgent postcopy requests aren't delayed.
  This causes problems for larger clusters which share networking
between the migration stream and other critical services.

  This series restricts the background postcopy bandwidth but does
it in a way that lets the urgent postcopy requests get through.

  Testing on a 10Gbps link and a 400MByte/s limit shows
very little difference in the postcopy request latency, but
a network graph shows the bandwidth usage very close
to the set limit, even with a very heavy memory load running
in the guest.

Dr. David Alan Gilbert (3):
  migration/postcopy: Add max-postcopy-bandwidth parameter
  migration: Wake rate limiting for urgent requests
  migration/postcopy: Wake rate limit sleep on postcopy request

 hmp.c                  |  7 +++++
 migration/migration.c  | 70 +++++++++++++++++++++++++++++++++++++++---
 migration/migration.h  |  8 +++++
 migration/ram.c        |  9 +++++-
 migration/trace-events |  2 ++
 qapi/migration.json    | 19 ++++++++++--
 6 files changed, 106 insertions(+), 9 deletions(-)

-- 
2.17.1

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Qemu-devel] [PATCH 1/3] migration/postcopy: Add max-postcopy-bandwidth parameter
  2018-06-13 10:26 [Qemu-devel] [PATCH 0/3] Postcopy bandwidth limiting Dr. David Alan Gilbert (git)
@ 2018-06-13 10:26 ` Dr. David Alan Gilbert (git)
  2018-06-14  2:18   ` Peter Xu
  2018-06-13 10:26 ` [Qemu-devel] [PATCH 2/3] migration: Wake rate limiting for urgent requests Dr. David Alan Gilbert (git)
  2018-06-13 10:26 ` [Qemu-devel] [PATCH 3/3] migration/postcopy: Wake rate limit sleep on postcopy request Dr. David Alan Gilbert (git)
  2 siblings, 1 reply; 9+ messages in thread
From: Dr. David Alan Gilbert (git) @ 2018-06-13 10:26 UTC (permalink / raw)
  To: qemu-devel, quintela, peterx; +Cc: jdenemar

From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>

Limit the background transfer bandwidth during the postcopy
phase to the value set on this new parameter.  The default, 0,
corresponds to the existing behaviour which is unlimited bandwidth.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 hmp.c                 |  7 +++++++
 migration/migration.c | 35 ++++++++++++++++++++++++++++++++++-
 qapi/migration.json   | 19 ++++++++++++++++---
 3 files changed, 57 insertions(+), 4 deletions(-)

diff --git a/hmp.c b/hmp.c
index 983e3be1a1..f601099f90 100644
--- a/hmp.c
+++ b/hmp.c
@@ -370,6 +370,9 @@ void hmp_info_migrate_parameters(Monitor *mon, const QDict *qdict)
         monitor_printf(mon, "%s: %" PRIu64 "\n",
             MigrationParameter_str(MIGRATION_PARAMETER_XBZRLE_CACHE_SIZE),
             params->xbzrle_cache_size);
+        monitor_printf(mon, "%s: %" PRIu64 "\n",
+            MigrationParameter_str(MIGRATION_PARAMETER_MAX_POSTCOPY_BANDWIDTH),
+            params->max_postcopy_bandwidth);
     }
 
     qapi_free_MigrationParameters(params);
@@ -1684,6 +1687,10 @@ void hmp_migrate_set_parameter(Monitor *mon, const QDict *qdict)
         }
         p->xbzrle_cache_size = cache_size;
         break;
+    case MIGRATION_PARAMETER_MAX_POSTCOPY_BANDWIDTH:
+        p->has_max_postcopy_bandwidth = true;
+        visit_type_size(v, param, &p->max_postcopy_bandwidth, &err);
+        break;
     default:
         assert(0);
     }
diff --git a/migration/migration.c b/migration/migration.c
index 1e99ec9b7e..3a50d4c35c 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -82,6 +82,11 @@
 #define DEFAULT_MIGRATE_MULTIFD_CHANNELS 2
 #define DEFAULT_MIGRATE_MULTIFD_PAGE_COUNT 16
 
+/* Background transfer rate for postcopy, 0 means unlimited, note
+ * that page requests can still exceed this limit.
+ */
+#define DEFAULT_MIGRATE_MAX_POSTCOPY_BANDWIDTH 0
+
 static NotifierList migration_state_notifiers =
     NOTIFIER_LIST_INITIALIZER(migration_state_notifiers);
 
@@ -659,6 +664,8 @@ MigrationParameters *qmp_query_migrate_parameters(Error **errp)
     params->x_multifd_page_count = s->parameters.x_multifd_page_count;
     params->has_xbzrle_cache_size = true;
     params->xbzrle_cache_size = s->parameters.xbzrle_cache_size;
+    params->has_max_postcopy_bandwidth = true;
+    params->max_postcopy_bandwidth = s->parameters.max_postcopy_bandwidth;
 
     return params;
 }
@@ -1066,6 +1073,9 @@ static void migrate_params_test_apply(MigrateSetParameters *params,
     if (params->has_xbzrle_cache_size) {
         dest->xbzrle_cache_size = params->xbzrle_cache_size;
     }
+    if (params->has_max_postcopy_bandwidth) {
+        dest->max_postcopy_bandwidth = params->max_postcopy_bandwidth;
+    }
 }
 
 static void migrate_params_apply(MigrateSetParameters *params, Error **errp)
@@ -1138,6 +1148,9 @@ static void migrate_params_apply(MigrateSetParameters *params, Error **errp)
         s->parameters.xbzrle_cache_size = params->xbzrle_cache_size;
         xbzrle_cache_resize(params->xbzrle_cache_size, errp);
     }
+    if (params->has_max_postcopy_bandwidth) {
+        s->parameters.max_postcopy_bandwidth = params->max_postcopy_bandwidth;
+    }
 }
 
 void qmp_migrate_set_parameters(MigrateSetParameters *params, Error **errp)
@@ -1887,6 +1900,16 @@ int64_t migrate_xbzrle_cache_size(void)
     return s->parameters.xbzrle_cache_size;
 }
 
+static int64_t migrate_max_postcopy_bandwidth(void)
+{
+    MigrationState *s;
+
+    s = migrate_get_current();
+
+    return s->parameters.max_postcopy_bandwidth;
+}
+
+
 bool migrate_use_block(void)
 {
     MigrationState *s;
@@ -2226,6 +2249,7 @@ static int postcopy_start(MigrationState *ms)
     QIOChannelBuffer *bioc;
     QEMUFile *fb;
     int64_t time_at_stop = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
+    int64_t bandwidth = migrate_max_postcopy_bandwidth();
     bool restart_block = false;
     int cur_state = MIGRATION_STATUS_ACTIVE;
     if (!migrate_pause_before_switchover()) {
@@ -2280,7 +2304,12 @@ static int postcopy_start(MigrationState *ms)
      * will notice we're in POSTCOPY_ACTIVE and not actually
      * wrap their state up here
      */
-    qemu_file_set_rate_limit(ms->to_dst_file, INT64_MAX);
+    /* 0 max-postcopy-bandwidth means unlimited */
+    if (!bandwidth) {
+        qemu_file_set_rate_limit(ms->to_dst_file, INT64_MAX);
+    } else {
+        qemu_file_set_rate_limit(ms->to_dst_file, bandwidth / XFER_LIMIT_RATIO);
+    }
     if (migrate_postcopy_ram()) {
         /* Ping just for debugging, helps line traces up */
         qemu_savevm_send_ping(ms->to_dst_file, 2);
@@ -3042,6 +3071,9 @@ static Property migration_properties[] = {
     DEFINE_PROP_SIZE("xbzrle-cache-size", MigrationState,
                       parameters.xbzrle_cache_size,
                       DEFAULT_MIGRATE_XBZRLE_CACHE_SIZE),
+    DEFINE_PROP_SIZE("max-postcopy-bandwidth", MigrationState,
+                      parameters.max_postcopy_bandwidth,
+                      DEFAULT_MIGRATE_MAX_POSTCOPY_BANDWIDTH),
 
     /* Migration capabilities */
     DEFINE_PROP_MIG_CAP("x-xbzrle", MIGRATION_CAPABILITY_XBZRLE),
@@ -3110,6 +3142,7 @@ static void migration_instance_init(Object *obj)
     params->has_x_multifd_channels = true;
     params->has_x_multifd_page_count = true;
     params->has_xbzrle_cache_size = true;
+    params->has_max_postcopy_bandwidth = true;
 
     qemu_sem_init(&ms->postcopy_pause_sem, 0);
     qemu_sem_init(&ms->postcopy_pause_rp_sem, 0);
diff --git a/qapi/migration.json b/qapi/migration.json
index f7e10ee90f..1b4c1db670 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -517,6 +517,9 @@
 #                     and a power of 2
 #                     (Since 2.11)
 #
+# @max-postcopy-bandwidth: Background transfer bandwidth during postcopy.
+#                     Defaults to 0 (unlimited).  In bytes per second.
+#                     (Since 3.0)
 # Since: 2.4
 ##
 { 'enum': 'MigrationParameter',
@@ -525,7 +528,7 @@
            'tls-creds', 'tls-hostname', 'max-bandwidth',
            'downtime-limit', 'x-checkpoint-delay', 'block-incremental',
            'x-multifd-channels', 'x-multifd-page-count',
-           'xbzrle-cache-size' ] }
+           'xbzrle-cache-size', 'max-postcopy-bandwidth' ] }
 
 ##
 # @MigrateSetParameters:
@@ -593,6 +596,10 @@
 #                     needs to be a multiple of the target page size
 #                     and a power of 2
 #                     (Since 2.11)
+#
+# @max-postcopy-bandwidth: Background transfer bandwidth during postcopy.
+#                     Defaults to 0 (unlimited).  In bytes per second.
+#                     (Since 3.0)
 # Since: 2.4
 ##
 # TODO either fuse back into MigrationParameters, or make
@@ -611,7 +618,8 @@
             '*block-incremental': 'bool',
             '*x-multifd-channels': 'int',
             '*x-multifd-page-count': 'int',
-            '*xbzrle-cache-size': 'size' } }
+            '*xbzrle-cache-size': 'size',
+            '*max-postcopy-bandwidth': 'size' } }
 
 ##
 # @migrate-set-parameters:
@@ -694,6 +702,10 @@
 #                     needs to be a multiple of the target page size
 #                     and a power of 2
 #                     (Since 2.11)
+#
+# @max-postcopy-bandwidth: Background transfer bandwidth during postcopy.
+#                     Defaults to 0 (unlimited).  In bytes per second.
+#                     (Since 3.0)
 # Since: 2.4
 ##
 { 'struct': 'MigrationParameters',
@@ -710,7 +722,8 @@
             '*block-incremental': 'bool' ,
             '*x-multifd-channels': 'uint8',
             '*x-multifd-page-count': 'uint32',
-            '*xbzrle-cache-size': 'size' } }
+            '*xbzrle-cache-size': 'size',
+            '*max-postcopy-bandwidth': 'size'  } }
 
 ##
 # @query-migrate-parameters:
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [Qemu-devel] [PATCH 2/3] migration: Wake rate limiting for urgent requests
  2018-06-13 10:26 [Qemu-devel] [PATCH 0/3] Postcopy bandwidth limiting Dr. David Alan Gilbert (git)
  2018-06-13 10:26 ` [Qemu-devel] [PATCH 1/3] migration/postcopy: Add max-postcopy-bandwidth parameter Dr. David Alan Gilbert (git)
@ 2018-06-13 10:26 ` Dr. David Alan Gilbert (git)
  2018-06-14  2:21   ` Peter Xu
  2018-06-13 10:26 ` [Qemu-devel] [PATCH 3/3] migration/postcopy: Wake rate limit sleep on postcopy request Dr. David Alan Gilbert (git)
  2 siblings, 1 reply; 9+ messages in thread
From: Dr. David Alan Gilbert (git) @ 2018-06-13 10:26 UTC (permalink / raw)
  To: qemu-devel, quintela, peterx; +Cc: jdenemar

From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>

Rate limiting sleeps the migration thread for a while when it runs
out of bandwidth; but sometimes we want to wake up to get on with
something more urgent (like a postcopy request).  Here we use
a semaphore with a timedwait instead of a simple sleep; Incrementing
the sempahore will wake it up sooner.  Anything that consumes
these urgent events must decrement the sempahore.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 migration/migration.c  | 35 +++++++++++++++++++++++++++++++----
 migration/migration.h  |  8 ++++++++
 migration/trace-events |  2 ++
 3 files changed, 41 insertions(+), 4 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index 3a50d4c35c..108c3d7142 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -2852,6 +2852,16 @@ static void migration_iteration_finish(MigrationState *s)
     qemu_mutex_unlock_iothread();
 }
 
+void migration_make_urgent_request(void)
+{
+    qemu_sem_post(&migrate_get_current()->rate_limit_sem);
+}
+
+void migration_consume_urgent_request(void)
+{
+    qemu_sem_wait(&migrate_get_current()->rate_limit_sem);
+}
+
 /*
  * Master migration thread on the source VM.
  * It drives the migration and pumps the data down the outgoing channel.
@@ -2861,6 +2871,7 @@ static void *migration_thread(void *opaque)
     MigrationState *s = opaque;
     int64_t setup_start = qemu_clock_get_ms(QEMU_CLOCK_HOST);
     MigThrError thr_error;
+    bool urgent = false;
 
     rcu_register_thread();
 
@@ -2901,7 +2912,7 @@ static void *migration_thread(void *opaque)
            s->state == MIGRATION_STATUS_POSTCOPY_ACTIVE) {
         int64_t current_time;
 
-        if (!qemu_file_rate_limit(s->to_dst_file)) {
+        if (urgent || !qemu_file_rate_limit(s->to_dst_file)) {
             MigIterateState iter_state = migration_iteration_run(s);
             if (iter_state == MIG_ITERATE_SKIP) {
                 continue;
@@ -2932,10 +2943,24 @@ static void *migration_thread(void *opaque)
 
         migration_update_counters(s, current_time);
 
+        urgent = false;
         if (qemu_file_rate_limit(s->to_dst_file)) {
-            /* usleep expects microseconds */
-            g_usleep((s->iteration_start_time + BUFFER_DELAY -
-                      current_time) * 1000);
+            /* Wait for a delay to do rate limiting OR
+             * something urgent to post the semaphore.
+             */
+            int ms = s->iteration_start_time + BUFFER_DELAY - current_time;
+            trace_migration_thread_ratelimit_pre(ms);
+            if (qemu_sem_timedwait(&s->rate_limit_sem, ms) == 0) {
+                /* We were worken by one or more urgent things but
+                 * the timedwait will have consumed one of them.
+                 * The service routine for the urgent wake will dec
+                 * the semaphore itself for each item it consumes,
+                 * so add this one we just eat back.
+                 */
+                qemu_sem_post(&s->rate_limit_sem);
+                urgent = true;
+            }
+            trace_migration_thread_ratelimit_post(urgent);
         }
     }
 
@@ -3109,6 +3134,7 @@ static void migration_instance_finalize(Object *obj)
     qemu_mutex_destroy(&ms->qemu_file_lock);
     g_free(params->tls_hostname);
     g_free(params->tls_creds);
+    qemu_sem_destroy(&ms->rate_limit_sem);
     qemu_sem_destroy(&ms->pause_sem);
     qemu_sem_destroy(&ms->postcopy_pause_sem);
     qemu_sem_destroy(&ms->postcopy_pause_rp_sem);
@@ -3147,6 +3173,7 @@ static void migration_instance_init(Object *obj)
     qemu_sem_init(&ms->postcopy_pause_sem, 0);
     qemu_sem_init(&ms->postcopy_pause_rp_sem, 0);
     qemu_sem_init(&ms->rp_state.rp_sem, 0);
+    qemu_sem_init(&ms->rate_limit_sem, 0);
     qemu_mutex_init(&ms->qemu_file_lock);
 }
 
diff --git a/migration/migration.h b/migration/migration.h
index 31d3ed12dc..64a7b33735 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -121,6 +121,11 @@ struct MigrationState
      */
     QemuMutex qemu_file_lock;
 
+    /*
+     * Used to allow urgent requests to override rate limiting.
+     */
+    QemuSemaphore rate_limit_sem;
+
     /* bytes already send at the beggining of current interation */
     uint64_t iteration_initial_bytes;
     /* time at the start of current iteration */
@@ -287,4 +292,7 @@ void init_dirty_bitmap_incoming_migration(void);
 #define qemu_ram_foreach_block \
   #warning "Use qemu_ram_foreach_block_migratable in migration code"
 
+void migration_make_urgent_request(void);
+void migration_consume_urgent_request(void);
+
 #endif
diff --git a/migration/trace-events b/migration/trace-events
index 4a768eaaeb..3f67758893 100644
--- a/migration/trace-events
+++ b/migration/trace-events
@@ -108,6 +108,8 @@ migration_return_path_end_before(void) ""
 migration_return_path_end_after(int rp_error) "%d"
 migration_thread_after_loop(void) ""
 migration_thread_file_err(void) ""
+migration_thread_ratelimit_pre(int ms) "%d ms"
+migration_thread_ratelimit_post(int urgent) "urgent: %d"
 migration_thread_setup_complete(void) ""
 open_return_path_on_source(void) ""
 open_return_path_on_source_continue(void) ""
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [Qemu-devel] [PATCH 3/3] migration/postcopy: Wake rate limit sleep on postcopy request
  2018-06-13 10:26 [Qemu-devel] [PATCH 0/3] Postcopy bandwidth limiting Dr. David Alan Gilbert (git)
  2018-06-13 10:26 ` [Qemu-devel] [PATCH 1/3] migration/postcopy: Add max-postcopy-bandwidth parameter Dr. David Alan Gilbert (git)
  2018-06-13 10:26 ` [Qemu-devel] [PATCH 2/3] migration: Wake rate limiting for urgent requests Dr. David Alan Gilbert (git)
@ 2018-06-13 10:26 ` Dr. David Alan Gilbert (git)
  2018-06-14 10:56   ` Peter Xu
  2 siblings, 1 reply; 9+ messages in thread
From: Dr. David Alan Gilbert (git) @ 2018-06-13 10:26 UTC (permalink / raw)
  To: qemu-devel, quintela, peterx; +Cc: jdenemar

From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>

Use the 'urgent request' mechanism added in the previous patch
for entries added to the postcopy request queue for RAM.  Ignore
the rate limiting while we have requests.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 migration/ram.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/migration/ram.c b/migration/ram.c
index e0d19305ee..d88400d99f 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -1538,6 +1538,7 @@ static RAMBlock *unqueue_page(RAMState *rs, ram_addr_t *offset)
             memory_region_unref(block->mr);
             QSIMPLEQ_REMOVE_HEAD(&rs->src_page_requests, next_req);
             g_free(entry);
+            migration_consume_urgent_request();
         }
     }
     qemu_mutex_unlock(&rs->src_page_req_mutex);
@@ -1686,6 +1687,7 @@ int ram_save_queue_pages(const char *rbname, ram_addr_t start, ram_addr_t len)
     memory_region_ref(ramblock->mr);
     qemu_mutex_lock(&rs->src_page_req_mutex);
     QSIMPLEQ_INSERT_TAIL(&rs->src_page_requests, new_entry, next_req);
+    migration_make_urgent_request();
     qemu_mutex_unlock(&rs->src_page_req_mutex);
     rcu_read_unlock();
 
@@ -2634,9 +2636,14 @@ static int ram_save_iterate(QEMUFile *f, void *opaque)
 
     t0 = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
     i = 0;
-    while ((ret = qemu_file_rate_limit(f)) == 0) {
+    while ((ret = qemu_file_rate_limit(f)) == 0 ||
+            !QSIMPLEQ_EMPTY(&rs->src_page_requests)) {
         int pages;
 
+        if (qemu_file_get_error(f)) {
+            break;
+        }
+
         pages = ram_find_and_save_block(rs, false);
         /* no more pages to sent */
         if (pages == 0) {
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] [PATCH 1/3] migration/postcopy: Add max-postcopy-bandwidth parameter
  2018-06-13 10:26 ` [Qemu-devel] [PATCH 1/3] migration/postcopy: Add max-postcopy-bandwidth parameter Dr. David Alan Gilbert (git)
@ 2018-06-14  2:18   ` Peter Xu
  0 siblings, 0 replies; 9+ messages in thread
From: Peter Xu @ 2018-06-14  2:18 UTC (permalink / raw)
  To: Dr. David Alan Gilbert (git); +Cc: qemu-devel, quintela, jdenemar

On Wed, Jun 13, 2018 at 11:26:40AM +0100, Dr. David Alan Gilbert (git) wrote:
> From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> 
> Limit the background transfer bandwidth during the postcopy
> phase to the value set on this new parameter.  The default, 0,
> corresponds to the existing behaviour which is unlimited bandwidth.
> 
> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

Reviewed-by: Peter Xu <peterx@redhat.com>

-- 
Peter Xu

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] [PATCH 2/3] migration: Wake rate limiting for urgent requests
  2018-06-13 10:26 ` [Qemu-devel] [PATCH 2/3] migration: Wake rate limiting for urgent requests Dr. David Alan Gilbert (git)
@ 2018-06-14  2:21   ` Peter Xu
  2018-06-14  8:50     ` Dr. David Alan Gilbert
  0 siblings, 1 reply; 9+ messages in thread
From: Peter Xu @ 2018-06-14  2:21 UTC (permalink / raw)
  To: Dr. David Alan Gilbert (git); +Cc: qemu-devel, quintela, jdenemar

On Wed, Jun 13, 2018 at 11:26:41AM +0100, Dr. David Alan Gilbert (git) wrote:

[...]

> @@ -2932,10 +2943,24 @@ static void *migration_thread(void *opaque)
>  
>          migration_update_counters(s, current_time);
>  
> +        urgent = false;
>          if (qemu_file_rate_limit(s->to_dst_file)) {
> -            /* usleep expects microseconds */
> -            g_usleep((s->iteration_start_time + BUFFER_DELAY -
> -                      current_time) * 1000);
> +            /* Wait for a delay to do rate limiting OR
> +             * something urgent to post the semaphore.
> +             */
> +            int ms = s->iteration_start_time + BUFFER_DELAY - current_time;
> +            trace_migration_thread_ratelimit_pre(ms);
> +            if (qemu_sem_timedwait(&s->rate_limit_sem, ms) == 0) {
> +                /* We were worken by one or more urgent things but
> +                 * the timedwait will have consumed one of them.
> +                 * The service routine for the urgent wake will dec
> +                 * the semaphore itself for each item it consumes,
> +                 * so add this one we just eat back.
> +                 */
> +                qemu_sem_post(&s->rate_limit_sem);

Is it possible that we just avoid eating that in the next patch?  Then
we only provide a helper to "trigger an urgent request" but we only
consume the point here?

-- 
Peter Xu

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] [PATCH 2/3] migration: Wake rate limiting for urgent requests
  2018-06-14  2:21   ` Peter Xu
@ 2018-06-14  8:50     ` Dr. David Alan Gilbert
  2018-06-14 10:56       ` Peter Xu
  0 siblings, 1 reply; 9+ messages in thread
From: Dr. David Alan Gilbert @ 2018-06-14  8:50 UTC (permalink / raw)
  To: Peter Xu; +Cc: qemu-devel, quintela, jdenemar

* Peter Xu (peterx@redhat.com) wrote:
> On Wed, Jun 13, 2018 at 11:26:41AM +0100, Dr. David Alan Gilbert (git) wrote:
> 
> [...]
> 
> > @@ -2932,10 +2943,24 @@ static void *migration_thread(void *opaque)
> >  
> >          migration_update_counters(s, current_time);
> >  
> > +        urgent = false;
> >          if (qemu_file_rate_limit(s->to_dst_file)) {
> > -            /* usleep expects microseconds */
> > -            g_usleep((s->iteration_start_time + BUFFER_DELAY -
> > -                      current_time) * 1000);
> > +            /* Wait for a delay to do rate limiting OR
> > +             * something urgent to post the semaphore.
> > +             */
> > +            int ms = s->iteration_start_time + BUFFER_DELAY - current_time;
> > +            trace_migration_thread_ratelimit_pre(ms);
> > +            if (qemu_sem_timedwait(&s->rate_limit_sem, ms) == 0) {
> > +                /* We were worken by one or more urgent things but
> > +                 * the timedwait will have consumed one of them.
> > +                 * The service routine for the urgent wake will dec
> > +                 * the semaphore itself for each item it consumes,
> > +                 * so add this one we just eat back.
> > +                 */
> > +                qemu_sem_post(&s->rate_limit_sem);
> 
> Is it possible that we just avoid eating that in the next patch?  Then
> we only provide a helper to "trigger an urgent request" but we only
> consume the point here?

I think it's harder;
This code is generic in migration.c where as the next patch is all
specific in ram.c; so we'd have to push a flag all the way down.
Also, the code later is very simple - every request it adds/removes
it posts/waits the semaphore - it's nice to keep that simple.

Dave

> -- 
> Peter Xu
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] [PATCH 2/3] migration: Wake rate limiting for urgent requests
  2018-06-14  8:50     ` Dr. David Alan Gilbert
@ 2018-06-14 10:56       ` Peter Xu
  0 siblings, 0 replies; 9+ messages in thread
From: Peter Xu @ 2018-06-14 10:56 UTC (permalink / raw)
  To: Dr. David Alan Gilbert; +Cc: qemu-devel, quintela, jdenemar

On Thu, Jun 14, 2018 at 09:50:35AM +0100, Dr. David Alan Gilbert wrote:
> * Peter Xu (peterx@redhat.com) wrote:
> > On Wed, Jun 13, 2018 at 11:26:41AM +0100, Dr. David Alan Gilbert (git) wrote:
> > 
> > [...]
> > 
> > > @@ -2932,10 +2943,24 @@ static void *migration_thread(void *opaque)
> > >  
> > >          migration_update_counters(s, current_time);
> > >  
> > > +        urgent = false;
> > >          if (qemu_file_rate_limit(s->to_dst_file)) {
> > > -            /* usleep expects microseconds */
> > > -            g_usleep((s->iteration_start_time + BUFFER_DELAY -
> > > -                      current_time) * 1000);
> > > +            /* Wait for a delay to do rate limiting OR
> > > +             * something urgent to post the semaphore.
> > > +             */
> > > +            int ms = s->iteration_start_time + BUFFER_DELAY - current_time;
> > > +            trace_migration_thread_ratelimit_pre(ms);
> > > +            if (qemu_sem_timedwait(&s->rate_limit_sem, ms) == 0) {
> > > +                /* We were worken by one or more urgent things but
> > > +                 * the timedwait will have consumed one of them.
> > > +                 * The service routine for the urgent wake will dec
> > > +                 * the semaphore itself for each item it consumes,
> > > +                 * so add this one we just eat back.
> > > +                 */
> > > +                qemu_sem_post(&s->rate_limit_sem);
> > 
> > Is it possible that we just avoid eating that in the next patch?  Then
> > we only provide a helper to "trigger an urgent request" but we only
> > consume the point here?
> 
> I think it's harder;
> This code is generic in migration.c where as the next patch is all
> specific in ram.c; so we'd have to push a flag all the way down.
> Also, the code later is very simple - every request it adds/removes
> it posts/waits the semaphore - it's nice to keep that simple.

So what I meant above was to remove this post() meanwhile remove the
wait() in unqueue_page() in the next page.  Then it'll be a
single-direction message telling that "we have some urgent work to do"
without any ACK.  That'll possibly work but we might need to run the
loop even if we don't really have postcopy pages to send (e.g., when 2
continuous postcopy requests come, we'll handle the two pages in the
first loop, but we'll run the whole iteration twice while in the 2nd
loop we'll see that the postcopy queue is empty).

The ideal solution I can think of is to use a semaphore that only with
max value equals to 1.  I suppose that can be emulated by an eventfd
but I think it's fine now with current approach; we just need to make
sure the post() and wait() are "very" paired up well when calling the
new APIs.

As a summary:

Reviewed-by: Peter Xu <peterx@redhat.com>

(One important reason for the r-b is that I see Dave's graph when
 testing the series; it's very beautiful!)

Thanks,

-- 
Peter Xu

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] [PATCH 3/3] migration/postcopy: Wake rate limit sleep on postcopy request
  2018-06-13 10:26 ` [Qemu-devel] [PATCH 3/3] migration/postcopy: Wake rate limit sleep on postcopy request Dr. David Alan Gilbert (git)
@ 2018-06-14 10:56   ` Peter Xu
  0 siblings, 0 replies; 9+ messages in thread
From: Peter Xu @ 2018-06-14 10:56 UTC (permalink / raw)
  To: Dr. David Alan Gilbert (git); +Cc: qemu-devel, quintela, jdenemar

On Wed, Jun 13, 2018 at 11:26:42AM +0100, Dr. David Alan Gilbert (git) wrote:
> From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> 
> Use the 'urgent request' mechanism added in the previous patch
> for entries added to the postcopy request queue for RAM.  Ignore
> the rate limiting while we have requests.
> 
> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

Reviewed-by: Peter Xu <peterx@redhat.com>

-- 
Peter Xu

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2018-06-14 10:57 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-06-13 10:26 [Qemu-devel] [PATCH 0/3] Postcopy bandwidth limiting Dr. David Alan Gilbert (git)
2018-06-13 10:26 ` [Qemu-devel] [PATCH 1/3] migration/postcopy: Add max-postcopy-bandwidth parameter Dr. David Alan Gilbert (git)
2018-06-14  2:18   ` Peter Xu
2018-06-13 10:26 ` [Qemu-devel] [PATCH 2/3] migration: Wake rate limiting for urgent requests Dr. David Alan Gilbert (git)
2018-06-14  2:21   ` Peter Xu
2018-06-14  8:50     ` Dr. David Alan Gilbert
2018-06-14 10:56       ` Peter Xu
2018-06-13 10:26 ` [Qemu-devel] [PATCH 3/3] migration/postcopy: Wake rate limit sleep on postcopy request Dr. David Alan Gilbert (git)
2018-06-14 10:56   ` Peter Xu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.