All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 0/5] several migrations related patches
@ 2017-03-27  7:21 Peter Xu
  2017-03-27  7:21 ` [Qemu-devel] [PATCH 1/5] migration: set current_active_state once Peter Xu
                   ` (4 more replies)
  0 siblings, 5 replies; 17+ messages in thread
From: Peter Xu @ 2017-03-27  7:21 UTC (permalink / raw)
  To: qemu-devel; +Cc: Juan Quintela, \  Dr . David Alan Gilbert \ , peterx

Mostly small touch-ups, the last one looks like a cpu throttle bug on
time slice calculation (or I missed anything?). Anyway, please review,
thanks.

Peter Xu (5):
  migration: set current_active_state once
  migration: rename max_size to threshold_size
  hmp: info migrate_capability format tunes
  hmp: info migrate_parameters format tunes
  cpu: throttle: fix throttle time slice

 cpus.c                      |  6 ++----
 hmp.c                       | 26 +++++++++++---------------
 include/migration/vmstate.h |  3 ++-
 migration/migration.c       | 18 +++++++++---------
 migration/savevm.c          |  4 ++--
 5 files changed, 26 insertions(+), 31 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PATCH 1/5] migration: set current_active_state once
  2017-03-27  7:21 [Qemu-devel] [PATCH 0/5] several migrations related patches Peter Xu
@ 2017-03-27  7:21 ` Peter Xu
  2017-03-31 18:50   ` Dr. David Alan Gilbert
  2017-03-27  7:21 ` [Qemu-devel] [PATCH 2/5] migration: rename max_size to threshold_size Peter Xu
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 17+ messages in thread
From: Peter Xu @ 2017-03-27  7:21 UTC (permalink / raw)
  To: qemu-devel; +Cc: Juan Quintela, \  Dr . David Alan Gilbert \ , peterx

We set it right above this one. No need to set it twice.

CC: Juan Quintela <quintela@redhat.com>
CC: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
---
 migration/migration.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/migration/migration.c b/migration/migration.c
index 54060f7..f9f4d98 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1938,7 +1938,6 @@ static void *migration_thread(void *opaque)
     qemu_savevm_state_begin(s->to_dst_file, &s->params);
 
     s->setup_time = qemu_clock_get_ms(QEMU_CLOCK_HOST) - setup_start;
-    current_active_state = MIGRATION_STATUS_ACTIVE;
     migrate_set_state(&s->state, MIGRATION_STATUS_SETUP,
                       MIGRATION_STATUS_ACTIVE);
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PATCH 2/5] migration: rename max_size to threshold_size
  2017-03-27  7:21 [Qemu-devel] [PATCH 0/5] several migrations related patches Peter Xu
  2017-03-27  7:21 ` [Qemu-devel] [PATCH 1/5] migration: set current_active_state once Peter Xu
@ 2017-03-27  7:21 ` Peter Xu
  2017-03-31 18:59   ` Dr. David Alan Gilbert
  2017-03-27  7:21 ` [Qemu-devel] [PATCH 3/5] hmp: info migrate_capability format tunes Peter Xu
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 17+ messages in thread
From: Peter Xu @ 2017-03-27  7:21 UTC (permalink / raw)
  To: qemu-devel; +Cc: Juan Quintela, \  Dr . David Alan Gilbert \ , peterx

In migration codes (especially in migration_thread()), max_size is used
in many place for the threshold value that we will start to do the final
flush and jump to the next stage to dump the whole rest things to
destination. However its name is confusing to first readers. Let's
rename it to "threshold_size" when proper and add a comment for it. No
functional change is made.

CC: Juan Quintela <quintela@redhat.com>
CC: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
---
 include/migration/vmstate.h |  3 ++-
 migration/migration.c       | 17 +++++++++--------
 migration/savevm.c          |  4 ++--
 3 files changed, 13 insertions(+), 11 deletions(-)

diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h
index f2dbf84..dad3984 100644
--- a/include/migration/vmstate.h
+++ b/include/migration/vmstate.h
@@ -56,7 +56,8 @@ typedef struct SaveVMHandlers {
 
     /* This runs outside the iothread lock!  */
     int (*save_live_setup)(QEMUFile *f, void *opaque);
-    void (*save_live_pending)(QEMUFile *f, void *opaque, uint64_t max_size,
+    void (*save_live_pending)(QEMUFile *f, void *opaque,
+                              uint64_t threshold_size,
                               uint64_t *non_postcopiable_pending,
                               uint64_t *postcopiable_pending);
     LoadStateHandler *load_state;
diff --git a/migration/migration.c b/migration/migration.c
index f9f4d98..b065fe4 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1907,7 +1907,8 @@ static void *migration_thread(void *opaque)
     int64_t initial_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
     int64_t setup_start = qemu_clock_get_ms(QEMU_CLOCK_HOST);
     int64_t initial_bytes = 0;
-    int64_t max_size = 0;
+    /* We'll do the final flush when reachs threshold_size */
+    int64_t threshold_size = 0;
     int64_t start_time = initial_time;
     int64_t end_time;
     bool old_vm_running = false;
@@ -1951,17 +1952,17 @@ static void *migration_thread(void *opaque)
         if (!qemu_file_rate_limit(s->to_dst_file)) {
             uint64_t pend_post, pend_nonpost;
 
-            qemu_savevm_state_pending(s->to_dst_file, max_size, &pend_nonpost,
-                                      &pend_post);
+            qemu_savevm_state_pending(s->to_dst_file, threshold_size,
+                                      &pend_nonpost, &pend_post);
             pending_size = pend_nonpost + pend_post;
-            trace_migrate_pending(pending_size, max_size,
+            trace_migrate_pending(pending_size, threshold_size,
                                   pend_post, pend_nonpost);
-            if (pending_size && pending_size >= max_size) {
+            if (pending_size && pending_size >= threshold_size) {
                 /* Still a significant amount to transfer */
 
                 if (migrate_postcopy_ram() &&
                     s->state != MIGRATION_STATUS_POSTCOPY_ACTIVE &&
-                    pend_nonpost <= max_size &&
+                    pend_nonpost <= threshold_size &&
                     atomic_read(&s->start_postcopy)) {
 
                     if (!postcopy_start(s, &old_vm_running)) {
@@ -1993,13 +1994,13 @@ static void *migration_thread(void *opaque)
                                          initial_bytes;
             uint64_t time_spent = current_time - initial_time;
             double bandwidth = (double)transferred_bytes / time_spent;
-            max_size = bandwidth * s->parameters.downtime_limit;
+            threshold_size = bandwidth * s->parameters.downtime_limit;
 
             s->mbps = (((double) transferred_bytes * 8.0) /
                     ((double) time_spent / 1000.0)) / 1000.0 / 1000.0;
 
             trace_migrate_transferred(transferred_bytes, time_spent,
-                                      bandwidth, max_size);
+                                      bandwidth, threshold_size);
             /* if we haven't sent anything, we don't want to recalculate
                10000 is a small enough number for our purposes */
             if (s->dirty_bytes_rate && transferred_bytes > 10000) {
diff --git a/migration/savevm.c b/migration/savevm.c
index 3b19a4a..59c04eb 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1197,7 +1197,7 @@ void qemu_savevm_state_complete_precopy(QEMUFile *f, bool iterable_only)
  * the result is split into the amount for units that can and
  * for units that can't do postcopy.
  */
-void qemu_savevm_state_pending(QEMUFile *f, uint64_t max_size,
+void qemu_savevm_state_pending(QEMUFile *f, uint64_t threshold_size,
                                uint64_t *res_non_postcopiable,
                                uint64_t *res_postcopiable)
 {
@@ -1216,7 +1216,7 @@ void qemu_savevm_state_pending(QEMUFile *f, uint64_t max_size,
                 continue;
             }
         }
-        se->ops->save_live_pending(f, se->opaque, max_size,
+        se->ops->save_live_pending(f, se->opaque, threshold_size,
                                    res_non_postcopiable, res_postcopiable);
     }
 }
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PATCH 3/5] hmp: info migrate_capability format tunes
  2017-03-27  7:21 [Qemu-devel] [PATCH 0/5] several migrations related patches Peter Xu
  2017-03-27  7:21 ` [Qemu-devel] [PATCH 1/5] migration: set current_active_state once Peter Xu
  2017-03-27  7:21 ` [Qemu-devel] [PATCH 2/5] migration: rename max_size to threshold_size Peter Xu
@ 2017-03-27  7:21 ` Peter Xu
  2017-03-31 19:01   ` Dr. David Alan Gilbert
  2017-03-27  7:21 ` [Qemu-devel] [PATCH 4/5] hmp: info migrate_parameters " Peter Xu
  2017-03-27  7:21 ` [Qemu-devel] [PATCH 5/5] cpu: throttle: fix throttle time slice Peter Xu
  4 siblings, 1 reply; 17+ messages in thread
From: Peter Xu @ 2017-03-27  7:21 UTC (permalink / raw)
  To: qemu-devel; +Cc: Juan Quintela, \  Dr . David Alan Gilbert \ , peterx

Dump the info in a single line is hard to read. Do it one per line.
Also, the first "capabilities:" didn't help much. Let's remove it.

CC: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
---
 hmp.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/hmp.c b/hmp.c
index edb8970..95eef8c 100644
--- a/hmp.c
+++ b/hmp.c
@@ -265,13 +265,11 @@ void hmp_info_migrate_capabilities(Monitor *mon, const QDict *qdict)
     caps = qmp_query_migrate_capabilities(NULL);
 
     if (caps) {
-        monitor_printf(mon, "capabilities: ");
         for (cap = caps; cap; cap = cap->next) {
-            monitor_printf(mon, "%s: %s ",
+            monitor_printf(mon, "%s: %s\n",
                            MigrationCapability_lookup[cap->value->capability],
                            cap->value->state ? "on" : "off");
         }
-        monitor_printf(mon, "\n");
     }
 
     qapi_free_MigrationCapabilityStatusList(caps);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PATCH 4/5] hmp: info migrate_parameters format tunes
  2017-03-27  7:21 [Qemu-devel] [PATCH 0/5] several migrations related patches Peter Xu
                   ` (2 preceding siblings ...)
  2017-03-27  7:21 ` [Qemu-devel] [PATCH 3/5] hmp: info migrate_capability format tunes Peter Xu
@ 2017-03-27  7:21 ` Peter Xu
  2017-03-31 19:02   ` Dr. David Alan Gilbert
  2017-03-27  7:21 ` [Qemu-devel] [PATCH 5/5] cpu: throttle: fix throttle time slice Peter Xu
  4 siblings, 1 reply; 17+ messages in thread
From: Peter Xu @ 2017-03-27  7:21 UTC (permalink / raw)
  To: qemu-devel; +Cc: Juan Quintela, \  Dr . David Alan Gilbert \ , peterx

Do the same (one per line) to the parameter list.

CC: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
---
 hmp.c | 22 ++++++++++------------
 1 file changed, 10 insertions(+), 12 deletions(-)

diff --git a/hmp.c b/hmp.c
index 95eef8c..b33e39e 100644
--- a/hmp.c
+++ b/hmp.c
@@ -282,46 +282,44 @@ void hmp_info_migrate_parameters(Monitor *mon, const QDict *qdict)
     params = qmp_query_migrate_parameters(NULL);
 
     if (params) {
-        monitor_printf(mon, "parameters:");
         assert(params->has_compress_level);
-        monitor_printf(mon, " %s: %" PRId64,
+        monitor_printf(mon, "%s: %" PRId64 "\n",
             MigrationParameter_lookup[MIGRATION_PARAMETER_COMPRESS_LEVEL],
             params->compress_level);
         assert(params->has_compress_threads);
-        monitor_printf(mon, " %s: %" PRId64,
+        monitor_printf(mon, "%s: %" PRId64 "\n",
             MigrationParameter_lookup[MIGRATION_PARAMETER_COMPRESS_THREADS],
             params->compress_threads);
         assert(params->has_decompress_threads);
-        monitor_printf(mon, " %s: %" PRId64,
+        monitor_printf(mon, "%s: %" PRId64 "\n",
             MigrationParameter_lookup[MIGRATION_PARAMETER_DECOMPRESS_THREADS],
             params->decompress_threads);
         assert(params->has_cpu_throttle_initial);
-        monitor_printf(mon, " %s: %" PRId64,
+        monitor_printf(mon, "%s: %" PRId64 "\n",
             MigrationParameter_lookup[MIGRATION_PARAMETER_CPU_THROTTLE_INITIAL],
             params->cpu_throttle_initial);
         assert(params->has_cpu_throttle_increment);
-        monitor_printf(mon, " %s: %" PRId64,
+        monitor_printf(mon, "%s: %" PRId64 "\n",
             MigrationParameter_lookup[MIGRATION_PARAMETER_CPU_THROTTLE_INCREMENT],
             params->cpu_throttle_increment);
-        monitor_printf(mon, " %s: '%s'",
+        monitor_printf(mon, "%s: '%s'\n",
             MigrationParameter_lookup[MIGRATION_PARAMETER_TLS_CREDS],
             params->has_tls_creds ? params->tls_creds : "");
-        monitor_printf(mon, " %s: '%s'",
+        monitor_printf(mon, "%s: '%s'\n",
             MigrationParameter_lookup[MIGRATION_PARAMETER_TLS_HOSTNAME],
             params->has_tls_hostname ? params->tls_hostname : "");
         assert(params->has_max_bandwidth);
-        monitor_printf(mon, " %s: %" PRId64 " bytes/second",
+        monitor_printf(mon, "%s: %" PRId64 " bytes/second\n",
             MigrationParameter_lookup[MIGRATION_PARAMETER_MAX_BANDWIDTH],
             params->max_bandwidth);
         assert(params->has_downtime_limit);
-        monitor_printf(mon, " %s: %" PRId64 " milliseconds",
+        monitor_printf(mon, "%s: %" PRId64 " milliseconds\n",
             MigrationParameter_lookup[MIGRATION_PARAMETER_DOWNTIME_LIMIT],
             params->downtime_limit);
         assert(params->has_x_checkpoint_delay);
-        monitor_printf(mon, " %s: %" PRId64,
+        monitor_printf(mon, "%s: %" PRId64 "\n",
             MigrationParameter_lookup[MIGRATION_PARAMETER_X_CHECKPOINT_DELAY],
             params->x_checkpoint_delay);
-        monitor_printf(mon, "\n");
     }
 
     qapi_free_MigrationParameters(params);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PATCH 5/5] cpu: throttle: fix throttle time slice
  2017-03-27  7:21 [Qemu-devel] [PATCH 0/5] several migrations related patches Peter Xu
                   ` (3 preceding siblings ...)
  2017-03-27  7:21 ` [Qemu-devel] [PATCH 4/5] hmp: info migrate_parameters " Peter Xu
@ 2017-03-27  7:21 ` Peter Xu
  2017-03-27  7:40   ` Peter Xu
  2017-03-31 16:38   ` Paolo Bonzini
  4 siblings, 2 replies; 17+ messages in thread
From: Peter Xu @ 2017-03-27  7:21 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juan Quintela, \  Dr . David Alan Gilbert \ ,
	peterx, Paolo Bonzini, Richard Henderson, Jason J . Herne

IIUC the throttle idea is that: we split a CPU_THROTTLE_TIMESLICE_NS
time slice into two parts - one for vcpu, one for throttle thread (which
will suspend the thread by a sleep). However current algorithm on
calculating the working piece and sleeping piece is strange.

Assume a 99% throttle, what we want is to merely stop vcpu from running,
but the old logic will just first let the vcpu run for a very long
time (which is "CPU_THROTTLE_TIMESLICE_NS / (1-pct)" = 1 second) before
doing anything else.

Fixing it up to the simplest but imho accurate logic.

CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Richard Henderson <rth@twiddle.net>
CC: Jason J. Herne <jjherne@linux.vnet.ibm.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
---
 cpus.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/cpus.c b/cpus.c
index 167d961..7976ce4 100644
--- a/cpus.c
+++ b/cpus.c
@@ -633,7 +633,6 @@ static const VMStateDescription vmstate_timers = {
 static void cpu_throttle_thread(CPUState *cpu, run_on_cpu_data opaque)
 {
     double pct;
-    double throttle_ratio;
     long sleeptime_ns;
 
     if (!cpu_throttle_get_percentage()) {
@@ -641,8 +640,7 @@ static void cpu_throttle_thread(CPUState *cpu, run_on_cpu_data opaque)
     }
 
     pct = (double)cpu_throttle_get_percentage()/100;
-    throttle_ratio = pct / (1 - pct);
-    sleeptime_ns = (long)(throttle_ratio * CPU_THROTTLE_TIMESLICE_NS);
+    sleeptime_ns = (long)((1 - pct) * CPU_THROTTLE_TIMESLICE_NS);
 
     qemu_mutex_unlock_iothread();
     atomic_set(&cpu->throttle_thread_scheduled, 0);
@@ -668,7 +666,7 @@ static void cpu_throttle_timer_tick(void *opaque)
 
     pct = (double)cpu_throttle_get_percentage()/100;
     timer_mod(throttle_timer, qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL_RT) +
-                                   CPU_THROTTLE_TIMESLICE_NS / (1-pct));
+                                   CPU_THROTTLE_TIMESLICE_NS * pct);
 }
 
 void cpu_throttle_set(int new_throttle_pct)
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH 5/5] cpu: throttle: fix throttle time slice
  2017-03-27  7:21 ` [Qemu-devel] [PATCH 5/5] cpu: throttle: fix throttle time slice Peter Xu
@ 2017-03-27  7:40   ` Peter Xu
  2017-03-31 16:38   ` Paolo Bonzini
  1 sibling, 0 replies; 17+ messages in thread
From: Peter Xu @ 2017-03-27  7:40 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juan Quintela, \  Dr . David Alan Gilbert \ ,
	Paolo Bonzini, Richard Henderson, Jason J . Herne

On Mon, Mar 27, 2017 at 03:21:28PM +0800, Peter Xu wrote:
> IIUC the throttle idea is that: we split a CPU_THROTTLE_TIMESLICE_NS
> time slice into two parts - one for vcpu, one for throttle thread (which
> will suspend the thread by a sleep). However current algorithm on
> calculating the working piece and sleeping piece is strange.
> 
> Assume a 99% throttle, what we want is to merely stop vcpu from running,
> but the old logic will just first let the vcpu run for a very long
> time (which is "CPU_THROTTLE_TIMESLICE_NS / (1-pct)" = 1 second) before
> doing anything else.
> 
> Fixing it up to the simplest but imho accurate logic.

Oh, looks like I need to switch the two pct below... :)

> 
> CC: Paolo Bonzini <pbonzini@redhat.com>
> CC: Richard Henderson <rth@twiddle.net>
> CC: Jason J. Herne <jjherne@linux.vnet.ibm.com>
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>  cpus.c | 6 ++----
>  1 file changed, 2 insertions(+), 4 deletions(-)
> 
> diff --git a/cpus.c b/cpus.c
> index 167d961..7976ce4 100644
> --- a/cpus.c
> +++ b/cpus.c
> @@ -633,7 +633,6 @@ static const VMStateDescription vmstate_timers = {
>  static void cpu_throttle_thread(CPUState *cpu, run_on_cpu_data opaque)
>  {
>      double pct;
> -    double throttle_ratio;
>      long sleeptime_ns;
>  
>      if (!cpu_throttle_get_percentage()) {
> @@ -641,8 +640,7 @@ static void cpu_throttle_thread(CPUState *cpu, run_on_cpu_data opaque)
>      }
>  
>      pct = (double)cpu_throttle_get_percentage()/100;
> -    throttle_ratio = pct / (1 - pct);
> -    sleeptime_ns = (long)(throttle_ratio * CPU_THROTTLE_TIMESLICE_NS);
> +    sleeptime_ns = (long)((1 - pct) * CPU_THROTTLE_TIMESLICE_NS);
                             ^^^^^^^^^
                             here it should be "pct", while...

>  
>      qemu_mutex_unlock_iothread();
>      atomic_set(&cpu->throttle_thread_scheduled, 0);
> @@ -668,7 +666,7 @@ static void cpu_throttle_timer_tick(void *opaque)
>  
>      pct = (double)cpu_throttle_get_percentage()/100;
>      timer_mod(throttle_timer, qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL_RT) +
> -                                   CPU_THROTTLE_TIMESLICE_NS / (1-pct));
> +                                   CPU_THROTTLE_TIMESLICE_NS * pct);
                                                                  ^^^
                                                here it should be (1 - pct)

I'll wait for review comment on the raw idea, to see whether I will
need a repost. Sorry for the misunderstanding.

-- peterx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH 5/5] cpu: throttle: fix throttle time slice
  2017-03-27  7:21 ` [Qemu-devel] [PATCH 5/5] cpu: throttle: fix throttle time slice Peter Xu
  2017-03-27  7:40   ` Peter Xu
@ 2017-03-31 16:38   ` Paolo Bonzini
  2017-03-31 19:13     ` Dr. David Alan Gilbert
  2017-04-01  7:52     ` Peter Xu
  1 sibling, 2 replies; 17+ messages in thread
From: Paolo Bonzini @ 2017-03-31 16:38 UTC (permalink / raw)
  To: Peter Xu, qemu-devel
  Cc: Juan Quintela, Dr. David Alan Gilbert, Richard Henderson, jjherne



On 27/03/2017 09:21, Peter Xu wrote:
> @@ -641,8 +640,7 @@ static void cpu_throttle_thread(CPUState *cpu, run_on_cpu_data opaque)
>      }
>  
>      pct = (double)cpu_throttle_get_percentage()/100;
> -    throttle_ratio = pct / (1 - pct);
> -    sleeptime_ns = (long)(throttle_ratio * CPU_THROTTLE_TIMESLICE_NS);

Say pct = 0.25, then throttle_ratio = 0.25/0.75 = 1/3.

> +    sleeptime_ns = (long)((1 - pct) * CPU_THROTTLE_TIMESLICE_NS);
>  
>      qemu_mutex_unlock_iothread();
>      atomic_set(&cpu->throttle_thread_scheduled, 0);
> @@ -668,7 +666,7 @@ static void cpu_throttle_timer_tick(void *opaque)
>  
>      pct = (double)cpu_throttle_get_percentage()/100;
>      timer_mod(throttle_timer, qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL_RT) +
> -                                   CPU_THROTTLE_TIMESLICE_NS / (1-pct));

And the timer is running every 1/0.75 = 4/3 * CPU_THROTTLE_TIMESLICE_NS.

Of these, 1/3 is spent sleeping (3.33 ms), while 1 (10 ms) is spent not
sleeping.

When pct = 0.75, throttle_ratio = 3 and the timer is running every 4 *
CPU_THROTTLE_TIMESLICE_NS (40 ms).  Of these, 3 slices (30 ms) are spent
sleeping, while 10 ms are spent not sleeping.

The rationale _could_ be (I don't remember) that a CPU with a very high
throttling frequency leaves little time for the migration thread to do
any work.  So QEMU keeps the "on" phase always the same and lengthens
the "off" phase, which as you found out can be unsatisfactory.

However, I think your patch has the opposite problem: the frequency is
constant, but with high throttling all time reserved for the CPU will be
lost in overhead.  For example, at 99% throttling you only have 100
microseconds to wake up, do work and go back to sleep.

So I'm inclined _not_ to take your patch.  One possibility could be to
do the following:

- for throttling between 0% and 80%, use the current algorithm.  At 66%,
the CPU will work for 10 ms and sleep for 40 ms.

- for throttling above 80% adapt your algorithm to have a variable
timeslice, going from 50 ms at 66% to 100 ms at 100%.  This way, the CPU
time will shrink below 10 ms and the sleep time will grow.

It looks like this: http://i.imgur.com/lyFie04.png

So at 99% the timeslice will be 97.5 ms; the CPU will work for 975 us
and sleep for the rest (10x more than with just your patch).  But I'm
not sure it's really worth it.

Paolo

> +                                   CPU_THROTTLE_TIMESLICE_NS * pct);
>  }

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH 1/5] migration: set current_active_state once
  2017-03-27  7:21 ` [Qemu-devel] [PATCH 1/5] migration: set current_active_state once Peter Xu
@ 2017-03-31 18:50   ` Dr. David Alan Gilbert
  0 siblings, 0 replies; 17+ messages in thread
From: Dr. David Alan Gilbert @ 2017-03-31 18:50 UTC (permalink / raw)
  To: Peter Xu; +Cc: qemu-devel, Juan Quintela

* Peter Xu (peterx@redhat.com) wrote:
> We set it right above this one. No need to set it twice.

Ah good spot, yes set in the declaration of current_active_state.

> CC: Juan Quintela <quintela@redhat.com>
> CC: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>  migration/migration.c | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/migration/migration.c b/migration/migration.c
> index 54060f7..f9f4d98 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -1938,7 +1938,6 @@ static void *migration_thread(void *opaque)
>      qemu_savevm_state_begin(s->to_dst_file, &s->params);
>  
>      s->setup_time = qemu_clock_get_ms(QEMU_CLOCK_HOST) - setup_start;
> -    current_active_state = MIGRATION_STATUS_ACTIVE;
>      migrate_set_state(&s->state, MIGRATION_STATUS_SETUP,
>                        MIGRATION_STATUS_ACTIVE);


Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

>  
> -- 
> 2.7.4
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH 2/5] migration: rename max_size to threshold_size
  2017-03-27  7:21 ` [Qemu-devel] [PATCH 2/5] migration: rename max_size to threshold_size Peter Xu
@ 2017-03-31 18:59   ` Dr. David Alan Gilbert
  2017-04-01  7:16     ` Peter Xu
  0 siblings, 1 reply; 17+ messages in thread
From: Dr. David Alan Gilbert @ 2017-03-31 18:59 UTC (permalink / raw)
  To: Peter Xu; +Cc: qemu-devel, Juan Quintela

* Peter Xu (peterx@redhat.com) wrote:
> In migration codes (especially in migration_thread()), max_size is used
> in many place for the threshold value that we will start to do the final
> flush and jump to the next stage to dump the whole rest things to
> destination. However its name is confusing to first readers. Let's
> rename it to "threshold_size" when proper and add a comment for it. No
> functional change is made.
> 
> CC: Juan Quintela <quintela@redhat.com>
> CC: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>  include/migration/vmstate.h |  3 ++-
>  migration/migration.c       | 17 +++++++++--------
>  migration/savevm.c          |  4 ++--
>  3 files changed, 13 insertions(+), 11 deletions(-)
> 
> diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h
> index f2dbf84..dad3984 100644
> --- a/include/migration/vmstate.h
> +++ b/include/migration/vmstate.h
> @@ -56,7 +56,8 @@ typedef struct SaveVMHandlers {
>  
>      /* This runs outside the iothread lock!  */
>      int (*save_live_setup)(QEMUFile *f, void *opaque);
> -    void (*save_live_pending)(QEMUFile *f, void *opaque, uint64_t max_size,
> +    void (*save_live_pending)(QEMUFile *f, void *opaque,
> +                              uint64_t threshold_size,
>                                uint64_t *non_postcopiable_pending,
>                                uint64_t *postcopiable_pending);
>      LoadStateHandler *load_state;
> diff --git a/migration/migration.c b/migration/migration.c
> index f9f4d98..b065fe4 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -1907,7 +1907,8 @@ static void *migration_thread(void *opaque)
>      int64_t initial_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
>      int64_t setup_start = qemu_clock_get_ms(QEMU_CLOCK_HOST);
>      int64_t initial_bytes = 0;
> -    int64_t max_size = 0;
> +    /* We'll do the final flush when reachs threshold_size */

I think that's 'reaches' - however perhaps we should make a more
explicit comment:
   'The final stage happens when the remaining data is smaller than
    this threshold; it's calculated from the requested downtime
    and measured bandwidth'

other than that:
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

Dave

> +    int64_t threshold_size = 0;
>      int64_t start_time = initial_time;
>      int64_t end_time;
>      bool old_vm_running = false;
> @@ -1951,17 +1952,17 @@ static void *migration_thread(void *opaque)
>          if (!qemu_file_rate_limit(s->to_dst_file)) {
>              uint64_t pend_post, pend_nonpost;
>  
> -            qemu_savevm_state_pending(s->to_dst_file, max_size, &pend_nonpost,
> -                                      &pend_post);
> +            qemu_savevm_state_pending(s->to_dst_file, threshold_size,
> +                                      &pend_nonpost, &pend_post);
>              pending_size = pend_nonpost + pend_post;
> -            trace_migrate_pending(pending_size, max_size,
> +            trace_migrate_pending(pending_size, threshold_size,
>                                    pend_post, pend_nonpost);
> -            if (pending_size && pending_size >= max_size) {
> +            if (pending_size && pending_size >= threshold_size) {
>                  /* Still a significant amount to transfer */
>  
>                  if (migrate_postcopy_ram() &&
>                      s->state != MIGRATION_STATUS_POSTCOPY_ACTIVE &&
> -                    pend_nonpost <= max_size &&
> +                    pend_nonpost <= threshold_size &&
>                      atomic_read(&s->start_postcopy)) {
>  
>                      if (!postcopy_start(s, &old_vm_running)) {
> @@ -1993,13 +1994,13 @@ static void *migration_thread(void *opaque)
>                                           initial_bytes;
>              uint64_t time_spent = current_time - initial_time;
>              double bandwidth = (double)transferred_bytes / time_spent;
> -            max_size = bandwidth * s->parameters.downtime_limit;
> +            threshold_size = bandwidth * s->parameters.downtime_limit;
>  
>              s->mbps = (((double) transferred_bytes * 8.0) /
>                      ((double) time_spent / 1000.0)) / 1000.0 / 1000.0;
>  
>              trace_migrate_transferred(transferred_bytes, time_spent,
> -                                      bandwidth, max_size);
> +                                      bandwidth, threshold_size);
>              /* if we haven't sent anything, we don't want to recalculate
>                 10000 is a small enough number for our purposes */
>              if (s->dirty_bytes_rate && transferred_bytes > 10000) {
> diff --git a/migration/savevm.c b/migration/savevm.c
> index 3b19a4a..59c04eb 100644
> --- a/migration/savevm.c
> +++ b/migration/savevm.c
> @@ -1197,7 +1197,7 @@ void qemu_savevm_state_complete_precopy(QEMUFile *f, bool iterable_only)
>   * the result is split into the amount for units that can and
>   * for units that can't do postcopy.
>   */
> -void qemu_savevm_state_pending(QEMUFile *f, uint64_t max_size,
> +void qemu_savevm_state_pending(QEMUFile *f, uint64_t threshold_size,
>                                 uint64_t *res_non_postcopiable,
>                                 uint64_t *res_postcopiable)
>  {
> @@ -1216,7 +1216,7 @@ void qemu_savevm_state_pending(QEMUFile *f, uint64_t max_size,
>                  continue;
>              }
>          }
> -        se->ops->save_live_pending(f, se->opaque, max_size,
> +        se->ops->save_live_pending(f, se->opaque, threshold_size,
>                                     res_non_postcopiable, res_postcopiable);
>      }
>  }
> -- 
> 2.7.4
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH 3/5] hmp: info migrate_capability format tunes
  2017-03-27  7:21 ` [Qemu-devel] [PATCH 3/5] hmp: info migrate_capability format tunes Peter Xu
@ 2017-03-31 19:01   ` Dr. David Alan Gilbert
  0 siblings, 0 replies; 17+ messages in thread
From: Dr. David Alan Gilbert @ 2017-03-31 19:01 UTC (permalink / raw)
  To: Peter Xu; +Cc: qemu-devel, Juan Quintela

* Peter Xu (peterx@redhat.com) wrote:
> Dump the info in a single line is hard to read. Do it one per line.
> Also, the first "capabilities:" didn't help much. Let's remove it.
> 
> CC: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> Signed-off-by: Peter Xu <peterx@redhat.com>

agreed, although you may find it breaks some peoples scripts, but
we're allowed to do that in HMP and it should be more readable, so:

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

> ---
>  hmp.c | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)
> 
> diff --git a/hmp.c b/hmp.c
> index edb8970..95eef8c 100644
> --- a/hmp.c
> +++ b/hmp.c
> @@ -265,13 +265,11 @@ void hmp_info_migrate_capabilities(Monitor *mon, const QDict *qdict)
>      caps = qmp_query_migrate_capabilities(NULL);
>  
>      if (caps) {
> -        monitor_printf(mon, "capabilities: ");
>          for (cap = caps; cap; cap = cap->next) {
> -            monitor_printf(mon, "%s: %s ",
> +            monitor_printf(mon, "%s: %s\n",
>                             MigrationCapability_lookup[cap->value->capability],
>                             cap->value->state ? "on" : "off");
>          }
> -        monitor_printf(mon, "\n");
>      }
>  
>      qapi_free_MigrationCapabilityStatusList(caps);
> -- 
> 2.7.4
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH 4/5] hmp: info migrate_parameters format tunes
  2017-03-27  7:21 ` [Qemu-devel] [PATCH 4/5] hmp: info migrate_parameters " Peter Xu
@ 2017-03-31 19:02   ` Dr. David Alan Gilbert
  0 siblings, 0 replies; 17+ messages in thread
From: Dr. David Alan Gilbert @ 2017-03-31 19:02 UTC (permalink / raw)
  To: Peter Xu; +Cc: qemu-devel, Juan Quintela

* Peter Xu (peterx@redhat.com) wrote:
> Do the same (one per line) to the parameter list.
> 
> CC: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> Signed-off-by: Peter Xu <peterx@redhat.com>

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

> ---
>  hmp.c | 22 ++++++++++------------
>  1 file changed, 10 insertions(+), 12 deletions(-)
> 
> diff --git a/hmp.c b/hmp.c
> index 95eef8c..b33e39e 100644
> --- a/hmp.c
> +++ b/hmp.c
> @@ -282,46 +282,44 @@ void hmp_info_migrate_parameters(Monitor *mon, const QDict *qdict)
>      params = qmp_query_migrate_parameters(NULL);
>  
>      if (params) {
> -        monitor_printf(mon, "parameters:");
>          assert(params->has_compress_level);
> -        monitor_printf(mon, " %s: %" PRId64,
> +        monitor_printf(mon, "%s: %" PRId64 "\n",
>              MigrationParameter_lookup[MIGRATION_PARAMETER_COMPRESS_LEVEL],
>              params->compress_level);
>          assert(params->has_compress_threads);
> -        monitor_printf(mon, " %s: %" PRId64,
> +        monitor_printf(mon, "%s: %" PRId64 "\n",
>              MigrationParameter_lookup[MIGRATION_PARAMETER_COMPRESS_THREADS],
>              params->compress_threads);
>          assert(params->has_decompress_threads);
> -        monitor_printf(mon, " %s: %" PRId64,
> +        monitor_printf(mon, "%s: %" PRId64 "\n",
>              MigrationParameter_lookup[MIGRATION_PARAMETER_DECOMPRESS_THREADS],
>              params->decompress_threads);
>          assert(params->has_cpu_throttle_initial);
> -        monitor_printf(mon, " %s: %" PRId64,
> +        monitor_printf(mon, "%s: %" PRId64 "\n",
>              MigrationParameter_lookup[MIGRATION_PARAMETER_CPU_THROTTLE_INITIAL],
>              params->cpu_throttle_initial);
>          assert(params->has_cpu_throttle_increment);
> -        monitor_printf(mon, " %s: %" PRId64,
> +        monitor_printf(mon, "%s: %" PRId64 "\n",
>              MigrationParameter_lookup[MIGRATION_PARAMETER_CPU_THROTTLE_INCREMENT],
>              params->cpu_throttle_increment);
> -        monitor_printf(mon, " %s: '%s'",
> +        monitor_printf(mon, "%s: '%s'\n",
>              MigrationParameter_lookup[MIGRATION_PARAMETER_TLS_CREDS],
>              params->has_tls_creds ? params->tls_creds : "");
> -        monitor_printf(mon, " %s: '%s'",
> +        monitor_printf(mon, "%s: '%s'\n",
>              MigrationParameter_lookup[MIGRATION_PARAMETER_TLS_HOSTNAME],
>              params->has_tls_hostname ? params->tls_hostname : "");
>          assert(params->has_max_bandwidth);
> -        monitor_printf(mon, " %s: %" PRId64 " bytes/second",
> +        monitor_printf(mon, "%s: %" PRId64 " bytes/second\n",
>              MigrationParameter_lookup[MIGRATION_PARAMETER_MAX_BANDWIDTH],
>              params->max_bandwidth);
>          assert(params->has_downtime_limit);
> -        monitor_printf(mon, " %s: %" PRId64 " milliseconds",
> +        monitor_printf(mon, "%s: %" PRId64 " milliseconds\n",
>              MigrationParameter_lookup[MIGRATION_PARAMETER_DOWNTIME_LIMIT],
>              params->downtime_limit);
>          assert(params->has_x_checkpoint_delay);
> -        monitor_printf(mon, " %s: %" PRId64,
> +        monitor_printf(mon, "%s: %" PRId64 "\n",
>              MigrationParameter_lookup[MIGRATION_PARAMETER_X_CHECKPOINT_DELAY],
>              params->x_checkpoint_delay);
> -        monitor_printf(mon, "\n");
>      }
>  
>      qapi_free_MigrationParameters(params);
> -- 
> 2.7.4
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH 5/5] cpu: throttle: fix throttle time slice
  2017-03-31 16:38   ` Paolo Bonzini
@ 2017-03-31 19:13     ` Dr. David Alan Gilbert
  2017-03-31 19:46       ` Paolo Bonzini
  2017-04-01  7:52     ` Peter Xu
  1 sibling, 1 reply; 17+ messages in thread
From: Dr. David Alan Gilbert @ 2017-03-31 19:13 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Peter Xu, qemu-devel, Juan Quintela, Richard Henderson, jjherne

Ignoring the details below for a minute, this patch belongs in a separate
series; all the rest of the patches in this set are nice simple ones.

* Paolo Bonzini (pbonzini@redhat.com) wrote:
> 
> 
> On 27/03/2017 09:21, Peter Xu wrote:
> > @@ -641,8 +640,7 @@ static void cpu_throttle_thread(CPUState *cpu, run_on_cpu_data opaque)
> >      }
> >  
> >      pct = (double)cpu_throttle_get_percentage()/100;
> > -    throttle_ratio = pct / (1 - pct);
> > -    sleeptime_ns = (long)(throttle_ratio * CPU_THROTTLE_TIMESLICE_NS);
> 
> Say pct = 0.25, then throttle_ratio = 0.25/0.75 = 1/3.
> 
> > +    sleeptime_ns = (long)((1 - pct) * CPU_THROTTLE_TIMESLICE_NS);
> >  
> >      qemu_mutex_unlock_iothread();
> >      atomic_set(&cpu->throttle_thread_scheduled, 0);
> > @@ -668,7 +666,7 @@ static void cpu_throttle_timer_tick(void *opaque)
> >  
> >      pct = (double)cpu_throttle_get_percentage()/100;
> >      timer_mod(throttle_timer, qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL_RT) +
> > -                                   CPU_THROTTLE_TIMESLICE_NS / (1-pct));
> 
> And the timer is running every 1/0.75 = 4/3 * CPU_THROTTLE_TIMESLICE_NS.
> 
> Of these, 1/3 is spent sleeping (3.33 ms), while 1 (10 ms) is spent not
> sleeping.
> 
> When pct = 0.75, throttle_ratio = 3 and the timer is running every 4 *
> CPU_THROTTLE_TIMESLICE_NS (40 ms).  Of these, 3 slices (30 ms) are spent
> sleeping, while 10 ms are spent not sleeping.
> 
> The rationale _could_ be (I don't remember) that a CPU with a very high
> throttling frequency leaves little time for the migration thread to do
> any work.  So QEMU keeps the "on" phase always the same and lengthens
> the "off" phase, which as you found out can be unsatisfactory.
> 
> However, I think your patch has the opposite problem: the frequency is
> constant, but with high throttling all time reserved for the CPU will be
> lost in overhead.  For example, at 99% throttling you only have 100
> microseconds to wake up, do work and go back to sleep.

Yes, and I'm worried that with the 10ms timeslice it is a lot of overhead,
especially if your timer that wakes you have that much higher resolution than
that.

> So I'm inclined _not_ to take your patch.  One possibility could be to
> do the following:
> 
> - for throttling between 0% and 80%, use the current algorithm.  At 66%,
> the CPU will work for 10 ms and sleep for 40 ms.
> 
> - for throttling above 80% adapt your algorithm to have a variable
> timeslice, going from 50 ms at 66% to 100 ms at 100%.  This way, the CPU
> time will shrink below 10 ms and the sleep time will grow.

It seems odd to have a threshold like that on something that's supposedly
a linear scale.

> It looks like this: http://i.imgur.com/lyFie04.png
> 
> So at 99% the timeslice will be 97.5 ms; the CPU will work for 975 us
> and sleep for the rest (10x more than with just your patch).  But I'm
> not sure it's really worth it.

Can you really run a CPU for 975us ?

Dave

> Paolo
> 
> > +                                   CPU_THROTTLE_TIMESLICE_NS * pct);
> >  }
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH 5/5] cpu: throttle: fix throttle time slice
  2017-03-31 19:13     ` Dr. David Alan Gilbert
@ 2017-03-31 19:46       ` Paolo Bonzini
  2017-04-04 15:44         ` Dr. David Alan Gilbert
  0 siblings, 1 reply; 17+ messages in thread
From: Paolo Bonzini @ 2017-03-31 19:46 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: Richard Henderson, jjherne, qemu-devel, Peter Xu, Juan Quintela



> > So I'm inclined _not_ to take your patch.  One possibility could be to
> > do the following:
> > 
> > - for throttling between 0% and 80%, use the current algorithm.  At 66%,
> > the CPU will work for 10 ms and sleep for 40 ms.
> > 
> > - for throttling above 80% adapt your algorithm to have a variable
> > timeslice, going from 50 ms at 66% to 100 ms at 100%.  This way, the CPU
> > time will shrink below 10 ms and the sleep time will grow.

Oops, all the 66% should be 80%.

> It seems odd to have a threshold like that on something that's supposedly
> a linear scale.

I futzed a bit with the threshold until the first derivative of the CPU
time was zero at the threshold, and the result was 80%.  That is, if you
switch before 80%, the CPU time slice can grow to more than 10 ms right
after the threshold, and then start shrinking.

> > It looks like this: http://i.imgur.com/lyFie04.png
> > 
> > So at 99% the timeslice will be 97.5 ms; the CPU will work for 975 us
> > and sleep for the rest (10x more than with just your patch).  But I'm
> > not sure it's really worth it.
> 
> Can you really run a CPU for 975us ?

It's 2-3 million clock cycles, should be doable.

Paolo

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH 2/5] migration: rename max_size to threshold_size
  2017-03-31 18:59   ` Dr. David Alan Gilbert
@ 2017-04-01  7:16     ` Peter Xu
  0 siblings, 0 replies; 17+ messages in thread
From: Peter Xu @ 2017-04-01  7:16 UTC (permalink / raw)
  To: Dr. David Alan Gilbert; +Cc: qemu-devel, Juan Quintela

On Fri, Mar 31, 2017 at 07:59:19PM +0100, Dr. David Alan Gilbert wrote:
> * Peter Xu (peterx@redhat.com) wrote:
> > In migration codes (especially in migration_thread()), max_size is used
> > in many place for the threshold value that we will start to do the final
> > flush and jump to the next stage to dump the whole rest things to
> > destination. However its name is confusing to first readers. Let's
> > rename it to "threshold_size" when proper and add a comment for it. No
> > functional change is made.
> > 
> > CC: Juan Quintela <quintela@redhat.com>
> > CC: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> > Signed-off-by: Peter Xu <peterx@redhat.com>
> > ---
> >  include/migration/vmstate.h |  3 ++-
> >  migration/migration.c       | 17 +++++++++--------
> >  migration/savevm.c          |  4 ++--
> >  3 files changed, 13 insertions(+), 11 deletions(-)
> > 
> > diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h
> > index f2dbf84..dad3984 100644
> > --- a/include/migration/vmstate.h
> > +++ b/include/migration/vmstate.h
> > @@ -56,7 +56,8 @@ typedef struct SaveVMHandlers {
> >  
> >      /* This runs outside the iothread lock!  */
> >      int (*save_live_setup)(QEMUFile *f, void *opaque);
> > -    void (*save_live_pending)(QEMUFile *f, void *opaque, uint64_t max_size,
> > +    void (*save_live_pending)(QEMUFile *f, void *opaque,
> > +                              uint64_t threshold_size,
> >                                uint64_t *non_postcopiable_pending,
> >                                uint64_t *postcopiable_pending);
> >      LoadStateHandler *load_state;
> > diff --git a/migration/migration.c b/migration/migration.c
> > index f9f4d98..b065fe4 100644
> > --- a/migration/migration.c
> > +++ b/migration/migration.c
> > @@ -1907,7 +1907,8 @@ static void *migration_thread(void *opaque)
> >      int64_t initial_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
> >      int64_t setup_start = qemu_clock_get_ms(QEMU_CLOCK_HOST);
> >      int64_t initial_bytes = 0;
> > -    int64_t max_size = 0;
> > +    /* We'll do the final flush when reachs threshold_size */
> 
> I think that's 'reaches' - however perhaps we should make a more
> explicit comment:
>    'The final stage happens when the remaining data is smaller than
>     this threshold; it's calculated from the requested downtime
>     and measured bandwidth'

Yes it looks better. Will "steal" that. :)

> 
> other than that:
> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

Thanks!

-- peterx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH 5/5] cpu: throttle: fix throttle time slice
  2017-03-31 16:38   ` Paolo Bonzini
  2017-03-31 19:13     ` Dr. David Alan Gilbert
@ 2017-04-01  7:52     ` Peter Xu
  1 sibling, 0 replies; 17+ messages in thread
From: Peter Xu @ 2017-04-01  7:52 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: qemu-devel, Juan Quintela, Dr. David Alan Gilbert,
	Richard Henderson, jjherne

On Fri, Mar 31, 2017 at 06:38:50PM +0200, Paolo Bonzini wrote:
> 
> 
> On 27/03/2017 09:21, Peter Xu wrote:
> > @@ -641,8 +640,7 @@ static void cpu_throttle_thread(CPUState *cpu, run_on_cpu_data opaque)
> >      }
> >  
> >      pct = (double)cpu_throttle_get_percentage()/100;
> > -    throttle_ratio = pct / (1 - pct);
> > -    sleeptime_ns = (long)(throttle_ratio * CPU_THROTTLE_TIMESLICE_NS);
> 
> Say pct = 0.25, then throttle_ratio = 0.25/0.75 = 1/3.
> 
> > +    sleeptime_ns = (long)((1 - pct) * CPU_THROTTLE_TIMESLICE_NS);
> >  
> >      qemu_mutex_unlock_iothread();
> >      atomic_set(&cpu->throttle_thread_scheduled, 0);
> > @@ -668,7 +666,7 @@ static void cpu_throttle_timer_tick(void *opaque)
> >  
> >      pct = (double)cpu_throttle_get_percentage()/100;
> >      timer_mod(throttle_timer, qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL_RT) +
> > -                                   CPU_THROTTLE_TIMESLICE_NS / (1-pct));
> 
> And the timer is running every 1/0.75 = 4/3 * CPU_THROTTLE_TIMESLICE_NS.
> 
> Of these, 1/3 is spent sleeping (3.33 ms), while 1 (10 ms) is spent not
> sleeping.
> 
> When pct = 0.75, throttle_ratio = 3 and the timer is running every 4 *
> CPU_THROTTLE_TIMESLICE_NS (40 ms).  Of these, 3 slices (30 ms) are spent
> sleeping, while 10 ms are spent not sleeping.
> 
> The rationale _could_ be (I don't remember) that a CPU with a very high
> throttling frequency leaves little time for the migration thread to do
> any work.  So QEMU keeps the "on" phase always the same and lengthens
> the "off" phase, which as you found out can be unsatisfactory.

Yes. Sorry I must have done an incorrect math that day. Current
algorithm is correct, it just assumes that the running time is
constant, which is CPU_THROTTLE_TIMESLICE_NS (10ms). I just didn't
realize it at the first glance.

> 
> However, I think your patch has the opposite problem: the frequency is
> constant, but with high throttling all time reserved for the CPU will be
> lost in overhead.  For example, at 99% throttling you only have 100
> microseconds to wake up, do work and go back to sleep.
> 
> So I'm inclined _not_ to take your patch.  One possibility could be to
> do the following:
> 
> - for throttling between 0% and 80%, use the current algorithm.  At 66%,
> the CPU will work for 10 ms and sleep for 40 ms.
> 
> - for throttling above 80% adapt your algorithm to have a variable
> timeslice, going from 50 ms at 66% to 100 ms at 100%.  This way, the CPU
> time will shrink below 10 ms and the sleep time will grow.
> 
> It looks like this: http://i.imgur.com/lyFie04.png
> 
> So at 99% the timeslice will be 97.5 ms; the CPU will work for 975 us
> and sleep for the rest (10x more than with just your patch).  But I'm
> not sure it's really worth it.

Yeah. It may not that worth it. Thanks for the analysis. :-)

Will drop this patch in next post.

Thanks,

-- peterx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH 5/5] cpu: throttle: fix throttle time slice
  2017-03-31 19:46       ` Paolo Bonzini
@ 2017-04-04 15:44         ` Dr. David Alan Gilbert
  0 siblings, 0 replies; 17+ messages in thread
From: Dr. David Alan Gilbert @ 2017-04-04 15:44 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Richard Henderson, jjherne, qemu-devel, Peter Xu, Juan Quintela

* Paolo Bonzini (pbonzini@redhat.com) wrote:
> 
> 
> > > So I'm inclined _not_ to take your patch.  One possibility could be to
> > > do the following:
> > > 
> > > - for throttling between 0% and 80%, use the current algorithm.  At 66%,
> > > the CPU will work for 10 ms and sleep for 40 ms.
> > > 
> > > - for throttling above 80% adapt your algorithm to have a variable
> > > timeslice, going from 50 ms at 66% to 100 ms at 100%.  This way, the CPU
> > > time will shrink below 10 ms and the sleep time will grow.
> 
> Oops, all the 66% should be 80%.
> 
> > It seems odd to have a threshold like that on something that's supposedly
> > a linear scale.
> 
> I futzed a bit with the threshold until the first derivative of the CPU
> time was zero at the threshold, and the result was 80%.  That is, if you
> switch before 80%, the CPU time slice can grow to more than 10 ms right
> after the threshold, and then start shrinking.
> 
> > > It looks like this: http://i.imgur.com/lyFie04.png
> > > 
> > > So at 99% the timeslice will be 97.5 ms; the CPU will work for 975 us
> > > and sleep for the rest (10x more than with just your patch).  But I'm
> > > not sure it's really worth it.
> > 
> > Can you really run a CPU for 975us ?
> 
> It's 2-3 million clock cycles, should be doable.

I wasn't really worrying about the CPU running, I was more worried
about timer resolution in stopping it.  If you're timer isn't that accurate
in starting/stopping the CPU then the errors might be so large as to
make that a bit odd.

Dave

> Paolo
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2017-04-04 15:44 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-27  7:21 [Qemu-devel] [PATCH 0/5] several migrations related patches Peter Xu
2017-03-27  7:21 ` [Qemu-devel] [PATCH 1/5] migration: set current_active_state once Peter Xu
2017-03-31 18:50   ` Dr. David Alan Gilbert
2017-03-27  7:21 ` [Qemu-devel] [PATCH 2/5] migration: rename max_size to threshold_size Peter Xu
2017-03-31 18:59   ` Dr. David Alan Gilbert
2017-04-01  7:16     ` Peter Xu
2017-03-27  7:21 ` [Qemu-devel] [PATCH 3/5] hmp: info migrate_capability format tunes Peter Xu
2017-03-31 19:01   ` Dr. David Alan Gilbert
2017-03-27  7:21 ` [Qemu-devel] [PATCH 4/5] hmp: info migrate_parameters " Peter Xu
2017-03-31 19:02   ` Dr. David Alan Gilbert
2017-03-27  7:21 ` [Qemu-devel] [PATCH 5/5] cpu: throttle: fix throttle time slice Peter Xu
2017-03-27  7:40   ` Peter Xu
2017-03-31 16:38   ` Paolo Bonzini
2017-03-31 19:13     ` Dr. David Alan Gilbert
2017-03-31 19:46       ` Paolo Bonzini
2017-04-04 15:44         ` Dr. David Alan Gilbert
2017-04-01  7:52     ` Peter Xu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.