All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH  0/3] migration/colo: Optimize COLO framework code
@ 2020-05-15  4:28 Zhang Chen
  2020-05-15  4:28 ` [PATCH 1/3] migration/colo: Optimize COLO boot code path Zhang Chen
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Zhang Chen @ 2020-05-15  4:28 UTC (permalink / raw)
  To: Dr . David Alan Gilbert, Juan Quintela, Zhanghailiang, qemu-dev
  Cc: Zhang Chen, Jason Wang, Zhang Chen

From: Zhang Chen <chen.zhang@intel.com>

This series optimize some code of COLO, please review.

Zhang Chen (3):
  migration/colo: Optimize COLO boot code path
  migration/colo: Update checkpoint time lately
  migration/colo: Merge multi checkpoint request into one.

 migration/colo.c      | 29 ++++++++++++++++++-----------
 migration/migration.c | 17 ++++++++++-------
 2 files changed, 28 insertions(+), 18 deletions(-)

-- 
2.17.1



^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH  1/3] migration/colo: Optimize COLO boot code path
  2020-05-15  4:28 [PATCH 0/3] migration/colo: Optimize COLO framework code Zhang Chen
@ 2020-05-15  4:28 ` Zhang Chen
  2020-06-02  3:29   ` Zhanghailiang
  2020-05-15  4:28 ` [PATCH 2/3] migration/colo: Update checkpoint time lately Zhang Chen
  2020-05-15  4:28 ` [PATCH 3/3] migration/colo: Merge multi checkpoint request into one Zhang Chen
  2 siblings, 1 reply; 10+ messages in thread
From: Zhang Chen @ 2020-05-15  4:28 UTC (permalink / raw)
  To: Dr . David Alan Gilbert, Juan Quintela, Zhanghailiang, qemu-dev
  Cc: Zhang Chen, Jason Wang, Zhang Chen

From: Zhang Chen <chen.zhang@intel.com>

No need to reuse MIGRATION_STATUS_ACTIVE boot COLO.

Signed-off-by: Zhang Chen <chen.zhang@intel.com>
---
 migration/colo.c      |  2 --
 migration/migration.c | 17 ++++++++++-------
 2 files changed, 10 insertions(+), 9 deletions(-)

diff --git a/migration/colo.c b/migration/colo.c
index d015d4f84e..5ef69b885d 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -669,8 +669,6 @@ void migrate_start_colo_process(MigrationState *s)
                                 colo_checkpoint_notify, s);
 
     qemu_sem_init(&s->colo_exit_sem, 0);
-    migrate_set_state(&s->state, MIGRATION_STATUS_ACTIVE,
-                      MIGRATION_STATUS_COLO);
     colo_process_checkpoint(s);
     qemu_mutex_lock_iothread();
 }
diff --git a/migration/migration.c b/migration/migration.c
index 0bb042a0f7..c889ef6eb7 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -2972,7 +2972,10 @@ static void migration_completion(MigrationState *s)
         goto fail_invalidate;
     }
 
-    if (!migrate_colo_enabled()) {
+    if (migrate_colo_enabled()) {
+        migrate_set_state(&s->state, current_active_state,
+                          MIGRATION_STATUS_COLO);
+    } else {
         migrate_set_state(&s->state, current_active_state,
                           MIGRATION_STATUS_COMPLETED);
     }
@@ -3304,12 +3307,7 @@ static void migration_iteration_finish(MigrationState *s)
         migration_calculate_complete(s);
         runstate_set(RUN_STATE_POSTMIGRATE);
         break;
-
-    case MIGRATION_STATUS_ACTIVE:
-        /*
-         * We should really assert here, but since it's during
-         * migration, let's try to reduce the usage of assertions.
-         */
+    case MIGRATION_STATUS_COLO:
         if (!migrate_colo_enabled()) {
             error_report("%s: critical error: calling COLO code without "
                          "COLO enabled", __func__);
@@ -3319,6 +3317,11 @@ static void migration_iteration_finish(MigrationState *s)
          * Fixme: we will run VM in COLO no matter its old running state.
          * After exited COLO, we will keep running.
          */
+    case MIGRATION_STATUS_ACTIVE:
+        /*
+         * We should really assert here, but since it's during
+         * migration, let's try to reduce the usage of assertions.
+         */
         s->vm_was_running = true;
         /* Fallthrough */
     case MIGRATION_STATUS_FAILED:
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH  2/3] migration/colo: Update checkpoint time lately
  2020-05-15  4:28 [PATCH 0/3] migration/colo: Optimize COLO framework code Zhang Chen
  2020-05-15  4:28 ` [PATCH 1/3] migration/colo: Optimize COLO boot code path Zhang Chen
@ 2020-05-15  4:28 ` Zhang Chen
  2020-06-02  6:20   ` Zhanghailiang
  2020-05-15  4:28 ` [PATCH 3/3] migration/colo: Merge multi checkpoint request into one Zhang Chen
  2 siblings, 1 reply; 10+ messages in thread
From: Zhang Chen @ 2020-05-15  4:28 UTC (permalink / raw)
  To: Dr . David Alan Gilbert, Juan Quintela, Zhanghailiang, qemu-dev
  Cc: Zhang Chen, Jason Wang, Zhang Chen

From: Zhang Chen <chen.zhang@intel.com>

Previous operation(like vm_start and replication_start_all) will consume
extra time for first forced synchronization, so reduce it in this patch.

Signed-off-by: Zhang Chen <chen.zhang@intel.com>
---
 migration/colo.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/migration/colo.c b/migration/colo.c
index 5ef69b885d..d5bced22cb 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -531,7 +531,6 @@ static void colo_process_checkpoint(MigrationState *s)
 {
     QIOChannelBuffer *bioc;
     QEMUFile *fb = NULL;
-    int64_t current_time = qemu_clock_get_ms(QEMU_CLOCK_HOST);
     Error *local_err = NULL;
     int ret;
 
@@ -580,8 +579,8 @@ static void colo_process_checkpoint(MigrationState *s)
     qemu_mutex_unlock_iothread();
     trace_colo_vm_state_change("stop", "run");
 
-    timer_mod(s->colo_delay_timer,
-            current_time + s->parameters.x_checkpoint_delay);
+    timer_mod(s->colo_delay_timer, qemu_clock_get_ms(QEMU_CLOCK_HOST) +
+              s->parameters.x_checkpoint_delay);
 
     while (s->state == MIGRATION_STATUS_COLO) {
         if (failover_get_state() != FAILOVER_STATUS_NONE) {
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 3/3] migration/colo: Merge multi checkpoint request into one.
  2020-05-15  4:28 [PATCH 0/3] migration/colo: Optimize COLO framework code Zhang Chen
  2020-05-15  4:28 ` [PATCH 1/3] migration/colo: Optimize COLO boot code path Zhang Chen
  2020-05-15  4:28 ` [PATCH 2/3] migration/colo: Update checkpoint time lately Zhang Chen
@ 2020-05-15  4:28 ` Zhang Chen
  2020-06-02  6:59   ` Zhanghailiang
  2 siblings, 1 reply; 10+ messages in thread
From: Zhang Chen @ 2020-05-15  4:28 UTC (permalink / raw)
  To: Dr . David Alan Gilbert, Juan Quintela, Zhanghailiang, qemu-dev
  Cc: Zhang Chen, Jason Wang, Zhang Chen

From: Zhang Chen <chen.zhang@intel.com>

When COLO guest occur issues, COLO-compare will catch lots of
different network packet and trigger notification multi times,
force periodic may happen at the same time. So this can be
efficient merge checkpoint request within COLO_CHECKPOINT_INTERVAL.

Signed-off-by: Zhang Chen <chen.zhang@intel.com>
---
 migration/colo.c | 22 ++++++++++++++++------
 1 file changed, 16 insertions(+), 6 deletions(-)

diff --git a/migration/colo.c b/migration/colo.c
index d5bced22cb..e6a7d8c6e2 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -47,6 +47,9 @@ static COLOMode last_colo_mode;
 
 #define COLO_BUFFER_BASE_SIZE (4 * 1024 * 1024)
 
+/* Default COLO_CHECKPOINT_INTERVAL is 1000 ms */
+#define COLO_CHECKPOINT_INTERVAL 1000
+
 bool migration_in_colo_state(void)
 {
     MigrationState *s = migrate_get_current();
@@ -651,13 +654,20 @@ out:
 void colo_checkpoint_notify(void *opaque)
 {
     MigrationState *s = opaque;
-    int64_t next_notify_time;
+    int64_t now = qemu_clock_get_ms(QEMU_CLOCK_HOST);
 
-    qemu_sem_post(&s->colo_checkpoint_sem);
-    s->colo_checkpoint_time = qemu_clock_get_ms(QEMU_CLOCK_HOST);
-    next_notify_time = s->colo_checkpoint_time +
-                    s->parameters.x_checkpoint_delay;
-    timer_mod(s->colo_delay_timer, next_notify_time);
+    /*
+     * When COLO guest occur issues, COLO-compare will catch lots of
+     * different network packet and trigger notification multi times,
+     * force periodic may happen at the same time. So this can be
+     * efficient merge checkpoint request within COLO_CHECKPOINT_INTERVAL.
+     */
+    if (now > s->colo_checkpoint_time + COLO_CHECKPOINT_INTERVAL) {
+        qemu_sem_post(&s->colo_checkpoint_sem);
+        timer_mod(s->colo_delay_timer, now +
+                  s->parameters.x_checkpoint_delay);
+        s->colo_checkpoint_time = now;
+    }
 }
 
 void migrate_start_colo_process(MigrationState *s)
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* RE: [PATCH  1/3] migration/colo: Optimize COLO boot code path
  2020-05-15  4:28 ` [PATCH 1/3] migration/colo: Optimize COLO boot code path Zhang Chen
@ 2020-06-02  3:29   ` Zhanghailiang
  0 siblings, 0 replies; 10+ messages in thread
From: Zhanghailiang @ 2020-06-02  3:29 UTC (permalink / raw)
  To: Zhang Chen, Dr . David Alan Gilbert, Juan Quintela, qemu-dev
  Cc: Jason Wang, Zhang Chen

Reviewed-by: zhanghailiang <zhang.zhanghailiang@huawei.com>

> -----Original Message-----
> From: Zhang Chen [mailto:chen.zhang@intel.com]
> Sent: Friday, May 15, 2020 12:28 PM
> To: Dr . David Alan Gilbert <dgilbert@redhat.com>; Juan Quintela
> <quintela@redhat.com>; Zhanghailiang <zhang.zhanghailiang@huawei.com>;
> qemu-dev <qemu-devel@nongnu.org>
> Cc: Zhang Chen <zhangckid@gmail.com>; Jason Wang
> <jasowang@redhat.com>; Zhang Chen <chen.zhang@intel.com>
> Subject: [PATCH 1/3] migration/colo: Optimize COLO boot code path
> 
> From: Zhang Chen <chen.zhang@intel.com>
> 
> No need to reuse MIGRATION_STATUS_ACTIVE boot COLO.
> 
> Signed-off-by: Zhang Chen <chen.zhang@intel.com>
> ---
>  migration/colo.c      |  2 --
>  migration/migration.c | 17 ++++++++++-------
>  2 files changed, 10 insertions(+), 9 deletions(-)
> 
> diff --git a/migration/colo.c b/migration/colo.c index
> d015d4f84e..5ef69b885d 100644
> --- a/migration/colo.c
> +++ b/migration/colo.c
> @@ -669,8 +669,6 @@ void migrate_start_colo_process(MigrationState *s)
>                                  colo_checkpoint_notify, s);
> 
>      qemu_sem_init(&s->colo_exit_sem, 0);
> -    migrate_set_state(&s->state, MIGRATION_STATUS_ACTIVE,
> -                      MIGRATION_STATUS_COLO);
>      colo_process_checkpoint(s);
>      qemu_mutex_lock_iothread();
>  }
> diff --git a/migration/migration.c b/migration/migration.c index
> 0bb042a0f7..c889ef6eb7 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -2972,7 +2972,10 @@ static void
> migration_completion(MigrationState *s)
>          goto fail_invalidate;
>      }
> 
> -    if (!migrate_colo_enabled()) {
> +    if (migrate_colo_enabled()) {
> +        migrate_set_state(&s->state, current_active_state,
> +                          MIGRATION_STATUS_COLO);
> +    } else {
>          migrate_set_state(&s->state, current_active_state,
>                            MIGRATION_STATUS_COMPLETED);
>      }
> @@ -3304,12 +3307,7 @@ static void
> migration_iteration_finish(MigrationState *s)
>          migration_calculate_complete(s);
>          runstate_set(RUN_STATE_POSTMIGRATE);
>          break;
> -
> -    case MIGRATION_STATUS_ACTIVE:
> -        /*
> -         * We should really assert here, but since it's during
> -         * migration, let's try to reduce the usage of assertions.
> -         */
> +    case MIGRATION_STATUS_COLO:
>          if (!migrate_colo_enabled()) {
>              error_report("%s: critical error: calling COLO code without "
>                           "COLO enabled", __func__); @@ -3319,6
> +3317,11 @@ static void migration_iteration_finish(MigrationState *s)
>           * Fixme: we will run VM in COLO no matter its old running state.
>           * After exited COLO, we will keep running.
>           */
> +    case MIGRATION_STATUS_ACTIVE:
> +        /*
> +         * We should really assert here, but since it's during
> +         * migration, let's try to reduce the usage of assertions.
> +         */
>          s->vm_was_running = true;
>          /* Fallthrough */
>      case MIGRATION_STATUS_FAILED:
> --
> 2.17.1



^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: [PATCH  2/3] migration/colo: Update checkpoint time lately
  2020-05-15  4:28 ` [PATCH 2/3] migration/colo: Update checkpoint time lately Zhang Chen
@ 2020-06-02  6:20   ` Zhanghailiang
  0 siblings, 0 replies; 10+ messages in thread
From: Zhanghailiang @ 2020-06-02  6:20 UTC (permalink / raw)
  To: Zhang Chen, Dr . David Alan Gilbert, Juan Quintela, qemu-dev
  Cc: Jason Wang, Zhang Chen

Reviewed-by: zhanghailiang <zhang.zhanghailiang@huawei.com>

Hmm, How much time it spends on preparing before COLO process ?

> -----Original Message-----
> From: Zhang Chen [mailto:chen.zhang@intel.com]
> Sent: Friday, May 15, 2020 12:28 PM
> To: Dr . David Alan Gilbert <dgilbert@redhat.com>; Juan Quintela
> <quintela@redhat.com>; Zhanghailiang <zhang.zhanghailiang@huawei.com>;
> qemu-dev <qemu-devel@nongnu.org>
> Cc: Zhang Chen <zhangckid@gmail.com>; Jason Wang
> <jasowang@redhat.com>; Zhang Chen <chen.zhang@intel.com>
> Subject: [PATCH 2/3] migration/colo: Update checkpoint time lately
> 
> From: Zhang Chen <chen.zhang@intel.com>
> 
> Previous operation(like vm_start and replication_start_all) will consume
> extra time for first forced synchronization, so reduce it in this patch.
> 
> Signed-off-by: Zhang Chen <chen.zhang@intel.com>
> ---
>  migration/colo.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/migration/colo.c b/migration/colo.c index
> 5ef69b885d..d5bced22cb 100644
> --- a/migration/colo.c
> +++ b/migration/colo.c
> @@ -531,7 +531,6 @@ static void colo_process_checkpoint(MigrationState
> *s)  {
>      QIOChannelBuffer *bioc;
>      QEMUFile *fb = NULL;
> -    int64_t current_time = qemu_clock_get_ms(QEMU_CLOCK_HOST);
>      Error *local_err = NULL;
>      int ret;
> 
> @@ -580,8 +579,8 @@ static void colo_process_checkpoint(MigrationState
> *s)
>      qemu_mutex_unlock_iothread();
>      trace_colo_vm_state_change("stop", "run");
> 
> -    timer_mod(s->colo_delay_timer,
> -            current_time + s->parameters.x_checkpoint_delay);
> +    timer_mod(s->colo_delay_timer,
> qemu_clock_get_ms(QEMU_CLOCK_HOST) +
> +              s->parameters.x_checkpoint_delay);
> 
>      while (s->state == MIGRATION_STATUS_COLO) {
>          if (failover_get_state() != FAILOVER_STATUS_NONE) {
> --
> 2.17.1



^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: [PATCH 3/3] migration/colo: Merge multi checkpoint request into one.
  2020-05-15  4:28 ` [PATCH 3/3] migration/colo: Merge multi checkpoint request into one Zhang Chen
@ 2020-06-02  6:59   ` Zhanghailiang
  2020-06-03  9:11     ` Zhang, Chen
  0 siblings, 1 reply; 10+ messages in thread
From: Zhanghailiang @ 2020-06-02  6:59 UTC (permalink / raw)
  To: Zhang Chen, Dr . David Alan Gilbert, Juan Quintela, qemu-dev
  Cc: Jason Wang, Zhang Chen



> -----Original Message-----
> From: Zhang Chen [mailto:chen.zhang@intel.com]
> Sent: Friday, May 15, 2020 12:28 PM
> To: Dr . David Alan Gilbert <dgilbert@redhat.com>; Juan Quintela
> <quintela@redhat.com>; Zhanghailiang <zhang.zhanghailiang@huawei.com>;
> qemu-dev <qemu-devel@nongnu.org>
> Cc: Zhang Chen <zhangckid@gmail.com>; Jason Wang
> <jasowang@redhat.com>; Zhang Chen <chen.zhang@intel.com>
> Subject: [PATCH 3/3] migration/colo: Merge multi checkpoint request into
> one.
> 
> From: Zhang Chen <chen.zhang@intel.com>
> 
> When COLO guest occur issues, COLO-compare will catch lots of different
> network packet and trigger notification multi times, force periodic may
> happen at the same time. So this can be efficient merge checkpoint request
> within COLO_CHECKPOINT_INTERVAL.
> 
> Signed-off-by: Zhang Chen <chen.zhang@intel.com>
> ---
>  migration/colo.c | 22 ++++++++++++++++------
>  1 file changed, 16 insertions(+), 6 deletions(-)
> 
> diff --git a/migration/colo.c b/migration/colo.c index
> d5bced22cb..e6a7d8c6e2 100644
> --- a/migration/colo.c
> +++ b/migration/colo.c
> @@ -47,6 +47,9 @@ static COLOMode last_colo_mode;
> 
>  #define COLO_BUFFER_BASE_SIZE (4 * 1024 * 1024)
> 
> +/* Default COLO_CHECKPOINT_INTERVAL is 1000 ms */ #define
> +COLO_CHECKPOINT_INTERVAL 1000
> +
>  bool migration_in_colo_state(void)
>  {
>      MigrationState *s = migrate_get_current(); @@ -651,13 +654,20 @@
> out:
>  void colo_checkpoint_notify(void *opaque)  {
>      MigrationState *s = opaque;
> -    int64_t next_notify_time;
> +    int64_t now = qemu_clock_get_ms(QEMU_CLOCK_HOST);
> 
> -    qemu_sem_post(&s->colo_checkpoint_sem);
> -    s->colo_checkpoint_time = qemu_clock_get_ms(QEMU_CLOCK_HOST);
> -    next_notify_time = s->colo_checkpoint_time +
> -                    s->parameters.x_checkpoint_delay;
> -    timer_mod(s->colo_delay_timer, next_notify_time);
> +    /*
> +     * When COLO guest occur issues, COLO-compare will catch lots of
> +     * different network packet and trigger notification multi times,
> +     * force periodic may happen at the same time. So this can be
> +     * efficient merge checkpoint request within
> COLO_CHECKPOINT_INTERVAL.
> +     */
> +    if (now > s->colo_checkpoint_time + COLO_CHECKPOINT_INTERVAL) {
> +        qemu_sem_post(&s->colo_checkpoint_sem);

It is not right here, this notification should not be controlled by the interval time,
I got what happened here, when multiple checkpoint requires come, this
Colo_delay_time will be added every time and it will be a big value which is not what we want.

Besides, please update this patch based on [PATCH 0/6] colo: migration related bugfixes series which
Has modified the same place.



> +        timer_mod(s->colo_delay_timer, now +
> +                  s->parameters.x_checkpoint_delay);
> +        s->colo_checkpoint_time = now;
> +    }
>  }
> 
>  void migrate_start_colo_process(MigrationState *s)
> --
> 2.17.1



^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: [PATCH 3/3] migration/colo: Merge multi checkpoint request into one.
  2020-06-02  6:59   ` Zhanghailiang
@ 2020-06-03  9:11     ` Zhang, Chen
  2020-06-03  9:38       ` Zhanghailiang
  0 siblings, 1 reply; 10+ messages in thread
From: Zhang, Chen @ 2020-06-03  9:11 UTC (permalink / raw)
  To: Zhanghailiang, Dr . David Alan Gilbert, Juan Quintela, qemu-dev
  Cc: Jason Wang, Zhang Chen



> -----Original Message-----
> From: Zhanghailiang <zhang.zhanghailiang@huawei.com>
> Sent: Tuesday, June 2, 2020 2:59 PM
> To: Zhang, Chen <chen.zhang@intel.com>; Dr . David Alan Gilbert
> <dgilbert@redhat.com>; Juan Quintela <quintela@redhat.com>; qemu-dev
> <qemu-devel@nongnu.org>
> Cc: Zhang Chen <zhangckid@gmail.com>; Jason Wang
> <jasowang@redhat.com>
> Subject: RE: [PATCH 3/3] migration/colo: Merge multi checkpoint request
> into one.
> 
> 
> 
> > -----Original Message-----
> > From: Zhang Chen [mailto:chen.zhang@intel.com]
> > Sent: Friday, May 15, 2020 12:28 PM
> > To: Dr . David Alan Gilbert <dgilbert@redhat.com>; Juan Quintela
> > <quintela@redhat.com>; Zhanghailiang
> <zhang.zhanghailiang@huawei.com>;
> > qemu-dev <qemu-devel@nongnu.org>
> > Cc: Zhang Chen <zhangckid@gmail.com>; Jason Wang
> > <jasowang@redhat.com>; Zhang Chen <chen.zhang@intel.com>
> > Subject: [PATCH 3/3] migration/colo: Merge multi checkpoint request
> > into one.
> >
> > From: Zhang Chen <chen.zhang@intel.com>
> >
> > When COLO guest occur issues, COLO-compare will catch lots of
> > different network packet and trigger notification multi times, force
> > periodic may happen at the same time. So this can be efficient merge
> > checkpoint request within COLO_CHECKPOINT_INTERVAL.
> >
> > Signed-off-by: Zhang Chen <chen.zhang@intel.com>
> > ---
> >  migration/colo.c | 22 ++++++++++++++++------
> >  1 file changed, 16 insertions(+), 6 deletions(-)
> >
> > diff --git a/migration/colo.c b/migration/colo.c index
> > d5bced22cb..e6a7d8c6e2 100644
> > --- a/migration/colo.c
> > +++ b/migration/colo.c
> > @@ -47,6 +47,9 @@ static COLOMode last_colo_mode;
> >
> >  #define COLO_BUFFER_BASE_SIZE (4 * 1024 * 1024)
> >
> > +/* Default COLO_CHECKPOINT_INTERVAL is 1000 ms */ #define
> > +COLO_CHECKPOINT_INTERVAL 1000
> > +
> >  bool migration_in_colo_state(void)
> >  {
> >      MigrationState *s = migrate_get_current(); @@ -651,13 +654,20 @@
> > out:
> >  void colo_checkpoint_notify(void *opaque)  {
> >      MigrationState *s = opaque;
> > -    int64_t next_notify_time;
> > +    int64_t now = qemu_clock_get_ms(QEMU_CLOCK_HOST);
> >
> > -    qemu_sem_post(&s->colo_checkpoint_sem);
> > -    s->colo_checkpoint_time = qemu_clock_get_ms(QEMU_CLOCK_HOST);
> > -    next_notify_time = s->colo_checkpoint_time +
> > -                    s->parameters.x_checkpoint_delay;
> > -    timer_mod(s->colo_delay_timer, next_notify_time);
> > +    /*
> > +     * When COLO guest occur issues, COLO-compare will catch lots of
> > +     * different network packet and trigger notification multi times,
> > +     * force periodic may happen at the same time. So this can be
> > +     * efficient merge checkpoint request within
> > COLO_CHECKPOINT_INTERVAL.
> > +     */
> > +    if (now > s->colo_checkpoint_time + COLO_CHECKPOINT_INTERVAL) {
> > +        qemu_sem_post(&s->colo_checkpoint_sem);
> 
> It is not right here, this notification should not be controlled by the interval
> time, I got what happened here, when multiple checkpoint requires come,
> this Colo_delay_time will be added every time and it will be a big value which
> is not what we want.

Not just this, multi checkpoint will spend lots of resource to sync memory from PVM to SVM,
It will make VM stop/start multi times, but for the results are same with one checkpoint. 
So in short time just need one checkpoint, because do checkpoint still need some time...

Thanks
Zhang Chen

> 
> Besides, please update this patch based on [PATCH 0/6] colo: migration
> related bugfixes series which Has modified the same place.
> 
> 
> 
> > +        timer_mod(s->colo_delay_timer, now +
> > +                  s->parameters.x_checkpoint_delay);
> > +        s->colo_checkpoint_time = now;
> > +    }
> >  }
> >
> >  void migrate_start_colo_process(MigrationState *s)
> > --
> > 2.17.1



^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: [PATCH 3/3] migration/colo: Merge multi checkpoint request into one.
  2020-06-03  9:11     ` Zhang, Chen
@ 2020-06-03  9:38       ` Zhanghailiang
  2020-06-04  6:35         ` Zhang, Chen
  0 siblings, 1 reply; 10+ messages in thread
From: Zhanghailiang @ 2020-06-03  9:38 UTC (permalink / raw)
  To: Zhang, Chen, Dr . David Alan Gilbert, Juan Quintela, qemu-dev
  Cc: Jason Wang, Zhang Chen

> -----Original Message-----
> From: Zhang, Chen [mailto:chen.zhang@intel.com]
> Sent: Wednesday, June 3, 2020 5:11 PM
> To: Zhanghailiang <zhang.zhanghailiang@huawei.com>; Dr . David Alan
> Gilbert <dgilbert@redhat.com>; Juan Quintela <quintela@redhat.com>;
> qemu-dev <qemu-devel@nongnu.org>
> Cc: Zhang Chen <zhangckid@gmail.com>; Jason Wang
> <jasowang@redhat.com>
> Subject: RE: [PATCH 3/3] migration/colo: Merge multi checkpoint request
> into one.
> 
> 
> 
> > -----Original Message-----
> > From: Zhanghailiang <zhang.zhanghailiang@huawei.com>
> > Sent: Tuesday, June 2, 2020 2:59 PM
> > To: Zhang, Chen <chen.zhang@intel.com>; Dr . David Alan Gilbert
> > <dgilbert@redhat.com>; Juan Quintela <quintela@redhat.com>; qemu-dev
> > <qemu-devel@nongnu.org>
> > Cc: Zhang Chen <zhangckid@gmail.com>; Jason Wang
> <jasowang@redhat.com>
> > Subject: RE: [PATCH 3/3] migration/colo: Merge multi checkpoint
> > request into one.
> >
> >
> >
> > > -----Original Message-----
> > > From: Zhang Chen [mailto:chen.zhang@intel.com]
> > > Sent: Friday, May 15, 2020 12:28 PM
> > > To: Dr . David Alan Gilbert <dgilbert@redhat.com>; Juan Quintela
> > > <quintela@redhat.com>; Zhanghailiang
> > <zhang.zhanghailiang@huawei.com>;
> > > qemu-dev <qemu-devel@nongnu.org>
> > > Cc: Zhang Chen <zhangckid@gmail.com>; Jason Wang
> > > <jasowang@redhat.com>; Zhang Chen <chen.zhang@intel.com>
> > > Subject: [PATCH 3/3] migration/colo: Merge multi checkpoint request
> > > into one.
> > >
> > > From: Zhang Chen <chen.zhang@intel.com>
> > >
> > > When COLO guest occur issues, COLO-compare will catch lots of
> > > different network packet and trigger notification multi times, force
> > > periodic may happen at the same time. So this can be efficient merge
> > > checkpoint request within COLO_CHECKPOINT_INTERVAL.
> > >
> > > Signed-off-by: Zhang Chen <chen.zhang@intel.com>
> > > ---
> > >  migration/colo.c | 22 ++++++++++++++++------
> > >  1 file changed, 16 insertions(+), 6 deletions(-)
> > >
> > > diff --git a/migration/colo.c b/migration/colo.c index
> > > d5bced22cb..e6a7d8c6e2 100644
> > > --- a/migration/colo.c
> > > +++ b/migration/colo.c
> > > @@ -47,6 +47,9 @@ static COLOMode last_colo_mode;
> > >
> > >  #define COLO_BUFFER_BASE_SIZE (4 * 1024 * 1024)
> > >
> > > +/* Default COLO_CHECKPOINT_INTERVAL is 1000 ms */ #define
> > > +COLO_CHECKPOINT_INTERVAL 1000
> > > +
> > >  bool migration_in_colo_state(void)
> > >  {
> > >      MigrationState *s = migrate_get_current(); @@ -651,13 +654,20
> > > @@
> > > out:
> > >  void colo_checkpoint_notify(void *opaque)  {
> > >      MigrationState *s = opaque;
> > > -    int64_t next_notify_time;
> > > +    int64_t now = qemu_clock_get_ms(QEMU_CLOCK_HOST);
> > >
> > > -    qemu_sem_post(&s->colo_checkpoint_sem);
> > > -    s->colo_checkpoint_time =
> qemu_clock_get_ms(QEMU_CLOCK_HOST);
> > > -    next_notify_time = s->colo_checkpoint_time +
> > > -                    s->parameters.x_checkpoint_delay;
> > > -    timer_mod(s->colo_delay_timer, next_notify_time);
> > > +    /*
> > > +     * When COLO guest occur issues, COLO-compare will catch lots of
> > > +     * different network packet and trigger notification multi times,
> > > +     * force periodic may happen at the same time. So this can be
> > > +     * efficient merge checkpoint request within
> > > COLO_CHECKPOINT_INTERVAL.
> > > +     */
> > > +    if (now > s->colo_checkpoint_time + COLO_CHECKPOINT_INTERVAL)
> {
> > > +        qemu_sem_post(&s->colo_checkpoint_sem);
> >
> > It is not right here, this notification should not be controlled by
> > the interval time, I got what happened here, when multiple checkpoint
> > requires come, this Colo_delay_time will be added every time and it
> > will be a big value which is not what we want.
> 
> Not just this, multi checkpoint will spend lots of resource to sync memory
> from PVM to SVM, It will make VM stop/start multi times, but for the results
> are same with one checkpoint.
> So in short time just need one checkpoint, because do checkpoint still need
> some time...
> 

Yes, this because we use semaphore here, it will be increased multiple times,
And I think Lukas's patch 'migration/colo.c: Use event instead of semaphore' has fixed this problem.
Did you try the qemu upstream which has merged this patch ?

> Thanks
> Zhang Chen
> 
> >
> > Besides, please update this patch based on [PATCH 0/6] colo: migration
> > related bugfixes series which Has modified the same place.
> >
> >
> >
> > > +        timer_mod(s->colo_delay_timer, now +
> > > +                  s->parameters.x_checkpoint_delay);
> > > +        s->colo_checkpoint_time = now;
> > > +    }
> > >  }
> > >
> > >  void migrate_start_colo_process(MigrationState *s)
> > > --
> > > 2.17.1



^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: [PATCH 3/3] migration/colo: Merge multi checkpoint request into one.
  2020-06-03  9:38       ` Zhanghailiang
@ 2020-06-04  6:35         ` Zhang, Chen
  0 siblings, 0 replies; 10+ messages in thread
From: Zhang, Chen @ 2020-06-04  6:35 UTC (permalink / raw)
  To: Zhanghailiang, Dr . David Alan Gilbert, Juan Quintela, qemu-dev
  Cc: Jason Wang, Zhang Chen



> -----Original Message-----
> From: Zhanghailiang <zhang.zhanghailiang@huawei.com>
> Sent: Wednesday, June 3, 2020 5:39 PM
> To: Zhang, Chen <chen.zhang@intel.com>; Dr . David Alan Gilbert
> <dgilbert@redhat.com>; Juan Quintela <quintela@redhat.com>; qemu-dev
> <qemu-devel@nongnu.org>
> Cc: Zhang Chen <zhangckid@gmail.com>; Jason Wang
> <jasowang@redhat.com>
> Subject: RE: [PATCH 3/3] migration/colo: Merge multi checkpoint request
> into one.
> 
> > -----Original Message-----
> > From: Zhang, Chen [mailto:chen.zhang@intel.com]
> > Sent: Wednesday, June 3, 2020 5:11 PM
> > To: Zhanghailiang <zhang.zhanghailiang@huawei.com>; Dr . David Alan
> > Gilbert <dgilbert@redhat.com>; Juan Quintela <quintela@redhat.com>;
> > qemu-dev <qemu-devel@nongnu.org>
> > Cc: Zhang Chen <zhangckid@gmail.com>; Jason Wang
> <jasowang@redhat.com>
> > Subject: RE: [PATCH 3/3] migration/colo: Merge multi checkpoint
> > request into one.
> >
> >
> >
> > > -----Original Message-----
> > > From: Zhanghailiang <zhang.zhanghailiang@huawei.com>
> > > Sent: Tuesday, June 2, 2020 2:59 PM
> > > To: Zhang, Chen <chen.zhang@intel.com>; Dr . David Alan Gilbert
> > > <dgilbert@redhat.com>; Juan Quintela <quintela@redhat.com>; qemu-
> dev
> > > <qemu-devel@nongnu.org>
> > > Cc: Zhang Chen <zhangckid@gmail.com>; Jason Wang
> > <jasowang@redhat.com>
> > > Subject: RE: [PATCH 3/3] migration/colo: Merge multi checkpoint
> > > request into one.
> > >
> > >
> > >
> > > > -----Original Message-----
> > > > From: Zhang Chen [mailto:chen.zhang@intel.com]
> > > > Sent: Friday, May 15, 2020 12:28 PM
> > > > To: Dr . David Alan Gilbert <dgilbert@redhat.com>; Juan Quintela
> > > > <quintela@redhat.com>; Zhanghailiang
> > > <zhang.zhanghailiang@huawei.com>;
> > > > qemu-dev <qemu-devel@nongnu.org>
> > > > Cc: Zhang Chen <zhangckid@gmail.com>; Jason Wang
> > > > <jasowang@redhat.com>; Zhang Chen <chen.zhang@intel.com>
> > > > Subject: [PATCH 3/3] migration/colo: Merge multi checkpoint
> > > > request into one.
> > > >
> > > > From: Zhang Chen <chen.zhang@intel.com>
> > > >
> > > > When COLO guest occur issues, COLO-compare will catch lots of
> > > > different network packet and trigger notification multi times,
> > > > force periodic may happen at the same time. So this can be
> > > > efficient merge checkpoint request within
> COLO_CHECKPOINT_INTERVAL.
> > > >
> > > > Signed-off-by: Zhang Chen <chen.zhang@intel.com>
> > > > ---
> > > >  migration/colo.c | 22 ++++++++++++++++------
> > > >  1 file changed, 16 insertions(+), 6 deletions(-)
> > > >
> > > > diff --git a/migration/colo.c b/migration/colo.c index
> > > > d5bced22cb..e6a7d8c6e2 100644
> > > > --- a/migration/colo.c
> > > > +++ b/migration/colo.c
> > > > @@ -47,6 +47,9 @@ static COLOMode last_colo_mode;
> > > >
> > > >  #define COLO_BUFFER_BASE_SIZE (4 * 1024 * 1024)
> > > >
> > > > +/* Default COLO_CHECKPOINT_INTERVAL is 1000 ms */ #define
> > > > +COLO_CHECKPOINT_INTERVAL 1000
> > > > +
> > > >  bool migration_in_colo_state(void)  {
> > > >      MigrationState *s = migrate_get_current(); @@ -651,13 +654,20
> > > > @@
> > > > out:
> > > >  void colo_checkpoint_notify(void *opaque)  {
> > > >      MigrationState *s = opaque;
> > > > -    int64_t next_notify_time;
> > > > +    int64_t now = qemu_clock_get_ms(QEMU_CLOCK_HOST);
> > > >
> > > > -    qemu_sem_post(&s->colo_checkpoint_sem);
> > > > -    s->colo_checkpoint_time =
> > qemu_clock_get_ms(QEMU_CLOCK_HOST);
> > > > -    next_notify_time = s->colo_checkpoint_time +
> > > > -                    s->parameters.x_checkpoint_delay;
> > > > -    timer_mod(s->colo_delay_timer, next_notify_time);
> > > > +    /*
> > > > +     * When COLO guest occur issues, COLO-compare will catch lots of
> > > > +     * different network packet and trigger notification multi times,
> > > > +     * force periodic may happen at the same time. So this can be
> > > > +     * efficient merge checkpoint request within
> > > > COLO_CHECKPOINT_INTERVAL.
> > > > +     */
> > > > +    if (now > s->colo_checkpoint_time + COLO_CHECKPOINT_INTERVAL)
> > {
> > > > +        qemu_sem_post(&s->colo_checkpoint_sem);
> > >
> > > It is not right here, this notification should not be controlled by
> > > the interval time, I got what happened here, when multiple
> > > checkpoint requires come, this Colo_delay_time will be added every
> > > time and it will be a big value which is not what we want.
> >
> > Not just this, multi checkpoint will spend lots of resource to sync
> > memory from PVM to SVM, It will make VM stop/start multi times, but
> > for the results are same with one checkpoint.
> > So in short time just need one checkpoint, because do checkpoint still
> > need some time...
> >
> 
> Yes, this because we use semaphore here, it will be increased multiple times,
> And I think Lukas's patch 'migration/colo.c: Use event instead of semaphore'
> has fixed this problem.
> Did you try the qemu upstream which has merged this patch ?

Oh, Thanks reminder, I will drop this patch and rebase on upstream for V2.

Thank
Zhang Chen

> 
> > Thanks
> > Zhang Chen
> >
> > >
> > > Besides, please update this patch based on [PATCH 0/6] colo:
> > > migration related bugfixes series which Has modified the same place.
> > >
> > >
> > >
> > > > +        timer_mod(s->colo_delay_timer, now +
> > > > +                  s->parameters.x_checkpoint_delay);
> > > > +        s->colo_checkpoint_time = now;
> > > > +    }
> > > >  }
> > > >
> > > >  void migrate_start_colo_process(MigrationState *s)
> > > > --
> > > > 2.17.1



^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2020-06-04  6:36 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-15  4:28 [PATCH 0/3] migration/colo: Optimize COLO framework code Zhang Chen
2020-05-15  4:28 ` [PATCH 1/3] migration/colo: Optimize COLO boot code path Zhang Chen
2020-06-02  3:29   ` Zhanghailiang
2020-05-15  4:28 ` [PATCH 2/3] migration/colo: Update checkpoint time lately Zhang Chen
2020-06-02  6:20   ` Zhanghailiang
2020-05-15  4:28 ` [PATCH 3/3] migration/colo: Merge multi checkpoint request into one Zhang Chen
2020-06-02  6:59   ` Zhanghailiang
2020-06-03  9:11     ` Zhang, Chen
2020-06-03  9:38       ` Zhanghailiang
2020-06-04  6:35         ` Zhang, Chen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.