From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:46050)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <dgilbert@redhat.com>) id 1ZxJBU-0006zB-NG
	for qemu-devel@nongnu.org; Fri, 13 Nov 2015 13:34:53 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <dgilbert@redhat.com>) id 1ZxJBO-0003vp-T6
	for qemu-devel@nongnu.org; Fri, 13 Nov 2015 13:34:52 -0500
Received: from mx1.redhat.com ([209.132.183.28]:35445)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <dgilbert@redhat.com>) id 1ZxJBO-0003vb-M3
	for qemu-devel@nongnu.org; Fri, 13 Nov 2015 13:34:46 -0500
Date: Fri, 13 Nov 2015 18:34:40 +0000
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Message-ID: <20151113183440.GO2456@work-vm>
References: <1446551816-15768-1-git-send-email-zhang.zhanghailiang@huawei.com>
	<1446551816-15768-18-git-send-email-zhang.zhanghailiang@huawei.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <1446551816-15768-18-git-send-email-zhang.zhanghailiang@huawei.com>
Subject: Re: [Qemu-devel] [PATCH COLO-Frame v10 17/38] COLO: synchronize
 PVM's state to SVM periodically
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: zhanghailiang <zhang.zhanghailiang@huawei.com>
Cc: lizhijian@cn.fujitsu.com, quintela@redhat.com, yunhong.jiang@intel.com, eddie.dong@intel.com, peter.huangpeng@huawei.com, qemu-devel@nongnu.org, arei.gonglei@huawei.com, stefanha@redhat.com, amit.shah@redhat.com

* zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
> Do checkpoint periodically, the default interval is 200ms.
> 
> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
> ---
>  migration/colo.c | 14 ++++++++++++++
>  1 file changed, 14 insertions(+)
> 
> diff --git a/migration/colo.c b/migration/colo.c
> index 0efab21..a6791f4 100644
> --- a/migration/colo.c
> +++ b/migration/colo.c
> @@ -11,12 +11,19 @@
>   */
>  
>  #include <unistd.h>
> +#include "qemu/timer.h"
>  #include "sysemu/sysemu.h"
>  #include "migration/colo.h"
>  #include "trace.h"
>  #include "qemu/error-report.h"
>  #include "qemu/sockets.h"
>  
> +/*
> + * checkpoint interval: unit ms
> + * Note: Please change this default value to 10000 when we support hybrid mode.
> + */
> +#define CHECKPOINT_MAX_PEROID 200

Why not put the patch that makes this a configurable parameter before this,
and then we can use it straight away?

>  /* colo buffer */
>  #define COLO_BUFFER_BASE_SIZE (4 * 1024 * 1024)
>  
> @@ -183,6 +190,7 @@ out:
>  static void colo_process_checkpoint(MigrationState *s)
>  {
>      QEMUSizedBuffer *buffer = NULL;
> +    int64_t current_time, checkpoint_time = qemu_clock_get_ms(QEMU_CLOCK_HOST);
>      int fd, ret = 0;
>  
>      /* Dup the fd of to_dst_file */
> @@ -220,11 +228,17 @@ static void colo_process_checkpoint(MigrationState *s)
>      trace_colo_vm_state_change("stop", "run");
>  
>      while (s->state == MIGRATION_STATUS_COLO) {
> +        current_time = qemu_clock_get_ms(QEMU_CLOCK_HOST);
> +        if (current_time - checkpoint_time < CHECKPOINT_MAX_PEROID) {
> +            g_usleep(100000);
> +            continue;
> +        }

I'm a bit concerned at the 100ms wait, when the period is 200ms; 
depending how the times work out, couldn't we end up waiting for just
under 300ms? - that's a big error - and it's even more weird when
we make it configurable later.

I don't think we've got a sleep-until, which is a shame; but how
about something like:

   if (current_time - checkpoint_time < CHECKPOINT_MAX_PEROID) {
       int64_t delay_ms;
       delay_ms = CHECKPOINT_MAX_PERIOD - (current_time - checkpoint_time);
       g_usleep (delay_ms * 1000);
   }

Dave

>          /* start a colo checkpoint */
>          ret = colo_do_checkpoint_transaction(s, buffer);
>          if (ret < 0) {
>              goto out;
>          }
> +        checkpoint_time = qemu_clock_get_ms(QEMU_CLOCK_HOST);
>      }
>  
>  out:
> -- 
> 1.8.3.1
> 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK