From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:34587) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YuaiP-0002tk-3o for qemu-devel@nongnu.org; Tue, 19 May 2015 02:09:22 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YuaiK-0004pY-Oj for qemu-devel@nongnu.org; Tue, 19 May 2015 02:09:20 -0400 Received: from szxga01-in.huawei.com ([58.251.152.64]:56946) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YuaiK-0004p3-0X for qemu-devel@nongnu.org; Tue, 19 May 2015 02:09:16 -0400 Message-ID: <555AD367.3060308@huawei.com> Date: Tue, 19 May 2015 14:08:39 +0800 From: zhanghailiang MIME-Version: 1.0 References: <1427347774-8960-1-git-send-email-zhang.zhanghailiang@huawei.com> <1427347774-8960-24-git-send-email-zhang.zhanghailiang@huawei.com> <20150518164859.GC14465@work-vm> In-Reply-To: <20150518164859.GC14465@work-vm> Content-Type: text/plain; charset="windows-1252"; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [RFC PATCH v4 23/28] COLO: Improve checkpoint efficiency by do additional periodic checkpoint List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Dr. David Alan Gilbert" Cc: lizhijian@cn.fujitsu.com, quintela@redhat.com, yunhong.jiang@intel.com, eddie.dong@intel.com, peter.huangpeng@huawei.com, qemu-devel@nongnu.org, arei.gonglei@huawei.com, amit.shah@redhat.com, Yang Hongyang , david@gibson.dropbear.id.au On 2015/5/19 0:48, Dr. David Alan Gilbert wrote: > * zhanghailiang (zhang.zhanghailiang@huawei.com) wrote: >> Besides normal checkpoint which according to the result of net packets >> comparing, We do additional checkpoint periodically, it will reduce the number >> of dirty pages when do one checkpoint, if we don't do checkpoint for a long >> time (This is a special case when the net packets is always consistent). >> >> Signed-off-by: zhanghailiang >> Signed-off-by: Yang Hongyang >> --- >> migration/colo.c | 29 +++++++++++++++++++++-------- >> 1 file changed, 21 insertions(+), 8 deletions(-) >> >> diff --git a/migration/colo.c b/migration/colo.c >> index 9ef4554..da5bc5e 100644 >> --- a/migration/colo.c >> +++ b/migration/colo.c >> @@ -10,6 +10,7 @@ >> * later. See the COPYING file in the top-level directory. >> */ >> >> +#include "qemu/timer.h" >> #include "sysemu/sysemu.h" >> #include "migration/migration-colo.h" >> #include "qemu/error-report.h" >> @@ -32,6 +33,13 @@ >> */ >> #define CHECKPOINT_MIN_PERIOD 100 /* unit: ms */ >> >> +/* >> + * force checkpoint timer: unit ms >> + * this is large because COLO checkpoint will mostly depend on >> + * COLO compare module. >> + */ >> +#define CHECKPOINT_MAX_PEROID 10000 >> + >> enum { >> COLO_READY = 0x46, >> >> @@ -340,14 +348,7 @@ static void *colo_thread(void *opaque) >> proxy_checkpoint_req = colo_proxy_compare(); >> if (proxy_checkpoint_req < 0) { >> goto out; >> - } else if (!proxy_checkpoint_req) { >> - /* >> - * No checkpoint is needed, wait for 1ms and then >> - * check if we need checkpoint again >> - */ >> - g_usleep(1000); >> - continue; >> - } else { >> + } else if (proxy_checkpoint_req) { >> int64_t interval; >> >> current_time = qemu_clock_get_ms(QEMU_CLOCK_HOST); >> @@ -357,8 +358,20 @@ static void *colo_thread(void *opaque) >> g_usleep((1000*(CHECKPOINT_MIN_PERIOD - interval))); >> } >> DPRINTF("Net packets is not consistent!!!\n"); >> + goto do_checkpoint; >> + } >> + >> + /* >> + * No proxy checkpoint is request, wait for 100ms >> + * and then check if we need checkpoint again. >> + */ >> + current_time = qemu_clock_get_ms(QEMU_CLOCK_HOST); >> + if (current_time - checkpoint_time < CHECKPOINT_MAX_PEROID) { >> + g_usleep(100000); >> + continue; > > This 100ms sleep is interesting - can you explain it's purpose; is it > just to save CPU time in the colo thread? It used to be 1ms (above > and in the previous version). > You are right, it is just to save CPU time. I'm not sure but if '1ms' is a little too short ? In our latest patch, we actually will send some dirty pages to slave side during this sleep time if there are dirty pages. > The MIN_PERIOD already stops the checkpoints being too close together, > so this is a separate sleep from that. >> } >> >> +do_checkpoint: >> /* start a colo checkpoint */ >> if (colo_do_checkpoint_transaction(s, colo_control)) { >> goto out; >> -- >> 1.7.12.4 >> >> > -- > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK > > . >