From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:45374) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZybLb-0002hV-GX for qemu-devel@nongnu.org; Tue, 17 Nov 2015 03:10:40 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ZybLY-0005J3-9e for qemu-devel@nongnu.org; Tue, 17 Nov 2015 03:10:39 -0500 Received: from szxga02-in.huawei.com ([119.145.14.65]:55550) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZybLX-000590-FY for qemu-devel@nongnu.org; Tue, 17 Nov 2015 03:10:36 -0500 References: <1446551816-15768-1-git-send-email-zhang.zhanghailiang@huawei.com> <1446551816-15768-19-git-send-email-zhang.zhanghailiang@huawei.com> <5646170D.7090301@redhat.com> From: zhanghailiang Message-ID: <564ADF41.5080600@huawei.com> Date: Tue, 17 Nov 2015 16:03:13 +0800 MIME-Version: 1.0 In-Reply-To: <5646170D.7090301@redhat.com> Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH COLO-Frame v10 18/38] COLO failover: Introduce a new command to trigger a failover List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Eric Blake , qemu-devel@nongnu.org Cc: lizhijian@cn.fujitsu.com, quintela@redhat.com, Markus Armbruster , yunhong.jiang@intel.com, eddie.dong@intel.com, peter.huangpeng@huawei.com, dgilbert@redhat.com, arei.gonglei@huawei.com, stefanha@redhat.com, amit.shah@redhat.com, Luiz Capitulino On 2015/11/14 0:59, Eric Blake wrote: > On 11/03/2015 04:56 AM, zhanghailiang wrote: >> We leave users to choose whatever heartbeat solution they want, if the heartbeat >> is lost, or other errors they detect, they can use experimental command >> 'x_colo_lost_heartbeat' to tell COLO to do failover, COLO will do operations >> accordingly. >> >> For example, if the command is sent to the PVM, the Primary side will >> exit COLO mode and take over operation. If sent to the Secondary, the >> secondary will run failover work, then take over server operation to >> become the new Primary. >> >> Cc: Luiz Capitulino >> Cc: Eric Blake >> Cc: Markus Armbruster >> Signed-off-by: zhanghailiang >> Signed-off-by: Li Zhijian >> --- >> v10: Rename command colo_lost_hearbeat to experimental 'x_colo_lost_heartbeat' >> --- > >> @@ -29,6 +30,9 @@ bool migration_incoming_enable_colo(void); >> void migration_incoming_exit_colo(void); >> void *colo_process_incoming_thread(void *opaque); >> bool migration_incoming_in_colo_state(void); >> + >> +int get_colo_mode(void); > > Should this return an enum type instead of an int? > > >> +++ b/migration/colo-comm.c >> @@ -20,6 +20,17 @@ typedef struct { >> >> static COLOInfo colo_info; >> >> +int get_colo_mode(void) >> +{ >> + if (migration_in_colo_state()) { >> + return COLO_MODE_PRIMARY; >> + } else if (migration_incoming_in_colo_state()) { >> + return COLO_MODE_SECONDARY; >> + } else { >> + return COLO_MODE_UNKNOWN; >> + } >> +} > > Particularly since it is always returning values of the same enum. > > Not fatal to the patch, so much as a style issue. > Seems reasonable. I will fix it in next version. > >> +void qmp_x_colo_lost_heartbeat(Error **errp) >> +{ >> + if (get_colo_mode() == COLO_MODE_UNKNOWN) { >> + error_setg(errp, QERR_FEATURE_DISABLED, "colo"); >> + return; > > We've slowly been trying to get rid of QERR_ usage. But you aren't the > first user, and a global cleanup may be better. So I can overlook it for > now. > Yes, there are still several places in qemu that use 'QERR_FEATURE_DISABLED', How to cleanup them ? Change it to 'error_setg(errp, "COLO feature is not enabled") here ? >> +++ b/qapi-schema.json >> @@ -734,6 +734,32 @@ >> 'checkpoint-reply', 'vmstate-send', 'vmstate-size', >> 'vmstate-received', 'vmstate-loaded' ] } >> >> +## >> +# @COLOMode >> +# >> +# The colo mode >> +# >> +# @unknown: unknown mode >> +# >> +# @primary: master side >> +# >> +# @secondary: slave side >> +# >> +# Since: 2.5 >> +## >> +{ 'enum': 'COLOMode', >> + 'data': [ 'unknown', 'primary', 'secondary'] } >> + >> +## >> +# @x-colo-lost-heartbeat >> +# >> +# Tell qemu that heartbeat is lost, request it to do takeover procedures. >> +# > > The docs here are rather short, compared to your commit message (in > particular, the fact that it causes a different action depending on > whether it is sent to primary [takeover] or secondary [failover]). > Ok, I will add more comments here. Thanks. >> +# Since: 2.5 > > 2.6 in both places. >