From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andrew Cooper Subject: Re: [PATCH v4 --for 4.6 COLOPre 20/25] tools/libx{l, c}: add back channel to libxc Date: Wed, 15 Jul 2015 14:21:51 +0100 Message-ID: <55A65E6F.4090400@citrix.com> References: <1436946351-21118-1-git-send-email-yanghy@cn.fujitsu.com> <1436946351-21118-21-git-send-email-yanghy@cn.fujitsu.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <1436946351-21118-21-git-send-email-yanghy@cn.fujitsu.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Yang Hongyang , xen-devel@lists.xen.org Cc: wei.liu2@citrix.com, ian.campbell@citrix.com, wency@cn.fujitsu.com, ian.jackson@eu.citrix.com, yunhong.jiang@intel.com, eddie.dong@intel.com, guijianfeng@cn.fujitsu.com, rshriram@cs.ubc.ca List-Id: xen-devel@lists.xenproject.org On 15/07/15 08:45, Yang Hongyang wrote: > In COLO mode, both VMs are running, and are considered in sync if the > visible network traffic is identical. After some time, they fall out of > sync. > > At this point, the two VMs have definitely diverged. Lets call the > primary dirty bitmap set A, while the secondary dirty bitmap set B. > > Sets A and B are different. > > Under normal migration, the page data for set A will be sent form the > primary to the secondary. > > However, the set difference B - A (lets call this C) is out-of-date on > the secondary (with respect to the primary) and will not be sent by the > primary, as it was not memory dirtied by the primary. The secondary > needs the page data for C to reconstruct an exact copy of the primary at > the checkpoint. > > The secondary cannot calculate C as it doesn't know A. Instead, the > secondary must send B to the primary, at which point the primary > calculates the union of A and B (lets call this D) which is all the > pages dirtied by both the primary and the secondary, and sends all page > data covered by D. > > In the general case, D is a superset of both A and B. Without the > backchannel dirty bitmap, a COLO checkpoint can't reconstruct a valid > copy of the primary. > > We transfer the dirty bitmap on libxc side, so we need to introduce back > channel to libxc. > > Signed-off-by: Yang Hongyang > commit message: > Signed-off-by: Andrew Cooper > CC: Ian Campbell > CC: Ian Jackson > CC: Wei Liu > --- > tools/libxc/include/xenguest.h | 8 ++++---- > tools/libxc/xc_domain_restore.c | 4 ++-- > tools/libxc/xc_domain_save.c | 4 ++-- > tools/libxc/xc_sr_restore.c | 2 +- > tools/libxc/xc_sr_save.c | 2 +- > tools/libxl/libxl_save_callout.c | 39 ++++++++++++++++++++++++++------------- > tools/libxl/libxl_save_helper.c | 8 ++++++-- > 7 files changed, 42 insertions(+), 25 deletions(-) You have not patched xc_nomigrate.c, which means this will break the ARM build. (I fell into the same trap, requiring c/s f50fe3a5 as a fixup). Having said that, I plan to throw together some cleanup patches removing files like xc_domain_{save,restore}.c and dropping most of the parameters from the parameter list, as they are superfluous. I will try to get my cleanup done shortly, which should make this prereq series easier, although I am focusing on some hypervisor side fixes right at the moment. ~Andrew