From mboxrd@z Thu Jan 1 00:00:00 1970 From: Lai Jiangshan Subject: Re: [PATCH 6/7] remus: implement remus replicated checkpointing disk Date: Fri, 4 Apr 2014 11:04:57 +0800 Message-ID: <533E2159.6080606@cn.fujitsu.com> References: <1396436677-10872-9-git-send-email-yanghy@cn.fujitsu.com> <1396527759-10186-1-git-send-email-laijs@cn.fujitsu.com> <1396527759-10186-6-git-send-email-laijs@cn.fujitsu.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: rshriram@cs.ubc.ca Cc: Ian Campbell , FNST-Wen Congyang , Stefano Stabellini , Andrew Cooper , Jiang Yunhong , Ian Jackson , "" , Dong Eddie , FNST-Yang Hongyang , Roger Pau Monne List-Id: xen-devel@lists.xenproject.org On 04/04/2014 12:41 AM, Shriram Rajagopalan wrote: >> @@ -1463,7 +1468,10 @@ static int libxl__remus_domain_resume_callback(void *data) >> if (libxl__domain_resume(gc, dss->domid, /* Fast Suspend */1)) >> return 0; >> >> - /* REMUS TODO: Deal with disk. */ >> + /* Deal with disk. */ >> + if (libxl__remus_disk_preresume(dss->remus_state)) >> + return 0; >> + >> return 1; >> } >> > > Bug. I think I mentioned this last time. Disk needs to be resumed before the > domain is resumed. Just move the domain resume call below the above > code snippet. > > >> +typedef struct libxl__remus_disk_type { >> + /* checkpointing */ >> + int (*postsuspend)(libxl__remus_disk *remus_disk); >> + int (*preresume)(libxl__remus_disk *remus_disk); >> + int (*commit)(libxl__remus_disk *remus_disk); >> + >> + /* >> + * Return value: >> + * 1: the disk is not this type or the script is still running >> + * 0: the disk is this type >> + * -1: error >> + */ >> + int (*match)(libxl__domain_suspend_state *dss, >> + const libxl_device_disk *disk, >> + libxl_async_exec *async_exec, >> + void *disk_state); >> + >> + /* >> + * This is synchronous callback. Return value: >> + * 0: setup is done >> + * -1: error >> + * >> + */ >> + int (*setup)(libxl__remus_disk *remus_disk); >> + >> + /* >> + * Return value: >> + * 1: the script is still running >> + * 0: the script is done >> + * -1: error >> + */ >> + int (*teardown)(libxl__remus_disk *remus_disk, >> + libxl_async_exec *async_exec); >> + >> + /* the size of the private data */ >> + int size; >> +} libxl__remus_disk_type; >> + > > > This vtable approach is neat. I am fine with the current disk > checkpoint approach you have taken. > > Something that might be worth thinking about: > The old remus code used this approach for both the disk and network buffering. > Given that this code is going in a similar direction, I suggest > hoisting this structure > up to an abstract buffer type, with setup, teardown, postsuspend, preresume and > commit callbacks. > > For disks, semantically, > setup [..] > teardown [..] > postsuspend [start flushing buffered writes to backup host] > preresume [wait until all writes have been flushed to backup host] > commit [no-op] > > For network devices, semantically, > setup [..] > teardown [..] > postsuspend [no-op] > preresume [start_new_epoch - libnl call] > commit [release_prev_epoch - libnl call] > > This way, in domain_suspend_done, the only thing we need to do is > foreach remus buffer > buffer.postsuspend() > > Similarly, in resume_callback() > > foreach remus buffer > buffer.preresume() > domain_resume() > > > in remus_checkpoint_dm_saved() > foreach remus buffer > buffer.commit() > > Lai, I can take an crack at it if you would like. > Your idea is great, I look forward to your work on it. > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel > . >