From mboxrd@z Thu Jan  1 00:00:00 1970
From: Lai Jiangshan <laijs@cn.fujitsu.com>
Subject: Re: [PATCH 6/7] remus: implement remus
 replicated	checkpointing disk
Date: Fri, 4 Apr 2014 11:04:57 +0800
Message-ID: <533E2159.6080606@cn.fujitsu.com>
References: <1396436677-10872-9-git-send-email-yanghy@cn.fujitsu.com>	<1396527759-10186-1-git-send-email-laijs@cn.fujitsu.com>	<1396527759-10186-6-git-send-email-laijs@cn.fujitsu.com>
	<CAP8mzPOYNNqywDgWevr7JNycuCEPsLmS6zjuY9UuPcDHyyop1A@mail.gmail.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Return-path: <xen-devel-bounces@lists.xen.org>
In-Reply-To: <CAP8mzPOYNNqywDgWevr7JNycuCEPsLmS6zjuY9UuPcDHyyop1A@mail.gmail.com>
List-Unsubscribe: <http://lists.xen.org/cgi-bin/mailman/options/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=unsubscribe>
List-Post: <mailto:xen-devel@lists.xen.org>
List-Help: <mailto:xen-devel-request@lists.xen.org?subject=help>
List-Subscribe: <http://lists.xen.org/cgi-bin/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=subscribe>
Sender: xen-devel-bounces@lists.xen.org
Errors-To: xen-devel-bounces@lists.xen.org
To: rshriram@cs.ubc.ca
Cc: Ian Campbell <ian.campbell@citrix.com>, FNST-Wen Congyang <wency@cn.fujitsu.com>, Stefano Stabellini <stefano.stabellini@eu.citrix.com>, Andrew Cooper <andrew.cooper3@citrix.com>, Jiang Yunhong <yunhong.jiang@intel.com>, Ian Jackson <ian.jackson@eu.citrix.com>, "<xen-devel@lists.xen.org>" <xen-devel@lists.xen.org>, Dong Eddie <eddie.dong@intel.com>, FNST-Yang Hongyang <yanghy@cn.fujitsu.com>, Roger Pau Monne <roger.pau@citrix.com>
List-Id: xen-devel@lists.xenproject.org

On 04/04/2014 12:41 AM, Shriram Rajagopalan wrote:
>> @@ -1463,7 +1468,10 @@ static int libxl__remus_domain_resume_callback(void *data)
>>     if (libxl__domain_resume(gc, dss->domid, /* Fast Suspend */1))
>>         return 0;
>>
>> -    /* REMUS TODO: Deal with disk. */
>> +    /* Deal with disk. */
>> +    if (libxl__remus_disk_preresume(dss->remus_state))
>> +        return 0;
>> +
>>     return 1;
>> }
>>
> 
> Bug. I think I mentioned this last time. Disk needs to be resumed before the
> domain is resumed. Just move the domain resume call below the above
> code snippet.
> 
> 
>> +typedef struct libxl__remus_disk_type {
>> +    /* checkpointing */
>> +    int (*postsuspend)(libxl__remus_disk *remus_disk);
>> +    int (*preresume)(libxl__remus_disk *remus_disk);
>> +    int (*commit)(libxl__remus_disk *remus_disk);
>> +
>> +    /*
>> +     * Return value:
>> +     *   1: the disk is not this type or the script is still running
>> +     *   0: the disk is this type
>> +     *  -1: error
>> +     */
>> +    int (*match)(libxl__domain_suspend_state *dss,
>> +                 const libxl_device_disk *disk,
>> +                 libxl_async_exec *async_exec,
>> +                 void *disk_state);
>> +
>> +    /*
>> +     * This is synchronous callback. Return value:
>> +     *  0: setup is done
>> +     * -1: error
>> +     *
>> +     */
>> +    int (*setup)(libxl__remus_disk *remus_disk);
>> +
>> +    /*
>> +     * Return value:
>> +     *   1: the script is still running
>> +     *   0: the script is done
>> +     *  -1: error
>> +     */
>> +    int (*teardown)(libxl__remus_disk *remus_disk,
>> +                    libxl_async_exec *async_exec);
>> +
>> +    /* the size of the private data */
>> +    int size;
>> +} libxl__remus_disk_type;
>> +
> 
> 
> This vtable approach is neat. I am fine with the current disk
> checkpoint approach you have taken.
> 
> Something that might be worth thinking about:
> The old remus code used this approach for both the disk and network buffering.
> Given that this code is going in a similar direction, I suggest
> hoisting this structure
> up to an abstract buffer type, with setup, teardown, postsuspend, preresume and
> commit callbacks.
> 
> For disks, semantically,
> setup [..]
> teardown [..]
> postsuspend [start flushing buffered writes to backup host]
> preresume [wait until all writes have been flushed to backup host]
> commit  [no-op]
> 
> For network devices, semantically,
> setup [..]
> teardown [..]
> postsuspend [no-op]
> preresume [start_new_epoch - libnl call]
> commit [release_prev_epoch - libnl call]
> 
> This way, in domain_suspend_done, the only thing we need to do is
> foreach remus buffer
>  buffer.postsuspend()
> 
> Similarly, in resume_callback()
> 
> foreach remus buffer
>  buffer.preresume()
> domain_resume()
> 
> 
> in remus_checkpoint_dm_saved()
>  foreach remus buffer
>   buffer.commit()
> 
> Lai, I can take an crack at it if you would like.
> 

Your idea is great, I look forward to your work on it.

> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
> .
>