From mboxrd@z Thu Jan  1 00:00:00 1970
From: Dor Laor <dlaor@redhat.com>
Subject: Re: [Qemu-devel] KVM call agenda for June 28
Date: Tue, 05 Jul 2011 18:04:34 +0300
Message-ID: <4E132802.8080300@redhat.com>
References: <BANLkTim1R+n7V3oUBqnsGD97rJNfN9nRdw@mail.gmail.com> <20110629154134.GA6631@amt.cnet> <BANLkTin-7hkUnMHJN9jUY87m8Y=fHS_GYA@mail.gmail.com> <20110630143620.GA4366@amt.cnet> <4E0C8D90.8050305@redhat.com> <20110630183829.GA8752@amt.cnet> <4E12C4F5.9000100@redhat.com> <CAJSP0QXLJBZn_3RfCWJCBE8-6LMd2jn4gJ3DqM8VHG3gg+85iQ@mail.gmail.com> <20110705125858.GA21254@amt.cnet> <4E1313FA.1060905@redhat.com> <20110705143230.GA22955@amt.cnet>
Reply-To: dlaor@redhat.com
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: Stefan Hajnoczi <stefanha@gmail.com>,
	Kevin Wolf <kwolf@redhat.com>,
	Chris Wright <chrisw@redhat.com>,
	KVM devel mailing list <kvm@vger.kernel.org>,
	quintela@redhat.com, jes sorensen <jes.sorensen@redhat.com>,
	qemu-devel@nongnu.org, Avi Kivity <avi@redhat.com>
To: Marcelo Tosatti <mtosatti@redhat.com>
Return-path: <kvm-owner@vger.kernel.org>
Received: from mx1.redhat.com ([209.132.183.28]:5527 "EHLO mx1.redhat.com"
	rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
	id S1755068Ab1GEPEl (ORCPT <rfc822;kvm@vger.kernel.org>);
	Tue, 5 Jul 2011 11:04:41 -0400
In-Reply-To: <20110705143230.GA22955@amt.cnet>
Sender: kvm-owner@vger.kernel.org
List-ID: <kvm.vger.kernel.org>

On 07/05/2011 05:32 PM, Marcelo Tosatti wrote:
> On Tue, Jul 05, 2011 at 04:39:06PM +0300, Dor Laor wrote:
>> On 07/05/2011 03:58 PM, Marcelo Tosatti wrote:
>>> On Tue, Jul 05, 2011 at 01:40:08PM +0100, Stefan Hajnoczi wrote:
>>>> On Tue, Jul 5, 2011 at 9:01 AM, Dor Laor<dlaor@redhat.com>   wrote:
>>>>> I tried to re-arrange all of the requirements and use cases using this wiki
>>>>> page: http://wiki.qemu.org/Features/LiveBlockMigration
>>>>>
>>>>> It would be the best to agree upon the most interesting use cases (while we
>>>>> make sure we cover future ones) and agree to them.
>>>>> The next step is to set the interface for all the various verbs since the
>>>>> implementation seems to be converging.
>>>>
>>>> Live block copy was supposed to support snapshot merge.  I think the
>>>> current favored approach is to make the source image a backing file to
>>>> the destination image and essentially do image streaming.
>>>>
>>>> Using this mechanism for snapshot merge is tricky.  The COW file
>>>> already uses the read-only snapshot base image.  So now we cannot
>>>> trivally copy the COW file contents back into the snapshot base image
>>>> using live block copy.
>>>
>>> It never did. Live copy creates a new image were both snapshot and
>>> "current" are copied to.
>>>
>>> This is similar with image streaming.
>>
>> Not sure I realize what's bad to do in-place merge:
>>
>> Let's suppose we have this COW chain:
>>
>>    base<-- s1<-- s2
>>
>> Now a live snapshot is created over s2, s2 becomes RO and s3 is RW:
>>
>>    base<-- s1<-- s2<-- s3
>>
>> Now we've done with s2 (post backup) and like to merge s3 into s2.
>>
>> With your approach we use live copy of s3 into newSnap:
>>
>>    base<-- s1<-- s2<-- s3
>>    base<-- s1<-- newSnap
>>
>> When it is over s2 and s3 can be erased.
>> The down side is the IOs for copying s2 data and the temporary
>> storage. I guess temp storage is cheap but excessive IO are
>> expensive.
>>
>> My approach was to collapse s3 into s2 and erase s3 eventually:
>>
>> before: base<-- s1<-- s2<-- s3
>> after:  base<-- s1<-- s2
>>
>> If we use live block copy using mirror driver it should be safe as
>> long as we keep the ordering of new writes into s3 during the
>> execution.
>> Even a failure in the the middle won't cause harm since the
>> management will keep using s3 until it gets success event.
>
> Well, it is more complicated than simply streaming into a new
> image. I'm not entirely sure it is necessary. The common case is:
>
> base ->  sn-1 ->  sn-2 ->  ... ->  sn-n
>
> When n reaches a limit, you do:
>
> base ->  merge-1
>
> You're potentially copying similar amount of data when merging back into
> a single image (and you can't easily merge multiple snapshots).
>
> If the amount of data thats not in 'base' is large, you create
> leave a new external file around:
>
> base ->  merge-1 ->  sn-1 ->  sn-2 ... ->  sn-n
> to
> base ->  merge-1 ->  merge-2

Sometimes one will want to merge the snapshot immediately post the base 
was backed-up

>
>>>
>>>> It seems like snapshot merge will require dedicated code that reads
>>>> the allocated clusters from the COW file and writes them back into the
>>>> base image.
>>>>
>>>> A very inefficient alternative would be to create a third image, the
>>>> "merge" image file, which has the COW file as its backing file:
>>>> snapshot (base) ->   cow ->   merge
>
> Remember there is a 'base' before snapshot, you don't copy the entire
> image.

Not always, the image might be raw file/device -

1. raw image
2. live snapshot it and use COW above it
    raw <- s1
3. backup the raw image using 3rd party mechanism
4. live merge (copy) s1 into raw

>
>>>>
>>>> All data from snapshot and cow is copied into merge and then snapshot
>>>> and cow can be deleted.  But this approach is results in full data
>>>> copying and uses potentially 3x space if cow is close to the size of
>>>> snapshot.
>>>
>>> Management can set a higher limit on the size of data that is merged,
>>> and create a new base once exceeded. This avoids copying excessive
>>> amounts of data.
>>>
>>>> Any other ideas that reuse live block copy for snapshot merge?
>>>>
>>>> Stefan
>>>
>>>


From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([140.186.70.92]:46875)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <dlaor@redhat.com>) id 1Qe7B5-0004Ru-Qg
	for qemu-devel@nongnu.org; Tue, 05 Jul 2011 11:04:49 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <dlaor@redhat.com>) id 1Qe7B3-0005G6-IT
	for qemu-devel@nongnu.org; Tue, 05 Jul 2011 11:04:43 -0400
Received: from mx1.redhat.com ([209.132.183.28]:14039)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <dlaor@redhat.com>) id 1Qe7B2-0005Ft-W7
	for qemu-devel@nongnu.org; Tue, 05 Jul 2011 11:04:41 -0400
Message-ID: <4E132802.8080300@redhat.com>
Date: Tue, 05 Jul 2011 18:04:34 +0300
From: Dor Laor <dlaor@redhat.com>
MIME-Version: 1.0
References: <BANLkTim1R+n7V3oUBqnsGD97rJNfN9nRdw@mail.gmail.com>
	<20110629154134.GA6631@amt.cnet>
	<BANLkTin-7hkUnMHJN9jUY87m8Y=fHS_GYA@mail.gmail.com>
	<20110630143620.GA4366@amt.cnet> <4E0C8D90.8050305@redhat.com>
	<20110630183829.GA8752@amt.cnet> <4E12C4F5.9000100@redhat.com>
	<CAJSP0QXLJBZn_3RfCWJCBE8-6LMd2jn4gJ3DqM8VHG3gg+85iQ@mail.gmail.com>
	<20110705125858.GA21254@amt.cnet> <4E1313FA.1060905@redhat.com>
	<20110705143230.GA22955@amt.cnet>
In-Reply-To: <20110705143230.GA22955@amt.cnet>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Subject: Re: [Qemu-devel] KVM call agenda for June 28
Reply-To: dlaor@redhat.com
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>, Chris Wright <chrisw@redhat.com>, KVM devel mailing list <kvm@vger.kernel.org>, quintela@redhat.com, Stefan Hajnoczi <stefanha@gmail.com>, qemu-devel@nongnu.org, Avi Kivity <avi@redhat.com>, jes sorensen <jes.sorensen@redhat.com>

On 07/05/2011 05:32 PM, Marcelo Tosatti wrote:
> On Tue, Jul 05, 2011 at 04:39:06PM +0300, Dor Laor wrote:
>> On 07/05/2011 03:58 PM, Marcelo Tosatti wrote:
>>> On Tue, Jul 05, 2011 at 01:40:08PM +0100, Stefan Hajnoczi wrote:
>>>> On Tue, Jul 5, 2011 at 9:01 AM, Dor Laor<dlaor@redhat.com>   wrote:
>>>>> I tried to re-arrange all of the requirements and use cases using this wiki
>>>>> page: http://wiki.qemu.org/Features/LiveBlockMigration
>>>>>
>>>>> It would be the best to agree upon the most interesting use cases (while we
>>>>> make sure we cover future ones) and agree to them.
>>>>> The next step is to set the interface for all the various verbs since the
>>>>> implementation seems to be converging.
>>>>
>>>> Live block copy was supposed to support snapshot merge.  I think the
>>>> current favored approach is to make the source image a backing file to
>>>> the destination image and essentially do image streaming.
>>>>
>>>> Using this mechanism for snapshot merge is tricky.  The COW file
>>>> already uses the read-only snapshot base image.  So now we cannot
>>>> trivally copy the COW file contents back into the snapshot base image
>>>> using live block copy.
>>>
>>> It never did. Live copy creates a new image were both snapshot and
>>> "current" are copied to.
>>>
>>> This is similar with image streaming.
>>
>> Not sure I realize what's bad to do in-place merge:
>>
>> Let's suppose we have this COW chain:
>>
>>    base<-- s1<-- s2
>>
>> Now a live snapshot is created over s2, s2 becomes RO and s3 is RW:
>>
>>    base<-- s1<-- s2<-- s3
>>
>> Now we've done with s2 (post backup) and like to merge s3 into s2.
>>
>> With your approach we use live copy of s3 into newSnap:
>>
>>    base<-- s1<-- s2<-- s3
>>    base<-- s1<-- newSnap
>>
>> When it is over s2 and s3 can be erased.
>> The down side is the IOs for copying s2 data and the temporary
>> storage. I guess temp storage is cheap but excessive IO are
>> expensive.
>>
>> My approach was to collapse s3 into s2 and erase s3 eventually:
>>
>> before: base<-- s1<-- s2<-- s3
>> after:  base<-- s1<-- s2
>>
>> If we use live block copy using mirror driver it should be safe as
>> long as we keep the ordering of new writes into s3 during the
>> execution.
>> Even a failure in the the middle won't cause harm since the
>> management will keep using s3 until it gets success event.
>
> Well, it is more complicated than simply streaming into a new
> image. I'm not entirely sure it is necessary. The common case is:
>
> base ->  sn-1 ->  sn-2 ->  ... ->  sn-n
>
> When n reaches a limit, you do:
>
> base ->  merge-1
>
> You're potentially copying similar amount of data when merging back into
> a single image (and you can't easily merge multiple snapshots).
>
> If the amount of data thats not in 'base' is large, you create
> leave a new external file around:
>
> base ->  merge-1 ->  sn-1 ->  sn-2 ... ->  sn-n
> to
> base ->  merge-1 ->  merge-2

Sometimes one will want to merge the snapshot immediately post the base 
was backed-up

>
>>>
>>>> It seems like snapshot merge will require dedicated code that reads
>>>> the allocated clusters from the COW file and writes them back into the
>>>> base image.
>>>>
>>>> A very inefficient alternative would be to create a third image, the
>>>> "merge" image file, which has the COW file as its backing file:
>>>> snapshot (base) ->   cow ->   merge
>
> Remember there is a 'base' before snapshot, you don't copy the entire
> image.

Not always, the image might be raw file/device -

1. raw image
2. live snapshot it and use COW above it
    raw <- s1
3. backup the raw image using 3rd party mechanism
4. live merge (copy) s1 into raw

>
>>>>
>>>> All data from snapshot and cow is copied into merge and then snapshot
>>>> and cow can be deleted.  But this approach is results in full data
>>>> copying and uses potentially 3x space if cow is close to the size of
>>>> snapshot.
>>>
>>> Management can set a higher limit on the size of data that is merged,
>>> and create a new base once exceeded. This avoids copying excessive
>>> amounts of data.
>>>
>>>> Any other ideas that reuse live block copy for snapshot merge?
>>>>
>>>> Stefan
>>>
>>>