KVM call agenda for June 28

All of lore.kernel.org
 help / color / mirror / Atom feed

* KVM call agenda for June 28
@ 2011-06-27 14:32 ` Juan Quintela
  0 siblings, 0 replies; 55+ messages in thread
From: Juan Quintela @ 2011-06-27 14:32 UTC (permalink / raw)
  To: KVM devel mailing list, qemu-devel

Hi

Please send in any agenda items you are interested in covering.

Later, Juan.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* [Qemu-devel] KVM call agenda for June 28
@ 2011-06-27 14:32 ` Juan Quintela
  0 siblings, 0 replies; 55+ messages in thread
From: Juan Quintela @ 2011-06-27 14:32 UTC (permalink / raw)
  To: KVM devel mailing list, qemu-devel

Hi

Please send in any agenda items you are interested in covering.

Later, Juan.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: KVM call agenda for June 28
  2011-06-27 14:32 ` [Qemu-devel] " Juan Quintela
@ 2011-06-28 13:38   ` Stefan Hajnoczi
  -1 siblings, 0 replies; 55+ messages in thread
From: Stefan Hajnoczi @ 2011-06-28 13:38 UTC (permalink / raw)
  To: quintela
  Cc: KVM devel mailing list, qemu-devel, Chris Wright,
	Marcelo Tosatti, Kevin Wolf

On Mon, Jun 27, 2011 at 3:32 PM, Juan Quintela <quintela@redhat.com> wrote:
> Please send in any agenda items you are interested in covering.

Live block copy and image streaming:
 * The differences between Marcelo and Kevin's approaches
 * Which approach to choose and who can help implement it

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [Qemu-devel] KVM call agenda for June 28
@ 2011-06-28 13:38   ` Stefan Hajnoczi
  0 siblings, 0 replies; 55+ messages in thread
From: Stefan Hajnoczi @ 2011-06-28 13:38 UTC (permalink / raw)
  To: quintela
  Cc: Chris Wright, Kevin Wolf, Marcelo Tosatti, qemu-devel,
	KVM devel mailing list

On Mon, Jun 27, 2011 at 3:32 PM, Juan Quintela <quintela@redhat.com> wrote:
> Please send in any agenda items you are interested in covering.

Live block copy and image streaming:
 * The differences between Marcelo and Kevin's approaches
 * Which approach to choose and who can help implement it

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [Qemu-devel] KVM call agenda for June 28
  2011-06-27 14:32 ` [Qemu-devel] " Juan Quintela
@ 2011-06-28 13:43   ` Anthony Liguori
  -1 siblings, 0 replies; 55+ messages in thread
From: Anthony Liguori @ 2011-06-28 13:43 UTC (permalink / raw)
  To: quintela; +Cc: KVM devel mailing list, qemu-devel

On 06/27/2011 09:32 AM, Juan Quintela wrote:
> Hi
>
> Please send in any agenda items you are interested in covering.

FYI, I'm in an all-day meeting so I can't attend.

Regards,

Anthony Liguori

>
> Later, Juan.
>


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [Qemu-devel] KVM call agenda for June 28
@ 2011-06-28 13:43   ` Anthony Liguori
  0 siblings, 0 replies; 55+ messages in thread
From: Anthony Liguori @ 2011-06-28 13:43 UTC (permalink / raw)
  To: quintela; +Cc: qemu-devel, KVM devel mailing list

On 06/27/2011 09:32 AM, Juan Quintela wrote:
> Hi
>
> Please send in any agenda items you are interested in covering.

FYI, I'm in an all-day meeting so I can't attend.

Regards,

Anthony Liguori

>
> Later, Juan.
>

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [Qemu-devel] KVM call agenda for June 28
  2011-06-28 13:43   ` Anthony Liguori
@ 2011-06-28 13:48     ` Avi Kivity
  -1 siblings, 0 replies; 55+ messages in thread
From: Avi Kivity @ 2011-06-28 13:48 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: quintela, KVM devel mailing list, qemu-devel

On 06/28/2011 04:43 PM, Anthony Liguori wrote:
> FYI, I'm in an all-day meeting so I can't attend.

Did you do something really bad?

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [Qemu-devel] KVM call agenda for June 28
@ 2011-06-28 13:48     ` Avi Kivity
  0 siblings, 0 replies; 55+ messages in thread
From: Avi Kivity @ 2011-06-28 13:48 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: qemu-devel, KVM devel mailing list, quintela

On 06/28/2011 04:43 PM, Anthony Liguori wrote:
> FYI, I'm in an all-day meeting so I can't attend.

Did you do something really bad?

-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: KVM call agenda for June 28
  2011-06-28 13:38   ` [Qemu-devel] " Stefan Hajnoczi
@ 2011-06-28 19:41     ` Marcelo Tosatti
  -1 siblings, 0 replies; 55+ messages in thread
From: Marcelo Tosatti @ 2011-06-28 19:41 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: quintela, KVM devel mailing list, qemu-devel, Chris Wright, Kevin Wolf

On Tue, Jun 28, 2011 at 02:38:15PM +0100, Stefan Hajnoczi wrote:
> On Mon, Jun 27, 2011 at 3:32 PM, Juan Quintela <quintela@redhat.com> wrote:
> > Please send in any agenda items you are interested in covering.
> 
> Live block copy and image streaming:
>  * The differences between Marcelo and Kevin's approaches
>  * Which approach to choose and who can help implement it

After more thinking, i dislike the image metadata approach. Management
must carry the information anyway, so its pointless to duplicate it
inside an image format.

After the discussion today, i think the internal mechanism and interface
should be different for copy and stream:

block copy
----------

With backing files:

1) base <- sn1 <- sn2
2) base <- copy

Without:

1) source
2) destination

Copy is only valid after switch has been performed. Same interface and
crash recovery characteristics for all image formats.

If management wants to support continuation, it must specify
blkcopy:sn2:copy on startup.

stream
------

1) base <- remote
2) base <- remote <- local
3) base <- local

"local" image is always valid. Requires backing file support.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [Qemu-devel] KVM call agenda for June 28
@ 2011-06-28 19:41     ` Marcelo Tosatti
  0 siblings, 0 replies; 55+ messages in thread
From: Marcelo Tosatti @ 2011-06-28 19:41 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Chris Wright, Kevin Wolf, qemu-devel, KVM devel mailing list, quintela

On Tue, Jun 28, 2011 at 02:38:15PM +0100, Stefan Hajnoczi wrote:
> On Mon, Jun 27, 2011 at 3:32 PM, Juan Quintela <quintela@redhat.com> wrote:
> > Please send in any agenda items you are interested in covering.
> 
> Live block copy and image streaming:
>  * The differences between Marcelo and Kevin's approaches
>  * Which approach to choose and who can help implement it

After more thinking, i dislike the image metadata approach. Management
must carry the information anyway, so its pointless to duplicate it
inside an image format.

After the discussion today, i think the internal mechanism and interface
should be different for copy and stream:

block copy
----------

With backing files:

1) base <- sn1 <- sn2
2) base <- copy

Without:

1) source
2) destination

Copy is only valid after switch has been performed. Same interface and
crash recovery characteristics for all image formats.

If management wants to support continuation, it must specify
blkcopy:sn2:copy on startup.

stream
------

1) base <- remote
2) base <- remote <- local
3) base <- local

"local" image is always valid. Requires backing file support.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: KVM call agenda for June 28
  2011-06-28 19:41     ` [Qemu-devel] " Marcelo Tosatti
@ 2011-06-29  5:32       ` Stefan Hajnoczi
  -1 siblings, 0 replies; 55+ messages in thread
From: Stefan Hajnoczi @ 2011-06-29  5:32 UTC (permalink / raw)
  To: Marcelo Tosatti
  Cc: quintela, KVM devel mailing list, qemu-devel, Chris Wright, Kevin Wolf

On Tue, Jun 28, 2011 at 8:41 PM, Marcelo Tosatti <mtosatti@redhat.com> wrote:
> On Tue, Jun 28, 2011 at 02:38:15PM +0100, Stefan Hajnoczi wrote:
>> On Mon, Jun 27, 2011 at 3:32 PM, Juan Quintela <quintela@redhat.com> wrote:
>> > Please send in any agenda items you are interested in covering.
>>
>> Live block copy and image streaming:
>>  * The differences between Marcelo and Kevin's approaches
>>  * Which approach to choose and who can help implement it
>
> After more thinking, i dislike the image metadata approach. Management
> must carry the information anyway, so its pointless to duplicate it
> inside an image format.

I agree with you.  It would be a significant change for QEMU users to
deal with block state files just in case they want to use live block
copy/image streaming.  Not only would existing management layers need
to be updated but also custom management or provisioning scripts.

> After the discussion today, i think the internal mechanism and interface
> should be different for copy and stream:
>
> block copy
> ----------
>
> With backing files:
>
> 1) base <- sn1 <- sn2
> 2) base <- copy
>
> Without:
>
> 1) source
> 2) destination
>
> Copy is only valid after switch has been performed. Same interface and
> crash recovery characteristics for all image formats.
>
> If management wants to support continuation, it must specify
> blkcopy:sn2:copy on startup.
>
> stream
> ------
>
> 1) base <- remote
> 2) base <- remote <- local
> 3) base <- local
>
> "local" image is always valid. Requires backing file support.

I agree that the modes of operation are different and we should
provide different HMP/QMP APIs for them.  Internally I still think
they can share code for the source -> destination copy operation.

Stefan

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [Qemu-devel] KVM call agenda for June 28
@ 2011-06-29  5:32       ` Stefan Hajnoczi
  0 siblings, 0 replies; 55+ messages in thread
From: Stefan Hajnoczi @ 2011-06-29  5:32 UTC (permalink / raw)
  To: Marcelo Tosatti
  Cc: Chris Wright, Kevin Wolf, qemu-devel, KVM devel mailing list, quintela

On Tue, Jun 28, 2011 at 8:41 PM, Marcelo Tosatti <mtosatti@redhat.com> wrote:
> On Tue, Jun 28, 2011 at 02:38:15PM +0100, Stefan Hajnoczi wrote:
>> On Mon, Jun 27, 2011 at 3:32 PM, Juan Quintela <quintela@redhat.com> wrote:
>> > Please send in any agenda items you are interested in covering.
>>
>> Live block copy and image streaming:
>>  * The differences between Marcelo and Kevin's approaches
>>  * Which approach to choose and who can help implement it
>
> After more thinking, i dislike the image metadata approach. Management
> must carry the information anyway, so its pointless to duplicate it
> inside an image format.

I agree with you.  It would be a significant change for QEMU users to
deal with block state files just in case they want to use live block
copy/image streaming.  Not only would existing management layers need
to be updated but also custom management or provisioning scripts.

> After the discussion today, i think the internal mechanism and interface
> should be different for copy and stream:
>
> block copy
> ----------
>
> With backing files:
>
> 1) base <- sn1 <- sn2
> 2) base <- copy
>
> Without:
>
> 1) source
> 2) destination
>
> Copy is only valid after switch has been performed. Same interface and
> crash recovery characteristics for all image formats.
>
> If management wants to support continuation, it must specify
> blkcopy:sn2:copy on startup.
>
> stream
> ------
>
> 1) base <- remote
> 2) base <- remote <- local
> 3) base <- local
>
> "local" image is always valid. Requires backing file support.

I agree that the modes of operation are different and we should
provide different HMP/QMP APIs for them.  Internally I still think
they can share code for the source -> destination copy operation.

Stefan

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: KVM call agenda for June 28
  2011-06-28 19:41     ` [Qemu-devel] " Marcelo Tosatti
@ 2011-06-29  7:57       ` Kevin Wolf
  -1 siblings, 0 replies; 55+ messages in thread
From: Kevin Wolf @ 2011-06-29  7:57 UTC (permalink / raw)
  To: Marcelo Tosatti
  Cc: Stefan Hajnoczi, quintela, KVM devel mailing list, qemu-devel,
	Chris Wright, Dor Laor, Avi Kivity

Am 28.06.2011 21:41, schrieb Marcelo Tosatti:
> On Tue, Jun 28, 2011 at 02:38:15PM +0100, Stefan Hajnoczi wrote:
>> On Mon, Jun 27, 2011 at 3:32 PM, Juan Quintela <quintela@redhat.com> wrote:
>>> Please send in any agenda items you are interested in covering.
>>
>> Live block copy and image streaming:
>>  * The differences between Marcelo and Kevin's approaches
>>  * Which approach to choose and who can help implement it
> 
> After more thinking, i dislike the image metadata approach. Management
> must carry the information anyway, so its pointless to duplicate it
> inside an image format.
> 
> After the discussion today, i think the internal mechanism and interface
> should be different for copy and stream:
> 
> block copy
> ----------
> 
> With backing files:
> 
> 1) base <- sn1 <- sn2
> 2) base <- copy
> 
> Without:
> 
> 1) source
> 2) destination
> 
> Copy is only valid after switch has been performed. Same interface and
> crash recovery characteristics for all image formats.
> 
> If management wants to support continuation, it must specify
> blkcopy:sn2:copy on startup.

We can use almost the same interface and still have an image that is
always valid (assuming that you provide the right format on the command
line, which is already a requirement today).

base <- sn1 <- sn2 <- copy.raw

You just add the file name for an external COW file, like
blkcopy:sn2:copy.raw:copy.cow (we can even have a default filename for
HMP instead of requiring to specify it, like $IMAGE.cow) and if the
destination doesn't support backing files by itself, blkcopy creates the
COW overlay BlockDriverState that uses this file.

No difference for management at all, except that it needs to allow
access to another file.

> stream
> ------
> 
> 1) base <- remote
> 2) base <- remote <- local
> 3) base <- local
> 
> "local" image is always valid. Requires backing file support.

With the above, this restriction wouldn't apply any more.

Also I don't think we should mix approaches. Either both block copy and
image streaming use backing files, or none of them do. Mixing means
duplicating more code, and even worse, that you can't stop a block copy
in the middle and continue with streaming (which I believe is a really
valuable feature to have).

Kevin

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [Qemu-devel] KVM call agenda for June 28
@ 2011-06-29  7:57       ` Kevin Wolf
  0 siblings, 0 replies; 55+ messages in thread
From: Kevin Wolf @ 2011-06-29  7:57 UTC (permalink / raw)
  To: Marcelo Tosatti
  Cc: Chris Wright, KVM devel mailing list, quintela, Stefan Hajnoczi,
	Dor Laor, qemu-devel, Avi Kivity

Am 28.06.2011 21:41, schrieb Marcelo Tosatti:
> On Tue, Jun 28, 2011 at 02:38:15PM +0100, Stefan Hajnoczi wrote:
>> On Mon, Jun 27, 2011 at 3:32 PM, Juan Quintela <quintela@redhat.com> wrote:
>>> Please send in any agenda items you are interested in covering.
>>
>> Live block copy and image streaming:
>>  * The differences between Marcelo and Kevin's approaches
>>  * Which approach to choose and who can help implement it
> 
> After more thinking, i dislike the image metadata approach. Management
> must carry the information anyway, so its pointless to duplicate it
> inside an image format.
> 
> After the discussion today, i think the internal mechanism and interface
> should be different for copy and stream:
> 
> block copy
> ----------
> 
> With backing files:
> 
> 1) base <- sn1 <- sn2
> 2) base <- copy
> 
> Without:
> 
> 1) source
> 2) destination
> 
> Copy is only valid after switch has been performed. Same interface and
> crash recovery characteristics for all image formats.
> 
> If management wants to support continuation, it must specify
> blkcopy:sn2:copy on startup.

We can use almost the same interface and still have an image that is
always valid (assuming that you provide the right format on the command
line, which is already a requirement today).

base <- sn1 <- sn2 <- copy.raw

You just add the file name for an external COW file, like
blkcopy:sn2:copy.raw:copy.cow (we can even have a default filename for
HMP instead of requiring to specify it, like $IMAGE.cow) and if the
destination doesn't support backing files by itself, blkcopy creates the
COW overlay BlockDriverState that uses this file.

No difference for management at all, except that it needs to allow
access to another file.

> stream
> ------
> 
> 1) base <- remote
> 2) base <- remote <- local
> 3) base <- local
> 
> "local" image is always valid. Requires backing file support.

With the above, this restriction wouldn't apply any more.

Also I don't think we should mix approaches. Either both block copy and
image streaming use backing files, or none of them do. Mixing means
duplicating more code, and even worse, that you can't stop a block copy
in the middle and continue with streaming (which I believe is a really
valuable feature to have).

Kevin

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: KVM call agenda for June 28
  2011-06-29  7:57       ` [Qemu-devel] " Kevin Wolf
@ 2011-06-29 10:08         ` Stefan Hajnoczi
  -1 siblings, 0 replies; 55+ messages in thread
From: Stefan Hajnoczi @ 2011-06-29 10:08 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: Marcelo Tosatti, quintela, KVM devel mailing list, qemu-devel,
	Chris Wright, Dor Laor, Avi Kivity

On Wed, Jun 29, 2011 at 8:57 AM, Kevin Wolf <kwolf@redhat.com> wrote:
> Am 28.06.2011 21:41, schrieb Marcelo Tosatti:
>> stream
>> ------
>>
>> 1) base <- remote
>> 2) base <- remote <- local
>> 3) base <- local
>>
>> "local" image is always valid. Requires backing file support.
>
> With the above, this restriction wouldn't apply any more.
>
> Also I don't think we should mix approaches. Either both block copy and
> image streaming use backing files, or none of them do. Mixing means
> duplicating more code, and even worse, that you can't stop a block copy
> in the middle and continue with streaming (which I believe is a really
> valuable feature to have).

Here is how the image streaming feature is used from HMP/QMP:

The guest is running from an image file with a backing file.  The aim
is to pull the data from the backing file and populate the image file
so that the dependency on the backing file can be eliminated.

1. Start a background streaming operation:

(qemu) block_stream -a ide0-hd

2. Check the status of the operation:

(qemu) info block-stream
Streaming device ide0-hd: Completed 512 of 34359738368 bytes

3. The status changes when the operation completes:

(qemu) info block-stream
No active stream

On completion the image file no longer has a backing file dependency.
When streaming completes QEMU updates the image file metadata to
indicate that no backing file is used.

The QMP interface is similar but provides QMP events to signal
streaming completion and failure.  Polling to query the streaming
status is only used when the management application wishes to refresh
progress information.

If guest execution is interrupted by a power failure or QEMU crash,
then the image file is still valid but streaming may be incomplete.
When QEMU is launched again the block_stream command can be issued to
resume streaming.

In the future we could add a 'base' argument to block_stream.  If base
is specified then data contained in the base image will not be copied.
 This can be used to merge data from an intermediate image without
merging the base image.  When streaming completes the backing file
will be set to the base image.  The backing file relationship would
typically look like this:

1. Before block_stream -a -b base.img ide0-hd completion:

base.img <- sn1 <- ... <- ide0-hd.qed

2. After streaming completes:

base.img <- ide0-hd.qed

This describes the image streaming use cases that I, Adam, and Anthony
propose to support.  In the course of the discussion we've sometimes
been distracted with the internals of what a unified live block
copy/image streaming implementation should do.  I wanted to post this
summary of image streaming to refocus us on the use case and the APIs
that users will see.

Stefan

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [Qemu-devel] KVM call agenda for June 28
@ 2011-06-29 10:08         ` Stefan Hajnoczi
  0 siblings, 0 replies; 55+ messages in thread
From: Stefan Hajnoczi @ 2011-06-29 10:08 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: Chris Wright, Marcelo Tosatti, KVM devel mailing list, quintela,
	Dor Laor, qemu-devel, Avi Kivity

On Wed, Jun 29, 2011 at 8:57 AM, Kevin Wolf <kwolf@redhat.com> wrote:
> Am 28.06.2011 21:41, schrieb Marcelo Tosatti:
>> stream
>> ------
>>
>> 1) base <- remote
>> 2) base <- remote <- local
>> 3) base <- local
>>
>> "local" image is always valid. Requires backing file support.
>
> With the above, this restriction wouldn't apply any more.
>
> Also I don't think we should mix approaches. Either both block copy and
> image streaming use backing files, or none of them do. Mixing means
> duplicating more code, and even worse, that you can't stop a block copy
> in the middle and continue with streaming (which I believe is a really
> valuable feature to have).

Here is how the image streaming feature is used from HMP/QMP:

The guest is running from an image file with a backing file.  The aim
is to pull the data from the backing file and populate the image file
so that the dependency on the backing file can be eliminated.

1. Start a background streaming operation:

(qemu) block_stream -a ide0-hd

2. Check the status of the operation:

(qemu) info block-stream
Streaming device ide0-hd: Completed 512 of 34359738368 bytes

3. The status changes when the operation completes:

(qemu) info block-stream
No active stream

On completion the image file no longer has a backing file dependency.
When streaming completes QEMU updates the image file metadata to
indicate that no backing file is used.

The QMP interface is similar but provides QMP events to signal
streaming completion and failure.  Polling to query the streaming
status is only used when the management application wishes to refresh
progress information.

If guest execution is interrupted by a power failure or QEMU crash,
then the image file is still valid but streaming may be incomplete.
When QEMU is launched again the block_stream command can be issued to
resume streaming.

In the future we could add a 'base' argument to block_stream.  If base
is specified then data contained in the base image will not be copied.
 This can be used to merge data from an intermediate image without
merging the base image.  When streaming completes the backing file
will be set to the base image.  The backing file relationship would
typically look like this:

1. Before block_stream -a -b base.img ide0-hd completion:

base.img <- sn1 <- ... <- ide0-hd.qed

2. After streaming completes:

base.img <- ide0-hd.qed

This describes the image streaming use cases that I, Adam, and Anthony
propose to support.  In the course of the discussion we've sometimes
been distracted with the internals of what a unified live block
copy/image streaming implementation should do.  I wanted to post this
summary of image streaming to refocus us on the use case and the APIs
that users will see.

Stefan

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: KVM call agenda for June 28
  2011-06-29 10:08         ` [Qemu-devel] " Stefan Hajnoczi
@ 2011-06-29 15:41           ` Marcelo Tosatti
  -1 siblings, 0 replies; 55+ messages in thread
From: Marcelo Tosatti @ 2011-06-29 15:41 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Kevin Wolf, quintela, KVM devel mailing list, qemu-devel,
	Chris Wright, Dor Laor, Avi Kivity

On Wed, Jun 29, 2011 at 11:08:23AM +0100, Stefan Hajnoczi wrote:
> On Wed, Jun 29, 2011 at 8:57 AM, Kevin Wolf <kwolf@redhat.com> wrote:
> > Am 28.06.2011 21:41, schrieb Marcelo Tosatti:
> >> stream
> >> ------
> >>
> >> 1) base <- remote
> >> 2) base <- remote <- local
> >> 3) base <- local
> >>
> >> "local" image is always valid. Requires backing file support.
> >
> > With the above, this restriction wouldn't apply any more.
> >
> > Also I don't think we should mix approaches. Either both block copy and
> > image streaming use backing files, or none of them do. Mixing means
> > duplicating more code, and even worse, that you can't stop a block copy
> > in the middle and continue with streaming (which I believe is a really
> > valuable feature to have).
> 
> Here is how the image streaming feature is used from HMP/QMP:
> 
> The guest is running from an image file with a backing file.  The aim
> is to pull the data from the backing file and populate the image file
> so that the dependency on the backing file can be eliminated.
> 
> 1. Start a background streaming operation:
> 
> (qemu) block_stream -a ide0-hd
> 
> 2. Check the status of the operation:
> 
> (qemu) info block-stream
> Streaming device ide0-hd: Completed 512 of 34359738368 bytes
> 
> 3. The status changes when the operation completes:
> 
> (qemu) info block-stream
> No active stream
> 
> On completion the image file no longer has a backing file dependency.
> When streaming completes QEMU updates the image file metadata to
> indicate that no backing file is used.
> 
> The QMP interface is similar but provides QMP events to signal
> streaming completion and failure.  Polling to query the streaming
> status is only used when the management application wishes to refresh
> progress information.
> 
> If guest execution is interrupted by a power failure or QEMU crash,
> then the image file is still valid but streaming may be incomplete.
> When QEMU is launched again the block_stream command can be issued to
> resume streaming.
> 
> In the future we could add a 'base' argument to block_stream.  If base
> is specified then data contained in the base image will not be copied.

This is a present requirement.

>  This can be used to merge data from an intermediate image without
> merging the base image.  When streaming completes the backing file
> will be set to the base image.  The backing file relationship would
> typically look like this:
> 
> 1. Before block_stream -a -b base.img ide0-hd completion:
> 
> base.img <- sn1 <- ... <- ide0-hd.qed
> 
> 2. After streaming completes:
> 
> base.img <- ide0-hd.qed
> 
> This describes the image streaming use cases that I, Adam, and Anthony
> propose to support.  In the course of the discussion we've sometimes
> been distracted with the internals of what a unified live block
> copy/image streaming implementation should do.  I wanted to post this
> summary of image streaming to refocus us on the use case and the APIs
> that users will see.
> 
> Stefan

OK, with an external COW file for formats that do not support it the
interface can be similar. Also there is no need to mirror writes,
no switch operation, always use destination image.


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [Qemu-devel] KVM call agenda for June 28
@ 2011-06-29 15:41           ` Marcelo Tosatti
  0 siblings, 0 replies; 55+ messages in thread
From: Marcelo Tosatti @ 2011-06-29 15:41 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Kevin Wolf, Chris Wright, KVM devel mailing list, quintela,
	Dor Laor, qemu-devel, Avi Kivity

On Wed, Jun 29, 2011 at 11:08:23AM +0100, Stefan Hajnoczi wrote:
> On Wed, Jun 29, 2011 at 8:57 AM, Kevin Wolf <kwolf@redhat.com> wrote:
> > Am 28.06.2011 21:41, schrieb Marcelo Tosatti:
> >> stream
> >> ------
> >>
> >> 1) base <- remote
> >> 2) base <- remote <- local
> >> 3) base <- local
> >>
> >> "local" image is always valid. Requires backing file support.
> >
> > With the above, this restriction wouldn't apply any more.
> >
> > Also I don't think we should mix approaches. Either both block copy and
> > image streaming use backing files, or none of them do. Mixing means
> > duplicating more code, and even worse, that you can't stop a block copy
> > in the middle and continue with streaming (which I believe is a really
> > valuable feature to have).
> 
> Here is how the image streaming feature is used from HMP/QMP:
> 
> The guest is running from an image file with a backing file.  The aim
> is to pull the data from the backing file and populate the image file
> so that the dependency on the backing file can be eliminated.
> 
> 1. Start a background streaming operation:
> 
> (qemu) block_stream -a ide0-hd
> 
> 2. Check the status of the operation:
> 
> (qemu) info block-stream
> Streaming device ide0-hd: Completed 512 of 34359738368 bytes
> 
> 3. The status changes when the operation completes:
> 
> (qemu) info block-stream
> No active stream
> 
> On completion the image file no longer has a backing file dependency.
> When streaming completes QEMU updates the image file metadata to
> indicate that no backing file is used.
> 
> The QMP interface is similar but provides QMP events to signal
> streaming completion and failure.  Polling to query the streaming
> status is only used when the management application wishes to refresh
> progress information.
> 
> If guest execution is interrupted by a power failure or QEMU crash,
> then the image file is still valid but streaming may be incomplete.
> When QEMU is launched again the block_stream command can be issued to
> resume streaming.
> 
> In the future we could add a 'base' argument to block_stream.  If base
> is specified then data contained in the base image will not be copied.

This is a present requirement.

>  This can be used to merge data from an intermediate image without
> merging the base image.  When streaming completes the backing file
> will be set to the base image.  The backing file relationship would
> typically look like this:
> 
> 1. Before block_stream -a -b base.img ide0-hd completion:
> 
> base.img <- sn1 <- ... <- ide0-hd.qed
> 
> 2. After streaming completes:
> 
> base.img <- ide0-hd.qed
> 
> This describes the image streaming use cases that I, Adam, and Anthony
> propose to support.  In the course of the discussion we've sometimes
> been distracted with the internals of what a unified live block
> copy/image streaming implementation should do.  I wanted to post this
> summary of image streaming to refocus us on the use case and the APIs
> that users will see.
> 
> Stefan

OK, with an external COW file for formats that do not support it the
interface can be similar. Also there is no need to mirror writes,
no switch operation, always use destination image.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: KVM call agenda for June 28
  2011-06-29 15:41           ` [Qemu-devel] " Marcelo Tosatti
@ 2011-06-30 11:48             ` Stefan Hajnoczi
  -1 siblings, 0 replies; 55+ messages in thread
From: Stefan Hajnoczi @ 2011-06-30 11:48 UTC (permalink / raw)
  To: Marcelo Tosatti
  Cc: Kevin Wolf, quintela, KVM devel mailing list, qemu-devel,
	Chris Wright, Dor Laor, Avi Kivity

On Wed, Jun 29, 2011 at 4:41 PM, Marcelo Tosatti <mtosatti@redhat.com> wrote:
> On Wed, Jun 29, 2011 at 11:08:23AM +0100, Stefan Hajnoczi wrote:
>> In the future we could add a 'base' argument to block_stream.  If base
>> is specified then data contained in the base image will not be copied.
>
> This is a present requirement.

It's not one that I have had in the past but it is a reasonable requirement.

One interesting thing about this requirement is that it makes
copy-on-read seem like the wrong primitive for image streaming.  If
there is a base image which should not be streamed then a plain loop
that calls bdrv_is_allocated_chain(bs, base, sector, &pnum) and copies
sectors into bs is more straightforward than passing base to a
copy-on-read operation somehow (through a variable that stashes the
base away somewhere?).

>>  This can be used to merge data from an intermediate image without
>> merging the base image.  When streaming completes the backing file
>> will be set to the base image.  The backing file relationship would
>> typically look like this:
>>
>> 1. Before block_stream -a -b base.img ide0-hd completion:
>>
>> base.img <- sn1 <- ... <- ide0-hd.qed
>>
>> 2. After streaming completes:
>>
>> base.img <- ide0-hd.qed
>>
>> This describes the image streaming use cases that I, Adam, and Anthony
>> propose to support.  In the course of the discussion we've sometimes
>> been distracted with the internals of what a unified live block
>> copy/image streaming implementation should do.  I wanted to post this
>> summary of image streaming to refocus us on the use case and the APIs
>> that users will see.
>>
>> Stefan
>
> OK, with an external COW file for formats that do not support it the
> interface can be similar. Also there is no need to mirror writes,
> no switch operation, always use destination image.

Yep.

Stefan

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [Qemu-devel] KVM call agenda for June 28
@ 2011-06-30 11:48             ` Stefan Hajnoczi
  0 siblings, 0 replies; 55+ messages in thread
From: Stefan Hajnoczi @ 2011-06-30 11:48 UTC (permalink / raw)
  To: Marcelo Tosatti
  Cc: Kevin Wolf, Chris Wright, KVM devel mailing list, quintela,
	Dor Laor, qemu-devel, Avi Kivity

On Wed, Jun 29, 2011 at 4:41 PM, Marcelo Tosatti <mtosatti@redhat.com> wrote:
> On Wed, Jun 29, 2011 at 11:08:23AM +0100, Stefan Hajnoczi wrote:
>> In the future we could add a 'base' argument to block_stream.  If base
>> is specified then data contained in the base image will not be copied.
>
> This is a present requirement.

It's not one that I have had in the past but it is a reasonable requirement.

One interesting thing about this requirement is that it makes
copy-on-read seem like the wrong primitive for image streaming.  If
there is a base image which should not be streamed then a plain loop
that calls bdrv_is_allocated_chain(bs, base, sector, &pnum) and copies
sectors into bs is more straightforward than passing base to a
copy-on-read operation somehow (through a variable that stashes the
base away somewhere?).

>>  This can be used to merge data from an intermediate image without
>> merging the base image.  When streaming completes the backing file
>> will be set to the base image.  The backing file relationship would
>> typically look like this:
>>
>> 1. Before block_stream -a -b base.img ide0-hd completion:
>>
>> base.img <- sn1 <- ... <- ide0-hd.qed
>>
>> 2. After streaming completes:
>>
>> base.img <- ide0-hd.qed
>>
>> This describes the image streaming use cases that I, Adam, and Anthony
>> propose to support.  In the course of the discussion we've sometimes
>> been distracted with the internals of what a unified live block
>> copy/image streaming implementation should do.  I wanted to post this
>> summary of image streaming to refocus us on the use case and the APIs
>> that users will see.
>>
>> Stefan
>
> OK, with an external COW file for formats that do not support it the
> interface can be similar. Also there is no need to mirror writes,
> no switch operation, always use destination image.

Yep.

Stefan

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: KVM call agenda for June 28
  2011-06-30 11:48             ` [Qemu-devel] " Stefan Hajnoczi
@ 2011-06-30 12:39               ` Kevin Wolf
  -1 siblings, 0 replies; 55+ messages in thread
From: Kevin Wolf @ 2011-06-30 12:39 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Marcelo Tosatti, quintela, KVM devel mailing list, qemu-devel,
	Chris Wright, Dor Laor, Avi Kivity

Am 30.06.2011 13:48, schrieb Stefan Hajnoczi:
> On Wed, Jun 29, 2011 at 4:41 PM, Marcelo Tosatti <mtosatti@redhat.com> wrote:
>> On Wed, Jun 29, 2011 at 11:08:23AM +0100, Stefan Hajnoczi wrote:
>>> In the future we could add a 'base' argument to block_stream.  If base
>>> is specified then data contained in the base image will not be copied.
>>
>> This is a present requirement.
> 
> It's not one that I have had in the past but it is a reasonable requirement.
> 
> One interesting thing about this requirement is that it makes
> copy-on-read seem like the wrong primitive for image streaming.  If
> there is a base image which should not be streamed then a plain loop
> that calls bdrv_is_allocated_chain(bs, base, sector, &pnum) and copies
> sectors into bs is more straightforward than passing base to a
> copy-on-read operation somehow (through a variable that stashes the
> base away somewhere?).

You don't even have to look at the implementation to say that COR is a
useful optimisation. It basically means that you reuse data read by the
guest instead of reading it a second time in your loop. (And this is
equally true for block copy and image streaming)

If this means adding a new field in BlockDriverState, so be it.

Kevin

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [Qemu-devel] KVM call agenda for June 28
@ 2011-06-30 12:39               ` Kevin Wolf
  0 siblings, 0 replies; 55+ messages in thread
From: Kevin Wolf @ 2011-06-30 12:39 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Chris Wright, Marcelo Tosatti, KVM devel mailing list, quintela,
	Dor Laor, qemu-devel, Avi Kivity

Am 30.06.2011 13:48, schrieb Stefan Hajnoczi:
> On Wed, Jun 29, 2011 at 4:41 PM, Marcelo Tosatti <mtosatti@redhat.com> wrote:
>> On Wed, Jun 29, 2011 at 11:08:23AM +0100, Stefan Hajnoczi wrote:
>>> In the future we could add a 'base' argument to block_stream.  If base
>>> is specified then data contained in the base image will not be copied.
>>
>> This is a present requirement.
> 
> It's not one that I have had in the past but it is a reasonable requirement.
> 
> One interesting thing about this requirement is that it makes
> copy-on-read seem like the wrong primitive for image streaming.  If
> there is a base image which should not be streamed then a plain loop
> that calls bdrv_is_allocated_chain(bs, base, sector, &pnum) and copies
> sectors into bs is more straightforward than passing base to a
> copy-on-read operation somehow (through a variable that stashes the
> base away somewhere?).

You don't even have to look at the implementation to say that COR is a
useful optimisation. It basically means that you reuse data read by the
guest instead of reading it a second time in your loop. (And this is
equally true for block copy and image streaming)

If this means adding a new field in BlockDriverState, so be it.

Kevin

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: KVM call agenda for June 28
  2011-06-29 15:41           ` [Qemu-devel] " Marcelo Tosatti
@ 2011-06-30 12:54             ` Stefan Hajnoczi
  -1 siblings, 0 replies; 55+ messages in thread
From: Stefan Hajnoczi @ 2011-06-30 12:54 UTC (permalink / raw)
  To: Marcelo Tosatti
  Cc: Kevin Wolf, quintela, KVM devel mailing list, qemu-devel,
	Chris Wright, Dor Laor, Avi Kivity

On Wed, Jun 29, 2011 at 4:41 PM, Marcelo Tosatti <mtosatti@redhat.com> wrote:
> On Wed, Jun 29, 2011 at 11:08:23AM +0100, Stefan Hajnoczi wrote:
>>  This can be used to merge data from an intermediate image without
>> merging the base image.  When streaming completes the backing file
>> will be set to the base image.  The backing file relationship would
>> typically look like this:
>>
>> 1. Before block_stream -a -b base.img ide0-hd completion:
>>
>> base.img <- sn1 <- ... <- ide0-hd.qed
>>
>> 2. After streaming completes:
>>
>> base.img <- ide0-hd.qed
>>
>> This describes the image streaming use cases that I, Adam, and Anthony
>> propose to support.  In the course of the discussion we've sometimes
>> been distracted with the internals of what a unified live block
>> copy/image streaming implementation should do.  I wanted to post this
>> summary of image streaming to refocus us on the use case and the APIs
>> that users will see.
>>
>> Stefan
>
> OK, with an external COW file for formats that do not support it the
> interface can be similar. Also there is no need to mirror writes,
> no switch operation, always use destination image.

Marcelo, does this mean you are happy with how management deals with
power failure/crash during streaming?

Are we settled on the approach where the destination file always has
the source file as its backing file?

Here are the components that I can identify:

1. blkmirror - used by live block copy to keep source and destination
in sync.  Already implemented as a block driver by Marcelo.

2. External COW overlay - can be used to add backing file (COW)
support on top of any image, including raw.  Currently unimplemented,
needs to be a block driver.  Kevin, do you want to write this?

3. Unified background copy - image format-independent mechanism for
copy contents of a backing file chain into the image file (with
exception of backing files chained below base).  Needs to play nice
with blkmirror.  Stefan can write this.

4. Live block copy API and high-level control - the main code that
adds the live block copy feature.  Existing patches by Marcelo, can be
restructured to use common core by Marcelo.

5. Image streaming API and high-level control - the main code that
adds the image streaming feature.  Existing patches by Stefan, Adam,
Anthony, can be restructured to use common core by Stefan.

I previously posted a proposed API for the unified background copy
mechanism.  I'm thinking that background copy is not the best name
since it is limited to copying the backing file into the image file.

/**
 * Start a background copy operation
 *
 * Unallocated clusters in the image will be populated with data
 * from its backing file.  This operation runs in the background and a
 * completion function is invoked when it is finished.
 */
BackgroundCopy *background_copy_start(
   BlockDriverState *bs,

   /**
    * Note: Kevin suggests we migrate this into BlockDriverState
    *       in order to enable copy-on-read.
    *
    * Base image that both source and destination have as a
    * backing file ancestor.  Data will not be copied from base
    * since both source and destination will have access to base
    * image.  This may be NULL to copy all data.
    */
   BlockDriverState *base,

   BlockDriverCompletionFunc *cb, void *opaque);

/**
 * Cancel a background copy operation
 *
 * This function marks the background copy operation for cancellation and the
 * completion function is invoked once the operation has been cancelled.
 */
void background_copy_cancel(BackgroundCopy *bgc,
                            BlockDriverCompletionFunc *cb, void *opaque);

/**
 * Get progress of a running background copy operation
 */
void background_copy_get_status(BackgroundCopy *bgc,
                                BackgroundCopyStatus *status);

Stefan

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [Qemu-devel] KVM call agenda for June 28
@ 2011-06-30 12:54             ` Stefan Hajnoczi
  0 siblings, 0 replies; 55+ messages in thread
From: Stefan Hajnoczi @ 2011-06-30 12:54 UTC (permalink / raw)
  To: Marcelo Tosatti
  Cc: Kevin Wolf, Chris Wright, KVM devel mailing list, quintela,
	Dor Laor, qemu-devel, Avi Kivity

On Wed, Jun 29, 2011 at 4:41 PM, Marcelo Tosatti <mtosatti@redhat.com> wrote:
> On Wed, Jun 29, 2011 at 11:08:23AM +0100, Stefan Hajnoczi wrote:
>>  This can be used to merge data from an intermediate image without
>> merging the base image.  When streaming completes the backing file
>> will be set to the base image.  The backing file relationship would
>> typically look like this:
>>
>> 1. Before block_stream -a -b base.img ide0-hd completion:
>>
>> base.img <- sn1 <- ... <- ide0-hd.qed
>>
>> 2. After streaming completes:
>>
>> base.img <- ide0-hd.qed
>>
>> This describes the image streaming use cases that I, Adam, and Anthony
>> propose to support.  In the course of the discussion we've sometimes
>> been distracted with the internals of what a unified live block
>> copy/image streaming implementation should do.  I wanted to post this
>> summary of image streaming to refocus us on the use case and the APIs
>> that users will see.
>>
>> Stefan
>
> OK, with an external COW file for formats that do not support it the
> interface can be similar. Also there is no need to mirror writes,
> no switch operation, always use destination image.

Marcelo, does this mean you are happy with how management deals with
power failure/crash during streaming?

Are we settled on the approach where the destination file always has
the source file as its backing file?

Here are the components that I can identify:

1. blkmirror - used by live block copy to keep source and destination
in sync.  Already implemented as a block driver by Marcelo.

2. External COW overlay - can be used to add backing file (COW)
support on top of any image, including raw.  Currently unimplemented,
needs to be a block driver.  Kevin, do you want to write this?

3. Unified background copy - image format-independent mechanism for
copy contents of a backing file chain into the image file (with
exception of backing files chained below base).  Needs to play nice
with blkmirror.  Stefan can write this.

4. Live block copy API and high-level control - the main code that
adds the live block copy feature.  Existing patches by Marcelo, can be
restructured to use common core by Marcelo.

5. Image streaming API and high-level control - the main code that
adds the image streaming feature.  Existing patches by Stefan, Adam,
Anthony, can be restructured to use common core by Stefan.

I previously posted a proposed API for the unified background copy
mechanism.  I'm thinking that background copy is not the best name
since it is limited to copying the backing file into the image file.

/**
 * Start a background copy operation
 *
 * Unallocated clusters in the image will be populated with data
 * from its backing file.  This operation runs in the background and a
 * completion function is invoked when it is finished.
 */
BackgroundCopy *background_copy_start(
   BlockDriverState *bs,

   /**
    * Note: Kevin suggests we migrate this into BlockDriverState
    *       in order to enable copy-on-read.
    *
    * Base image that both source and destination have as a
    * backing file ancestor.  Data will not be copied from base
    * since both source and destination will have access to base
    * image.  This may be NULL to copy all data.
    */
   BlockDriverState *base,

   BlockDriverCompletionFunc *cb, void *opaque);

/**
 * Cancel a background copy operation
 *
 * This function marks the background copy operation for cancellation and the
 * completion function is invoked once the operation has been cancelled.
 */
void background_copy_cancel(BackgroundCopy *bgc,
                            BlockDriverCompletionFunc *cb, void *opaque);

/**
 * Get progress of a running background copy operation
 */
void background_copy_get_status(BackgroundCopy *bgc,
                                BackgroundCopyStatus *status);

Stefan

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [Qemu-devel] KVM call agenda for June 28
  2011-06-28 13:48     ` Avi Kivity
  (?)
@ 2011-06-30 14:10     ` Anthony Liguori
  -1 siblings, 0 replies; 55+ messages in thread
From: Anthony Liguori @ 2011-06-30 14:10 UTC (permalink / raw)
  To: Avi Kivity; +Cc: qemu-devel, KVM devel mailing list, quintela

On 06/28/2011 08:48 AM, Avi Kivity wrote:
> On 06/28/2011 04:43 PM, Anthony Liguori wrote:
>> FYI, I'm in an all-day meeting so I can't attend.
>
> Did you do something really bad?

I named some variables with a leading underscore and now have to be 
re-educated.

Regards,

Anthony Liguori



^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: KVM call agenda for June 28
  2011-06-30 12:54             ` [Qemu-devel] " Stefan Hajnoczi
@ 2011-06-30 14:36               ` Marcelo Tosatti
  -1 siblings, 0 replies; 55+ messages in thread
From: Marcelo Tosatti @ 2011-06-30 14:36 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Kevin Wolf, quintela, KVM devel mailing list, qemu-devel,
	Chris Wright, Dor Laor, Avi Kivity

On Thu, Jun 30, 2011 at 01:54:09PM +0100, Stefan Hajnoczi wrote:
> On Wed, Jun 29, 2011 at 4:41 PM, Marcelo Tosatti <mtosatti@redhat.com> wrote:
> > On Wed, Jun 29, 2011 at 11:08:23AM +0100, Stefan Hajnoczi wrote:
> >>  This can be used to merge data from an intermediate image without
> >> merging the base image.  When streaming completes the backing file
> >> will be set to the base image.  The backing file relationship would
> >> typically look like this:
> >>
> >> 1. Before block_stream -a -b base.img ide0-hd completion:
> >>
> >> base.img <- sn1 <- ... <- ide0-hd.qed
> >>
> >> 2. After streaming completes:
> >>
> >> base.img <- ide0-hd.qed
> >>
> >> This describes the image streaming use cases that I, Adam, and Anthony
> >> propose to support.  In the course of the discussion we've sometimes
> >> been distracted with the internals of what a unified live block
> >> copy/image streaming implementation should do.  I wanted to post this
> >> summary of image streaming to refocus us on the use case and the APIs
> >> that users will see.
> >>
> >> Stefan
> >
> > OK, with an external COW file for formats that do not support it the
> > interface can be similar. Also there is no need to mirror writes,
> > no switch operation, always use destination image.
> 
> Marcelo, does this mean you are happy with how management deals with
> power failure/crash during streaming?

Yep.

> Are we settled on the approach where the destination file always has
> the source file as its backing file?

Yep.

> Here are the components that I can identify:
> 
> 1. blkmirror - used by live block copy to keep source and destination
> in sync.  Already implemented as a block driver by Marcelo.

No need for it anymore, now you switch to the destination before
the operation starts. And always use destination from there on.

> 2. External COW overlay - can be used to add backing file (COW)
> support on top of any image, including raw.  Currently unimplemented,
> needs to be a block driver.  Kevin, do you want to write this?
> 
> 3. Unified background copy - image format-independent mechanism for
> copy contents of a backing file chain into the image file (with
> exception of backing files chained below base).  Needs to play nice
> with blkmirror.  Stefan can write this.

Note the background copy itself is to simply read from 0...END. The bulk
is in the block driver.

> 4. Live block copy API and high-level control - the main code that
> adds the live block copy feature.  Existing patches by Marcelo, can be
> restructured to use common core by Marcelo.

Can use your proposed block_stream interface, with a "block_switch"
command on top, so:

1) management creates copy.img with backing file current.img, allows
access
2) management issues "block_switch dev copy.img"
3) management issues "block_stream dev base"

> 5. Image streaming API and high-level control - the main code that
> adds the image streaming feature.  Existing patches by Stefan, Adam,
> Anthony, can be restructured to use common core by Stefan.
> 
> I previously posted a proposed API for the unified background copy
> mechanism.  I'm thinking that background copy is not the best name
> since it is limited to copying the backing file into the image file.
> 
> /**
>  * Start a background copy operation
>  *
>  * Unallocated clusters in the image will be populated with data
>  * from its backing file.  This operation runs in the background and a
>  * completion function is invoked when it is finished.
>  */
> BackgroundCopy *background_copy_start(
>    BlockDriverState *bs,
> 
>    /**
>     * Note: Kevin suggests we migrate this into BlockDriverState
>     *       in order to enable copy-on-read.
>     *
>     * Base image that both source and destination have as a
>     * backing file ancestor.  Data will not be copied from base
>     * since both source and destination will have access to base
>     * image.  This may be NULL to copy all data.
>     */
>    BlockDriverState *base,
> 
>    BlockDriverCompletionFunc *cb, void *opaque);
> 
> /**
>  * Cancel a background copy operation
>  *
>  * This function marks the background copy operation for cancellation and the
>  * completion function is invoked once the operation has been cancelled.
>  */
> void background_copy_cancel(BackgroundCopy *bgc,
>                             BlockDriverCompletionFunc *cb, void *opaque);
> 
> /**
>  * Get progress of a running background copy operation
>  */
> void background_copy_get_status(BackgroundCopy *bgc,
>                                 BackgroundCopyStatus *status);
> 
> Stefan

Thought of implementing "block_stream" command by reopening device with

blkstream:imagename.img

Then:

AIO_READ:
- for each cluster in request:
    - if allocated-or-in-final-base, read.
    - check write queue, if present wait on it, if not, add "copy"
      entry to write queue.
    - issue cluster sized read from source.
    - on completion:
        - copy data to original read buffer, complete it.
        - if not cancelled, write cluster to destination.

AIO_WRITE
for each cluster in request:
    - check write queue, cancel/wait for "copy" entry.
    - add "guest" entry to write queue.
    - issue write to destination.
    - on completion:
        - remove write queue entry.


With the 0...END background read, once it completes write final base
file for image.

So block_stream/block_stream_cancel/block_stream_status commands, the
background read and the rebase -u update can be separate from the block
driver.


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [Qemu-devel] KVM call agenda for June 28
@ 2011-06-30 14:36               ` Marcelo Tosatti
  0 siblings, 0 replies; 55+ messages in thread
From: Marcelo Tosatti @ 2011-06-30 14:36 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Kevin Wolf, Chris Wright, KVM devel mailing list, quintela,
	Dor Laor, qemu-devel, Avi Kivity

On Thu, Jun 30, 2011 at 01:54:09PM +0100, Stefan Hajnoczi wrote:
> On Wed, Jun 29, 2011 at 4:41 PM, Marcelo Tosatti <mtosatti@redhat.com> wrote:
> > On Wed, Jun 29, 2011 at 11:08:23AM +0100, Stefan Hajnoczi wrote:
> >>  This can be used to merge data from an intermediate image without
> >> merging the base image.  When streaming completes the backing file
> >> will be set to the base image.  The backing file relationship would
> >> typically look like this:
> >>
> >> 1. Before block_stream -a -b base.img ide0-hd completion:
> >>
> >> base.img <- sn1 <- ... <- ide0-hd.qed
> >>
> >> 2. After streaming completes:
> >>
> >> base.img <- ide0-hd.qed
> >>
> >> This describes the image streaming use cases that I, Adam, and Anthony
> >> propose to support.  In the course of the discussion we've sometimes
> >> been distracted with the internals of what a unified live block
> >> copy/image streaming implementation should do.  I wanted to post this
> >> summary of image streaming to refocus us on the use case and the APIs
> >> that users will see.
> >>
> >> Stefan
> >
> > OK, with an external COW file for formats that do not support it the
> > interface can be similar. Also there is no need to mirror writes,
> > no switch operation, always use destination image.
> 
> Marcelo, does this mean you are happy with how management deals with
> power failure/crash during streaming?

Yep.

> Are we settled on the approach where the destination file always has
> the source file as its backing file?

Yep.

> Here are the components that I can identify:
> 
> 1. blkmirror - used by live block copy to keep source and destination
> in sync.  Already implemented as a block driver by Marcelo.

No need for it anymore, now you switch to the destination before
the operation starts. And always use destination from there on.

> 2. External COW overlay - can be used to add backing file (COW)
> support on top of any image, including raw.  Currently unimplemented,
> needs to be a block driver.  Kevin, do you want to write this?
> 
> 3. Unified background copy - image format-independent mechanism for
> copy contents of a backing file chain into the image file (with
> exception of backing files chained below base).  Needs to play nice
> with blkmirror.  Stefan can write this.

Note the background copy itself is to simply read from 0...END. The bulk
is in the block driver.

> 4. Live block copy API and high-level control - the main code that
> adds the live block copy feature.  Existing patches by Marcelo, can be
> restructured to use common core by Marcelo.

Can use your proposed block_stream interface, with a "block_switch"
command on top, so:

1) management creates copy.img with backing file current.img, allows
access
2) management issues "block_switch dev copy.img"
3) management issues "block_stream dev base"

> 5. Image streaming API and high-level control - the main code that
> adds the image streaming feature.  Existing patches by Stefan, Adam,
> Anthony, can be restructured to use common core by Stefan.
> 
> I previously posted a proposed API for the unified background copy
> mechanism.  I'm thinking that background copy is not the best name
> since it is limited to copying the backing file into the image file.
> 
> /**
>  * Start a background copy operation
>  *
>  * Unallocated clusters in the image will be populated with data
>  * from its backing file.  This operation runs in the background and a
>  * completion function is invoked when it is finished.
>  */
> BackgroundCopy *background_copy_start(
>    BlockDriverState *bs,
> 
>    /**
>     * Note: Kevin suggests we migrate this into BlockDriverState
>     *       in order to enable copy-on-read.
>     *
>     * Base image that both source and destination have as a
>     * backing file ancestor.  Data will not be copied from base
>     * since both source and destination will have access to base
>     * image.  This may be NULL to copy all data.
>     */
>    BlockDriverState *base,
> 
>    BlockDriverCompletionFunc *cb, void *opaque);
> 
> /**
>  * Cancel a background copy operation
>  *
>  * This function marks the background copy operation for cancellation and the
>  * completion function is invoked once the operation has been cancelled.
>  */
> void background_copy_cancel(BackgroundCopy *bgc,
>                             BlockDriverCompletionFunc *cb, void *opaque);
> 
> /**
>  * Get progress of a running background copy operation
>  */
> void background_copy_get_status(BackgroundCopy *bgc,
>                                 BackgroundCopyStatus *status);
> 
> Stefan

Thought of implementing "block_stream" command by reopening device with

blkstream:imagename.img

Then:

AIO_READ:
- for each cluster in request:
    - if allocated-or-in-final-base, read.
    - check write queue, if present wait on it, if not, add "copy"
      entry to write queue.
    - issue cluster sized read from source.
    - on completion:
        - copy data to original read buffer, complete it.
        - if not cancelled, write cluster to destination.

AIO_WRITE
for each cluster in request:
    - check write queue, cancel/wait for "copy" entry.
    - add "guest" entry to write queue.
    - issue write to destination.
    - on completion:
        - remove write queue entry.


With the 0...END background read, once it completes write final base
file for image.

So block_stream/block_stream_cancel/block_stream_status commands, the
background read and the rebase -u update can be separate from the block
driver.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: KVM call agenda for June 28
  2011-06-30 14:36               ` [Qemu-devel] " Marcelo Tosatti
@ 2011-06-30 14:52                 ` Kevin Wolf
  -1 siblings, 0 replies; 55+ messages in thread
From: Kevin Wolf @ 2011-06-30 14:52 UTC (permalink / raw)
  To: Marcelo Tosatti
  Cc: Stefan Hajnoczi, quintela, KVM devel mailing list, qemu-devel,
	Chris Wright, Dor Laor, Avi Kivity

Am 30.06.2011 16:36, schrieb Marcelo Tosatti:
>> 4. Live block copy API and high-level control - the main code that
>> adds the live block copy feature.  Existing patches by Marcelo, can be
>> restructured to use common core by Marcelo.
> 
> Can use your proposed block_stream interface, with a "block_switch"
> command on top, so:
> 
> 1) management creates copy.img with backing file current.img, allows
> access
> 2) management issues "block_switch dev copy.img"
> 3) management issues "block_stream dev base"

Isn't this block_switch command the same as the existing snapshot_blkdev?

> Thought of implementing "block_stream" command by reopening device with
> 
> blkstream:imagename.img
> 
> Then:
> 
> AIO_READ:
> - for each cluster in request:
>     - if allocated-or-in-final-base, read.
>     - check write queue, if present wait on it, if not, add "copy"
>       entry to write queue.
>     - issue cluster sized read from source.
>     - on completion:
>         - copy data to original read buffer, complete it.
>         - if not cancelled, write cluster to destination.
> 
> AIO_WRITE
> for each cluster in request:
>     - check write queue, cancel/wait for "copy" entry.
>     - add "guest" entry to write queue.
>     - issue write to destination.
>     - on completion:
>         - remove write queue entry.
> 
> 
> With the 0...END background read, once it completes write final base
> file for image.
> 
> So block_stream/block_stream_cancel/block_stream_status commands, the
> background read and the rebase -u update can be separate from the block
> driver.

The way how it works looks good to me, I'm just not entirely sure about
the right place to implement it. I think request queueing and copy on
read could be useful outside blkstream, too.

Kevin

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [Qemu-devel] KVM call agenda for June 28
@ 2011-06-30 14:52                 ` Kevin Wolf
  0 siblings, 0 replies; 55+ messages in thread
From: Kevin Wolf @ 2011-06-30 14:52 UTC (permalink / raw)
  To: Marcelo Tosatti
  Cc: Chris Wright, KVM devel mailing list, quintela, Stefan Hajnoczi,
	Dor Laor, qemu-devel, Avi Kivity

Am 30.06.2011 16:36, schrieb Marcelo Tosatti:
>> 4. Live block copy API and high-level control - the main code that
>> adds the live block copy feature.  Existing patches by Marcelo, can be
>> restructured to use common core by Marcelo.
> 
> Can use your proposed block_stream interface, with a "block_switch"
> command on top, so:
> 
> 1) management creates copy.img with backing file current.img, allows
> access
> 2) management issues "block_switch dev copy.img"
> 3) management issues "block_stream dev base"

Isn't this block_switch command the same as the existing snapshot_blkdev?

> Thought of implementing "block_stream" command by reopening device with
> 
> blkstream:imagename.img
> 
> Then:
> 
> AIO_READ:
> - for each cluster in request:
>     - if allocated-or-in-final-base, read.
>     - check write queue, if present wait on it, if not, add "copy"
>       entry to write queue.
>     - issue cluster sized read from source.
>     - on completion:
>         - copy data to original read buffer, complete it.
>         - if not cancelled, write cluster to destination.
> 
> AIO_WRITE
> for each cluster in request:
>     - check write queue, cancel/wait for "copy" entry.
>     - add "guest" entry to write queue.
>     - issue write to destination.
>     - on completion:
>         - remove write queue entry.
> 
> 
> With the 0...END background read, once it completes write final base
> file for image.
> 
> So block_stream/block_stream_cancel/block_stream_status commands, the
> background read and the rebase -u update can be separate from the block
> driver.

The way how it works looks good to me, I'm just not entirely sure about
the right place to implement it. I think request queueing and copy on
read could be useful outside blkstream, too.

Kevin

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [Qemu-devel] KVM call agenda for June 28
  2011-06-30 14:52                 ` [Qemu-devel] " Kevin Wolf
  (?)
@ 2011-06-30 18:38                 ` Marcelo Tosatti
  2011-07-05  8:01                   ` Dor Laor
  -1 siblings, 1 reply; 55+ messages in thread
From: Marcelo Tosatti @ 2011-06-30 18:38 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: Chris Wright, KVM devel mailing list, quintela, Stefan Hajnoczi,
	Dor Laor, qemu-devel, Avi Kivity

On Thu, Jun 30, 2011 at 04:52:00PM +0200, Kevin Wolf wrote:
> Am 30.06.2011 16:36, schrieb Marcelo Tosatti:
> >> 4. Live block copy API and high-level control - the main code that
> >> adds the live block copy feature.  Existing patches by Marcelo, can be
> >> restructured to use common core by Marcelo.
> > 
> > Can use your proposed block_stream interface, with a "block_switch"
> > command on top, so:
> > 
> > 1) management creates copy.img with backing file current.img, allows
> > access
> > 2) management issues "block_switch dev copy.img"
> > 3) management issues "block_stream dev base"
> 
> Isn't this block_switch command the same as the existing snapshot_blkdev?

Yep.

> > Thought of implementing "block_stream" command by reopening device with
> > 
> > blkstream:imagename.img
> > 
> > Then:
> > 
> > AIO_READ:
> > - for each cluster in request:
> >     - if allocated-or-in-final-base, read.
> >     - check write queue, if present wait on it, if not, add "copy"
> >       entry to write queue.
> >     - issue cluster sized read from source.
> >     - on completion:
> >         - copy data to original read buffer, complete it.
> >         - if not cancelled, write cluster to destination.
> > 
> > AIO_WRITE
> > for each cluster in request:
> >     - check write queue, cancel/wait for "copy" entry.
> >     - add "guest" entry to write queue.
> >     - issue write to destination.
> >     - on completion:
> >         - remove write queue entry.
> > 
> > 
> > With the 0...END background read, once it completes write final base
> > file for image.
> > 
> > So block_stream/block_stream_cancel/block_stream_status commands, the
> > background read and the rebase -u update can be separate from the block
> > driver.
> 
> The way how it works looks good to me, I'm just not entirely sure about
> the right place to implement it. I think request queueing and copy on
> read could be useful outside blkstream, too.

They could be lifted later, when there are other users.


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [Qemu-devel] KVM call agenda for June 28
  2011-06-30 18:38                 ` Marcelo Tosatti
@ 2011-07-05  8:01                   ` Dor Laor
  2011-07-05 12:40                       ` Stefan Hajnoczi
  0 siblings, 1 reply; 55+ messages in thread
From: Dor Laor @ 2011-07-05  8:01 UTC (permalink / raw)
  To: Marcelo Tosatti
  Cc: Kevin Wolf, Chris Wright, KVM devel mailing list, quintela,
	Stefan Hajnoczi, qemu-devel, Avi Kivity, jes sorensen

I tried to re-arrange all of the requirements and use cases using this 
wiki page: http://wiki.qemu.org/Features/LiveBlockMigration

It would be the best to agree upon the most interesting use cases (while 
we make sure we cover future ones) and agree to them.
The next step is to set the interface for all the various verbs since 
the implementation seems to be converging.

Cheers,
Dor

On 06/30/2011 09:38 PM, Marcelo Tosatti wrote:
> On Thu, Jun 30, 2011 at 04:52:00PM +0200, Kevin Wolf wrote:
>> Am 30.06.2011 16:36, schrieb Marcelo Tosatti:
>>>> 4. Live block copy API and high-level control - the main code that
>>>> adds the live block copy feature.  Existing patches by Marcelo, can be
>>>> restructured to use common core by Marcelo.
>>>
>>> Can use your proposed block_stream interface, with a "block_switch"
>>> command on top, so:
>>>
>>> 1) management creates copy.img with backing file current.img, allows
>>> access
>>> 2) management issues "block_switch dev copy.img"
>>> 3) management issues "block_stream dev base"
>>
>> Isn't this block_switch command the same as the existing snapshot_blkdev?
>
> Yep.
>
>>> Thought of implementing "block_stream" command by reopening device with
>>>
>>> blkstream:imagename.img
>>>
>>> Then:
>>>
>>> AIO_READ:
>>> - for each cluster in request:
>>>      - if allocated-or-in-final-base, read.
>>>      - check write queue, if present wait on it, if not, add "copy"
>>>        entry to write queue.
>>>      - issue cluster sized read from source.
>>>      - on completion:
>>>          - copy data to original read buffer, complete it.
>>>          - if not cancelled, write cluster to destination.
>>>
>>> AIO_WRITE
>>> for each cluster in request:
>>>      - check write queue, cancel/wait for "copy" entry.
>>>      - add "guest" entry to write queue.
>>>      - issue write to destination.
>>>      - on completion:
>>>          - remove write queue entry.
>>>
>>>
>>> With the 0...END background read, once it completes write final base
>>> file for image.
>>>
>>> So block_stream/block_stream_cancel/block_stream_status commands, the
>>> background read and the rebase -u update can be separate from the block
>>> driver.
>>
>> The way how it works looks good to me, I'm just not entirely sure about
>> the right place to implement it. I think request queueing and copy on
>> read could be useful outside blkstream, too.
>
> They could be lifted later, when there are other users.
>
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [Qemu-devel] KVM call agenda for June 28
  2011-07-05  8:01                   ` Dor Laor
@ 2011-07-05 12:40                       ` Stefan Hajnoczi
  0 siblings, 0 replies; 55+ messages in thread
From: Stefan Hajnoczi @ 2011-07-05 12:40 UTC (permalink / raw)
  To: dlaor
  Cc: Marcelo Tosatti, Kevin Wolf, Chris Wright,
	KVM devel mailing list, quintela, qemu-devel, Avi Kivity,
	jes sorensen

On Tue, Jul 5, 2011 at 9:01 AM, Dor Laor <dlaor@redhat.com> wrote:
> I tried to re-arrange all of the requirements and use cases using this wiki
> page: http://wiki.qemu.org/Features/LiveBlockMigration
>
> It would be the best to agree upon the most interesting use cases (while we
> make sure we cover future ones) and agree to them.
> The next step is to set the interface for all the various verbs since the
> implementation seems to be converging.

Live block copy was supposed to support snapshot merge.  I think the
current favored approach is to make the source image a backing file to
the destination image and essentially do image streaming.

Using this mechanism for snapshot merge is tricky.  The COW file
already uses the read-only snapshot base image.  So now we cannot
trivally copy the COW file contents back into the snapshot base image
using live block copy.

It seems like snapshot merge will require dedicated code that reads
the allocated clusters from the COW file and writes them back into the
base image.

A very inefficient alternative would be to create a third image, the
"merge" image file, which has the COW file as its backing file:
snapshot (base) -> cow -> merge

All data from snapshot and cow is copied into merge and then snapshot
and cow can be deleted.  But this approach is results in full data
copying and uses potentially 3x space if cow is close to the size of
snapshot.

Any other ideas that reuse live block copy for snapshot merge?

Stefan

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [Qemu-devel] KVM call agenda for June 28
@ 2011-07-05 12:40                       ` Stefan Hajnoczi
  0 siblings, 0 replies; 55+ messages in thread
From: Stefan Hajnoczi @ 2011-07-05 12:40 UTC (permalink / raw)
  To: dlaor
  Cc: Kevin Wolf, Chris Wright, KVM devel mailing list, quintela,
	jes sorensen, Marcelo Tosatti, qemu-devel, Avi Kivity

On Tue, Jul 5, 2011 at 9:01 AM, Dor Laor <dlaor@redhat.com> wrote:
> I tried to re-arrange all of the requirements and use cases using this wiki
> page: http://wiki.qemu.org/Features/LiveBlockMigration
>
> It would be the best to agree upon the most interesting use cases (while we
> make sure we cover future ones) and agree to them.
> The next step is to set the interface for all the various verbs since the
> implementation seems to be converging.

Live block copy was supposed to support snapshot merge.  I think the
current favored approach is to make the source image a backing file to
the destination image and essentially do image streaming.

Using this mechanism for snapshot merge is tricky.  The COW file
already uses the read-only snapshot base image.  So now we cannot
trivally copy the COW file contents back into the snapshot base image
using live block copy.

It seems like snapshot merge will require dedicated code that reads
the allocated clusters from the COW file and writes them back into the
base image.

A very inefficient alternative would be to create a third image, the
"merge" image file, which has the COW file as its backing file:
snapshot (base) -> cow -> merge

All data from snapshot and cow is copied into merge and then snapshot
and cow can be deleted.  But this approach is results in full data
copying and uses potentially 3x space if cow is close to the size of
snapshot.

Any other ideas that reuse live block copy for snapshot merge?

Stefan

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [Qemu-devel] KVM call agenda for June 28
  2011-07-05 12:40                       ` Stefan Hajnoczi
@ 2011-07-05 12:58                         ` Marcelo Tosatti
  -1 siblings, 0 replies; 55+ messages in thread
From: Marcelo Tosatti @ 2011-07-05 12:58 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: dlaor, Kevin Wolf, Chris Wright, KVM devel mailing list,
	quintela, qemu-devel, Avi Kivity, jes sorensen

On Tue, Jul 05, 2011 at 01:40:08PM +0100, Stefan Hajnoczi wrote:
> On Tue, Jul 5, 2011 at 9:01 AM, Dor Laor <dlaor@redhat.com> wrote:
> > I tried to re-arrange all of the requirements and use cases using this wiki
> > page: http://wiki.qemu.org/Features/LiveBlockMigration
> >
> > It would be the best to agree upon the most interesting use cases (while we
> > make sure we cover future ones) and agree to them.
> > The next step is to set the interface for all the various verbs since the
> > implementation seems to be converging.
> 
> Live block copy was supposed to support snapshot merge.  I think the
> current favored approach is to make the source image a backing file to
> the destination image and essentially do image streaming.
> 
> Using this mechanism for snapshot merge is tricky.  The COW file
> already uses the read-only snapshot base image.  So now we cannot
> trivally copy the COW file contents back into the snapshot base image
> using live block copy.

It never did. Live copy creates a new image were both snapshot and
"current" are copied to.

This is similar with image streaming.

> It seems like snapshot merge will require dedicated code that reads
> the allocated clusters from the COW file and writes them back into the
> base image.
> 
> A very inefficient alternative would be to create a third image, the
> "merge" image file, which has the COW file as its backing file:
> snapshot (base) -> cow -> merge
> 
> All data from snapshot and cow is copied into merge and then snapshot
> and cow can be deleted.  But this approach is results in full data
> copying and uses potentially 3x space if cow is close to the size of
> snapshot.

Management can set a higher limit on the size of data that is merged,
and create a new base once exceeded. This avoids copying excessive
amounts of data.

> Any other ideas that reuse live block copy for snapshot merge?
> 
> Stefan

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [Qemu-devel] KVM call agenda for June 28
@ 2011-07-05 12:58                         ` Marcelo Tosatti
  0 siblings, 0 replies; 55+ messages in thread
From: Marcelo Tosatti @ 2011-07-05 12:58 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Kevin Wolf, Chris Wright, KVM devel mailing list, quintela,
	jes sorensen, dlaor, qemu-devel, Avi Kivity

On Tue, Jul 05, 2011 at 01:40:08PM +0100, Stefan Hajnoczi wrote:
> On Tue, Jul 5, 2011 at 9:01 AM, Dor Laor <dlaor@redhat.com> wrote:
> > I tried to re-arrange all of the requirements and use cases using this wiki
> > page: http://wiki.qemu.org/Features/LiveBlockMigration
> >
> > It would be the best to agree upon the most interesting use cases (while we
> > make sure we cover future ones) and agree to them.
> > The next step is to set the interface for all the various verbs since the
> > implementation seems to be converging.
> 
> Live block copy was supposed to support snapshot merge.  I think the
> current favored approach is to make the source image a backing file to
> the destination image and essentially do image streaming.
> 
> Using this mechanism for snapshot merge is tricky.  The COW file
> already uses the read-only snapshot base image.  So now we cannot
> trivally copy the COW file contents back into the snapshot base image
> using live block copy.

It never did. Live copy creates a new image were both snapshot and
"current" are copied to.

This is similar with image streaming.

> It seems like snapshot merge will require dedicated code that reads
> the allocated clusters from the COW file and writes them back into the
> base image.
> 
> A very inefficient alternative would be to create a third image, the
> "merge" image file, which has the COW file as its backing file:
> snapshot (base) -> cow -> merge
> 
> All data from snapshot and cow is copied into merge and then snapshot
> and cow can be deleted.  But this approach is results in full data
> copying and uses potentially 3x space if cow is close to the size of
> snapshot.

Management can set a higher limit on the size of data that is merged,
and create a new base once exceeded. This avoids copying excessive
amounts of data.

> Any other ideas that reuse live block copy for snapshot merge?
> 
> Stefan

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [Qemu-devel] KVM call agenda for June 28
  2011-07-05 12:58                         ` Marcelo Tosatti
@ 2011-07-05 13:39                           ` Dor Laor
  -1 siblings, 0 replies; 55+ messages in thread
From: Dor Laor @ 2011-07-05 13:39 UTC (permalink / raw)
  To: Marcelo Tosatti
  Cc: Stefan Hajnoczi, Kevin Wolf, Chris Wright,
	KVM devel mailing list, quintela, jes sorensen, qemu-devel,
	Avi Kivity

On 07/05/2011 03:58 PM, Marcelo Tosatti wrote:
> On Tue, Jul 05, 2011 at 01:40:08PM +0100, Stefan Hajnoczi wrote:
>> On Tue, Jul 5, 2011 at 9:01 AM, Dor Laor<dlaor@redhat.com>  wrote:
>>> I tried to re-arrange all of the requirements and use cases using this wiki
>>> page: http://wiki.qemu.org/Features/LiveBlockMigration
>>>
>>> It would be the best to agree upon the most interesting use cases (while we
>>> make sure we cover future ones) and agree to them.
>>> The next step is to set the interface for all the various verbs since the
>>> implementation seems to be converging.
>>
>> Live block copy was supposed to support snapshot merge.  I think the
>> current favored approach is to make the source image a backing file to
>> the destination image and essentially do image streaming.
>>
>> Using this mechanism for snapshot merge is tricky.  The COW file
>> already uses the read-only snapshot base image.  So now we cannot
>> trivally copy the COW file contents back into the snapshot base image
>> using live block copy.
>
> It never did. Live copy creates a new image were both snapshot and
> "current" are copied to.
>
> This is similar with image streaming.

Not sure I realize what's bad to do in-place merge:

Let's suppose we have this COW chain:

   base <-- s1 <-- s2

Now a live snapshot is created over s2, s2 becomes RO and s3 is RW:

   base <-- s1 <-- s2 <-- s3

Now we've done with s2 (post backup) and like to merge s3 into s2.

With your approach we use live copy of s3 into newSnap:

   base <-- s1 <-- s2 <-- s3
   base <-- s1 <-- newSnap

When it is over s2 and s3 can be erased.
The down side is the IOs for copying s2 data and the temporary storage. 
I guess temp storage is cheap but excessive IO are expensive.

My approach was to collapse s3 into s2 and erase s3 eventually:

before: base <-- s1 <-- s2 <-- s3
after:  base <-- s1 <-- s2

If we use live block copy using mirror driver it should be safe as long 
as we keep the ordering of new writes into s3 during the execution.
Even a failure in the the middle won't cause harm since the management 
will keep using s3 until it gets success event.

>
>> It seems like snapshot merge will require dedicated code that reads
>> the allocated clusters from the COW file and writes them back into the
>> base image.
>>
>> A very inefficient alternative would be to create a third image, the
>> "merge" image file, which has the COW file as its backing file:
>> snapshot (base) ->  cow ->  merge
>>
>> All data from snapshot and cow is copied into merge and then snapshot
>> and cow can be deleted.  But this approach is results in full data
>> copying and uses potentially 3x space if cow is close to the size of
>> snapshot.
>
> Management can set a higher limit on the size of data that is merged,
> and create a new base once exceeded. This avoids copying excessive
> amounts of data.
>
>> Any other ideas that reuse live block copy for snapshot merge?
>>
>> Stefan
>
>


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [Qemu-devel] KVM call agenda for June 28
@ 2011-07-05 13:39                           ` Dor Laor
  0 siblings, 0 replies; 55+ messages in thread
From: Dor Laor @ 2011-07-05 13:39 UTC (permalink / raw)
  To: Marcelo Tosatti
  Cc: Kevin Wolf, Chris Wright, KVM devel mailing list, quintela,
	Stefan Hajnoczi, qemu-devel, Avi Kivity, jes sorensen

On 07/05/2011 03:58 PM, Marcelo Tosatti wrote:
> On Tue, Jul 05, 2011 at 01:40:08PM +0100, Stefan Hajnoczi wrote:
>> On Tue, Jul 5, 2011 at 9:01 AM, Dor Laor<dlaor@redhat.com>  wrote:
>>> I tried to re-arrange all of the requirements and use cases using this wiki
>>> page: http://wiki.qemu.org/Features/LiveBlockMigration
>>>
>>> It would be the best to agree upon the most interesting use cases (while we
>>> make sure we cover future ones) and agree to them.
>>> The next step is to set the interface for all the various verbs since the
>>> implementation seems to be converging.
>>
>> Live block copy was supposed to support snapshot merge.  I think the
>> current favored approach is to make the source image a backing file to
>> the destination image and essentially do image streaming.
>>
>> Using this mechanism for snapshot merge is tricky.  The COW file
>> already uses the read-only snapshot base image.  So now we cannot
>> trivally copy the COW file contents back into the snapshot base image
>> using live block copy.
>
> It never did. Live copy creates a new image were both snapshot and
> "current" are copied to.
>
> This is similar with image streaming.

Not sure I realize what's bad to do in-place merge:

Let's suppose we have this COW chain:

   base <-- s1 <-- s2

Now a live snapshot is created over s2, s2 becomes RO and s3 is RW:

   base <-- s1 <-- s2 <-- s3

Now we've done with s2 (post backup) and like to merge s3 into s2.

With your approach we use live copy of s3 into newSnap:

   base <-- s1 <-- s2 <-- s3
   base <-- s1 <-- newSnap

When it is over s2 and s3 can be erased.
The down side is the IOs for copying s2 data and the temporary storage. 
I guess temp storage is cheap but excessive IO are expensive.

My approach was to collapse s3 into s2 and erase s3 eventually:

before: base <-- s1 <-- s2 <-- s3
after:  base <-- s1 <-- s2

If we use live block copy using mirror driver it should be safe as long 
as we keep the ordering of new writes into s3 during the execution.
Even a failure in the the middle won't cause harm since the management 
will keep using s3 until it gets success event.

>
>> It seems like snapshot merge will require dedicated code that reads
>> the allocated clusters from the COW file and writes them back into the
>> base image.
>>
>> A very inefficient alternative would be to create a third image, the
>> "merge" image file, which has the COW file as its backing file:
>> snapshot (base) ->  cow ->  merge
>>
>> All data from snapshot and cow is copied into merge and then snapshot
>> and cow can be deleted.  But this approach is results in full data
>> copying and uses potentially 3x space if cow is close to the size of
>> snapshot.
>
> Management can set a higher limit on the size of data that is merged,
> and create a new base once exceeded. This avoids copying excessive
> amounts of data.
>
>> Any other ideas that reuse live block copy for snapshot merge?
>>
>> Stefan
>
>

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: KVM call agenda for June 28
  2011-07-05 13:39                           ` Dor Laor
@ 2011-07-05 14:29                             ` Marcelo Tosatti
  -1 siblings, 0 replies; 55+ messages in thread
From: Marcelo Tosatti @ 2011-07-05 14:29 UTC (permalink / raw)
  To: Dor Laor
  Cc: Kevin Wolf, Chris Wright, KVM devel mailing list, quintela,
	Stefan Hajnoczi, qemu-devel, Avi Kivity, jes sorensen

On Tue, Jul 05, 2011 at 04:39:06PM +0300, Dor Laor wrote:
> On 07/05/2011 03:58 PM, Marcelo Tosatti wrote:
> >On Tue, Jul 05, 2011 at 01:40:08PM +0100, Stefan Hajnoczi wrote:
> >>On Tue, Jul 5, 2011 at 9:01 AM, Dor Laor<dlaor@redhat.com>  wrote:
> >>>I tried to re-arrange all of the requirements and use cases using this wiki
> >>>page: http://wiki.qemu.org/Features/LiveBlockMigration
> >>>
> >>>It would be the best to agree upon the most interesting use cases (while we
> >>>make sure we cover future ones) and agree to them.
> >>>The next step is to set the interface for all the various verbs since the
> >>>implementation seems to be converging.
> >>
> >>Live block copy was supposed to support snapshot merge.  I think the
> >>current favored approach is to make the source image a backing file to
> >>the destination image and essentially do image streaming.
> >>
> >>Using this mechanism for snapshot merge is tricky.  The COW file
> >>already uses the read-only snapshot base image.  So now we cannot
> >>trivally copy the COW file contents back into the snapshot base image
> >>using live block copy.
> >
> >It never did. Live copy creates a new image were both snapshot and
> >"current" are copied to.
> >
> >This is similar with image streaming.
> 
> Not sure I realize what's bad to do in-place merge:
> 
> Let's suppose we have this COW chain:
> 
>   base <-- s1 <-- s2
> 
> Now a live snapshot is created over s2, s2 becomes RO and s3 is RW:
> 
>   base <-- s1 <-- s2 <-- s3
> 
> Now we've done with s2 (post backup) and like to merge s3 into s2.
> 
> With your approach we use live copy of s3 into newSnap:
> 
>   base <-- s1 <-- s2 <-- s3
>   base <-- s1 <-- newSnap
> 
> When it is over s2 and s3 can be erased.
> The down side is the IOs for copying s2 data and the temporary
> storage. I guess temp storage is cheap but excessive IO are
> expensive.
> 
> My approach was to collapse s3 into s2 and erase s3 eventually:
> 
> before: base <-- s1 <-- s2 <-- s3
> after:  base <-- s1 <-- s2
> 
> If we use live block copy using mirror driver it should be safe as
> long as we keep the ordering of new writes into s3 during the
> execution.
> Even a failure in the the middle won't cause harm since the
> management will keep using s3 until it gets success event.

Well, it is more complicated than simply streaming into a new 
image. I'm not entirely sure it is necessary. The common case is:

base -> sn-1 -> sn-2 -> ... -> sn-n

When n reaches a limit, you do:

base -> merge-1

You're potentially copying similar amount of data when merging back
into a single image (and worst, you can't easily merge multiple
snapshots). 

If the amount of data thats not in 'base' is large, you create
leave a new external file around:

base -> merge-1 -> sn-1 -> sn-2 ... -> sn-n
to
base -> merge-1 -> merge-2

> >>It seems like snapshot merge will require dedicated code that reads
> >>the allocated clusters from the COW file and writes them back into the
> >>base image.
> >>
> >>A very inefficient alternative would be to create a third image, the
> >>"merge" image file, which has the COW file as its backing file:
> >>snapshot (base) ->  cow ->  merge
> >>
> >>All data from snapshot and cow is copied into merge and then snapshot
> >>and cow can be deleted.  But this approach is results in full data
> >>copying and uses potentially 3x space if cow is close to the size of
> >>snapshot.

Remember there is a 'base' before snapshot, you don't copy the entire 
OS installation.

> >
> >Management can set a higher limit on the size of data that is merged,
> >and create a new base once exceeded. This avoids copying excessive
> >amounts of data.
> >
> >>Any other ideas that reuse live block copy for snapshot merge?
> >>
> >>Stefan
> >
> >

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [Qemu-devel] KVM call agenda for June 28
@ 2011-07-05 14:29                             ` Marcelo Tosatti
  0 siblings, 0 replies; 55+ messages in thread
From: Marcelo Tosatti @ 2011-07-05 14:29 UTC (permalink / raw)
  To: Dor Laor
  Cc: Kevin Wolf, Chris Wright, KVM devel mailing list, quintela,
	Stefan Hajnoczi, qemu-devel, Avi Kivity, jes sorensen

On Tue, Jul 05, 2011 at 04:39:06PM +0300, Dor Laor wrote:
> On 07/05/2011 03:58 PM, Marcelo Tosatti wrote:
> >On Tue, Jul 05, 2011 at 01:40:08PM +0100, Stefan Hajnoczi wrote:
> >>On Tue, Jul 5, 2011 at 9:01 AM, Dor Laor<dlaor@redhat.com>  wrote:
> >>>I tried to re-arrange all of the requirements and use cases using this wiki
> >>>page: http://wiki.qemu.org/Features/LiveBlockMigration
> >>>
> >>>It would be the best to agree upon the most interesting use cases (while we
> >>>make sure we cover future ones) and agree to them.
> >>>The next step is to set the interface for all the various verbs since the
> >>>implementation seems to be converging.
> >>
> >>Live block copy was supposed to support snapshot merge.  I think the
> >>current favored approach is to make the source image a backing file to
> >>the destination image and essentially do image streaming.
> >>
> >>Using this mechanism for snapshot merge is tricky.  The COW file
> >>already uses the read-only snapshot base image.  So now we cannot
> >>trivally copy the COW file contents back into the snapshot base image
> >>using live block copy.
> >
> >It never did. Live copy creates a new image were both snapshot and
> >"current" are copied to.
> >
> >This is similar with image streaming.
> 
> Not sure I realize what's bad to do in-place merge:
> 
> Let's suppose we have this COW chain:
> 
>   base <-- s1 <-- s2
> 
> Now a live snapshot is created over s2, s2 becomes RO and s3 is RW:
> 
>   base <-- s1 <-- s2 <-- s3
> 
> Now we've done with s2 (post backup) and like to merge s3 into s2.
> 
> With your approach we use live copy of s3 into newSnap:
> 
>   base <-- s1 <-- s2 <-- s3
>   base <-- s1 <-- newSnap
> 
> When it is over s2 and s3 can be erased.
> The down side is the IOs for copying s2 data and the temporary
> storage. I guess temp storage is cheap but excessive IO are
> expensive.
> 
> My approach was to collapse s3 into s2 and erase s3 eventually:
> 
> before: base <-- s1 <-- s2 <-- s3
> after:  base <-- s1 <-- s2
> 
> If we use live block copy using mirror driver it should be safe as
> long as we keep the ordering of new writes into s3 during the
> execution.
> Even a failure in the the middle won't cause harm since the
> management will keep using s3 until it gets success event.

Well, it is more complicated than simply streaming into a new 
image. I'm not entirely sure it is necessary. The common case is:

base -> sn-1 -> sn-2 -> ... -> sn-n

When n reaches a limit, you do:

base -> merge-1

You're potentially copying similar amount of data when merging back
into a single image (and worst, you can't easily merge multiple
snapshots). 

If the amount of data thats not in 'base' is large, you create
leave a new external file around:

base -> merge-1 -> sn-1 -> sn-2 ... -> sn-n
to
base -> merge-1 -> merge-2

> >>It seems like snapshot merge will require dedicated code that reads
> >>the allocated clusters from the COW file and writes them back into the
> >>base image.
> >>
> >>A very inefficient alternative would be to create a third image, the
> >>"merge" image file, which has the COW file as its backing file:
> >>snapshot (base) ->  cow ->  merge
> >>
> >>All data from snapshot and cow is copied into merge and then snapshot
> >>and cow can be deleted.  But this approach is results in full data
> >>copying and uses potentially 3x space if cow is close to the size of
> >>snapshot.

Remember there is a 'base' before snapshot, you don't copy the entire 
OS installation.

> >
> >Management can set a higher limit on the size of data that is merged,
> >and create a new base once exceeded. This avoids copying excessive
> >amounts of data.
> >
> >>Any other ideas that reuse live block copy for snapshot merge?
> >>
> >>Stefan
> >
> >

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [Qemu-devel] KVM call agenda for June 28
  2011-07-05 13:39                           ` Dor Laor
@ 2011-07-05 14:32                             ` Marcelo Tosatti
  -1 siblings, 0 replies; 55+ messages in thread
From: Marcelo Tosatti @ 2011-07-05 14:32 UTC (permalink / raw)
  To: Dor Laor
  Cc: Stefan Hajnoczi, Kevin Wolf, Chris Wright,
	KVM devel mailing list, quintela, jes sorensen, qemu-devel,
	Avi Kivity

On Tue, Jul 05, 2011 at 04:39:06PM +0300, Dor Laor wrote:
> On 07/05/2011 03:58 PM, Marcelo Tosatti wrote:
> >On Tue, Jul 05, 2011 at 01:40:08PM +0100, Stefan Hajnoczi wrote:
> >>On Tue, Jul 5, 2011 at 9:01 AM, Dor Laor<dlaor@redhat.com>  wrote:
> >>>I tried to re-arrange all of the requirements and use cases using this wiki
> >>>page: http://wiki.qemu.org/Features/LiveBlockMigration
> >>>
> >>>It would be the best to agree upon the most interesting use cases (while we
> >>>make sure we cover future ones) and agree to them.
> >>>The next step is to set the interface for all the various verbs since the
> >>>implementation seems to be converging.
> >>
> >>Live block copy was supposed to support snapshot merge.  I think the
> >>current favored approach is to make the source image a backing file to
> >>the destination image and essentially do image streaming.
> >>
> >>Using this mechanism for snapshot merge is tricky.  The COW file
> >>already uses the read-only snapshot base image.  So now we cannot
> >>trivally copy the COW file contents back into the snapshot base image
> >>using live block copy.
> >
> >It never did. Live copy creates a new image were both snapshot and
> >"current" are copied to.
> >
> >This is similar with image streaming.
> 
> Not sure I realize what's bad to do in-place merge:
> 
> Let's suppose we have this COW chain:
> 
>   base <-- s1 <-- s2
> 
> Now a live snapshot is created over s2, s2 becomes RO and s3 is RW:
> 
>   base <-- s1 <-- s2 <-- s3
> 
> Now we've done with s2 (post backup) and like to merge s3 into s2.
> 
> With your approach we use live copy of s3 into newSnap:
> 
>   base <-- s1 <-- s2 <-- s3
>   base <-- s1 <-- newSnap
> 
> When it is over s2 and s3 can be erased.
> The down side is the IOs for copying s2 data and the temporary
> storage. I guess temp storage is cheap but excessive IO are
> expensive.
> 
> My approach was to collapse s3 into s2 and erase s3 eventually:
> 
> before: base <-- s1 <-- s2 <-- s3
> after:  base <-- s1 <-- s2
> 
> If we use live block copy using mirror driver it should be safe as
> long as we keep the ordering of new writes into s3 during the
> execution.
> Even a failure in the the middle won't cause harm since the
> management will keep using s3 until it gets success event.

Well, it is more complicated than simply streaming into a new
image. I'm not entirely sure it is necessary. The common case is:

base -> sn-1 -> sn-2 -> ... -> sn-n

When n reaches a limit, you do:

base -> merge-1

You're potentially copying similar amount of data when merging back into
a single image (and you can't easily merge multiple snapshots).

If the amount of data thats not in 'base' is large, you create
leave a new external file around:

base -> merge-1 -> sn-1 -> sn-2 ... -> sn-n
to
base -> merge-1 -> merge-2

> >
> >>It seems like snapshot merge will require dedicated code that reads
> >>the allocated clusters from the COW file and writes them back into the
> >>base image.
> >>
> >>A very inefficient alternative would be to create a third image, the
> >>"merge" image file, which has the COW file as its backing file:
> >>snapshot (base) ->  cow ->  merge

Remember there is a 'base' before snapshot, you don't copy the entire
image.

> >>
> >>All data from snapshot and cow is copied into merge and then snapshot
> >>and cow can be deleted.  But this approach is results in full data
> >>copying and uses potentially 3x space if cow is close to the size of
> >>snapshot.
> >
> >Management can set a higher limit on the size of data that is merged,
> >and create a new base once exceeded. This avoids copying excessive
> >amounts of data.
> >
> >>Any other ideas that reuse live block copy for snapshot merge?
> >>
> >>Stefan
> >
> >

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [Qemu-devel] KVM call agenda for June 28
@ 2011-07-05 14:32                             ` Marcelo Tosatti
  0 siblings, 0 replies; 55+ messages in thread
From: Marcelo Tosatti @ 2011-07-05 14:32 UTC (permalink / raw)
  To: Dor Laor
  Cc: Kevin Wolf, Chris Wright, KVM devel mailing list, quintela,
	Stefan Hajnoczi, qemu-devel, Avi Kivity, jes sorensen

On Tue, Jul 05, 2011 at 04:39:06PM +0300, Dor Laor wrote:
> On 07/05/2011 03:58 PM, Marcelo Tosatti wrote:
> >On Tue, Jul 05, 2011 at 01:40:08PM +0100, Stefan Hajnoczi wrote:
> >>On Tue, Jul 5, 2011 at 9:01 AM, Dor Laor<dlaor@redhat.com>  wrote:
> >>>I tried to re-arrange all of the requirements and use cases using this wiki
> >>>page: http://wiki.qemu.org/Features/LiveBlockMigration
> >>>
> >>>It would be the best to agree upon the most interesting use cases (while we
> >>>make sure we cover future ones) and agree to them.
> >>>The next step is to set the interface for all the various verbs since the
> >>>implementation seems to be converging.
> >>
> >>Live block copy was supposed to support snapshot merge.  I think the
> >>current favored approach is to make the source image a backing file to
> >>the destination image and essentially do image streaming.
> >>
> >>Using this mechanism for snapshot merge is tricky.  The COW file
> >>already uses the read-only snapshot base image.  So now we cannot
> >>trivally copy the COW file contents back into the snapshot base image
> >>using live block copy.
> >
> >It never did. Live copy creates a new image were both snapshot and
> >"current" are copied to.
> >
> >This is similar with image streaming.
> 
> Not sure I realize what's bad to do in-place merge:
> 
> Let's suppose we have this COW chain:
> 
>   base <-- s1 <-- s2
> 
> Now a live snapshot is created over s2, s2 becomes RO and s3 is RW:
> 
>   base <-- s1 <-- s2 <-- s3
> 
> Now we've done with s2 (post backup) and like to merge s3 into s2.
> 
> With your approach we use live copy of s3 into newSnap:
> 
>   base <-- s1 <-- s2 <-- s3
>   base <-- s1 <-- newSnap
> 
> When it is over s2 and s3 can be erased.
> The down side is the IOs for copying s2 data and the temporary
> storage. I guess temp storage is cheap but excessive IO are
> expensive.
> 
> My approach was to collapse s3 into s2 and erase s3 eventually:
> 
> before: base <-- s1 <-- s2 <-- s3
> after:  base <-- s1 <-- s2
> 
> If we use live block copy using mirror driver it should be safe as
> long as we keep the ordering of new writes into s3 during the
> execution.
> Even a failure in the the middle won't cause harm since the
> management will keep using s3 until it gets success event.

Well, it is more complicated than simply streaming into a new
image. I'm not entirely sure it is necessary. The common case is:

base -> sn-1 -> sn-2 -> ... -> sn-n

When n reaches a limit, you do:

base -> merge-1

You're potentially copying similar amount of data when merging back into
a single image (and you can't easily merge multiple snapshots).

If the amount of data thats not in 'base' is large, you create
leave a new external file around:

base -> merge-1 -> sn-1 -> sn-2 ... -> sn-n
to
base -> merge-1 -> merge-2

> >
> >>It seems like snapshot merge will require dedicated code that reads
> >>the allocated clusters from the COW file and writes them back into the
> >>base image.
> >>
> >>A very inefficient alternative would be to create a third image, the
> >>"merge" image file, which has the COW file as its backing file:
> >>snapshot (base) ->  cow ->  merge

Remember there is a 'base' before snapshot, you don't copy the entire
image.

> >>
> >>All data from snapshot and cow is copied into merge and then snapshot
> >>and cow can be deleted.  But this approach is results in full data
> >>copying and uses potentially 3x space if cow is close to the size of
> >>snapshot.
> >
> >Management can set a higher limit on the size of data that is merged,
> >and create a new base once exceeded. This avoids copying excessive
> >amounts of data.
> >
> >>Any other ideas that reuse live block copy for snapshot merge?
> >>
> >>Stefan
> >
> >

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [Qemu-devel] KVM call agenda for June 28
  2011-07-05 14:32                             ` Marcelo Tosatti
@ 2011-07-05 14:46                               ` Kevin Wolf
  -1 siblings, 0 replies; 55+ messages in thread
From: Kevin Wolf @ 2011-07-05 14:46 UTC (permalink / raw)
  To: Marcelo Tosatti
  Cc: Dor Laor, Stefan Hajnoczi, Chris Wright, KVM devel mailing list,
	quintela, jes sorensen, qemu-devel, Avi Kivity

Am 05.07.2011 16:32, schrieb Marcelo Tosatti:
> On Tue, Jul 05, 2011 at 04:39:06PM +0300, Dor Laor wrote:
>> On 07/05/2011 03:58 PM, Marcelo Tosatti wrote:
>>> On Tue, Jul 05, 2011 at 01:40:08PM +0100, Stefan Hajnoczi wrote:
>>>> On Tue, Jul 5, 2011 at 9:01 AM, Dor Laor<dlaor@redhat.com>  wrote:
>>>>> I tried to re-arrange all of the requirements and use cases using this wiki
>>>>> page: http://wiki.qemu.org/Features/LiveBlockMigration
>>>>>
>>>>> It would be the best to agree upon the most interesting use cases (while we
>>>>> make sure we cover future ones) and agree to them.
>>>>> The next step is to set the interface for all the various verbs since the
>>>>> implementation seems to be converging.
>>>>
>>>> Live block copy was supposed to support snapshot merge.  I think the
>>>> current favored approach is to make the source image a backing file to
>>>> the destination image and essentially do image streaming.
>>>>
>>>> Using this mechanism for snapshot merge is tricky.  The COW file
>>>> already uses the read-only snapshot base image.  So now we cannot
>>>> trivally copy the COW file contents back into the snapshot base image
>>>> using live block copy.
>>>
>>> It never did. Live copy creates a new image were both snapshot and
>>> "current" are copied to.
>>>
>>> This is similar with image streaming.
>>
>> Not sure I realize what's bad to do in-place merge:
>>
>> Let's suppose we have this COW chain:
>>
>>   base <-- s1 <-- s2
>>
>> Now a live snapshot is created over s2, s2 becomes RO and s3 is RW:
>>
>>   base <-- s1 <-- s2 <-- s3
>>
>> Now we've done with s2 (post backup) and like to merge s3 into s2.
>>
>> With your approach we use live copy of s3 into newSnap:
>>
>>   base <-- s1 <-- s2 <-- s3
>>   base <-- s1 <-- newSnap
>>
>> When it is over s2 and s3 can be erased.
>> The down side is the IOs for copying s2 data and the temporary
>> storage. I guess temp storage is cheap but excessive IO are
>> expensive.
>>
>> My approach was to collapse s3 into s2 and erase s3 eventually:
>>
>> before: base <-- s1 <-- s2 <-- s3
>> after:  base <-- s1 <-- s2
>>
>> If we use live block copy using mirror driver it should be safe as
>> long as we keep the ordering of new writes into s3 during the
>> execution.
>> Even a failure in the the middle won't cause harm since the
>> management will keep using s3 until it gets success event.
> 
> Well, it is more complicated than simply streaming into a new
> image. I'm not entirely sure it is necessary. The common case is:
> 
> base -> sn-1 -> sn-2 -> ... -> sn-n
> 
> When n reaches a limit, you do:
> 
> base -> merge-1

Hm, I would expect that a case like this is important, too:

base <- sn-1 <- ... <- sn-n-1 <- sn-n <- ... <- sn-m

Which should be merged so that we get the following (i.e. deleting older
snapshots but retaining more recent ones):

base <- sn-merged <- sn-n <- ... <- sn-m

Kevin

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [Qemu-devel] KVM call agenda for June 28
@ 2011-07-05 14:46                               ` Kevin Wolf
  0 siblings, 0 replies; 55+ messages in thread
From: Kevin Wolf @ 2011-07-05 14:46 UTC (permalink / raw)
  To: Marcelo Tosatti
  Cc: Chris Wright, KVM devel mailing list, quintela, Stefan Hajnoczi,
	Dor Laor, qemu-devel, Avi Kivity, jes sorensen

Am 05.07.2011 16:32, schrieb Marcelo Tosatti:
> On Tue, Jul 05, 2011 at 04:39:06PM +0300, Dor Laor wrote:
>> On 07/05/2011 03:58 PM, Marcelo Tosatti wrote:
>>> On Tue, Jul 05, 2011 at 01:40:08PM +0100, Stefan Hajnoczi wrote:
>>>> On Tue, Jul 5, 2011 at 9:01 AM, Dor Laor<dlaor@redhat.com>  wrote:
>>>>> I tried to re-arrange all of the requirements and use cases using this wiki
>>>>> page: http://wiki.qemu.org/Features/LiveBlockMigration
>>>>>
>>>>> It would be the best to agree upon the most interesting use cases (while we
>>>>> make sure we cover future ones) and agree to them.
>>>>> The next step is to set the interface for all the various verbs since the
>>>>> implementation seems to be converging.
>>>>
>>>> Live block copy was supposed to support snapshot merge.  I think the
>>>> current favored approach is to make the source image a backing file to
>>>> the destination image and essentially do image streaming.
>>>>
>>>> Using this mechanism for snapshot merge is tricky.  The COW file
>>>> already uses the read-only snapshot base image.  So now we cannot
>>>> trivally copy the COW file contents back into the snapshot base image
>>>> using live block copy.
>>>
>>> It never did. Live copy creates a new image were both snapshot and
>>> "current" are copied to.
>>>
>>> This is similar with image streaming.
>>
>> Not sure I realize what's bad to do in-place merge:
>>
>> Let's suppose we have this COW chain:
>>
>>   base <-- s1 <-- s2
>>
>> Now a live snapshot is created over s2, s2 becomes RO and s3 is RW:
>>
>>   base <-- s1 <-- s2 <-- s3
>>
>> Now we've done with s2 (post backup) and like to merge s3 into s2.
>>
>> With your approach we use live copy of s3 into newSnap:
>>
>>   base <-- s1 <-- s2 <-- s3
>>   base <-- s1 <-- newSnap
>>
>> When it is over s2 and s3 can be erased.
>> The down side is the IOs for copying s2 data and the temporary
>> storage. I guess temp storage is cheap but excessive IO are
>> expensive.
>>
>> My approach was to collapse s3 into s2 and erase s3 eventually:
>>
>> before: base <-- s1 <-- s2 <-- s3
>> after:  base <-- s1 <-- s2
>>
>> If we use live block copy using mirror driver it should be safe as
>> long as we keep the ordering of new writes into s3 during the
>> execution.
>> Even a failure in the the middle won't cause harm since the
>> management will keep using s3 until it gets success event.
> 
> Well, it is more complicated than simply streaming into a new
> image. I'm not entirely sure it is necessary. The common case is:
> 
> base -> sn-1 -> sn-2 -> ... -> sn-n
> 
> When n reaches a limit, you do:
> 
> base -> merge-1

Hm, I would expect that a case like this is important, too:

base <- sn-1 <- ... <- sn-n-1 <- sn-n <- ... <- sn-m

Which should be merged so that we get the following (i.e. deleting older
snapshots but retaining more recent ones):

base <- sn-merged <- sn-n <- ... <- sn-m

Kevin

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [Qemu-devel] KVM call agenda for June 28
  2011-07-05 14:32                             ` Marcelo Tosatti
@ 2011-07-05 15:04                               ` Dor Laor
  -1 siblings, 0 replies; 55+ messages in thread
From: Dor Laor @ 2011-07-05 15:04 UTC (permalink / raw)
  To: Marcelo Tosatti
  Cc: Stefan Hajnoczi, Kevin Wolf, Chris Wright,
	KVM devel mailing list, quintela, jes sorensen, qemu-devel,
	Avi Kivity

On 07/05/2011 05:32 PM, Marcelo Tosatti wrote:
> On Tue, Jul 05, 2011 at 04:39:06PM +0300, Dor Laor wrote:
>> On 07/05/2011 03:58 PM, Marcelo Tosatti wrote:
>>> On Tue, Jul 05, 2011 at 01:40:08PM +0100, Stefan Hajnoczi wrote:
>>>> On Tue, Jul 5, 2011 at 9:01 AM, Dor Laor<dlaor@redhat.com>   wrote:
>>>>> I tried to re-arrange all of the requirements and use cases using this wiki
>>>>> page: http://wiki.qemu.org/Features/LiveBlockMigration
>>>>>
>>>>> It would be the best to agree upon the most interesting use cases (while we
>>>>> make sure we cover future ones) and agree to them.
>>>>> The next step is to set the interface for all the various verbs since the
>>>>> implementation seems to be converging.
>>>>
>>>> Live block copy was supposed to support snapshot merge.  I think the
>>>> current favored approach is to make the source image a backing file to
>>>> the destination image and essentially do image streaming.
>>>>
>>>> Using this mechanism for snapshot merge is tricky.  The COW file
>>>> already uses the read-only snapshot base image.  So now we cannot
>>>> trivally copy the COW file contents back into the snapshot base image
>>>> using live block copy.
>>>
>>> It never did. Live copy creates a new image were both snapshot and
>>> "current" are copied to.
>>>
>>> This is similar with image streaming.
>>
>> Not sure I realize what's bad to do in-place merge:
>>
>> Let's suppose we have this COW chain:
>>
>>    base<-- s1<-- s2
>>
>> Now a live snapshot is created over s2, s2 becomes RO and s3 is RW:
>>
>>    base<-- s1<-- s2<-- s3
>>
>> Now we've done with s2 (post backup) and like to merge s3 into s2.
>>
>> With your approach we use live copy of s3 into newSnap:
>>
>>    base<-- s1<-- s2<-- s3
>>    base<-- s1<-- newSnap
>>
>> When it is over s2 and s3 can be erased.
>> The down side is the IOs for copying s2 data and the temporary
>> storage. I guess temp storage is cheap but excessive IO are
>> expensive.
>>
>> My approach was to collapse s3 into s2 and erase s3 eventually:
>>
>> before: base<-- s1<-- s2<-- s3
>> after:  base<-- s1<-- s2
>>
>> If we use live block copy using mirror driver it should be safe as
>> long as we keep the ordering of new writes into s3 during the
>> execution.
>> Even a failure in the the middle won't cause harm since the
>> management will keep using s3 until it gets success event.
>
> Well, it is more complicated than simply streaming into a new
> image. I'm not entirely sure it is necessary. The common case is:
>
> base ->  sn-1 ->  sn-2 ->  ... ->  sn-n
>
> When n reaches a limit, you do:
>
> base ->  merge-1
>
> You're potentially copying similar amount of data when merging back into
> a single image (and you can't easily merge multiple snapshots).
>
> If the amount of data thats not in 'base' is large, you create
> leave a new external file around:
>
> base ->  merge-1 ->  sn-1 ->  sn-2 ... ->  sn-n
> to
> base ->  merge-1 ->  merge-2

Sometimes one will want to merge the snapshot immediately post the base 
was backed-up

>
>>>
>>>> It seems like snapshot merge will require dedicated code that reads
>>>> the allocated clusters from the COW file and writes them back into the
>>>> base image.
>>>>
>>>> A very inefficient alternative would be to create a third image, the
>>>> "merge" image file, which has the COW file as its backing file:
>>>> snapshot (base) ->   cow ->   merge
>
> Remember there is a 'base' before snapshot, you don't copy the entire
> image.

Not always, the image might be raw file/device -

1. raw image
2. live snapshot it and use COW above it
    raw <- s1
3. backup the raw image using 3rd party mechanism
4. live merge (copy) s1 into raw

>
>>>>
>>>> All data from snapshot and cow is copied into merge and then snapshot
>>>> and cow can be deleted.  But this approach is results in full data
>>>> copying and uses potentially 3x space if cow is close to the size of
>>>> snapshot.
>>>
>>> Management can set a higher limit on the size of data that is merged,
>>> and create a new base once exceeded. This avoids copying excessive
>>> amounts of data.
>>>
>>>> Any other ideas that reuse live block copy for snapshot merge?
>>>>
>>>> Stefan
>>>
>>>


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [Qemu-devel] KVM call agenda for June 28
@ 2011-07-05 15:04                               ` Dor Laor
  0 siblings, 0 replies; 55+ messages in thread
From: Dor Laor @ 2011-07-05 15:04 UTC (permalink / raw)
  To: Marcelo Tosatti
  Cc: Kevin Wolf, Chris Wright, KVM devel mailing list, quintela,
	Stefan Hajnoczi, qemu-devel, Avi Kivity, jes sorensen

On 07/05/2011 05:32 PM, Marcelo Tosatti wrote:
> On Tue, Jul 05, 2011 at 04:39:06PM +0300, Dor Laor wrote:
>> On 07/05/2011 03:58 PM, Marcelo Tosatti wrote:
>>> On Tue, Jul 05, 2011 at 01:40:08PM +0100, Stefan Hajnoczi wrote:
>>>> On Tue, Jul 5, 2011 at 9:01 AM, Dor Laor<dlaor@redhat.com>   wrote:
>>>>> I tried to re-arrange all of the requirements and use cases using this wiki
>>>>> page: http://wiki.qemu.org/Features/LiveBlockMigration
>>>>>
>>>>> It would be the best to agree upon the most interesting use cases (while we
>>>>> make sure we cover future ones) and agree to them.
>>>>> The next step is to set the interface for all the various verbs since the
>>>>> implementation seems to be converging.
>>>>
>>>> Live block copy was supposed to support snapshot merge.  I think the
>>>> current favored approach is to make the source image a backing file to
>>>> the destination image and essentially do image streaming.
>>>>
>>>> Using this mechanism for snapshot merge is tricky.  The COW file
>>>> already uses the read-only snapshot base image.  So now we cannot
>>>> trivally copy the COW file contents back into the snapshot base image
>>>> using live block copy.
>>>
>>> It never did. Live copy creates a new image were both snapshot and
>>> "current" are copied to.
>>>
>>> This is similar with image streaming.
>>
>> Not sure I realize what's bad to do in-place merge:
>>
>> Let's suppose we have this COW chain:
>>
>>    base<-- s1<-- s2
>>
>> Now a live snapshot is created over s2, s2 becomes RO and s3 is RW:
>>
>>    base<-- s1<-- s2<-- s3
>>
>> Now we've done with s2 (post backup) and like to merge s3 into s2.
>>
>> With your approach we use live copy of s3 into newSnap:
>>
>>    base<-- s1<-- s2<-- s3
>>    base<-- s1<-- newSnap
>>
>> When it is over s2 and s3 can be erased.
>> The down side is the IOs for copying s2 data and the temporary
>> storage. I guess temp storage is cheap but excessive IO are
>> expensive.
>>
>> My approach was to collapse s3 into s2 and erase s3 eventually:
>>
>> before: base<-- s1<-- s2<-- s3
>> after:  base<-- s1<-- s2
>>
>> If we use live block copy using mirror driver it should be safe as
>> long as we keep the ordering of new writes into s3 during the
>> execution.
>> Even a failure in the the middle won't cause harm since the
>> management will keep using s3 until it gets success event.
>
> Well, it is more complicated than simply streaming into a new
> image. I'm not entirely sure it is necessary. The common case is:
>
> base ->  sn-1 ->  sn-2 ->  ... ->  sn-n
>
> When n reaches a limit, you do:
>
> base ->  merge-1
>
> You're potentially copying similar amount of data when merging back into
> a single image (and you can't easily merge multiple snapshots).
>
> If the amount of data thats not in 'base' is large, you create
> leave a new external file around:
>
> base ->  merge-1 ->  sn-1 ->  sn-2 ... ->  sn-n
> to
> base ->  merge-1 ->  merge-2

Sometimes one will want to merge the snapshot immediately post the base 
was backed-up

>
>>>
>>>> It seems like snapshot merge will require dedicated code that reads
>>>> the allocated clusters from the COW file and writes them back into the
>>>> base image.
>>>>
>>>> A very inefficient alternative would be to create a third image, the
>>>> "merge" image file, which has the COW file as its backing file:
>>>> snapshot (base) ->   cow ->   merge
>
> Remember there is a 'base' before snapshot, you don't copy the entire
> image.

Not always, the image might be raw file/device -

1. raw image
2. live snapshot it and use COW above it
    raw <- s1
3. backup the raw image using 3rd party mechanism
4. live merge (copy) s1 into raw

>
>>>>
>>>> All data from snapshot and cow is copied into merge and then snapshot
>>>> and cow can be deleted.  But this approach is results in full data
>>>> copying and uses potentially 3x space if cow is close to the size of
>>>> snapshot.
>>>
>>> Management can set a higher limit on the size of data that is merged,
>>> and create a new base once exceeded. This avoids copying excessive
>>> amounts of data.
>>>
>>>> Any other ideas that reuse live block copy for snapshot merge?
>>>>
>>>> Stefan
>>>
>>>

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [Qemu-devel] KVM call agenda for June 28
  2011-07-05 15:04                               ` Dor Laor
@ 2011-07-05 15:29                                 ` Marcelo Tosatti
  -1 siblings, 0 replies; 55+ messages in thread
From: Marcelo Tosatti @ 2011-07-05 15:29 UTC (permalink / raw)
  To: Dor Laor
  Cc: Stefan Hajnoczi, Kevin Wolf, Chris Wright,
	KVM devel mailing list, quintela, jes sorensen, qemu-devel,
	Avi Kivity

On Tue, Jul 05, 2011 at 06:04:34PM +0300, Dor Laor wrote:
> On 07/05/2011 05:32 PM, Marcelo Tosatti wrote:
> >On Tue, Jul 05, 2011 at 04:39:06PM +0300, Dor Laor wrote:
> >>On 07/05/2011 03:58 PM, Marcelo Tosatti wrote:
> >>>On Tue, Jul 05, 2011 at 01:40:08PM +0100, Stefan Hajnoczi wrote:
> >>>>On Tue, Jul 5, 2011 at 9:01 AM, Dor Laor<dlaor@redhat.com>   wrote:
> >>>>>I tried to re-arrange all of the requirements and use cases using this wiki
> >>>>>page: http://wiki.qemu.org/Features/LiveBlockMigration
> >>>>>
> >>>>>It would be the best to agree upon the most interesting use cases (while we
> >>>>>make sure we cover future ones) and agree to them.
> >>>>>The next step is to set the interface for all the various verbs since the
> >>>>>implementation seems to be converging.
> >>>>
> >>>>Live block copy was supposed to support snapshot merge.  I think the
> >>>>current favored approach is to make the source image a backing file to
> >>>>the destination image and essentially do image streaming.
> >>>>
> >>>>Using this mechanism for snapshot merge is tricky.  The COW file
> >>>>already uses the read-only snapshot base image.  So now we cannot
> >>>>trivally copy the COW file contents back into the snapshot base image
> >>>>using live block copy.
> >>>
> >>>It never did. Live copy creates a new image were both snapshot and
> >>>"current" are copied to.
> >>>
> >>>This is similar with image streaming.
> >>
> >>Not sure I realize what's bad to do in-place merge:
> >>
> >>Let's suppose we have this COW chain:
> >>
> >>   base<-- s1<-- s2
> >>
> >>Now a live snapshot is created over s2, s2 becomes RO and s3 is RW:
> >>
> >>   base<-- s1<-- s2<-- s3
> >>
> >>Now we've done with s2 (post backup) and like to merge s3 into s2.
> >>
> >>With your approach we use live copy of s3 into newSnap:
> >>
> >>   base<-- s1<-- s2<-- s3
> >>   base<-- s1<-- newSnap
> >>
> >>When it is over s2 and s3 can be erased.
> >>The down side is the IOs for copying s2 data and the temporary
> >>storage. I guess temp storage is cheap but excessive IO are
> >>expensive.
> >>
> >>My approach was to collapse s3 into s2 and erase s3 eventually:
> >>
> >>before: base<-- s1<-- s2<-- s3
> >>after:  base<-- s1<-- s2
> >>
> >>If we use live block copy using mirror driver it should be safe as
> >>long as we keep the ordering of new writes into s3 during the
> >>execution.
> >>Even a failure in the the middle won't cause harm since the
> >>management will keep using s3 until it gets success event.
> >
> >Well, it is more complicated than simply streaming into a new
> >image. I'm not entirely sure it is necessary. The common case is:
> >
> >base ->  sn-1 ->  sn-2 ->  ... ->  sn-n
> >
> >When n reaches a limit, you do:
> >
> >base ->  merge-1
> >
> >You're potentially copying similar amount of data when merging back into
> >a single image (and you can't easily merge multiple snapshots).
> >
> >If the amount of data thats not in 'base' is large, you create
> >leave a new external file around:
> >
> >base ->  merge-1 ->  sn-1 ->  sn-2 ... ->  sn-n
> >to
> >base ->  merge-1 ->  merge-2
> 
> Sometimes one will want to merge the snapshot immediately post the
> base was backed-up

Well, ok, this needs a separate interface for management, needs write
mirroring, and must mind crash handling.

> >>>>It seems like snapshot merge will require dedicated code that reads
> >>>>the allocated clusters from the COW file and writes them back into the
> >>>>base image.
> >>>>
> >>>>A very inefficient alternative would be to create a third image, the
> >>>>"merge" image file, which has the COW file as its backing file:
> >>>>snapshot (base) ->   cow ->   merge
> >
> >Remember there is a 'base' before snapshot, you don't copy the entire
> >image.
> 
> Not always, the image might be raw file/device -
> 
> 1. raw image
> 2. live snapshot it and use COW above it
>    raw <- s1
> 3. backup the raw image using 3rd party mechanism
> 4. live merge (copy) s1 into raw
> 
> >
> >>>>
> >>>>All data from snapshot and cow is copied into merge and then snapshot
> >>>>and cow can be deleted.  But this approach is results in full data
> >>>>copying and uses potentially 3x space if cow is close to the size of
> >>>>snapshot.
> >>>
> >>>Management can set a higher limit on the size of data that is merged,
> >>>and create a new base once exceeded. This avoids copying excessive
> >>>amounts of data.
> >>>
> >>>>Any other ideas that reuse live block copy for snapshot merge?
> >>>>
> >>>>Stefan
> >>>
> >>>

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [Qemu-devel] KVM call agenda for June 28
@ 2011-07-05 15:29                                 ` Marcelo Tosatti
  0 siblings, 0 replies; 55+ messages in thread
From: Marcelo Tosatti @ 2011-07-05 15:29 UTC (permalink / raw)
  To: Dor Laor
  Cc: Kevin Wolf, Chris Wright, KVM devel mailing list, quintela,
	Stefan Hajnoczi, qemu-devel, Avi Kivity, jes sorensen

On Tue, Jul 05, 2011 at 06:04:34PM +0300, Dor Laor wrote:
> On 07/05/2011 05:32 PM, Marcelo Tosatti wrote:
> >On Tue, Jul 05, 2011 at 04:39:06PM +0300, Dor Laor wrote:
> >>On 07/05/2011 03:58 PM, Marcelo Tosatti wrote:
> >>>On Tue, Jul 05, 2011 at 01:40:08PM +0100, Stefan Hajnoczi wrote:
> >>>>On Tue, Jul 5, 2011 at 9:01 AM, Dor Laor<dlaor@redhat.com>   wrote:
> >>>>>I tried to re-arrange all of the requirements and use cases using this wiki
> >>>>>page: http://wiki.qemu.org/Features/LiveBlockMigration
> >>>>>
> >>>>>It would be the best to agree upon the most interesting use cases (while we
> >>>>>make sure we cover future ones) and agree to them.
> >>>>>The next step is to set the interface for all the various verbs since the
> >>>>>implementation seems to be converging.
> >>>>
> >>>>Live block copy was supposed to support snapshot merge.  I think the
> >>>>current favored approach is to make the source image a backing file to
> >>>>the destination image and essentially do image streaming.
> >>>>
> >>>>Using this mechanism for snapshot merge is tricky.  The COW file
> >>>>already uses the read-only snapshot base image.  So now we cannot
> >>>>trivally copy the COW file contents back into the snapshot base image
> >>>>using live block copy.
> >>>
> >>>It never did. Live copy creates a new image were both snapshot and
> >>>"current" are copied to.
> >>>
> >>>This is similar with image streaming.
> >>
> >>Not sure I realize what's bad to do in-place merge:
> >>
> >>Let's suppose we have this COW chain:
> >>
> >>   base<-- s1<-- s2
> >>
> >>Now a live snapshot is created over s2, s2 becomes RO and s3 is RW:
> >>
> >>   base<-- s1<-- s2<-- s3
> >>
> >>Now we've done with s2 (post backup) and like to merge s3 into s2.
> >>
> >>With your approach we use live copy of s3 into newSnap:
> >>
> >>   base<-- s1<-- s2<-- s3
> >>   base<-- s1<-- newSnap
> >>
> >>When it is over s2 and s3 can be erased.
> >>The down side is the IOs for copying s2 data and the temporary
> >>storage. I guess temp storage is cheap but excessive IO are
> >>expensive.
> >>
> >>My approach was to collapse s3 into s2 and erase s3 eventually:
> >>
> >>before: base<-- s1<-- s2<-- s3
> >>after:  base<-- s1<-- s2
> >>
> >>If we use live block copy using mirror driver it should be safe as
> >>long as we keep the ordering of new writes into s3 during the
> >>execution.
> >>Even a failure in the the middle won't cause harm since the
> >>management will keep using s3 until it gets success event.
> >
> >Well, it is more complicated than simply streaming into a new
> >image. I'm not entirely sure it is necessary. The common case is:
> >
> >base ->  sn-1 ->  sn-2 ->  ... ->  sn-n
> >
> >When n reaches a limit, you do:
> >
> >base ->  merge-1
> >
> >You're potentially copying similar amount of data when merging back into
> >a single image (and you can't easily merge multiple snapshots).
> >
> >If the amount of data thats not in 'base' is large, you create
> >leave a new external file around:
> >
> >base ->  merge-1 ->  sn-1 ->  sn-2 ... ->  sn-n
> >to
> >base ->  merge-1 ->  merge-2
> 
> Sometimes one will want to merge the snapshot immediately post the
> base was backed-up

Well, ok, this needs a separate interface for management, needs write
mirroring, and must mind crash handling.

> >>>>It seems like snapshot merge will require dedicated code that reads
> >>>>the allocated clusters from the COW file and writes them back into the
> >>>>base image.
> >>>>
> >>>>A very inefficient alternative would be to create a third image, the
> >>>>"merge" image file, which has the COW file as its backing file:
> >>>>snapshot (base) ->   cow ->   merge
> >
> >Remember there is a 'base' before snapshot, you don't copy the entire
> >image.
> 
> Not always, the image might be raw file/device -
> 
> 1. raw image
> 2. live snapshot it and use COW above it
>    raw <- s1
> 3. backup the raw image using 3rd party mechanism
> 4. live merge (copy) s1 into raw
> 
> >
> >>>>
> >>>>All data from snapshot and cow is copied into merge and then snapshot
> >>>>and cow can be deleted.  But this approach is results in full data
> >>>>copying and uses potentially 3x space if cow is close to the size of
> >>>>snapshot.
> >>>
> >>>Management can set a higher limit on the size of data that is merged,
> >>>and create a new base once exceeded. This avoids copying excessive
> >>>amounts of data.
> >>>
> >>>>Any other ideas that reuse live block copy for snapshot merge?
> >>>>
> >>>>Stefan
> >>>
> >>>

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [Qemu-devel] KVM call agenda for June 28
  2011-07-05 14:32                             ` Marcelo Tosatti
@ 2011-07-05 15:37                               ` Stefan Hajnoczi
  -1 siblings, 0 replies; 55+ messages in thread
From: Stefan Hajnoczi @ 2011-07-05 15:37 UTC (permalink / raw)
  To: Marcelo Tosatti
  Cc: Dor Laor, Kevin Wolf, Chris Wright, KVM devel mailing list,
	quintela, jes sorensen, qemu-devel, Avi Kivity

On Tue, Jul 5, 2011 at 3:32 PM, Marcelo Tosatti <mtosatti@redhat.com> wrote:
> On Tue, Jul 05, 2011 at 04:39:06PM +0300, Dor Laor wrote:
>> On 07/05/2011 03:58 PM, Marcelo Tosatti wrote:
>> >On Tue, Jul 05, 2011 at 01:40:08PM +0100, Stefan Hajnoczi wrote:
>> >>On Tue, Jul 5, 2011 at 9:01 AM, Dor Laor<dlaor@redhat.com>  wrote:
>> >>>I tried to re-arrange all of the requirements and use cases using this wiki
>> >>>page: http://wiki.qemu.org/Features/LiveBlockMigration
>> >>>
>> >>>It would be the best to agree upon the most interesting use cases (while we
>> >>>make sure we cover future ones) and agree to them.
>> >>>The next step is to set the interface for all the various verbs since the
>> >>>implementation seems to be converging.
>> >>
>> >>Live block copy was supposed to support snapshot merge.  I think the
>> >>current favored approach is to make the source image a backing file to
>> >>the destination image and essentially do image streaming.
>> >>
>> >>Using this mechanism for snapshot merge is tricky.  The COW file
>> >>already uses the read-only snapshot base image.  So now we cannot
>> >>trivally copy the COW file contents back into the snapshot base image
>> >>using live block copy.
>> >
>> >It never did. Live copy creates a new image were both snapshot and
>> >"current" are copied to.
>> >
>> >This is similar with image streaming.
>>
>> Not sure I realize what's bad to do in-place merge:
>>
>> Let's suppose we have this COW chain:
>>
>>   base <-- s1 <-- s2
>>
>> Now a live snapshot is created over s2, s2 becomes RO and s3 is RW:
>>
>>   base <-- s1 <-- s2 <-- s3
>>
>> Now we've done with s2 (post backup) and like to merge s3 into s2.
>>
>> With your approach we use live copy of s3 into newSnap:
>>
>>   base <-- s1 <-- s2 <-- s3
>>   base <-- s1 <-- newSnap
>>
>> When it is over s2 and s3 can be erased.
>> The down side is the IOs for copying s2 data and the temporary
>> storage. I guess temp storage is cheap but excessive IO are
>> expensive.
>>
>> My approach was to collapse s3 into s2 and erase s3 eventually:
>>
>> before: base <-- s1 <-- s2 <-- s3
>> after:  base <-- s1 <-- s2
>>
>> If we use live block copy using mirror driver it should be safe as
>> long as we keep the ordering of new writes into s3 during the
>> execution.
>> Even a failure in the the middle won't cause harm since the
>> management will keep using s3 until it gets success event.
>
> Well, it is more complicated than simply streaming into a new
> image. I'm not entirely sure it is necessary. The common case is:
>
> base -> sn-1 -> sn-2 -> ... -> sn-n
>
> When n reaches a limit, you do:
>
> base -> merge-1
>
> You're potentially copying similar amount of data when merging back into
> a single image (and you can't easily merge multiple snapshots).
>
> If the amount of data thats not in 'base' is large, you create
> leave a new external file around:
>
> base -> merge-1 -> sn-1 -> sn-2 ... -> sn-n
> to
> base -> merge-1 -> merge-2
>
>> >
>> >>It seems like snapshot merge will require dedicated code that reads
>> >>the allocated clusters from the COW file and writes them back into the
>> >>base image.
>> >>
>> >>A very inefficient alternative would be to create a third image, the
>> >>"merge" image file, which has the COW file as its backing file:
>> >>snapshot (base) ->  cow ->  merge
>
> Remember there is a 'base' before snapshot, you don't copy the entire
> image.

One use case I have in mind is the Live Backup approach that Jagane
has been developing.  Here the backup solution only creates a snapshot
for the period of time needed to read out the dirty blocks.  Then the
snapshot is deleted again and probably contains very little new data
relative to the base image.  The backup solution does this operation
every day.

This is the pathalogical case for any approach that copies the entire
base into a new file.  We could have avoided a lot of I/O by doing an
in-place update.

I want to make sure this works well.

Stefan

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [Qemu-devel] KVM call agenda for June 28
@ 2011-07-05 15:37                               ` Stefan Hajnoczi
  0 siblings, 0 replies; 55+ messages in thread
From: Stefan Hajnoczi @ 2011-07-05 15:37 UTC (permalink / raw)
  To: Marcelo Tosatti
  Cc: Kevin Wolf, Chris Wright, KVM devel mailing list, quintela,
	jes sorensen, Dor Laor, qemu-devel, Avi Kivity

On Tue, Jul 5, 2011 at 3:32 PM, Marcelo Tosatti <mtosatti@redhat.com> wrote:
> On Tue, Jul 05, 2011 at 04:39:06PM +0300, Dor Laor wrote:
>> On 07/05/2011 03:58 PM, Marcelo Tosatti wrote:
>> >On Tue, Jul 05, 2011 at 01:40:08PM +0100, Stefan Hajnoczi wrote:
>> >>On Tue, Jul 5, 2011 at 9:01 AM, Dor Laor<dlaor@redhat.com>  wrote:
>> >>>I tried to re-arrange all of the requirements and use cases using this wiki
>> >>>page: http://wiki.qemu.org/Features/LiveBlockMigration
>> >>>
>> >>>It would be the best to agree upon the most interesting use cases (while we
>> >>>make sure we cover future ones) and agree to them.
>> >>>The next step is to set the interface for all the various verbs since the
>> >>>implementation seems to be converging.
>> >>
>> >>Live block copy was supposed to support snapshot merge.  I think the
>> >>current favored approach is to make the source image a backing file to
>> >>the destination image and essentially do image streaming.
>> >>
>> >>Using this mechanism for snapshot merge is tricky.  The COW file
>> >>already uses the read-only snapshot base image.  So now we cannot
>> >>trivally copy the COW file contents back into the snapshot base image
>> >>using live block copy.
>> >
>> >It never did. Live copy creates a new image were both snapshot and
>> >"current" are copied to.
>> >
>> >This is similar with image streaming.
>>
>> Not sure I realize what's bad to do in-place merge:
>>
>> Let's suppose we have this COW chain:
>>
>>   base <-- s1 <-- s2
>>
>> Now a live snapshot is created over s2, s2 becomes RO and s3 is RW:
>>
>>   base <-- s1 <-- s2 <-- s3
>>
>> Now we've done with s2 (post backup) and like to merge s3 into s2.
>>
>> With your approach we use live copy of s3 into newSnap:
>>
>>   base <-- s1 <-- s2 <-- s3
>>   base <-- s1 <-- newSnap
>>
>> When it is over s2 and s3 can be erased.
>> The down side is the IOs for copying s2 data and the temporary
>> storage. I guess temp storage is cheap but excessive IO are
>> expensive.
>>
>> My approach was to collapse s3 into s2 and erase s3 eventually:
>>
>> before: base <-- s1 <-- s2 <-- s3
>> after:  base <-- s1 <-- s2
>>
>> If we use live block copy using mirror driver it should be safe as
>> long as we keep the ordering of new writes into s3 during the
>> execution.
>> Even a failure in the the middle won't cause harm since the
>> management will keep using s3 until it gets success event.
>
> Well, it is more complicated than simply streaming into a new
> image. I'm not entirely sure it is necessary. The common case is:
>
> base -> sn-1 -> sn-2 -> ... -> sn-n
>
> When n reaches a limit, you do:
>
> base -> merge-1
>
> You're potentially copying similar amount of data when merging back into
> a single image (and you can't easily merge multiple snapshots).
>
> If the amount of data thats not in 'base' is large, you create
> leave a new external file around:
>
> base -> merge-1 -> sn-1 -> sn-2 ... -> sn-n
> to
> base -> merge-1 -> merge-2
>
>> >
>> >>It seems like snapshot merge will require dedicated code that reads
>> >>the allocated clusters from the COW file and writes them back into the
>> >>base image.
>> >>
>> >>A very inefficient alternative would be to create a third image, the
>> >>"merge" image file, which has the COW file as its backing file:
>> >>snapshot (base) ->  cow ->  merge
>
> Remember there is a 'base' before snapshot, you don't copy the entire
> image.

One use case I have in mind is the Live Backup approach that Jagane
has been developing.  Here the backup solution only creates a snapshot
for the period of time needed to read out the dirty blocks.  Then the
snapshot is deleted again and probably contains very little new data
relative to the base image.  The backup solution does this operation
every day.

This is the pathalogical case for any approach that copies the entire
base into a new file.  We could have avoided a lot of I/O by doing an
in-place update.

I want to make sure this works well.

Stefan

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [Qemu-devel] KVM call agenda for June 28
  2011-07-05 15:37                               ` Stefan Hajnoczi
@ 2011-07-05 18:18                                 ` Marcelo Tosatti
  -1 siblings, 0 replies; 55+ messages in thread
From: Marcelo Tosatti @ 2011-07-05 18:18 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Dor Laor, Kevin Wolf, Chris Wright, KVM devel mailing list,
	quintela, jes sorensen, qemu-devel, Avi Kivity

On Tue, Jul 05, 2011 at 04:37:08PM +0100, Stefan Hajnoczi wrote:
> On Tue, Jul 5, 2011 at 3:32 PM, Marcelo Tosatti <mtosatti@redhat.com> wrote:
> > On Tue, Jul 05, 2011 at 04:39:06PM +0300, Dor Laor wrote:
> >> On 07/05/2011 03:58 PM, Marcelo Tosatti wrote:
> >> >On Tue, Jul 05, 2011 at 01:40:08PM +0100, Stefan Hajnoczi wrote:
> >> >>On Tue, Jul 5, 2011 at 9:01 AM, Dor Laor<dlaor@redhat.com>  wrote:
> >> >>>I tried to re-arrange all of the requirements and use cases using this wiki
> >> >>>page: http://wiki.qemu.org/Features/LiveBlockMigration
> >> >>>
> >> >>>It would be the best to agree upon the most interesting use cases (while we
> >> >>>make sure we cover future ones) and agree to them.
> >> >>>The next step is to set the interface for all the various verbs since the
> >> >>>implementation seems to be converging.
> >> >>
> >> >>Live block copy was supposed to support snapshot merge.  I think the
> >> >>current favored approach is to make the source image a backing file to
> >> >>the destination image and essentially do image streaming.
> >> >>
> >> >>Using this mechanism for snapshot merge is tricky.  The COW file
> >> >>already uses the read-only snapshot base image.  So now we cannot
> >> >>trivally copy the COW file contents back into the snapshot base image
> >> >>using live block copy.
> >> >
> >> >It never did. Live copy creates a new image were both snapshot and
> >> >"current" are copied to.
> >> >
> >> >This is similar with image streaming.
> >>
> >> Not sure I realize what's bad to do in-place merge:
> >>
> >> Let's suppose we have this COW chain:
> >>
> >>   base <-- s1 <-- s2
> >>
> >> Now a live snapshot is created over s2, s2 becomes RO and s3 is RW:
> >>
> >>   base <-- s1 <-- s2 <-- s3
> >>
> >> Now we've done with s2 (post backup) and like to merge s3 into s2.
> >>
> >> With your approach we use live copy of s3 into newSnap:
> >>
> >>   base <-- s1 <-- s2 <-- s3
> >>   base <-- s1 <-- newSnap
> >>
> >> When it is over s2 and s3 can be erased.
> >> The down side is the IOs for copying s2 data and the temporary
> >> storage. I guess temp storage is cheap but excessive IO are
> >> expensive.
> >>
> >> My approach was to collapse s3 into s2 and erase s3 eventually:
> >>
> >> before: base <-- s1 <-- s2 <-- s3
> >> after:  base <-- s1 <-- s2
> >>
> >> If we use live block copy using mirror driver it should be safe as
> >> long as we keep the ordering of new writes into s3 during the
> >> execution.
> >> Even a failure in the the middle won't cause harm since the
> >> management will keep using s3 until it gets success event.
> >
> > Well, it is more complicated than simply streaming into a new
> > image. I'm not entirely sure it is necessary. The common case is:
> >
> > base -> sn-1 -> sn-2 -> ... -> sn-n
> >
> > When n reaches a limit, you do:
> >
> > base -> merge-1
> >
> > You're potentially copying similar amount of data when merging back into
> > a single image (and you can't easily merge multiple snapshots).
> >
> > If the amount of data thats not in 'base' is large, you create
> > leave a new external file around:
> >
> > base -> merge-1 -> sn-1 -> sn-2 ... -> sn-n
> > to
> > base -> merge-1 -> merge-2
> >
> >> >
> >> >>It seems like snapshot merge will require dedicated code that reads
> >> >>the allocated clusters from the COW file and writes them back into the
> >> >>base image.
> >> >>
> >> >>A very inefficient alternative would be to create a third image, the
> >> >>"merge" image file, which has the COW file as its backing file:
> >> >>snapshot (base) ->  cow ->  merge
> >
> > Remember there is a 'base' before snapshot, you don't copy the entire
> > image.
> 
> One use case I have in mind is the Live Backup approach that Jagane
> has been developing.  Here the backup solution only creates a snapshot
> for the period of time needed to read out the dirty blocks.  Then the
> snapshot is deleted again and probably contains very little new data
> relative to the base image.  The backup solution does this operation
> every day.
> 
> This is the pathalogical case for any approach that copies the entire
> base into a new file.  We could have avoided a lot of I/O by doing an
> in-place update.
> 
> I want to make sure this works well.

This use case does not fit the streaming scheme that has come up. Its a
completly different operation.

IMO it should be implemented separately.

> Stefan

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [Qemu-devel] KVM call agenda for June 28
@ 2011-07-05 18:18                                 ` Marcelo Tosatti
  0 siblings, 0 replies; 55+ messages in thread
From: Marcelo Tosatti @ 2011-07-05 18:18 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Kevin Wolf, Chris Wright, KVM devel mailing list, quintela,
	jes sorensen, Dor Laor, qemu-devel, Avi Kivity

On Tue, Jul 05, 2011 at 04:37:08PM +0100, Stefan Hajnoczi wrote:
> On Tue, Jul 5, 2011 at 3:32 PM, Marcelo Tosatti <mtosatti@redhat.com> wrote:
> > On Tue, Jul 05, 2011 at 04:39:06PM +0300, Dor Laor wrote:
> >> On 07/05/2011 03:58 PM, Marcelo Tosatti wrote:
> >> >On Tue, Jul 05, 2011 at 01:40:08PM +0100, Stefan Hajnoczi wrote:
> >> >>On Tue, Jul 5, 2011 at 9:01 AM, Dor Laor<dlaor@redhat.com>  wrote:
> >> >>>I tried to re-arrange all of the requirements and use cases using this wiki
> >> >>>page: http://wiki.qemu.org/Features/LiveBlockMigration
> >> >>>
> >> >>>It would be the best to agree upon the most interesting use cases (while we
> >> >>>make sure we cover future ones) and agree to them.
> >> >>>The next step is to set the interface for all the various verbs since the
> >> >>>implementation seems to be converging.
> >> >>
> >> >>Live block copy was supposed to support snapshot merge.  I think the
> >> >>current favored approach is to make the source image a backing file to
> >> >>the destination image and essentially do image streaming.
> >> >>
> >> >>Using this mechanism for snapshot merge is tricky.  The COW file
> >> >>already uses the read-only snapshot base image.  So now we cannot
> >> >>trivally copy the COW file contents back into the snapshot base image
> >> >>using live block copy.
> >> >
> >> >It never did. Live copy creates a new image were both snapshot and
> >> >"current" are copied to.
> >> >
> >> >This is similar with image streaming.
> >>
> >> Not sure I realize what's bad to do in-place merge:
> >>
> >> Let's suppose we have this COW chain:
> >>
> >>   base <-- s1 <-- s2
> >>
> >> Now a live snapshot is created over s2, s2 becomes RO and s3 is RW:
> >>
> >>   base <-- s1 <-- s2 <-- s3
> >>
> >> Now we've done with s2 (post backup) and like to merge s3 into s2.
> >>
> >> With your approach we use live copy of s3 into newSnap:
> >>
> >>   base <-- s1 <-- s2 <-- s3
> >>   base <-- s1 <-- newSnap
> >>
> >> When it is over s2 and s3 can be erased.
> >> The down side is the IOs for copying s2 data and the temporary
> >> storage. I guess temp storage is cheap but excessive IO are
> >> expensive.
> >>
> >> My approach was to collapse s3 into s2 and erase s3 eventually:
> >>
> >> before: base <-- s1 <-- s2 <-- s3
> >> after:  base <-- s1 <-- s2
> >>
> >> If we use live block copy using mirror driver it should be safe as
> >> long as we keep the ordering of new writes into s3 during the
> >> execution.
> >> Even a failure in the the middle won't cause harm since the
> >> management will keep using s3 until it gets success event.
> >
> > Well, it is more complicated than simply streaming into a new
> > image. I'm not entirely sure it is necessary. The common case is:
> >
> > base -> sn-1 -> sn-2 -> ... -> sn-n
> >
> > When n reaches a limit, you do:
> >
> > base -> merge-1
> >
> > You're potentially copying similar amount of data when merging back into
> > a single image (and you can't easily merge multiple snapshots).
> >
> > If the amount of data thats not in 'base' is large, you create
> > leave a new external file around:
> >
> > base -> merge-1 -> sn-1 -> sn-2 ... -> sn-n
> > to
> > base -> merge-1 -> merge-2
> >
> >> >
> >> >>It seems like snapshot merge will require dedicated code that reads
> >> >>the allocated clusters from the COW file and writes them back into the
> >> >>base image.
> >> >>
> >> >>A very inefficient alternative would be to create a third image, the
> >> >>"merge" image file, which has the COW file as its backing file:
> >> >>snapshot (base) ->  cow ->  merge
> >
> > Remember there is a 'base' before snapshot, you don't copy the entire
> > image.
> 
> One use case I have in mind is the Live Backup approach that Jagane
> has been developing.  Here the backup solution only creates a snapshot
> for the period of time needed to read out the dirty blocks.  Then the
> snapshot is deleted again and probably contains very little new data
> relative to the base image.  The backup solution does this operation
> every day.
> 
> This is the pathalogical case for any approach that copies the entire
> base into a new file.  We could have avoided a lot of I/O by doing an
> in-place update.
> 
> I want to make sure this works well.

This use case does not fit the streaming scheme that has come up. Its a
completly different operation.

IMO it should be implemented separately.

> Stefan

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [Qemu-devel] KVM call agenda for June 28
  2011-07-05 18:18                                 ` Marcelo Tosatti
@ 2011-07-06  7:48                                   ` Kevin Wolf
  -1 siblings, 0 replies; 55+ messages in thread
From: Kevin Wolf @ 2011-07-06  7:48 UTC (permalink / raw)
  To: Marcelo Tosatti
  Cc: Stefan Hajnoczi, Dor Laor, Chris Wright, KVM devel mailing list,
	quintela, jes sorensen, qemu-devel, Avi Kivity

Am 05.07.2011 20:18, schrieb Marcelo Tosatti:
> On Tue, Jul 05, 2011 at 04:37:08PM +0100, Stefan Hajnoczi wrote:
>> On Tue, Jul 5, 2011 at 3:32 PM, Marcelo Tosatti <mtosatti@redhat.com> wrote:
>>> On Tue, Jul 05, 2011 at 04:39:06PM +0300, Dor Laor wrote:
>>>> On 07/05/2011 03:58 PM, Marcelo Tosatti wrote:
>>>>> On Tue, Jul 05, 2011 at 01:40:08PM +0100, Stefan Hajnoczi wrote:
>>>>>> On Tue, Jul 5, 2011 at 9:01 AM, Dor Laor<dlaor@redhat.com>  wrote:
>>>>>>> I tried to re-arrange all of the requirements and use cases using this wiki
>>>>>>> page: http://wiki.qemu.org/Features/LiveBlockMigration
>>>>>>>
>>>>>>> It would be the best to agree upon the most interesting use cases (while we
>>>>>>> make sure we cover future ones) and agree to them.
>>>>>>> The next step is to set the interface for all the various verbs since the
>>>>>>> implementation seems to be converging.
>>>>>>
>>>>>> Live block copy was supposed to support snapshot merge.  I think the
>>>>>> current favored approach is to make the source image a backing file to
>>>>>> the destination image and essentially do image streaming.
>>>>>>
>>>>>> Using this mechanism for snapshot merge is tricky.  The COW file
>>>>>> already uses the read-only snapshot base image.  So now we cannot
>>>>>> trivally copy the COW file contents back into the snapshot base image
>>>>>> using live block copy.
>>>>>
>>>>> It never did. Live copy creates a new image were both snapshot and
>>>>> "current" are copied to.
>>>>>
>>>>> This is similar with image streaming.
>>>>
>>>> Not sure I realize what's bad to do in-place merge:
>>>>
>>>> Let's suppose we have this COW chain:
>>>>
>>>>   base <-- s1 <-- s2
>>>>
>>>> Now a live snapshot is created over s2, s2 becomes RO and s3 is RW:
>>>>
>>>>   base <-- s1 <-- s2 <-- s3
>>>>
>>>> Now we've done with s2 (post backup) and like to merge s3 into s2.
>>>>
>>>> With your approach we use live copy of s3 into newSnap:
>>>>
>>>>   base <-- s1 <-- s2 <-- s3
>>>>   base <-- s1 <-- newSnap
>>>>
>>>> When it is over s2 and s3 can be erased.
>>>> The down side is the IOs for copying s2 data and the temporary
>>>> storage. I guess temp storage is cheap but excessive IO are
>>>> expensive.
>>>>
>>>> My approach was to collapse s3 into s2 and erase s3 eventually:
>>>>
>>>> before: base <-- s1 <-- s2 <-- s3
>>>> after:  base <-- s1 <-- s2
>>>>
>>>> If we use live block copy using mirror driver it should be safe as
>>>> long as we keep the ordering of new writes into s3 during the
>>>> execution.
>>>> Even a failure in the the middle won't cause harm since the
>>>> management will keep using s3 until it gets success event.
>>>
>>> Well, it is more complicated than simply streaming into a new
>>> image. I'm not entirely sure it is necessary. The common case is:
>>>
>>> base -> sn-1 -> sn-2 -> ... -> sn-n
>>>
>>> When n reaches a limit, you do:
>>>
>>> base -> merge-1
>>>
>>> You're potentially copying similar amount of data when merging back into
>>> a single image (and you can't easily merge multiple snapshots).
>>>
>>> If the amount of data thats not in 'base' is large, you create
>>> leave a new external file around:
>>>
>>> base -> merge-1 -> sn-1 -> sn-2 ... -> sn-n
>>> to
>>> base -> merge-1 -> merge-2
>>>
>>>>>
>>>>>> It seems like snapshot merge will require dedicated code that reads
>>>>>> the allocated clusters from the COW file and writes them back into the
>>>>>> base image.
>>>>>>
>>>>>> A very inefficient alternative would be to create a third image, the
>>>>>> "merge" image file, which has the COW file as its backing file:
>>>>>> snapshot (base) ->  cow ->  merge
>>>
>>> Remember there is a 'base' before snapshot, you don't copy the entire
>>> image.
>>
>> One use case I have in mind is the Live Backup approach that Jagane
>> has been developing.  Here the backup solution only creates a snapshot
>> for the period of time needed to read out the dirty blocks.  Then the
>> snapshot is deleted again and probably contains very little new data
>> relative to the base image.  The backup solution does this operation
>> every day.
>>
>> This is the pathalogical case for any approach that copies the entire
>> base into a new file.  We could have avoided a lot of I/O by doing an
>> in-place update.
>>
>> I want to make sure this works well.
> 
> This use case does not fit the streaming scheme that has come up. Its a
> completly different operation.
> 
> IMO it should be implemented separately.

I agree, this is a case for a live commit operation.

Kevin

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [Qemu-devel] KVM call agenda for June 28
@ 2011-07-06  7:48                                   ` Kevin Wolf
  0 siblings, 0 replies; 55+ messages in thread
From: Kevin Wolf @ 2011-07-06  7:48 UTC (permalink / raw)
  To: Marcelo Tosatti
  Cc: Chris Wright, KVM devel mailing list, quintela, Stefan Hajnoczi,
	Dor Laor, qemu-devel, Avi Kivity, jes sorensen

Am 05.07.2011 20:18, schrieb Marcelo Tosatti:
> On Tue, Jul 05, 2011 at 04:37:08PM +0100, Stefan Hajnoczi wrote:
>> On Tue, Jul 5, 2011 at 3:32 PM, Marcelo Tosatti <mtosatti@redhat.com> wrote:
>>> On Tue, Jul 05, 2011 at 04:39:06PM +0300, Dor Laor wrote:
>>>> On 07/05/2011 03:58 PM, Marcelo Tosatti wrote:
>>>>> On Tue, Jul 05, 2011 at 01:40:08PM +0100, Stefan Hajnoczi wrote:
>>>>>> On Tue, Jul 5, 2011 at 9:01 AM, Dor Laor<dlaor@redhat.com>  wrote:
>>>>>>> I tried to re-arrange all of the requirements and use cases using this wiki
>>>>>>> page: http://wiki.qemu.org/Features/LiveBlockMigration
>>>>>>>
>>>>>>> It would be the best to agree upon the most interesting use cases (while we
>>>>>>> make sure we cover future ones) and agree to them.
>>>>>>> The next step is to set the interface for all the various verbs since the
>>>>>>> implementation seems to be converging.
>>>>>>
>>>>>> Live block copy was supposed to support snapshot merge.  I think the
>>>>>> current favored approach is to make the source image a backing file to
>>>>>> the destination image and essentially do image streaming.
>>>>>>
>>>>>> Using this mechanism for snapshot merge is tricky.  The COW file
>>>>>> already uses the read-only snapshot base image.  So now we cannot
>>>>>> trivally copy the COW file contents back into the snapshot base image
>>>>>> using live block copy.
>>>>>
>>>>> It never did. Live copy creates a new image were both snapshot and
>>>>> "current" are copied to.
>>>>>
>>>>> This is similar with image streaming.
>>>>
>>>> Not sure I realize what's bad to do in-place merge:
>>>>
>>>> Let's suppose we have this COW chain:
>>>>
>>>>   base <-- s1 <-- s2
>>>>
>>>> Now a live snapshot is created over s2, s2 becomes RO and s3 is RW:
>>>>
>>>>   base <-- s1 <-- s2 <-- s3
>>>>
>>>> Now we've done with s2 (post backup) and like to merge s3 into s2.
>>>>
>>>> With your approach we use live copy of s3 into newSnap:
>>>>
>>>>   base <-- s1 <-- s2 <-- s3
>>>>   base <-- s1 <-- newSnap
>>>>
>>>> When it is over s2 and s3 can be erased.
>>>> The down side is the IOs for copying s2 data and the temporary
>>>> storage. I guess temp storage is cheap but excessive IO are
>>>> expensive.
>>>>
>>>> My approach was to collapse s3 into s2 and erase s3 eventually:
>>>>
>>>> before: base <-- s1 <-- s2 <-- s3
>>>> after:  base <-- s1 <-- s2
>>>>
>>>> If we use live block copy using mirror driver it should be safe as
>>>> long as we keep the ordering of new writes into s3 during the
>>>> execution.
>>>> Even a failure in the the middle won't cause harm since the
>>>> management will keep using s3 until it gets success event.
>>>
>>> Well, it is more complicated than simply streaming into a new
>>> image. I'm not entirely sure it is necessary. The common case is:
>>>
>>> base -> sn-1 -> sn-2 -> ... -> sn-n
>>>
>>> When n reaches a limit, you do:
>>>
>>> base -> merge-1
>>>
>>> You're potentially copying similar amount of data when merging back into
>>> a single image (and you can't easily merge multiple snapshots).
>>>
>>> If the amount of data thats not in 'base' is large, you create
>>> leave a new external file around:
>>>
>>> base -> merge-1 -> sn-1 -> sn-2 ... -> sn-n
>>> to
>>> base -> merge-1 -> merge-2
>>>
>>>>>
>>>>>> It seems like snapshot merge will require dedicated code that reads
>>>>>> the allocated clusters from the COW file and writes them back into the
>>>>>> base image.
>>>>>>
>>>>>> A very inefficient alternative would be to create a third image, the
>>>>>> "merge" image file, which has the COW file as its backing file:
>>>>>> snapshot (base) ->  cow ->  merge
>>>
>>> Remember there is a 'base' before snapshot, you don't copy the entire
>>> image.
>>
>> One use case I have in mind is the Live Backup approach that Jagane
>> has been developing.  Here the backup solution only creates a snapshot
>> for the period of time needed to read out the dirty blocks.  Then the
>> snapshot is deleted again and probably contains very little new data
>> relative to the base image.  The backup solution does this operation
>> every day.
>>
>> This is the pathalogical case for any approach that copies the entire
>> base into a new file.  We could have avoided a lot of I/O by doing an
>> in-place update.
>>
>> I want to make sure this works well.
> 
> This use case does not fit the streaming scheme that has come up. Its a
> completly different operation.
> 
> IMO it should be implemented separately.

I agree, this is a case for a live commit operation.

Kevin

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [Qemu-devel] KVM call agenda for June 28
  2011-07-05 18:18                                 ` Marcelo Tosatti
@ 2011-07-07 15:25                                   ` Stefan Hajnoczi
  -1 siblings, 0 replies; 55+ messages in thread
From: Stefan Hajnoczi @ 2011-07-07 15:25 UTC (permalink / raw)
  To: Marcelo Tosatti
  Cc: Dor Laor, Kevin Wolf, Chris Wright, KVM devel mailing list,
	quintela, jes sorensen, qemu-devel, Avi Kivity

On Tue, Jul 5, 2011 at 7:18 PM, Marcelo Tosatti <mtosatti@redhat.com> wrote:
> On Tue, Jul 05, 2011 at 04:37:08PM +0100, Stefan Hajnoczi wrote:
>> On Tue, Jul 5, 2011 at 3:32 PM, Marcelo Tosatti <mtosatti@redhat.com> wrote:
>> > On Tue, Jul 05, 2011 at 04:39:06PM +0300, Dor Laor wrote:
>> >> On 07/05/2011 03:58 PM, Marcelo Tosatti wrote:
>> >> >On Tue, Jul 05, 2011 at 01:40:08PM +0100, Stefan Hajnoczi wrote:
>> >> >>On Tue, Jul 5, 2011 at 9:01 AM, Dor Laor<dlaor@redhat.com>  wrote:
>> >> >>>I tried to re-arrange all of the requirements and use cases using this wiki
>> >> >>>page: http://wiki.qemu.org/Features/LiveBlockMigration
>> >> >>>
>> >> >>>It would be the best to agree upon the most interesting use cases (while we
>> >> >>>make sure we cover future ones) and agree to them.
>> >> >>>The next step is to set the interface for all the various verbs since the
>> >> >>>implementation seems to be converging.
>> >> >>
>> >> >>Live block copy was supposed to support snapshot merge.  I think the
>> >> >>current favored approach is to make the source image a backing file to
>> >> >>the destination image and essentially do image streaming.
>> >> >>
>> >> >>Using this mechanism for snapshot merge is tricky.  The COW file
>> >> >>already uses the read-only snapshot base image.  So now we cannot
>> >> >>trivally copy the COW file contents back into the snapshot base image
>> >> >>using live block copy.
>> >> >
>> >> >It never did. Live copy creates a new image were both snapshot and
>> >> >"current" are copied to.
>> >> >
>> >> >This is similar with image streaming.
>> >>
>> >> Not sure I realize what's bad to do in-place merge:
>> >>
>> >> Let's suppose we have this COW chain:
>> >>
>> >>   base <-- s1 <-- s2
>> >>
>> >> Now a live snapshot is created over s2, s2 becomes RO and s3 is RW:
>> >>
>> >>   base <-- s1 <-- s2 <-- s3
>> >>
>> >> Now we've done with s2 (post backup) and like to merge s3 into s2.
>> >>
>> >> With your approach we use live copy of s3 into newSnap:
>> >>
>> >>   base <-- s1 <-- s2 <-- s3
>> >>   base <-- s1 <-- newSnap
>> >>
>> >> When it is over s2 and s3 can be erased.
>> >> The down side is the IOs for copying s2 data and the temporary
>> >> storage. I guess temp storage is cheap but excessive IO are
>> >> expensive.
>> >>
>> >> My approach was to collapse s3 into s2 and erase s3 eventually:
>> >>
>> >> before: base <-- s1 <-- s2 <-- s3
>> >> after:  base <-- s1 <-- s2
>> >>
>> >> If we use live block copy using mirror driver it should be safe as
>> >> long as we keep the ordering of new writes into s3 during the
>> >> execution.
>> >> Even a failure in the the middle won't cause harm since the
>> >> management will keep using s3 until it gets success event.
>> >
>> > Well, it is more complicated than simply streaming into a new
>> > image. I'm not entirely sure it is necessary. The common case is:
>> >
>> > base -> sn-1 -> sn-2 -> ... -> sn-n
>> >
>> > When n reaches a limit, you do:
>> >
>> > base -> merge-1
>> >
>> > You're potentially copying similar amount of data when merging back into
>> > a single image (and you can't easily merge multiple snapshots).
>> >
>> > If the amount of data thats not in 'base' is large, you create
>> > leave a new external file around:
>> >
>> > base -> merge-1 -> sn-1 -> sn-2 ... -> sn-n
>> > to
>> > base -> merge-1 -> merge-2
>> >
>> >> >
>> >> >>It seems like snapshot merge will require dedicated code that reads
>> >> >>the allocated clusters from the COW file and writes them back into the
>> >> >>base image.
>> >> >>
>> >> >>A very inefficient alternative would be to create a third image, the
>> >> >>"merge" image file, which has the COW file as its backing file:
>> >> >>snapshot (base) ->  cow ->  merge
>> >
>> > Remember there is a 'base' before snapshot, you don't copy the entire
>> > image.
>>
>> One use case I have in mind is the Live Backup approach that Jagane
>> has been developing.  Here the backup solution only creates a snapshot
>> for the period of time needed to read out the dirty blocks.  Then the
>> snapshot is deleted again and probably contains very little new data
>> relative to the base image.  The backup solution does this operation
>> every day.
>>
>> This is the pathalogical case for any approach that copies the entire
>> base into a new file.  We could have avoided a lot of I/O by doing an
>> in-place update.
>>
>> I want to make sure this works well.
>
> This use case does not fit the streaming scheme that has come up. Its a
> completly different operation.
>
> IMO it should be implemented separately.

Okay, not everything can fit into this one grand unified block
copy/image streaming mechanism :).

Stefan

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [Qemu-devel] KVM call agenda for June 28
@ 2011-07-07 15:25                                   ` Stefan Hajnoczi
  0 siblings, 0 replies; 55+ messages in thread
From: Stefan Hajnoczi @ 2011-07-07 15:25 UTC (permalink / raw)
  To: Marcelo Tosatti
  Cc: Kevin Wolf, Chris Wright, KVM devel mailing list, quintela,
	jes sorensen, Dor Laor, qemu-devel, Avi Kivity

On Tue, Jul 5, 2011 at 7:18 PM, Marcelo Tosatti <mtosatti@redhat.com> wrote:
> On Tue, Jul 05, 2011 at 04:37:08PM +0100, Stefan Hajnoczi wrote:
>> On Tue, Jul 5, 2011 at 3:32 PM, Marcelo Tosatti <mtosatti@redhat.com> wrote:
>> > On Tue, Jul 05, 2011 at 04:39:06PM +0300, Dor Laor wrote:
>> >> On 07/05/2011 03:58 PM, Marcelo Tosatti wrote:
>> >> >On Tue, Jul 05, 2011 at 01:40:08PM +0100, Stefan Hajnoczi wrote:
>> >> >>On Tue, Jul 5, 2011 at 9:01 AM, Dor Laor<dlaor@redhat.com>  wrote:
>> >> >>>I tried to re-arrange all of the requirements and use cases using this wiki
>> >> >>>page: http://wiki.qemu.org/Features/LiveBlockMigration
>> >> >>>
>> >> >>>It would be the best to agree upon the most interesting use cases (while we
>> >> >>>make sure we cover future ones) and agree to them.
>> >> >>>The next step is to set the interface for all the various verbs since the
>> >> >>>implementation seems to be converging.
>> >> >>
>> >> >>Live block copy was supposed to support snapshot merge.  I think the
>> >> >>current favored approach is to make the source image a backing file to
>> >> >>the destination image and essentially do image streaming.
>> >> >>
>> >> >>Using this mechanism for snapshot merge is tricky.  The COW file
>> >> >>already uses the read-only snapshot base image.  So now we cannot
>> >> >>trivally copy the COW file contents back into the snapshot base image
>> >> >>using live block copy.
>> >> >
>> >> >It never did. Live copy creates a new image were both snapshot and
>> >> >"current" are copied to.
>> >> >
>> >> >This is similar with image streaming.
>> >>
>> >> Not sure I realize what's bad to do in-place merge:
>> >>
>> >> Let's suppose we have this COW chain:
>> >>
>> >>   base <-- s1 <-- s2
>> >>
>> >> Now a live snapshot is created over s2, s2 becomes RO and s3 is RW:
>> >>
>> >>   base <-- s1 <-- s2 <-- s3
>> >>
>> >> Now we've done with s2 (post backup) and like to merge s3 into s2.
>> >>
>> >> With your approach we use live copy of s3 into newSnap:
>> >>
>> >>   base <-- s1 <-- s2 <-- s3
>> >>   base <-- s1 <-- newSnap
>> >>
>> >> When it is over s2 and s3 can be erased.
>> >> The down side is the IOs for copying s2 data and the temporary
>> >> storage. I guess temp storage is cheap but excessive IO are
>> >> expensive.
>> >>
>> >> My approach was to collapse s3 into s2 and erase s3 eventually:
>> >>
>> >> before: base <-- s1 <-- s2 <-- s3
>> >> after:  base <-- s1 <-- s2
>> >>
>> >> If we use live block copy using mirror driver it should be safe as
>> >> long as we keep the ordering of new writes into s3 during the
>> >> execution.
>> >> Even a failure in the the middle won't cause harm since the
>> >> management will keep using s3 until it gets success event.
>> >
>> > Well, it is more complicated than simply streaming into a new
>> > image. I'm not entirely sure it is necessary. The common case is:
>> >
>> > base -> sn-1 -> sn-2 -> ... -> sn-n
>> >
>> > When n reaches a limit, you do:
>> >
>> > base -> merge-1
>> >
>> > You're potentially copying similar amount of data when merging back into
>> > a single image (and you can't easily merge multiple snapshots).
>> >
>> > If the amount of data thats not in 'base' is large, you create
>> > leave a new external file around:
>> >
>> > base -> merge-1 -> sn-1 -> sn-2 ... -> sn-n
>> > to
>> > base -> merge-1 -> merge-2
>> >
>> >> >
>> >> >>It seems like snapshot merge will require dedicated code that reads
>> >> >>the allocated clusters from the COW file and writes them back into the
>> >> >>base image.
>> >> >>
>> >> >>A very inefficient alternative would be to create a third image, the
>> >> >>"merge" image file, which has the COW file as its backing file:
>> >> >>snapshot (base) ->  cow ->  merge
>> >
>> > Remember there is a 'base' before snapshot, you don't copy the entire
>> > image.
>>
>> One use case I have in mind is the Live Backup approach that Jagane
>> has been developing.  Here the backup solution only creates a snapshot
>> for the period of time needed to read out the dirty blocks.  Then the
>> snapshot is deleted again and probably contains very little new data
>> relative to the base image.  The backup solution does this operation
>> every day.
>>
>> This is the pathalogical case for any approach that copies the entire
>> base into a new file.  We could have avoided a lot of I/O by doing an
>> in-place update.
>>
>> I want to make sure this works well.
>
> This use case does not fit the streaming scheme that has come up. Its a
> completly different operation.
>
> IMO it should be implemented separately.

Okay, not everything can fit into this one grand unified block
copy/image streaming mechanism :).

Stefan

^ permalink raw reply	[flat|nested] 55+ messages in thread

end of thread, other threads:[~2011-07-08  0:30 UTC | newest]

Thread overview: 55+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-06-27 14:32 KVM call agenda for June 28 Juan Quintela
2011-06-27 14:32 ` [Qemu-devel] " Juan Quintela
2011-06-28 13:38 ` Stefan Hajnoczi
2011-06-28 13:38   ` [Qemu-devel] " Stefan Hajnoczi
2011-06-28 19:41   ` Marcelo Tosatti
2011-06-28 19:41     ` [Qemu-devel] " Marcelo Tosatti
2011-06-29  5:32     ` Stefan Hajnoczi
2011-06-29  5:32       ` [Qemu-devel] " Stefan Hajnoczi
2011-06-29  7:57     ` Kevin Wolf
2011-06-29  7:57       ` [Qemu-devel] " Kevin Wolf
2011-06-29 10:08       ` Stefan Hajnoczi
2011-06-29 10:08         ` [Qemu-devel] " Stefan Hajnoczi
2011-06-29 15:41         ` Marcelo Tosatti
2011-06-29 15:41           ` [Qemu-devel] " Marcelo Tosatti
2011-06-30 11:48           ` Stefan Hajnoczi
2011-06-30 11:48             ` [Qemu-devel] " Stefan Hajnoczi
2011-06-30 12:39             ` Kevin Wolf
2011-06-30 12:39               ` [Qemu-devel] " Kevin Wolf
2011-06-30 12:54           ` Stefan Hajnoczi
2011-06-30 12:54             ` [Qemu-devel] " Stefan Hajnoczi
2011-06-30 14:36             ` Marcelo Tosatti
2011-06-30 14:36               ` [Qemu-devel] " Marcelo Tosatti
2011-06-30 14:52               ` Kevin Wolf
2011-06-30 14:52                 ` [Qemu-devel] " Kevin Wolf
2011-06-30 18:38                 ` Marcelo Tosatti
2011-07-05  8:01                   ` Dor Laor
2011-07-05 12:40                     ` Stefan Hajnoczi
2011-07-05 12:40                       ` Stefan Hajnoczi
2011-07-05 12:58                       ` Marcelo Tosatti
2011-07-05 12:58                         ` Marcelo Tosatti
2011-07-05 13:39                         ` Dor Laor
2011-07-05 13:39                           ` Dor Laor
2011-07-05 14:29                           ` Marcelo Tosatti
2011-07-05 14:29                             ` [Qemu-devel] " Marcelo Tosatti
2011-07-05 14:32                           ` Marcelo Tosatti
2011-07-05 14:32                             ` Marcelo Tosatti
2011-07-05 14:46                             ` Kevin Wolf
2011-07-05 14:46                               ` Kevin Wolf
2011-07-05 15:04                             ` Dor Laor
2011-07-05 15:04                               ` Dor Laor
2011-07-05 15:29                               ` Marcelo Tosatti
2011-07-05 15:29                                 ` Marcelo Tosatti
2011-07-05 15:37                             ` Stefan Hajnoczi
2011-07-05 15:37                               ` Stefan Hajnoczi
2011-07-05 18:18                               ` Marcelo Tosatti
2011-07-05 18:18                                 ` Marcelo Tosatti
2011-07-06  7:48                                 ` Kevin Wolf
2011-07-06  7:48                                   ` Kevin Wolf
2011-07-07 15:25                                 ` Stefan Hajnoczi
2011-07-07 15:25                                   ` Stefan Hajnoczi
2011-06-28 13:43 ` Anthony Liguori
2011-06-28 13:43   ` Anthony Liguori
2011-06-28 13:48   ` Avi Kivity
2011-06-28 13:48     ` Avi Kivity
2011-06-30 14:10     ` Anthony Liguori

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.