All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] storing machine data in qcow images?
@ 2018-05-18 15:30 Michael S. Tsirkin
  2018-05-18 16:49 ` Eduardo Habkost
                   ` (3 more replies)
  0 siblings, 4 replies; 157+ messages in thread
From: Michael S. Tsirkin @ 2018-05-18 15:30 UTC (permalink / raw)
  To: ehabkost, stefanha, kwolf, mreitz, qemu-devel, qemu-block

Hi!
Right now, QEMU supports multiple machine types within
a given architecture. This was the case for many architectures
(like ARM) for a while, somewhat more recently this is the case
for x86 with I440FX and Q35 options.

Unfortunately this means that it's no longer possible
to more or less reliably boot a VM just given a disk image,
even if you select the correct QEMU binary:
you must supply the correct machine type.

Some guests go even further and require specific devices to be present.

Would it be reasonable to support storing this information in the qcow
image itself?  For example, I can see it following immediately the
backing file path within the image.

As Eduardo pointed out off-list, the format could be a set of key-value
pairs. Initially qemu-img could gain ability to retrieve and manipulate
these. Down the road we could teach qemu to use them automatically.
We could also thinkably warn the user, or drop the image from the boot
order.

Reasonable (IMO) things we could store in such a section:
- qemu architecture to use with the image
- machine type

more possibilities:
- required cpu flags
- expected frontend devices
- kernel flags for device tree based guests

Security considerations
- If there is a machine type specific security issue,
  this makes it easier to trick user to hitting it.
  Not sure how common this is.
- We most likely shouldn't get backend parameters from the image

Thoughts?

-- 
MST

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-05-18 15:30 [Qemu-devel] storing machine data in qcow images? Michael S. Tsirkin
@ 2018-05-18 16:49 ` Eduardo Habkost
  2018-05-18 17:09 ` Daniel P. Berrangé
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 157+ messages in thread
From: Eduardo Habkost @ 2018-05-18 16:49 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: stefanha, kwolf, mreitz, qemu-devel, qemu-block,
	Daniel P. Berrange, Dr. David Alan Gilbert

On Fri, May 18, 2018 at 06:30:38PM +0300, Michael S. Tsirkin wrote:
> Hi!
> Right now, QEMU supports multiple machine types within
> a given architecture. This was the case for many architectures
> (like ARM) for a while, somewhat more recently this is the case
> for x86 with I440FX and Q35 options.
> 
> Unfortunately this means that it's no longer possible
> to more or less reliably boot a VM just given a disk image,
> even if you select the correct QEMU binary:
> you must supply the correct machine type.
> 
> Some guests go even further and require specific devices to be present.
> 
> Would it be reasonable to support storing this information in the qcow
> image itself?  For example, I can see it following immediately the
> backing file path within the image.
> 
> As Eduardo pointed out off-list, the format could be a set of key-value
> pairs. Initially qemu-img could gain ability to retrieve and manipulate
> these. Down the road we could teach qemu to use them automatically.
> We could also thinkably warn the user, or drop the image from the boot
> order.

Some additional context:

Currently OpenStack and other management stacks support importing
"guest images", that are often just qcow2 disk images.  Today all
management stacks suppose x86 guest images all work using
pc-i440fx, but this is likely to change with newer guest OS
versions.

Right now it's very convenient for users to simply create disk
images using whatever VM management tools they have (e.g.
virt-image, virt-install, virsh) to install and configure a
guest, and all they need to do is to upload the resulting disk
image.

If information about the machine-type and disk type used to
create the VM is saved in the disk image, OpenStack and other
management stacks can use this information as hints to choose the
right machine-type for a given guest image.  This would also help
the system detect mistakes like using an image for the wrong
architecture.

I don't think QEMU needs to use this information automatically,
necessarily.  I think the first step is to simply make QEMU save
this information in the disk image, and making qemu-img able to
read and write this information.

> 
> Reasonable (IMO) things we could store in such a section:
> - qemu architecture to use with the image
> - machine type

Maybe just the machine-type family would be enough?

> 
> more possibilities:
> - required cpu flags
> - expected frontend devices
> - kernel flags for device tree based guests

All these might be useful in some cases.  I think it's important
to highlight that these would be just hints for systems importing
the disk image, and not mandatory.

> 
> Security considerations
> - If there is a machine type specific security issue,
>   this makes it easier to trick user to hitting it.
>   Not sure how common this is.

Yeah, we need to keep this in mind for every hint we add to this
system.

I would prefer a system with a very limited set of
possible input values, to avoid transforming this into a new
attack vector.

For example, I think the hint needs to specify: only the
machine-type family instead of the full machine-type version;
only expected NIC model instead of NIC model + mac address + PCI
address; only the CPU architecture instead of CPU model name +
flags.

(But "guest kernel flags" seems acceptable, because it's parsed
only by guest code.)

If any management stack requires a more detailed VM description,
they won't be covered by this system, and they can't expect qcow2
disk images to carry all the information they need.


> - We most likely shouldn't get backend parameters from the image

Agreed.

-- 
Eduardo

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-05-18 15:30 [Qemu-devel] storing machine data in qcow images? Michael S. Tsirkin
  2018-05-18 16:49 ` Eduardo Habkost
@ 2018-05-18 17:09 ` Daniel P. Berrangé
  2018-05-18 17:41   ` Eduardo Habkost
                     ` (2 more replies)
  2018-05-22  8:50 ` Philipp Hahn
  2018-05-24 11:32 ` Richard W.M. Jones
  3 siblings, 3 replies; 157+ messages in thread
From: Daniel P. Berrangé @ 2018-05-18 17:09 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: ehabkost, stefanha, kwolf, mreitz, qemu-devel, qemu-block

On Fri, May 18, 2018 at 06:30:38PM +0300, Michael S. Tsirkin wrote:
> Hi!
> Right now, QEMU supports multiple machine types within
> a given architecture. This was the case for many architectures
> (like ARM) for a while, somewhat more recently this is the case
> for x86 with I440FX and Q35 options.
> 
> Unfortunately this means that it's no longer possible
> to more or less reliably boot a VM just given a disk image,
> even if you select the correct QEMU binary:
> you must supply the correct machine type.

You must /sometimes/ supply the correct machine type.

It is quite dependent on the guest OS you have installed, and even
just how the guest OS is configured.  In general Linux is very
flexible and can adapt to a wide range of hardware, automatically
detecting things as needed. It is possible for a sysadmin to build
a Linux image in a way that would only work with I440FX, but I
don't think it would be common to see that. Many distros build
and distribute disk images that can work across VMWare, KVM,
and VirtualBox which all have very quite different hardware.
Non-x86 archs may be more fussy but I don't have personal
experiance with them

Windows is probably where things get more tricky, as it is not
happy with disks moving between different controller types
for example, and you might trigger license activation again.


> Some guests go even further and require specific devices to be present.
> 
> Would it be reasonable to support storing this information in the qcow
> image itself?  For example, I can see it following immediately the
> backing file path within the image.

The backing file string needs to go in space between the end of headers
and start of first cluster, and the spec explicitly says nothing else
must be stored there. Also we can already hit the length limit on the
backing file.

There would need to be an explicit header extension defined with its
own clusters allocated instead.

That said I'm not really convinced that using the qcow2 headers is
a good plan. We have many disk image formats in common use, qcow2
is just one. Even if the user provides the image in qcow2 format,
that doesn't mean that mgmt apps actually store the qcow2 file.

For example in some deployments OpenStack will immediately
convert the image to raw for storage in an RBD volume as it is
uploaded to Glance. So the glance image store would need to
have a way to extract & save the info at time of upload. OpenStack
targets multiple hypervisors though, so I'm not sure they would
welcome something that is specific to just qcow2 in this area.

The closest to a cross-hypervisor standard is OVF which can store
metadata about required hardware for a VM. I'm pretty sure it does
not have the concept of machine types, but maybe it has a way for
people to define metadata extensions. Since it is just XML at the
end of the day, even if there was nothing official in OVF, it would
be possible to just define a custom XML namespace and declare a
schema for that to follow.


> As Eduardo pointed out off-list, the format could be a set of key-value
> pairs. Initially qemu-img could gain ability to retrieve and manipulate
> these. Down the road we could teach qemu to use them automatically.
> We could also thinkably warn the user, or drop the image from the boot
> order.
> 
> Reasonable (IMO) things we could store in such a section:
> - qemu architecture to use with the image
> - machine type

A concern is about what you actually put here. We could easily create a
situation where we make images /less/ portable. eg take a Linux image
which is capable of running on both i440fx and q35, if that was built
on i44fx and that gets recorded, a mgmt app which honours this info
is needless restricting how the image can be run.

Or consider that LTS distros typically create custom machine types,
so you can have a image with machine type  pc-rhel-7.4.0 which is
now unable to be used on an Ubuntu distro which lacks the RHEL
machine types.

IOW, there's a distinction between what's recommended, vs what's
required, vs what's forbidden. Whitelisting valid machine types
is too restrictive, but blacklisting is not broad enough.

> more possibilities:
> - required cpu flags

Again this is not so black & white - there's a distinction between
what is absolutely required vs what is merely recommended

> - expected frontend devices
> - kernel flags for device tree based guests
> 
> Security considerations
> - If there is a machine type specific security issue,
>   this makes it easier to trick user to hitting it.
>   Not sure how common this is.

This would imply setting very specific versioned machine type
choice, but that kills any kind of platform portability.

> - We most likely shouldn't get backend parameters from the image
> 
> Thoughts?

I tend to think we'd be better looking at what we can do in the context
of an existing standard like OVF rather than inventing something that
only works with qcow2. I think it would need to be more expressive than
just a single list of key,value pairs for each item.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-05-18 17:09 ` Daniel P. Berrangé
@ 2018-05-18 17:41   ` Eduardo Habkost
  2018-05-19  6:05     ` Markus Armbruster
  2018-05-21 20:18     ` Daniel P. Berrangé
  2018-05-22  7:35   ` Gerd Hoffmann
  2018-05-24 11:17   ` Richard W.M. Jones
  2 siblings, 2 replies; 157+ messages in thread
From: Eduardo Habkost @ 2018-05-18 17:41 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Michael S. Tsirkin, stefanha, kwolf, mreitz, qemu-devel, qemu-block

On Fri, May 18, 2018 at 06:09:56PM +0100, Daniel P. Berrangé wrote:
> On Fri, May 18, 2018 at 06:30:38PM +0300, Michael S. Tsirkin wrote:
> > Hi!
> > Right now, QEMU supports multiple machine types within
> > a given architecture. This was the case for many architectures
> > (like ARM) for a while, somewhat more recently this is the case
> > for x86 with I440FX and Q35 options.
> > 
> > Unfortunately this means that it's no longer possible
> > to more or less reliably boot a VM just given a disk image,
> > even if you select the correct QEMU binary:
> > you must supply the correct machine type.
> 
> You must /sometimes/ supply the correct machine type.
> 
> It is quite dependent on the guest OS you have installed, and even
> just how the guest OS is configured.  In general Linux is very
> flexible and can adapt to a wide range of hardware, automatically
> detecting things as needed. It is possible for a sysadmin to build
> a Linux image in a way that would only work with I440FX, but I
> don't think it would be common to see that. Many distros build
> and distribute disk images that can work across VMWare, KVM,
> and VirtualBox which all have very quite different hardware.
> Non-x86 archs may be more fussy but I don't have personal
> experiance with them
> 
> Windows is probably where things get more tricky, as it is not
> happy with disks moving between different controller types
> for example, and you might trigger license activation again.

All I'm suggesting here is just adding extra hints that OpenStack
can use.

I have very specific goal here: the goal is to make it less
painful to users when OpenStack+libvirt+QEMU switch to using a
different machine-type by default (q35), and/or when guest OSes
stop supporting pc-i440fx.  I assume this is a goal for OpenStack
as well.

We can make the solution to be more extensible and solve other
problems as well, but my original goal is the one above.

> 
> 
> > Some guests go even further and require specific devices to be present.
> > 
> > Would it be reasonable to support storing this information in the qcow
> > image itself?  For example, I can see it following immediately the
> > backing file path within the image.
> 
> The backing file string needs to go in space between the end of headers
> and start of first cluster, and the spec explicitly says nothing else
> must be stored there. Also we can already hit the length limit on the
> backing file.
> 
> There would need to be an explicit header extension defined with its
> own clusters allocated instead.

This sounds correct.


> 
> That said I'm not really convinced that using the qcow2 headers is
> a good plan. We have many disk image formats in common use, qcow2
> is just one. Even if the user provides the image in qcow2 format,
> that doesn't mean that mgmt apps actually store the qcow2 file.
> 

Why this OpenStack implementation detail matters?  Once the hints
are included in the input, it's up to OpenStack to choose how to
deal with it.


> For example in some deployments OpenStack will immediately
> convert the image to raw for storage in an RBD volume as it is
> uploaded to Glance. So the glance image store would need to
> have a way to extract & save the info at time of upload. OpenStack
> targets multiple hypervisors though, so I'm not sure they would
> welcome something that is specific to just qcow2 in this area.
> 

I don't get the "something that is specific to just qcow2" part.
Adding extra info to qcow2 doesn't prevent other file formats
from carrying the same information as well.


> The closest to a cross-hypervisor standard is OVF which can store
> metadata about required hardware for a VM. I'm pretty sure it does
> not have the concept of machine types, but maybe it has a way for
> people to define metadata extensions. Since it is just XML at the
> end of the day, even if there was nothing official in OVF, it would
> be possible to just define a custom XML namespace and declare a
> schema for that to follow.

There's nothing preventing OVF from supporting the same kind of
hints.

I just don't think we should require people to migrate to OVF if
all they need is to tell OpenStack what's the recommended
machine-type for a guest image.

Requiring a different image format seems very likely to not
fulfill the goal I stated above: it will require using different
tools to create the guest images, and we can't force everybody
publishing guest images to stop using qcow2.

> 
> 
> > As Eduardo pointed out off-list, the format could be a set of key-value
> > pairs. Initially qemu-img could gain ability to retrieve and manipulate
> > these. Down the road we could teach qemu to use them automatically.
> > We could also thinkably warn the user, or drop the image from the boot
> > order.
> > 
> > Reasonable (IMO) things we could store in such a section:
> > - qemu architecture to use with the image
> > - machine type
> 
> A concern is about what you actually put here. We could easily create a
> situation where we make images /less/ portable. eg take a Linux image
> which is capable of running on both i440fx and q35, if that was built
> on i44fx and that gets recorded, a mgmt app which honours this info
> is needless restricting how the image can be run.

That's why it should be just a hint, not a requirement.

> 
> Or consider that LTS distros typically create custom machine types,
> so you can have a image with machine type  pc-rhel-7.4.0 which is
> now unable to be used on an Ubuntu distro which lacks the RHEL
> machine types.

That's why recording the machine-type family is more useful than
recording the full versioned machine-type name.

> 
> IOW, there's a distinction between what's recommended, vs what's
> required, vs what's forbidden. Whitelisting valid machine types
> is too restrictive, but blacklisting is not broad enough.
> 
> > more possibilities:
> > - required cpu flags
> 
> Again this is not so black & white - there's a distinction between
> what is absolutely required vs what is merely recommended
> 
> > - expected frontend devices
> > - kernel flags for device tree based guests
> > 
> > Security considerations
> > - If there is a machine type specific security issue,
> >   this makes it easier to trick user to hitting it.
> >   Not sure how common this is.
> 
> This would imply setting very specific versioned machine type
> choice, but that kills any kind of platform portability.

True.

> 
> > - We most likely shouldn't get backend parameters from the image
> > 
> > Thoughts?
> 
> I tend to think we'd be better looking at what we can do in the context
> of an existing standard like OVF rather than inventing something that
> only works with qcow2. I think it would need to be more expressive than
> just a single list of key,value pairs for each item.

Why you claim we are inventing something that only works with
qcow2?

About being more expressive than just a single list of key,value
pairs, I don't see any evidence of that being necessary for the
problems we're trying to address.

-- 
Eduardo

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-05-18 17:41   ` Eduardo Habkost
@ 2018-05-19  6:05     ` Markus Armbruster
  2018-05-21 18:29       ` Eduardo Habkost
  2018-05-21 20:18     ` Daniel P. Berrangé
  1 sibling, 1 reply; 157+ messages in thread
From: Markus Armbruster @ 2018-05-19  6:05 UTC (permalink / raw)
  To: Eduardo Habkost
  Cc: Daniel P. Berrangé,
	kwolf, qemu-block, Michael S. Tsirkin, qemu-devel, mreitz,
	stefanha

Eduardo Habkost <ehabkost@redhat.com> writes:

[...]
> About being more expressive than just a single list of key,value
> pairs, I don't see any evidence of that being necessary for the
> problems we're trying to address.

Short history of a configuration format you might have encountered:

1. A couple of (key, value) is all we ne need for the problems we're
trying to address.  (v0.4, 2003)

2.1. I got this one special snowflake problem where I actually need a few
related values.  Fortunately, this little ad hoc parser can take apart
the key's single value easily.  (ca. v0.8, 2005)

...

2.n. Snowflakes are surprisingly common, but fortunately one more little
ad hoc parser can't hurt.

3. Umm, this is getting messy.  Let's have proper infrastructure for
two-level keys.  Surely two levels are all we ne need for the problems
we're trying to address.  Fortunately, we can bolt them on without too
much trouble.  (v0.12, 2009)

4. Err, trees, I'm afraid we actually need trees.  Fortunately, we can
hack them into the existing two-level infrastructure without too much
trouble.  (v1.3, 2013)

5. You are in a maze of twisting little passages, all different.
(today)


How confident are we a single list of (key, value) is really all we're
going to need?

Even if we think it is, would it be possible to provide for a future
extension to trees at next to no cost?

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-05-19  6:05     ` Markus Armbruster
@ 2018-05-21 18:29       ` Eduardo Habkost
  2018-05-21 18:44         ` Daniel P. Berrangé
  0 siblings, 1 reply; 157+ messages in thread
From: Eduardo Habkost @ 2018-05-21 18:29 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Daniel P. Berrangé,
	kwolf, qemu-block, Michael S. Tsirkin, qemu-devel, mreitz,
	stefanha

On Sat, May 19, 2018 at 08:05:06AM +0200, Markus Armbruster wrote:
> Eduardo Habkost <ehabkost@redhat.com> writes:
> 
> [...]
> > About being more expressive than just a single list of key,value
> > pairs, I don't see any evidence of that being necessary for the
> > problems we're trying to address.
> 
> Short history of a configuration format you might have encountered:
> 
> 1. A couple of (key, value) is all we ne need for the problems we're
> trying to address.  (v0.4, 2003)
> 
> 2.1. I got this one special snowflake problem where I actually need a few
> related values.  Fortunately, this little ad hoc parser can take apart
> the key's single value easily.  (ca. v0.8, 2005)
> 
> ...
> 
> 2.n. Snowflakes are surprisingly common, but fortunately one more little
> ad hoc parser can't hurt.
> 
> 3. Umm, this is getting messy.  Let's have proper infrastructure for
> two-level keys.  Surely two levels are all we ne need for the problems
> we're trying to address.  Fortunately, we can bolt them on without too
> much trouble.  (v0.12, 2009)
> 
> 4. Err, trees, I'm afraid we actually need trees.  Fortunately, we can
> hack them into the existing two-level infrastructure without too much
> trouble.  (v1.3, 2013)
> 
> 5. You are in a maze of twisting little passages, all different.
> (today)
> 
> 
> How confident are we a single list of (key, value) is really all we're
> going to need?
> 
> Even if we think it is, would it be possible to provide for a future
> extension to trees at next to no cost?

I'm confident that a list of key,values is all we need for the
current problem.

I also agree that being possible to represent trees is a good
idea, and it would probably have next to no cost.

But I disagree if the point here is "we will eventually need much
more complex data in the future, so let's require users to move
to OVF instead".

The point here is to allow users to simply copy an existing disk
image, and it will contain enough hints for a cloud stack to
choose reasonable defaults for machine-type and disk type
automatically.  Requiring the user to perform a separate step to
encapsulate the disk image in another file format defeats the
whole purpose of the proposal.

-- 
Eduardo

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-05-21 18:29       ` Eduardo Habkost
@ 2018-05-21 18:44         ` Daniel P. Berrangé
  2018-05-21 19:01           ` Eduardo Habkost
  0 siblings, 1 reply; 157+ messages in thread
From: Daniel P. Berrangé @ 2018-05-21 18:44 UTC (permalink / raw)
  To: Eduardo Habkost
  Cc: Markus Armbruster, kwolf, qemu-block, Michael S. Tsirkin,
	qemu-devel, mreitz, stefanha

On Mon, May 21, 2018 at 03:29:28PM -0300, Eduardo Habkost wrote:
> On Sat, May 19, 2018 at 08:05:06AM +0200, Markus Armbruster wrote:
> > Eduardo Habkost <ehabkost@redhat.com> writes:
> > 
> > [...]
> > > About being more expressive than just a single list of key,value
> > > pairs, I don't see any evidence of that being necessary for the
> > > problems we're trying to address.
> > 
> > Short history of a configuration format you might have encountered:

[snip]

> > How confident are we a single list of (key, value) is really all we're
> > going to need?
> > 
> > Even if we think it is, would it be possible to provide for a future
> > extension to trees at next to no cost?
> 
> I'm confident that a list of key,values is all we need for the
> current problem.

I'm not convinced. A disk image may work with Q35 or i440fx,  or
work with any of virtio, ide or sata disk. So that already means
values have to be arrays, not scalars. You could do that with a
simple key,value list, but only by defining a mapping of arrays
into a flattened form. eg do we allow repeated keys, or do we
allow array indexes on keys. 

> The point here is to allow users to simply copy an existing disk
> image, and it will contain enough hints for a cloud stack to
> choose reasonable defaults for machine-type and disk type
> automatically.  Requiring the user to perform a separate step to
> encapsulate the disk image in another file format defeats the
> whole purpose of the proposal.

It doesn't have to mean more work for the user - the application
that is used to create the image can do that on their behalf.
oVirt for example can import/export OVA files, containing OVF
metadata. I could imagine virt-manager, and other tools adding
export ability without much trouble if this was deemed a desirable
thing. Bundling gives ability to have multiple disk images in one
archive, which is something OVF does.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-05-21 18:44         ` Daniel P. Berrangé
@ 2018-05-21 19:01           ` Eduardo Habkost
  2018-05-23 11:19             ` Markus Armbruster
  0 siblings, 1 reply; 157+ messages in thread
From: Eduardo Habkost @ 2018-05-21 19:01 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Markus Armbruster, kwolf, qemu-block, Michael S. Tsirkin,
	qemu-devel, mreitz, stefanha

On Mon, May 21, 2018 at 07:44:40PM +0100, Daniel P. Berrangé wrote:
> On Mon, May 21, 2018 at 03:29:28PM -0300, Eduardo Habkost wrote:
> > On Sat, May 19, 2018 at 08:05:06AM +0200, Markus Armbruster wrote:
> > > Eduardo Habkost <ehabkost@redhat.com> writes:
> > > 
> > > [...]
> > > > About being more expressive than just a single list of key,value
> > > > pairs, I don't see any evidence of that being necessary for the
> > > > problems we're trying to address.
> > > 
> > > Short history of a configuration format you might have encountered:
> 
> [snip]
> 
> > > How confident are we a single list of (key, value) is really all we're
> > > going to need?
> > > 
> > > Even if we think it is, would it be possible to provide for a future
> > > extension to trees at next to no cost?
> > 
> > I'm confident that a list of key,values is all we need for the
> > current problem.
> 
> I'm not convinced. A disk image may work with Q35 or i440fx,  or
> work with any of virtio, ide or sata disk. So that already means
> values have to be arrays, not scalars. You could do that with a
> simple key,value list, but only by defining a mapping of arrays
> into a flattened form. eg do we allow repeated keys, or do we
> allow array indexes on keys. 

No problem, we can support trees if it's necessary.


> > The point here is to allow users to simply copy an existing disk
> > image, and it will contain enough hints for a cloud stack to
> > choose reasonable defaults for machine-type and disk type
> > automatically.  Requiring the user to perform a separate step to
> > encapsulate the disk image in another file format defeats the
> > whole purpose of the proposal.
> 
> It doesn't have to mean more work for the user - the application
> that is used to create the image can do that on their behalf.
> oVirt for example can import/export OVA files, containing OVF
> metadata. I could imagine virt-manager, and other tools adding
> export ability without much trouble if this was deemed a desirable
> thing. Bundling gives ability to have multiple disk images in one
> archive, which is something OVF does.

I have the impression that "the application that is used to
create the image" is a very large set.  It can be virt-manager,
virt-install, virt-manager, or even QEMU itself.

Today people can simply create a VM on virt-manager, or run QEMU
manually, and upload the qcow2 image directly from its original
location (they don't need to copy/export it).  Don't we want the
same procedure to keep working instead of requiring users to use
another tool?

-- 
Eduardo

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-05-18 17:41   ` Eduardo Habkost
  2018-05-19  6:05     ` Markus Armbruster
@ 2018-05-21 20:18     ` Daniel P. Berrangé
  2018-05-21 20:33       ` Eduardo Habkost
  1 sibling, 1 reply; 157+ messages in thread
From: Daniel P. Berrangé @ 2018-05-21 20:18 UTC (permalink / raw)
  To: Eduardo Habkost
  Cc: Michael S. Tsirkin, stefanha, kwolf, mreitz, qemu-devel, qemu-block

On Fri, May 18, 2018 at 02:41:33PM -0300, Eduardo Habkost wrote:
> On Fri, May 18, 2018 at 06:09:56PM +0100, Daniel P. Berrangé wrote:
> > On Fri, May 18, 2018 at 06:30:38PM +0300, Michael S. Tsirkin wrote:
> > > Hi!
> > > Right now, QEMU supports multiple machine types within
> > > a given architecture. This was the case for many architectures
> > > (like ARM) for a while, somewhat more recently this is the case
> > > for x86 with I440FX and Q35 options.
> > > 
> > > Unfortunately this means that it's no longer possible
> > > to more or less reliably boot a VM just given a disk image,
> > > even if you select the correct QEMU binary:
> > > you must supply the correct machine type.
> > 
> > You must /sometimes/ supply the correct machine type.
> > 
> > It is quite dependent on the guest OS you have installed, and even
> > just how the guest OS is configured.  In general Linux is very
> > flexible and can adapt to a wide range of hardware, automatically
> > detecting things as needed. It is possible for a sysadmin to build
> > a Linux image in a way that would only work with I440FX, but I
> > don't think it would be common to see that. Many distros build
> > and distribute disk images that can work across VMWare, KVM,
> > and VirtualBox which all have very quite different hardware.
> > Non-x86 archs may be more fussy but I don't have personal
> > experiance with them
> > 
> > Windows is probably where things get more tricky, as it is not
> > happy with disks moving between different controller types
> > for example, and you might trigger license activation again.
> 
> All I'm suggesting here is just adding extra hints that OpenStack
> can use.
> 
> I have very specific goal here: the goal is to make it less
> painful to users when OpenStack+libvirt+QEMU switch to using a
> different machine-type by default (q35), and/or when guest OSes
> stop supporting pc-i440fx.  I assume this is a goal for OpenStack
> as well.
> 
> We can make the solution to be more extensible and solve other
> problems as well, but my original goal is the one above.

Configuring the machine type is just one thing that users
would do with OpenStack though.  A simple example might be

    openstack image set \
         --property hw_disk_bus=scsi \
	 --property hw_vif_model=e1000e

Or if they're using libosinfo to set preferred devices 

    openstack image set \
         --property os_distro=fedora26

which will identify virtio-blk & virtio-net as disk+nic
respectively. Using libosinfo is more flexible than setting
the hw_disk_bus & hw_vif_model  explicitly, because libosinfo
will report multiple devices that can be used, and the virt
driver can then pick one which best suits the particular
host or hypervisor.

Setting a non-default machine type is one extra prop

    openstack image set \
         --property hw_machine_type=q35
         --property os_distro=fedora26

So while your immediate motivation is only considering the
machine type, from the Openstack POV thats only one property
out of many that users might be setting.


> > That said I'm not really convinced that using the qcow2 headers is
> > a good plan. We have many disk image formats in common use, qcow2
> > is just one. Even if the user provides the image in qcow2 format,
> > that doesn't mean that mgmt apps actually store the qcow2 file.
> > 
> 
> Why this OpenStack implementation detail matters?  Once the hints
> are included in the input, it's up to OpenStack to choose how to
> deal with it.

Well openstack aims to support multiple hypervisors - if there's a
choice between implementing something that is a cross-vendor standard
like OVF, or implementing something that only works with qcow2, the
latter is not very appealing to support.

> > The closest to a cross-hypervisor standard is OVF which can store
> > metadata about required hardware for a VM. I'm pretty sure it does
> > not have the concept of machine types, but maybe it has a way for
> > people to define metadata extensions. Since it is just XML at the
> > end of the day, even if there was nothing official in OVF, it would
> > be possible to just define a custom XML namespace and declare a
> > schema for that to follow.
> 
> There's nothing preventing OVF from supporting the same kind of
> hints.
> 
> I just don't think we should require people to migrate to OVF if
> all they need is to tell OpenStack what's the recommended
> machine-type for a guest image.
> 
> Requiring a different image format seems very likely to not
> fulfill the goal I stated above: it will require using different
> tools to create the guest images, and we can't force everybody
> publishing guest images to stop using qcow2.

It doesn't have to require different tools - existing tools could
create a OVF/OVA file for the disk image as part of an "export"
process.


> > > - We most likely shouldn't get backend parameters from the image
> > > 
> > > Thoughts?
> > 
> > I tend to think we'd be better looking at what we can do in the context
> > of an existing standard like OVF rather than inventing something that
> > only works with qcow2. I think it would need to be more expressive than
> > just a single list of key,value pairs for each item.
> 
> Why you claim we are inventing something that only works with
> qcow2?

It works with a disk image format that has ability to record extra
metadata. With raw files you would have to have a separate file to
record it, likewise for any other vendor disk formats that are
not extended. 

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-05-21 20:18     ` Daniel P. Berrangé
@ 2018-05-21 20:33       ` Eduardo Habkost
  2018-05-24  9:58         ` Kashyap Chamarthy
  0 siblings, 1 reply; 157+ messages in thread
From: Eduardo Habkost @ 2018-05-21 20:33 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Michael S. Tsirkin, stefanha, kwolf, mreitz, qemu-devel, qemu-block

On Mon, May 21, 2018 at 09:18:17PM +0100, Daniel P. Berrangé wrote:
> On Fri, May 18, 2018 at 02:41:33PM -0300, Eduardo Habkost wrote:
> > On Fri, May 18, 2018 at 06:09:56PM +0100, Daniel P. Berrangé wrote:
> > > On Fri, May 18, 2018 at 06:30:38PM +0300, Michael S. Tsirkin wrote:
> > > > Hi!
> > > > Right now, QEMU supports multiple machine types within
> > > > a given architecture. This was the case for many architectures
> > > > (like ARM) for a while, somewhat more recently this is the case
> > > > for x86 with I440FX and Q35 options.
> > > > 
> > > > Unfortunately this means that it's no longer possible
> > > > to more or less reliably boot a VM just given a disk image,
> > > > even if you select the correct QEMU binary:
> > > > you must supply the correct machine type.
> > > 
> > > You must /sometimes/ supply the correct machine type.
> > > 
> > > It is quite dependent on the guest OS you have installed, and even
> > > just how the guest OS is configured.  In general Linux is very
> > > flexible and can adapt to a wide range of hardware, automatically
> > > detecting things as needed. It is possible for a sysadmin to build
> > > a Linux image in a way that would only work with I440FX, but I
> > > don't think it would be common to see that. Many distros build
> > > and distribute disk images that can work across VMWare, KVM,
> > > and VirtualBox which all have very quite different hardware.
> > > Non-x86 archs may be more fussy but I don't have personal
> > > experiance with them
> > > 
> > > Windows is probably where things get more tricky, as it is not
> > > happy with disks moving between different controller types
> > > for example, and you might trigger license activation again.
> > 
> > All I'm suggesting here is just adding extra hints that OpenStack
> > can use.
> > 
> > I have very specific goal here: the goal is to make it less
> > painful to users when OpenStack+libvirt+QEMU switch to using a
> > different machine-type by default (q35), and/or when guest OSes
> > stop supporting pc-i440fx.  I assume this is a goal for OpenStack
> > as well.
> > 
> > We can make the solution to be more extensible and solve other
> > problems as well, but my original goal is the one above.
> 
> Configuring the machine type is just one thing that users
> would do with OpenStack though.  A simple example might be
> 
>     openstack image set \
>          --property hw_disk_bus=scsi \
> 	 --property hw_vif_model=e1000e
> 
> Or if they're using libosinfo to set preferred devices 
> 
>     openstack image set \
>          --property os_distro=fedora26
> 
> which will identify virtio-blk & virtio-net as disk+nic
> respectively. Using libosinfo is more flexible than setting
> the hw_disk_bus & hw_vif_model  explicitly, because libosinfo
> will report multiple devices that can be used, and the virt
> driver can then pick one which best suits the particular
> host or hypervisor.
> 
> Setting a non-default machine type is one extra prop
> 
>     openstack image set \
>          --property hw_machine_type=q35
>          --property os_distro=fedora26

Nice.  Are these just hypothetical examples, or something that
already works?


> 
> So while your immediate motivation is only considering the
> machine type, from the Openstack POV thats only one property
> out of many that users might be setting.

Agreed.


> > > That said I'm not really convinced that using the qcow2 headers is
> > > a good plan. We have many disk image formats in common use, qcow2
> > > is just one. Even if the user provides the image in qcow2 format,
> > > that doesn't mean that mgmt apps actually store the qcow2 file.
> > > 
> > 
> > Why this OpenStack implementation detail matters?  Once the hints
> > are included in the input, it's up to OpenStack to choose how to
> > deal with it.
> 
> Well openstack aims to support multiple hypervisors - if there's a
> choice between implementing something that is a cross-vendor standard
> like OVF, or implementing something that only works with qcow2, the
> latter is not very appealing to support.

I still don't understand why you claim this would only work with
qcow2.  If somebody wants to implement the same functionality in
OVF, it's also possible.


> > > The closest to a cross-hypervisor standard is OVF which can store
> > > metadata about required hardware for a VM. I'm pretty sure it does
> > > not have the concept of machine types, but maybe it has a way for
> > > people to define metadata extensions. Since it is just XML at the
> > > end of the day, even if there was nothing official in OVF, it would
> > > be possible to just define a custom XML namespace and declare a
> > > schema for that to follow.
> > 
> > There's nothing preventing OVF from supporting the same kind of
> > hints.
> > 
> > I just don't think we should require people to migrate to OVF if
> > all they need is to tell OpenStack what's the recommended
> > machine-type for a guest image.
> > 
> > Requiring a different image format seems very likely to not
> > fulfill the goal I stated above: it will require using different
> > tools to create the guest images, and we can't force everybody
> > publishing guest images to stop using qcow2.
> 
> It doesn't have to require different tools - existing tools could
> create a OVF/OVA file for the disk image as part of an "export"
> process.

Requiring a new "export" step that wasn't required before is
requiring a different tool, isn't it?


> > > > - We most likely shouldn't get backend parameters from the image
> > > > 
> > > > Thoughts?
> > > 
> > > I tend to think we'd be better looking at what we can do in the context
> > > of an existing standard like OVF rather than inventing something that
> > > only works with qcow2. I think it would need to be more expressive than
> > > just a single list of key,value pairs for each item.
> > 
> > Why you claim we are inventing something that only works with
> > qcow2?
> 
> It works with a disk image format that has ability to record extra
> metadata. With raw files you would have to have a separate file to
> record it, likewise for any other vendor disk formats that are
> not extended. 

So this could work with both qcow2 and OVF (and maybe other
formats if others want to extend them), wouldn't it?

-- 
Eduardo

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-05-18 17:09 ` Daniel P. Berrangé
  2018-05-18 17:41   ` Eduardo Habkost
@ 2018-05-22  7:35   ` Gerd Hoffmann
  2018-05-22 10:53     ` Eduardo Habkost
  2018-05-22 14:19     ` Michael S. Tsirkin
  2018-05-24 11:17   ` Richard W.M. Jones
  2 siblings, 2 replies; 157+ messages in thread
From: Gerd Hoffmann @ 2018-05-22  7:35 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Michael S. Tsirkin, kwolf, ehabkost, qemu-block, qemu-devel,
	mreitz, stefanha

  Hi,

> You must /sometimes/ supply the correct machine type.
> 
> It is quite dependent on the guest OS you have installed, and even
> just how the guest OS is configured.  In general Linux is very
> flexible and can adapt to a wide range of hardware, automatically
> detecting things as needed. It is possible for a sysadmin to build
> a Linux image in a way that would only work with I440FX, but I
> don't think it would be common to see that.

I think it would be pretty hard to actually build such an image.

The more critical thing for linux guests is the storage driver which
must be included into the initrd so the image can mount the root
filesystem.  And the firmware, bios vs. uefi is more critical than
pc vs. q35.

> That said I'm not really convinced that using the qcow2 headers is
> a good plan. We have many disk image formats in common use, qcow2
> is just one. Even if the user provides the image in qcow2 format,
> that doesn't mean that mgmt apps actually store the qcow2 file.

> I tend to think we'd be better looking at what we can do in the context
> of an existing standard like OVF rather than inventing something that
> only works with qcow2. I think it would need to be more expressive than
> just a single list of key,value pairs for each item.

Embed OVF metadata in the qcow2 image?

cheers,
  Gerd

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-05-18 15:30 [Qemu-devel] storing machine data in qcow images? Michael S. Tsirkin
  2018-05-18 16:49 ` Eduardo Habkost
  2018-05-18 17:09 ` Daniel P. Berrangé
@ 2018-05-22  8:50 ` Philipp Hahn
  2018-05-24 11:32 ` Richard W.M. Jones
  3 siblings, 0 replies; 157+ messages in thread
From: Philipp Hahn @ 2018-05-22  8:50 UTC (permalink / raw)
  To: qemu-devel

Hi,

Am 18.05.2018 um 17:30 schrieb Michael S. Tsirkin:
> Unfortunately this means that it's no longer possible
> to more or less reliably boot a VM just given a disk image,
> even if you select the correct QEMU binary:
...
> Would it be reasonable to support storing this information in the qcow
> image itself?  For example, I can see it following immediately the
> backing file path within the image.

- This looks like a layering violation
- what happens when you have multiple (conflicting) images like a VM
with 2 image files?

Philipp

PS: this is even more an issue for restoring snapshots as you the must
launch a new QEMU process with the exact layout of the saving QEMU
process - otherwise LoadVM will just fail.
PPS: I'm afraid of someone suggesting such an abomination as those self
extracting archives using shell scripts at the beginning and an
compressed archive BLOB at the end of the same file.

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-05-22  7:35   ` Gerd Hoffmann
@ 2018-05-22 10:53     ` Eduardo Habkost
  2018-05-22 14:19     ` Michael S. Tsirkin
  1 sibling, 0 replies; 157+ messages in thread
From: Eduardo Habkost @ 2018-05-22 10:53 UTC (permalink / raw)
  To: Gerd Hoffmann
  Cc: Daniel P. Berrangé,
	Michael S. Tsirkin, kwolf, qemu-block, qemu-devel, mreitz,
	stefanha

On Tue, May 22, 2018 at 09:35:55AM +0200, Gerd Hoffmann wrote:
>   Hi,
> 
> > You must /sometimes/ supply the correct machine type.
> > 
> > It is quite dependent on the guest OS you have installed, and even
> > just how the guest OS is configured.  In general Linux is very
> > flexible and can adapt to a wide range of hardware, automatically
> > detecting things as needed. It is possible for a sysadmin to build
> > a Linux image in a way that would only work with I440FX, but I
> > don't think it would be common to see that.
> 
> I think it would be pretty hard to actually build such an image.
> 
> The more critical thing for linux guests is the storage driver which
> must be included into the initrd so the image can mount the root
> filesystem.  And the firmware, bios vs. uefi is more critical than
> pc vs. q35.
> 
> > That said I'm not really convinced that using the qcow2 headers is
> > a good plan. We have many disk image formats in common use, qcow2
> > is just one. Even if the user provides the image in qcow2 format,
> > that doesn't mean that mgmt apps actually store the qcow2 file.
> 
> > I tend to think we'd be better looking at what we can do in the context
> > of an existing standard like OVF rather than inventing something that
> > only works with qcow2. I think it would need to be more expressive than
> > just a single list of key,value pairs for each item.
> 
> Embed OVF metadata in the qcow2 image?

I'm all for using the same standard for specifying machine hints
on both cases.

Now, is there an existing mechanism for virtual hardware hints
(not requirements) in OVF, or we have to invent one?

-- 
Eduardo

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-05-22  7:35   ` Gerd Hoffmann
  2018-05-22 10:53     ` Eduardo Habkost
@ 2018-05-22 14:19     ` Michael S. Tsirkin
  2018-05-22 15:02       ` Kevin Wolf
  1 sibling, 1 reply; 157+ messages in thread
From: Michael S. Tsirkin @ 2018-05-22 14:19 UTC (permalink / raw)
  To: Gerd Hoffmann
  Cc: Daniel P. Berrangé,
	kwolf, ehabkost, qemu-block, qemu-devel, mreitz, stefanha

On Tue, May 22, 2018 at 09:35:55AM +0200, Gerd Hoffmann wrote:
>   Hi,
> 
> > You must /sometimes/ supply the correct machine type.
> > 
> > It is quite dependent on the guest OS you have installed, and even
> > just how the guest OS is configured.  In general Linux is very
> > flexible and can adapt to a wide range of hardware, automatically
> > detecting things as needed. It is possible for a sysadmin to build
> > a Linux image in a way that would only work with I440FX, but I
> > don't think it would be common to see that.
> 
> I think it would be pretty hard to actually build such an image.
> 
> The more critical thing for linux guests is the storage driver which
> must be included into the initrd so the image can mount the root
> filesystem.  And the firmware, bios vs. uefi is more critical than
> pc vs. q35.

I think we can start by finding a location to embed a string in a qcow
image, add ability for qemu-img to set and get this string.  We can
discuss how it's formatted separately.

> > That said I'm not really convinced that using the qcow2 headers is
> > a good plan. We have many disk image formats in common use, qcow2
> > is just one. Even if the user provides the image in qcow2 format,
> > that doesn't mean that mgmt apps actually store the qcow2 file.
> 
> > I tend to think we'd be better looking at what we can do in the context
> > of an existing standard like OVF rather than inventing something that
> > only works with qcow2. I think it would need to be more expressive than
> > just a single list of key,value pairs for each item.
> 
> Embed OVF metadata in the qcow2 image?
> 
> cheers,
>   Gerd

What would be helpful is if we could tell the user who wonders
how to run an image "hey you probably want flags X,Y and Z".
I can see how we could have an option to either stick
a bit of XML there, or just some QEMU flags.

-- 
MST

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-05-22 14:19     ` Michael S. Tsirkin
@ 2018-05-22 15:02       ` Kevin Wolf
  2018-05-22 15:14         ` Eduardo Habkost
  2018-05-23  2:12         ` Fam Zheng
  0 siblings, 2 replies; 157+ messages in thread
From: Kevin Wolf @ 2018-05-22 15:02 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Gerd Hoffmann, Daniel P. Berrangé,
	ehabkost, qemu-block, qemu-devel, mreitz, stefanha

Am 22.05.2018 um 16:19 hat Michael S. Tsirkin geschrieben:
> On Tue, May 22, 2018 at 09:35:55AM +0200, Gerd Hoffmann wrote:
> >   Hi,
> > 
> > > You must /sometimes/ supply the correct machine type.
> > > 
> > > It is quite dependent on the guest OS you have installed, and even
> > > just how the guest OS is configured.  In general Linux is very
> > > flexible and can adapt to a wide range of hardware, automatically
> > > detecting things as needed. It is possible for a sysadmin to build
> > > a Linux image in a way that would only work with I440FX, but I
> > > don't think it would be common to see that.
> > 
> > I think it would be pretty hard to actually build such an image.
> > 
> > The more critical thing for linux guests is the storage driver which
> > must be included into the initrd so the image can mount the root
> > filesystem.  And the firmware, bios vs. uefi is more critical than
> > pc vs. q35.
> 
> I think we can start by finding a location to embed a string in a qcow
> image, add ability for qemu-img to set and get this string.  We can
> discuss how it's formatted separately.

If we want it, we'll find a place to store it.

But the first thing we need is a spec for what's actually in it. Just
storing a machine type hint would be a one-off hack that wouldn't last
very long before we want to add the next thing.

Essentially, what we need is a description of the virtual machine that
we suggest to use with this image. We can try to reuse something
existing there, like libvirt XML or OVF, or invent something new (a JSON
array describing runtime options?). One difference to existing formats
is probably that we want only frontends and no backends in the
description.

Kevin

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-05-22 15:02       ` Kevin Wolf
@ 2018-05-22 15:14         ` Eduardo Habkost
  2018-05-23  2:12         ` Fam Zheng
  1 sibling, 0 replies; 157+ messages in thread
From: Eduardo Habkost @ 2018-05-22 15:14 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: Michael S. Tsirkin, Gerd Hoffmann, Daniel P. Berrangé,
	qemu-block, qemu-devel, mreitz, stefanha

On Tue, May 22, 2018 at 05:02:21PM +0200, Kevin Wolf wrote:
> Am 22.05.2018 um 16:19 hat Michael S. Tsirkin geschrieben:
> > On Tue, May 22, 2018 at 09:35:55AM +0200, Gerd Hoffmann wrote:
> > >   Hi,
> > > 
> > > > You must /sometimes/ supply the correct machine type.
> > > > 
> > > > It is quite dependent on the guest OS you have installed, and even
> > > > just how the guest OS is configured.  In general Linux is very
> > > > flexible and can adapt to a wide range of hardware, automatically
> > > > detecting things as needed. It is possible for a sysadmin to build
> > > > a Linux image in a way that would only work with I440FX, but I
> > > > don't think it would be common to see that.
> > > 
> > > I think it would be pretty hard to actually build such an image.
> > > 
> > > The more critical thing for linux guests is the storage driver which
> > > must be included into the initrd so the image can mount the root
> > > filesystem.  And the firmware, bios vs. uefi is more critical than
> > > pc vs. q35.
> > 
> > I think we can start by finding a location to embed a string in a qcow
> > image, add ability for qemu-img to set and get this string.  We can
> > discuss how it's formatted separately.
> 
> If we want it, we'll find a place to store it.
> 
> But the first thing we need is a spec for what's actually in it. Just
> storing a machine type hint would be a one-off hack that wouldn't last
> very long before we want to add the next thing.
> 
> Essentially, what we need is a description of the virtual machine that
> we suggest to use with this image. We can try to reuse something
> existing there, like libvirt XML or OVF, or invent something new (a JSON
> array describing runtime options?). One difference to existing formats
> is probably that we want only frontends and no backends in the
> description.

The OVF virtual hardware description might be appropriate to
define what's required vs what's recommended, to support multiple
machine-types, and ranges of valid values for variables.

Pro: management software can reuse exactly the same logic for
qcow2 machine descriptions and OVF machine descriptions.

Con: OVF is a pretty complex specification.

-- 
Eduardo

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-05-22 15:02       ` Kevin Wolf
  2018-05-22 15:14         ` Eduardo Habkost
@ 2018-05-23  2:12         ` Fam Zheng
  2018-05-23  9:16           ` Kevin Wolf
  1 sibling, 1 reply; 157+ messages in thread
From: Fam Zheng @ 2018-05-23  2:12 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: Michael S. Tsirkin, ehabkost, qemu-block, qemu-devel, mreitz,
	Gerd Hoffmann, stefanha, berrange

On Tue, 05/22 17:02, Kevin Wolf wrote:
> Am 22.05.2018 um 16:19 hat Michael S. Tsirkin geschrieben:
> > On Tue, May 22, 2018 at 09:35:55AM +0200, Gerd Hoffmann wrote:
> > >   Hi,
> > > 
> > > > You must /sometimes/ supply the correct machine type.
> > > > 
> > > > It is quite dependent on the guest OS you have installed, and even
> > > > just how the guest OS is configured.  In general Linux is very
> > > > flexible and can adapt to a wide range of hardware, automatically
> > > > detecting things as needed. It is possible for a sysadmin to build
> > > > a Linux image in a way that would only work with I440FX, but I
> > > > don't think it would be common to see that.
> > > 
> > > I think it would be pretty hard to actually build such an image.
> > > 
> > > The more critical thing for linux guests is the storage driver which
> > > must be included into the initrd so the image can mount the root
> > > filesystem.  And the firmware, bios vs. uefi is more critical than
> > > pc vs. q35.
> > 
> > I think we can start by finding a location to embed a string in a qcow
> > image, add ability for qemu-img to set and get this string.  We can
> > discuss how it's formatted separately.
> 
> If we want it, we'll find a place to store it.
> 
> But the first thing we need is a spec for what's actually in it. Just
> storing a machine type hint would be a one-off hack that wouldn't last
> very long before we want to add the next thing.
> 
> Essentially, what we need is a description of the virtual machine that
> we suggest to use with this image. We can try to reuse something
> existing there, like libvirt XML or OVF, or invent something new (a JSON
> array describing runtime options?). One difference to existing formats
> is probably that we want only frontends and no backends in the
> description.
> 

Do we really need a uniform way and require compliance to the standard we
choose, and implement verification in the block driver, or can we get away with
a description field that accepts any text and leave it to the user to decide
what to put there? In the header we could assign a Content-type field that
defaults to 'text/plain' to the description, that way apps can mark the data as
"application/ovf" if they want, or whatever the upper layer decides.

Fam

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-05-23  2:12         ` Fam Zheng
@ 2018-05-23  9:16           ` Kevin Wolf
  2018-05-23 14:46             ` Michael S. Tsirkin
  0 siblings, 1 reply; 157+ messages in thread
From: Kevin Wolf @ 2018-05-23  9:16 UTC (permalink / raw)
  To: Fam Zheng
  Cc: Michael S. Tsirkin, ehabkost, qemu-block, qemu-devel, mreitz,
	Gerd Hoffmann, stefanha, berrange

Am 23.05.2018 um 04:12 hat Fam Zheng geschrieben:
> On Tue, 05/22 17:02, Kevin Wolf wrote:
> > Am 22.05.2018 um 16:19 hat Michael S. Tsirkin geschrieben:
> > > On Tue, May 22, 2018 at 09:35:55AM +0200, Gerd Hoffmann wrote:
> > > >   Hi,
> > > > 
> > > > > You must /sometimes/ supply the correct machine type.
> > > > > 
> > > > > It is quite dependent on the guest OS you have installed, and even
> > > > > just how the guest OS is configured.  In general Linux is very
> > > > > flexible and can adapt to a wide range of hardware, automatically
> > > > > detecting things as needed. It is possible for a sysadmin to build
> > > > > a Linux image in a way that would only work with I440FX, but I
> > > > > don't think it would be common to see that.
> > > > 
> > > > I think it would be pretty hard to actually build such an image.
> > > > 
> > > > The more critical thing for linux guests is the storage driver which
> > > > must be included into the initrd so the image can mount the root
> > > > filesystem.  And the firmware, bios vs. uefi is more critical than
> > > > pc vs. q35.
> > > 
> > > I think we can start by finding a location to embed a string in a qcow
> > > image, add ability for qemu-img to set and get this string.  We can
> > > discuss how it's formatted separately.
> > 
> > If we want it, we'll find a place to store it.
> > 
> > But the first thing we need is a spec for what's actually in it. Just
> > storing a machine type hint would be a one-off hack that wouldn't last
> > very long before we want to add the next thing.
> > 
> > Essentially, what we need is a description of the virtual machine that
> > we suggest to use with this image. We can try to reuse something
> > existing there, like libvirt XML or OVF, or invent something new (a JSON
> > array describing runtime options?). One difference to existing formats
> > is probably that we want only frontends and no backends in the
> > description.
> > 
> 
> Do we really need a uniform way and require compliance to the standard we
> choose, and implement verification in the block driver, or can we get away with
> a description field that accepts any text and leave it to the user to decide
> what to put there? In the header we could assign a Content-type field that
> defaults to 'text/plain' to the description, that way apps can mark the data as
> "application/ovf" if they want, or whatever the upper layer decides.

Yes, we can. But I'm not sure if I want. Providing low-level features
without telling users how they are supposed to be used usually results
in a big surprise for both sides eventually.

Kevin

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-05-21 19:01           ` Eduardo Habkost
@ 2018-05-23 11:19             ` Markus Armbruster
  2018-05-23 12:13               ` Eduardo Habkost
  0 siblings, 1 reply; 157+ messages in thread
From: Markus Armbruster @ 2018-05-23 11:19 UTC (permalink / raw)
  To: Eduardo Habkost
  Cc: Daniel P. Berrangé,
	kwolf, qemu-block, Michael S. Tsirkin, qemu-devel, mreitz,
	stefanha

Eduardo Habkost <ehabkost@redhat.com> writes:

> On Mon, May 21, 2018 at 07:44:40PM +0100, Daniel P. Berrangé wrote:
>> On Mon, May 21, 2018 at 03:29:28PM -0300, Eduardo Habkost wrote:
>> > On Sat, May 19, 2018 at 08:05:06AM +0200, Markus Armbruster wrote:
>> > > Eduardo Habkost <ehabkost@redhat.com> writes:
>> > > 
>> > > [...]
>> > > > About being more expressive than just a single list of key,value
>> > > > pairs, I don't see any evidence of that being necessary for the
>> > > > problems we're trying to address.
>> > > 
>> > > Short history of a configuration format you might have encountered:
>> 
>> [snip]
>> 
>> > > How confident are we a single list of (key, value) is really all we're
>> > > going to need?
>> > > 
>> > > Even if we think it is, would it be possible to provide for a future
>> > > extension to trees at next to no cost?
>> > 
>> > I'm confident that a list of key,values is all we need for the
>> > current problem.
>> 
>> I'm not convinced. A disk image may work with Q35 or i440fx,  or
>> work with any of virtio, ide or sata disk. So that already means
>> values have to be arrays, not scalars. You could do that with a
>> simple key,value list, but only by defining a mapping of arrays
>> into a flattened form. eg do we allow repeated keys, or do we
>> allow array indexes on keys. 
>
> No problem, we can support trees if it's necessary.
>
>
>> > The point here is to allow users to simply copy an existing disk
>> > image, and it will contain enough hints for a cloud stack to
>> > choose reasonable defaults for machine-type and disk type
>> > automatically.  Requiring the user to perform a separate step to
>> > encapsulate the disk image in another file format defeats the
>> > whole purpose of the proposal.
>> 
>> It doesn't have to mean more work for the user - the application
>> that is used to create the image can do that on their behalf.
>> oVirt for example can import/export OVA files, containing OVF
>> metadata. I could imagine virt-manager, and other tools adding
>> export ability without much trouble if this was deemed a desirable
>> thing. Bundling gives ability to have multiple disk images in one
>> archive, which is something OVF does.
>
> I have the impression that "the application that is used to
> create the image" is a very large set.  It can be virt-manager,
> virt-install, virt-manager, or even QEMU itself.
>
> Today people can simply create a VM on virt-manager, or run QEMU
> manually, and upload the qcow2 image directly from its original
> location (they don't need to copy/export it).  Don't we want the
> same procedure to keep working instead of requiring users to use
> another tool?

Today, I can take the disk out of my old computer, put it into my new
computer, and it just works.  Don't we want the same procedure to keep
working forever?

Sadly, wanting something badly enough doesn't make actual solutions any
easier :)

My point is: disk images (real or virtual) keep working in different
hardware contexts by a mixture of flexibility built into system software
on the image, disciplined evolution of hardware (real or virtual), and
dumb luck.  It works until it doesn't.  And then you get to tinker.

With OVF, you solve the problem further up the stack: you do virtual
appliances instead of disk images.

How much space that leaves for useful solutions at the level of QEMU I
can't say.

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-05-23 11:19             ` Markus Armbruster
@ 2018-05-23 12:13               ` Eduardo Habkost
  2018-05-23 16:35                 ` Markus Armbruster
  0 siblings, 1 reply; 157+ messages in thread
From: Eduardo Habkost @ 2018-05-23 12:13 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Daniel P. Berrangé,
	kwolf, qemu-block, Michael S. Tsirkin, qemu-devel, mreitz,
	stefanha

On Wed, May 23, 2018 at 01:19:46PM +0200, Markus Armbruster wrote:
> Eduardo Habkost <ehabkost@redhat.com> writes:
> 
> > On Mon, May 21, 2018 at 07:44:40PM +0100, Daniel P. Berrangé wrote:
> >> On Mon, May 21, 2018 at 03:29:28PM -0300, Eduardo Habkost wrote:
> >> > On Sat, May 19, 2018 at 08:05:06AM +0200, Markus Armbruster wrote:
> >> > > Eduardo Habkost <ehabkost@redhat.com> writes:
> >> > > 
> >> > > [...]
> >> > > > About being more expressive than just a single list of key,value
> >> > > > pairs, I don't see any evidence of that being necessary for the
> >> > > > problems we're trying to address.
> >> > > 
> >> > > Short history of a configuration format you might have encountered:
> >> 
> >> [snip]
> >> 
> >> > > How confident are we a single list of (key, value) is really all we're
> >> > > going to need?
> >> > > 
> >> > > Even if we think it is, would it be possible to provide for a future
> >> > > extension to trees at next to no cost?
> >> > 
> >> > I'm confident that a list of key,values is all we need for the
> >> > current problem.
> >> 
> >> I'm not convinced. A disk image may work with Q35 or i440fx,  or
> >> work with any of virtio, ide or sata disk. So that already means
> >> values have to be arrays, not scalars. You could do that with a
> >> simple key,value list, but only by defining a mapping of arrays
> >> into a flattened form. eg do we allow repeated keys, or do we
> >> allow array indexes on keys. 
> >
> > No problem, we can support trees if it's necessary.
> >
> >
> >> > The point here is to allow users to simply copy an existing disk
> >> > image, and it will contain enough hints for a cloud stack to
> >> > choose reasonable defaults for machine-type and disk type
> >> > automatically.  Requiring the user to perform a separate step to
> >> > encapsulate the disk image in another file format defeats the
> >> > whole purpose of the proposal.
> >> 
> >> It doesn't have to mean more work for the user - the application
> >> that is used to create the image can do that on their behalf.
> >> oVirt for example can import/export OVA files, containing OVF
> >> metadata. I could imagine virt-manager, and other tools adding
> >> export ability without much trouble if this was deemed a desirable
> >> thing. Bundling gives ability to have multiple disk images in one
> >> archive, which is something OVF does.
> >
> > I have the impression that "the application that is used to
> > create the image" is a very large set.  It can be virt-manager,
> > virt-install, virt-manager, or even QEMU itself.
> >
> > Today people can simply create a VM on virt-manager, or run QEMU
> > manually, and upload the qcow2 image directly from its original
> > location (they don't need to copy/export it).  Don't we want the
> > same procedure to keep working instead of requiring users to use
> > another tool?
> 
> Today, I can take the disk out of my old computer, put it into my new
> computer, and it just works.  Don't we want the same procedure to keep
> working forever?

I don't think the comparison is fair: downloading hard disk
images for bare metal hardware is not as common as downloading
guest disk images for cloud infrastructure.

> 
> Sadly, wanting something badly enough doesn't make actual solutions any
> easier :)

This part is true.  :)

> 
> My point is: disk images (real or virtual) keep working in different
> hardware contexts by a mixture of flexibility built into system software
> on the image, disciplined evolution of hardware (real or virtual), and
> dumb luck.  It works until it doesn't.  And then you get to tinker.

Personally, I believe that tools for running and managing virtual
hardware can and should be smarter than real hardware.

> 
> With OVF, you solve the problem further up the stack: you do virtual
> appliances instead of disk images.
> 

I guess the main problem is that people are already using disk
images as if they were virtual appliances.

We can tell people to stop doing that and use OVF, but then we
won't make anybody's life any easier: publishers of images might
need to generate both qcow2 and OVF images if they want it to
work with older hosts; consumers will need to find out if they
need qcow2 or OVF.

But I work too deep down the stack to tell if it's really
important to avoid these problems or not.  And as you said, this
doesn't make actual solutions any easier.


> How much space that leaves for useful solutions at the level of QEMU I
> can't say.

I have no doubt about "useful", but I'm not sure about
"important".

I guess the question is if we have people with time and resources
to work on solutions (whether using qcow2 or OVF).

-- 
Eduardo

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-05-23  9:16           ` Kevin Wolf
@ 2018-05-23 14:46             ` Michael S. Tsirkin
  0 siblings, 0 replies; 157+ messages in thread
From: Michael S. Tsirkin @ 2018-05-23 14:46 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: Fam Zheng, ehabkost, qemu-block, qemu-devel, mreitz,
	Gerd Hoffmann, stefanha, berrange

On Wed, May 23, 2018 at 11:16:04AM +0200, Kevin Wolf wrote:
> Am 23.05.2018 um 04:12 hat Fam Zheng geschrieben:
> > On Tue, 05/22 17:02, Kevin Wolf wrote:
> > > Am 22.05.2018 um 16:19 hat Michael S. Tsirkin geschrieben:
> > > > On Tue, May 22, 2018 at 09:35:55AM +0200, Gerd Hoffmann wrote:
> > > > >   Hi,
> > > > > 
> > > > > > You must /sometimes/ supply the correct machine type.
> > > > > > 
> > > > > > It is quite dependent on the guest OS you have installed, and even
> > > > > > just how the guest OS is configured.  In general Linux is very
> > > > > > flexible and can adapt to a wide range of hardware, automatically
> > > > > > detecting things as needed. It is possible for a sysadmin to build
> > > > > > a Linux image in a way that would only work with I440FX, but I
> > > > > > don't think it would be common to see that.
> > > > > 
> > > > > I think it would be pretty hard to actually build such an image.
> > > > > 
> > > > > The more critical thing for linux guests is the storage driver which
> > > > > must be included into the initrd so the image can mount the root
> > > > > filesystem.  And the firmware, bios vs. uefi is more critical than
> > > > > pc vs. q35.
> > > > 
> > > > I think we can start by finding a location to embed a string in a qcow
> > > > image, add ability for qemu-img to set and get this string.  We can
> > > > discuss how it's formatted separately.
> > > 
> > > If we want it, we'll find a place to store it.
> > > 
> > > But the first thing we need is a spec for what's actually in it. Just
> > > storing a machine type hint would be a one-off hack that wouldn't last
> > > very long before we want to add the next thing.
> > > 
> > > Essentially, what we need is a description of the virtual machine that
> > > we suggest to use with this image. We can try to reuse something
> > > existing there, like libvirt XML or OVF, or invent something new (a JSON
> > > array describing runtime options?). One difference to existing formats
> > > is probably that we want only frontends and no backends in the
> > > description.
> > > 
> > 
> > Do we really need a uniform way and require compliance to the standard we
> > choose, and implement verification in the block driver, or can we get away with
> > a description field that accepts any text and leave it to the user to decide
> > what to put there? In the header we could assign a Content-type field that
> > defaults to 'text/plain' to the description, that way apps can mark the data as
> > "application/ovf" if they want, or whatever the upper layer decides.
> 
> Yes, we can. But I'm not sure if I want. Providing low-level features
> without telling users how they are supposed to be used usually results
> in a big surprise for both sides eventually.
> 
> Kevin

The idea to include a format in there sounds very reasonable to me
though. We can then start with a simple text format just showing the
QEMU command line, and others can reuse it for OVF format, etc.

-- 
MST

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-05-23 12:13               ` Eduardo Habkost
@ 2018-05-23 16:35                 ` Markus Armbruster
  2018-05-29 14:06                   ` Dr. David Alan Gilbert
  2018-06-05 21:58                   ` Michal Suchánek
  0 siblings, 2 replies; 157+ messages in thread
From: Markus Armbruster @ 2018-05-23 16:35 UTC (permalink / raw)
  To: Eduardo Habkost
  Cc: kwolf, qemu-block, Michael S. Tsirkin, qemu-devel, mreitz, stefanha

Eduardo Habkost <ehabkost@redhat.com> writes:

> On Wed, May 23, 2018 at 01:19:46PM +0200, Markus Armbruster wrote:
>> Eduardo Habkost <ehabkost@redhat.com> writes:
>> 
>> > On Mon, May 21, 2018 at 07:44:40PM +0100, Daniel P. Berrangé wrote:
>> >> On Mon, May 21, 2018 at 03:29:28PM -0300, Eduardo Habkost wrote:
>> >> > On Sat, May 19, 2018 at 08:05:06AM +0200, Markus Armbruster wrote:
>> >> > > Eduardo Habkost <ehabkost@redhat.com> writes:
>> >> > > 
>> >> > > [...]
>> >> > > > About being more expressive than just a single list of key,value
>> >> > > > pairs, I don't see any evidence of that being necessary for the
>> >> > > > problems we're trying to address.
>> >> > > 
>> >> > > Short history of a configuration format you might have encountered:
>> >> 
>> >> [snip]
>> >> 
>> >> > > How confident are we a single list of (key, value) is really all we're
>> >> > > going to need?
>> >> > > 
>> >> > > Even if we think it is, would it be possible to provide for a future
>> >> > > extension to trees at next to no cost?
>> >> > 
>> >> > I'm confident that a list of key,values is all we need for the
>> >> > current problem.
>> >> 
>> >> I'm not convinced. A disk image may work with Q35 or i440fx,  or
>> >> work with any of virtio, ide or sata disk. So that already means
>> >> values have to be arrays, not scalars. You could do that with a
>> >> simple key,value list, but only by defining a mapping of arrays
>> >> into a flattened form. eg do we allow repeated keys, or do we
>> >> allow array indexes on keys. 
>> >
>> > No problem, we can support trees if it's necessary.
>> >
>> >
>> >> > The point here is to allow users to simply copy an existing disk
>> >> > image, and it will contain enough hints for a cloud stack to
>> >> > choose reasonable defaults for machine-type and disk type
>> >> > automatically.  Requiring the user to perform a separate step to
>> >> > encapsulate the disk image in another file format defeats the
>> >> > whole purpose of the proposal.
>> >> 
>> >> It doesn't have to mean more work for the user - the application
>> >> that is used to create the image can do that on their behalf.
>> >> oVirt for example can import/export OVA files, containing OVF
>> >> metadata. I could imagine virt-manager, and other tools adding
>> >> export ability without much trouble if this was deemed a desirable
>> >> thing. Bundling gives ability to have multiple disk images in one
>> >> archive, which is something OVF does.
>> >
>> > I have the impression that "the application that is used to
>> > create the image" is a very large set.  It can be virt-manager,
>> > virt-install, virt-manager, or even QEMU itself.
>> >
>> > Today people can simply create a VM on virt-manager, or run QEMU
>> > manually, and upload the qcow2 image directly from its original
>> > location (they don't need to copy/export it).  Don't we want the
>> > same procedure to keep working instead of requiring users to use
>> > another tool?
>> 
>> Today, I can take the disk out of my old computer, put it into my new
>> computer, and it just works.  Don't we want the same procedure to keep
>> working forever?
>
> I don't think the comparison is fair: downloading hard disk
> images for bare metal hardware is not as common as downloading
> guest disk images for cloud infrastructure.

Can't let "fair" get in the way of a witticism!

Seriously, though: are disk images a sane way to package software?

>> Sadly, wanting something badly enough doesn't make actual solutions any
>> easier :)
>
> This part is true.  :)
>
>> 
>> My point is: disk images (real or virtual) keep working in different
>> hardware contexts by a mixture of flexibility built into system software
>> on the image, disciplined evolution of hardware (real or virtual), and
>> dumb luck.  It works until it doesn't.  And then you get to tinker.
>
> Personally, I believe that tools for running and managing virtual
> hardware can and should be smarter than real hardware.

Point taken.

Adding hints to disk images (which is how I understand your proposal)
could make them (with a bit of luck) work more often.  Luck, because the
receiving end needs to interpret the hints in a way that makes the image
work.  In other words, guesswork and duct tape.  Both guesswork and duct
tape are incredibly useful when no better tools are at hand.  Is that
the case here?

>> With OVF, you solve the problem further up the stack: you do virtual
>> appliances instead of disk images.
>> 
>
> I guess the main problem is that people are already using disk
> images as if they were virtual appliances.
>
> We can tell people to stop doing that and use OVF, but then we
> won't make anybody's life any easier: publishers of images might
> need to generate both qcow2 and OVF images if they want it to
> work with older hosts; consumers will need to find out if they
> need qcow2 or OVF.

I'm afraid providing for "hints" in QCOW2 could only add problems.  To
pick the right hints, publishers need to predict how future software
consuming the image will interpret them.  Consumers may have to
configure their software to interpret hints in various ways.

I figure my belly-aching is due to the general fuzziness of "hints", at
least in my mind.  Would it even be possible to do anything remotely
similar to a specification for them?

> But I work too deep down the stack to tell if it's really
> important to avoid these problems or not.  And as you said, this
> doesn't make actual solutions any easier.
>
>
>> How much space that leaves for useful solutions at the level of QEMU I
>> can't say.
>
> I have no doubt about "useful", but I'm not sure about
> "important".

I'm in doubt on "feasible", but that might be due to me not fully
grasping the intended use of hints.

> I guess the question is if we have people with time and resources
> to work on solutions (whether using qcow2 or OVF).

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-05-21 20:33       ` Eduardo Habkost
@ 2018-05-24  9:58         ` Kashyap Chamarthy
  0 siblings, 0 replies; 157+ messages in thread
From: Kashyap Chamarthy @ 2018-05-24  9:58 UTC (permalink / raw)
  To: Eduardo Habkost
  Cc: Daniel P. Berrangé,
	kwolf, qemu-block, Michael S. Tsirkin, qemu-devel, mreitz,
	stefanha

On Mon, May 21, 2018 at 05:33:22PM -0300, Eduardo Habkost wrote:
> On Mon, May 21, 2018 at 09:18:17PM +0100, Daniel P. Berrangé wrote:

[...]

(Just catching up with this thread.)

[...]

> > > I have very specific goal here: the goal is to make it less
> > > painful to users when OpenStack+libvirt+QEMU switch to using a
> > > different machine-type by default (q35), and/or when guest OSes
> > > stop supporting pc-i440fx.  I assume this is a goal for OpenStack
> > > as well.
> > > 
> > > We can make the solution to be more extensible and solve other
> > > problems as well, but my original goal is the one above.
> > 
> > Configuring the machine type is just one thing that users
> > would do with OpenStack though.  A simple example might be
> > 
> >     openstack image set \
> >          --property hw_disk_bus=scsi \
> > 	 --property hw_vif_model=e1000e

[...]

> > Setting a non-default machine type is one extra prop
> > 
> >     openstack image set \
> >          --property hw_machine_type=q35
> >          --property os_distro=fedora26
> 
> Nice.  Are these just hypothetical examples, or something that
> already works?

No, not hypothetical -- they actually work _today_, and customers
actively use it in production as we speak.  Machine type could be set in
two ways in OpenStack.  One is as Dan noted above, which is *per* disk
image.  The other is per Compute node (where QEMU instances are
launched), via setting a config attribute in a file
(/etc/nova/nova.conf):

        [libvirt]
        ...
        hw_machine_type=x86_64=pc-q35-2.9

The above means _all_ QEMU instances launched on that Compute node will
get 'pc-q35-2.9' machine type.

[...]

-- 
/kashyap

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-05-18 17:09 ` Daniel P. Berrangé
  2018-05-18 17:41   ` Eduardo Habkost
  2018-05-22  7:35   ` Gerd Hoffmann
@ 2018-05-24 11:17   ` Richard W.M. Jones
  2018-05-29 14:03     ` Dr. David Alan Gilbert
  2 siblings, 1 reply; 157+ messages in thread
From: Richard W.M. Jones @ 2018-05-24 11:17 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Michael S. Tsirkin, kwolf, ehabkost, qemu-block, qemu-devel,
	mreitz, stefanha

On Fri, May 18, 2018 at 06:09:56PM +0100, Daniel P. Berrangé wrote:
> The closest to a cross-hypervisor standard is OVF which can store
> metadata about required hardware for a VM. I'm pretty sure it does
> not have the concept of machine types, but maybe it has a way for
> people to define metadata extensions. Since it is just XML at the
> end of the day, even if there was nothing official in OVF, it would
> be possible to just define a custom XML namespace and declare a
> schema for that to follow.

I have a great deal of experience with the OVF "standard".
TL;DR: DO NOT USE IT.

Long answer copied from a rant I wrote on an internal mailing list a
while back:

  Don't make the mistake of confusing OVF for a format.  It's not,
  there are at least 4 non-interoperable OVF "format"s around:

   - 2 x oVirt OVF
   - VMware's OVF used in exported OVA files
   - VirtualBox's OVF used in their exported OVA files

  These are all different and do not interoperate *at all*.  So before
  you decide "let's parse OVF", be precise about which format(s) you
  actually want to parse.

  Also OVF is a hideous format.  Many fields are obviously internal data
  dumps of VMware structures, complete with internal VMware IDs instead
  of descriptive names.  Where there are descriptive names, they use
  English strings instead of keywords, like: <rasd:AllocationUnits>MetaBytes</>
  or my particular WTF favourite, a meaningful field which references
  English (only) Wikipedia:

    <Disk ovf:format="http://en.wikipedia.org/wiki/Byte">

  File references are split over two places, and there are other
  examples where data is needlessly duplicated or it's unclear what data
  is supposed to be.

  Of course VMware Inc. are not stupid enough to use this format for
  their own purposes.  They use a completely different format (VMX)
  which is a lot like YAML.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
Fedora Windows cross-compiler. Compile Windows programs, test, and
build Windows installers. Over 100 libraries supported.
http://fedoraproject.org/wiki/MinGW

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-05-18 15:30 [Qemu-devel] storing machine data in qcow images? Michael S. Tsirkin
                   ` (2 preceding siblings ...)
  2018-05-22  8:50 ` Philipp Hahn
@ 2018-05-24 11:32 ` Richard W.M. Jones
  2018-05-24 14:56   ` Michael S. Tsirkin
  2018-05-28 18:10   ` Max Reitz
  3 siblings, 2 replies; 157+ messages in thread
From: Richard W.M. Jones @ 2018-05-24 11:32 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: ehabkost, stefanha, kwolf, mreitz, qemu-devel, qemu-block

I read the whole thread and the fundamental problem is that you're
mixing layers.  Let qcow2 be a disk image format, and let management
layers deal with metadata and how to run qemu.

What's going to happen when you have (eg) an OVA file containing qcow2
files, and the qcow2 files all have different metadata from each other
and from the actual metadata in the OVA?  Even the case where you've
got ‘-hda file1.qcow2 -hdb file2.qcow2’ is not properly defined.  What
happens if someone uses ‘-M mach1 -hda file.qcow2’ and the machine
type in the qcow2 file conflicts with the command line?

BTW we have a tooling (libguestfs) which can tell you what devices are
supported by the guest.  virt-v2v already uses libguestfs to find out
the full list of devices supported by guests, and uses that to drive
conversion.  At some point we're going to extend virt-inspector to
make this a bit easier (patches and other contributions welcome,
there's a huge list of work to do on libguestfs and not enough
developers to get through it).

There is however a seed of a good idea in the thread:

> I don't think QEMU needs to use this information automatically,
> necessarily.  I think the first step is to simply make QEMU save
> this information in the disk image, and making qemu-img able to
> read and write this information.

It would be nice if qcow2 added arbitrary data sections (which would
always be ignored by qemu) for storing additional data.  This could be
used to create a compact qcow2 + metadata format to rival OVA for
management layers to use, and there are various other uses too.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
Fedora Windows cross-compiler. Compile Windows programs, test, and
build Windows installers. Over 100 libraries supported.
http://fedoraproject.org/wiki/MinGW

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-05-24 11:32 ` Richard W.M. Jones
@ 2018-05-24 14:56   ` Michael S. Tsirkin
  2018-05-24 15:08     ` Kevin Wolf
  2018-05-28 18:10   ` Max Reitz
  1 sibling, 1 reply; 157+ messages in thread
From: Michael S. Tsirkin @ 2018-05-24 14:56 UTC (permalink / raw)
  To: Richard W.M. Jones
  Cc: ehabkost, stefanha, kwolf, mreitz, qemu-devel, qemu-block

On Thu, May 24, 2018 at 12:32:51PM +0100, Richard W.M. Jones wrote:
> There is however a seed of a good idea in the thread:
> 
> > I don't think QEMU needs to use this information automatically,
> > necessarily.  I think the first step is to simply make QEMU save
> > this information in the disk image, and making qemu-img able to
> > read and write this information.
> 
> It would be nice if qcow2 added arbitrary data sections (which would
> always be ignored by qemu) for storing additional data.  This could be
> used to create a compact qcow2 + metadata format to rival OVA for
> management layers to use, and there are various other uses too.
> 
> Rich.

I think this part is pretty uncontroversial.

But can we add data without changing the verion?

typedef struct QCowHeader {
    uint32_t magic;
    uint32_t version;
    uint64_t backing_file_offset;
    uint32_t backing_file_size;
    uint32_t mtime;
    uint64_t size; /* in bytes */
    uint8_t cluster_bits;
    uint8_t l2_bits;
    uint16_t padding;
    uint32_t crypt_method;
    uint64_t l1_table_offset;
} QEMU_PACKED QCowHeader;


How about changing mtime to a flags bitmap?
E.g. 0x1 would mean there's an extended header.

And then:

struct QCowExtHeader {
    uint64_t meta_data_offset;
    uint32_t meta_data_size;
    uint32_t meta_data_format; /* 0x0 - UTF-8 NULL-terminated string */
};


Thanks!

-- 
MST

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-05-24 14:56   ` Michael S. Tsirkin
@ 2018-05-24 15:08     ` Kevin Wolf
  2018-05-24 15:19       ` Michael S. Tsirkin
  2018-05-24 15:20       ` Richard W.M. Jones
  0 siblings, 2 replies; 157+ messages in thread
From: Kevin Wolf @ 2018-05-24 15:08 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Richard W.M. Jones, ehabkost, stefanha, mreitz, qemu-devel, qemu-block

Am 24.05.2018 um 16:56 hat Michael S. Tsirkin geschrieben:
> On Thu, May 24, 2018 at 12:32:51PM +0100, Richard W.M. Jones wrote:
> > There is however a seed of a good idea in the thread:
> > 
> > > I don't think QEMU needs to use this information automatically,
> > > necessarily.  I think the first step is to simply make QEMU save
> > > this information in the disk image, and making qemu-img able to
> > > read and write this information.
> > 
> > It would be nice if qcow2 added arbitrary data sections (which would
> > always be ignored by qemu) for storing additional data.  This could be
> > used to create a compact qcow2 + metadata format to rival OVA for
> > management layers to use, and there are various other uses too.
> > 
> > Rich.
> 
> I think this part is pretty uncontroversial.
> 
> But can we add data without changing the verion?

Yes. Don't worry about where to store it, we'll solve this. Do worry
about what to store.

> typedef struct QCowHeader {
>     uint32_t magic;
>     uint32_t version;
>     uint64_t backing_file_offset;
>     uint32_t backing_file_size;
>     uint32_t mtime;
>     uint64_t size; /* in bytes */
>     uint8_t cluster_bits;
>     uint8_t l2_bits;
>     uint16_t padding;
>     uint32_t crypt_method;
>     uint64_t l1_table_offset;
> } QEMU_PACKED QCowHeader;
> 
> How about changing mtime to a flags bitmap?
> E.g. 0x1 would mean there's an extended header.

You're looking at the qcow1 header. qcow2 has mechanisms to extend the
metadata, including compatible and incompatible feature flags and a
header_length field.

Kevin

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-05-24 15:08     ` Kevin Wolf
@ 2018-05-24 15:19       ` Michael S. Tsirkin
  2018-05-24 15:20       ` Richard W.M. Jones
  1 sibling, 0 replies; 157+ messages in thread
From: Michael S. Tsirkin @ 2018-05-24 15:19 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: Richard W.M. Jones, ehabkost, stefanha, mreitz, qemu-devel, qemu-block

On Thu, May 24, 2018 at 05:08:17PM +0200, Kevin Wolf wrote:
> Am 24.05.2018 um 16:56 hat Michael S. Tsirkin geschrieben:
> > On Thu, May 24, 2018 at 12:32:51PM +0100, Richard W.M. Jones wrote:
> > > There is however a seed of a good idea in the thread:
> > > 
> > > > I don't think QEMU needs to use this information automatically,
> > > > necessarily.  I think the first step is to simply make QEMU save
> > > > this information in the disk image, and making qemu-img able to
> > > > read and write this information.
> > > 
> > > It would be nice if qcow2 added arbitrary data sections (which would
> > > always be ignored by qemu) for storing additional data.  This could be
> > > used to create a compact qcow2 + metadata format to rival OVA for
> > > management layers to use, and there are various other uses too.
> > > 
> > > Rich.
> > 
> > I think this part is pretty uncontroversial.
> > 
> > But can we add data without changing the verion?
> 
> Yes. Don't worry about where to store it, we'll solve this. Do worry
> about what to store.

Let's start with a UTF-8 string.

> > typedef struct QCowHeader {
> >     uint32_t magic;
> >     uint32_t version;
> >     uint64_t backing_file_offset;
> >     uint32_t backing_file_size;
> >     uint32_t mtime;
> >     uint64_t size; /* in bytes */
> >     uint8_t cluster_bits;
> >     uint8_t l2_bits;
> >     uint16_t padding;
> >     uint32_t crypt_method;
> >     uint64_t l1_table_offset;
> > } QEMU_PACKED QCowHeader;
> > 
> > How about changing mtime to a flags bitmap?
> > E.g. 0x1 would mean there's an extended header.
> 
> You're looking at the qcow1 header. qcow2 has mechanisms to extend the
> metadata, including compatible and incompatible feature flags and a
> header_length field.
> 
> Kevin

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-05-24 15:08     ` Kevin Wolf
  2018-05-24 15:19       ` Michael S. Tsirkin
@ 2018-05-24 15:20       ` Richard W.M. Jones
  2018-05-24 16:25         ` Markus Armbruster
  1 sibling, 1 reply; 157+ messages in thread
From: Richard W.M. Jones @ 2018-05-24 15:20 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: Michael S. Tsirkin, ehabkost, stefanha, mreitz, qemu-devel, qemu-block

On Thu, May 24, 2018 at 05:08:17PM +0200, Kevin Wolf wrote:
> Am 24.05.2018 um 16:56 hat Michael S. Tsirkin geschrieben:
> > On Thu, May 24, 2018 at 12:32:51PM +0100, Richard W.M. Jones wrote:
> > > There is however a seed of a good idea in the thread:
> > > 
> > > > I don't think QEMU needs to use this information automatically,
> > > > necessarily.  I think the first step is to simply make QEMU save
> > > > this information in the disk image, and making qemu-img able to
> > > > read and write this information.
> > > 
> > > It would be nice if qcow2 added arbitrary data sections (which would
> > > always be ignored by qemu) for storing additional data.  This could be
> > > used to create a compact qcow2 + metadata format to rival OVA for
> > > management layers to use, and there are various other uses too.
> > > 
> > > Rich.
> > 
> > I think this part is pretty uncontroversial.
> > 
> > But can we add data without changing the verion?
> 
> Yes. Don't worry about where to store it, we'll solve this. Do worry
> about what to store.

Ideally from my point of view: named blobs.  More formally, any number
of (key, value) pairs where the key is a simple string, and the value
is a binary blob.  The binary blobs might really be XML or YAML or an
icon or whatever, but qemu would not need to look inside them.

We could then attach metadata (in some to-be-decided format) to qcow2
files and create a compact rival to OVA without needing to encode any
knowledge of the metadata into qemu at all.

Another use for this is allowing qcow2 files to contain names, titles,
descriptions, creation date, OS icons, etc. which could be displayed
in file managers.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-df lists disk usage of guests without needing to install any
software inside the virtual machine.  Supports Linux and Windows.
http://people.redhat.com/~rjones/virt-df/

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-05-24 15:20       ` Richard W.M. Jones
@ 2018-05-24 16:25         ` Markus Armbruster
  0 siblings, 0 replies; 157+ messages in thread
From: Markus Armbruster @ 2018-05-24 16:25 UTC (permalink / raw)
  To: Richard W.M. Jones
  Cc: Kevin Wolf, ehabkost, qemu-block, Michael S. Tsirkin, qemu-devel,
	mreitz, stefanha

"Richard W.M. Jones" <rjones@redhat.com> writes:

> On Thu, May 24, 2018 at 05:08:17PM +0200, Kevin Wolf wrote:
>> Am 24.05.2018 um 16:56 hat Michael S. Tsirkin geschrieben:
>> > On Thu, May 24, 2018 at 12:32:51PM +0100, Richard W.M. Jones wrote:
>> > > There is however a seed of a good idea in the thread:
>> > > 
>> > > > I don't think QEMU needs to use this information automatically,
>> > > > necessarily.  I think the first step is to simply make QEMU save
>> > > > this information in the disk image, and making qemu-img able to
>> > > > read and write this information.
>> > > 
>> > > It would be nice if qcow2 added arbitrary data sections (which would
>> > > always be ignored by qemu) for storing additional data.  This could be
>> > > used to create a compact qcow2 + metadata format to rival OVA for
>> > > management layers to use, and there are various other uses too.
>> > > 
>> > > Rich.
>> > 
>> > I think this part is pretty uncontroversial.
>> > 
>> > But can we add data without changing the verion?
>> 
>> Yes. Don't worry about where to store it, we'll solve this. Do worry
>> about what to store.
>
> Ideally from my point of view: named blobs.  More formally, any number
> of (key, value) pairs where the key is a simple string, and the value
> is a binary blob.  The binary blobs might really be XML or YAML or an
> icon or whatever, but qemu would not need to look inside them.
>
> We could then attach metadata (in some to-be-decided format) to qcow2
> files and create a compact rival to OVA without needing to encode any
> knowledge of the metadata into qemu at all.
>
> Another use for this is allowing qcow2 files to contain names, titles,
> descriptions, creation date, OS icons, etc. which could be displayed
> in file managers.

Have we just reinvented resource forks?

SCNR ;)

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-05-24 11:32 ` Richard W.M. Jones
  2018-05-24 14:56   ` Michael S. Tsirkin
@ 2018-05-28 18:10   ` Max Reitz
  2018-05-28 18:30     ` Richard W.M. Jones
  1 sibling, 1 reply; 157+ messages in thread
From: Max Reitz @ 2018-05-28 18:10 UTC (permalink / raw)
  To: Richard W.M. Jones, Michael S. Tsirkin
  Cc: ehabkost, stefanha, kwolf, qemu-devel, qemu-block

[-- Attachment #1: Type: text/plain, Size: 2809 bytes --]

On 2018-05-24 13:32, Richard W.M. Jones wrote:
> I read the whole thread and the fundamental problem is that you're
> mixing layers.  Let qcow2 be a disk image format, and let management
> layers deal with metadata and how to run qemu.
> 
> What's going to happen when you have (eg) an OVA file containing qcow2
> files, and the qcow2 files all have different metadata from each other
> and from the actual metadata in the OVA?  Even the case where you've
> got ‘-hda file1.qcow2 -hdb file2.qcow2’ is not properly defined.  What
> happens if someone uses ‘-M mach1 -hda file.qcow2’ and the machine
> type in the qcow2 file conflicts with the command line?
> 
> BTW we have a tooling (libguestfs) which can tell you what devices are
> supported by the guest.  virt-v2v already uses libguestfs to find out
> the full list of devices supported by guests, and uses that to drive
> conversion.  At some point we're going to extend virt-inspector to
> make this a bit easier (patches and other contributions welcome,
> there's a huge list of work to do on libguestfs and not enough
> developers to get through it).
> 
> There is however a seed of a good idea in the thread:
> 
>> I don't think QEMU needs to use this information automatically,
>> necessarily.  I think the first step is to simply make QEMU save
>> this information in the disk image, and making qemu-img able to
>> read and write this information.
> 
> It would be nice if qcow2 added arbitrary data sections (which would
> always be ignored by qemu) for storing additional data.  This could be
> used to create a compact qcow2 + metadata format to rival OVA for
> management layers to use, and there are various other uses too.

As an extremist on the "qcow2 is an image data format and as such should
only store data that is relevant to the virtual disk" front, I don't see
the appeal.

As someone who is just naive and doesn't see the big picture, I don't
see what's wrong with using a tar file that contains the image and
additional data.  I shudder to imagine integrating the qcow2 driver so
deeply into qemu that various parts all over qemu just use it to store
some data.  I can't help but feel reminded of the HMP savevm command
that just randomly chooses some qcow2 image to store the VM state in.

At least you're talking about just storing data.  I imagine opening a
qcow2 image, then reading some VM configuration, and spreading the
gospel through the rest of the qemu process to initialize everything
would be anything but nice.

I personally don't see why it is so bad to split the information between
two files.  Honestly, if you want to put disk images and VM
configuration into a single file, I'd do it backwards: Put the disk
image into the configuration file.

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-05-28 18:10   ` Max Reitz
@ 2018-05-28 18:30     ` Richard W.M. Jones
  2018-05-28 18:38       ` Kevin Wolf
  0 siblings, 1 reply; 157+ messages in thread
From: Richard W.M. Jones @ 2018-05-28 18:30 UTC (permalink / raw)
  To: Max Reitz
  Cc: Michael S. Tsirkin, ehabkost, stefanha, kwolf, qemu-devel, qemu-block

On Mon, May 28, 2018 at 08:10:32PM +0200, Max Reitz wrote:
> As someone who is just naive and doesn't see the big picture, I don't
> see what's wrong with using a tar file that contains the image and
> additional data.

FWIW an OVA file is exactly this: an uncompressed tar file containing
disk image(s) and metadata.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-df lists disk usage of guests without needing to install any
software inside the virtual machine.  Supports Linux and Windows.
http://people.redhat.com/~rjones/virt-df/

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-05-28 18:30     ` Richard W.M. Jones
@ 2018-05-28 18:38       ` Kevin Wolf
  2018-05-28 18:44         ` Max Reitz
  2018-05-28 21:20         ` Richard W.M. Jones
  0 siblings, 2 replies; 157+ messages in thread
From: Kevin Wolf @ 2018-05-28 18:38 UTC (permalink / raw)
  To: Richard W.M. Jones
  Cc: Max Reitz, Michael S. Tsirkin, ehabkost, stefanha, qemu-devel,
	qemu-block

Am 28.05.2018 um 20:30 hat Richard W.M. Jones geschrieben:
> On Mon, May 28, 2018 at 08:10:32PM +0200, Max Reitz wrote:
> > As someone who is just naive and doesn't see the big picture, I don't
> > see what's wrong with using a tar file that contains the image and
> > additional data.
> 
> FWIW an OVA file is exactly this: an uncompressed tar file containing
> disk image(s) and metadata.

If we combine VM configuration and the disk image this way, I would
still want to directly use that combined thing without having to extract
its components first.

Just accessing the image file within a tar archive is possible and we
could write a block driver for that (I actually think we should do
this), but it restricts you because certain operations like resizing
aren't really possible in tar. Unfortunately, resizing is a really
common operation for non-raw image formats.

And if I think of a file format that can contain several different
things that can be individually resized etc., I end up with qcow2 in the
simple case or a full file system in the more complex case.

Kevin

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-05-28 18:38       ` Kevin Wolf
@ 2018-05-28 18:44         ` Max Reitz
  2018-05-28 19:09           ` Kevin Wolf
  2018-05-28 21:20         ` Richard W.M. Jones
  1 sibling, 1 reply; 157+ messages in thread
From: Max Reitz @ 2018-05-28 18:44 UTC (permalink / raw)
  To: Kevin Wolf, Richard W.M. Jones
  Cc: Michael S. Tsirkin, ehabkost, stefanha, qemu-devel, qemu-block

[-- Attachment #1: Type: text/plain, Size: 1209 bytes --]

On 2018-05-28 20:38, Kevin Wolf wrote:
> Am 28.05.2018 um 20:30 hat Richard W.M. Jones geschrieben:
>> On Mon, May 28, 2018 at 08:10:32PM +0200, Max Reitz wrote:
>>> As someone who is just naive and doesn't see the big picture, I don't
>>> see what's wrong with using a tar file that contains the image and
>>> additional data.
>>
>> FWIW an OVA file is exactly this: an uncompressed tar file containing
>> disk image(s) and metadata.
> 
> If we combine VM configuration and the disk image this way, I would
> still want to directly use that combined thing without having to extract
> its components first.
> 
> Just accessing the image file within a tar archive is possible and we
> could write a block driver for that (I actually think we should do
> this), but it restricts you because certain operations like resizing
> aren't really possible in tar. Unfortunately, resizing is a really
> common operation for non-raw image formats.
> 
> And if I think of a file format that can contain several different
> things that can be individually resized etc., I end up with qcow2 in the
> simple case or a full file system in the more complex case.

Well, you end up with VMDK.

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-05-28 18:44         ` Max Reitz
@ 2018-05-28 19:09           ` Kevin Wolf
  2018-05-29  9:23             ` Max Reitz
  0 siblings, 1 reply; 157+ messages in thread
From: Kevin Wolf @ 2018-05-28 19:09 UTC (permalink / raw)
  To: Max Reitz
  Cc: Richard W.M. Jones, Michael S. Tsirkin, ehabkost, stefanha,
	qemu-devel, qemu-block

[-- Attachment #1: Type: text/plain, Size: 1656 bytes --]

Am 28.05.2018 um 20:44 hat Max Reitz geschrieben:
> On 2018-05-28 20:38, Kevin Wolf wrote:
> > Am 28.05.2018 um 20:30 hat Richard W.M. Jones geschrieben:
> >> On Mon, May 28, 2018 at 08:10:32PM +0200, Max Reitz wrote:
> >>> As someone who is just naive and doesn't see the big picture, I don't
> >>> see what's wrong with using a tar file that contains the image and
> >>> additional data.
> >>
> >> FWIW an OVA file is exactly this: an uncompressed tar file containing
> >> disk image(s) and metadata.
> > 
> > If we combine VM configuration and the disk image this way, I would
> > still want to directly use that combined thing without having to extract
> > its components first.
> > 
> > Just accessing the image file within a tar archive is possible and we
> > could write a block driver for that (I actually think we should do
> > this), but it restricts you because certain operations like resizing
> > aren't really possible in tar. Unfortunately, resizing is a really
> > common operation for non-raw image formats.
> > 
> > And if I think of a file format that can contain several different
> > things that can be individually resized etc., I end up with qcow2 in the
> > simple case or a full file system in the more complex case.
> 
> Well, you end up with VMDK.

I don't think VMDK can save several different objects? It can have some
metadata in the descriptor, and it can spread the contents of a single
object across multiple files (with extents), but I don't think it has
something comparable to e.g. qcow2 snapshots, which are separate objects
with an individual size that can dynamically change.

Kevin

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 801 bytes --]

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-05-28 18:38       ` Kevin Wolf
  2018-05-28 18:44         ` Max Reitz
@ 2018-05-28 21:20         ` Richard W.M. Jones
  2018-05-28 21:25           ` Richard W.M. Jones
  1 sibling, 1 reply; 157+ messages in thread
From: Richard W.M. Jones @ 2018-05-28 21:20 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: Max Reitz, Michael S. Tsirkin, ehabkost, stefanha, qemu-devel,
	qemu-block

On Mon, May 28, 2018 at 08:38:33PM +0200, Kevin Wolf wrote:
> Just accessing the image file within a tar archive is possible and we
> could write a block driver for that (I actually think we should do
> this), but it restricts you because certain operations like resizing
> aren't really possible in tar. Unfortunately, resizing is a really
> common operation for non-raw image formats.

We do this already in virt-v2v (using file.offset and file.size
parameters in the raw driver).

For virt-v2v we only need to read the source so resizing isn't an
issue.  For most of the cases we're talking about the downloaded image
would also be a template / base image, so I suppose only reading would
be required too.

I also wrote an nbdkit tar file driver (supports writes, but not
resizing).
https://manpages.debian.org/testing/nbdkit-plugin-perl/nbdkit-tar-plugin.1.en.html

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
Fedora Windows cross-compiler. Compile Windows programs, test, and
build Windows installers. Over 100 libraries supported.
http://fedoraproject.org/wiki/MinGW

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-05-28 21:20         ` Richard W.M. Jones
@ 2018-05-28 21:25           ` Richard W.M. Jones
  2018-05-29  6:44             ` Kevin Wolf
  0 siblings, 1 reply; 157+ messages in thread
From: Richard W.M. Jones @ 2018-05-28 21:25 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: ehabkost, qemu-block, Michael S. Tsirkin, qemu-devel, Max Reitz,
	stefanha

On Mon, May 28, 2018 at 10:20:54PM +0100, Richard W.M. Jones wrote:
> On Mon, May 28, 2018 at 08:38:33PM +0200, Kevin Wolf wrote:
> > Just accessing the image file within a tar archive is possible and we
> > could write a block driver for that (I actually think we should do
> > this), but it restricts you because certain operations like resizing
> > aren't really possible in tar. Unfortunately, resizing is a really
> > common operation for non-raw image formats.
> 
> We do this already in virt-v2v (using file.offset and file.size
> parameters in the raw driver).
> 
> For virt-v2v we only need to read the source so resizing isn't an
> issue.  For most of the cases we're talking about the downloaded image
> would also be a template / base image, so I suppose only reading would
> be required too.
> 
> I also wrote an nbdkit tar file driver (supports writes, but not
> resizing).
> https://manpages.debian.org/testing/nbdkit-plugin-perl/nbdkit-tar-plugin.1.en.html

I should add the other thorny issue with OVA files is that the
metadata contains a checksum (SHA1 or SHA256) of the disk images.  If
you modify the disk images in-place in the tar file then you need to
recalculate those.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-df lists disk usage of guests without needing to install any
software inside the virtual machine.  Supports Linux and Windows.
http://people.redhat.com/~rjones/virt-df/

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-05-28 21:25           ` Richard W.M. Jones
@ 2018-05-29  6:44             ` Kevin Wolf
  2018-05-29 10:14               ` Max Reitz
  0 siblings, 1 reply; 157+ messages in thread
From: Kevin Wolf @ 2018-05-29  6:44 UTC (permalink / raw)
  To: Richard W.M. Jones
  Cc: ehabkost, qemu-block, Michael S. Tsirkin, qemu-devel, Max Reitz,
	stefanha

Am 28.05.2018 um 23:25 hat Richard W.M. Jones geschrieben:
> On Mon, May 28, 2018 at 10:20:54PM +0100, Richard W.M. Jones wrote:
> > On Mon, May 28, 2018 at 08:38:33PM +0200, Kevin Wolf wrote:
> > > Just accessing the image file within a tar archive is possible and we
> > > could write a block driver for that (I actually think we should do
> > > this), but it restricts you because certain operations like resizing
> > > aren't really possible in tar. Unfortunately, resizing is a really
> > > common operation for non-raw image formats.
> > 
> > We do this already in virt-v2v (using file.offset and file.size
> > parameters in the raw driver).
> > 
> > For virt-v2v we only need to read the source so resizing isn't an
> > issue.  For most of the cases we're talking about the downloaded image
> > would also be a template / base image, so I suppose only reading would
> > be required too.
> > 
> > I also wrote an nbdkit tar file driver (supports writes, but not
> > resizing).
> > https://manpages.debian.org/testing/nbdkit-plugin-perl/nbdkit-tar-plugin.1.en.html
> 
> I should add the other thorny issue with OVA files is that the
> metadata contains a checksum (SHA1 or SHA256) of the disk images.  If
> you modify the disk images in-place in the tar file then you need to
> recalculate those.

All of this means that OVA isn't really well suited to be used as a
native format for VM configuration + images. It's just for sharing
read-only images that are converted into another native format before
they are used.

Which is probably fair for the use case it was made for, but means that
we need something else to solve our problem.

Kevin

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-05-28 19:09           ` Kevin Wolf
@ 2018-05-29  9:23             ` Max Reitz
  2018-05-29 10:14               ` Kevin Wolf
  0 siblings, 1 reply; 157+ messages in thread
From: Max Reitz @ 2018-05-29  9:23 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: Richard W.M. Jones, Michael S. Tsirkin, ehabkost, stefanha,
	qemu-devel, qemu-block

[-- Attachment #1: Type: text/plain, Size: 2201 bytes --]

On 2018-05-28 21:09, Kevin Wolf wrote:
> Am 28.05.2018 um 20:44 hat Max Reitz geschrieben:
>> On 2018-05-28 20:38, Kevin Wolf wrote:
>>> Am 28.05.2018 um 20:30 hat Richard W.M. Jones geschrieben:
>>>> On Mon, May 28, 2018 at 08:10:32PM +0200, Max Reitz wrote:
>>>>> As someone who is just naive and doesn't see the big picture, I don't
>>>>> see what's wrong with using a tar file that contains the image and
>>>>> additional data.
>>>>
>>>> FWIW an OVA file is exactly this: an uncompressed tar file containing
>>>> disk image(s) and metadata.
>>>
>>> If we combine VM configuration and the disk image this way, I would
>>> still want to directly use that combined thing without having to extract
>>> its components first.
>>>
>>> Just accessing the image file within a tar archive is possible and we
>>> could write a block driver for that (I actually think we should do
>>> this), but it restricts you because certain operations like resizing
>>> aren't really possible in tar. Unfortunately, resizing is a really
>>> common operation for non-raw image formats.
>>>
>>> And if I think of a file format that can contain several different
>>> things that can be individually resized etc., I end up with qcow2 in the
>>> simple case or a full file system in the more complex case.
>>
>> Well, you end up with VMDK.
> 
> I don't think VMDK can save several different objects? It can have some
> metadata in the descriptor, and it can spread the contents of a single
> object across multiple files (with extents), but I don't think it has
> something comparable to e.g. qcow2 snapshots, which are separate objects
> with an individual size that can dynamically change.

Right, I tried to be funny and was over-simplifying in the process.

What I meant is: You end up with an image format that is spread on a
filesystem, like VMDK is (usually).  Then you have some metadata
descriptor file that describes the rest and multiple data object files.

(For completeness's sake: And you can use an external or an internal
filesystem, that is, use multiple files (like VMDK) or have an internal
filesystem (like tar, except tar doesn't allow fragmentation).)

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-05-29  6:44             ` Kevin Wolf
@ 2018-05-29 10:14               ` Max Reitz
  2018-06-05  9:21                 ` Dr. David Alan Gilbert
  2018-06-06 10:32                 ` [Qemu-devel] " Michal Suchánek
  0 siblings, 2 replies; 157+ messages in thread
From: Max Reitz @ 2018-05-29 10:14 UTC (permalink / raw)
  To: Kevin Wolf, Richard W.M. Jones
  Cc: ehabkost, qemu-block, Michael S. Tsirkin, qemu-devel, stefanha

[-- Attachment #1: Type: text/plain, Size: 8005 bytes --]

On 2018-05-29 08:44, Kevin Wolf wrote:
> Am 28.05.2018 um 23:25 hat Richard W.M. Jones geschrieben:
>> On Mon, May 28, 2018 at 10:20:54PM +0100, Richard W.M. Jones wrote:
>>> On Mon, May 28, 2018 at 08:38:33PM +0200, Kevin Wolf wrote:
>>>> Just accessing the image file within a tar archive is possible and we
>>>> could write a block driver for that (I actually think we should do
>>>> this), but it restricts you because certain operations like resizing
>>>> aren't really possible in tar. Unfortunately, resizing is a really
>>>> common operation for non-raw image formats.
>>>
>>> We do this already in virt-v2v (using file.offset and file.size
>>> parameters in the raw driver).
>>>
>>> For virt-v2v we only need to read the source so resizing isn't an
>>> issue.  For most of the cases we're talking about the downloaded image
>>> would also be a template / base image, so I suppose only reading would
>>> be required too.
>>>
>>> I also wrote an nbdkit tar file driver (supports writes, but not
>>> resizing).
>>> https://manpages.debian.org/testing/nbdkit-plugin-perl/nbdkit-tar-plugin.1.en.html
>>
>> I should add the other thorny issue with OVA files is that the
>> metadata contains a checksum (SHA1 or SHA256) of the disk images.  If
>> you modify the disk images in-place in the tar file then you need to
>> recalculate those.
> 
> All of this means that OVA isn't really well suited to be used as a
> native format for VM configuration + images. It's just for sharing
> read-only images that are converted into another native format before
> they are used.
> 
> Which is probably fair for the use case it was made for, but means that
> we need something else to solve our problem.

Maybe we should first narrow down our problem.  Maybe you have done that
already, but I'm quite in the dark still.

The original problem was that you need to supply a machine type to qemu,
and that multiple common architectures now have multiple machine types
and not necessarily all work with a single image.  So far so good, but I
have two issues here already:

(1) How is qemu supposed to interpret that information?  If it's stored
in the image file, I don't see a nice way of retrieving it before the
machine is initialized, at least not with qemu's current architecture.
Once we support configuring qemu solely through QMP, sure, you can do a
blockdev-add and then build the machine accordingly.  But that is not
here today, and I'm not sure this is a good idea either, because that
would mean automagic defaults for the machine-building QMP commands
derived from the blockdev-add earlier, which should get a plain "No".
Also, having to use QMP to build your machine wouldn't make anything
easier; at least not easier than just supplying a configuration file
along with the image.

(Building the magic into -blockdev might be less horrible, but such
magic (adding block devices influences machine defaults) to me still
doesn't seem worth not having to supply a config file along with the
disk image.)

(2) Again, I personally just really don't like saving such information
in a disk image.  One actual argument I can bring up for that distaste
is this: Suppose, you have multiple images attached to your VM.  Now the
VM wants to store the machine type.  Where does it go?  Into all of
them?  But some of those images may only contain data and might be
intended to be shared between multiple VMs.  So those shouldn't receive
the mark.  Only disks with binaries should receive them.
But what if those binaries are just cross-compiled binaries for some
other VM?  Oh no, so not even binaries are a sure indicator...  So I
have no idea where the information is supposed to be stored.  In any
case, "the first image" just gets an outright "no" from me, and "all
images" gets an "I don't think this is a good idea".

Loading is fun, too.  OK, so you attach multiple disk images to a VM.
Oops, they have varying machine type information...  Now what?  Use the
information from the first one?  Definitely no.  Just ignore all of the
information in such a case and have the user supply the machine type
again?  Possible, but it seems weird to me that qemu would usually guess
the machine type, but once you attach some random other image to it, it
suddenly fails to do that.  But maybe it's just me who thinks this is weird.


OK, so let's go a step further.  We have stored the machine type
information in order to not have to supply a config file with the qcow2
image -- because if we did, it could just contain the machine type and
that would be it.

So to me it follows naturally that just storing the machine type doesn't
make much sense if we cannot also store more VM configuration in a qcow2
file, because I don't see why you should be able to ship an image
without a config file only if all you need to supply is a machine type.
Often, you also need to supply how much memory the VM needs (which
depends on the OS on the image) or what storage controller to use (does
the OS have virtio drivers? (to be fair, it usually does, because you're
supplying a VM image in the first place)).

So I think if we decide to store the machine type, that is kind of a
slippery slope and then there are good arguments for storing even more
configuration options in the file, too.  But I really, really don't like
that.

For one thing, I suspect it to get really ugly implementation-wise.
Getting the machine type out of a disk image and actually interpreting
it automatically is bad enough, but getting possibly everything out of
it?  It's not going to be any better.

For another, how do we store the data?  key-value seems wrong if we want
to store everything.  JSON might be fine.  But eventually we just want
basically a qemu configuration file in there, I would think (which may
support JSON at some point?).   So basically we would store the data as
a binary blob and let the rest of qemu do its thing with it.  But then
please tell me why I fought so valiantly against storing random bitmaps
in qcow2 files.  I hate the idea of making qcow2 a random archive
format.  We have tar for that.


Unless I have got something terribly wrong (which is indeed a
possibility!), to me this proposal means basically to turn qcow2 into
(1) a VM description format for qemu, and (2) to turn it into an archive
format on the way.

As explained, I don't like (2), but it would be necessary for (1), so yeah.

As for (1), just why?  I mean, if we want to do that, fine, but on one
hand that is absolutely not what qcow2 is right now (it is a VM-agnostic
disk image format), and on the other I simply don't see any good reason
to.  We have config files for that, they just lack the disk data.  I
don't see the difficulty in having to deal with two files.  And even if
it were too difficult for some people in some cases (to me this is
really hypothetically speaking), I'd rather integrate qcow2 (or any disk
image) into some other descriptive format than adding descriptive
capabilities to qcow2.[1]


tl;dr: I really don't get why it's so hard to supply a config file along
with a qcow2 image.  Is it so hard for people to realize that a VM does
not only consist of a disk?

Max


[1] It isn't like I think integrating disk images into a VM description
format (like config files) is worth doing.  I think having to deal with
multiple files is not an issue whatsoever (but again, this may be
because I haven't understood the problem).  I just think that (1) it
doesn't concern me as someone working on the block layer, because it
wouldn't change anything about qcow2 or the block layer in general, so
go ahead if you want to do anything in an area that doesn't concern me,
and (2) it just seems intuitively more natural to me and I think it
would be much nicer to implement.  So if someone has time to spare to
implement this, I wouldn't oppose it.  But I'm not that someone.


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-05-29  9:23             ` Max Reitz
@ 2018-05-29 10:14               ` Kevin Wolf
  2018-05-29 13:16                 ` Eduardo Habkost
  0 siblings, 1 reply; 157+ messages in thread
From: Kevin Wolf @ 2018-05-29 10:14 UTC (permalink / raw)
  To: Max Reitz
  Cc: Richard W.M. Jones, Michael S. Tsirkin, ehabkost, stefanha,
	qemu-devel, qemu-block

[-- Attachment #1: Type: text/plain, Size: 2561 bytes --]

Am 29.05.2018 um 11:23 hat Max Reitz geschrieben:
> On 2018-05-28 21:09, Kevin Wolf wrote:
> > Am 28.05.2018 um 20:44 hat Max Reitz geschrieben:
> >> On 2018-05-28 20:38, Kevin Wolf wrote:
> >>> Am 28.05.2018 um 20:30 hat Richard W.M. Jones geschrieben:
> >>>> On Mon, May 28, 2018 at 08:10:32PM +0200, Max Reitz wrote:
> >>>>> As someone who is just naive and doesn't see the big picture, I don't
> >>>>> see what's wrong with using a tar file that contains the image and
> >>>>> additional data.
> >>>>
> >>>> FWIW an OVA file is exactly this: an uncompressed tar file containing
> >>>> disk image(s) and metadata.
> >>>
> >>> If we combine VM configuration and the disk image this way, I would
> >>> still want to directly use that combined thing without having to extract
> >>> its components first.
> >>>
> >>> Just accessing the image file within a tar archive is possible and we
> >>> could write a block driver for that (I actually think we should do
> >>> this), but it restricts you because certain operations like resizing
> >>> aren't really possible in tar. Unfortunately, resizing is a really
> >>> common operation for non-raw image formats.
> >>>
> >>> And if I think of a file format that can contain several different
> >>> things that can be individually resized etc., I end up with qcow2 in the
> >>> simple case or a full file system in the more complex case.
> >>
> >> Well, you end up with VMDK.
> > 
> > I don't think VMDK can save several different objects? It can have some
> > metadata in the descriptor, and it can spread the contents of a single
> > object across multiple files (with extents), but I don't think it has
> > something comparable to e.g. qcow2 snapshots, which are separate objects
> > with an individual size that can dynamically change.
> 
> Right, I tried to be funny and was over-simplifying in the process.
> 
> What I meant is: You end up with an image format that is spread on a
> filesystem, like VMDK is (usually).  Then you have some metadata
> descriptor file that describes the rest and multiple data object files.
> 
> (For completeness's sake: And you can use an external or an internal
> filesystem, that is, use multiple files (like VMDK) or have an internal
> filesystem (like tar, except tar doesn't allow fragmentation).)

Let's call the libvirt XML the image file and the qcow2 files its
data object files and we're done?

I'm afraid spreading things across multiple files doesn't meet the
requirements for the problem at hand, though...

Kevin

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 801 bytes --]

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-05-29 10:14               ` Kevin Wolf
@ 2018-05-29 13:16                 ` Eduardo Habkost
  0 siblings, 0 replies; 157+ messages in thread
From: Eduardo Habkost @ 2018-05-29 13:16 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: Max Reitz, Richard W.M. Jones, Michael S. Tsirkin, stefanha,
	qemu-devel, qemu-block

On Tue, May 29, 2018 at 12:14:28PM +0200, Kevin Wolf wrote:
> Am 29.05.2018 um 11:23 hat Max Reitz geschrieben:
> > On 2018-05-28 21:09, Kevin Wolf wrote:
> > > Am 28.05.2018 um 20:44 hat Max Reitz geschrieben:
> > >> On 2018-05-28 20:38, Kevin Wolf wrote:
> > >>> Am 28.05.2018 um 20:30 hat Richard W.M. Jones geschrieben:
> > >>>> On Mon, May 28, 2018 at 08:10:32PM +0200, Max Reitz wrote:
> > >>>>> As someone who is just naive and doesn't see the big picture, I don't
> > >>>>> see what's wrong with using a tar file that contains the image and
> > >>>>> additional data.
> > >>>>
> > >>>> FWIW an OVA file is exactly this: an uncompressed tar file containing
> > >>>> disk image(s) and metadata.
> > >>>
> > >>> If we combine VM configuration and the disk image this way, I would
> > >>> still want to directly use that combined thing without having to extract
> > >>> its components first.
> > >>>
> > >>> Just accessing the image file within a tar archive is possible and we
> > >>> could write a block driver for that (I actually think we should do
> > >>> this), but it restricts you because certain operations like resizing
> > >>> aren't really possible in tar. Unfortunately, resizing is a really
> > >>> common operation for non-raw image formats.
> > >>>
> > >>> And if I think of a file format that can contain several different
> > >>> things that can be individually resized etc., I end up with qcow2 in the
> > >>> simple case or a full file system in the more complex case.
> > >>
> > >> Well, you end up with VMDK.
> > > 
> > > I don't think VMDK can save several different objects? It can have some
> > > metadata in the descriptor, and it can spread the contents of a single
> > > object across multiple files (with extents), but I don't think it has
> > > something comparable to e.g. qcow2 snapshots, which are separate objects
> > > with an individual size that can dynamically change.
> > 
> > Right, I tried to be funny and was over-simplifying in the process.
> > 
> > What I meant is: You end up with an image format that is spread on a
> > filesystem, like VMDK is (usually).  Then you have some metadata
> > descriptor file that describes the rest and multiple data object files.
> > 
> > (For completeness's sake: And you can use an external or an internal
> > filesystem, that is, use multiple files (like VMDK) or have an internal
> > filesystem (like tar, except tar doesn't allow fragmentation).)
> 
> Let's call the libvirt XML the image file and the qcow2 files its
> data object files and we're done?

libvirt XML doesn't seems to be a solution for the problem, but a
separate VM specification file could work anyway.


> 
> I'm afraid spreading things across multiple files doesn't meet the
> requirements for the problem at hand, though...

Yeah, I wanted to make this an extension of qcow2, so people
don't have Yet Another file format to choose from.  We could also
let libvirt automatically write hints to disk images so things
would Just Work if you copied a disk image from an existing VM.

But I'm not the one working on management layers or guest image
tools, so I will just trust Richard's and Daniel's opinions on
this.

-- 
Eduardo

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-05-24 11:17   ` Richard W.M. Jones
@ 2018-05-29 14:03     ` Dr. David Alan Gilbert
  2018-05-29 14:14       ` Eduardo Habkost
  0 siblings, 1 reply; 157+ messages in thread
From: Dr. David Alan Gilbert @ 2018-05-29 14:03 UTC (permalink / raw)
  To: Richard W.M. Jones
  Cc: Daniel P. Berrangé,
	kwolf, ehabkost, qemu-block, Michael S. Tsirkin, qemu-devel,
	mreitz, stefanha

* Richard W.M. Jones (rjones@redhat.com) wrote:
> On Fri, May 18, 2018 at 06:09:56PM +0100, Daniel P. Berrangé wrote:
> > The closest to a cross-hypervisor standard is OVF which can store
> > metadata about required hardware for a VM. I'm pretty sure it does
> > not have the concept of machine types, but maybe it has a way for
> > people to define metadata extensions. Since it is just XML at the
> > end of the day, even if there was nothing official in OVF, it would
> > be possible to just define a custom XML namespace and declare a
> > schema for that to follow.
> 
> I have a great deal of experience with the OVF "standard".
> TL;DR: DO NOT USE IT.

In addition to the detail below, from reading DMTF's OVF spec (DSP0243 v
2.1.1) I see absolutely nothing specifying hardware type.
Sure it can specify size of storage, number of ether cards, MAC
addresses for them etc - but I don't see any where specify the type of 
emualted system.

Dave

> Long answer copied from a rant I wrote on an internal mailing list a
> while back:
> 
>   Don't make the mistake of confusing OVF for a format.  It's not,
>   there are at least 4 non-interoperable OVF "format"s around:
> 
>    - 2 x oVirt OVF
>    - VMware's OVF used in exported OVA files
>    - VirtualBox's OVF used in their exported OVA files
> 
>   These are all different and do not interoperate *at all*.  So before
>   you decide "let's parse OVF", be precise about which format(s) you
>   actually want to parse.
> 
>   Also OVF is a hideous format.  Many fields are obviously internal data
>   dumps of VMware structures, complete with internal VMware IDs instead
>   of descriptive names.  Where there are descriptive names, they use
>   English strings instead of keywords, like: <rasd:AllocationUnits>MetaBytes</>
>   or my particular WTF favourite, a meaningful field which references
>   English (only) Wikipedia:
> 
>     <Disk ovf:format="http://en.wikipedia.org/wiki/Byte">
> 
>   File references are split over two places, and there are other
>   examples where data is needlessly duplicated or it's unclear what data
>   is supposed to be.
> 
>   Of course VMware Inc. are not stupid enough to use this format for
>   their own purposes.  They use a completely different format (VMX)
>   which is a lot like YAML.
> 
> Rich.
> 
> -- 
> Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
> Read my programming and virtualization blog: http://rwmj.wordpress.com
> Fedora Windows cross-compiler. Compile Windows programs, test, and
> build Windows installers. Over 100 libraries supported.
> http://fedoraproject.org/wiki/MinGW
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-05-23 16:35                 ` Markus Armbruster
@ 2018-05-29 14:06                   ` Dr. David Alan Gilbert
  2018-06-05 21:58                   ` Michal Suchánek
  1 sibling, 0 replies; 157+ messages in thread
From: Dr. David Alan Gilbert @ 2018-05-29 14:06 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Eduardo Habkost, kwolf, qemu-block, Michael S. Tsirkin,
	qemu-devel, mreitz, stefanha

* Markus Armbruster (armbru@redhat.com) wrote:

> Seriously, though: are disk images a sane way to package software?

They're not actually terrible.
You can just start a VM with them straight away, no unpacking required,
no installation time.  Especially when used with something like
-snapshot.
It's also easy for the person packaging them.

Dave
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-05-29 14:03     ` Dr. David Alan Gilbert
@ 2018-05-29 14:14       ` Eduardo Habkost
  2018-05-29 14:51         ` Richard W.M. Jones
  2018-05-29 15:31         ` Dr. David Alan Gilbert
  0 siblings, 2 replies; 157+ messages in thread
From: Eduardo Habkost @ 2018-05-29 14:14 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: Richard W.M. Jones, Daniel P. Berrangé,
	kwolf, qemu-block, Michael S. Tsirkin, qemu-devel, mreitz,
	stefanha

On Tue, May 29, 2018 at 03:03:16PM +0100, Dr. David Alan Gilbert wrote:
> * Richard W.M. Jones (rjones@redhat.com) wrote:
> > On Fri, May 18, 2018 at 06:09:56PM +0100, Daniel P. Berrangé wrote:
> > > The closest to a cross-hypervisor standard is OVF which can store
> > > metadata about required hardware for a VM. I'm pretty sure it does
> > > not have the concept of machine types, but maybe it has a way for
> > > people to define metadata extensions. Since it is just XML at the
> > > end of the day, even if there was nothing official in OVF, it would
> > > be possible to just define a custom XML namespace and declare a
> > > schema for that to follow.
> > 
> > I have a great deal of experience with the OVF "standard".
> > TL;DR: DO NOT USE IT.
> 
> In addition to the detail below, from reading DMTF's OVF spec (DSP0243 v
> 2.1.1) I see absolutely nothing specifying hardware type.
> Sure it can specify size of storage, number of ether cards, MAC
> addresses for them etc - but I don't see any where specify the type of 
> emualted system.

Maybe the VirtualHardwareSection/System/vssd:VirtualSystemType
element could be used for that.  (DSP0243 v2.1.1, line 650).

But based on Richard's feedback, I think we shouldn't even try to
use it.

-- 
Eduardo

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-05-29 14:14       ` Eduardo Habkost
@ 2018-05-29 14:51         ` Richard W.M. Jones
  2018-05-29 15:31         ` Dr. David Alan Gilbert
  1 sibling, 0 replies; 157+ messages in thread
From: Richard W.M. Jones @ 2018-05-29 14:51 UTC (permalink / raw)
  To: Eduardo Habkost
  Cc: Dr. David Alan Gilbert, Daniel P. Berrangé,
	kwolf, qemu-block, Michael S. Tsirkin, qemu-devel, mreitz,
	stefanha

On Tue, May 29, 2018 at 11:14:11AM -0300, Eduardo Habkost wrote:
> On Tue, May 29, 2018 at 03:03:16PM +0100, Dr. David Alan Gilbert wrote:
> > * Richard W.M. Jones (rjones@redhat.com) wrote:
> > > On Fri, May 18, 2018 at 06:09:56PM +0100, Daniel P. Berrangé wrote:
> > > > The closest to a cross-hypervisor standard is OVF which can store
> > > > metadata about required hardware for a VM. I'm pretty sure it does
> > > > not have the concept of machine types, but maybe it has a way for
> > > > people to define metadata extensions. Since it is just XML at the
> > > > end of the day, even if there was nothing official in OVF, it would
> > > > be possible to just define a custom XML namespace and declare a
> > > > schema for that to follow.
> > > 
> > > I have a great deal of experience with the OVF "standard".
> > > TL;DR: DO NOT USE IT.
> > 
> > In addition to the detail below, from reading DMTF's OVF spec (DSP0243 v
> > 2.1.1) I see absolutely nothing specifying hardware type.
> > Sure it can specify size of storage, number of ether cards, MAC
> > addresses for them etc - but I don't see any where specify the type of 
> > emualted system.
> 
> Maybe the VirtualHardwareSection/System/vssd:VirtualSystemType
> element could be used for that.  (DSP0243 v2.1.1, line 650).
> 
> But based on Richard's feedback, I think we shouldn't even try to
> use it.

Yes, save yourself time and worry by avoiding OVF altogether.

Note that in any case you need something quite different from any
existing metadata format.  Most guests will support a variety of
driver models (eg. pc or q35, sata or virtio-blk or virtio-scsi, ...).

You need to express what device drivers are installed in the guest,
(separately for boot and running) and then the management layer needs
to match the devices the hypervisor can emulate with the required
devices, select the best performing ones in each class, and present
those to the guest.

As far as I know, there is no existing metadata format which expresses
this.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-builder quickly builds VMs from scratch
http://libguestfs.org/virt-builder.1.html

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-05-29 14:14       ` Eduardo Habkost
  2018-05-29 14:51         ` Richard W.M. Jones
@ 2018-05-29 15:31         ` Dr. David Alan Gilbert
  1 sibling, 0 replies; 157+ messages in thread
From: Dr. David Alan Gilbert @ 2018-05-29 15:31 UTC (permalink / raw)
  To: Eduardo Habkost
  Cc: Richard W.M. Jones, Daniel P. Berrangé,
	kwolf, qemu-block, Michael S. Tsirkin, qemu-devel, mreitz,
	stefanha

* Eduardo Habkost (ehabkost@redhat.com) wrote:
> On Tue, May 29, 2018 at 03:03:16PM +0100, Dr. David Alan Gilbert wrote:
> > * Richard W.M. Jones (rjones@redhat.com) wrote:
> > > On Fri, May 18, 2018 at 06:09:56PM +0100, Daniel P. Berrangé wrote:
> > > > The closest to a cross-hypervisor standard is OVF which can store
> > > > metadata about required hardware for a VM. I'm pretty sure it does
> > > > not have the concept of machine types, but maybe it has a way for
> > > > people to define metadata extensions. Since it is just XML at the
> > > > end of the day, even if there was nothing official in OVF, it would
> > > > be possible to just define a custom XML namespace and declare a
> > > > schema for that to follow.
> > > 
> > > I have a great deal of experience with the OVF "standard".
> > > TL;DR: DO NOT USE IT.
> > 
> > In addition to the detail below, from reading DMTF's OVF spec (DSP0243 v
> > 2.1.1) I see absolutely nothing specifying hardware type.
> > Sure it can specify size of storage, number of ether cards, MAC
> > addresses for them etc - but I don't see any where specify the type of 
> > emualted system.
> 
> Maybe the VirtualHardwareSection/System/vssd:VirtualSystemType
> element could be used for that.  (DSP0243 v2.1.1, line 650).

Ah yes, you're right; they hadn't bothered putting that in any of the
examples.  A quick search suggests VMWare use that as 'vmx-10' or
'vmx-12' as a 'hardware faimily'.

> But based on Richard's feedback, I think we shouldn't even try to
> use it.

Right; although if we have a key/value system, then if the key/value
structures we used happened to match up with OVMF if they made sense
then I guess it would make conversions easy.

Dave

> -- 
> Eduardo
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-05-29 10:14               ` Max Reitz
@ 2018-06-05  9:21                 ` Dr. David Alan Gilbert
  2018-06-05 19:03                   ` Eduardo Habkost
  2018-06-06 11:02                   ` Max Reitz
  2018-06-06 10:32                 ` [Qemu-devel] " Michal Suchánek
  1 sibling, 2 replies; 157+ messages in thread
From: Dr. David Alan Gilbert @ 2018-06-05  9:21 UTC (permalink / raw)
  To: Max Reitz
  Cc: Kevin Wolf, Richard W.M. Jones, qemu-block, Michael S. Tsirkin,
	qemu-devel, stefanha

<reawakening a fizzled out thread>

This seems to have fizzled out because of a lack of a concrete proposal;
so here is one based on a reply to Max's post:

* Max Reitz (mreitz@redhat.com) wrote:

<snip>

> The original problem was that you need to supply a machine type to qemu,
> and that multiple common architectures now have multiple machine types
> and not necessarily all work with a single image.  So far so good, but I
> have two issues here already:
> 
> (1) How is qemu supposed to interpret that information?  If it's stored
> in the image file, I don't see a nice way of retrieving it before the
> machine is initialized, at least not with qemu's current architecture.

<snip>

> (2) Again, I personally just really don't like saving such information
> in a disk image.  One actual argument I can bring up for that distaste
> is this: Suppose, you have multiple images attached to your VM.  Now the
> VM wants to store the machine type.  Where does it go?  Into all of
> them?

<snip>

> So I think if we decide to store the machine type, that is kind of a
> slippery slope and then there are good arguments for storing even more
> configuration options in the file, too.  But I really, really don't like
> that.

<snip>

> For another, how do we store the data?  key-value seems wrong if we want
> to store everything.  JSON might be fine.  But eventually we just want
> basically a qemu configuration file in there, I would think (which may
> support JSON at some point?).   So basically we would store the data as
> a binary blob and let the rest of qemu do its thing with it.  But then
> please tell me why I fought so valiantly against storing random bitmaps
> in qcow2 files.  I hate the idea of making qcow2 a random archive
> format.  We have tar for that.

<snip>

> tl;dr: I really don't get why it's so hard to supply a config file along
> with a qcow2 image.  Is it so hard for people to realize that a VM does
> not only consist of a disk?

Yes! Because in many cases that's all it needs, and it's ready to run
with no unpacking.

I think we should have:

--------------------------------------------------------------
Layer 0:
   QCOW provides a way to store a single string of arbitrary (but
limited?) length.
   QCOW provides a way to replace the string by a new string.
   The original or the new string will be stored after that;
   never some mix.
   Where a file 'b' has a backing file 'a', 'b' inherits the
   string from 'a' unless 'b' has it's own string.
   Snapshots inherit their string from the main unless they have
   their own string.

Layer 1:
   The string shall always be a JSON 'object'; i.e. of the form
    { "something": ... , "more": ... }

   The key strings shall be non-null and non-empty and shall
   be unique.

Layer 2:
   '.'s in the key string shall indicate hierarchy
   
   Key strings shall be listed in qemu's 
      docs/specs/qcow-keys.rst

      that shall indicate their meaning and the meaning and
      valid formatting of the value associated with the,

   Key strings shall start with either:
      qemu.   in which case they must be listed in a file in
              the qemu source tree

      a reverse dotted name unique to the submitter, they may
              be listed in the same file in the source tree, e.g.
      com.redhat.

Layer 3:
   QEMU shall, for a given qcow2 file be able to dump the
   key values.

Layer 4:
   On creating a VM by importing a qcow2, a management layer
   shall inspect the key/values to influence the configuration
   of the VM created.   Where it imports multiple qcow2's it
   shall inspect all the files and flag disagreements.

   Management layers shall, on creating a qcow2 shall set the
   keys based on the VM the qcow2 is created for.  If the qcow2
   is created as an additional disk for an exisitng VM it's
   fine to leave the string empty (e.g. for a data disk).

--------------------------------------------------------------
   

Some reasoning:
   a) I've avoided the problem of when QEMU interprets the value
      by ignoring it and giving it to management layers at the point
      of VM import.
   b) I hate JSON, but there again nailing down a fixed format
      seems easiest and it makes the job of QCOW easy - a single
      string.
      (I would suggest in layer2 that the keys are sorted, but
      that's a pain to do in some json creators)
   c) Forcing the registry of keys might avoid silly duplication.
      We can but hope.
   d) I've not said it's a libvirt XML file since that seems
      a bit prescriptive.

Some initial suggested keys:

   "qemu.machine-types": [ "q35", "i440fx" ]
   "qemu.min-ram-MB": 1024


Dave

--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-05  9:21                 ` Dr. David Alan Gilbert
@ 2018-06-05 19:03                   ` Eduardo Habkost
  2018-06-05 19:47                     ` Michael S. Tsirkin
                                       ` (2 more replies)
  2018-06-06 11:02                   ` Max Reitz
  1 sibling, 3 replies; 157+ messages in thread
From: Eduardo Habkost @ 2018-06-05 19:03 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: Max Reitz, Kevin Wolf, qemu-block, Michael S. Tsirkin,
	qemu-devel, Richard W.M. Jones, stefanha

On Tue, Jun 05, 2018 at 10:21:59AM +0100, Dr. David Alan Gilbert wrote:
> <reawakening a fizzled out thread>
> 
> This seems to have fizzled out because of a lack of a concrete proposal;
> so here is one based on a reply to Max's post:
> 
> * Max Reitz (mreitz@redhat.com) wrote:
> 
> <snip>
> 
> > The original problem was that you need to supply a machine type to qemu,
> > and that multiple common architectures now have multiple machine types
> > and not necessarily all work with a single image.  So far so good, but I
> > have two issues here already:
> > 
> > (1) How is qemu supposed to interpret that information?  If it's stored
> > in the image file, I don't see a nice way of retrieving it before the
> > machine is initialized, at least not with qemu's current architecture.
> 
> <snip>
> 
> > (2) Again, I personally just really don't like saving such information
> > in a disk image.  One actual argument I can bring up for that distaste
> > is this: Suppose, you have multiple images attached to your VM.  Now the
> > VM wants to store the machine type.  Where does it go?  Into all of
> > them?
> 
> <snip>
> 
> > So I think if we decide to store the machine type, that is kind of a
> > slippery slope and then there are good arguments for storing even more
> > configuration options in the file, too.  But I really, really don't like
> > that.
> 
> <snip>
> 
> > For another, how do we store the data?  key-value seems wrong if we want
> > to store everything.  JSON might be fine.  But eventually we just want
> > basically a qemu configuration file in there, I would think (which may
> > support JSON at some point?).   So basically we would store the data as
> > a binary blob and let the rest of qemu do its thing with it.  But then
> > please tell me why I fought so valiantly against storing random bitmaps
> > in qcow2 files.  I hate the idea of making qcow2 a random archive
> > format.  We have tar for that.
> 
> <snip>
> 
> > tl;dr: I really don't get why it's so hard to supply a config file along
> > with a qcow2 image.  Is it so hard for people to realize that a VM does
> > not only consist of a disk?
> 
> Yes! Because in many cases that's all it needs, and it's ready to run
> with no unpacking.
> 
> I think we should have:
> 
> --------------------------------------------------------------
> Layer 0:
>    QCOW provides a way to store a single string of arbitrary (but
> limited?) length.
>    QCOW provides a way to replace the string by a new string.
>    The original or the new string will be stored after that;
>    never some mix.
>    Where a file 'b' has a backing file 'a', 'b' inherits the
>    string from 'a' unless 'b' has it's own string.
>    Snapshots inherit their string from the main unless they have
>    their own string.
> 
> Layer 1:
>    The string shall always be a JSON 'object'; i.e. of the form
>     { "something": ... , "more": ... }
> 
>    The key strings shall be non-null and non-empty and shall
>    be unique.
> 

I'd prefer layer 0+1 to:

1) Allow multiple entries to be stored (implemented by layer 1
   in this proposal)
2) Identify each entry with a name (implemented by layer 1 in
   this proposal)
3) Allow arbitrary binary data to be stored on an entry
   (not possible with the JSON-based proposal, because JSON
   strings are not blobs, but Unicode strings).
4) Make it easy to replace only one entry while keeping others
   intact (not the case here, if all entries are stored in the
   same JSON string)

I think it would be simpler if layer 0 simply provided a list of
names/value pairs, where names are ascii strings, and values are
binary data[1].  It would make layer 1 unnecessary, and allow (3)
and (4) to happen.

[1] In other words, Rich's proposal of "named blobs":
https://www.mail-archive.com/qemu-block@nongnu.org/msg37856.html


> Layer 2:
>    '.'s in the key string shall indicate hierarchy
>    
>    Key strings shall be listed in qemu's 
>       docs/specs/qcow-keys.rst
> 
>       that shall indicate their meaning and the meaning and
>       valid formatting of the value associated with the,
> 
>    Key strings shall start with either:
>       qemu.   in which case they must be listed in a file in
>               the qemu source tree
> 
>       a reverse dotted name unique to the submitter, they may
>               be listed in the same file in the source tree, e.g.
>       com.redhat.
> 

Based on this proposal for layer 2, it looks like you expect the
number of keys used on layer 1 to become large.

I would prefer a solution that expects a very small set of keys
for layer 0+1, and point to other specifications of how the blob
can be interpreted for each key.  This way we can experiment with
different solutions for layers 2-4, instead of deciding on a
specific format like JSON.


> Layer 3:
>    QEMU shall, for a given qcow2 file be able to dump the
>    key values.
> 
> Layer 4:
>    On creating a VM by importing a qcow2, a management layer
>    shall inspect the key/values to influence the configuration
>    of the VM created.   Where it imports multiple qcow2's it
>    shall inspect all the files and flag disagreements.
> 
>    Management layers shall, on creating a qcow2 shall set the
>    keys based on the VM the qcow2 is created for.  If the qcow2
>    is created as an additional disk for an exisitng VM it's
>    fine to leave the string empty (e.g. for a data disk).
> 
> --------------------------------------------------------------
>    
> 
> Some reasoning:
>    a) I've avoided the problem of when QEMU interprets the value
>       by ignoring it and giving it to management layers at the point
>       of VM import.
>    b) I hate JSON, but there again nailing down a fixed format
>       seems easiest and it makes the job of QCOW easy - a single
>       string.
>       (I would suggest in layer2 that the keys are sorted, but
>       that's a pain to do in some json creators)
>    c) Forcing the registry of keys might avoid silly duplication.
>       We can but hope.
>    d) I've not said it's a libvirt XML file since that seems
>       a bit prescriptive.
> 
> Some initial suggested keys:
> 
>    "qemu.machine-types": [ "q35", "i440fx" ]
>    "qemu.min-ram-MB": 1024
> 
> 
> Dave
> 
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> 

-- 
Eduardo

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-05 19:03                   ` Eduardo Habkost
@ 2018-06-05 19:47                     ` Michael S. Tsirkin
  2018-06-05 19:54                       ` [Qemu-devel] [Qemu-block] " Eric Blake
  2018-06-06  6:26                     ` [Qemu-devel] " Gerd Hoffmann
  2018-06-06  9:44                     ` Dr. David Alan Gilbert
  2 siblings, 1 reply; 157+ messages in thread
From: Michael S. Tsirkin @ 2018-06-05 19:47 UTC (permalink / raw)
  To: Eduardo Habkost
  Cc: Dr. David Alan Gilbert, Max Reitz, Kevin Wolf, qemu-block,
	qemu-devel, Richard W.M. Jones, stefanha

On Tue, Jun 05, 2018 at 04:03:24PM -0300, Eduardo Habkost wrote:
> On Tue, Jun 05, 2018 at 10:21:59AM +0100, Dr. David Alan Gilbert wrote:
> > <reawakening a fizzled out thread>
> > 
> > This seems to have fizzled out because of a lack of a concrete proposal;
> > so here is one based on a reply to Max's post:
> > 
> > * Max Reitz (mreitz@redhat.com) wrote:
> > 
> > <snip>
> > 
> > > The original problem was that you need to supply a machine type to qemu,
> > > and that multiple common architectures now have multiple machine types
> > > and not necessarily all work with a single image.  So far so good, but I
> > > have two issues here already:
> > > 
> > > (1) How is qemu supposed to interpret that information?  If it's stored
> > > in the image file, I don't see a nice way of retrieving it before the
> > > machine is initialized, at least not with qemu's current architecture.
> > 
> > <snip>
> > 
> > > (2) Again, I personally just really don't like saving such information
> > > in a disk image.  One actual argument I can bring up for that distaste
> > > is this: Suppose, you have multiple images attached to your VM.  Now the
> > > VM wants to store the machine type.  Where does it go?  Into all of
> > > them?
> > 
> > <snip>
> > 
> > > So I think if we decide to store the machine type, that is kind of a
> > > slippery slope and then there are good arguments for storing even more
> > > configuration options in the file, too.  But I really, really don't like
> > > that.
> > 
> > <snip>
> > 
> > > For another, how do we store the data?  key-value seems wrong if we want
> > > to store everything.  JSON might be fine.  But eventually we just want
> > > basically a qemu configuration file in there, I would think (which may
> > > support JSON at some point?).   So basically we would store the data as
> > > a binary blob and let the rest of qemu do its thing with it.  But then
> > > please tell me why I fought so valiantly against storing random bitmaps
> > > in qcow2 files.  I hate the idea of making qcow2 a random archive
> > > format.  We have tar for that.
> > 
> > <snip>
> > 
> > > tl;dr: I really don't get why it's so hard to supply a config file along
> > > with a qcow2 image.  Is it so hard for people to realize that a VM does
> > > not only consist of a disk?
> > 
> > Yes! Because in many cases that's all it needs, and it's ready to run
> > with no unpacking.
> > 
> > I think we should have:
> > 
> > --------------------------------------------------------------
> > Layer 0:
> >    QCOW provides a way to store a single string of arbitrary (but
> > limited?) length.
> >    QCOW provides a way to replace the string by a new string.
> >    The original or the new string will be stored after that;
> >    never some mix.
> >    Where a file 'b' has a backing file 'a', 'b' inherits the
> >    string from 'a' unless 'b' has it's own string.
> >    Snapshots inherit their string from the main unless they have
> >    their own string.
> > 
> > Layer 1:
> >    The string shall always be a JSON 'object'; i.e. of the form
> >     { "something": ... , "more": ... }
> > 
> >    The key strings shall be non-null and non-empty and shall
> >    be unique.
> > 
> 
> I'd prefer layer 0+1 to:
> 
> 1) Allow multiple entries to be stored (implemented by layer 1
>    in this proposal)
> 2) Identify each entry with a name (implemented by layer 1 in
>    this proposal)
> 3) Allow arbitrary binary data to be stored on an entry
>    (not possible with the JSON-based proposal, because JSON
>    strings are not blobs, but Unicode strings).
> 4) Make it easy to replace only one entry while keeping others
>    intact (not the case here, if all entries are stored in the
>    same JSON string)
> 
> I think it would be simpler if layer 0 simply provided a list of
> names/value pairs, where names are ascii strings, and values are
> binary data[1].  It would make layer 1 unnecessary, and allow (3)
> and (4) to happen.
> 
> [1] In other words, Rich's proposal of "named blobs":
> https://www.mail-archive.com/qemu-block@nongnu.org/msg37856.html

I think simple is beautiful, too. But assuming they
really are binary how are blobs encoded?
Did binary really mean UTF-8 here?

> 
> > Layer 2:
> >    '.'s in the key string shall indicate hierarchy
> >    
> >    Key strings shall be listed in qemu's 
> >       docs/specs/qcow-keys.rst
> > 
> >       that shall indicate their meaning and the meaning and
> >       valid formatting of the value associated with the,
> > 
> >    Key strings shall start with either:
> >       qemu.   in which case they must be listed in a file in
> >               the qemu source tree
> > 
> >       a reverse dotted name unique to the submitter, they may
> >               be listed in the same file in the source tree, e.g.
> >       com.redhat.
> > 
> 
> Based on this proposal for layer 2, it looks like you expect the
> number of keys used on layer 1 to become large.
> 
> I would prefer a solution that expects a very small set of keys
> for layer 0+1, and point to other specifications of how the blob
> can be interpreted for each key.  This way we can experiment with
> different solutions for layers 2-4, instead of deciding on a
> specific format like JSON.
> 
> 
> > Layer 3:
> >    QEMU shall, for a given qcow2 file be able to dump the
> >    key values.
> > 
> > Layer 4:
> >    On creating a VM by importing a qcow2, a management layer
> >    shall inspect the key/values to influence the configuration
> >    of the VM created.   Where it imports multiple qcow2's it
> >    shall inspect all the files and flag disagreements.
> > 
> >    Management layers shall, on creating a qcow2 shall set the
> >    keys based on the VM the qcow2 is created for.  If the qcow2
> >    is created as an additional disk for an exisitng VM it's
> >    fine to leave the string empty (e.g. for a data disk).
> > 
> > --------------------------------------------------------------
> >    
> > 
> > Some reasoning:
> >    a) I've avoided the problem of when QEMU interprets the value
> >       by ignoring it and giving it to management layers at the point
> >       of VM import.
> >    b) I hate JSON, but there again nailing down a fixed format
> >       seems easiest and it makes the job of QCOW easy - a single
> >       string.
> >       (I would suggest in layer2 that the keys are sorted, but
> >       that's a pain to do in some json creators)
> >    c) Forcing the registry of keys might avoid silly duplication.
> >       We can but hope.
> >    d) I've not said it's a libvirt XML file since that seems
> >       a bit prescriptive.
> > 
> > Some initial suggested keys:
> > 
> >    "qemu.machine-types": [ "q35", "i440fx" ]
> >    "qemu.min-ram-MB": 1024
> > 
> > 
> > Dave
> > 
> > --
> > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> > 
> 
> -- 
> Eduardo

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] [Qemu-block] storing machine data in qcow images?
  2018-06-05 19:47                     ` Michael S. Tsirkin
@ 2018-06-05 19:54                       ` Eric Blake
  2018-06-05 19:58                         ` Richard W.M. Jones
  2018-06-05 20:06                         ` Michael S. Tsirkin
  0 siblings, 2 replies; 157+ messages in thread
From: Eric Blake @ 2018-06-05 19:54 UTC (permalink / raw)
  To: Michael S. Tsirkin, Eduardo Habkost
  Cc: Kevin Wolf, qemu-block, qemu-devel, Richard W.M. Jones, stefanha,
	Max Reitz, Dr. David Alan Gilbert

On 06/05/2018 02:47 PM, Michael S. Tsirkin wrote:

>>> Layer 1:
>>>     The string shall always be a JSON 'object'; i.e. of the form
>>>      { "something": ... , "more": ... }
>>>
>>>     The key strings shall be non-null and non-empty and shall
>>>     be unique.

>>
>> I think it would be simpler if layer 0 simply provided a list of
>> names/value pairs, where names are ascii strings, and values are
>> binary data[1].  It would make layer 1 unnecessary, and allow (3)
>> and (4) to happen.
>>
>> [1] In other words, Rich's proposal of "named blobs":
>> https://www.mail-archive.com/qemu-block@nongnu.org/msg37856.html
> 
> I think simple is beautiful, too. But assuming they
> really are binary how are blobs encoded?
> Did binary really mean UTF-8 here?

Binary blobs can always be base64 encoded for representation within a 
valid JSON UTF-8 string (and we already have several QMP interfaces that 
utilize base64 encoding to pass through what is otherwise invalid 
UTF-8).  It does inflate things slightly compared to a format that 
allows a raw length coupled with raw data, but that is not necessarily a 
problem.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] [Qemu-block] storing machine data in qcow images?
  2018-06-05 19:54                       ` [Qemu-devel] [Qemu-block] " Eric Blake
@ 2018-06-05 19:58                         ` Richard W.M. Jones
  2018-06-05 20:09                           ` Eric Blake
  2018-06-06  6:23                           ` Gerd Hoffmann
  2018-06-05 20:06                         ` Michael S. Tsirkin
  1 sibling, 2 replies; 157+ messages in thread
From: Richard W.M. Jones @ 2018-06-05 19:58 UTC (permalink / raw)
  To: Eric Blake
  Cc: Michael S. Tsirkin, Eduardo Habkost, Kevin Wolf, qemu-block,
	qemu-devel, stefanha, Max Reitz, Dr. David Alan Gilbert

On Tue, Jun 05, 2018 at 02:54:07PM -0500, Eric Blake wrote:
> On 06/05/2018 02:47 PM, Michael S. Tsirkin wrote:
> 
> >>>Layer 1:
> >>>    The string shall always be a JSON 'object'; i.e. of the form
> >>>     { "something": ... , "more": ... }
> >>>
> >>>    The key strings shall be non-null and non-empty and shall
> >>>    be unique.
> 
> >>
> >>I think it would be simpler if layer 0 simply provided a list of
> >>names/value pairs, where names are ascii strings, and values are
> >>binary data[1].  It would make layer 1 unnecessary, and allow (3)
> >>and (4) to happen.
> >>
> >>[1] In other words, Rich's proposal of "named blobs":
> >>https://www.mail-archive.com/qemu-block@nongnu.org/msg37856.html
> >
> >I think simple is beautiful, too. But assuming they
> >really are binary how are blobs encoded?
> >Did binary really mean UTF-8 here?

By binary I actually meant binary.  The idea is you could
store things like PNG images in them (for icons).

> Binary blobs can always be base64 encoded for representation within
> a valid JSON UTF-8 string (and we already have several QMP
> interfaces that utilize base64 encoding to pass through what is
> otherwise invalid UTF-8).  It does inflate things slightly compared
> to a format that allows a raw length coupled with raw data, but that
> is not necessarily a problem.

Of course how we represent them externally and/or while
using QMP / qemu-img to store and retrieve them is up for grabs.
Doesn't JSON allow binary to be encoded?  (Knowing how poorly
done JSON is, I wouldn't be surprised if not)

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-top is 'top' for virtual machines.  Tiny program with many
powerful monitoring features, net stats, disk stats, logging, etc.
http://people.redhat.com/~rjones/virt-top

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] [Qemu-block] storing machine data in qcow images?
  2018-06-05 19:54                       ` [Qemu-devel] [Qemu-block] " Eric Blake
  2018-06-05 19:58                         ` Richard W.M. Jones
@ 2018-06-05 20:06                         ` Michael S. Tsirkin
  1 sibling, 0 replies; 157+ messages in thread
From: Michael S. Tsirkin @ 2018-06-05 20:06 UTC (permalink / raw)
  To: Eric Blake
  Cc: Eduardo Habkost, Kevin Wolf, qemu-block, qemu-devel,
	Richard W.M. Jones, stefanha, Max Reitz, Dr. David Alan Gilbert

On Tue, Jun 05, 2018 at 02:54:07PM -0500, Eric Blake wrote:
> On 06/05/2018 02:47 PM, Michael S. Tsirkin wrote:
> 
> > > > Layer 1:
> > > >     The string shall always be a JSON 'object'; i.e. of the form
> > > >      { "something": ... , "more": ... }
> > > > 
> > > >     The key strings shall be non-null and non-empty and shall
> > > >     be unique.
> 
> > > 
> > > I think it would be simpler if layer 0 simply provided a list of
> > > names/value pairs, where names are ascii strings, and values are
> > > binary data[1].  It would make layer 1 unnecessary, and allow (3)
> > > and (4) to happen.
> > > 
> > > [1] In other words, Rich's proposal of "named blobs":
> > > https://www.mail-archive.com/qemu-block@nongnu.org/msg37856.html
> > 
> > I think simple is beautiful, too. But assuming they
> > really are binary how are blobs encoded?
> > Did binary really mean UTF-8 here?
> 
> Binary blobs can always be base64 encoded for representation within a valid
> JSON UTF-8 string (and we already have several QMP interfaces that utilize
> base64 encoding to pass through what is otherwise invalid UTF-8).  It does
> inflate things slightly compared to a format that allows a raw length
> coupled with raw data, but that is not necessarily a problem.

OK so what's proposed here is something like the following:

[A-Za-z][^=\0]*=[^\0]*

Sound reasonable?

How about a fixed key at the beginning to specify the format?

E.g. 16 first bytes:

qcow2-format=00\0

anything else in first 16 bytes means some other format?

> -- 
> Eric Blake, Principal Software Engineer
> Red Hat, Inc.           +1-919-301-3266
> Virtualization:  qemu.org | libvirt.org

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] [Qemu-block] storing machine data in qcow images?
  2018-06-05 19:58                         ` Richard W.M. Jones
@ 2018-06-05 20:09                           ` Eric Blake
  2018-06-05 20:28                             ` Michael S. Tsirkin
  2018-06-06  6:23                           ` Gerd Hoffmann
  1 sibling, 1 reply; 157+ messages in thread
From: Eric Blake @ 2018-06-05 20:09 UTC (permalink / raw)
  To: Richard W.M. Jones
  Cc: Michael S. Tsirkin, Eduardo Habkost, Kevin Wolf, qemu-block,
	qemu-devel, stefanha, Max Reitz, Dr. David Alan Gilbert

On 06/05/2018 02:58 PM, Richard W.M. Jones wrote:
>> Binary blobs can always be base64 encoded for representation within
>> a valid JSON UTF-8 string (and we already have several QMP
>> interfaces that utilize base64 encoding to pass through what is
>> otherwise invalid UTF-8).  It does inflate things slightly compared
>> to a format that allows a raw length coupled with raw data, but that
>> is not necessarily a problem.
> 
> Of course how we represent them externally and/or while
> using QMP / qemu-img to store and retrieve them is up for grabs.
> Doesn't JSON allow binary to be encoded?  (Knowing how poorly
> done JSON is, I wouldn't be surprised if not)

JSON itself does not have a binary primitive; to pass arbitrary data 
through JSON you have to first encode that data into something like 
base64 that can then be represented as a UTF-8 string.  For reference, 
look at qapi/crypto.json and the definition of QCryptoSecretFormat.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] [Qemu-block] storing machine data in qcow images?
  2018-06-05 20:09                           ` Eric Blake
@ 2018-06-05 20:28                             ` Michael S. Tsirkin
  2018-06-05 20:46                               ` Eric Blake
  2018-06-06  8:07                               ` Dr. David Alan Gilbert
  0 siblings, 2 replies; 157+ messages in thread
From: Michael S. Tsirkin @ 2018-06-05 20:28 UTC (permalink / raw)
  To: Eric Blake
  Cc: Richard W.M. Jones, Eduardo Habkost, Kevin Wolf, qemu-block,
	qemu-devel, stefanha, Max Reitz, Dr. David Alan Gilbert

On Tue, Jun 05, 2018 at 03:09:17PM -0500, Eric Blake wrote:
> On 06/05/2018 02:58 PM, Richard W.M. Jones wrote:
> > > Binary blobs can always be base64 encoded for representation within
> > > a valid JSON UTF-8 string (and we already have several QMP
> > > interfaces that utilize base64 encoding to pass through what is
> > > otherwise invalid UTF-8).  It does inflate things slightly compared
> > > to a format that allows a raw length coupled with raw data, but that
> > > is not necessarily a problem.
> > 
> > Of course how we represent them externally and/or while
> > using QMP / qemu-img to store and retrieve them is up for grabs.
> > Doesn't JSON allow binary to be encoded?  (Knowing how poorly
> > done JSON is, I wouldn't be surprised if not)
> 
> JSON itself does not have a binary primitive; to pass arbitrary data through
> JSON you have to first encode that data into something like base64 that can
> then be represented as a UTF-8 string.  For reference, look at
> qapi/crypto.json and the definition of QCryptoSecretFormat.

But there isn't a way to figure out that a string is base64, which
means each application needs to know whether it's a string or
a binary.

How about specifying the encoding in the value?

string value:
[A-Za-z][^=\0]=S[^\0]*

base64 value:
[A-Za-z][^=\0]=B[A-Za-z0-9+/]*

or the key:
S[A-Za-z][^=\0]=[^\0]*

base64 value:
B[A-Za-z][^=\0]=[A-Za-z0-9+/]*

?

-- 
MST

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] [Qemu-block] storing machine data in qcow images?
  2018-06-05 20:28                             ` Michael S. Tsirkin
@ 2018-06-05 20:46                               ` Eric Blake
  2018-06-05 21:26                                 ` Michael S. Tsirkin
  2018-06-06  8:07                               ` Dr. David Alan Gilbert
  1 sibling, 1 reply; 157+ messages in thread
From: Eric Blake @ 2018-06-05 20:46 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Richard W.M. Jones, Eduardo Habkost, Kevin Wolf, qemu-block,
	qemu-devel, stefanha, Max Reitz, Dr. David Alan Gilbert

On 06/05/2018 03:28 PM, Michael S. Tsirkin wrote:
> On Tue, Jun 05, 2018 at 03:09:17PM -0500, Eric Blake wrote:
>> On 06/05/2018 02:58 PM, Richard W.M. Jones wrote:
>>>> Binary blobs can always be base64 encoded for representation within
>>>> a valid JSON UTF-8 string (and we already have several QMP
>>>> interfaces that utilize base64 encoding to pass through what is
>>>> otherwise invalid UTF-8).  It does inflate things slightly compared
>>>> to a format that allows a raw length coupled with raw data, but that
>>>> is not necessarily a problem.
>>>
>>> Of course how we represent them externally and/or while
>>> using QMP / qemu-img to store and retrieve them is up for grabs.
>>> Doesn't JSON allow binary to be encoded?  (Knowing how poorly
>>> done JSON is, I wouldn't be surprised if not)
>>
>> JSON itself does not have a binary primitive; to pass arbitrary data through
>> JSON you have to first encode that data into something like base64 that can
>> then be represented as a UTF-8 string.  For reference, look at
>> qapi/crypto.json and the definition of QCryptoSecretFormat.
> 
> But there isn't a way to figure out that a string is base64, which
> means each application needs to know whether it's a string or
> a binary.

Other than if the key name is well-known (and interpretation of the 
value is done according to the key, if it is interpreted at all).

> 
> How about specifying the encoding in the value?
> 
> string value:
> [A-Za-z][^=\0]=S[^\0]*
> 
> base64 value:
> [A-Za-z][^=\0]=B[A-Za-z0-9+/]*
> 
> or the key:
> S[A-Za-z][^=\0]=[^\0]*
> 
> base64 value:
> B[A-Za-z][^=\0]=[A-Za-z0-9+/]*

If we're going to express a tuple of <key,type,value>, then let's 
describe it as a 3-tuple, rather than trying to overload that into a 
2-tuple key=value syntax.  After all, while you may want a high-level 
layer 1 representation in JSON or something else easy to hand off to 
other processing, there's no requirement that the layer 0 storage in 
qcow2 can't be a struct in some other layout, even if we have to convert 
between layouts for what is stored in the file vs. what is presented to 
the user.

By that argument, a single JSON object (if we insist on storing a single 
JSON object as layer 0) might be:

[ { "name": "foo", "type": "raw", "value": "bar" },
   { "name": "quux", "type": "base64", "value": "aGVsbG8=" } ]

But figuring out how to add a "Header extension type" to 
docs/interop/qcow2.txt, and whether the key/value payload will usefully 
fit in that header extension (where we are tight on spacing, as ALL 
extensions combined must fit in a single cluster), or whether the header 
should instead have an offset field that points to some other cluster in 
the qcow2 image, is relatively straightforward, and relatively 
independent of the even bigger design question of whether we want to 
allow qcow2 as an image format to expose an arbitrary data store feature 
and what types of data should go into that store.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] [Qemu-block] storing machine data in qcow images?
  2018-06-05 20:46                               ` Eric Blake
@ 2018-06-05 21:26                                 ` Michael S. Tsirkin
  0 siblings, 0 replies; 157+ messages in thread
From: Michael S. Tsirkin @ 2018-06-05 21:26 UTC (permalink / raw)
  To: Eric Blake
  Cc: Richard W.M. Jones, Eduardo Habkost, Kevin Wolf, qemu-block,
	qemu-devel, stefanha, Max Reitz, Dr. David Alan Gilbert

On Tue, Jun 05, 2018 at 03:46:45PM -0500, Eric Blake wrote:
> On 06/05/2018 03:28 PM, Michael S. Tsirkin wrote:
> > On Tue, Jun 05, 2018 at 03:09:17PM -0500, Eric Blake wrote:
> > > On 06/05/2018 02:58 PM, Richard W.M. Jones wrote:
> > > > > Binary blobs can always be base64 encoded for representation within
> > > > > a valid JSON UTF-8 string (and we already have several QMP
> > > > > interfaces that utilize base64 encoding to pass through what is
> > > > > otherwise invalid UTF-8).  It does inflate things slightly compared
> > > > > to a format that allows a raw length coupled with raw data, but that
> > > > > is not necessarily a problem.
> > > > 
> > > > Of course how we represent them externally and/or while
> > > > using QMP / qemu-img to store and retrieve them is up for grabs.
> > > > Doesn't JSON allow binary to be encoded?  (Knowing how poorly
> > > > done JSON is, I wouldn't be surprised if not)
> > > 
> > > JSON itself does not have a binary primitive; to pass arbitrary data through
> > > JSON you have to first encode that data into something like base64 that can
> > > then be represented as a UTF-8 string.  For reference, look at
> > > qapi/crypto.json and the definition of QCryptoSecretFormat.
> > 
> > But there isn't a way to figure out that a string is base64, which
> > means each application needs to know whether it's a string or
> > a binary.
> 
> Other than if the key name is well-known (and interpretation of the value is
> done according to the key, if it is interpreted at all).
> 
> > 
> > How about specifying the encoding in the value?
> > 
> > string value:
> > [A-Za-z][^=\0]=S[^\0]*
> > 
> > base64 value:
> > [A-Za-z][^=\0]=B[A-Za-z0-9+/]*
> > 
> > or the key:
> > S[A-Za-z][^=\0]=[^\0]*
> > 
> > base64 value:
> > B[A-Za-z][^=\0]=[A-Za-z0-9+/]*
> 
> If we're going to express a tuple of <key,type,value>, then let's describe
> it as a 3-tuple, rather than trying to overload that into a 2-tuple
> key=value syntax.

It's a 3 tuple, isn't it?

Foobar=Sabc
1------23

"=" marks end of key but we don't have so many types to need a separator
between type and value.  Would you prefer:

Foobar=S:abc

then, just in case we add a huge number of types down the road?


>  After all, while you may want a high-level layer 1
> representation in JSON or something else easy to hand off to other
> processing, there's no requirement that the layer 0 storage in qcow2 can't
> be a struct in some other layout, even if we have to convert between layouts
> for what is stored in the file vs. what is presented to the user.
>
>
> By that argument, a single JSON object (if we insist on storing a single
> JSON object as layer 0) might be:
> 
> [ { "name": "foo", "type": "raw", "value": "bar" },
>   { "name": "quux", "type": "base64", "value": "aGVsbG8=" } ]

I suspect a lot of what we might want to save would be snippets from
QEMU command line, which are all 0-terminated strings but for which
there is no guarantee they are valid JSON or even UTF8 strings.

We could base64 encode them all but that means some of json advantages
(e.g. readability) are gone, and it will be easy for us to get confused
and forget to encode some string which will lead to security issues.
I'd rather use a format that allows zero terminated strings
as 1st class citizens.


> But figuring out how to add a "Header extension type" to
> docs/interop/qcow2.txt, and whether the key/value payload will usefully fit
> in that header extension (where we are tight on spacing, as ALL extensions
> combined must fit in a single cluster), or whether the header should instead
> have an offset field that points to some other cluster in the qcow2 image,

I'm guessing we need an offset, yes.

> is relatively straightforward, and relatively independent of the even bigger
> design question of whether we want to allow qcow2 as an image format to
> expose an arbitrary data store feature and what types of data should go into
> that store.

I agree it can be decided later but it's not orthogonal in that
it only makes sense to argue what exactly we store there after
we have the capability to store *something* :)

I think we can start small and define the format first, extend qemu-img
- either get/set the file so we can pipe it to some other utility, or
teach it to work with keys/values directly.

Next step will be to start defining contents. We already have two things I
know we want to store: the architecture and the machine type are useful
for bootable images.

Poking at these values from qemu and warning user on mismatch, or even
getting the default machine from there if not specified might
be a step after that.

> -- 
> Eric Blake, Principal Software Engineer
> Red Hat, Inc.           +1-919-301-3266
> Virtualization:  qemu.org | libvirt.org

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-05-23 16:35                 ` Markus Armbruster
  2018-05-29 14:06                   ` Dr. David Alan Gilbert
@ 2018-06-05 21:58                   ` Michal Suchánek
  1 sibling, 0 replies; 157+ messages in thread
From: Michal Suchánek @ 2018-06-05 21:58 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Eduardo Habkost, kwolf, qemu-block, Michael S. Tsirkin,
	qemu-devel, mreitz, stefanha

On Wed, 23 May 2018 18:35:31 +0200
Markus Armbruster <armbru@redhat.com> wrote:

> Eduardo Habkost <ehabkost@redhat.com> writes:
> 
> > On Wed, May 23, 2018 at 01:19:46PM +0200, Markus Armbruster wrote:  
> >> Eduardo Habkost <ehabkost@redhat.com> writes:
> >>   
> >> > On Mon, May 21, 2018 at 07:44:40PM +0100, Daniel P. Berrangé
> >> > wrote:  
> >> >> On Mon, May 21, 2018 at 03:29:28PM -0300, Eduardo Habkost
> >> >> wrote:  
> >> >> > On Sat, May 19, 2018 at 08:05:06AM +0200, Markus Armbruster
> >> >> > wrote:  
...  
> >> >> > The point here is to allow users to simply copy an existing
> >> >> > disk image, and it will contain enough hints for a cloud
> >> >> > stack to choose reasonable defaults for machine-type and disk
> >> >> > type automatically.  Requiring the user to perform a separate
> >> >> > step to encapsulate the disk image in another file format
> >> >> > defeats the whole purpose of the proposal.  
> >> >> 
> >> >> It doesn't have to mean more work for the user - the application
> >> >> that is used to create the image can do that on their behalf.
> >> >> oVirt for example can import/export OVA files, containing OVF
> >> >> metadata. I could imagine virt-manager, and other tools adding
> >> >> export ability without much trouble if this was deemed a
> >> >> desirable thing. Bundling gives ability to have multiple disk
> >> >> images in one archive, which is something OVF does.  
> >> >
> >> > I have the impression that "the application that is used to
> >> > create the image" is a very large set.  It can be virt-manager,
> >> > virt-install, virt-manager, or even QEMU itself.
> >> >
> >> > Today people can simply create a VM on virt-manager, or run QEMU
> >> > manually, and upload the qcow2 image directly from its original
> >> > location (they don't need to copy/export it).  Don't we want the
> >> > same procedure to keep working instead of requiring users to use
> >> > another tool?  
> >> 
...
> >> With OVF, you solve the problem further up the stack: you do
> >> virtual appliances instead of disk images.
> >>   
> >
> > I guess the main problem is that people are already using disk
> > images as if they were virtual appliances.
> >
> > We can tell people to stop doing that and use OVF, but then we
> > won't make anybody's life any easier: publishers of images might
> > need to generate both qcow2 and OVF images if they want it to
> > work with older hosts; consumers will need to find out if they
> > need qcow2 or OVF.  
> 
> I'm afraid providing for "hints" in QCOW2 could only add problems.  To
> pick the right hints, publishers need to predict how future software
> consuming the image will interpret them.  Consumers may have to
> configure their software to interpret hints in various ways.

That depends on the hint.

If you define that the key libvirt-xml contains the libvirt machine
definition it is up to libvirt to define what exactly the hint contains.

It allows exporting a machine from virt-manager or virsh and importing
into same with exact same properties. Looking at it will probably give
you some idea how to configure the VM in another virtualization
solution.

Similarly OpenStack can define different hint for OpenStack VMs. 

You are free to equip you appliance with multiple hints if you want it
to work it on multiple virtualization solutions. It is much more
efficient than providing multiple appliance images that embed the
same disk image and different metadata. Virtualization solutions are
free to implement conversions to ease importing VMs from different
source, even for something as horrible as OVF.

It is a bit limiting, though.

You cannot make an applicance with multiple disks when the metadata is
embedded in a disk image.

Thanks

Michal

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] [Qemu-block] storing machine data in qcow images?
  2018-06-05 19:58                         ` Richard W.M. Jones
  2018-06-05 20:09                           ` Eric Blake
@ 2018-06-06  6:23                           ` Gerd Hoffmann
  1 sibling, 0 replies; 157+ messages in thread
From: Gerd Hoffmann @ 2018-06-06  6:23 UTC (permalink / raw)
  To: Richard W.M. Jones
  Cc: Eric Blake, Kevin Wolf, Eduardo Habkost, qemu-block,
	Michael S. Tsirkin, qemu-devel, Max Reitz, stefanha,
	Dr. David Alan Gilbert

  Hi,

> By binary I actually meant binary.  The idea is you could
> store things like PNG images in them (for icons).

Guess if we go that route we also want store a mine-type for each entry.

cheers,
  Gerd

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-05 19:03                   ` Eduardo Habkost
  2018-06-05 19:47                     ` Michael S. Tsirkin
@ 2018-06-06  6:26                     ` Gerd Hoffmann
  2018-06-06  9:44                     ` Dr. David Alan Gilbert
  2 siblings, 0 replies; 157+ messages in thread
From: Gerd Hoffmann @ 2018-06-06  6:26 UTC (permalink / raw)
  To: Eduardo Habkost
  Cc: Dr. David Alan Gilbert, Kevin Wolf, qemu-block,
	Michael S. Tsirkin, Richard W.M. Jones, qemu-devel, stefanha,
	Max Reitz

  Hi,

> Based on this proposal for layer 2, it looks like you expect the
> number of keys used on layer 1 to become large.
> 
> I would prefer a solution that expects a very small set of keys
> for layer 0+1, and point to other specifications of how the blob
> can be interpreted for each key.  This way we can experiment with
> different solutions for layers 2-4, instead of deciding on a
> specific format like JSON.

Well, you never know what people will use this for ...

cheers,
  Gerd

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] [Qemu-block] storing machine data in qcow images?
  2018-06-05 20:28                             ` Michael S. Tsirkin
  2018-06-05 20:46                               ` Eric Blake
@ 2018-06-06  8:07                               ` Dr. David Alan Gilbert
  1 sibling, 0 replies; 157+ messages in thread
From: Dr. David Alan Gilbert @ 2018-06-06  8:07 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Eric Blake, Richard W.M. Jones, Eduardo Habkost, Kevin Wolf,
	qemu-block, qemu-devel, stefanha, Max Reitz

* Michael S. Tsirkin (mst@redhat.com) wrote:
> On Tue, Jun 05, 2018 at 03:09:17PM -0500, Eric Blake wrote:
> > On 06/05/2018 02:58 PM, Richard W.M. Jones wrote:
> > > > Binary blobs can always be base64 encoded for representation within
> > > > a valid JSON UTF-8 string (and we already have several QMP
> > > > interfaces that utilize base64 encoding to pass through what is
> > > > otherwise invalid UTF-8).  It does inflate things slightly compared
> > > > to a format that allows a raw length coupled with raw data, but that
> > > > is not necessarily a problem.
> > > 
> > > Of course how we represent them externally and/or while
> > > using QMP / qemu-img to store and retrieve them is up for grabs.
> > > Doesn't JSON allow binary to be encoded?  (Knowing how poorly
> > > done JSON is, I wouldn't be surprised if not)
> > 
> > JSON itself does not have a binary primitive; to pass arbitrary data through
> > JSON you have to first encode that data into something like base64 that can
> > then be represented as a UTF-8 string.  For reference, look at
> > qapi/crypto.json and the definition of QCryptoSecretFormat.
> 
> But there isn't a way to figure out that a string is base64, which
> means each application needs to know whether it's a string or
> a binary.
> 
> How about specifying the encoding in the value?
> 
> string value:
> [A-Za-z][^=\0]=S[^\0]*
> 
> base64 value:
> [A-Za-z][^=\0]=B[A-Za-z0-9+/]*
> 
> or the key:
> S[A-Za-z][^=\0]=[^\0]*
> 
> base64 value:
> B[A-Za-z][^=\0]=[A-Za-z0-9+/]*

Why reinvent the wheel yet again?
The world has enough encodings out there, and we're trying to add
something that's a trivial data store for a frankly trivial use.

While I hate json, it's at least standard.

Dave

> ?
> 
> -- 
> MST
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-05 19:03                   ` Eduardo Habkost
  2018-06-05 19:47                     ` Michael S. Tsirkin
  2018-06-06  6:26                     ` [Qemu-devel] " Gerd Hoffmann
@ 2018-06-06  9:44                     ` Dr. David Alan Gilbert
  2018-06-06 13:35                       ` Eduardo Habkost
  2 siblings, 1 reply; 157+ messages in thread
From: Dr. David Alan Gilbert @ 2018-06-06  9:44 UTC (permalink / raw)
  To: Eduardo Habkost
  Cc: Max Reitz, Kevin Wolf, qemu-block, Michael S. Tsirkin,
	qemu-devel, Richard W.M. Jones, stefanha

* Eduardo Habkost (ehabkost@redhat.com) wrote:
> On Tue, Jun 05, 2018 at 10:21:59AM +0100, Dr. David Alan Gilbert wrote:
> > <reawakening a fizzled out thread>
> > 
> > This seems to have fizzled out because of a lack of a concrete proposal;
> > so here is one based on a reply to Max's post:
> > 
> > * Max Reitz (mreitz@redhat.com) wrote:
> > 
> > <snip>
> > 
> > > The original problem was that you need to supply a machine type to qemu,
> > > and that multiple common architectures now have multiple machine types
> > > and not necessarily all work with a single image.  So far so good, but I
> > > have two issues here already:
> > > 
> > > (1) How is qemu supposed to interpret that information?  If it's stored
> > > in the image file, I don't see a nice way of retrieving it before the
> > > machine is initialized, at least not with qemu's current architecture.
> > 
> > <snip>
> > 
> > > (2) Again, I personally just really don't like saving such information
> > > in a disk image.  One actual argument I can bring up for that distaste
> > > is this: Suppose, you have multiple images attached to your VM.  Now the
> > > VM wants to store the machine type.  Where does it go?  Into all of
> > > them?
> > 
> > <snip>
> > 
> > > So I think if we decide to store the machine type, that is kind of a
> > > slippery slope and then there are good arguments for storing even more
> > > configuration options in the file, too.  But I really, really don't like
> > > that.
> > 
> > <snip>
> > 
> > > For another, how do we store the data?  key-value seems wrong if we want
> > > to store everything.  JSON might be fine.  But eventually we just want
> > > basically a qemu configuration file in there, I would think (which may
> > > support JSON at some point?).   So basically we would store the data as
> > > a binary blob and let the rest of qemu do its thing with it.  But then
> > > please tell me why I fought so valiantly against storing random bitmaps
> > > in qcow2 files.  I hate the idea of making qcow2 a random archive
> > > format.  We have tar for that.
> > 
> > <snip>
> > 
> > > tl;dr: I really don't get why it's so hard to supply a config file along
> > > with a qcow2 image.  Is it so hard for people to realize that a VM does
> > > not only consist of a disk?
> > 
> > Yes! Because in many cases that's all it needs, and it's ready to run
> > with no unpacking.
> > 
> > I think we should have:
> > 
> > --------------------------------------------------------------
> > Layer 0:
> >    QCOW provides a way to store a single string of arbitrary (but
> > limited?) length.
> >    QCOW provides a way to replace the string by a new string.
> >    The original or the new string will be stored after that;
> >    never some mix.
> >    Where a file 'b' has a backing file 'a', 'b' inherits the
> >    string from 'a' unless 'b' has it's own string.
> >    Snapshots inherit their string from the main unless they have
> >    their own string.
> > 
> > Layer 1:
> >    The string shall always be a JSON 'object'; i.e. of the form
> >     { "something": ... , "more": ... }
> > 
> >    The key strings shall be non-null and non-empty and shall
> >    be unique.
> > 
> 
> I'd prefer layer 0+1 to:
> 
> 1) Allow multiple entries to be stored (implemented by layer 1
>    in this proposal)
> 2) Identify each entry with a name (implemented by layer 1 in
>    this proposal)
> 3) Allow arbitrary binary data to be stored on an entry
>    (not possible with the JSON-based proposal, because JSON
>    strings are not blobs, but Unicode strings).
> 4) Make it easy to replace only one entry while keeping others
>    intact (not the case here, if all entries are stored in the
>    same JSON string)
> 
> I think it would be simpler if layer 0 simply provided a list of
> names/value pairs, where names are ascii strings, and values are
> binary data[1].  It would make layer 1 unnecessary, and allow (3)
> and (4) to happen.
> 
> [1] In other words, Rich's proposal of "named blobs":
> https://www.mail-archive.com/qemu-block@nongnu.org/msg37856.html

My reasoning was just one of simplicity; each layer in this is
almost trivial.
The downside to my proposal is it's more expensive if you want
to change a single key; but as yet no one has suggested why we'd
want to do it frequently enough to worry.

My suggestion of JSON was just to try and stop the bikeshedding;
we seem to be putting a lot of effort into inventing a new
storage definition for something that so far we need to hold
one rarely changing short string, which even with all the discussion
has moved upto maybe 3 or 4 rarely changing pieces of data.

Dave


> 
> 
> > Layer 2:
> >    '.'s in the key string shall indicate hierarchy
> >    
> >    Key strings shall be listed in qemu's 
> >       docs/specs/qcow-keys.rst
> > 
> >       that shall indicate their meaning and the meaning and
> >       valid formatting of the value associated with the,
> > 
> >    Key strings shall start with either:
> >       qemu.   in which case they must be listed in a file in
> >               the qemu source tree
> > 
> >       a reverse dotted name unique to the submitter, they may
> >               be listed in the same file in the source tree, e.g.
> >       com.redhat.
> > 
> 
> Based on this proposal for layer 2, it looks like you expect the
> number of keys used on layer 1 to become large.
> 
> I would prefer a solution that expects a very small set of keys
> for layer 0+1, and point to other specifications of how the blob
> can be interpreted for each key.  This way we can experiment with
> different solutions for layers 2-4, instead of deciding on a
> specific format like JSON.
> 
> 
> > Layer 3:
> >    QEMU shall, for a given qcow2 file be able to dump the
> >    key values.
> > 
> > Layer 4:
> >    On creating a VM by importing a qcow2, a management layer
> >    shall inspect the key/values to influence the configuration
> >    of the VM created.   Where it imports multiple qcow2's it
> >    shall inspect all the files and flag disagreements.
> > 
> >    Management layers shall, on creating a qcow2 shall set the
> >    keys based on the VM the qcow2 is created for.  If the qcow2
> >    is created as an additional disk for an exisitng VM it's
> >    fine to leave the string empty (e.g. for a data disk).
> > 
> > --------------------------------------------------------------
> >    
> > 
> > Some reasoning:
> >    a) I've avoided the problem of when QEMU interprets the value
> >       by ignoring it and giving it to management layers at the point
> >       of VM import.
> >    b) I hate JSON, but there again nailing down a fixed format
> >       seems easiest and it makes the job of QCOW easy - a single
> >       string.
> >       (I would suggest in layer2 that the keys are sorted, but
> >       that's a pain to do in some json creators)
> >    c) Forcing the registry of keys might avoid silly duplication.
> >       We can but hope.
> >    d) I've not said it's a libvirt XML file since that seems
> >       a bit prescriptive.
> > 
> > Some initial suggested keys:
> > 
> >    "qemu.machine-types": [ "q35", "i440fx" ]
> >    "qemu.min-ram-MB": 1024
> > 
> > 
> > Dave
> > 
> > --
> > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> > 
> 
> -- 
> Eduardo
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-05-29 10:14               ` Max Reitz
  2018-06-05  9:21                 ` Dr. David Alan Gilbert
@ 2018-06-06 10:32                 ` Michal Suchánek
  2018-06-06 11:02                   ` Max Reitz
  1 sibling, 1 reply; 157+ messages in thread
From: Michal Suchánek @ 2018-06-06 10:32 UTC (permalink / raw)
  To: Max Reitz
  Cc: Kevin Wolf, Richard W.M. Jones, qemu-devel, stefanha, ehabkost,
	qemu-block, Michael S. Tsirkin

[-- Attachment #1: Type: text/plain, Size: 7224 bytes --]

On Tue, 29 May 2018 12:14:15 +0200
Max Reitz <mreitz@redhat.com> wrote:

> On 2018-05-29 08:44, Kevin Wolf wrote:
> > Am 28.05.2018 um 23:25 hat Richard W.M. Jones geschrieben:  
> >> On Mon, May 28, 2018 at 10:20:54PM +0100, Richard W.M. Jones
> >> wrote:  
> >>> On Mon, May 28, 2018 at 08:38:33PM +0200, Kevin Wolf wrote:  
> >>>> Just accessing the image file within a tar archive is possible
> >>>> and we could write a block driver for that (I actually think we
> >>>> should do this), but it restricts you because certain operations
> >>>> like resizing aren't really possible in tar. Unfortunately,
> >>>> resizing is a really common operation for non-raw image
> >>>> formats.  
> >>>
> >>> We do this already in virt-v2v (using file.offset and file.size
> >>> parameters in the raw driver).
> >>>
> >>> For virt-v2v we only need to read the source so resizing isn't an
> >>> issue.  For most of the cases we're talking about the downloaded
> >>> image would also be a template / base image, so I suppose only
> >>> reading would be required too.
> >>>
> >>> I also wrote an nbdkit tar file driver (supports writes, but not
> >>> resizing).
> >>> https://manpages.debian.org/testing/nbdkit-plugin-perl/nbdkit-tar-plugin.1.en.html  
> >>
> >> I should add the other thorny issue with OVA files is that the
> >> metadata contains a checksum (SHA1 or SHA256) of the disk images.
> >> If you modify the disk images in-place in the tar file then you
> >> need to recalculate those.  
> > 
> > All of this means that OVA isn't really well suited to be used as a
> > native format for VM configuration + images. It's just for sharing
> > read-only images that are converted into another native format
> > before they are used.
> > 
> > Which is probably fair for the use case it was made for, but means
> > that we need something else to solve our problem.  
> 
> Maybe we should first narrow down our problem.  Maybe you have done
> that already, but I'm quite in the dark still.
> 
> The original problem was that you need to supply a machine type to
> qemu, and that multiple common architectures now have multiple
> machine types and not necessarily all work with a single image.  So
> far so good, but I have two issues here already:
> 
> (1) How is qemu supposed to interpret that information?  If it's
> stored in the image file, I don't see a nice way of retrieving it
> before the machine is initialized, at least not with qemu's current
> architecture. Once we support configuring qemu solely through QMP,
> sure, you can do a blockdev-add and then build the machine
> accordingly.  But that is not here today, and I'm not sure this is a
> good idea either, because that would mean automagic defaults for the
> machine-building QMP commands derived from the blockdev-add earlier,
> which should get a plain "No". Also, having to use QMP to build your
> machine wouldn't make anything easier; at least not easier than just
> supplying a configuration file along with the image.
> 
> (Building the magic into -blockdev might be less horrible, but such
> magic (adding block devices influences machine defaults) to me still
> doesn't seem worth not having to supply a config file along with the
> disk image.)
> 
> (2) Again, I personally just really don't like saving such information
> in a disk image.  One actual argument I can bring up for that distaste
> is this: Suppose, you have multiple images attached to your VM.  Now
> the VM wants to store the machine type.  Where does it go?  Into all
> of them?  But some of those images may only contain data and might be
> intended to be shared between multiple VMs.  So those shouldn't
> receive the mark.  Only disks with binaries should receive them.
> But what if those binaries are just cross-compiled binaries for some
> other VM?  Oh no, so not even binaries are a sure indicator...  So I
> have no idea where the information is supposed to be stored.  In any
> case, "the first image" just gets an outright "no" from me, and "all
> images" gets an "I don't think this is a good idea".
> 
> Loading is fun, too.  OK, so you attach multiple disk images to a VM.
> Oops, they have varying machine type information...  Now what?  Use
> the information from the first one?  Definitely no.  Just ignore all
> of the information in such a case and have the user supply the
> machine type again?  Possible, but it seems weird to me that qemu
> would usually guess the machine type, but once you attach some random
> other image to it, it suddenly fails to do that.  But maybe it's just
> me who thinks this is weird.
> 
> 
> OK, so let's go a step further.  We have stored the machine type
> information in order to not have to supply a config file with the
> qcow2 image -- because if we did, it could just contain the machine
> type and that would be it.
> 
> So to me it follows naturally that just storing the machine type
> doesn't make much sense if we cannot also store more VM configuration
> in a qcow2 file, because I don't see why you should be able to ship
> an image without a config file only if all you need to supply is a
> machine type. Often, you also need to supply how much memory the VM
> needs (which depends on the OS on the image) or what storage
> controller to use (does the OS have virtio drivers? (to be fair, it
> usually does, because you're supplying a VM image in the first
> place)).
> 
> So I think if we decide to store the machine type, that is kind of a
> slippery slope and then there are good arguments for storing even more
> configuration options in the file, too.  But I really, really don't
> like that.
> 
> For one thing, I suspect it to get really ugly implementation-wise.
> Getting the machine type out of a disk image and actually interpreting
> it automatically is bad enough, but getting possibly everything out of
> it?  It's not going to be any better.
> 
> For another, how do we store the data?  key-value seems wrong if we
> want to store everything.  JSON might be fine.  But eventually we
> just want basically a qemu configuration file in there, I would think
> (which may support JSON at some point?).   So basically we would
> store the data as a binary blob and let the rest of qemu do its thing
> with it.  But then please tell me why I fought so valiantly against
> storing random bitmaps in qcow2 files.  

Yes, I wonder. Why did you?

> I hate the idea of making qcow2 a random archive format.

What's wrong with that?

> We have tar for that.

It does not support expanding the stored files.

> 
> 
> Unless I have got something terribly wrong (which is indeed a
> possibility!), to me this proposal means basically to turn qcow2 into
> (1) a VM description format for qemu, and (2) to turn it into an
> archive format on the way.

And if you go all the way you can store multiple disks along with the
VM definition so you can have the whole appliance in one file. It
conveniently solves the problem of synchronizing snapshots across
multiple disk images and the question where to store the machine state
if you want to suspend it. 

Thanks

Michal

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-06 10:32                 ` [Qemu-devel] " Michal Suchánek
@ 2018-06-06 11:02                   ` Max Reitz
  2018-06-06 11:19                     ` Michal Suchánek
                                       ` (2 more replies)
  0 siblings, 3 replies; 157+ messages in thread
From: Max Reitz @ 2018-06-06 11:02 UTC (permalink / raw)
  To: Michal Suchánek
  Cc: Kevin Wolf, Richard W.M. Jones, qemu-devel, stefanha, ehabkost,
	qemu-block, Michael S. Tsirkin

[-- Attachment #1: Type: text/plain, Size: 9172 bytes --]

On 2018-06-06 12:32, Michal Suchánek wrote:
> On Tue, 29 May 2018 12:14:15 +0200
> Max Reitz <mreitz@redhat.com> wrote:
> 
>> On 2018-05-29 08:44, Kevin Wolf wrote:
>>> Am 28.05.2018 um 23:25 hat Richard W.M. Jones geschrieben:  
>>>> On Mon, May 28, 2018 at 10:20:54PM +0100, Richard W.M. Jones
>>>> wrote:  
>>>>> On Mon, May 28, 2018 at 08:38:33PM +0200, Kevin Wolf wrote:  
>>>>>> Just accessing the image file within a tar archive is possible
>>>>>> and we could write a block driver for that (I actually think we
>>>>>> should do this), but it restricts you because certain operations
>>>>>> like resizing aren't really possible in tar. Unfortunately,
>>>>>> resizing is a really common operation for non-raw image
>>>>>> formats.  
>>>>>
>>>>> We do this already in virt-v2v (using file.offset and file.size
>>>>> parameters in the raw driver).
>>>>>
>>>>> For virt-v2v we only need to read the source so resizing isn't an
>>>>> issue.  For most of the cases we're talking about the downloaded
>>>>> image would also be a template / base image, so I suppose only
>>>>> reading would be required too.
>>>>>
>>>>> I also wrote an nbdkit tar file driver (supports writes, but not
>>>>> resizing).
>>>>> https://manpages.debian.org/testing/nbdkit-plugin-perl/nbdkit-tar-plugin.1.en.html  
>>>>
>>>> I should add the other thorny issue with OVA files is that the
>>>> metadata contains a checksum (SHA1 or SHA256) of the disk images.
>>>> If you modify the disk images in-place in the tar file then you
>>>> need to recalculate those.  
>>>
>>> All of this means that OVA isn't really well suited to be used as a
>>> native format for VM configuration + images. It's just for sharing
>>> read-only images that are converted into another native format
>>> before they are used.
>>>
>>> Which is probably fair for the use case it was made for, but means
>>> that we need something else to solve our problem.  
>>
>> Maybe we should first narrow down our problem.  Maybe you have done
>> that already, but I'm quite in the dark still.
>>
>> The original problem was that you need to supply a machine type to
>> qemu, and that multiple common architectures now have multiple
>> machine types and not necessarily all work with a single image.  So
>> far so good, but I have two issues here already:
>>
>> (1) How is qemu supposed to interpret that information?  If it's
>> stored in the image file, I don't see a nice way of retrieving it
>> before the machine is initialized, at least not with qemu's current
>> architecture. Once we support configuring qemu solely through QMP,
>> sure, you can do a blockdev-add and then build the machine
>> accordingly.  But that is not here today, and I'm not sure this is a
>> good idea either, because that would mean automagic defaults for the
>> machine-building QMP commands derived from the blockdev-add earlier,
>> which should get a plain "No". Also, having to use QMP to build your
>> machine wouldn't make anything easier; at least not easier than just
>> supplying a configuration file along with the image.
>>
>> (Building the magic into -blockdev might be less horrible, but such
>> magic (adding block devices influences machine defaults) to me still
>> doesn't seem worth not having to supply a config file along with the
>> disk image.)
>>
>> (2) Again, I personally just really don't like saving such information
>> in a disk image.  One actual argument I can bring up for that distaste
>> is this: Suppose, you have multiple images attached to your VM.  Now
>> the VM wants to store the machine type.  Where does it go?  Into all
>> of them?  But some of those images may only contain data and might be
>> intended to be shared between multiple VMs.  So those shouldn't
>> receive the mark.  Only disks with binaries should receive them.
>> But what if those binaries are just cross-compiled binaries for some
>> other VM?  Oh no, so not even binaries are a sure indicator...  So I
>> have no idea where the information is supposed to be stored.  In any
>> case, "the first image" just gets an outright "no" from me, and "all
>> images" gets an "I don't think this is a good idea".
>>
>> Loading is fun, too.  OK, so you attach multiple disk images to a VM.
>> Oops, they have varying machine type information...  Now what?  Use
>> the information from the first one?  Definitely no.  Just ignore all
>> of the information in such a case and have the user supply the
>> machine type again?  Possible, but it seems weird to me that qemu
>> would usually guess the machine type, but once you attach some random
>> other image to it, it suddenly fails to do that.  But maybe it's just
>> me who thinks this is weird.
>>
>>
>> OK, so let's go a step further.  We have stored the machine type
>> information in order to not have to supply a config file with the
>> qcow2 image -- because if we did, it could just contain the machine
>> type and that would be it.
>>
>> So to me it follows naturally that just storing the machine type
>> doesn't make much sense if we cannot also store more VM configuration
>> in a qcow2 file, because I don't see why you should be able to ship
>> an image without a config file only if all you need to supply is a
>> machine type. Often, you also need to supply how much memory the VM
>> needs (which depends on the OS on the image) or what storage
>> controller to use (does the OS have virtio drivers? (to be fair, it
>> usually does, because you're supplying a VM image in the first
>> place)).
>>
>> So I think if we decide to store the machine type, that is kind of a
>> slippery slope and then there are good arguments for storing even more
>> configuration options in the file, too.  But I really, really don't
>> like that.
>>
>> For one thing, I suspect it to get really ugly implementation-wise.
>> Getting the machine type out of a disk image and actually interpreting
>> it automatically is bad enough, but getting possibly everything out of
>> it?  It's not going to be any better.
>>
>> For another, how do we store the data?  key-value seems wrong if we
>> want to store everything.  JSON might be fine.  But eventually we
>> just want basically a qemu configuration file in there, I would think
>> (which may support JSON at some point?).   So basically we would
>> store the data as a binary blob and let the rest of qemu do its thing
>> with it.  But then please tell me why I fought so valiantly against
>> storing random bitmaps in qcow2 files.  
> 
> Yes, I wonder. Why did you?

That was mostly directed at Kevin.

My reasoning was that a qcow2 file is a disk image.  All data stored
therein should be immediately associated with the stored data.  Another
reason was that from the perspective of qcow2 you don't lose anything by
tying the bitmaps directly to that data; all we lost was the capability
of storing bitmaps for unrelated raw files.

(And the reasoning for that is "if you want features, use qcow2" --
although R/W backing files may loosen that phrase.)

>> I hate the idea of making qcow2 a random archive format.
> 
> What's wrong with that?

The fact that qcow2 isn't.

From my perspective it would increase the format's complexity to a point
where you could just create a new format altogether.  Well, actually,
all you do is design a filesystem (or reuse an existing one).

>> We have tar for that.
> 
> It does not support expanding the stored files.

Nor does qcow2, because it does not support storing files at all.

Secondly, that completely depends on how you use it.  You can freely
expand the last file in the archive, for instance.  Also I've seen
people store files in chunks so they can indeed resize it.

(I'm wondering if we could write a block driver that could provide such
a chunk allocation transparently to qcow2...  Note that a qcow2 file
does not need to be continuous, so you could in theory indeed store the
qcow2 file and its data in completely separate places in a tar file.)

What I'm trying to get at is that qcow2 was not designed to be a
container format for arbitrary files.  If you want to make it such, I'm
sure there are existing formats that work better.

>> Unless I have got something terribly wrong (which is indeed a
>> possibility!), to me this proposal means basically to turn qcow2 into
>> (1) a VM description format for qemu, and (2) to turn it into an
>> archive format on the way.
> 
> And if you go all the way you can store multiple disks along with the
> VM definition so you can have the whole appliance in one file. It
> conveniently solves the problem of synchronizing snapshots across
> multiple disk images and the question where to store the machine state
> if you want to suspend it. 

Yeah, but why make qcow2 that format?  That's what I completely fail to
understand.

If you want to have a single VM description file that contains the VM
configuration and some qcow2/raw/whatever files along with it for the
guest disk data, sure, go ahead.  But why does the format of the whole
thing need to be qcow2?

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-05  9:21                 ` Dr. David Alan Gilbert
  2018-06-05 19:03                   ` Eduardo Habkost
@ 2018-06-06 11:02                   ` Max Reitz
  2018-06-06 11:14                     ` Dr. David Alan Gilbert
  2018-06-06 11:22                     ` [Qemu-devel] [Qemu-block] " Peter Krempa
  1 sibling, 2 replies; 157+ messages in thread
From: Max Reitz @ 2018-06-06 11:02 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: Kevin Wolf, Richard W.M. Jones, qemu-block, Michael S. Tsirkin,
	qemu-devel, stefanha

[-- Attachment #1: Type: text/plain, Size: 6829 bytes --]

On 2018-06-05 11:21, Dr. David Alan Gilbert wrote:
> <reawakening a fizzled out thread>
> 
> This seems to have fizzled out because of a lack of a concrete proposal;
> so here is one based on a reply to Max's post:
> 
> * Max Reitz (mreitz@redhat.com) wrote:
> 
> <snip>
> 
>> The original problem was that you need to supply a machine type to qemu,
>> and that multiple common architectures now have multiple machine types
>> and not necessarily all work with a single image.  So far so good, but I
>> have two issues here already:
>>
>> (1) How is qemu supposed to interpret that information?  If it's stored
>> in the image file, I don't see a nice way of retrieving it before the
>> machine is initialized, at least not with qemu's current architecture.
> 
> <snip>
> 
>> (2) Again, I personally just really don't like saving such information
>> in a disk image.  One actual argument I can bring up for that distaste
>> is this: Suppose, you have multiple images attached to your VM.  Now the
>> VM wants to store the machine type.  Where does it go?  Into all of
>> them?
> 
> <snip>
> 
>> So I think if we decide to store the machine type, that is kind of a
>> slippery slope and then there are good arguments for storing even more
>> configuration options in the file, too.  But I really, really don't like
>> that.
> 
> <snip>
> 
>> For another, how do we store the data?  key-value seems wrong if we want
>> to store everything.  JSON might be fine.  But eventually we just want
>> basically a qemu configuration file in there, I would think (which may
>> support JSON at some point?).   So basically we would store the data as
>> a binary blob and let the rest of qemu do its thing with it.  But then
>> please tell me why I fought so valiantly against storing random bitmaps
>> in qcow2 files.  I hate the idea of making qcow2 a random archive
>> format.  We have tar for that.
> 
> <snip>
> 
>> tl;dr: I really don't get why it's so hard to supply a config file along
>> with a qcow2 image.  Is it so hard for people to realize that a VM does
>> not only consist of a disk?
> 
> Yes! Because in many cases that's all it needs, and it's ready to run
> with no unpacking.

It clearly is not, or we would not have this discussion.

The disk image is only enough if you want the default values for all of
qemu's configuration options, because today (and if I were to decide, in
the future, too) disk images do not configure the VM (well, they
configure the guest, but not the VM itself).

> I think we should have:
> 
> --------------------------------------------------------------
> Layer 0:
>    QCOW provides a way to store a single string of arbitrary (but
> limited?) length.
>    QCOW provides a way to replace the string by a new string.
>    The original or the new string will be stored after that;
>    never some mix.
>    Where a file 'b' has a backing file 'a', 'b' inherits the
>    string from 'a' unless 'b' has it's own string.
>    Snapshots inherit their string from the main unless they have
>    their own string.
> 
> Layer 1:
>    The string shall always be a JSON 'object'; i.e. of the form
>     { "something": ... , "more": ... }
> 
>    The key strings shall be non-null and non-empty and shall
>    be unique.
> 
> Layer 2:
>    '.'s in the key string shall indicate hierarchy

I don't understand why we we'd need dotted syntax when we already have
JSON, but that's not my issue.

>    Key strings shall be listed in qemu's 
>       docs/specs/qcow-keys.rst
> 
>       that shall indicate their meaning and the meaning and
>       valid formatting of the value associated with the,
> 
>    Key strings shall start with either:
>       qemu.   in which case they must be listed in a file in
>               the qemu source tree
> 
>       a reverse dotted name unique to the submitter, they may
>               be listed in the same file in the source tree, e.g.
>       com.redhat.

So this is just another configuration file format.

> Layer 3:
>    QEMU shall, for a given qcow2 file be able to dump the
>    key values.
> 
> Layer 4:
>    On creating a VM by importing a qcow2, a management layer
>    shall inspect the key/values to influence the configuration
>    of the VM created.   Where it imports multiple qcow2's it
>    shall inspect all the files and flag disagreements.
> 
>    Management layers shall, on creating a qcow2 shall set the
>    keys based on the VM the qcow2 is created for.  If the qcow2
>    is created as an additional disk for an exisitng VM it's
>    fine to leave the string empty (e.g. for a data disk).

This at least solves the issue of where qemu should store the data (qemu
doesn't care), and how qemu should interpret it (not at all).

But I really, really, really do not like storing arbitrary data in qcow2
files.  I hated it badly enough when qemu knew what to do with it, but I
hate it even more when even qemu has no idea what to do with it.

Having a specification of what everything means in the qemu tree makes
things less unbearable, but not to my liking still.

> --------------------------------------------------------------
>    
> 
> Some reasoning:
>    a) I've avoided the problem of when QEMU interprets the value
>       by ignoring it and giving it to management layers at the point
>       of VM import.

Yes, but in the process you've made it completely opaque to qemu,
basically, which doesn't really make it better for me.  Not that
qemu-specific information in qcow2 files would be what I want, but, well.

But it does solve technical issues, I concede that.

>    b) I hate JSON, but there again nailing down a fixed format
>       seems easiest and it makes the job of QCOW easy - a single
>       string.

Not really.  The string can be rather long, so you probably don't want
to store it in the image header, and thus it's just a binary blob from
qcow2's perspective, essentially.

>       (I would suggest in layer2 that the keys are sorted, but
>       that's a pain to do in some json creators)
>    c) Forcing the registry of keys might avoid silly duplication.
>       We can but hope.
>    d) I've not said it's a libvirt XML file since that seems
>       a bit prescriptive.
> 
> Some initial suggested keys:
> 
>    "qemu.machine-types": [ "q35", "i440fx" ]
>    "qemu.min-ram-MB": 1024

I still don't understand why you'd want to put the configuration into
qcow2 instead of the other way around.

Or why you'd want to use a single file at all, because as this whole
thread shows, a disk image alone is clearly not sufficient to describe a VM.

(Or it may be in simple cases, but then that's because you don't need
any configuration.)

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-06 11:02                   ` Max Reitz
@ 2018-06-06 11:14                     ` Dr. David Alan Gilbert
  2018-06-06 11:26                       ` Max Reitz
  2018-06-06 11:42                       ` Richard W.M. Jones
  2018-06-06 11:22                     ` [Qemu-devel] [Qemu-block] " Peter Krempa
  1 sibling, 2 replies; 157+ messages in thread
From: Dr. David Alan Gilbert @ 2018-06-06 11:14 UTC (permalink / raw)
  To: Max Reitz
  Cc: Kevin Wolf, Richard W.M. Jones, armbru, qemu-block,
	Michael S. Tsirkin, qemu-devel, stefanha

* Max Reitz (mreitz@redhat.com) wrote:
> On 2018-06-05 11:21, Dr. David Alan Gilbert wrote:
> > <reawakening a fizzled out thread>
> > 
> > This seems to have fizzled out because of a lack of a concrete proposal;
> > so here is one based on a reply to Max's post:
> > 
> > * Max Reitz (mreitz@redhat.com) wrote:
> > 
> > <snip>
> > 
> >> The original problem was that you need to supply a machine type to qemu,
> >> and that multiple common architectures now have multiple machine types
> >> and not necessarily all work with a single image.  So far so good, but I
> >> have two issues here already:
> >>
> >> (1) How is qemu supposed to interpret that information?  If it's stored
> >> in the image file, I don't see a nice way of retrieving it before the
> >> machine is initialized, at least not with qemu's current architecture.
> > 
> > <snip>
> > 
> >> (2) Again, I personally just really don't like saving such information
> >> in a disk image.  One actual argument I can bring up for that distaste
> >> is this: Suppose, you have multiple images attached to your VM.  Now the
> >> VM wants to store the machine type.  Where does it go?  Into all of
> >> them?
> > 
> > <snip>
> > 
> >> So I think if we decide to store the machine type, that is kind of a
> >> slippery slope and then there are good arguments for storing even more
> >> configuration options in the file, too.  But I really, really don't like
> >> that.
> > 
> > <snip>
> > 
> >> For another, how do we store the data?  key-value seems wrong if we want
> >> to store everything.  JSON might be fine.  But eventually we just want
> >> basically a qemu configuration file in there, I would think (which may
> >> support JSON at some point?).   So basically we would store the data as
> >> a binary blob and let the rest of qemu do its thing with it.  But then
> >> please tell me why I fought so valiantly against storing random bitmaps
> >> in qcow2 files.  I hate the idea of making qcow2 a random archive
> >> format.  We have tar for that.
> > 
> > <snip>
> > 
> >> tl;dr: I really don't get why it's so hard to supply a config file along
> >> with a qcow2 image.  Is it so hard for people to realize that a VM does
> >> not only consist of a disk?
> > 
> > Yes! Because in many cases that's all it needs, and it's ready to run
> > with no unpacking.
> 
> It clearly is not, or we would not have this discussion.
> 
> The disk image is only enough if you want the default values for all of
> qemu's configuration options, because today (and if I were to decide, in
> the future, too) disk images do not configure the VM (well, they
> configure the guest, but not the VM itself).

The problem with having a separate file is that you either have to copy
it around with the image or have an archive.  If you have an archive
you have to have an unpacking step which then copies, potentially a lot
of data taking some reasonable amount of time.  Storing a simple bit
of data with the image avoids that.

> > I think we should have:
> > 
> > --------------------------------------------------------------
> > Layer 0:
> >    QCOW provides a way to store a single string of arbitrary (but
> > limited?) length.
> >    QCOW provides a way to replace the string by a new string.
> >    The original or the new string will be stored after that;
> >    never some mix.
> >    Where a file 'b' has a backing file 'a', 'b' inherits the
> >    string from 'a' unless 'b' has it's own string.
> >    Snapshots inherit their string from the main unless they have
> >    their own string.
> > 
> > Layer 1:
> >    The string shall always be a JSON 'object'; i.e. of the form
> >     { "something": ... , "more": ... }
> > 
> >    The key strings shall be non-null and non-empty and shall
> >    be unique.
> > 
> > Layer 2:
> >    '.'s in the key string shall indicate hierarchy
> 
> I don't understand why we we'd need dotted syntax when we already have
> JSON, but that's not my issue.

I think someone earlier in the thread had asked about how we handled
hierarchy so I added it.

> >    Key strings shall be listed in qemu's 
> >       docs/specs/qcow-keys.rst
> > 
> >       that shall indicate their meaning and the meaning and
> >       valid formatting of the value associated with the,
> > 
> >    Key strings shall start with either:
> >       qemu.   in which case they must be listed in a file in
> >               the qemu source tree
> > 
> >       a reverse dotted name unique to the submitter, they may
> >               be listed in the same file in the source tree, e.g.
> >       com.redhat.
> 
> So this is just another configuration file format.
> 
> > Layer 3:
> >    QEMU shall, for a given qcow2 file be able to dump the
> >    key values.
> > 
> > Layer 4:
> >    On creating a VM by importing a qcow2, a management layer
> >    shall inspect the key/values to influence the configuration
> >    of the VM created.   Where it imports multiple qcow2's it
> >    shall inspect all the files and flag disagreements.
> > 
> >    Management layers shall, on creating a qcow2 shall set the
> >    keys based on the VM the qcow2 is created for.  If the qcow2
> >    is created as an additional disk for an exisitng VM it's
> >    fine to leave the string empty (e.g. for a data disk).
> 
> This at least solves the issue of where qemu should store the data (qemu
> doesn't care), and how qemu should interpret it (not at all).
> 
> But I really, really, really do not like storing arbitrary data in qcow2
> files.  I hated it badly enough when qemu knew what to do with it, but I
> hate it even more when even qemu has no idea what to do with it.
> 
> Having a specification of what everything means in the qemu tree makes
> things less unbearable, but not to my liking still.

Have you said why you hate it so much?
Your hate for it seems to be making a simple solution hard.

> > --------------------------------------------------------------
> >    
> > 
> > Some reasoning:
> >    a) I've avoided the problem of when QEMU interprets the value
> >       by ignoring it and giving it to management layers at the point
> >       of VM import.
> 
> Yes, but in the process you've made it completely opaque to qemu,
> basically, which doesn't really make it better for me.  Not that
> qemu-specific information in qcow2 files would be what I want, but, well.
> 
> But it does solve technical issues, I concede that.
> 
> >    b) I hate JSON, but there again nailing down a fixed format
> >       seems easiest and it makes the job of QCOW easy - a single
> >       string.
> 
> Not really.  The string can be rather long, so you probably don't want
> to store it in the image header, and thus it's just a binary blob from
> qcow2's perspective, essentially.

Yes, but it's a single blob - I'm not asking for multiple keyed blobs
or the ability to update individual blobs; just one blob that I can
replace.

> >       (I would suggest in layer2 that the keys are sorted, but
> >       that's a pain to do in some json creators)
> >    c) Forcing the registry of keys might avoid silly duplication.
> >       We can but hope.
> >    d) I've not said it's a libvirt XML file since that seems
> >       a bit prescriptive.
> > 
> > Some initial suggested keys:
> > 
> >    "qemu.machine-types": [ "q35", "i440fx" ]
> >    "qemu.min-ram-MB": 1024
> 
> I still don't understand why you'd want to put the configuration into
> qcow2 instead of the other way around.
> 
> Or why you'd want to use a single file at all, because as this whole
> thread shows, a disk image alone is clearly not sufficient to describe a VM.
> 
> (Or it may be in simple cases, but then that's because you don't need
> any configuration.)

Because it avoids the unpacking associated with archives.

Dave

> Max
> 



--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-06 11:02                   ` Max Reitz
@ 2018-06-06 11:19                     ` Michal Suchánek
  2018-06-06 11:32                       ` Max Reitz
  2018-06-06 11:40                     ` Richard W.M. Jones
  2018-06-06 14:43                     ` Michael S. Tsirkin
  2 siblings, 1 reply; 157+ messages in thread
From: Michal Suchánek @ 2018-06-06 11:19 UTC (permalink / raw)
  To: Max Reitz
  Cc: Kevin Wolf, Richard W.M. Jones, qemu-devel, stefanha, ehabkost,
	qemu-block, Michael S. Tsirkin

[-- Attachment #1: Type: text/plain, Size: 10257 bytes --]

On Wed, 6 Jun 2018 13:02:53 +0200
Max Reitz <mreitz@redhat.com> wrote:

> On 2018-06-06 12:32, Michal Suchánek wrote:
> > On Tue, 29 May 2018 12:14:15 +0200
> > Max Reitz <mreitz@redhat.com> wrote:
> >   
> >> On 2018-05-29 08:44, Kevin Wolf wrote:  
> >>> Am 28.05.2018 um 23:25 hat Richard W.M. Jones geschrieben:    
> >>>> On Mon, May 28, 2018 at 10:20:54PM +0100, Richard W.M. Jones
> >>>> wrote:    
> >>>>> On Mon, May 28, 2018 at 08:38:33PM +0200, Kevin Wolf wrote:    
> >>>>>> Just accessing the image file within a tar archive is possible
> >>>>>> and we could write a block driver for that (I actually think we
> >>>>>> should do this), but it restricts you because certain
> >>>>>> operations like resizing aren't really possible in tar.
> >>>>>> Unfortunately, resizing is a really common operation for
> >>>>>> non-raw image formats.    
> >>>>>
> >>>>> We do this already in virt-v2v (using file.offset and file.size
> >>>>> parameters in the raw driver).
> >>>>>
> >>>>> For virt-v2v we only need to read the source so resizing isn't
> >>>>> an issue.  For most of the cases we're talking about the
> >>>>> downloaded image would also be a template / base image, so I
> >>>>> suppose only reading would be required too.
> >>>>>
> >>>>> I also wrote an nbdkit tar file driver (supports writes, but not
> >>>>> resizing).
> >>>>> https://manpages.debian.org/testing/nbdkit-plugin-perl/nbdkit-tar-plugin.1.en.html    
> >>>>
> >>>> I should add the other thorny issue with OVA files is that the
> >>>> metadata contains a checksum (SHA1 or SHA256) of the disk images.
> >>>> If you modify the disk images in-place in the tar file then you
> >>>> need to recalculate those.    
> >>>
> >>> All of this means that OVA isn't really well suited to be used as
> >>> a native format for VM configuration + images. It's just for
> >>> sharing read-only images that are converted into another native
> >>> format before they are used.
> >>>
> >>> Which is probably fair for the use case it was made for, but means
> >>> that we need something else to solve our problem.    
> >>
> >> Maybe we should first narrow down our problem.  Maybe you have done
> >> that already, but I'm quite in the dark still.
> >>
> >> The original problem was that you need to supply a machine type to
> >> qemu, and that multiple common architectures now have multiple
> >> machine types and not necessarily all work with a single image.  So
> >> far so good, but I have two issues here already:
> >>
> >> (1) How is qemu supposed to interpret that information?  If it's
> >> stored in the image file, I don't see a nice way of retrieving it
> >> before the machine is initialized, at least not with qemu's current
> >> architecture. Once we support configuring qemu solely through QMP,
> >> sure, you can do a blockdev-add and then build the machine
> >> accordingly.  But that is not here today, and I'm not sure this is
> >> a good idea either, because that would mean automagic defaults for
> >> the machine-building QMP commands derived from the blockdev-add
> >> earlier, which should get a plain "No". Also, having to use QMP to
> >> build your machine wouldn't make anything easier; at least not
> >> easier than just supplying a configuration file along with the
> >> image.
> >>
> >> (Building the magic into -blockdev might be less horrible, but such
> >> magic (adding block devices influences machine defaults) to me
> >> still doesn't seem worth not having to supply a config file along
> >> with the disk image.)
> >>
> >> (2) Again, I personally just really don't like saving such
> >> information in a disk image.  One actual argument I can bring up
> >> for that distaste is this: Suppose, you have multiple images
> >> attached to your VM.  Now the VM wants to store the machine type.
> >> Where does it go?  Into all of them?  But some of those images may
> >> only contain data and might be intended to be shared between
> >> multiple VMs.  So those shouldn't receive the mark.  Only disks
> >> with binaries should receive them. But what if those binaries are
> >> just cross-compiled binaries for some other VM?  Oh no, so not
> >> even binaries are a sure indicator...  So I have no idea where the
> >> information is supposed to be stored.  In any case, "the first
> >> image" just gets an outright "no" from me, and "all images" gets
> >> an "I don't think this is a good idea".
> >>
> >> Loading is fun, too.  OK, so you attach multiple disk images to a
> >> VM. Oops, they have varying machine type information...  Now
> >> what?  Use the information from the first one?  Definitely no.
> >> Just ignore all of the information in such a case and have the
> >> user supply the machine type again?  Possible, but it seems weird
> >> to me that qemu would usually guess the machine type, but once you
> >> attach some random other image to it, it suddenly fails to do
> >> that.  But maybe it's just me who thinks this is weird.
> >>
> >>
> >> OK, so let's go a step further.  We have stored the machine type
> >> information in order to not have to supply a config file with the
> >> qcow2 image -- because if we did, it could just contain the machine
> >> type and that would be it.
> >>
> >> So to me it follows naturally that just storing the machine type
> >> doesn't make much sense if we cannot also store more VM
> >> configuration in a qcow2 file, because I don't see why you should
> >> be able to ship an image without a config file only if all you
> >> need to supply is a machine type. Often, you also need to supply
> >> how much memory the VM needs (which depends on the OS on the
> >> image) or what storage controller to use (does the OS have virtio
> >> drivers? (to be fair, it usually does, because you're supplying a
> >> VM image in the first place)).
> >>
> >> So I think if we decide to store the machine type, that is kind of
> >> a slippery slope and then there are good arguments for storing
> >> even more configuration options in the file, too.  But I really,
> >> really don't like that.
> >>
> >> For one thing, I suspect it to get really ugly implementation-wise.
> >> Getting the machine type out of a disk image and actually
> >> interpreting it automatically is bad enough, but getting possibly
> >> everything out of it?  It's not going to be any better.
> >>
> >> For another, how do we store the data?  key-value seems wrong if we
> >> want to store everything.  JSON might be fine.  But eventually we
> >> just want basically a qemu configuration file in there, I would
> >> think (which may support JSON at some point?).   So basically we
> >> would store the data as a binary blob and let the rest of qemu do
> >> its thing with it.  But then please tell me why I fought so
> >> valiantly against storing random bitmaps in qcow2 files.    
> > 
> > Yes, I wonder. Why did you?  
> 
> That was mostly directed at Kevin.
> 
> My reasoning was that a qcow2 file is a disk image.  All data stored
> therein should be immediately associated with the stored data.
> Another reason was that from the perspective of qcow2 you don't lose
> anything by tying the bitmaps directly to that data; all we lost was
> the capability of storing bitmaps for unrelated raw files.
> 
> (And the reasoning for that is "if you want features, use qcow2" --
> although R/W backing files may loosen that phrase.)
> 
> >> I hate the idea of making qcow2 a random archive format.  
> > 
> > What's wrong with that?  
> 
> The fact that qcow2 isn't.
> 
> From my perspective it would increase the format's complexity to a
> point where you could just create a new format altogether.  Well,
> actually, all you do is design a filesystem (or reuse an existing
> one).
> 
> >> We have tar for that.  
> > 
> > It does not support expanding the stored files.  
> 
> Nor does qcow2, because it does not support storing files at all.

AFAICT from the previous discussion it already does allow storing
multiple data streams that can be changed independently so it basically
is an archive format or filesystem except the streams are not named nor
easily accessible separately outside of qemu.

> 
> Secondly, that completely depends on how you use it.  You can freely
> expand the last file in the archive, for instance.  Also I've seen
> people store files in chunks so they can indeed resize it.
> 
> (I'm wondering if we could write a block driver that could provide
> such a chunk allocation transparently to qcow2...  Note that a qcow2
> file does not need to be continuous, so you could in theory indeed
> store the qcow2 file and its data in completely separate places in a
> tar file.)

Which basically invents another new filesystem on top of tar for no
good reason. Especially when we have already support for storage format
that is capable enough.

> 
> What I'm trying to get at is that qcow2 was not designed to be a
> container format for arbitrary files.  If you want to make it such,
> I'm sure there are existing formats that work better.

Such as?

> 
> >> Unless I have got something terribly wrong (which is indeed a
> >> possibility!), to me this proposal means basically to turn qcow2
> >> into (1) a VM description format for qemu, and (2) to turn it into
> >> an archive format on the way.  
> > 
> > And if you go all the way you can store multiple disks along with
> > the VM definition so you can have the whole appliance in one file.
> > It conveniently solves the problem of synchronizing snapshots across
> > multiple disk images and the question where to store the machine
> > state if you want to suspend it.   
> 
> Yeah, but why make qcow2 that format?  That's what I completely fail
> to understand.
> 
> If you want to have a single VM description file that contains the VM
> configuration and some qcow2/raw/whatever files along with it for the
> guest disk data, sure, go ahead.  But why does the format of the whole
> thing need to be qcow2?

Because then qemu can access the disk data from the image directly
without any need for extraction, copying to different file, etc.

Thanks

Michal

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] [Qemu-block] storing machine data in qcow images?
  2018-06-06 11:02                   ` Max Reitz
  2018-06-06 11:14                     ` Dr. David Alan Gilbert
@ 2018-06-06 11:22                     ` Peter Krempa
  1 sibling, 0 replies; 157+ messages in thread
From: Peter Krempa @ 2018-06-06 11:22 UTC (permalink / raw)
  To: Max Reitz
  Cc: Dr. David Alan Gilbert, Kevin Wolf, qemu-block,
	Michael S. Tsirkin, qemu-devel, Richard W.M. Jones, stefanha

[-- Attachment #1: Type: text/plain, Size: 1906 bytes --]

On Wed, Jun 06, 2018 at 13:02:56 +0200, Max Reitz wrote:
> On 2018-06-05 11:21, Dr. David Alan Gilbert wrote:

[...]

> >       (I would suggest in layer2 that the keys are sorted, but
> >       that's a pain to do in some json creators)
> >    c) Forcing the registry of keys might avoid silly duplication.
> >       We can but hope.
> >    d) I've not said it's a libvirt XML file since that seems
> >       a bit prescriptive.
> > 
> > Some initial suggested keys:
> > 
> >    "qemu.machine-types": [ "q35", "i440fx" ]
> >    "qemu.min-ram-MB": 1024
> 
> I still don't understand why you'd want to put the configuration into
> qcow2 instead of the other way around.
> 
> Or why you'd want to use a single file at all, because as this whole
> thread shows, a disk image alone is clearly not sufficient to describe a VM.

I concur to many points made here. I think it would be wrong to put
anything besides disk-related data in a disk image.

If we want to have a all-in-one VM image, then it should be something
separate from this.

Some more points against squashing all this irrelevant stuff into qcow2
is that if you use it as a disk image only with any higher level
management you don't care about any stored configuration.

In fact in libvirt we'd need a way to prevent any of the data from the
disk image from  being applied to the configuration as it would create
problems.

Also there is a big difference between storing "suggestions" of
configuration and the actual full configuration itself. The
"suggestions" which storage controller or machine type to use are
sometimes helpful and sometimes not. Users may have their own
preference. And besides this we have the libosinfo project which can
provide suggestions.

I think that if the clear separation between a disk image format and an
all-in-one VM file is not kept it will end up in a big mess.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-06 11:14                     ` Dr. David Alan Gilbert
@ 2018-06-06 11:26                       ` Max Reitz
  2018-06-06 12:00                         ` Dr. David Alan Gilbert
  2018-06-06 11:42                       ` Richard W.M. Jones
  1 sibling, 1 reply; 157+ messages in thread
From: Max Reitz @ 2018-06-06 11:26 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: Kevin Wolf, Richard W.M. Jones, armbru, qemu-block,
	Michael S. Tsirkin, qemu-devel, stefanha

[-- Attachment #1: Type: text/plain, Size: 10855 bytes --]

On 2018-06-06 13:14, Dr. David Alan Gilbert wrote:
> * Max Reitz (mreitz@redhat.com) wrote:
>> On 2018-06-05 11:21, Dr. David Alan Gilbert wrote:
>>> <reawakening a fizzled out thread>
>>>
>>> This seems to have fizzled out because of a lack of a concrete proposal;
>>> so here is one based on a reply to Max's post:
>>>
>>> * Max Reitz (mreitz@redhat.com) wrote:
>>>
>>> <snip>
>>>
>>>> The original problem was that you need to supply a machine type to qemu,
>>>> and that multiple common architectures now have multiple machine types
>>>> and not necessarily all work with a single image.  So far so good, but I
>>>> have two issues here already:
>>>>
>>>> (1) How is qemu supposed to interpret that information?  If it's stored
>>>> in the image file, I don't see a nice way of retrieving it before the
>>>> machine is initialized, at least not with qemu's current architecture.
>>>
>>> <snip>
>>>
>>>> (2) Again, I personally just really don't like saving such information
>>>> in a disk image.  One actual argument I can bring up for that distaste
>>>> is this: Suppose, you have multiple images attached to your VM.  Now the
>>>> VM wants to store the machine type.  Where does it go?  Into all of
>>>> them?
>>>
>>> <snip>
>>>
>>>> So I think if we decide to store the machine type, that is kind of a
>>>> slippery slope and then there are good arguments for storing even more
>>>> configuration options in the file, too.  But I really, really don't like
>>>> that.
>>>
>>> <snip>
>>>
>>>> For another, how do we store the data?  key-value seems wrong if we want
>>>> to store everything.  JSON might be fine.  But eventually we just want
>>>> basically a qemu configuration file in there, I would think (which may
>>>> support JSON at some point?).   So basically we would store the data as
>>>> a binary blob and let the rest of qemu do its thing with it.  But then
>>>> please tell me why I fought so valiantly against storing random bitmaps
>>>> in qcow2 files.  I hate the idea of making qcow2 a random archive
>>>> format.  We have tar for that.
>>>
>>> <snip>
>>>
>>>> tl;dr: I really don't get why it's so hard to supply a config file along
>>>> with a qcow2 image.  Is it so hard for people to realize that a VM does
>>>> not only consist of a disk?
>>>
>>> Yes! Because in many cases that's all it needs, and it's ready to run
>>> with no unpacking.
>>
>> It clearly is not, or we would not have this discussion.
>>
>> The disk image is only enough if you want the default values for all of
>> qemu's configuration options, because today (and if I were to decide, in
>> the future, too) disk images do not configure the VM (well, they
>> configure the guest, but not the VM itself).
> 
> The problem with having a separate file is that you either have to copy
> it around with the image 

Which is just an inconvenience.

I understand it is an inconvenience and it would be nice to change it,
but please understand that I do not want qcow2 to become a filesystem
just to relieve an inconvenience.

(Note: I understand that you may not want qcow2 to become a filesystem,
but I do get the impression from others.)

>                           or have an archive. If you have an archive
> you have to have an unpacking step which then copies, potentially a lot
> of data taking some reasonable amount of time.

I'm sure this can be optimized, but yes, I get that.

(If you use e.g. tar and store the image data starting on an FS cluster
boundary (64 kB should be more than sufficient), I assume there is a way
to extract that data into a new file without copying anything.)

>                                                 Storing a simple bit
> of data with the image avoids that.

It is not a simple bit of data, as evidenced by the discussion about
storing binary blobs and MIME types going on.

>>> I think we should have:
>>>
>>> --------------------------------------------------------------
>>> Layer 0:
>>>    QCOW provides a way to store a single string of arbitrary (but
>>> limited?) length.
>>>    QCOW provides a way to replace the string by a new string.
>>>    The original or the new string will be stored after that;
>>>    never some mix.
>>>    Where a file 'b' has a backing file 'a', 'b' inherits the
>>>    string from 'a' unless 'b' has it's own string.
>>>    Snapshots inherit their string from the main unless they have
>>>    their own string.
>>>
>>> Layer 1:
>>>    The string shall always be a JSON 'object'; i.e. of the form
>>>     { "something": ... , "more": ... }
>>>
>>>    The key strings shall be non-null and non-empty and shall
>>>    be unique.
>>>
>>> Layer 2:
>>>    '.'s in the key string shall indicate hierarchy
>>
>> I don't understand why we we'd need dotted syntax when we already have
>> JSON, but that's not my issue.
> 
> I think someone earlier in the thread had asked about how we handled
> hierarchy so I added it.
> 
>>>    Key strings shall be listed in qemu's 
>>>       docs/specs/qcow-keys.rst
>>>
>>>       that shall indicate their meaning and the meaning and
>>>       valid formatting of the value associated with the,
>>>
>>>    Key strings shall start with either:
>>>       qemu.   in which case they must be listed in a file in
>>>               the qemu source tree
>>>
>>>       a reverse dotted name unique to the submitter, they may
>>>               be listed in the same file in the source tree, e.g.
>>>       com.redhat.
>>
>> So this is just another configuration file format.
>>
>>> Layer 3:
>>>    QEMU shall, for a given qcow2 file be able to dump the
>>>    key values.
>>>
>>> Layer 4:
>>>    On creating a VM by importing a qcow2, a management layer
>>>    shall inspect the key/values to influence the configuration
>>>    of the VM created.   Where it imports multiple qcow2's it
>>>    shall inspect all the files and flag disagreements.
>>>
>>>    Management layers shall, on creating a qcow2 shall set the
>>>    keys based on the VM the qcow2 is created for.  If the qcow2
>>>    is created as an additional disk for an exisitng VM it's
>>>    fine to leave the string empty (e.g. for a data disk).
>>
>> This at least solves the issue of where qemu should store the data (qemu
>> doesn't care), and how qemu should interpret it (not at all).
>>
>> But I really, really, really do not like storing arbitrary data in qcow2
>> files.  I hated it badly enough when qemu knew what to do with it, but I
>> hate it even more when even qemu has no idea what to do with it.
>>
>> Having a specification of what everything means in the qemu tree makes
>> things less unbearable, but not to my liking still.
> 
> Have you said why you hate it so much?
> Your hate for it seems to be making a simple solution hard.

Because it's a disk image format.  Data therein should be relevant to
the disk image.  I see qcow2 as a representation of data stored on a
physical storage medium.

Some metadata associated directly with that is fine (such as dirty
bitmaps, backing chains, things like that).  But configuring the whole
VM seems out of scope to me.

Also, making qcow2 a filesystem is not a simple solution.

...OK, let me back off here, I may be over-interpreting things and
throwing opinions of different people into one pot.

Maybe you don't want qcow2 to be a filesystem, and you just want to
store a single binary blob.  Well, OK, that's not that bad.  But in any
case, I wouldn't call it a simple solution anymore.

Yes, storing just the machine type somewhere would be possible with a
simple solution; but as I said (and the whole thread shows since then),
this is a slippery slope, and suddenly we arrive at storing arbitrary
binary data (like images?!) along with MIME types.  That will not be
possible with a simple solution anymore, I don't think.

>>> --------------------------------------------------------------
>>>    
>>>
>>> Some reasoning:
>>>    a) I've avoided the problem of when QEMU interprets the value
>>>       by ignoring it and giving it to management layers at the point
>>>       of VM import.
>>
>> Yes, but in the process you've made it completely opaque to qemu,
>> basically, which doesn't really make it better for me.  Not that
>> qemu-specific information in qcow2 files would be what I want, but, well.
>>
>> But it does solve technical issues, I concede that.
>>
>>>    b) I hate JSON, but there again nailing down a fixed format
>>>       seems easiest and it makes the job of QCOW easy - a single
>>>       string.
>>
>> Not really.  The string can be rather long, so you probably don't want
>> to store it in the image header, and thus it's just a binary blob from
>> qcow2's perspective, essentially.
> 
> Yes, but it's a single blob - I'm not asking for multiple keyed blobs
> or the ability to update individual blobs; just one blob that I can
> replace.

OK, you aren't, but others seem to be.

Or, well, you call it a single blob.  But actually the current ideas
seem to be to store a rather large configuration tree with binary data
in that blob, so to me personally there is absolutely no functional
difference to just storing a tar file in that blob.

So correct me if I'm wrong, but to me it appears that you effectively
want to store a filesystem in qcow2.[1]  Well, that's better than making
qcow2 the filesystem, but it still appears just the wrong way around to me.

[1] Yes, I know that the guest disk already contains an FS. :-P

>>>       (I would suggest in layer2 that the keys are sorted, but
>>>       that's a pain to do in some json creators)
>>>    c) Forcing the registry of keys might avoid silly duplication.
>>>       We can but hope.
>>>    d) I've not said it's a libvirt XML file since that seems
>>>       a bit prescriptive.
>>>
>>> Some initial suggested keys:
>>>
>>>    "qemu.machine-types": [ "q35", "i440fx" ]
>>>    "qemu.min-ram-MB": 1024
>>
>> I still don't understand why you'd want to put the configuration into
>> qcow2 instead of the other way around.
>>
>> Or why you'd want to use a single file at all, because as this whole
>> thread shows, a disk image alone is clearly not sufficient to describe a VM.
>>
>> (Or it may be in simple cases, but then that's because you don't need
>> any configuration.)
> 
> Because it avoids the unpacking associated with archives.

I'm not talking about unpacking.  I'm talking about a potentially new
format which allows accessing the qcow2 file in-place.  It would
probably be trivial to write a block driver to allow this.

(And as I wrote in my response to Michal, I suspect that tar could
actually allow this, even though it would probably not be the ideal format.)

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-06 11:19                     ` Michal Suchánek
@ 2018-06-06 11:32                       ` Max Reitz
  2018-06-06 11:37                         ` Dr. David Alan Gilbert
                                           ` (2 more replies)
  0 siblings, 3 replies; 157+ messages in thread
From: Max Reitz @ 2018-06-06 11:32 UTC (permalink / raw)
  To: Michal Suchánek
  Cc: Kevin Wolf, Richard W.M. Jones, qemu-devel, stefanha, ehabkost,
	qemu-block, Michael S. Tsirkin

[-- Attachment #1: Type: text/plain, Size: 11238 bytes --]

On 2018-06-06 13:19, Michal Suchánek wrote:
> On Wed, 6 Jun 2018 13:02:53 +0200
> Max Reitz <mreitz@redhat.com> wrote:
> 
>> On 2018-06-06 12:32, Michal Suchánek wrote:
>>> On Tue, 29 May 2018 12:14:15 +0200
>>> Max Reitz <mreitz@redhat.com> wrote:
>>>   
>>>> On 2018-05-29 08:44, Kevin Wolf wrote:  
>>>>> Am 28.05.2018 um 23:25 hat Richard W.M. Jones geschrieben:    
>>>>>> On Mon, May 28, 2018 at 10:20:54PM +0100, Richard W.M. Jones
>>>>>> wrote:    
>>>>>>> On Mon, May 28, 2018 at 08:38:33PM +0200, Kevin Wolf wrote:    
>>>>>>>> Just accessing the image file within a tar archive is possible
>>>>>>>> and we could write a block driver for that (I actually think we
>>>>>>>> should do this), but it restricts you because certain
>>>>>>>> operations like resizing aren't really possible in tar.
>>>>>>>> Unfortunately, resizing is a really common operation for
>>>>>>>> non-raw image formats.    
>>>>>>>
>>>>>>> We do this already in virt-v2v (using file.offset and file.size
>>>>>>> parameters in the raw driver).
>>>>>>>
>>>>>>> For virt-v2v we only need to read the source so resizing isn't
>>>>>>> an issue.  For most of the cases we're talking about the
>>>>>>> downloaded image would also be a template / base image, so I
>>>>>>> suppose only reading would be required too.
>>>>>>>
>>>>>>> I also wrote an nbdkit tar file driver (supports writes, but not
>>>>>>> resizing).
>>>>>>> https://manpages.debian.org/testing/nbdkit-plugin-perl/nbdkit-tar-plugin.1.en.html    
>>>>>>
>>>>>> I should add the other thorny issue with OVA files is that the
>>>>>> metadata contains a checksum (SHA1 or SHA256) of the disk images.
>>>>>> If you modify the disk images in-place in the tar file then you
>>>>>> need to recalculate those.    
>>>>>
>>>>> All of this means that OVA isn't really well suited to be used as
>>>>> a native format for VM configuration + images. It's just for
>>>>> sharing read-only images that are converted into another native
>>>>> format before they are used.
>>>>>
>>>>> Which is probably fair for the use case it was made for, but means
>>>>> that we need something else to solve our problem.    
>>>>
>>>> Maybe we should first narrow down our problem.  Maybe you have done
>>>> that already, but I'm quite in the dark still.
>>>>
>>>> The original problem was that you need to supply a machine type to
>>>> qemu, and that multiple common architectures now have multiple
>>>> machine types and not necessarily all work with a single image.  So
>>>> far so good, but I have two issues here already:
>>>>
>>>> (1) How is qemu supposed to interpret that information?  If it's
>>>> stored in the image file, I don't see a nice way of retrieving it
>>>> before the machine is initialized, at least not with qemu's current
>>>> architecture. Once we support configuring qemu solely through QMP,
>>>> sure, you can do a blockdev-add and then build the machine
>>>> accordingly.  But that is not here today, and I'm not sure this is
>>>> a good idea either, because that would mean automagic defaults for
>>>> the machine-building QMP commands derived from the blockdev-add
>>>> earlier, which should get a plain "No". Also, having to use QMP to
>>>> build your machine wouldn't make anything easier; at least not
>>>> easier than just supplying a configuration file along with the
>>>> image.
>>>>
>>>> (Building the magic into -blockdev might be less horrible, but such
>>>> magic (adding block devices influences machine defaults) to me
>>>> still doesn't seem worth not having to supply a config file along
>>>> with the disk image.)
>>>>
>>>> (2) Again, I personally just really don't like saving such
>>>> information in a disk image.  One actual argument I can bring up
>>>> for that distaste is this: Suppose, you have multiple images
>>>> attached to your VM.  Now the VM wants to store the machine type.
>>>> Where does it go?  Into all of them?  But some of those images may
>>>> only contain data and might be intended to be shared between
>>>> multiple VMs.  So those shouldn't receive the mark.  Only disks
>>>> with binaries should receive them. But what if those binaries are
>>>> just cross-compiled binaries for some other VM?  Oh no, so not
>>>> even binaries are a sure indicator...  So I have no idea where the
>>>> information is supposed to be stored.  In any case, "the first
>>>> image" just gets an outright "no" from me, and "all images" gets
>>>> an "I don't think this is a good idea".
>>>>
>>>> Loading is fun, too.  OK, so you attach multiple disk images to a
>>>> VM. Oops, they have varying machine type information...  Now
>>>> what?  Use the information from the first one?  Definitely no.
>>>> Just ignore all of the information in such a case and have the
>>>> user supply the machine type again?  Possible, but it seems weird
>>>> to me that qemu would usually guess the machine type, but once you
>>>> attach some random other image to it, it suddenly fails to do
>>>> that.  But maybe it's just me who thinks this is weird.
>>>>
>>>>
>>>> OK, so let's go a step further.  We have stored the machine type
>>>> information in order to not have to supply a config file with the
>>>> qcow2 image -- because if we did, it could just contain the machine
>>>> type and that would be it.
>>>>
>>>> So to me it follows naturally that just storing the machine type
>>>> doesn't make much sense if we cannot also store more VM
>>>> configuration in a qcow2 file, because I don't see why you should
>>>> be able to ship an image without a config file only if all you
>>>> need to supply is a machine type. Often, you also need to supply
>>>> how much memory the VM needs (which depends on the OS on the
>>>> image) or what storage controller to use (does the OS have virtio
>>>> drivers? (to be fair, it usually does, because you're supplying a
>>>> VM image in the first place)).
>>>>
>>>> So I think if we decide to store the machine type, that is kind of
>>>> a slippery slope and then there are good arguments for storing
>>>> even more configuration options in the file, too.  But I really,
>>>> really don't like that.
>>>>
>>>> For one thing, I suspect it to get really ugly implementation-wise.
>>>> Getting the machine type out of a disk image and actually
>>>> interpreting it automatically is bad enough, but getting possibly
>>>> everything out of it?  It's not going to be any better.
>>>>
>>>> For another, how do we store the data?  key-value seems wrong if we
>>>> want to store everything.  JSON might be fine.  But eventually we
>>>> just want basically a qemu configuration file in there, I would
>>>> think (which may support JSON at some point?).   So basically we
>>>> would store the data as a binary blob and let the rest of qemu do
>>>> its thing with it.  But then please tell me why I fought so
>>>> valiantly against storing random bitmaps in qcow2 files.    
>>>
>>> Yes, I wonder. Why did you?  
>>
>> That was mostly directed at Kevin.
>>
>> My reasoning was that a qcow2 file is a disk image.  All data stored
>> therein should be immediately associated with the stored data.
>> Another reason was that from the perspective of qcow2 you don't lose
>> anything by tying the bitmaps directly to that data; all we lost was
>> the capability of storing bitmaps for unrelated raw files.
>>
>> (And the reasoning for that is "if you want features, use qcow2" --
>> although R/W backing files may loosen that phrase.)
>>
>>>> I hate the idea of making qcow2 a random archive format.  
>>>
>>> What's wrong with that?  
>>
>> The fact that qcow2 isn't.
>>
>> From my perspective it would increase the format's complexity to a
>> point where you could just create a new format altogether.  Well,
>> actually, all you do is design a filesystem (or reuse an existing
>> one).
>>
>>>> We have tar for that.  
>>>
>>> It does not support expanding the stored files.  
>>
>> Nor does qcow2, because it does not support storing files at all.
> 
> AFAICT from the previous discussion it already does allow storing
> multiple data streams that can be changed independently so it basically
> is an archive format or filesystem except the streams are not named nor
> easily accessible separately outside of qemu.

I don't quite understand what you are referring to.  We have snapshots,
we have bitmaps, yes, but all of that are related directly to the stored
guest disk data.

The only thing we currently have in qcow2 that is opaque is the VM state
that can be stored in snapshots (and don't hold me responsible for that).

>> Secondly, that completely depends on how you use it.  You can freely
>> expand the last file in the archive, for instance.  Also I've seen
>> people store files in chunks so they can indeed resize it.
>>
>> (I'm wondering if we could write a block driver that could provide
>> such a chunk allocation transparently to qcow2...  Note that a qcow2
>> file does not need to be continuous, so you could in theory indeed
>> store the qcow2 file and its data in completely separate places in a
>> tar file.)
> 
> Which basically invents another new filesystem on top of tar for no
> good reason. Especially when we have already support for storage format
> that is capable enough.

No different from inventing a filesystem on top of qcow2.

I don't think qcow2 is any more capable than tar.

>> What I'm trying to get at is that qcow2 was not designed to be a
>> container format for arbitrary files.  If you want to make it such,
>> I'm sure there are existing formats that work better.
> 
> Such as?

ext2?

It seems to me that you want to make qcow2 a filesystem.  Sure, the FS
we'd end up with would probably be simpler than ext2, but I assume
thanks to feature creep we'd eventually end up with a qcow2 format that
is a worse FS than real FS (especially performance-wise), but that is
similarly complex.

>>>> Unless I have got something terribly wrong (which is indeed a
>>>> possibility!), to me this proposal means basically to turn qcow2
>>>> into (1) a VM description format for qemu, and (2) to turn it into
>>>> an archive format on the way.  
>>>
>>> And if you go all the way you can store multiple disks along with
>>> the VM definition so you can have the whole appliance in one file.
>>> It conveniently solves the problem of synchronizing snapshots across
>>> multiple disk images and the question where to store the machine
>>> state if you want to suspend it.   
>>
>> Yeah, but why make qcow2 that format?  That's what I completely fail
>> to understand.
>>
>> If you want to have a single VM description file that contains the VM
>> configuration and some qcow2/raw/whatever files along with it for the
>> guest disk data, sure, go ahead.  But why does the format of the whole
>> thing need to be qcow2?
> 
> Because then qemu can access the disk data from the image directly
> without any need for extraction, copying to different file, etc.

This does not explain why it needs to be qcow2.  There is absolutely no
reason why you couldn't use qcow2 files in-place inside of another file.

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-06 11:32                       ` Max Reitz
@ 2018-06-06 11:37                         ` Dr. David Alan Gilbert
  2018-06-06 11:44                           ` Max Reitz
  2018-06-06 11:43                         ` Michal Suchánek
  2018-06-11  8:44                         ` Richard W.M. Jones
  2 siblings, 1 reply; 157+ messages in thread
From: Dr. David Alan Gilbert @ 2018-06-06 11:37 UTC (permalink / raw)
  To: Max Reitz
  Cc: Michal Suchánek, Kevin Wolf, ehabkost, qemu-block,
	Michael S. Tsirkin, Richard W.M. Jones, qemu-devel, stefanha

* Max Reitz (mreitz@redhat.com) wrote:
> On 2018-06-06 13:19, Michal Suchánek wrote:
> > On Wed, 6 Jun 2018 13:02:53 +0200
> > Max Reitz <mreitz@redhat.com> wrote:
> > 
> >> On 2018-06-06 12:32, Michal Suchánek wrote:
> >>> On Tue, 29 May 2018 12:14:15 +0200
> >>> Max Reitz <mreitz@redhat.com> wrote:
> >>>   
> >>>> On 2018-05-29 08:44, Kevin Wolf wrote:  
> >>>>> Am 28.05.2018 um 23:25 hat Richard W.M. Jones geschrieben:    
> >>>>>> On Mon, May 28, 2018 at 10:20:54PM +0100, Richard W.M. Jones
> >>>>>> wrote:    
> >>>>>>> On Mon, May 28, 2018 at 08:38:33PM +0200, Kevin Wolf wrote:    
> >>>>>>>> Just accessing the image file within a tar archive is possible
> >>>>>>>> and we could write a block driver for that (I actually think we
> >>>>>>>> should do this), but it restricts you because certain
> >>>>>>>> operations like resizing aren't really possible in tar.
> >>>>>>>> Unfortunately, resizing is a really common operation for
> >>>>>>>> non-raw image formats.    
> >>>>>>>
> >>>>>>> We do this already in virt-v2v (using file.offset and file.size
> >>>>>>> parameters in the raw driver).
> >>>>>>>
> >>>>>>> For virt-v2v we only need to read the source so resizing isn't
> >>>>>>> an issue.  For most of the cases we're talking about the
> >>>>>>> downloaded image would also be a template / base image, so I
> >>>>>>> suppose only reading would be required too.
> >>>>>>>
> >>>>>>> I also wrote an nbdkit tar file driver (supports writes, but not
> >>>>>>> resizing).
> >>>>>>> https://manpages.debian.org/testing/nbdkit-plugin-perl/nbdkit-tar-plugin.1.en.html    
> >>>>>>
> >>>>>> I should add the other thorny issue with OVA files is that the
> >>>>>> metadata contains a checksum (SHA1 or SHA256) of the disk images.
> >>>>>> If you modify the disk images in-place in the tar file then you
> >>>>>> need to recalculate those.    
> >>>>>
> >>>>> All of this means that OVA isn't really well suited to be used as
> >>>>> a native format for VM configuration + images. It's just for
> >>>>> sharing read-only images that are converted into another native
> >>>>> format before they are used.
> >>>>>
> >>>>> Which is probably fair for the use case it was made for, but means
> >>>>> that we need something else to solve our problem.    
> >>>>
> >>>> Maybe we should first narrow down our problem.  Maybe you have done
> >>>> that already, but I'm quite in the dark still.
> >>>>
> >>>> The original problem was that you need to supply a machine type to
> >>>> qemu, and that multiple common architectures now have multiple
> >>>> machine types and not necessarily all work with a single image.  So
> >>>> far so good, but I have two issues here already:
> >>>>
> >>>> (1) How is qemu supposed to interpret that information?  If it's
> >>>> stored in the image file, I don't see a nice way of retrieving it
> >>>> before the machine is initialized, at least not with qemu's current
> >>>> architecture. Once we support configuring qemu solely through QMP,
> >>>> sure, you can do a blockdev-add and then build the machine
> >>>> accordingly.  But that is not here today, and I'm not sure this is
> >>>> a good idea either, because that would mean automagic defaults for
> >>>> the machine-building QMP commands derived from the blockdev-add
> >>>> earlier, which should get a plain "No". Also, having to use QMP to
> >>>> build your machine wouldn't make anything easier; at least not
> >>>> easier than just supplying a configuration file along with the
> >>>> image.
> >>>>
> >>>> (Building the magic into -blockdev might be less horrible, but such
> >>>> magic (adding block devices influences machine defaults) to me
> >>>> still doesn't seem worth not having to supply a config file along
> >>>> with the disk image.)
> >>>>
> >>>> (2) Again, I personally just really don't like saving such
> >>>> information in a disk image.  One actual argument I can bring up
> >>>> for that distaste is this: Suppose, you have multiple images
> >>>> attached to your VM.  Now the VM wants to store the machine type.
> >>>> Where does it go?  Into all of them?  But some of those images may
> >>>> only contain data and might be intended to be shared between
> >>>> multiple VMs.  So those shouldn't receive the mark.  Only disks
> >>>> with binaries should receive them. But what if those binaries are
> >>>> just cross-compiled binaries for some other VM?  Oh no, so not
> >>>> even binaries are a sure indicator...  So I have no idea where the
> >>>> information is supposed to be stored.  In any case, "the first
> >>>> image" just gets an outright "no" from me, and "all images" gets
> >>>> an "I don't think this is a good idea".
> >>>>
> >>>> Loading is fun, too.  OK, so you attach multiple disk images to a
> >>>> VM. Oops, they have varying machine type information...  Now
> >>>> what?  Use the information from the first one?  Definitely no.
> >>>> Just ignore all of the information in such a case and have the
> >>>> user supply the machine type again?  Possible, but it seems weird
> >>>> to me that qemu would usually guess the machine type, but once you
> >>>> attach some random other image to it, it suddenly fails to do
> >>>> that.  But maybe it's just me who thinks this is weird.
> >>>>
> >>>>
> >>>> OK, so let's go a step further.  We have stored the machine type
> >>>> information in order to not have to supply a config file with the
> >>>> qcow2 image -- because if we did, it could just contain the machine
> >>>> type and that would be it.
> >>>>
> >>>> So to me it follows naturally that just storing the machine type
> >>>> doesn't make much sense if we cannot also store more VM
> >>>> configuration in a qcow2 file, because I don't see why you should
> >>>> be able to ship an image without a config file only if all you
> >>>> need to supply is a machine type. Often, you also need to supply
> >>>> how much memory the VM needs (which depends on the OS on the
> >>>> image) or what storage controller to use (does the OS have virtio
> >>>> drivers? (to be fair, it usually does, because you're supplying a
> >>>> VM image in the first place)).
> >>>>
> >>>> So I think if we decide to store the machine type, that is kind of
> >>>> a slippery slope and then there are good arguments for storing
> >>>> even more configuration options in the file, too.  But I really,
> >>>> really don't like that.
> >>>>
> >>>> For one thing, I suspect it to get really ugly implementation-wise.
> >>>> Getting the machine type out of a disk image and actually
> >>>> interpreting it automatically is bad enough, but getting possibly
> >>>> everything out of it?  It's not going to be any better.
> >>>>
> >>>> For another, how do we store the data?  key-value seems wrong if we
> >>>> want to store everything.  JSON might be fine.  But eventually we
> >>>> just want basically a qemu configuration file in there, I would
> >>>> think (which may support JSON at some point?).   So basically we
> >>>> would store the data as a binary blob and let the rest of qemu do
> >>>> its thing with it.  But then please tell me why I fought so
> >>>> valiantly against storing random bitmaps in qcow2 files.    
> >>>
> >>> Yes, I wonder. Why did you?  
> >>
> >> That was mostly directed at Kevin.
> >>
> >> My reasoning was that a qcow2 file is a disk image.  All data stored
> >> therein should be immediately associated with the stored data.
> >> Another reason was that from the perspective of qcow2 you don't lose
> >> anything by tying the bitmaps directly to that data; all we lost was
> >> the capability of storing bitmaps for unrelated raw files.
> >>
> >> (And the reasoning for that is "if you want features, use qcow2" --
> >> although R/W backing files may loosen that phrase.)
> >>
> >>>> I hate the idea of making qcow2 a random archive format.  
> >>>
> >>> What's wrong with that?  
> >>
> >> The fact that qcow2 isn't.
> >>
> >> From my perspective it would increase the format's complexity to a
> >> point where you could just create a new format altogether.  Well,
> >> actually, all you do is design a filesystem (or reuse an existing
> >> one).
> >>
> >>>> We have tar for that.  
> >>>
> >>> It does not support expanding the stored files.  
> >>
> >> Nor does qcow2, because it does not support storing files at all.
> > 
> > AFAICT from the previous discussion it already does allow storing
> > multiple data streams that can be changed independently so it basically
> > is an archive format or filesystem except the streams are not named nor
> > easily accessible separately outside of qemu.
> 
> I don't quite understand what you are referring to.  We have snapshots,
> we have bitmaps, yes, but all of that are related directly to the stored
> guest disk data.
> 
> The only thing we currently have in qcow2 that is opaque is the VM state
> that can be stored in snapshots (and don't hold me responsible for that).
> 
> >> Secondly, that completely depends on how you use it.  You can freely
> >> expand the last file in the archive, for instance.  Also I've seen
> >> people store files in chunks so they can indeed resize it.
> >>
> >> (I'm wondering if we could write a block driver that could provide
> >> such a chunk allocation transparently to qcow2...  Note that a qcow2
> >> file does not need to be continuous, so you could in theory indeed
> >> store the qcow2 file and its data in completely separate places in a
> >> tar file.)
> > 
> > Which basically invents another new filesystem on top of tar for no
> > good reason. Especially when we have already support for storage format
> > that is capable enough.
> 
> No different from inventing a filesystem on top of qcow2.
> 
> I don't think qcow2 is any more capable than tar.
> 
> >> What I'm trying to get at is that qcow2 was not designed to be a
> >> container format for arbitrary files.  If you want to make it such,
> >> I'm sure there are existing formats that work better.
> > 
> > Such as?
> 
> ext2?
> 
> It seems to me that you want to make qcow2 a filesystem.  Sure, the FS
> we'd end up with would probably be simpler than ext2, but I assume
> thanks to feature creep we'd eventually end up with a qcow2 format that
> is a worse FS than real FS (especially performance-wise), but that is
> similarly complex.
> 
> >>>> Unless I have got something terribly wrong (which is indeed a
> >>>> possibility!), to me this proposal means basically to turn qcow2
> >>>> into (1) a VM description format for qemu, and (2) to turn it into
> >>>> an archive format on the way.  
> >>>
> >>> And if you go all the way you can store multiple disks along with
> >>> the VM definition so you can have the whole appliance in one file.
> >>> It conveniently solves the problem of synchronizing snapshots across
> >>> multiple disk images and the question where to store the machine
> >>> state if you want to suspend it.   
> >>
> >> Yeah, but why make qcow2 that format?  That's what I completely fail
> >> to understand.
> >>
> >> If you want to have a single VM description file that contains the VM
> >> configuration and some qcow2/raw/whatever files along with it for the
> >> guest disk data, sure, go ahead.  But why does the format of the whole
> >> thing need to be qcow2?
> > 
> > Because then qemu can access the disk data from the image directly
> > without any need for extraction, copying to different file, etc.
> 
> This does not explain why it needs to be qcow2.  There is absolutely no
> reason why you couldn't use qcow2 files in-place inside of another file.

Because then we'd have to change the whole stack to take advantage of
that.  Adding a feature into qcow2 means nothing else changes.

Dave

> Max
> 


--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-06 11:02                   ` Max Reitz
  2018-06-06 11:19                     ` Michal Suchánek
@ 2018-06-06 11:40                     ` Richard W.M. Jones
  2018-06-06 14:31                       ` Michael S. Tsirkin
  2018-06-06 14:43                     ` Michael S. Tsirkin
  2 siblings, 1 reply; 157+ messages in thread
From: Richard W.M. Jones @ 2018-06-06 11:40 UTC (permalink / raw)
  To: Max Reitz
  Cc: Michal Suchánek, Kevin Wolf, qemu-devel, stefanha, ehabkost,
	qemu-block, Michael S. Tsirkin

On Wed, Jun 06, 2018 at 01:02:53PM +0200, Max Reitz wrote:
> (I'm wondering if we could write a block driver that could provide such
> a chunk allocation transparently to qcow2...  Note that a qcow2 file
> does not need to be continuous, so you could in theory indeed store the
> qcow2 file and its data in completely separate places in a tar file.)

nbdkit-split-plugin
(https://github.com/libguestfs/nbdkit/tree/master/plugins/split).  It
currently doesn't support resizing although it could do fairly easily.

> Yeah, but why make qcow2 that format?  That's what I completely fail to
> understand.

I started off a long reply here, but I think you're right.  If we
cannot make people decide on and use a proper disk image + metadata
container, then it's also unlikely we'll get them to add sensible
metadata to their qcow2 images either :-(

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
libguestfs lets you edit virtual machines.  Supports shell scripting,
bindings from many languages.  http://libguestfs.org

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-06 11:14                     ` Dr. David Alan Gilbert
  2018-06-06 11:26                       ` Max Reitz
@ 2018-06-06 11:42                       ` Richard W.M. Jones
  2018-06-06 11:48                         ` Daniel P. Berrangé
  1 sibling, 1 reply; 157+ messages in thread
From: Richard W.M. Jones @ 2018-06-06 11:42 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: Max Reitz, Kevin Wolf, armbru, qemu-block, Michael S. Tsirkin,
	qemu-devel, stefanha

On Wed, Jun 06, 2018 at 12:14:07PM +0100, Dr. David Alan Gilbert wrote:
> The problem with having a separate file is that you either have to copy
> it around with the image or have an archive.  If you have an archive
> you have to have an unpacking step which then copies, potentially a lot
> of data taking some reasonable amount of time.  Storing a simple bit
> of data with the image avoids that.

This isn't really true.  For OVA (ie. tar) we don't unpack them.
Adding file.offset and file.size in qemu's raw driver was crucial to
that optimization.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
libguestfs lets you edit virtual machines.  Supports shell scripting,
bindings from many languages.  http://libguestfs.org

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-06 11:32                       ` Max Reitz
  2018-06-06 11:37                         ` Dr. David Alan Gilbert
@ 2018-06-06 11:43                         ` Michal Suchánek
  2018-06-06 11:52                           ` Max Reitz
  2018-06-11  8:44                         ` Richard W.M. Jones
  2 siblings, 1 reply; 157+ messages in thread
From: Michal Suchánek @ 2018-06-06 11:43 UTC (permalink / raw)
  To: Max Reitz
  Cc: Kevin Wolf, Richard W.M. Jones, qemu-devel, stefanha, ehabkost,
	qemu-block, Michael S. Tsirkin

[-- Attachment #1: Type: text/plain, Size: 12623 bytes --]

On Wed, 6 Jun 2018 13:32:47 +0200
Max Reitz <mreitz@redhat.com> wrote:

> On 2018-06-06 13:19, Michal Suchánek wrote:
> > On Wed, 6 Jun 2018 13:02:53 +0200
> > Max Reitz <mreitz@redhat.com> wrote:
> >   
> >> On 2018-06-06 12:32, Michal Suchánek wrote:  
> >>> On Tue, 29 May 2018 12:14:15 +0200
> >>> Max Reitz <mreitz@redhat.com> wrote:
> >>>     
> >>>> On 2018-05-29 08:44, Kevin Wolf wrote:    
> >>>>> Am 28.05.2018 um 23:25 hat Richard W.M. Jones geschrieben:      
> >>>>>> On Mon, May 28, 2018 at 10:20:54PM +0100, Richard W.M. Jones
> >>>>>> wrote:      
> >>>>>>> On Mon, May 28, 2018 at 08:38:33PM +0200, Kevin Wolf
> >>>>>>> wrote:      
> >>>>>>>> Just accessing the image file within a tar archive is
> >>>>>>>> possible and we could write a block driver for that (I
> >>>>>>>> actually think we should do this), but it restricts you
> >>>>>>>> because certain operations like resizing aren't really
> >>>>>>>> possible in tar. Unfortunately, resizing is a really common
> >>>>>>>> operation for non-raw image formats.      
> >>>>>>>
> >>>>>>> We do this already in virt-v2v (using file.offset and
> >>>>>>> file.size parameters in the raw driver).
> >>>>>>>
> >>>>>>> For virt-v2v we only need to read the source so resizing isn't
> >>>>>>> an issue.  For most of the cases we're talking about the
> >>>>>>> downloaded image would also be a template / base image, so I
> >>>>>>> suppose only reading would be required too.
> >>>>>>>
> >>>>>>> I also wrote an nbdkit tar file driver (supports writes, but
> >>>>>>> not resizing).
> >>>>>>> https://manpages.debian.org/testing/nbdkit-plugin-perl/nbdkit-tar-plugin.1.en.html      
> >>>>>>
> >>>>>> I should add the other thorny issue with OVA files is that the
> >>>>>> metadata contains a checksum (SHA1 or SHA256) of the disk
> >>>>>> images. If you modify the disk images in-place in the tar file
> >>>>>> then you need to recalculate those.      
> >>>>>
> >>>>> All of this means that OVA isn't really well suited to be used
> >>>>> as a native format for VM configuration + images. It's just for
> >>>>> sharing read-only images that are converted into another native
> >>>>> format before they are used.
> >>>>>
> >>>>> Which is probably fair for the use case it was made for, but
> >>>>> means that we need something else to solve our problem.      
> >>>>
> >>>> Maybe we should first narrow down our problem.  Maybe you have
> >>>> done that already, but I'm quite in the dark still.
> >>>>
> >>>> The original problem was that you need to supply a machine type
> >>>> to qemu, and that multiple common architectures now have multiple
> >>>> machine types and not necessarily all work with a single image.
> >>>> So far so good, but I have two issues here already:
> >>>>
> >>>> (1) How is qemu supposed to interpret that information?  If it's
> >>>> stored in the image file, I don't see a nice way of retrieving it
> >>>> before the machine is initialized, at least not with qemu's
> >>>> current architecture. Once we support configuring qemu solely
> >>>> through QMP, sure, you can do a blockdev-add and then build the
> >>>> machine accordingly.  But that is not here today, and I'm not
> >>>> sure this is a good idea either, because that would mean
> >>>> automagic defaults for the machine-building QMP commands derived
> >>>> from the blockdev-add earlier, which should get a plain "No".
> >>>> Also, having to use QMP to build your machine wouldn't make
> >>>> anything easier; at least not easier than just supplying a
> >>>> configuration file along with the image.
> >>>>
> >>>> (Building the magic into -blockdev might be less horrible, but
> >>>> such magic (adding block devices influences machine defaults) to
> >>>> me still doesn't seem worth not having to supply a config file
> >>>> along with the disk image.)
> >>>>
> >>>> (2) Again, I personally just really don't like saving such
> >>>> information in a disk image.  One actual argument I can bring up
> >>>> for that distaste is this: Suppose, you have multiple images
> >>>> attached to your VM.  Now the VM wants to store the machine type.
> >>>> Where does it go?  Into all of them?  But some of those images
> >>>> may only contain data and might be intended to be shared between
> >>>> multiple VMs.  So those shouldn't receive the mark.  Only disks
> >>>> with binaries should receive them. But what if those binaries are
> >>>> just cross-compiled binaries for some other VM?  Oh no, so not
> >>>> even binaries are a sure indicator...  So I have no idea where
> >>>> the information is supposed to be stored.  In any case, "the
> >>>> first image" just gets an outright "no" from me, and "all
> >>>> images" gets an "I don't think this is a good idea".
> >>>>
> >>>> Loading is fun, too.  OK, so you attach multiple disk images to a
> >>>> VM. Oops, they have varying machine type information...  Now
> >>>> what?  Use the information from the first one?  Definitely no.
> >>>> Just ignore all of the information in such a case and have the
> >>>> user supply the machine type again?  Possible, but it seems weird
> >>>> to me that qemu would usually guess the machine type, but once
> >>>> you attach some random other image to it, it suddenly fails to do
> >>>> that.  But maybe it's just me who thinks this is weird.
> >>>>
> >>>>
> >>>> OK, so let's go a step further.  We have stored the machine type
> >>>> information in order to not have to supply a config file with the
> >>>> qcow2 image -- because if we did, it could just contain the
> >>>> machine type and that would be it.
> >>>>
> >>>> So to me it follows naturally that just storing the machine type
> >>>> doesn't make much sense if we cannot also store more VM
> >>>> configuration in a qcow2 file, because I don't see why you should
> >>>> be able to ship an image without a config file only if all you
> >>>> need to supply is a machine type. Often, you also need to supply
> >>>> how much memory the VM needs (which depends on the OS on the
> >>>> image) or what storage controller to use (does the OS have virtio
> >>>> drivers? (to be fair, it usually does, because you're supplying a
> >>>> VM image in the first place)).
> >>>>
> >>>> So I think if we decide to store the machine type, that is kind
> >>>> of a slippery slope and then there are good arguments for storing
> >>>> even more configuration options in the file, too.  But I really,
> >>>> really don't like that.
> >>>>
> >>>> For one thing, I suspect it to get really ugly
> >>>> implementation-wise. Getting the machine type out of a disk
> >>>> image and actually interpreting it automatically is bad enough,
> >>>> but getting possibly everything out of it?  It's not going to be
> >>>> any better.
> >>>>
> >>>> For another, how do we store the data?  key-value seems wrong if
> >>>> we want to store everything.  JSON might be fine.  But
> >>>> eventually we just want basically a qemu configuration file in
> >>>> there, I would think (which may support JSON at some point?).
> >>>> So basically we would store the data as a binary blob and let
> >>>> the rest of qemu do its thing with it.  But then please tell me
> >>>> why I fought so valiantly against storing random bitmaps in
> >>>> qcow2 files.      
> >>>
> >>> Yes, I wonder. Why did you?    
> >>
> >> That was mostly directed at Kevin.
> >>
> >> My reasoning was that a qcow2 file is a disk image.  All data
> >> stored therein should be immediately associated with the stored
> >> data. Another reason was that from the perspective of qcow2 you
> >> don't lose anything by tying the bitmaps directly to that data;
> >> all we lost was the capability of storing bitmaps for unrelated
> >> raw files.
> >>
> >> (And the reasoning for that is "if you want features, use qcow2" --
> >> although R/W backing files may loosen that phrase.)
> >>  
> >>>> I hate the idea of making qcow2 a random archive format.    
> >>>
> >>> What's wrong with that?    
> >>
> >> The fact that qcow2 isn't.
> >>
> >> From my perspective it would increase the format's complexity to a
> >> point where you could just create a new format altogether.  Well,
> >> actually, all you do is design a filesystem (or reuse an existing
> >> one).
> >>  
> >>>> We have tar for that.    
> >>>
> >>> It does not support expanding the stored files.    
> >>
> >> Nor does qcow2, because it does not support storing files at all.  
> > 
> > AFAICT from the previous discussion it already does allow storing
> > multiple data streams that can be changed independently so it
> > basically is an archive format or filesystem except the streams are
> > not named nor easily accessible separately outside of qemu.  
> 
> I don't quite understand what you are referring to.  We have
> snapshots, we have bitmaps, yes, but all of that are related directly
> to the stored guest disk data.
> 
> The only thing we currently have in qcow2 that is opaque is the VM
> state that can be stored in snapshots (and don't hold me responsible
> for that).

But it has to be related to the stored disk data only because of your
dislike for stuff not related to the disk data. Not for a technical
reason. The format can sustain storing unrelated data.

> 
> >> Secondly, that completely depends on how you use it.  You can
> >> freely expand the last file in the archive, for instance.  Also
> >> I've seen people store files in chunks so they can indeed resize
> >> it.
> >>
> >> (I'm wondering if we could write a block driver that could provide
> >> such a chunk allocation transparently to qcow2...  Note that a
> >> qcow2 file does not need to be continuous, so you could in theory
> >> indeed store the qcow2 file and its data in completely separate
> >> places in a tar file.)  
> > 
> > Which basically invents another new filesystem on top of tar for no
> > good reason. Especially when we have already support for storage
> > format that is capable enough.  
> 
> No different from inventing a filesystem on top of qcow2.
> 
> I don't think qcow2 is any more capable than tar.

It can natively resize the objects it stores. tar cannot do that so you
will have to store random nonsense in the tar archive that only makes
sense for your fs/tar driver.

> 
> >> What I'm trying to get at is that qcow2 was not designed to be a
> >> container format for arbitrary files.  If you want to make it such,
> >> I'm sure there are existing formats that work better.  
> > 
> > Such as?  
> 
> ext2?

So you want an ext2 driver in qemu instead of expanding qcow2 to work
not only for a single disk but also for an appliance?

> 
> It seems to me that you want to make qcow2 a filesystem.  Sure, the FS
> we'd end up with would probably be simpler than ext2, but I assume
> thanks to feature creep we'd eventually end up with a qcow2 format
> that is a worse FS than real FS (especially performance-wise), but
> that is similarly complex.

I do not see how the complexity increases drastically by assigning
user-visible names to some of the data stored in the image. I am not
familiar with the internals, though.

> 
> >>>> Unless I have got something terribly wrong (which is indeed a
> >>>> possibility!), to me this proposal means basically to turn qcow2
> >>>> into (1) a VM description format for qemu, and (2) to turn it
> >>>> into an archive format on the way.    
> >>>
> >>> And if you go all the way you can store multiple disks along with
> >>> the VM definition so you can have the whole appliance in one file.
> >>> It conveniently solves the problem of synchronizing snapshots
> >>> across multiple disk images and the question where to store the
> >>> machine state if you want to suspend it.     
> >>
> >> Yeah, but why make qcow2 that format?  That's what I completely
> >> fail to understand.
> >>
> >> If you want to have a single VM description file that contains the
> >> VM configuration and some qcow2/raw/whatever files along with it
> >> for the guest disk data, sure, go ahead.  But why does the format
> >> of the whole thing need to be qcow2?  
> > 
> > Because then qemu can access the disk data from the image directly
> > without any need for extraction, copying to different file, etc.  
> 
> This does not explain why it needs to be qcow2.  There is absolutely
> no reason why you couldn't use qcow2 files in-place inside of another
> file.

qemu cannot read the disk data from the file in-place.

Thanks

Michal

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-06 11:37                         ` Dr. David Alan Gilbert
@ 2018-06-06 11:44                           ` Max Reitz
  2018-06-06 12:16                             ` Dr. David Alan Gilbert
                                               ` (2 more replies)
  0 siblings, 3 replies; 157+ messages in thread
From: Max Reitz @ 2018-06-06 11:44 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: Michal Suchánek, Kevin Wolf, ehabkost, qemu-block,
	Michael S. Tsirkin, Richard W.M. Jones, qemu-devel, stefanha

[-- Attachment #1: Type: text/plain, Size: 2784 bytes --]

On 2018-06-06 13:37, Dr. David Alan Gilbert wrote:
> * Max Reitz (mreitz@redhat.com) wrote:
>> On 2018-06-06 13:19, Michal Suchánek wrote:
>>> On Wed, 6 Jun 2018 13:02:53 +0200
>>> Max Reitz <mreitz@redhat.com> wrote:
>>>
>>>> On 2018-06-06 12:32, Michal Suchánek wrote:
>>>>> On Tue, 29 May 2018 12:14:15 +0200
>>>>> Max Reitz <mreitz@redhat.com> wrote:

[...]

>>>>>> Unless I have got something terribly wrong (which is indeed a
>>>>>> possibility!), to me this proposal means basically to turn qcow2
>>>>>> into (1) a VM description format for qemu, and (2) to turn it into
>>>>>> an archive format on the way.  
>>>>>
>>>>> And if you go all the way you can store multiple disks along with
>>>>> the VM definition so you can have the whole appliance in one file.
>>>>> It conveniently solves the problem of synchronizing snapshots across
>>>>> multiple disk images and the question where to store the machine
>>>>> state if you want to suspend it.   
>>>>
>>>> Yeah, but why make qcow2 that format?  That's what I completely fail
>>>> to understand.
>>>>
>>>> If you want to have a single VM description file that contains the VM
>>>> configuration and some qcow2/raw/whatever files along with it for the
>>>> guest disk data, sure, go ahead.  But why does the format of the whole
>>>> thing need to be qcow2?
>>>
>>> Because then qemu can access the disk data from the image directly
>>> without any need for extraction, copying to different file, etc.
>>
>> This does not explain why it needs to be qcow2.  There is absolutely no
>> reason why you couldn't use qcow2 files in-place inside of another file.
> 
> Because then we'd have to change the whole stack to take advantage of
> that.  Adding a feature into qcow2 means nothing else changes.

Because it's a hack, right.  Storing binary data in a qcow2 file,
completely ignoring it in qemu (and being completely unusable to any
potential other users of the qcow2 format[1]) and only interpreting it
somewhere up the stack is a hack.

That is not necessarily a negative point, hacks can work wonderfully
well, and they usually are simple, that is correct.  But the thing is
that I feel like people have grand visions of what to get out of this.
Imagine, a single file that can configure all and any VM!

But hacks usually only solve a single issue.  Once you try to extend a
hack, it breaks down and becomes insufficient.

If we want a grand vision where a single file stores the whole VM, why
not invest the work and make it right from the start?

Max

[1] Yes, I concede that there are probably no other users of qcow2.  But
please forgive me for assuming that qcow2 was in a sense designed to be
a rather general image format that not only qemu could use.


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-06 11:42                       ` Richard W.M. Jones
@ 2018-06-06 11:48                         ` Daniel P. Berrangé
  2018-06-06 11:53                           ` Max Reitz
                                             ` (2 more replies)
  0 siblings, 3 replies; 157+ messages in thread
From: Daniel P. Berrangé @ 2018-06-06 11:48 UTC (permalink / raw)
  To: Richard W.M. Jones
  Cc: Dr. David Alan Gilbert, Kevin Wolf, qemu-block,
	Michael S. Tsirkin, qemu-devel, armbru, stefanha, Max Reitz

On Wed, Jun 06, 2018 at 12:42:28PM +0100, Richard W.M. Jones wrote:
> On Wed, Jun 06, 2018 at 12:14:07PM +0100, Dr. David Alan Gilbert wrote:
> > The problem with having a separate file is that you either have to copy
> > it around with the image or have an archive.  If you have an archive
> > you have to have an unpacking step which then copies, potentially a lot
> > of data taking some reasonable amount of time.  Storing a simple bit
> > of data with the image avoids that.
> 
> This isn't really true.  For OVA (ie. tar) we don't unpack them.
> Adding file.offset and file.size in qemu's raw driver was crucial to
> that optimization.

Though that assumes you're only using the qcow2 file in read-only mode.
As soon as you need write access you need to unpack from the OVA so that
the qcow2 file can grow its length when new sectors are allocated.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-06 11:43                         ` Michal Suchánek
@ 2018-06-06 11:52                           ` Max Reitz
  2018-06-06 12:13                             ` Michal Suchánek
  0 siblings, 1 reply; 157+ messages in thread
From: Max Reitz @ 2018-06-06 11:52 UTC (permalink / raw)
  To: Michal Suchánek
  Cc: Kevin Wolf, Richard W.M. Jones, qemu-devel, stefanha, ehabkost,
	qemu-block, Michael S. Tsirkin

[-- Attachment #1: Type: text/plain, Size: 6172 bytes --]

On 2018-06-06 13:43, Michal Suchánek wrote:
> On Wed, 6 Jun 2018 13:32:47 +0200
> Max Reitz <mreitz@redhat.com> wrote:
> 
>> On 2018-06-06 13:19, Michal Suchánek wrote:
>>> On Wed, 6 Jun 2018 13:02:53 +0200
>>> Max Reitz <mreitz@redhat.com> wrote:
>>>   
>>>> On 2018-06-06 12:32, Michal Suchánek wrote:  
>>>>> On Tue, 29 May 2018 12:14:15 +0200
>>>>> Max Reitz <mreitz@redhat.com> wrote:

[...]

>>>>>> I hate the idea of making qcow2 a random archive format.    
>>>>>
>>>>> What's wrong with that?    
>>>>
>>>> The fact that qcow2 isn't.
>>>>
>>>> From my perspective it would increase the format's complexity to a
>>>> point where you could just create a new format altogether.  Well,
>>>> actually, all you do is design a filesystem (or reuse an existing
>>>> one).
>>>>  
>>>>>> We have tar for that.    
>>>>>
>>>>> It does not support expanding the stored files.    
>>>>
>>>> Nor does qcow2, because it does not support storing files at all.  
>>>
>>> AFAICT from the previous discussion it already does allow storing
>>> multiple data streams that can be changed independently so it
>>> basically is an archive format or filesystem except the streams are
>>> not named nor easily accessible separately outside of qemu.  
>>
>> I don't quite understand what you are referring to.  We have
>> snapshots, we have bitmaps, yes, but all of that are related directly
>> to the stored guest disk data.
>>
>> The only thing we currently have in qcow2 that is opaque is the VM
>> state that can be stored in snapshots (and don't hold me responsible
>> for that).
> 
> But it has to be related to the stored disk data only because of your
> dislike for stuff not related to the disk data. Not for a technical
> reason. The format can sustain storing unrelated data.

Yes.  Technically, you can even store data without qcow2 knowing, except
that qemu-img check -r leaks will delete it, but yeah.

>>>> Secondly, that completely depends on how you use it.  You can
>>>> freely expand the last file in the archive, for instance.  Also
>>>> I've seen people store files in chunks so they can indeed resize
>>>> it.
>>>>
>>>> (I'm wondering if we could write a block driver that could provide
>>>> such a chunk allocation transparently to qcow2...  Note that a
>>>> qcow2 file does not need to be continuous, so you could in theory
>>>> indeed store the qcow2 file and its data in completely separate
>>>> places in a tar file.)  
>>>
>>> Which basically invents another new filesystem on top of tar for no
>>> good reason. Especially when we have already support for storage
>>> format that is capable enough.  
>>
>> No different from inventing a filesystem on top of qcow2.
>>
>> I don't think qcow2 is any more capable than tar.
> 
> It can natively resize the objects it stores.

It does not store arbitrary objects.  It stores a guest disk (and
snapshots), and bitmaps.  These are stored in directory structures so
new chunks can be dynamically added.  Note that both use slightly
different structures.

Yes, it is technically possible to use a similar structure to store
arbitrary objects.  But currently it does not have that capability.

>                                                tar cannot do that so you
> will have to store random nonsense in the tar archive that only makes
> sense for your fs/tar driver.

Right.  But what's the difference to storing random nonsense in a qcow2
file that not even makes sense to qemu?

>>>> What I'm trying to get at is that qcow2 was not designed to be a
>>>> container format for arbitrary files.  If you want to make it such,
>>>> I'm sure there are existing formats that work better.  
>>>
>>> Such as?  
>>
>> ext2?
> 
> So you want an ext2 driver in qemu instead of expanding qcow2 to work
> not only for a single disk but also for an appliance?

Yes, because ext2 was designed to be a proper filesystem.  I'm not an FS
designer.  Well, not a good one anyway.  So I don't trust myself on
extending qcow2 to be a good FS -- and why would I, when there are
already numerous FS around.

>> It seems to me that you want to make qcow2 a filesystem.  Sure, the FS
>> we'd end up with would probably be simpler than ext2, but I assume
>> thanks to feature creep we'd eventually end up with a qcow2 format
>> that is a worse FS than real FS (especially performance-wise), but
>> that is similarly complex.
> 
> I do not see how the complexity increases drastically by assigning
> user-visible names to some of the data stored in the image. I am not
> familiar with the internals, though.

As I said, qcow2 does not store objects currently.  It stores specific
data structures that differ.  There is no common "object structure".

>>>>>> Unless I have got something terribly wrong (which is indeed a
>>>>>> possibility!), to me this proposal means basically to turn qcow2
>>>>>> into (1) a VM description format for qemu, and (2) to turn it
>>>>>> into an archive format on the way.    
>>>>>
>>>>> And if you go all the way you can store multiple disks along with
>>>>> the VM definition so you can have the whole appliance in one file.
>>>>> It conveniently solves the problem of synchronizing snapshots
>>>>> across multiple disk images and the question where to store the
>>>>> machine state if you want to suspend it.     
>>>>
>>>> Yeah, but why make qcow2 that format?  That's what I completely
>>>> fail to understand.
>>>>
>>>> If you want to have a single VM description file that contains the
>>>> VM configuration and some qcow2/raw/whatever files along with it
>>>> for the guest disk data, sure, go ahead.  But why does the format
>>>> of the whole thing need to be qcow2?  
>>>
>>> Because then qemu can access the disk data from the image directly
>>> without any need for extraction, copying to different file, etc.  
>>
>> This does not explain why it needs to be qcow2.  There is absolutely
>> no reason why you couldn't use qcow2 files in-place inside of another
>> file.
> 
> qemu cannot read the disk data from the file in-place.

Hu?  Why not?

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-06 11:48                         ` Daniel P. Berrangé
@ 2018-06-06 11:53                           ` Max Reitz
  2018-06-06 12:03                           ` Dr. David Alan Gilbert
  2018-06-06 12:29                           ` Richard W.M. Jones
  2 siblings, 0 replies; 157+ messages in thread
From: Max Reitz @ 2018-06-06 11:53 UTC (permalink / raw)
  To: Daniel P. Berrangé, Richard W.M. Jones
  Cc: Dr. David Alan Gilbert, Kevin Wolf, qemu-block,
	Michael S. Tsirkin, qemu-devel, armbru, stefanha

[-- Attachment #1: Type: text/plain, Size: 1077 bytes --]

On 2018-06-06 13:48, Daniel P. Berrangé wrote:
> On Wed, Jun 06, 2018 at 12:42:28PM +0100, Richard W.M. Jones wrote:
>> On Wed, Jun 06, 2018 at 12:14:07PM +0100, Dr. David Alan Gilbert wrote:
>>> The problem with having a separate file is that you either have to copy
>>> it around with the image or have an archive.  If you have an archive
>>> you have to have an unpacking step which then copies, potentially a lot
>>> of data taking some reasonable amount of time.  Storing a simple bit
>>> of data with the image avoids that.
>>
>> This isn't really true.  For OVA (ie. tar) we don't unpack them.
>> Adding file.offset and file.size in qemu's raw driver was crucial to
>> that optimization.
> 
> Though that assumes you're only using the qcow2 file in read-only mode.
> As soon as you need write access you need to unpack from the OVA so that
> the qcow2 file can grow its length when new sectors are allocated.

Except if the qcow2 file is at the end of the archive.  Then all you
need to do is adjust the length field of the tar file header.

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-06 11:26                       ` Max Reitz
@ 2018-06-06 12:00                         ` Dr. David Alan Gilbert
  2018-06-06 12:59                           ` Max Reitz
  0 siblings, 1 reply; 157+ messages in thread
From: Dr. David Alan Gilbert @ 2018-06-06 12:00 UTC (permalink / raw)
  To: Max Reitz
  Cc: Kevin Wolf, qemu-block, Michael S. Tsirkin, armbru, qemu-devel,
	Richard W.M. Jones, stefanha

* Max Reitz (mreitz@redhat.com) wrote:
> On 2018-06-06 13:14, Dr. David Alan Gilbert wrote:
> > * Max Reitz (mreitz@redhat.com) wrote:
> >> On 2018-06-05 11:21, Dr. David Alan Gilbert wrote:
> >>> <reawakening a fizzled out thread>
> >>>
> >>> This seems to have fizzled out because of a lack of a concrete proposal;
> >>> so here is one based on a reply to Max's post:
> >>>
> >>> * Max Reitz (mreitz@redhat.com) wrote:
> >>>
> >>> <snip>
> >>>
> >>>> The original problem was that you need to supply a machine type to qemu,
> >>>> and that multiple common architectures now have multiple machine types
> >>>> and not necessarily all work with a single image.  So far so good, but I
> >>>> have two issues here already:
> >>>>
> >>>> (1) How is qemu supposed to interpret that information?  If it's stored
> >>>> in the image file, I don't see a nice way of retrieving it before the
> >>>> machine is initialized, at least not with qemu's current architecture.
> >>>
> >>> <snip>
> >>>
> >>>> (2) Again, I personally just really don't like saving such information
> >>>> in a disk image.  One actual argument I can bring up for that distaste
> >>>> is this: Suppose, you have multiple images attached to your VM.  Now the
> >>>> VM wants to store the machine type.  Where does it go?  Into all of
> >>>> them?
> >>>
> >>> <snip>
> >>>
> >>>> So I think if we decide to store the machine type, that is kind of a
> >>>> slippery slope and then there are good arguments for storing even more
> >>>> configuration options in the file, too.  But I really, really don't like
> >>>> that.
> >>>
> >>> <snip>
> >>>
> >>>> For another, how do we store the data?  key-value seems wrong if we want
> >>>> to store everything.  JSON might be fine.  But eventually we just want
> >>>> basically a qemu configuration file in there, I would think (which may
> >>>> support JSON at some point?).   So basically we would store the data as
> >>>> a binary blob and let the rest of qemu do its thing with it.  But then
> >>>> please tell me why I fought so valiantly against storing random bitmaps
> >>>> in qcow2 files.  I hate the idea of making qcow2 a random archive
> >>>> format.  We have tar for that.
> >>>
> >>> <snip>
> >>>
> >>>> tl;dr: I really don't get why it's so hard to supply a config file along
> >>>> with a qcow2 image.  Is it so hard for people to realize that a VM does
> >>>> not only consist of a disk?
> >>>
> >>> Yes! Because in many cases that's all it needs, and it's ready to run
> >>> with no unpacking.
> >>
> >> It clearly is not, or we would not have this discussion.
> >>
> >> The disk image is only enough if you want the default values for all of
> >> qemu's configuration options, because today (and if I were to decide, in
> >> the future, too) disk images do not configure the VM (well, they
> >> configure the guest, but not the VM itself).
> > 
> > The problem with having a separate file is that you either have to copy
> > it around with the image 
> 
> Which is just an inconvenience.

It's more than that;  if it's a separate file then the tools can't
rely on users supplying it, and frankly they won't and they'll still
just supply an image.

> I understand it is an inconvenience and it would be nice to change it,
> but please understand that I do not want qcow2 to become a filesystem
> just to relieve an inconvenience.

I very much don't want it to be a filesystem; my reason for writing
down my spec the way I did was to make it clear that the only
thing I want of qcow2 is a single blob, no more; I don't want naming
of the blob or anything else.

> (Note: I understand that you may not want qcow2 to become a filesystem,
> but I do get the impression from others.)

My aim was to specify it to fulfill the requirements that everyone
else had asked for, but still only having one unmodifiable blob in qcow.

> >                           or have an archive. If you have an archive
> > you have to have an unpacking step which then copies, potentially a lot
> > of data taking some reasonable amount of time.
> 
> I'm sure this can be optimized, but yes, I get that.
> 
> (If you use e.g. tar and store the image data starting on an FS cluster
> boundary (64 kB should be more than sufficient), I assume there is a way
> to extract that data into a new file without copying anything.)

But then we have to modify all the current things that know how to
handle a qcow2.

> >                                                 Storing a simple bit
> > of data with the image avoids that.
> 
> It is not a simple bit of data, as evidenced by the discussion about
> storing binary blobs and MIME types going on.

All of the things they've suggested can be done inside that one blob;
even inside the json (or any other structure in that blob).

> >>> I think we should have:
> >>>
> >>> --------------------------------------------------------------
> >>> Layer 0:
> >>>    QCOW provides a way to store a single string of arbitrary (but
> >>> limited?) length.
> >>>    QCOW provides a way to replace the string by a new string.
> >>>    The original or the new string will be stored after that;
> >>>    never some mix.
> >>>    Where a file 'b' has a backing file 'a', 'b' inherits the
> >>>    string from 'a' unless 'b' has it's own string.
> >>>    Snapshots inherit their string from the main unless they have
> >>>    their own string.
> >>>
> >>> Layer 1:
> >>>    The string shall always be a JSON 'object'; i.e. of the form
> >>>     { "something": ... , "more": ... }
> >>>
> >>>    The key strings shall be non-null and non-empty and shall
> >>>    be unique.
> >>>
> >>> Layer 2:
> >>>    '.'s in the key string shall indicate hierarchy
> >>
> >> I don't understand why we we'd need dotted syntax when we already have
> >> JSON, but that's not my issue.
> > 
> > I think someone earlier in the thread had asked about how we handled
> > hierarchy so I added it.
> > 
> >>>    Key strings shall be listed in qemu's 
> >>>       docs/specs/qcow-keys.rst
> >>>
> >>>       that shall indicate their meaning and the meaning and
> >>>       valid formatting of the value associated with the,
> >>>
> >>>    Key strings shall start with either:
> >>>       qemu.   in which case they must be listed in a file in
> >>>               the qemu source tree
> >>>
> >>>       a reverse dotted name unique to the submitter, they may
> >>>               be listed in the same file in the source tree, e.g.
> >>>       com.redhat.
> >>
> >> So this is just another configuration file format.
> >>
> >>> Layer 3:
> >>>    QEMU shall, for a given qcow2 file be able to dump the
> >>>    key values.
> >>>
> >>> Layer 4:
> >>>    On creating a VM by importing a qcow2, a management layer
> >>>    shall inspect the key/values to influence the configuration
> >>>    of the VM created.   Where it imports multiple qcow2's it
> >>>    shall inspect all the files and flag disagreements.
> >>>
> >>>    Management layers shall, on creating a qcow2 shall set the
> >>>    keys based on the VM the qcow2 is created for.  If the qcow2
> >>>    is created as an additional disk for an exisitng VM it's
> >>>    fine to leave the string empty (e.g. for a data disk).
> >>
> >> This at least solves the issue of where qemu should store the data (qemu
> >> doesn't care), and how qemu should interpret it (not at all).
> >>
> >> But I really, really, really do not like storing arbitrary data in qcow2
> >> files.  I hated it badly enough when qemu knew what to do with it, but I
> >> hate it even more when even qemu has no idea what to do with it.
> >>
> >> Having a specification of what everything means in the qemu tree makes
> >> things less unbearable, but not to my liking still.
> > 
> > Have you said why you hate it so much?
> > Your hate for it seems to be making a simple solution hard.
> 
> Because it's a disk image format.  Data therein should be relevant to
> the disk image.  I see qcow2 as a representation of data stored on a
> physical storage medium.

What we're missing here is the notes scribbled on the sticky label on
the disc;  you rarely need them on a physical drive in a computer,
LUNs on a SAN don't need them that much because they have a full
filesystem and don't move about much.  Here we're talking about an image
being downloaded or sent between people.

> Some metadata associated directly with that is fine (such as dirty
> bitmaps, backing chains, things like that).  But configuring the whole
> VM seems out of scope to me.
> 
> Also, making qcow2 a filesystem is not a simple solution.
> 
> ...OK, let me back off here, I may be over-interpreting things and
> throwing opinions of different people into one pot.
> 
> Maybe you don't want qcow2 to be a filesystem, and you just want to
> store a single binary blob.  Well, OK, that's not that bad.  But in any
> case, I wouldn't call it a simple solution anymore.
> 
> Yes, storing just the machine type somewhere would be possible with a
> simple solution; but as I said (and the whole thread shows since then),
> this is a slippery slope, and suddenly we arrive at storing arbitrary
> binary data (like images?!) along with MIME types.  That will not be
> possible with a simple solution anymore, I don't think.

Right; I was thinking we were too far down that slope to get rid
of all of those requirements, but I was trying to force it back to
being a single blob as far as QCOW2 saw it.

> >>> --------------------------------------------------------------
> >>>    
> >>>
> >>> Some reasoning:
> >>>    a) I've avoided the problem of when QEMU interprets the value
> >>>       by ignoring it and giving it to management layers at the point
> >>>       of VM import.
> >>
> >> Yes, but in the process you've made it completely opaque to qemu,
> >> basically, which doesn't really make it better for me.  Not that
> >> qemu-specific information in qcow2 files would be what I want, but, well.
> >>
> >> But it does solve technical issues, I concede that.
> >>
> >>>    b) I hate JSON, but there again nailing down a fixed format
> >>>       seems easiest and it makes the job of QCOW easy - a single
> >>>       string.
> >>
> >> Not really.  The string can be rather long, so you probably don't want
> >> to store it in the image header, and thus it's just a binary blob from
> >> qcow2's perspective, essentially.
> > 
> > Yes, but it's a single blob - I'm not asking for multiple keyed blobs
> > or the ability to update individual blobs; just one blob that I can
> > replace.
> 
> OK, you aren't, but others seem to be.
> 
> Or, well, you call it a single blob.  But actually the current ideas
> seem to be to store a rather large configuration tree with binary data
> in that blob, so to me personally there is absolutely no functional
> difference to just storing a tar file in that blob.
> 
> So correct me if I'm wrong, but to me it appears that you effectively
> want to store a filesystem in qcow2.[1]  Well, that's better than making
> qcow2 the filesystem, but it still appears just the wrong way around to me.

It's different in the sense that what we end up with is still a qcow2;
anything that just handles qcow2's and can pass them through doesn't
need to do anything different; users don't need to do anything
different.  No one has to pack/unpack the file.

> [1] Yes, I know that the guest disk already contains an FS. :-P
> 
> >>>       (I would suggest in layer2 that the keys are sorted, but
> >>>       that's a pain to do in some json creators)
> >>>    c) Forcing the registry of keys might avoid silly duplication.
> >>>       We can but hope.
> >>>    d) I've not said it's a libvirt XML file since that seems
> >>>       a bit prescriptive.
> >>>
> >>> Some initial suggested keys:
> >>>
> >>>    "qemu.machine-types": [ "q35", "i440fx" ]
> >>>    "qemu.min-ram-MB": 1024
> >>
> >> I still don't understand why you'd want to put the configuration into
> >> qcow2 instead of the other way around.
> >>
> >> Or why you'd want to use a single file at all, because as this whole
> >> thread shows, a disk image alone is clearly not sufficient to describe a VM.
> >>
> >> (Or it may be in simple cases, but then that's because you don't need
> >> any configuration.)
> > 
> > Because it avoids the unpacking associated with archives.
> 
> I'm not talking about unpacking.  I'm talking about a potentially new
> format which allows accessing the qcow2 file in-place.  It would
> probably be trivial to write a block driver to allow this.
> 
> (And as I wrote in my response to Michal, I suspect that tar could
> actually allow this, even though it would probably not be the ideal format.)

As above, I don't think this is trivial; you have to change all the
layers;  lets say it was a tar; you'd have to somehow know that you're
importing one of these special tars, you also have to have a tool to
create them; and you have to worry about whether that alignment
is correct for the storage/memory you're using it with.

Dave

> Max
> 


--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-06 11:48                         ` Daniel P. Berrangé
  2018-06-06 11:53                           ` Max Reitz
@ 2018-06-06 12:03                           ` Dr. David Alan Gilbert
  2018-06-06 13:15                             ` Max Reitz
  2018-06-06 12:29                           ` Richard W.M. Jones
  2 siblings, 1 reply; 157+ messages in thread
From: Dr. David Alan Gilbert @ 2018-06-06 12:03 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Richard W.M. Jones, Kevin Wolf, qemu-block, Michael S. Tsirkin,
	armbru, qemu-devel, stefanha, Max Reitz

* Daniel P. Berrangé (berrange@redhat.com) wrote:
> On Wed, Jun 06, 2018 at 12:42:28PM +0100, Richard W.M. Jones wrote:
> > On Wed, Jun 06, 2018 at 12:14:07PM +0100, Dr. David Alan Gilbert wrote:
> > > The problem with having a separate file is that you either have to copy
> > > it around with the image or have an archive.  If you have an archive
> > > you have to have an unpacking step which then copies, potentially a lot
> > > of data taking some reasonable amount of time.  Storing a simple bit
> > > of data with the image avoids that.
> > 
> > This isn't really true.  For OVA (ie. tar) we don't unpack them.
> > Adding file.offset and file.size in qemu's raw driver was crucial to
> > that optimization.
> 
> Though that assumes you're only using the qcow2 file in read-only mode.
> As soon as you need write access you need to unpack from the OVA so that
> the qcow2 file can grow its length when new sectors are allocated.

And the person creating the OVA has to do that taring rather than just
take the qcow2 they've just used in the VM.

Dave

> Regards,
> Daniel
> -- 
> |: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org         -o-            https://fstop138.berrange.com :|
> |: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-06 11:52                           ` Max Reitz
@ 2018-06-06 12:13                             ` Michal Suchánek
  2018-06-06 13:14                               ` Max Reitz
  0 siblings, 1 reply; 157+ messages in thread
From: Michal Suchánek @ 2018-06-06 12:13 UTC (permalink / raw)
  To: Max Reitz
  Cc: Kevin Wolf, Richard W.M. Jones, qemu-devel, stefanha, ehabkost,
	qemu-block, Michael S. Tsirkin

[-- Attachment #1: Type: text/plain, Size: 6873 bytes --]

On Wed, 6 Jun 2018 13:52:35 +0200
Max Reitz <mreitz@redhat.com> wrote:

> On 2018-06-06 13:43, Michal Suchánek wrote:
> > On Wed, 6 Jun 2018 13:32:47 +0200
> > Max Reitz <mreitz@redhat.com> wrote:
> >   
> >> On 2018-06-06 13:19, Michal Suchánek wrote:  
> >>> On Wed, 6 Jun 2018 13:02:53 +0200
> >>> Max Reitz <mreitz@redhat.com> wrote:
> >>>     
> >>>> On 2018-06-06 12:32, Michal Suchánek wrote:    
> >>>>> On Tue, 29 May 2018 12:14:15 +0200
> >>>>> Max Reitz <mreitz@redhat.com> wrote:  
> 
> [...]
> 
> >>>>>> I hate the idea of making qcow2 a random archive format.      
> >>>>>
> >>>>> What's wrong with that?      
> >>>>
> >>>> The fact that qcow2 isn't.
> >>>>
> >>>> From my perspective it would increase the format's complexity to
> >>>> a point where you could just create a new format altogether.
> >>>> Well, actually, all you do is design a filesystem (or reuse an
> >>>> existing one).
> >>>>    
> >>>>>> We have tar for that.      
> >>>>>
> >>>>> It does not support expanding the stored files.      
> >>>>
> >>>> Nor does qcow2, because it does not support storing files at
> >>>> all.    
> >>>
> >>> AFAICT from the previous discussion it already does allow storing
> >>> multiple data streams that can be changed independently so it
> >>> basically is an archive format or filesystem except the streams
> >>> are not named nor easily accessible separately outside of
> >>> qemu.    
> >>
> >> I don't quite understand what you are referring to.  We have
> >> snapshots, we have bitmaps, yes, but all of that are related
> >> directly to the stored guest disk data.
> >>
> >> The only thing we currently have in qcow2 that is opaque is the VM
> >> state that can be stored in snapshots (and don't hold me
> >> responsible for that).  
> > 
> > But it has to be related to the stored disk data only because of
> > your dislike for stuff not related to the disk data. Not for a
> > technical reason. The format can sustain storing unrelated data.  
> 
> Yes.  Technically, you can even store data without qcow2 knowing,
> except that qemu-img check -r leaks will delete it, but yeah.
> 
> >>>> Secondly, that completely depends on how you use it.  You can
> >>>> freely expand the last file in the archive, for instance.  Also
> >>>> I've seen people store files in chunks so they can indeed resize
> >>>> it.
> >>>>
> >>>> (I'm wondering if we could write a block driver that could
> >>>> provide such a chunk allocation transparently to qcow2...  Note
> >>>> that a qcow2 file does not need to be continuous, so you could
> >>>> in theory indeed store the qcow2 file and its data in completely
> >>>> separate places in a tar file.)    
> >>>
> >>> Which basically invents another new filesystem on top of tar for
> >>> no good reason. Especially when we have already support for
> >>> storage format that is capable enough.    
> >>
> >> No different from inventing a filesystem on top of qcow2.
> >>
> >> I don't think qcow2 is any more capable than tar.  
> > 
> > It can natively resize the objects it stores.  
> 
> It does not store arbitrary objects.  It stores a guest disk (and
> snapshots), and bitmaps.  These are stored in directory structures so
> new chunks can be dynamically added.  Note that both use slightly
> different structures.
> 
> Yes, it is technically possible to use a similar structure to store
> arbitrary objects.  But currently it does not have that capability.
> 
> >                                                tar cannot do that
> > so you will have to store random nonsense in the tar archive that
> > only makes sense for your fs/tar driver.  
> 
> Right.  But what's the difference to storing random nonsense in a
> qcow2 file that not even makes sense to qemu?
> 
> >>>> What I'm trying to get at is that qcow2 was not designed to be a
> >>>> container format for arbitrary files.  If you want to make it
> >>>> such, I'm sure there are existing formats that work better.    
> >>>
> >>> Such as?    
> >>
> >> ext2?  
> > 
> > So you want an ext2 driver in qemu instead of expanding qcow2 to
> > work not only for a single disk but also for an appliance?  
> 
> Yes, because ext2 was designed to be a proper filesystem.  I'm not an
> FS designer.  Well, not a good one anyway.  So I don't trust myself on
> extending qcow2 to be a good FS -- and why would I, when there are
> already numerous FS around.

Do you expect that performance of qemu using qcow2 driver over ext2
driver will be better than using qcow driver directly with some part
semi-permanently occupied by a configuration blob? My bet is not.

The ext* drivers are designed to work with kernel VM infrastructure
which must be tuned for different usage scenarios and you would have to
duplicate that tuning in qemu to get competitive performance. Also you
get qcow2 and ext2 metadata which must be allocated, managed, etc. You
get more storage and performance overhead for no good reason.

On the other hand, qcow is designed for storing VM disk data and
hopefully was tuned to do that decently over the years. The primary use
case remains storing VM disk data. Adding a configuration blob does not
change that.


> >>>>>> Unless I have got something terribly wrong (which is indeed a
> >>>>>> possibility!), to me this proposal means basically to turn
> >>>>>> qcow2 into (1) a VM description format for qemu, and (2) to
> >>>>>> turn it into an archive format on the way.      
> >>>>>
> >>>>> And if you go all the way you can store multiple disks along
> >>>>> with the VM definition so you can have the whole appliance in
> >>>>> one file. It conveniently solves the problem of synchronizing
> >>>>> snapshots across multiple disk images and the question where to
> >>>>> store the machine state if you want to suspend it.       
> >>>>
> >>>> Yeah, but why make qcow2 that format?  That's what I completely
> >>>> fail to understand.
> >>>>
> >>>> If you want to have a single VM description file that contains
> >>>> the VM configuration and some qcow2/raw/whatever files along
> >>>> with it for the guest disk data, sure, go ahead.  But why does
> >>>> the format of the whole thing need to be qcow2?    
> >>>
> >>> Because then qemu can access the disk data from the image directly
> >>> without any need for extraction, copying to different file,
> >>> etc.    
> >>
> >> This does not explain why it needs to be qcow2.  There is
> >> absolutely no reason why you couldn't use qcow2 files in-place
> >> inside of another file.  
> > 
> > qemu cannot read the disk data from the file in-place.  
> 
> Hu?  Why not?

Well, it can possibly read the image if it happens to be continuous. It
will not be able to update it without a fs driver, however.

Thanks

Michal

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-06 11:44                           ` Max Reitz
@ 2018-06-06 12:16                             ` Dr. David Alan Gilbert
  2018-06-06 13:22                               ` Max Reitz
  2018-06-06 13:42                             ` [Qemu-devel] " Eduardo Habkost
  2018-06-06 14:46                             ` Michael S. Tsirkin
  2 siblings, 1 reply; 157+ messages in thread
From: Dr. David Alan Gilbert @ 2018-06-06 12:16 UTC (permalink / raw)
  To: Max Reitz
  Cc: Kevin Wolf, ehabkost, qemu-block, Michael S. Tsirkin,
	Richard W.M. Jones, qemu-devel, stefanha, Michal Suchánek

* Max Reitz (mreitz@redhat.com) wrote:
> On 2018-06-06 13:37, Dr. David Alan Gilbert wrote:
> > * Max Reitz (mreitz@redhat.com) wrote:
> >> On 2018-06-06 13:19, Michal Suchánek wrote:
> >>> On Wed, 6 Jun 2018 13:02:53 +0200
> >>> Max Reitz <mreitz@redhat.com> wrote:
> >>>
> >>>> On 2018-06-06 12:32, Michal Suchánek wrote:
> >>>>> On Tue, 29 May 2018 12:14:15 +0200
> >>>>> Max Reitz <mreitz@redhat.com> wrote:
> 
> [...]
> 
> >>>>>> Unless I have got something terribly wrong (which is indeed a
> >>>>>> possibility!), to me this proposal means basically to turn qcow2
> >>>>>> into (1) a VM description format for qemu, and (2) to turn it into
> >>>>>> an archive format on the way.  
> >>>>>
> >>>>> And if you go all the way you can store multiple disks along with
> >>>>> the VM definition so you can have the whole appliance in one file.
> >>>>> It conveniently solves the problem of synchronizing snapshots across
> >>>>> multiple disk images and the question where to store the machine
> >>>>> state if you want to suspend it.   
> >>>>
> >>>> Yeah, but why make qcow2 that format?  That's what I completely fail
> >>>> to understand.
> >>>>
> >>>> If you want to have a single VM description file that contains the VM
> >>>> configuration and some qcow2/raw/whatever files along with it for the
> >>>> guest disk data, sure, go ahead.  But why does the format of the whole
> >>>> thing need to be qcow2?
> >>>
> >>> Because then qemu can access the disk data from the image directly
> >>> without any need for extraction, copying to different file, etc.
> >>
> >> This does not explain why it needs to be qcow2.  There is absolutely no
> >> reason why you couldn't use qcow2 files in-place inside of another file.
> > 
> > Because then we'd have to change the whole stack to take advantage of
> > that.  Adding a feature into qcow2 means nothing else changes.
> 
> Because it's a hack, right.  Storing binary data in a qcow2 file,
> completely ignoring it in qemu (and being completely unusable to any
> potential other users of the qcow2 format[1]) and only interpreting it
> somewhere up the stack is a hack.

It's not a hack!
Seriously it's not.
There's nothing wrong with it being aimed higher up the stack than qemu,
the problem we started off with was what happens when a user downloads
a VM image and tries to import it into their VM system; weve already
got 2+ layers of management stuff in there - I want the information to
guide those layers, not form a complete set of configuration.

> That is not necessarily a negative point, hacks can work wonderfully
> well, and they usually are simple, that is correct.  But the thing is
> that I feel like people have grand visions of what to get out of this.
> Imagine, a single file that can configure all and any VM!
> 
> But hacks usually only solve a single issue.  Once you try to extend a
> hack, it breaks down and becomes insufficient.
> 
> If we want a grand vision where a single file stores the whole VM, why
> not invest the work and make it right from the start?

Because we won't get it right; however much we bikeshed about it
we'll just end up with a mess.   The right thing is to put in something
to hold configuration and then review the items of configuration we
add properly as we define them.

> Max
> 
> [1] Yes, I concede that there are probably no other users of qcow2.  But
> please forgive me for assuming that qcow2 was in a sense designed to be
> a rather general image format that not only qemu could use.

What makes it QEMU specific?  It's basically just the same key/value
setup as OVA, except putting them inside the qcow2.
We could use the same keys/value definitions as OVA in the blob,
although their definitions aren't very portable either.

Dave



--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-06 11:48                         ` Daniel P. Berrangé
  2018-06-06 11:53                           ` Max Reitz
  2018-06-06 12:03                           ` Dr. David Alan Gilbert
@ 2018-06-06 12:29                           ` Richard W.M. Jones
  2 siblings, 0 replies; 157+ messages in thread
From: Richard W.M. Jones @ 2018-06-06 12:29 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Dr. David Alan Gilbert, Kevin Wolf, qemu-block,
	Michael S. Tsirkin, qemu-devel, armbru, stefanha, Max Reitz

On Wed, Jun 06, 2018 at 12:48:17PM +0100, Daniel P. Berrangé wrote:
> On Wed, Jun 06, 2018 at 12:42:28PM +0100, Richard W.M. Jones wrote:
> > On Wed, Jun 06, 2018 at 12:14:07PM +0100, Dr. David Alan Gilbert wrote:
> > > The problem with having a separate file is that you either have to copy
> > > it around with the image or have an archive.  If you have an archive
> > > you have to have an unpacking step which then copies, potentially a lot
> > > of data taking some reasonable amount of time.  Storing a simple bit
> > > of data with the image avoids that.
> > 
> > This isn't really true.  For OVA (ie. tar) we don't unpack them.
> > Adding file.offset and file.size in qemu's raw driver was crucial to
> > that optimization.
> 
> Though that assumes you're only using the qcow2 file in read-only mode.
> As soon as you need write access you need to unpack from the OVA so that
> the qcow2 file can grow its length when new sectors are allocated.

Sure but you cannot write to an OVA anyway because it contains
embedded checksums.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-builder quickly builds VMs from scratch
http://libguestfs.org/virt-builder.1.html

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-06 12:00                         ` Dr. David Alan Gilbert
@ 2018-06-06 12:59                           ` Max Reitz
  2018-06-06 14:31                             ` Dr. David Alan Gilbert
  0 siblings, 1 reply; 157+ messages in thread
From: Max Reitz @ 2018-06-06 12:59 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: Kevin Wolf, qemu-block, Michael S. Tsirkin, armbru, qemu-devel,
	Richard W.M. Jones, stefanha

[-- Attachment #1: Type: text/plain, Size: 18317 bytes --]

On 2018-06-06 14:00, Dr. David Alan Gilbert wrote:
> * Max Reitz (mreitz@redhat.com) wrote:
>> On 2018-06-06 13:14, Dr. David Alan Gilbert wrote:
>>> * Max Reitz (mreitz@redhat.com) wrote:
>>>> On 2018-06-05 11:21, Dr. David Alan Gilbert wrote:
>>>>> <reawakening a fizzled out thread>
>>>>>
>>>>> This seems to have fizzled out because of a lack of a concrete proposal;
>>>>> so here is one based on a reply to Max's post:
>>>>>
>>>>> * Max Reitz (mreitz@redhat.com) wrote:
>>>>>
>>>>> <snip>
>>>>>
>>>>>> The original problem was that you need to supply a machine type to qemu,
>>>>>> and that multiple common architectures now have multiple machine types
>>>>>> and not necessarily all work with a single image.  So far so good, but I
>>>>>> have two issues here already:
>>>>>>
>>>>>> (1) How is qemu supposed to interpret that information?  If it's stored
>>>>>> in the image file, I don't see a nice way of retrieving it before the
>>>>>> machine is initialized, at least not with qemu's current architecture.
>>>>>
>>>>> <snip>
>>>>>
>>>>>> (2) Again, I personally just really don't like saving such information
>>>>>> in a disk image.  One actual argument I can bring up for that distaste
>>>>>> is this: Suppose, you have multiple images attached to your VM.  Now the
>>>>>> VM wants to store the machine type.  Where does it go?  Into all of
>>>>>> them?
>>>>>
>>>>> <snip>
>>>>>
>>>>>> So I think if we decide to store the machine type, that is kind of a
>>>>>> slippery slope and then there are good arguments for storing even more
>>>>>> configuration options in the file, too.  But I really, really don't like
>>>>>> that.
>>>>>
>>>>> <snip>
>>>>>
>>>>>> For another, how do we store the data?  key-value seems wrong if we want
>>>>>> to store everything.  JSON might be fine.  But eventually we just want
>>>>>> basically a qemu configuration file in there, I would think (which may
>>>>>> support JSON at some point?).   So basically we would store the data as
>>>>>> a binary blob and let the rest of qemu do its thing with it.  But then
>>>>>> please tell me why I fought so valiantly against storing random bitmaps
>>>>>> in qcow2 files.  I hate the idea of making qcow2 a random archive
>>>>>> format.  We have tar for that.
>>>>>
>>>>> <snip>
>>>>>
>>>>>> tl;dr: I really don't get why it's so hard to supply a config file along
>>>>>> with a qcow2 image.  Is it so hard for people to realize that a VM does
>>>>>> not only consist of a disk?
>>>>>
>>>>> Yes! Because in many cases that's all it needs, and it's ready to run
>>>>> with no unpacking.
>>>>
>>>> It clearly is not, or we would not have this discussion.
>>>>
>>>> The disk image is only enough if you want the default values for all of
>>>> qemu's configuration options, because today (and if I were to decide, in
>>>> the future, too) disk images do not configure the VM (well, they
>>>> configure the guest, but not the VM itself).
>>>
>>> The problem with having a separate file is that you either have to copy
>>> it around with the image 
>>
>> Which is just an inconvenience.
> 
> It's more than that;  if it's a separate file then the tools can't
> rely on users supplying it, and frankly they won't and they'll still
> just supply an image.

At which point you throw an error and tell them to specify the config file.

>> I understand it is an inconvenience and it would be nice to change it,
>> but please understand that I do not want qcow2 to become a filesystem
>> just to relieve an inconvenience.
> 
> I very much don't want it to be a filesystem; my reason for writing
> down my spec the way I did was to make it clear that the only
> thing I want of qcow2 is a single blob, no more; I don't want naming
> of the blob or anything else.
> 
>> (Note: I understand that you may not want qcow2 to become a filesystem,
>> but I do get the impression from others.)
> 
> My aim was to specify it to fulfill the requirements that everyone
> else had asked for, but still only having one unmodifiable blob in qcow.
> 
>>>                           or have an archive. If you have an archive
>>> you have to have an unpacking step which then copies, potentially a lot
>>> of data taking some reasonable amount of time.
>>
>> I'm sure this can be optimized, but yes, I get that.
>>
>> (If you use e.g. tar and store the image data starting on an FS cluster
>> boundary (64 kB should be more than sufficient), I assume there is a way
>> to extract that data into a new file without copying anything.)
> 
> But then we have to modify all the current things that know how to
> handle a qcow2.

Not in this case because it'd still be a flat qcow2 file in a simple tar
archive.

But you're right if we had a more complex format (like chunks stored in
a tar file).

>>>                                                 Storing a simple bit
>>> of data with the image avoids that.
>>
>> It is not a simple bit of data, as evidenced by the discussion about
>> storing binary blobs and MIME types going on.
> 
> All of the things they've suggested can be done inside that one blob;
> even inside the json (or any other structure in that blob).

Right, from qcow2's perspective it's a blob of data.  But you can put a
whole filesystem into a blob of data, and I get the impression that this
is what some are trying to do.

Once we store larger amounts of binary data in that blob (which is what
I'm fearing from comments on MIME types and PNG images), people will
realize that always having to re-store the whole blob if you modify
something in the middle is inefficient and that it needs to be
optimized.  I don't think you want to do that, but we haven't
implemented any of this yet and people are already asking for such
binary data inside of the blob.

I suspect it'll only get worse over time.

I think the most difficult thing about this discussion is that there are
different targets.

You just want to store a bit of information.  OK, good, but then I'd say
we could even just prepend that to the image file in a small header.

(Note that extending that header would not even be too complicated,
because you can easily move the qcow2 header somewhere else.  Say you
move it back by one cluster (e.g. 64 kB), then you just put the cluster
that was there originally to the end of the file, which is pretty much
trivial.  Then you copy that original data there and overwrite it with
the image header.  Done.)

Others want to store more binary data.  Then this may get inefficient
and insufficient.  But I'd think at this point it gets really
problematic to put the data into the qcow2 file because it really
doesn't belong there.  (I can't imagine anything that would warrant a
MIME type.)

Then I've heard proposals of storing multiple disk images.  Yes, you
could store multiple disks inside of a single qcow2 file, but it would
be basically exactly the same as storing just multiple qcow2 files, so...

And really, I still believe in my slippery slope argument, which means
that even if you just want to innocently store a machine type, we will
end up with something vastly more complex in the end.

Finally, it appears to me that you have a simple problem, found one
possible solution, and now you just focus on that solution instead of
taking a step back and looking at the problem again.

The problem: You want to store a binary blob and a disk image together.

Your solution: qcow2 has refcounting and thus "occupation bits".  You
can put data into it and it will leave it alone, as long as that area is
marked as occupied.  Let's put the data into the qcow2 file.

OK, let's look at the problem and its constraints again.

Hard constraint: Store a single file.
(I don't think this is a hard constraint, because I haven't been
convinced yet that handling more than a single file is so bad.)

Soft constraint: Max doesn't like storing blobs in qcow2.

So one solution is to ignore the soft constraint.  OK, valid solution, I
give you that.  But it doesn't leave me content, probably understandably so.


So let me try to understand how we end up with qcow2 as a result...  We
need a single file that needs to contain both the disk data and a binary
blob.  Or, well, even better would be if that file can store multiple
arbitrary objects, in a format of your choosing, but that makes things
more complicated, so let's leave that off for now.

So all you need is object storage (probably with a single root object
that references the rest in a custom format) and a way to tell which
areas of the file are occupied.  Now the issue is that both the disk
image and the blob may grow.  So both need mutual understanding of which
areas are occupied and which can be used for growth.  For the disk
image, the block layer would definitely need a driver to handle that,
which is not impossible.  But qcow2 would automatically handle it.

So, OK, for now this is my result.  If we create a new format, we'd need
a block driver for it (underneath qcow2) that handles the allocation.
With qcow2, we'd get it for free.


Hm, OK.

The simplest implementation for such an additional layer would get away
without actual occupation bits and just always allocate new storage at
the end of the file.  That should be sufficient, it would be quick and
not very complex.  But I see that it is additional complexity when
compared with just adding the blob to qcow2.


Well, in a sense, because we'd need block layer interfaces for
extracting the information from a qcow2 file through qemu-img.  So maybe
adding another block driver would actually mean less complexity...


[...]

>>>> But I really, really, really do not like storing arbitrary data in qcow2
>>>> files.  I hated it badly enough when qemu knew what to do with it, but I
>>>> hate it even more when even qemu has no idea what to do with it.
>>>>
>>>> Having a specification of what everything means in the qemu tree makes
>>>> things less unbearable, but not to my liking still.
>>>
>>> Have you said why you hate it so much?
>>> Your hate for it seems to be making a simple solution hard.
>>
>> Because it's a disk image format.  Data therein should be relevant to
>> the disk image.  I see qcow2 as a representation of data stored on a
>> physical storage medium.
> 
> What we're missing here is the notes scribbled on the sticky label on
> the disc;  you rarely need them on a physical drive in a computer,
> LUNs on a SAN don't need them that much because they have a full
> filesystem and don't move about much.  Here we're talking about an image
> being downloaded or sent between people.

Well, qcow2 doesn't even describe the device type, so the sticky label
may be off limits.

But really, if you create a VM, you need a configuration.  Like if you
set up a new computer, you need to know what you want.  Usually there is
no sticky label, but you just have to know and input it manually.  Maybe
you have a sheet of paper, which I'd call the configuration file.

>> Some metadata associated directly with that is fine (such as dirty
>> bitmaps, backing chains, things like that).  But configuring the whole
>> VM seems out of scope to me.
>>
>> Also, making qcow2 a filesystem is not a simple solution.
>>
>> ...OK, let me back off here, I may be over-interpreting things and
>> throwing opinions of different people into one pot.
>>
>> Maybe you don't want qcow2 to be a filesystem, and you just want to
>> store a single binary blob.  Well, OK, that's not that bad.  But in any
>> case, I wouldn't call it a simple solution anymore.
>>
>> Yes, storing just the machine type somewhere would be possible with a
>> simple solution; but as I said (and the whole thread shows since then),
>> this is a slippery slope, and suddenly we arrive at storing arbitrary
>> binary data (like images?!) along with MIME types.  That will not be
>> possible with a simple solution anymore, I don't think.
> 
> Right; I was thinking we were too far down that slope to get rid
> of all of those requirements, but I was trying to force it back to
> being a single blob as far as QCOW2 saw it.

A valiant effort, but I myself cannot see why we should forbid storing
more data once we started storing some data.  I myself do think that if
we store some VM configuration, we should be able to store all of it,
and allow for arbitrarily complex scenarios.

>>>>> --------------------------------------------------------------
>>>>>    
>>>>>
>>>>> Some reasoning:
>>>>>    a) I've avoided the problem of when QEMU interprets the value
>>>>>       by ignoring it and giving it to management layers at the point
>>>>>       of VM import.
>>>>
>>>> Yes, but in the process you've made it completely opaque to qemu,
>>>> basically, which doesn't really make it better for me.  Not that
>>>> qemu-specific information in qcow2 files would be what I want, but, well.
>>>>
>>>> But it does solve technical issues, I concede that.
>>>>
>>>>>    b) I hate JSON, but there again nailing down a fixed format
>>>>>       seems easiest and it makes the job of QCOW easy - a single
>>>>>       string.
>>>>
>>>> Not really.  The string can be rather long, so you probably don't want
>>>> to store it in the image header, and thus it's just a binary blob from
>>>> qcow2's perspective, essentially.
>>>
>>> Yes, but it's a single blob - I'm not asking for multiple keyed blobs
>>> or the ability to update individual blobs; just one blob that I can
>>> replace.
>>
>> OK, you aren't, but others seem to be.
>>
>> Or, well, you call it a single blob.  But actually the current ideas
>> seem to be to store a rather large configuration tree with binary data
>> in that blob, so to me personally there is absolutely no functional
>> difference to just storing a tar file in that blob.
>>
>> So correct me if I'm wrong, but to me it appears that you effectively
>> want to store a filesystem in qcow2.[1]  Well, that's better than making
>> qcow2 the filesystem, but it still appears just the wrong way around to me.
> 
> It's different in the sense that what we end up with is still a qcow2;
> anything that just handles qcow2's and can pass them through doesn't
> need to do anything different; users don't need to do anything
> different.  No one has to pack/unpack the file.

Packing/unpacking is a strawman because I'm doing my best to give
proposals that completely avoid that.

Users do need to do something different, because users do need to
realize that today there is no way to store VM configuration and disk
data in a single file.  So if they already start VMs just based on a
disk, then they are assuming behavior we do not have and that I'd call
naive.  But that is a strawman from my side, sorry.  Keeping naive users
happy is probably OK.

Keeping tools working is a good argument, but I'm not exactly sure what
the use cases are.  What I'd want is that in the end we have a way of
configuring a whole VM in a single file.[1]  Then, that file is no
longer just a disk image, it is a whole VM.  So maybe those tools need
to be adjusted anyway.

I assume that we have tools that work on disk images, and we trivially
want to keep them working on that VM's disk image without having to
incorporate a block layer.  Depending on the format we choose, that may
be very simple (maybe just use an offset for the qcow2 header).

But if we want to store a whole VM in a single file, then storing
multiple disk images in that single file does not seem too far off to
me, and that would mean breaking those tools anyway.

[1] I still don't quite see the point, because just using more than a
single file is so much easier.

>> [1] Yes, I know that the guest disk already contains an FS. :-P
>>
>>>>>       (I would suggest in layer2 that the keys are sorted, but
>>>>>       that's a pain to do in some json creators)
>>>>>    c) Forcing the registry of keys might avoid silly duplication.
>>>>>       We can but hope.
>>>>>    d) I've not said it's a libvirt XML file since that seems
>>>>>       a bit prescriptive.
>>>>>
>>>>> Some initial suggested keys:
>>>>>
>>>>>    "qemu.machine-types": [ "q35", "i440fx" ]
>>>>>    "qemu.min-ram-MB": 1024
>>>>
>>>> I still don't understand why you'd want to put the configuration into
>>>> qcow2 instead of the other way around.
>>>>
>>>> Or why you'd want to use a single file at all, because as this whole
>>>> thread shows, a disk image alone is clearly not sufficient to describe a VM.
>>>>
>>>> (Or it may be in simple cases, but then that's because you don't need
>>>> any configuration.)
>>>
>>> Because it avoids the unpacking associated with archives.
>>
>> I'm not talking about unpacking.  I'm talking about a potentially new
>> format which allows accessing the qcow2 file in-place.  It would
>> probably be trivial to write a block driver to allow this.
>>
>> (And as I wrote in my response to Michal, I suspect that tar could
>> actually allow this, even though it would probably not be the ideal format.)
> 
> As above, I don't think this is trivial; you have to change all the
> layers;  lets say it was a tar; you'd have to somehow know that you're
> importing one of these special tars,

Which is trivial because it's just "Hey, look, it's a tar with that
description file".

>                                      you also have to have a tool to
> create them;

Also trivial.  Non-trivial is modifying them.

The workflow would be to create the tar with an empty qcow2 file, the VM
description you want, and then just using it.

Yes, using is more difficult, but it wouldn't be an own tool, it would
be built into qemu.  I can't say how difficult that implementation would
be, but it would not be trivial, that is correct.

>              and you have to worry about whether that alignment
> is correct for the storage/memory you're using it with.

Which would be difficult with tar, right.  But we don't have to use tar.

(And, no, I don't think creating a new container format is not worse for
interoperability than adding a blob to qcow2.)

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-06 12:13                             ` Michal Suchánek
@ 2018-06-06 13:14                               ` Max Reitz
  2018-06-06 13:45                                 ` Michal Suchánek
  0 siblings, 1 reply; 157+ messages in thread
From: Max Reitz @ 2018-06-06 13:14 UTC (permalink / raw)
  To: Michal Suchánek
  Cc: Kevin Wolf, Richard W.M. Jones, qemu-devel, stefanha, ehabkost,
	qemu-block, Michael S. Tsirkin

[-- Attachment #1: Type: text/plain, Size: 5965 bytes --]

On 2018-06-06 14:13, Michal Suchánek wrote:
> On Wed, 6 Jun 2018 13:52:35 +0200
> Max Reitz <mreitz@redhat.com> wrote:
> 
>> On 2018-06-06 13:43, Michal Suchánek wrote:
>>> On Wed, 6 Jun 2018 13:32:47 +0200
>>> Max Reitz <mreitz@redhat.com> wrote:
>>>   
>>>> On 2018-06-06 13:19, Michal Suchánek wrote:  
>>>>> On Wed, 6 Jun 2018 13:02:53 +0200
>>>>> Max Reitz <mreitz@redhat.com> wrote:

[...]

>>>>>> What I'm trying to get at is that qcow2 was not designed to be a
>>>>>> container format for arbitrary files.  If you want to make it
>>>>>> such, I'm sure there are existing formats that work better.    
>>>>>
>>>>> Such as?    
>>>>
>>>> ext2?  
>>>
>>> So you want an ext2 driver in qemu instead of expanding qcow2 to
>>> work not only for a single disk but also for an appliance?  
>>
>> Yes, because ext2 was designed to be a proper filesystem.  I'm not an
>> FS designer.  Well, not a good one anyway.  So I don't trust myself on
>> extending qcow2 to be a good FS -- and why would I, when there are
>> already numerous FS around.
> 
> Do you expect that performance of qemu using qcow2 driver over ext2
> driver will be better than using qcow driver directly with some part
> semi-permanently occupied by a configuration blob? My bet is not.

If you want to store multiple disk images in a single file?  I would
think so, yes.  With qcow2, I would assume it leads to fragmentation.  I
would hope that proper filesystems can mitigate this.
> The ext* drivers are designed to work with kernel VM infrastructure
> which must be tuned for different usage scenarios and you would have to
> duplicate that tuning in qemu to get competitive performance. Also you
> get qcow2 and ext2 metadata which must be allocated, managed, etc. You
> get more storage and performance overhead for no good reason.

Yes, there is a good reason.  You can add arbitrary configuration
options without having to worry about me.

Seriously, though, a real FS would allow you to be more expressive and
really do what you want without having to work around the quirks that
adding a not-real-FS in the most simple way possible to qcow2 would
bring with it.

Because this is part of my fear, that we now add a very simple blob for
just a sprinkle of data.  But over time it gets more and more complex
because we want to store more and more data to make things ever more
convenient[1], we notice that we need more features, the format gets
more complex, and in the end we have an FS that is just worse than a
real FS.

[1] And note that if I'm convinced to store VM configuration data in
qemu, I will agree that we can store any data in there and it would be
nice if any VM could be provisioned and used that way.

> On the other hand, qcow is designed for storing VM disk data and
> hopefully was tuned to do that decently over the years. The primary use
> case remains storing VM disk data. Adding a configuration blob does not
> change that.

True.  So the argument is that qcow2 may be worse for storing arbitrary
data, but we don't have performance requirements for that; but we do
have performance requirements for disk data and adding another format
below qcow2 will not make it better.

I do think it is possible to not make things worse with a format under
qcow2, but that may require additional complexity, that you think is
pointless.

I understand that you think that, but I still believe that putting the
configuration into qcow2 is just the wrong way around and will fall on
our feet in the long run.

>>>>>>>> Unless I have got something terribly wrong (which is indeed a
>>>>>>>> possibility!), to me this proposal means basically to turn
>>>>>>>> qcow2 into (1) a VM description format for qemu, and (2) to
>>>>>>>> turn it into an archive format on the way.      
>>>>>>>
>>>>>>> And if you go all the way you can store multiple disks along
>>>>>>> with the VM definition so you can have the whole appliance in
>>>>>>> one file. It conveniently solves the problem of synchronizing
>>>>>>> snapshots across multiple disk images and the question where to
>>>>>>> store the machine state if you want to suspend it.       
>>>>>>
>>>>>> Yeah, but why make qcow2 that format?  That's what I completely
>>>>>> fail to understand.
>>>>>>
>>>>>> If you want to have a single VM description file that contains
>>>>>> the VM configuration and some qcow2/raw/whatever files along
>>>>>> with it for the guest disk data, sure, go ahead.  But why does
>>>>>> the format of the whole thing need to be qcow2?    
>>>>>
>>>>> Because then qemu can access the disk data from the image directly
>>>>> without any need for extraction, copying to different file,
>>>>> etc.    
>>>>
>>>> This does not explain why it needs to be qcow2.  There is
>>>> absolutely no reason why you couldn't use qcow2 files in-place
>>>> inside of another file.  
>>>
>>> qemu cannot read the disk data from the file in-place.  
>>
>> Hu?  Why not?
> 
> Well, it can possibly read the image if it happens to be continuous. It
> will not be able to update it without a fs driver, however.

Yes, but first, such an FS driver would be possible (and as long as we
don't need real complexity, it could be very simple, like just using an
offset in a tar file and then just adjust the file length field on
allocations beyond the EOF).

And secondly, I think adding another format has the advantage of easier
deprecation.  If we think we need something more complex, we are free to
design that and throw away the old format.  But if we add something to
qcow2, I would think it is there to stay.

So, yes, for qcow2 we might want to design something (overly?) complex
from that start that we hope will fulfill all our needs (which it won't,
because things never turn out that way).  But if we'd add a new format,
we could keep it simple in the beginning and start over later.

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-06 12:03                           ` Dr. David Alan Gilbert
@ 2018-06-06 13:15                             ` Max Reitz
  0 siblings, 0 replies; 157+ messages in thread
From: Max Reitz @ 2018-06-06 13:15 UTC (permalink / raw)
  To: Dr. David Alan Gilbert, Daniel P. Berrangé
  Cc: Richard W.M. Jones, Kevin Wolf, qemu-block, Michael S. Tsirkin,
	armbru, qemu-devel, stefanha

[-- Attachment #1: Type: text/plain, Size: 1291 bytes --]

On 2018-06-06 14:03, Dr. David Alan Gilbert wrote:
> * Daniel P. Berrangé (berrange@redhat.com) wrote:
>> On Wed, Jun 06, 2018 at 12:42:28PM +0100, Richard W.M. Jones wrote:
>>> On Wed, Jun 06, 2018 at 12:14:07PM +0100, Dr. David Alan Gilbert wrote:
>>>> The problem with having a separate file is that you either have to copy
>>>> it around with the image or have an archive.  If you have an archive
>>>> you have to have an unpacking step which then copies, potentially a lot
>>>> of data taking some reasonable amount of time.  Storing a simple bit
>>>> of data with the image avoids that.
>>>
>>> This isn't really true.  For OVA (ie. tar) we don't unpack them.
>>> Adding file.offset and file.size in qemu's raw driver was crucial to
>>> that optimization.
>>
>> Though that assumes you're only using the qcow2 file in read-only mode.
>> As soon as you need write access you need to unpack from the OVA so that
>> the qcow2 file can grow its length when new sectors are allocated.
> 
> And the person creating the OVA has to do that taring rather than just
> take the qcow2 they've just used in the VM.

Note that this again can be done efficiently.  You just overwrite the
beginning of the qcow2 file and move the overwritten clusters somewhere
else.

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-06 12:16                             ` Dr. David Alan Gilbert
@ 2018-06-06 13:22                               ` Max Reitz
  2018-06-06 14:02                                 ` Dr. David Alan Gilbert
  0 siblings, 1 reply; 157+ messages in thread
From: Max Reitz @ 2018-06-06 13:22 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: Kevin Wolf, ehabkost, qemu-block, Michael S. Tsirkin,
	Richard W.M. Jones, qemu-devel, stefanha, Michal Suchánek

[-- Attachment #1: Type: text/plain, Size: 5234 bytes --]

On 2018-06-06 14:16, Dr. David Alan Gilbert wrote:
> * Max Reitz (mreitz@redhat.com) wrote:
>> On 2018-06-06 13:37, Dr. David Alan Gilbert wrote:
>>> * Max Reitz (mreitz@redhat.com) wrote:
>>>> On 2018-06-06 13:19, Michal Suchánek wrote:
>>>>> On Wed, 6 Jun 2018 13:02:53 +0200
>>>>> Max Reitz <mreitz@redhat.com> wrote:
>>>>>
>>>>>> On 2018-06-06 12:32, Michal Suchánek wrote:
>>>>>>> On Tue, 29 May 2018 12:14:15 +0200
>>>>>>> Max Reitz <mreitz@redhat.com> wrote:
>>
>> [...]
>>
>>>>>>>> Unless I have got something terribly wrong (which is indeed a
>>>>>>>> possibility!), to me this proposal means basically to turn qcow2
>>>>>>>> into (1) a VM description format for qemu, and (2) to turn it into
>>>>>>>> an archive format on the way.  
>>>>>>>
>>>>>>> And if you go all the way you can store multiple disks along with
>>>>>>> the VM definition so you can have the whole appliance in one file.
>>>>>>> It conveniently solves the problem of synchronizing snapshots across
>>>>>>> multiple disk images and the question where to store the machine
>>>>>>> state if you want to suspend it.   
>>>>>>
>>>>>> Yeah, but why make qcow2 that format?  That's what I completely fail
>>>>>> to understand.
>>>>>>
>>>>>> If you want to have a single VM description file that contains the VM
>>>>>> configuration and some qcow2/raw/whatever files along with it for the
>>>>>> guest disk data, sure, go ahead.  But why does the format of the whole
>>>>>> thing need to be qcow2?
>>>>>
>>>>> Because then qemu can access the disk data from the image directly
>>>>> without any need for extraction, copying to different file, etc.
>>>>
>>>> This does not explain why it needs to be qcow2.  There is absolutely no
>>>> reason why you couldn't use qcow2 files in-place inside of another file.
>>>
>>> Because then we'd have to change the whole stack to take advantage of
>>> that.  Adding a feature into qcow2 means nothing else changes.
>>
>> Because it's a hack, right.  Storing binary data in a qcow2 file,
>> completely ignoring it in qemu (and being completely unusable to any
>> potential other users of the qcow2 format[1]) and only interpreting it
>> somewhere up the stack is a hack.
> 
> It's not a hack!
> Seriously it's not.
> There's nothing wrong with it being aimed higher up the stack than qemu,

Not really, but storing that information in a disk image file is, from
my perspective.  So far, qcow2 was always just for qemu.  (Hmm...  Maybe
backing links weren't, but at least they were intended for qemu originally.)

So this would mix information for different layers inside qcow2 which to
me sounds weird.  Maybe I just have to get used to it.

> the problem we started off with was what happens when a user downloads
> a VM image and tries to import it into their VM system;

Well, the VM system should choke without a config file. O:-)

>                                                         weve already
> got 2+ layers of management stuff in there - I want the information to
> guide those layers, not form a complete set of configuration.

But I do.

If we store some information, I don't see why we don't store all of it.

>> That is not necessarily a negative point, hacks can work wonderfully
>> well, and they usually are simple, that is correct.  But the thing is
>> that I feel like people have grand visions of what to get out of this.
>> Imagine, a single file that can configure all and any VM!
>>
>> But hacks usually only solve a single issue.  Once you try to extend a
>> hack, it breaks down and becomes insufficient.
>>
>> If we want a grand vision where a single file stores the whole VM, why
>> not invest the work and make it right from the start?
> 
> Because we won't get it right; however much we bikeshed about it
> we'll just end up with a mess.

Sure, but the same thing applies to putting it into qcow2.  The
difference is, for something outside of qcow2, throwing it away and
starting over is simple.

When putting it into qcow2, we can only do that if we really just put a
binary blob there that isn't described in the specification.

>                                  The right thing is to put in something
> to hold configuration and then review the items of configuration we
> add properly as we define them.

OK, but review them on what terms?  Whether they are simple enough?

As I said, I would want a whole configuration if we allow some
configuration.

(More below)

>> [1] Yes, I concede that there are probably no other users of qcow2.  But
>> please forgive me for assuming that qcow2 was in a sense designed to be
>> a rather general image format that not only qemu could use.
> 
> What makes it QEMU specific?  It's basically just the same key/value
> setup as OVA, except putting them inside the qcow2.

Well, not necessarily qemu-specific, but
${management_software}-specific, which comes down to the same.  Or,
well, I'd think that, but hold on.

> We could use the same keys/value definitions as OVA in the blob,
> although their definitions aren't very portable either.

So you're proposing that we only add options that seem portable for any VM?

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-06  9:44                     ` Dr. David Alan Gilbert
@ 2018-06-06 13:35                       ` Eduardo Habkost
  0 siblings, 0 replies; 157+ messages in thread
From: Eduardo Habkost @ 2018-06-06 13:35 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: Max Reitz, Kevin Wolf, qemu-block, Michael S. Tsirkin,
	qemu-devel, Richard W.M. Jones, stefanha

On Wed, Jun 06, 2018 at 10:44:20AM +0100, Dr. David Alan Gilbert wrote:
> * Eduardo Habkost (ehabkost@redhat.com) wrote:
> > On Tue, Jun 05, 2018 at 10:21:59AM +0100, Dr. David Alan Gilbert wrote:
> > > <reawakening a fizzled out thread>
> > > 
> > > This seems to have fizzled out because of a lack of a concrete proposal;
> > > so here is one based on a reply to Max's post:
> > > 
> > > * Max Reitz (mreitz@redhat.com) wrote:
> > > 
> > > <snip>
> > > 
> > > > The original problem was that you need to supply a machine type to qemu,
> > > > and that multiple common architectures now have multiple machine types
> > > > and not necessarily all work with a single image.  So far so good, but I
> > > > have two issues here already:
> > > > 
> > > > (1) How is qemu supposed to interpret that information?  If it's stored
> > > > in the image file, I don't see a nice way of retrieving it before the
> > > > machine is initialized, at least not with qemu's current architecture.
> > > 
> > > <snip>
> > > 
> > > > (2) Again, I personally just really don't like saving such information
> > > > in a disk image.  One actual argument I can bring up for that distaste
> > > > is this: Suppose, you have multiple images attached to your VM.  Now the
> > > > VM wants to store the machine type.  Where does it go?  Into all of
> > > > them?
> > > 
> > > <snip>
> > > 
> > > > So I think if we decide to store the machine type, that is kind of a
> > > > slippery slope and then there are good arguments for storing even more
> > > > configuration options in the file, too.  But I really, really don't like
> > > > that.
> > > 
> > > <snip>
> > > 
> > > > For another, how do we store the data?  key-value seems wrong if we want
> > > > to store everything.  JSON might be fine.  But eventually we just want
> > > > basically a qemu configuration file in there, I would think (which may
> > > > support JSON at some point?).   So basically we would store the data as
> > > > a binary blob and let the rest of qemu do its thing with it.  But then
> > > > please tell me why I fought so valiantly against storing random bitmaps
> > > > in qcow2 files.  I hate the idea of making qcow2 a random archive
> > > > format.  We have tar for that.
> > > 
> > > <snip>
> > > 
> > > > tl;dr: I really don't get why it's so hard to supply a config file along
> > > > with a qcow2 image.  Is it so hard for people to realize that a VM does
> > > > not only consist of a disk?
> > > 
> > > Yes! Because in many cases that's all it needs, and it's ready to run
> > > with no unpacking.
> > > 
> > > I think we should have:
> > > 
> > > --------------------------------------------------------------
> > > Layer 0:
> > >    QCOW provides a way to store a single string of arbitrary (but
> > > limited?) length.
> > >    QCOW provides a way to replace the string by a new string.
> > >    The original or the new string will be stored after that;
> > >    never some mix.
> > >    Where a file 'b' has a backing file 'a', 'b' inherits the
> > >    string from 'a' unless 'b' has it's own string.
> > >    Snapshots inherit their string from the main unless they have
> > >    their own string.
> > > 
> > > Layer 1:
> > >    The string shall always be a JSON 'object'; i.e. of the form
> > >     { "something": ... , "more": ... }
> > > 
> > >    The key strings shall be non-null and non-empty and shall
> > >    be unique.
> > > 
> > 
> > I'd prefer layer 0+1 to:
> > 
> > 1) Allow multiple entries to be stored (implemented by layer 1
> >    in this proposal)
> > 2) Identify each entry with a name (implemented by layer 1 in
> >    this proposal)
> > 3) Allow arbitrary binary data to be stored on an entry
> >    (not possible with the JSON-based proposal, because JSON
> >    strings are not blobs, but Unicode strings).
> > 4) Make it easy to replace only one entry while keeping others
> >    intact (not the case here, if all entries are stored in the
> >    same JSON string)
> > 
> > I think it would be simpler if layer 0 simply provided a list of
> > names/value pairs, where names are ascii strings, and values are
> > binary data[1].  It would make layer 1 unnecessary, and allow (3)
> > and (4) to happen.
> > 
> > [1] In other words, Rich's proposal of "named blobs":
> > https://www.mail-archive.com/qemu-block@nongnu.org/msg37856.html
> 
> My reasoning was just one of simplicity; each layer in this is
> almost trivial.
> The downside to my proposal is it's more expensive if you want
> to change a single key; but as yet no one has suggested why we'd
> want to do it frequently enough to worry.
> 
> My suggestion of JSON was just to try and stop the bikeshedding;
> we seem to be putting a lot of effort into inventing a new
> storage definition for something that so far we need to hold
> one rarely changing short string, which even with all the discussion
> has moved upto maybe 3 or 4 rarely changing pieces of data.

I agree with that feeling, and I don't have a very strong
preference.

I would prefer if qcow2 let us store named blobs, because it
gives us more flexibility to decide about the data format later.
But if qcow2 maintainers tell us it's much simpler to let us
store a single string instead of named blobs, I won't complain
too loudly if we follow your suggestion.

-- 
Eduardo

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-06 11:44                           ` Max Reitz
  2018-06-06 12:16                             ` Dr. David Alan Gilbert
@ 2018-06-06 13:42                             ` Eduardo Habkost
  2018-06-06 14:55                               ` Michael S. Tsirkin
  2018-06-06 14:46                             ` Michael S. Tsirkin
  2 siblings, 1 reply; 157+ messages in thread
From: Eduardo Habkost @ 2018-06-06 13:42 UTC (permalink / raw)
  To: Max Reitz
  Cc: Dr. David Alan Gilbert, Michal Suchánek, Kevin Wolf,
	qemu-block, Michael S. Tsirkin, Richard W.M. Jones, qemu-devel,
	stefanha

On Wed, Jun 06, 2018 at 01:44:02PM +0200, Max Reitz wrote:
> On 2018-06-06 13:37, Dr. David Alan Gilbert wrote:
> > * Max Reitz (mreitz@redhat.com) wrote:
> >> On 2018-06-06 13:19, Michal Suchánek wrote:
> >>> On Wed, 6 Jun 2018 13:02:53 +0200
> >>> Max Reitz <mreitz@redhat.com> wrote:
> >>>
> >>>> On 2018-06-06 12:32, Michal Suchánek wrote:
> >>>>> On Tue, 29 May 2018 12:14:15 +0200
> >>>>> Max Reitz <mreitz@redhat.com> wrote:
> 
> [...]
> 
> >>>>>> Unless I have got something terribly wrong (which is indeed a
> >>>>>> possibility!), to me this proposal means basically to turn qcow2
> >>>>>> into (1) a VM description format for qemu, and (2) to turn it into
> >>>>>> an archive format on the way.  
> >>>>>
> >>>>> And if you go all the way you can store multiple disks along with
> >>>>> the VM definition so you can have the whole appliance in one file.
> >>>>> It conveniently solves the problem of synchronizing snapshots across
> >>>>> multiple disk images and the question where to store the machine
> >>>>> state if you want to suspend it.   
> >>>>
> >>>> Yeah, but why make qcow2 that format?  That's what I completely fail
> >>>> to understand.
> >>>>
> >>>> If you want to have a single VM description file that contains the VM
> >>>> configuration and some qcow2/raw/whatever files along with it for the
> >>>> guest disk data, sure, go ahead.  But why does the format of the whole
> >>>> thing need to be qcow2?
> >>>
> >>> Because then qemu can access the disk data from the image directly
> >>> without any need for extraction, copying to different file, etc.
> >>
> >> This does not explain why it needs to be qcow2.  There is absolutely no
> >> reason why you couldn't use qcow2 files in-place inside of another file.
> > 
> > Because then we'd have to change the whole stack to take advantage of
> > that.  Adding a feature into qcow2 means nothing else changes.
> 
> Because it's a hack, right.  Storing binary data in a qcow2 file,
> completely ignoring it in qemu (and being completely unusable to any
> potential other users of the qcow2 format[1]) and only interpreting it
> somewhere up the stack is a hack.
> 
> That is not necessarily a negative point, hacks can work wonderfully
> well, and they usually are simple, that is correct.  But the thing is
> that I feel like people have grand visions of what to get out of this.
> Imagine, a single file that can configure all and any VM!
> 
> But hacks usually only solve a single issue.  Once you try to extend a
> hack, it breaks down and becomes insufficient.
> 
> If we want a grand vision where a single file stores the whole VM, why
> not invest the work and make it right from the start?

We don't want a grand vision where a single file stores the whole
VM.  This is exactly what I would like to avoid, by not inventing
a whole different appliance file format.

-- 
Eduardo

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-06 13:14                               ` Max Reitz
@ 2018-06-06 13:45                                 ` Michal Suchánek
  2018-06-06 13:50                                   ` Daniel P. Berrangé
  2018-06-06 14:17                                   ` Max Reitz
  0 siblings, 2 replies; 157+ messages in thread
From: Michal Suchánek @ 2018-06-06 13:45 UTC (permalink / raw)
  To: Max Reitz
  Cc: Kevin Wolf, ehabkost, qemu-block, Michael S. Tsirkin,
	Richard W.M. Jones, qemu-devel, stefanha

[-- Attachment #1: Type: text/plain, Size: 7492 bytes --]

On Wed, 6 Jun 2018 15:14:03 +0200
Max Reitz <mreitz@redhat.com> wrote:

> On 2018-06-06 14:13, Michal Suchánek wrote:
> > On Wed, 6 Jun 2018 13:52:35 +0200
> > Max Reitz <mreitz@redhat.com> wrote:
> >   
> >> On 2018-06-06 13:43, Michal Suchánek wrote:  
> >>> On Wed, 6 Jun 2018 13:32:47 +0200
> >>> Max Reitz <mreitz@redhat.com> wrote:
> >>>     
> >>>> On 2018-06-06 13:19, Michal Suchánek wrote:    
> >>>>> On Wed, 6 Jun 2018 13:02:53 +0200
> >>>>> Max Reitz <mreitz@redhat.com> wrote:  
> 
> [...]
> 
> >>>>>> What I'm trying to get at is that qcow2 was not designed to be
> >>>>>> a container format for arbitrary files.  If you want to make it
> >>>>>> such, I'm sure there are existing formats that work
> >>>>>> better.      
> >>>>>
> >>>>> Such as?      
> >>>>
> >>>> ext2?    
> >>>
> >>> So you want an ext2 driver in qemu instead of expanding qcow2 to
> >>> work not only for a single disk but also for an appliance?    
> >>
> >> Yes, because ext2 was designed to be a proper filesystem.  I'm not
> >> an FS designer.  Well, not a good one anyway.  So I don't trust
> >> myself on extending qcow2 to be a good FS -- and why would I, when
> >> there are already numerous FS around.  
> > 
> > Do you expect that performance of qemu using qcow2 driver over ext2
> > driver will be better than using qcow driver directly with some part
> > semi-permanently occupied by a configuration blob? My bet is not.  
> 
> If you want to store multiple disk images in a single file?  I would
> think so, yes.  With qcow2, I would assume it leads to
> fragmentation.  

How is that different from single disk divided into two partitions
internally (without any knowledge on the qcow2 level)?

> I would hope that proper filesystems can mitigate this.

Not really. Not without much complexity and repeated maintenance.

> > The ext* drivers are designed to work with kernel VM infrastructure
> > which must be tuned for different usage scenarios and you would
> > have to duplicate that tuning in qemu to get competitive
> > performance. Also you get qcow2 and ext2 metadata which must be
> > allocated, managed, etc. You get more storage and performance
> > overhead for no good reason.  
> 
> Yes, there is a good reason.  You can add arbitrary configuration
> options without having to worry about me.

But I will not be able to use the images in qemu so it will be useless.

Well, there is FUSE and that is certainly blazing fast and ubiquitous,
I am sure.

> 
> Seriously, though, a real FS would allow you to be more expressive and
> really do what you want without having to work around the quirks that
> adding a not-real-FS in the most simple way possible to qcow2 would
> bring with it.
> 
> Because this is part of my fear, that we now add a very simple blob
> for just a sprinkle of data.  But over time it gets more and more
> complex because we want to store more and more data to make things
> ever more convenient[1], we notice that we need more features, the
> format gets more complex, and in the end we have an FS that is just
> worse than a real FS.
> 
> [1] And note that if I'm convinced to store VM configuration data in
> qemu, I will agree that we can store any data in there and it would be
> nice if any VM could be provisioned and used that way.
> 
> > On the other hand, qcow is designed for storing VM disk data and
> > hopefully was tuned to do that decently over the years. The primary
> > use case remains storing VM disk data. Adding a configuration blob
> > does not change that.  
> 
> True.  So the argument is that qcow2 may be worse for storing
> arbitrary data, but we don't have performance requirements for that;
> but we do have performance requirements for disk data and adding
> another format below qcow2 will not make it better.
> 
> I do think it is possible to not make things worse with a format under
> qcow2, but that may require additional complexity, that you think is
> pointless.
> 
> I understand that you think that, but I still believe that putting the
> configuration into qcow2 is just the wrong way around and will fall on
> our feet in the long run.

I think that *if* we want an 'appliance' format that stores a whole VM
in a single file to ease VM distribution then the logical place to look
in qemu is qcow. The reason have been explained at length.

I understand that for some use cases simplifying the distribution of
VMs as much as possible is quite important.

> 
> >>>>>>>> Unless I have got something terribly wrong (which is indeed a
> >>>>>>>> possibility!), to me this proposal means basically to turn
> >>>>>>>> qcow2 into (1) a VM description format for qemu, and (2) to
> >>>>>>>> turn it into an archive format on the way.        
> >>>>>>>
> >>>>>>> And if you go all the way you can store multiple disks along
> >>>>>>> with the VM definition so you can have the whole appliance in
> >>>>>>> one file. It conveniently solves the problem of synchronizing
> >>>>>>> snapshots across multiple disk images and the question where
> >>>>>>> to store the machine state if you want to suspend it.         
> >>>>>>
> >>>>>> Yeah, but why make qcow2 that format?  That's what I completely
> >>>>>> fail to understand.
> >>>>>>
> >>>>>> If you want to have a single VM description file that contains
> >>>>>> the VM configuration and some qcow2/raw/whatever files along
> >>>>>> with it for the guest disk data, sure, go ahead.  But why does
> >>>>>> the format of the whole thing need to be qcow2?      
> >>>>>
> >>>>> Because then qemu can access the disk data from the image
> >>>>> directly without any need for extraction, copying to different
> >>>>> file, etc.      
> >>>>
> >>>> This does not explain why it needs to be qcow2.  There is
> >>>> absolutely no reason why you couldn't use qcow2 files in-place
> >>>> inside of another file.    
> >>>
> >>> qemu cannot read the disk data from the file in-place.    
> >>
> >> Hu?  Why not?  
> > 
> > Well, it can possibly read the image if it happens to be
> > continuous. It will not be able to update it without a fs driver,
> > however.  
> 
> Yes, but first, such an FS driver would be possible (and as long as we
> don't need real complexity, it could be very simple, like just using
> an offset in a tar file and then just adjust the file length field on
> allocations beyond the EOF).

As said this will cover only the simplest case with readonly
configuration blob(s) and one disk image at the end. The solution is
obsolete even as we are designing it.

> 
> And secondly, I think adding another format has the advantage of
> easier deprecation.  If we think we need something more complex, we
> are free to design that and throw away the old format.  But if we add
> something to qcow2, I would think it is there to stay.

And then we would better add something general enough to work for
foreseeable future or not add anything at all.

> 
> So, yes, for qcow2 we might want to design something (overly?) complex
> from that start that we hope will fulfill all our needs (which it
> won't, because things never turn out that way).  But if we'd add a
> new format, we could keep it simple in the beginning and start over
> later.
> 

Yes, you basically want something like a tar file that can interleave
the archive members. Wait, isn't that a new filesystem?

Thanks

Michal

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-06 13:45                                 ` Michal Suchánek
@ 2018-06-06 13:50                                   ` Daniel P. Berrangé
  2018-06-06 14:14                                     ` Eduardo Habkost
  2018-06-06 14:17                                   ` Max Reitz
  1 sibling, 1 reply; 157+ messages in thread
From: Daniel P. Berrangé @ 2018-06-06 13:50 UTC (permalink / raw)
  To: Michal Suchánek
  Cc: Max Reitz, Kevin Wolf, ehabkost, qemu-block, Michael S. Tsirkin,
	Richard W.M. Jones, qemu-devel, stefanha

On Wed, Jun 06, 2018 at 03:45:10PM +0200, Michal Suchánek wrote:
> 
> I think that *if* we want an 'appliance' format that stores a whole VM
> in a single file to ease VM distribution then the logical place to look
> in qemu is qcow. The reason have been explained at length.

I rather disagree. This is a common problem beyond just QEMU and everyone
just uses an existing archive format (TAR, ZIP) for bundling together
one or more disk images, metdata for config, and whatever other resources
are applicable for the vendor.  This works with any disk format (raw,
qcow2, vmdk, vpc, etc) so is preferrable to inventing someting that is
specific to qcow2 IMHO.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-06 13:22                               ` Max Reitz
@ 2018-06-06 14:02                                 ` Dr. David Alan Gilbert
  2018-06-06 14:33                                   ` Max Reitz
  0 siblings, 1 reply; 157+ messages in thread
From: Dr. David Alan Gilbert @ 2018-06-06 14:02 UTC (permalink / raw)
  To: Max Reitz
  Cc: Kevin Wolf, ehabkost, qemu-block, Michael S. Tsirkin,
	Richard W.M. Jones, qemu-devel, stefanha, Michal Suchánek

* Max Reitz (mreitz@redhat.com) wrote:
> On 2018-06-06 14:16, Dr. David Alan Gilbert wrote:
> > * Max Reitz (mreitz@redhat.com) wrote:
> >> On 2018-06-06 13:37, Dr. David Alan Gilbert wrote:
> >>> * Max Reitz (mreitz@redhat.com) wrote:
> >>>> On 2018-06-06 13:19, Michal Suchánek wrote:
> >>>>> On Wed, 6 Jun 2018 13:02:53 +0200
> >>>>> Max Reitz <mreitz@redhat.com> wrote:
> >>>>>
> >>>>>> On 2018-06-06 12:32, Michal Suchánek wrote:
> >>>>>>> On Tue, 29 May 2018 12:14:15 +0200
> >>>>>>> Max Reitz <mreitz@redhat.com> wrote:
> >>
> >> [...]
> >>
> >>>>>>>> Unless I have got something terribly wrong (which is indeed a
> >>>>>>>> possibility!), to me this proposal means basically to turn qcow2
> >>>>>>>> into (1) a VM description format for qemu, and (2) to turn it into
> >>>>>>>> an archive format on the way.  
> >>>>>>>
> >>>>>>> And if you go all the way you can store multiple disks along with
> >>>>>>> the VM definition so you can have the whole appliance in one file.
> >>>>>>> It conveniently solves the problem of synchronizing snapshots across
> >>>>>>> multiple disk images and the question where to store the machine
> >>>>>>> state if you want to suspend it.   
> >>>>>>
> >>>>>> Yeah, but why make qcow2 that format?  That's what I completely fail
> >>>>>> to understand.
> >>>>>>
> >>>>>> If you want to have a single VM description file that contains the VM
> >>>>>> configuration and some qcow2/raw/whatever files along with it for the
> >>>>>> guest disk data, sure, go ahead.  But why does the format of the whole
> >>>>>> thing need to be qcow2?
> >>>>>
> >>>>> Because then qemu can access the disk data from the image directly
> >>>>> without any need for extraction, copying to different file, etc.
> >>>>
> >>>> This does not explain why it needs to be qcow2.  There is absolutely no
> >>>> reason why you couldn't use qcow2 files in-place inside of another file.
> >>>
> >>> Because then we'd have to change the whole stack to take advantage of
> >>> that.  Adding a feature into qcow2 means nothing else changes.
> >>
> >> Because it's a hack, right.  Storing binary data in a qcow2 file,
> >> completely ignoring it in qemu (and being completely unusable to any
> >> potential other users of the qcow2 format[1]) and only interpreting it
> >> somewhere up the stack is a hack.
> > 
> > It's not a hack!
> > Seriously it's not.
> > There's nothing wrong with it being aimed higher up the stack than qemu,
> 
> Not really, but storing that information in a disk image file is, from
> my perspective.  So far, qcow2 was always just for qemu.  (Hmm...  Maybe
> backing links weren't, but at least they were intended for qemu originally.)
> 
> So this would mix information for different layers inside qcow2 which to
> me sounds weird.  Maybe I just have to get used to it.

The important point is it's explicitly for a different layer; we're not
mixing it - the guest can never get to this information.  It also saves
the higher level management layers ever having to look at the data the
guest can get to, which is a security advantage.

From my point of view, it really is the sticky label on the disc rather
than the contents of it.

> > the problem we started off with was what happens when a user downloads
> > a VM image and tries to import it into their VM system;
> 
> Well, the VM system should choke without a config file. O:-)
> 
> >                                                         weve already
> > got 2+ layers of management stuff in there - I want the information to
> > guide those layers, not form a complete set of configuration.
> 
> But I do.
> 
> If we store some information, I don't see why we don't store all of it.

Hmm, now that generally I don't like:
  a) That really would make it hypervisor specific
  b) It would be a lot of data
  c) Generally, the supplier of an image doesn't know how the end-user
     wants it configured - although for some appliances they might.
  d) Remember the only problem we had when we got here was how to stop
     the user shooting themselves in the foot by connecting the wrong
     image to the wrong VM type.  So I'm expecting to use this to
     contain requirements, nothing more.


> >> That is not necessarily a negative point, hacks can work wonderfully
> >> well, and they usually are simple, that is correct.  But the thing is
> >> that I feel like people have grand visions of what to get out of this.
> >> Imagine, a single file that can configure all and any VM!
> >>
> >> But hacks usually only solve a single issue.  Once you try to extend a
> >> hack, it breaks down and becomes insufficient.
> >>
> >> If we want a grand vision where a single file stores the whole VM, why
> >> not invest the work and make it right from the start?
> > 
> > Because we won't get it right; however much we bikeshed about it
> > we'll just end up with a mess.
> 
> Sure, but the same thing applies to putting it into qcow2.  The
> difference is, for something outside of qcow2, throwing it away and
> starting over is simple.
> 
> When putting it into qcow2, we can only do that if we really just put a
> binary blob there that isn't described in the specification.

Well, it's why I'm going for defined key/values that are stored in the
blob and only a few of them.  We've got a reasonable chance of being
able to define what we want from 3-4 key/values, it should be a lot
easier than trying to define a grand scheme.

> >                                  The right thing is to put in something
> > to hold configuration and then review the items of configuration we
> > add properly as we define them.
> 
> OK, but review them on what terms?  Whether they are simple enough?

Well, I'll take simple, make sense - whatever - feel free to be the
maintainer for that list!

> As I said, I would want a whole configuration if we allow some
> configuration.

Well then you could go for libvirt XML as the contents; but I think
that's crossing layers even more.
(I would have veered more to it being exactly the same as an OVA
description except for rjones dislike of it)

> (More below)
> 
> >> [1] Yes, I concede that there are probably no other users of qcow2.  But
> >> please forgive me for assuming that qcow2 was in a sense designed to be
> >> a rather general image format that not only qemu could use.
> > 
> > What makes it QEMU specific?  It's basically just the same key/value
> > setup as OVA, except putting them inside the qcow2.
> 
> Well, not necessarily qemu-specific, but
> ${management_software}-specific, which comes down to the same.  Or,
> well, I'd think that, but hold on.
> 
> > We could use the same keys/value definitions as OVA in the blob,
> > although their definitions aren't very portable either.
> 
> So you're proposing that we only add options that seem portable for any VM?

Hmm.  We should probably split them, so there should be general options
(e.g. minimum-ram) but also hypervisor specifics
(qemu.machine-class=q35); but that doesn't mean you can't add keys
for multiple hypervisors into the one blob.  I mean
something like:
    minimum-ram = 1G
    qemu.machine-class = q35
    anothervm.machine-class = ....

Dave

> Max
> 



--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-06 13:50                                   ` Daniel P. Berrangé
@ 2018-06-06 14:14                                     ` Eduardo Habkost
  2018-06-06 14:21                                       ` Max Reitz
  2018-06-06 14:24                                       ` Daniel P. Berrangé
  0 siblings, 2 replies; 157+ messages in thread
From: Eduardo Habkost @ 2018-06-06 14:14 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Michal Suchánek, Max Reitz, Kevin Wolf, qemu-block,
	Michael S. Tsirkin, Richard W.M. Jones, qemu-devel, stefanha

On Wed, Jun 06, 2018 at 02:50:10PM +0100, Daniel P. Berrangé wrote:
> On Wed, Jun 06, 2018 at 03:45:10PM +0200, Michal Suchánek wrote:
> > 
> > I think that *if* we want an 'appliance' format that stores a whole VM
> > in a single file to ease VM distribution then the logical place to look
> > in qemu is qcow. The reason have been explained at length.
> 
> I rather disagree. This is a common problem beyond just QEMU and everyone
> just uses an existing archive format (TAR, ZIP) for bundling together
> one or more disk images, metdata for config, and whatever other resources
> are applicable for the vendor.  This works with any disk format (raw,
> qcow2, vmdk, vpc, etc) so is preferrable to inventing someting that is
> specific to qcow2 IMHO.

Now we have N+1 appliance file formats.  :)

(We like it or not, qcow2 is already used as an appliance format
for single-disk VMs in practice.)

But I agree this must not be specific to qcow2.  The same VM
description format we agree upon should work with other disk
formats or with multi-disk appliances.

If we specify a reasonable VM description format for appliances
and make it work inside (e.g.) tar files, we will still have the
option of allowing the description be placed inside qcow2 if we
really want to.  I don't think we need to finish this qcow2
bikeshedding exercise right now.

-- 
Eduardo

^ permalink raw reply	[flat|nested] 157+ messages in thread

* [Qemu-devel]  storing machine data in qcow images?
  2018-06-06 13:45                                 ` Michal Suchánek
  2018-06-06 13:50                                   ` Daniel P. Berrangé
@ 2018-06-06 14:17                                   ` Max Reitz
  2018-06-06 16:10                                     ` Eduardo Habkost
  1 sibling, 1 reply; 157+ messages in thread
From: Max Reitz @ 2018-06-06 14:17 UTC (permalink / raw)
  To: Michal Suchánek
  Cc: Kevin Wolf, ehabkost, qemu-block, Michael S. Tsirkin,
	Richard W.M. Jones, qemu-devel, stefanha

[-- Attachment #1: Type: text/plain, Size: 9892 bytes --]

On 2018-06-06 15:45, Michal Suchánek wrote:
> On Wed, 6 Jun 2018 15:14:03 +0200
> Max Reitz <mreitz@redhat.com> wrote:
> 
>> On 2018-06-06 14:13, Michal Suchánek wrote:
>>> On Wed, 6 Jun 2018 13:52:35 +0200
>>> Max Reitz <mreitz@redhat.com> wrote:
>>>   
>>>> On 2018-06-06 13:43, Michal Suchánek wrote:  
>>>>> On Wed, 6 Jun 2018 13:32:47 +0200
>>>>> Max Reitz <mreitz@redhat.com> wrote:
>>>>>     
>>>>>> On 2018-06-06 13:19, Michal Suchánek wrote:    
>>>>>>> On Wed, 6 Jun 2018 13:02:53 +0200
>>>>>>> Max Reitz <mreitz@redhat.com> wrote:  
>>
>> [...]
>>
>>>>>>>> What I'm trying to get at is that qcow2 was not designed to be
>>>>>>>> a container format for arbitrary files.  If you want to make it
>>>>>>>> such, I'm sure there are existing formats that work
>>>>>>>> better.      
>>>>>>>
>>>>>>> Such as?      
>>>>>>
>>>>>> ext2?    
>>>>>
>>>>> So you want an ext2 driver in qemu instead of expanding qcow2 to
>>>>> work not only for a single disk but also for an appliance?    
>>>>
>>>> Yes, because ext2 was designed to be a proper filesystem.  I'm not
>>>> an FS designer.  Well, not a good one anyway.  So I don't trust
>>>> myself on extending qcow2 to be a good FS -- and why would I, when
>>>> there are already numerous FS around.  
>>>
>>> Do you expect that performance of qemu using qcow2 driver over ext2
>>> driver will be better than using qcow driver directly with some part
>>> semi-permanently occupied by a configuration blob? My bet is not.  
>>
>> If you want to store multiple disk images in a single file?  I would
>> think so, yes.  With qcow2, I would assume it leads to
>> fragmentation.  
> 
> How is that different from single disk divided into two partitions
> internally (without any knowledge on the qcow2 level)?

From how it's going to be fragmented, there is no difference.  If you
have multiple partitions and write to them concurrently, thus allocating
new areas, you get bad fragmentation.

>> I would hope that proper filesystems can mitigate this.
> 
> Not really. Not without much complexity and repeated maintenance.

Yes, a proper filesystem.  Which we'd get for free with multiple files.

>>> The ext* drivers are designed to work with kernel VM infrastructure
>>> which must be tuned for different usage scenarios and you would
>>> have to duplicate that tuning in qemu to get competitive
>>> performance. Also you get qcow2 and ext2 metadata which must be
>>> allocated, managed, etc. You get more storage and performance
>>> overhead for no good reason.  
>>
>> Yes, there is a good reason.  You can add arbitrary configuration
>> options without having to worry about me.
> 
> But I will not be able to use the images in qemu so it will be useless.

Neither can you with the current proposal because that is about adding
management layer configuration options which are opaque to qemu.

> Well, there is FUSE and that is certainly blazing fast and ubiquitous,
> I am sure.

If you want to use pre-existing drivers, you'd probably use a loop device.

Otherwise, you'd use the block layer for accessing the disk.

If you want blazingly fast, you probably won't use qcow2 anyway.  Or,
funnily enough, you'd want to probably split the qcow2 file into a
metadata and a data file, so you get even more files.  (But that is a
proposal for the future.)

>> Seriously, though, a real FS would allow you to be more expressive and
>> really do what you want without having to work around the quirks that
>> adding a not-real-FS in the most simple way possible to qcow2 would
>> bring with it.
>>
>> Because this is part of my fear, that we now add a very simple blob
>> for just a sprinkle of data.  But over time it gets more and more
>> complex because we want to store more and more data to make things
>> ever more convenient[1], we notice that we need more features, the
>> format gets more complex, and in the end we have an FS that is just
>> worse than a real FS.
>>
>> [1] And note that if I'm convinced to store VM configuration data in
>> qemu, I will agree that we can store any data in there and it would be
>> nice if any VM could be provisioned and used that way.
>>
>>> On the other hand, qcow is designed for storing VM disk data and
>>> hopefully was tuned to do that decently over the years. The primary
>>> use case remains storing VM disk data. Adding a configuration blob
>>> does not change that.  
>>
>> True.  So the argument is that qcow2 may be worse for storing
>> arbitrary data, but we don't have performance requirements for that;
>> but we do have performance requirements for disk data and adding
>> another format below qcow2 will not make it better.
>>
>> I do think it is possible to not make things worse with a format under
>> qcow2, but that may require additional complexity, that you think is
>> pointless.
>>
>> I understand that you think that, but I still believe that putting the
>> configuration into qcow2 is just the wrong way around and will fall on
>> our feet in the long run.
> 
> I think that *if* we want an 'appliance' format that stores a whole VM
> in a single file to ease VM distribution then the logical place to look
> in qemu is qcow. The reason have been explained at length.

The reason being that it's the easiest place, yes.  That doesn't make it
the best place.

> I understand that for some use cases simplifying the distribution of
> VMs as much as possible is quite important.

I don't because still nobody has explained it to me.

The only explanation I got so far was "People are lazy and we have
defaults for everything, so we don't throw an error if people forget to
pass a configuration file."

Which to me still just makes it an inconvenience.

>>>>>>>>>> Unless I have got something terribly wrong (which is indeed a
>>>>>>>>>> possibility!), to me this proposal means basically to turn
>>>>>>>>>> qcow2 into (1) a VM description format for qemu, and (2) to
>>>>>>>>>> turn it into an archive format on the way.        
>>>>>>>>>
>>>>>>>>> And if you go all the way you can store multiple disks along
>>>>>>>>> with the VM definition so you can have the whole appliance in
>>>>>>>>> one file. It conveniently solves the problem of synchronizing
>>>>>>>>> snapshots across multiple disk images and the question where
>>>>>>>>> to store the machine state if you want to suspend it.         
>>>>>>>>
>>>>>>>> Yeah, but why make qcow2 that format?  That's what I completely
>>>>>>>> fail to understand.
>>>>>>>>
>>>>>>>> If you want to have a single VM description file that contains
>>>>>>>> the VM configuration and some qcow2/raw/whatever files along
>>>>>>>> with it for the guest disk data, sure, go ahead.  But why does
>>>>>>>> the format of the whole thing need to be qcow2?      
>>>>>>>
>>>>>>> Because then qemu can access the disk data from the image
>>>>>>> directly without any need for extraction, copying to different
>>>>>>> file, etc.      
>>>>>>
>>>>>> This does not explain why it needs to be qcow2.  There is
>>>>>> absolutely no reason why you couldn't use qcow2 files in-place
>>>>>> inside of another file.    
>>>>>
>>>>> qemu cannot read the disk data from the file in-place.    
>>>>
>>>> Hu?  Why not?  
>>>
>>> Well, it can possibly read the image if it happens to be
>>> continuous. It will not be able to update it without a fs driver,
>>> however.  
>>
>> Yes, but first, such an FS driver would be possible (and as long as we
>> don't need real complexity, it could be very simple, like just using
>> an offset in a tar file and then just adjust the file length field on
>> allocations beyond the EOF).
> 
> As said this will cover only the simplest case with readonly
> configuration blob(s) and one disk image at the end. The solution is
> obsolete even as we are designing it.

As said this is not true because you can resize the configuration blob
by restructuring the qcow2 file.

It is true for one disk image, but others apparently don't want an
appliance file, so they only want a simple case where we only have one
disk image.

I'm noticing a pattern here, and that is that everybody has a different
opinion on what we actually want in the end, and it's just by chance
that we find ourselves in two camps ("put it in qcow2" vs. "put it
somewhere else").

Maybe we should first discuss what we actually want before we can
discuss where to put it.

>> And secondly, I think adding another format has the advantage of
>> easier deprecation.  If we think we need something more complex, we
>> are free to design that and throw away the old format.  But if we add
>> something to qcow2, I would think it is there to stay.
> 
> And then we would better add something general enough to work for
> foreseeable future or not add anything at all.

The intention is always to do everything right from the start, but that
is rarely successful.

>> So, yes, for qcow2 we might want to design something (overly?) complex
>> from that start that we hope will fulfill all our needs (which it
>> won't, because things never turn out that way).  But if we'd add a
>> new format, we could keep it simple in the beginning and start over
>> later.
>>
> 
> Yes, you basically want something like a tar file that can interleave
> the archive members. Wait, isn't that a new filesystem?

Exactly.  But I'd rather have this outside of qcow2 (and thus easily
deprecatable) than inside.

(And if we just need to store something simple, we don't need such a
filesystem, because then we can get away without fragmentation.  But if
we really need to grow the configuration options often, I suspect we'll
want some allocation information and the like inside that blob in qcow2,
too, so we'd end up with a filesystem there.)

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-06 14:14                                     ` Eduardo Habkost
@ 2018-06-06 14:21                                       ` Max Reitz
  2018-06-06 14:24                                       ` Daniel P. Berrangé
  1 sibling, 0 replies; 157+ messages in thread
From: Max Reitz @ 2018-06-06 14:21 UTC (permalink / raw)
  To: Eduardo Habkost, Daniel P. Berrangé
  Cc: Michal Suchánek, Kevin Wolf, qemu-block, Michael S. Tsirkin,
	Richard W.M. Jones, qemu-devel, stefanha

[-- Attachment #1: Type: text/plain, Size: 2019 bytes --]

On 2018-06-06 16:14, Eduardo Habkost wrote:
> On Wed, Jun 06, 2018 at 02:50:10PM +0100, Daniel P. Berrangé wrote:
>> On Wed, Jun 06, 2018 at 03:45:10PM +0200, Michal Suchánek wrote:
>>>
>>> I think that *if* we want an 'appliance' format that stores a whole VM
>>> in a single file to ease VM distribution then the logical place to look
>>> in qemu is qcow. The reason have been explained at length.
>>
>> I rather disagree. This is a common problem beyond just QEMU and everyone
>> just uses an existing archive format (TAR, ZIP) for bundling together
>> one or more disk images, metdata for config, and whatever other resources
>> are applicable for the vendor.  This works with any disk format (raw,
>> qcow2, vmdk, vpc, etc) so is preferrable to inventing someting that is
>> specific to qcow2 IMHO.
> 
> Now we have N+1 appliance file formats.  :)
> 
> (We like it or not, qcow2 is already used as an appliance format
> for single-disk VMs in practice.)
> 
> But I agree this must not be specific to qcow2.  The same VM
> description format we agree upon should work with other disk
> formats or with multi-disk appliances.
> 
> If we specify a reasonable VM description format for appliances
> and make it work inside (e.g.) tar files, we will still have the
> option of allowing the description be placed inside qcow2 if we
> really want to.  I don't think we need to finish this qcow2
> bikeshedding exercise right now.

That actually sounds reasonable to me.

I better not think about it for too long so I don't come up with
something I very much dislike about it.

(Well, now I have.  The thing I still dislike is that we haven't talked
about what we actually want in the end.  Do we want just the very basic
configuration options that are portable?  But what really is the use
case for such basic information?  Won't people demand ever more
configuration options, then?  I certainly would.

And having to handle a full-blown config is probably a pain...)

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-06 14:14                                     ` Eduardo Habkost
  2018-06-06 14:21                                       ` Max Reitz
@ 2018-06-06 14:24                                       ` Daniel P. Berrangé
  1 sibling, 0 replies; 157+ messages in thread
From: Daniel P. Berrangé @ 2018-06-06 14:24 UTC (permalink / raw)
  To: Eduardo Habkost
  Cc: Michal Suchánek, Max Reitz, Kevin Wolf, qemu-block,
	Michael S. Tsirkin, Richard W.M. Jones, qemu-devel, stefanha

On Wed, Jun 06, 2018 at 11:14:32AM -0300, Eduardo Habkost wrote:
> On Wed, Jun 06, 2018 at 02:50:10PM +0100, Daniel P. Berrangé wrote:
> > On Wed, Jun 06, 2018 at 03:45:10PM +0200, Michal Suchánek wrote:
> > > 
> > > I think that *if* we want an 'appliance' format that stores a whole VM
> > > in a single file to ease VM distribution then the logical place to look
> > > in qemu is qcow. The reason have been explained at length.
> > 
> > I rather disagree. This is a common problem beyond just QEMU and everyone
> > just uses an existing archive format (TAR, ZIP) for bundling together
> > one or more disk images, metdata for config, and whatever other resources
> > are applicable for the vendor.  This works with any disk format (raw,
> > qcow2, vmdk, vpc, etc) so is preferrable to inventing someting that is
> > specific to qcow2 IMHO.
> 
> Now we have N+1 appliance file formats.  :)
> 
> (We like it or not, qcow2 is already used as an appliance format
> for single-disk VMs in practice.)
> 
> But I agree this must not be specific to qcow2.  The same VM
> description format we agree upon should work with other disk
> formats or with multi-disk appliances.
> 
> If we specify a reasonable VM description format for appliances
> and make it work inside (e.g.) tar files, we will still have the
> option of allowing the description be placed inside qcow2 if we
> really want to.  I don't think we need to finish this qcow2
> bikeshedding exercise right now.

Yes, I think that is sensible, as once we actually try it out in real
world cases, we might then find a tar/zip is sufficient after all and
we don't need to do something extra for qcow2. Also means we can do
experiments without committing to a qcow2 format spec change right
away.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-06 11:40                     ` Richard W.M. Jones
@ 2018-06-06 14:31                       ` Michael S. Tsirkin
  0 siblings, 0 replies; 157+ messages in thread
From: Michael S. Tsirkin @ 2018-06-06 14:31 UTC (permalink / raw)
  To: Richard W.M. Jones
  Cc: Max Reitz, Michal Suchánek, Kevin Wolf, qemu-devel,
	stefanha, ehabkost, qemu-block

On Wed, Jun 06, 2018 at 12:40:35PM +0100, Richard W.M. Jones wrote:
> I started off a long reply here, but I think you're right.  If we
> cannot make people decide on and use a proper disk image + metadata
> container, then it's also unlikely we'll get them to add sensible
> metadata to their qcow2 images either :-(
> 
> Rich.

But we can do it for them: ask how to run image, store it
in the image forever.

-- 
MST

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-06 12:59                           ` Max Reitz
@ 2018-06-06 14:31                             ` Dr. David Alan Gilbert
  2018-06-06 14:37                               ` Daniel P. Berrangé
  2018-06-06 14:51                               ` Max Reitz
  0 siblings, 2 replies; 157+ messages in thread
From: Dr. David Alan Gilbert @ 2018-06-06 14:31 UTC (permalink / raw)
  To: Max Reitz
  Cc: Kevin Wolf, qemu-block, Michael S. Tsirkin, armbru, qemu-devel,
	Richard W.M. Jones, stefanha

* Max Reitz (mreitz@redhat.com) wrote:
> On 2018-06-06 14:00, Dr. David Alan Gilbert wrote:
> > * Max Reitz (mreitz@redhat.com) wrote:
> >> On 2018-06-06 13:14, Dr. David Alan Gilbert wrote:
> >>> * Max Reitz (mreitz@redhat.com) wrote:
> >>>> On 2018-06-05 11:21, Dr. David Alan Gilbert wrote:
> >>>>> <reawakening a fizzled out thread>

<snip>

> >>> The problem with having a separate file is that you either have to copy
> >>> it around with the image 
> >>
> >> Which is just an inconvenience.
> > 
> > It's more than that;  if it's a separate file then the tools can't
> > rely on users supplying it, and frankly they won't and they'll still
> > just supply an image.
> 
> At which point you throw an error and tell them to specify the config file.

No:
   a) At the moment they get away with it for images since they're all
      'pc' and the management layers do the right thing.
   b) They'll give the wrong config file - then you'd need to add a flag
     to detect that - which means you'd need to add something to the
     qcow to match it to the config; loop back to teh start!

We should make this EASY for users.

> >> I understand it is an inconvenience and it would be nice to change it,
> >> but please understand that I do not want qcow2 to become a filesystem
> >> just to relieve an inconvenience.
> > 
> > I very much don't want it to be a filesystem; my reason for writing
> > down my spec the way I did was to make it clear that the only
> > thing I want of qcow2 is a single blob, no more; I don't want naming
> > of the blob or anything else.
> > 
> >> (Note: I understand that you may not want qcow2 to become a filesystem,
> >> but I do get the impression from others.)
> > 
> > My aim was to specify it to fulfill the requirements that everyone
> > else had asked for, but still only having one unmodifiable blob in qcow.
> > 
> >>>                           or have an archive. If you have an archive
> >>> you have to have an unpacking step which then copies, potentially a lot
> >>> of data taking some reasonable amount of time.
> >>
> >> I'm sure this can be optimized, but yes, I get that.
> >>
> >> (If you use e.g. tar and store the image data starting on an FS cluster
> >> boundary (64 kB should be more than sufficient), I assume there is a way
> >> to extract that data into a new file without copying anything.)
> > 
> > But then we have to modify all the current things that know how to
> > handle a qcow2.
> 
> Not in this case because it'd still be a flat qcow2 file in a simple tar
> archive.
> 
> But you're right if we had a more complex format (like chunks stored in
> a tar file).

My only problem with using the tar like that is that all tools
everywhere would need to be updated to be able to parse them.
(Note if adding a blob to qcow2 like I'm asking for would break existing
qcow2 users then I don't want it either).

> >>>                                                 Storing a simple bit
> >>> of data with the image avoids that.
> >>
> >> It is not a simple bit of data, as evidenced by the discussion about
> >> storing binary blobs and MIME types going on.
> > 
> > All of the things they've suggested can be done inside that one blob;
> > even inside the json (or any other structure in that blob).
> 
> Right, from qcow2's perspective it's a blob of data.  But you can put a
> whole filesystem into a blob of data, and I get the impression that this
> is what some are trying to do.
> 
> Once we store larger amounts of binary data in that blob (which is what
> I'm fearing from comments on MIME types and PNG images), people will
> realize that always having to re-store the whole blob if you modify
> something in the middle is inefficient and that it needs to be
> optimized.  I don't think you want to do that, but we haven't
> implemented any of this yet and people are already asking for such
> binary data inside of the blob.
> 
> I suspect it'll only get worse over time.
> I think the most difficult thing about this discussion is that there are
> different targets.
> 
> You just want to store a bit of information.  OK, good, but then I'd say
> we could even just prepend that to the image file in a small header.


I think you're over-reading what people are asking for.
I think the PNG suggestion is again the 'label on the front' for a logo.
I've not seen anything that's not for either:
  a) The user to know what the image is
  b) The management layer to know what type of VM to create

> (Note that extending that header would not even be too complicated,
> because you can easily move the qcow2 header somewhere else.  Say you
> move it back by one cluster (e.g. 64 kB), then you just put the cluster
> that was there originally to the end of the file, which is pretty much
> trivial.  Then you copy that original data there and overwrite it with
> the image header.  Done.)
> 
> Others want to store more binary data.  Then this may get inefficient
> and insufficient.  But I'd think at this point it gets really
> problematic to put the data into the qcow2 file because it really
> doesn't belong there.  (I can't imagine anything that would warrant a
> MIME type.)

No, I can't imagine why anyone wants a MIME type either.

> Then I've heard proposals of storing multiple disk images.  Yes, you
> could store multiple disks inside of a single qcow2 file, but it would
> be basically exactly the same as storing just multiple qcow2 files, so...

No, completely agree.

> And really, I still believe in my slippery slope argument, which means
> that even if you just want to innocently store a machine type, we will
> end up with something vastly more complex in the end.
> 
> Finally, it appears to me that you have a simple problem, found one
> possible solution, and now you just focus on that solution instead of
> taking a step back and looking at the problem again.
> 
> The problem: You want to store a binary blob and a disk image together.
> 
> Your solution: qcow2 has refcounting and thus "occupation bits".  You
> can put data into it and it will leave it alone, as long as that area is
> marked as occupied.  Let's put the data into the qcow2 file.
> 
> OK, let's look at the problem and its constraints again.
> 
> Hard constraint: Store a single file.
> (I don't think this is a hard constraint, because I haven't been
> convinced yet that handling more than a single file is so bad.)

See above; I think it is.
My other hard contraint is that no tool has to change unless
it wants to make use of the new data.

> Soft constraint: Max doesn't like storing blobs in qcow2.
> 
> So one solution is to ignore the soft constraint.  OK, valid solution, I
> give you that.  But it doesn't leave me content, probably understandably so.
> 
> 
> So let me try to understand how we end up with qcow2 as a result...  We
> need a single file that needs to contain both the disk data and a binary
> blob.  Or, well, even better would be if that file can store multiple
> arbitrary objects, in a format of your choosing, but that makes things
> more complicated, so let's leave that off for now.
> 
> So all you need is object storage (probably with a single root object
> that references the rest in a custom format) and a way to tell which
> areas of the file are occupied.  Now the issue is that both the disk
> image and the blob may grow.  So both need mutual understanding of which
> areas are occupied and which can be used for growth.  For the disk
> image, the block layer would definitely need a driver to handle that,
> which is not impossible.  But qcow2 would automatically handle it.
> 
> So, OK, for now this is my result.  If we create a new format, we'd need
> a block driver for it (underneath qcow2) that handles the allocation.
> With qcow2, we'd get it for free.
> 
> 
> Hm, OK.
> 
> The simplest implementation for such an additional layer would get away
> without actual occupation bits and just always allocate new storage at
> the end of the file.  That should be sufficient, it would be quick and
> not very complex.  But I see that it is additional complexity when
> compared with just adding the blob to qcow2.
> 
> 
> Well, in a sense, because we'd need block layer interfaces for
> extracting the information from a qcow2 file through qemu-img.  So maybe
> adding another block driver would actually mean less complexity...
> 
> 
> [...]
> 
> >>>> But I really, really, really do not like storing arbitrary data in qcow2
> >>>> files.  I hated it badly enough when qemu knew what to do with it, but I
> >>>> hate it even more when even qemu has no idea what to do with it.
> >>>>
> >>>> Having a specification of what everything means in the qemu tree makes
> >>>> things less unbearable, but not to my liking still.
> >>>
> >>> Have you said why you hate it so much?
> >>> Your hate for it seems to be making a simple solution hard.
> >>
> >> Because it's a disk image format.  Data therein should be relevant to
> >> the disk image.  I see qcow2 as a representation of data stored on a
> >> physical storage medium.
> > 
> > What we're missing here is the notes scribbled on the sticky label on
> > the disc;  you rarely need them on a physical drive in a computer,
> > LUNs on a SAN don't need them that much because they have a full
> > filesystem and don't move about much.  Here we're talking about an image
> > being downloaded or sent between people.
> 
> Well, qcow2 doesn't even describe the device type, so the sticky label
> may be off limits.
> 
> But really, if you create a VM, you need a configuration.  Like if you
> set up a new computer, you need to know what you want.  Usually there is
> no sticky label, but you just have to know and input it manually.  Maybe
> you have a sheet of paper, which I'd call the configuration file.

Most things are figurable-out by the management tools/defaults or
are dependent on the whim of the user - we're only trying to stop the
user doing things that wont work.
Simpler example; what stops you trying to put the PPC qcow image into
your x86 VM system - nothing that I know of.  I just want to stop the
users shooting themselves in the foot.

> >> Some metadata associated directly with that is fine (such as dirty
> >> bitmaps, backing chains, things like that).  But configuring the whole
> >> VM seems out of scope to me.
> >>
> >> Also, making qcow2 a filesystem is not a simple solution.
> >>
> >> ...OK, let me back off here, I may be over-interpreting things and
> >> throwing opinions of different people into one pot.
> >>
> >> Maybe you don't want qcow2 to be a filesystem, and you just want to
> >> store a single binary blob.  Well, OK, that's not that bad.  But in any
> >> case, I wouldn't call it a simple solution anymore.
> >>
> >> Yes, storing just the machine type somewhere would be possible with a
> >> simple solution; but as I said (and the whole thread shows since then),
> >> this is a slippery slope, and suddenly we arrive at storing arbitrary
> >> binary data (like images?!) along with MIME types.  That will not be
> >> possible with a simple solution anymore, I don't think.
> > 
> > Right; I was thinking we were too far down that slope to get rid
> > of all of those requirements, but I was trying to force it back to
> > being a single blob as far as QCOW2 saw it.
> 
> A valiant effort, but I myself cannot see why we should forbid storing
> more data once we started storing some data.  I myself do think that if
> we store some VM configuration, we should be able to store all of it,
> and allow for arbitrarily complex scenarios.
> 
> >>>>> --------------------------------------------------------------
> >>>>>    
> >>>>>
> >>>>> Some reasoning:
> >>>>>    a) I've avoided the problem of when QEMU interprets the value
> >>>>>       by ignoring it and giving it to management layers at the point
> >>>>>       of VM import.
> >>>>
> >>>> Yes, but in the process you've made it completely opaque to qemu,
> >>>> basically, which doesn't really make it better for me.  Not that
> >>>> qemu-specific information in qcow2 files would be what I want, but, well.
> >>>>
> >>>> But it does solve technical issues, I concede that.
> >>>>
> >>>>>    b) I hate JSON, but there again nailing down a fixed format
> >>>>>       seems easiest and it makes the job of QCOW easy - a single
> >>>>>       string.
> >>>>
> >>>> Not really.  The string can be rather long, so you probably don't want
> >>>> to store it in the image header, and thus it's just a binary blob from
> >>>> qcow2's perspective, essentially.
> >>>
> >>> Yes, but it's a single blob - I'm not asking for multiple keyed blobs
> >>> or the ability to update individual blobs; just one blob that I can
> >>> replace.
> >>
> >> OK, you aren't, but others seem to be.
> >>
> >> Or, well, you call it a single blob.  But actually the current ideas
> >> seem to be to store a rather large configuration tree with binary data
> >> in that blob, so to me personally there is absolutely no functional
> >> difference to just storing a tar file in that blob.
> >>
> >> So correct me if I'm wrong, but to me it appears that you effectively
> >> want to store a filesystem in qcow2.[1]  Well, that's better than making
> >> qcow2 the filesystem, but it still appears just the wrong way around to me.
> > 
> > It's different in the sense that what we end up with is still a qcow2;
> > anything that just handles qcow2's and can pass them through doesn't
> > need to do anything different; users don't need to do anything
> > different.  No one has to pack/unpack the file.
> 
> Packing/unpacking is a strawman because I'm doing my best to give
> proposals that completely avoid that.
> 
> Users do need to do something different, because users do need to
> realize that today there is no way to store VM configuration and disk
> data in a single file.  So if they already start VMs just based on a
> disk, then they are assuming behavior we do not have and that I'd call
> naive.  But that is a strawman from my side, sorry.  Keeping naive users
> happy is probably OK.

Remember this all works fine now and has done for many years;
it's the addition of q35 that breaks that assumption.
The users can already blidly pick up the qcow2 image and stuff it in and
it all works; all I want is for that to keep working.

> Keeping tools working is a good argument, but I'm not exactly sure what
> the use cases are.  What I'd want is that in the end we have a way of
> configuring a whole VM in a single file.[1]  Then, that file is no
> longer just a disk image, it is a whole VM.  So maybe those tools need
> to be adjusted anyway.
> 
> I assume that we have tools that work on disk images, and we trivially
> want to keep them working on that VM's disk image without having to
> incorporate a block layer.  Depending on the format we choose, that may
> be very simple (maybe just use an offset for the qcow2 header).
> 
> But if we want to store a whole VM in a single file, then storing
> multiple disk images in that single file does not seem too far off to
> me, and that would mean breaking those tools anyway.
> 
> [1] I still don't quite see the point, because just using more than a
> single file is so much easier.
> 
> >> [1] Yes, I know that the guest disk already contains an FS. :-P
> >>
> >>>>>       (I would suggest in layer2 that the keys are sorted, but
> >>>>>       that's a pain to do in some json creators)
> >>>>>    c) Forcing the registry of keys might avoid silly duplication.
> >>>>>       We can but hope.
> >>>>>    d) I've not said it's a libvirt XML file since that seems
> >>>>>       a bit prescriptive.
> >>>>>
> >>>>> Some initial suggested keys:
> >>>>>
> >>>>>    "qemu.machine-types": [ "q35", "i440fx" ]
> >>>>>    "qemu.min-ram-MB": 1024
> >>>>
> >>>> I still don't understand why you'd want to put the configuration into
> >>>> qcow2 instead of the other way around.
> >>>>
> >>>> Or why you'd want to use a single file at all, because as this whole
> >>>> thread shows, a disk image alone is clearly not sufficient to describe a VM.
> >>>>
> >>>> (Or it may be in simple cases, but then that's because you don't need
> >>>> any configuration.)
> >>>
> >>> Because it avoids the unpacking associated with archives.
> >>
> >> I'm not talking about unpacking.  I'm talking about a potentially new
> >> format which allows accessing the qcow2 file in-place.  It would
> >> probably be trivial to write a block driver to allow this.
> >>
> >> (And as I wrote in my response to Michal, I suspect that tar could
> >> actually allow this, even though it would probably not be the ideal format.)
> > 
> > As above, I don't think this is trivial; you have to change all the
> > layers;  lets say it was a tar; you'd have to somehow know that you're
> > importing one of these special tars,
> 
> Which is trivial because it's just "Hey, look, it's a tar with that
> description file".

Trivial? It's taking 100+ mails to add a tag to a qcow2 file! Can you
imagine what it takes to change libvirt, openstack, ovirt and the rest?


> >                                      you also have to have a tool to
> > create them;
> 
> Also trivial.  Non-trivial is modifying them.
> 
> The workflow would be to create the tar with an empty qcow2 file, the VM
> description you want, and then just using it.
> 
> Yes, using is more difficult, but it wouldn't be an own tool, it would
> be built into qemu.  I can't say how difficult that implementation would
> be, but it would not be trivial, that is correct.
> 
> >              and you have to worry about whether that alignment
> > is correct for the storage/memory you're using it with.
> 
> Which would be difficult with tar, right.  But we don't have to use tar.
> 
> (And, no, I don't think creating a new container format is not worse for
> interoperability than adding a blob to qcow2.)

If you were going to do this then you'd end up just using OVA.
You couldn't justify yet another format.

Dave

> Max
> 



--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-06 14:02                                 ` Dr. David Alan Gilbert
@ 2018-06-06 14:33                                   ` Max Reitz
  2018-06-06 14:41                                     ` Dr. David Alan Gilbert
  0 siblings, 1 reply; 157+ messages in thread
From: Max Reitz @ 2018-06-06 14:33 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: Kevin Wolf, ehabkost, qemu-block, Michael S. Tsirkin,
	Richard W.M. Jones, qemu-devel, stefanha, Michal Suchánek

[-- Attachment #1: Type: text/plain, Size: 9709 bytes --]

On 2018-06-06 16:02, Dr. David Alan Gilbert wrote:
> * Max Reitz (mreitz@redhat.com) wrote:
>> On 2018-06-06 14:16, Dr. David Alan Gilbert wrote:
>>> * Max Reitz (mreitz@redhat.com) wrote:
>>>> On 2018-06-06 13:37, Dr. David Alan Gilbert wrote:
>>>>> * Max Reitz (mreitz@redhat.com) wrote:
>>>>>> On 2018-06-06 13:19, Michal Suchánek wrote:
>>>>>>> On Wed, 6 Jun 2018 13:02:53 +0200
>>>>>>> Max Reitz <mreitz@redhat.com> wrote:
>>>>>>>
>>>>>>>> On 2018-06-06 12:32, Michal Suchánek wrote:
>>>>>>>>> On Tue, 29 May 2018 12:14:15 +0200
>>>>>>>>> Max Reitz <mreitz@redhat.com> wrote:
>>>>
>>>> [...]
>>>>
>>>>>>>>>> Unless I have got something terribly wrong (which is indeed a
>>>>>>>>>> possibility!), to me this proposal means basically to turn qcow2
>>>>>>>>>> into (1) a VM description format for qemu, and (2) to turn it into
>>>>>>>>>> an archive format on the way.  
>>>>>>>>>
>>>>>>>>> And if you go all the way you can store multiple disks along with
>>>>>>>>> the VM definition so you can have the whole appliance in one file.
>>>>>>>>> It conveniently solves the problem of synchronizing snapshots across
>>>>>>>>> multiple disk images and the question where to store the machine
>>>>>>>>> state if you want to suspend it.   
>>>>>>>>
>>>>>>>> Yeah, but why make qcow2 that format?  That's what I completely fail
>>>>>>>> to understand.
>>>>>>>>
>>>>>>>> If you want to have a single VM description file that contains the VM
>>>>>>>> configuration and some qcow2/raw/whatever files along with it for the
>>>>>>>> guest disk data, sure, go ahead.  But why does the format of the whole
>>>>>>>> thing need to be qcow2?
>>>>>>>
>>>>>>> Because then qemu can access the disk data from the image directly
>>>>>>> without any need for extraction, copying to different file, etc.
>>>>>>
>>>>>> This does not explain why it needs to be qcow2.  There is absolutely no
>>>>>> reason why you couldn't use qcow2 files in-place inside of another file.
>>>>>
>>>>> Because then we'd have to change the whole stack to take advantage of
>>>>> that.  Adding a feature into qcow2 means nothing else changes.
>>>>
>>>> Because it's a hack, right.  Storing binary data in a qcow2 file,
>>>> completely ignoring it in qemu (and being completely unusable to any
>>>> potential other users of the qcow2 format[1]) and only interpreting it
>>>> somewhere up the stack is a hack.
>>>
>>> It's not a hack!
>>> Seriously it's not.
>>> There's nothing wrong with it being aimed higher up the stack than qemu,
>>
>> Not really, but storing that information in a disk image file is, from
>> my perspective.  So far, qcow2 was always just for qemu.  (Hmm...  Maybe
>> backing links weren't, but at least they were intended for qemu originally.)
>>
>> So this would mix information for different layers inside qcow2 which to
>> me sounds weird.  Maybe I just have to get used to it.
> 
> The important point is it's explicitly for a different layer; we're not
> mixing it - the guest can never get to this information.

Neither can it get to bitmaps, but bitmaps are still for qemu.

>                                                           It also saves
> the higher level management layers ever having to look at the data the
> guest can get to, which is a security advantage.

Er, well, yes, but guessing configuration options from the guest disk
contents is definitely a bad idea, I agree on that anyway.

> From my point of view, it really is the sticky label on the disc rather
> than the contents of it.

Sure, which is why I wouldn't put it in qcow2.  Content and meta-content
is what qcow2 currently stores, but not how to use it.

>>> the problem we started off with was what happens when a user downloads
>>> a VM image and tries to import it into their VM system;
>>
>> Well, the VM system should choke without a config file. O:-)
>>
>>>                                                         weve already
>>> got 2+ layers of management stuff in there - I want the information to
>>> guide those layers, not form a complete set of configuration.
>>
>> But I do.
>>
>> If we store some information, I don't see why we don't store all of it.
> 
> Hmm, now that generally I don't like:

Me neither.

>   a) That really would make it hypervisor specific

Yes.

>   b) It would be a lot of data

Yes.

>   c) Generally, the supplier of an image doesn't know how the end-user
>      wants it configured - although for some appliances they might.

Well, yes.  But just storing very basic information limits the use case
to a very basic case anyway, doesn't it?  So this wouldn't be worse.

Everything beyond a very basic use case can expect the user to take 30
seconds to download and pass a config file.

>   d) Remember the only problem we had when we got here was how to stop
>      the user shooting themselves in the foot by connecting the wrong
>      image to the wrong VM type.

Hm.  How exactly is that shooting yourself in the foot?  Won't it just
not work?

>                                   So I'm expecting to use this to
>      contain requirements, nothing more.

I assumed you'd want to relieve users of having to specify config
options in basic use cases.  This is why I believed it would be natural
to expand that scope.

So why is it so dangerous to connect a disk you just downloaded to e.g.
the wrong machine type?  I assumed it just wouldn't work and you'd try
again, until you realized that maybe you should read the download
description and do as it says ("download this config file, pass it").

>>>> That is not necessarily a negative point, hacks can work wonderfully
>>>> well, and they usually are simple, that is correct.  But the thing is
>>>> that I feel like people have grand visions of what to get out of this.
>>>> Imagine, a single file that can configure all and any VM!
>>>>
>>>> But hacks usually only solve a single issue.  Once you try to extend a
>>>> hack, it breaks down and becomes insufficient.
>>>>
>>>> If we want a grand vision where a single file stores the whole VM, why
>>>> not invest the work and make it right from the start?
>>>
>>> Because we won't get it right; however much we bikeshed about it
>>> we'll just end up with a mess.
>>
>> Sure, but the same thing applies to putting it into qcow2.  The
>> difference is, for something outside of qcow2, throwing it away and
>> starting over is simple.
>>
>> When putting it into qcow2, we can only do that if we really just put a
>> binary blob there that isn't described in the specification.
> 
> Well, it's why I'm going for defined key/values that are stored in the
> blob and only a few of them.  We've got a reasonable chance of being
> able to define what we want from 3-4 key/values, it should be a lot
> easier than trying to define a grand scheme.

Yes, as long as we can agree on how we can justify to future generations
why we really only want these very few very specific values.

>>>                                  The right thing is to put in something
>>> to hold configuration and then review the items of configuration we
>>> add properly as we define them.
>>
>> OK, but review them on what terms?  Whether they are simple enough?
> 
> Well, I'll take simple, make sense - whatever - feel free to be the
> maintainer for that list!

OK, none. :-P

It would need to be a very strict requirement, in any case.  Just
"simple" or "dense" do not suffice, because those can be stretched.

>> As I said, I would want a whole configuration if we allow some
>> configuration.
> 
> Well then you could go for libvirt XML as the contents; but I think
> that's crossing layers even more.
> (I would have veered more to it being exactly the same as an OVA
> description except for rjones dislike of it)

Well, the format doesn't really matter for now, I think it's most
important to first talk about what kind of scope we want.

>> (More below)
>>
>>>> [1] Yes, I concede that there are probably no other users of qcow2.  But
>>>> please forgive me for assuming that qcow2 was in a sense designed to be
>>>> a rather general image format that not only qemu could use.
>>>
>>> What makes it QEMU specific?  It's basically just the same key/value
>>> setup as OVA, except putting them inside the qcow2.
>>
>> Well, not necessarily qemu-specific, but
>> ${management_software}-specific, which comes down to the same.  Or,
>> well, I'd think that, but hold on.
>>
>>> We could use the same keys/value definitions as OVA in the blob,
>>> although their definitions aren't very portable either.
>>
>> So you're proposing that we only add options that seem portable for any VM?
> 
> Hmm.  We should probably split them, so there should be general options
> (e.g. minimum-ram) but also hypervisor specifics
> (qemu.machine-class=q35); but that doesn't mean you can't add keys
> for multiple hypervisors into the one blob.  I mean
> something like:
>     minimum-ram = 1G
>     qemu.machine-class = q35
>     anothervm.machine-class = ....

Well, and that's my issue.  Once you have application-specific info, you
can go wild.  And I would go wild, without a reasonable and strict
requirement that the information we want to store has to fulfill.

For the record, I would've liked it if you'd said "only portable
options".  But I would have replied that I would fear we'd still end up
with someone saying "I'd like to store X and Y, let's just put them into
the specification, then they are portable [even if only this stack
supports them]" and we wouldn't really have won anything.

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-06 14:31                             ` Dr. David Alan Gilbert
@ 2018-06-06 14:37                               ` Daniel P. Berrangé
  2018-06-06 14:42                                 ` Dr. David Alan Gilbert
  2018-06-06 14:51                               ` Max Reitz
  1 sibling, 1 reply; 157+ messages in thread
From: Daniel P. Berrangé @ 2018-06-06 14:37 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: Max Reitz, Kevin Wolf, qemu-block, Michael S. Tsirkin, armbru,
	qemu-devel, Richard W.M. Jones, stefanha

On Wed, Jun 06, 2018 at 03:31:35PM +0100, Dr. David Alan Gilbert wrote:
> > Not in this case because it'd still be a flat qcow2 file in a simple tar
> > archive.
> > 
> > But you're right if we had a more complex format (like chunks stored in
> > a tar file).
> 
> My only problem with using the tar like that is that all tools
> everywhere would need to be updated to be able to parse them.

I feel it is the opposite actually. By adding named blobs or custom
strings to qcow2, we've effectively invented a new type of archive
format, except apps cant use the normal unzip/tar tools/apis they
already have. Instead they need to use qemu-img to read,add,remove
blobs from qcow2. It is very compelling to use an existing archive
format like tar/zip because every language has APIs for dealing
with them and apps probably already do this for things like OVA.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-06 14:33                                   ` Max Reitz
@ 2018-06-06 14:41                                     ` Dr. David Alan Gilbert
  2018-06-06 14:55                                       ` Max Reitz
  0 siblings, 1 reply; 157+ messages in thread
From: Dr. David Alan Gilbert @ 2018-06-06 14:41 UTC (permalink / raw)
  To: Max Reitz
  Cc: Kevin Wolf, ehabkost, qemu-block, Michael S. Tsirkin,
	Richard W.M. Jones, qemu-devel, stefanha, Michal Suchánek

* Max Reitz (mreitz@redhat.com) wrote:
> On 2018-06-06 16:02, Dr. David Alan Gilbert wrote:
> > * Max Reitz (mreitz@redhat.com) wrote:
> >> On 2018-06-06 14:16, Dr. David Alan Gilbert wrote:
> >>> * Max Reitz (mreitz@redhat.com) wrote:
> >>>> On 2018-06-06 13:37, Dr. David Alan Gilbert wrote:
> >>>>> * Max Reitz (mreitz@redhat.com) wrote:
> >>>>>> On 2018-06-06 13:19, Michal Suchánek wrote:
> >>>>>>> On Wed, 6 Jun 2018 13:02:53 +0200
> >>>>>>> Max Reitz <mreitz@redhat.com> wrote:
> >>>>>>>
> >>>>>>>> On 2018-06-06 12:32, Michal Suchánek wrote:
> >>>>>>>>> On Tue, 29 May 2018 12:14:15 +0200
> >>>>>>>>> Max Reitz <mreitz@redhat.com> wrote:
> >>>>
> >>>> [...]
> >>>>
> >>>>>>>>>> Unless I have got something terribly wrong (which is indeed a
> >>>>>>>>>> possibility!), to me this proposal means basically to turn qcow2
> >>>>>>>>>> into (1) a VM description format for qemu, and (2) to turn it into
> >>>>>>>>>> an archive format on the way.  
> >>>>>>>>>
> >>>>>>>>> And if you go all the way you can store multiple disks along with
> >>>>>>>>> the VM definition so you can have the whole appliance in one file.
> >>>>>>>>> It conveniently solves the problem of synchronizing snapshots across
> >>>>>>>>> multiple disk images and the question where to store the machine
> >>>>>>>>> state if you want to suspend it.   
> >>>>>>>>
> >>>>>>>> Yeah, but why make qcow2 that format?  That's what I completely fail
> >>>>>>>> to understand.
> >>>>>>>>
> >>>>>>>> If you want to have a single VM description file that contains the VM
> >>>>>>>> configuration and some qcow2/raw/whatever files along with it for the
> >>>>>>>> guest disk data, sure, go ahead.  But why does the format of the whole
> >>>>>>>> thing need to be qcow2?
> >>>>>>>
> >>>>>>> Because then qemu can access the disk data from the image directly
> >>>>>>> without any need for extraction, copying to different file, etc.
> >>>>>>
> >>>>>> This does not explain why it needs to be qcow2.  There is absolutely no
> >>>>>> reason why you couldn't use qcow2 files in-place inside of another file.
> >>>>>
> >>>>> Because then we'd have to change the whole stack to take advantage of
> >>>>> that.  Adding a feature into qcow2 means nothing else changes.
> >>>>
> >>>> Because it's a hack, right.  Storing binary data in a qcow2 file,
> >>>> completely ignoring it in qemu (and being completely unusable to any
> >>>> potential other users of the qcow2 format[1]) and only interpreting it
> >>>> somewhere up the stack is a hack.
> >>>
> >>> It's not a hack!
> >>> Seriously it's not.
> >>> There's nothing wrong with it being aimed higher up the stack than qemu,
> >>
> >> Not really, but storing that information in a disk image file is, from
> >> my perspective.  So far, qcow2 was always just for qemu.  (Hmm...  Maybe
> >> backing links weren't, but at least they were intended for qemu originally.)
> >>
> >> So this would mix information for different layers inside qcow2 which to
> >> me sounds weird.  Maybe I just have to get used to it.
> > 
> > The important point is it's explicitly for a different layer; we're not
> > mixing it - the guest can never get to this information.
> 
> Neither can it get to bitmaps, but bitmaps are still for qemu.
> 
> >                                                           It also saves
> > the higher level management layers ever having to look at the data the
> > guest can get to, which is a security advantage.
> 
> Er, well, yes, but guessing configuration options from the guest disk
> contents is definitely a bad idea, I agree on that anyway.
> 
> > From my point of view, it really is the sticky label on the disc rather
> > than the contents of it.
> 
> Sure, which is why I wouldn't put it in qcow2.  Content and meta-content
> is what qcow2 currently stores, but not how to use it.
> 
> >>> the problem we started off with was what happens when a user downloads
> >>> a VM image and tries to import it into their VM system;
> >>
> >> Well, the VM system should choke without a config file. O:-)
> >>
> >>>                                                         weve already
> >>> got 2+ layers of management stuff in there - I want the information to
> >>> guide those layers, not form a complete set of configuration.
> >>
> >> But I do.
> >>
> >> If we store some information, I don't see why we don't store all of it.
> > 
> > Hmm, now that generally I don't like:
> 
> Me neither.
> 
> >   a) That really would make it hypervisor specific
> 
> Yes.
> 
> >   b) It would be a lot of data
> 
> Yes.
> 
> >   c) Generally, the supplier of an image doesn't know how the end-user
> >      wants it configured - although for some appliances they might.
> 
> Well, yes.  But just storing very basic information limits the use case
> to a very basic case anyway, doesn't it?  So this wouldn't be worse.
> 
> Everything beyond a very basic use case can expect the user to take 30
> seconds to download and pass a config file.
> 
> >   d) Remember the only problem we had when we got here was how to stop
> >      the user shooting themselves in the foot by connecting the wrong
> >      image to the wrong VM type.
> 
> Hm.  How exactly is that shooting yourself in the foot?  Won't it just
> not work?
> 
> >                                   So I'm expecting to use this to
> >      contain requirements, nothing more.
> 
> I assumed you'd want to relieve users of having to specify config
> options in basic use cases.  This is why I believed it would be natural
> to expand that scope.
> 
> So why is it so dangerous to connect a disk you just downloaded to e.g.
> the wrong machine type?  I assumed it just wouldn't work and you'd try
> again, until you realized that maybe you should read the download
> description and do as it says ("download this config file, pass it").

That's bad!  Stuff should just-work; it currently just works, things
should get better and easier for our users.  And anyway, not working for
EFI for exmaple can be just a blank screen.  Seriously - keep it easy
for the user!

And with 'pc' type VMs being all that's around it does just-work.

> >>>> That is not necessarily a negative point, hacks can work wonderfully
> >>>> well, and they usually are simple, that is correct.  But the thing is
> >>>> that I feel like people have grand visions of what to get out of this.
> >>>> Imagine, a single file that can configure all and any VM!
> >>>>
> >>>> But hacks usually only solve a single issue.  Once you try to extend a
> >>>> hack, it breaks down and becomes insufficient.
> >>>>
> >>>> If we want a grand vision where a single file stores the whole VM, why
> >>>> not invest the work and make it right from the start?
> >>>
> >>> Because we won't get it right; however much we bikeshed about it
> >>> we'll just end up with a mess.
> >>
> >> Sure, but the same thing applies to putting it into qcow2.  The
> >> difference is, for something outside of qcow2, throwing it away and
> >> starting over is simple.
> >>
> >> When putting it into qcow2, we can only do that if we really just put a
> >> binary blob there that isn't described in the specification.
> > 
> > Well, it's why I'm going for defined key/values that are stored in the
> > blob and only a few of them.  We've got a reasonable chance of being
> > able to define what we want from 3-4 key/values, it should be a lot
> > easier than trying to define a grand scheme.
> 
> Yes, as long as we can agree on how we can justify to future generations
> why we really only want these very few very specific values.
> 
> >>>                                  The right thing is to put in something
> >>> to hold configuration and then review the items of configuration we
> >>> add properly as we define them.
> >>
> >> OK, but review them on what terms?  Whether they are simple enough?
> > 
> > Well, I'll take simple, make sense - whatever - feel free to be the
> > maintainer for that list!
> 
> OK, none. :-P
> 
> It would need to be a very strict requirement, in any case.  Just
> "simple" or "dense" do not suffice, because those can be stretched.
> 
> >> As I said, I would want a whole configuration if we allow some
> >> configuration.
> > 
> > Well then you could go for libvirt XML as the contents; but I think
> > that's crossing layers even more.
> > (I would have veered more to it being exactly the same as an OVA
> > description except for rjones dislike of it)
> 
> Well, the format doesn't really matter for now, I think it's most
> important to first talk about what kind of scope we want.
> 
> >> (More below)
> >>
> >>>> [1] Yes, I concede that there are probably no other users of qcow2.  But
> >>>> please forgive me for assuming that qcow2 was in a sense designed to be
> >>>> a rather general image format that not only qemu could use.
> >>>
> >>> What makes it QEMU specific?  It's basically just the same key/value
> >>> setup as OVA, except putting them inside the qcow2.
> >>
> >> Well, not necessarily qemu-specific, but
> >> ${management_software}-specific, which comes down to the same.  Or,
> >> well, I'd think that, but hold on.
> >>
> >>> We could use the same keys/value definitions as OVA in the blob,
> >>> although their definitions aren't very portable either.
> >>
> >> So you're proposing that we only add options that seem portable for any VM?
> > 
> > Hmm.  We should probably split them, so there should be general options
> > (e.g. minimum-ram) but also hypervisor specifics
> > (qemu.machine-class=q35); but that doesn't mean you can't add keys
> > for multiple hypervisors into the one blob.  I mean
> > something like:
> >     minimum-ram = 1G
> >     qemu.machine-class = q35
> >     anothervm.machine-class = ....
> 
> Well, and that's my issue.  Once you have application-specific info, you
> can go wild.  And I would go wild, without a reasonable and strict
> requirement that the information we want to store has to fulfill.
> 
> For the record, I would've liked it if you'd said "only portable
> options".  But I would have replied that I would fear we'd still end up
> with someone saying "I'd like to store X and Y, let's just put them into
> the specification, then they are portable [even if only this stack
> supports them]" and we wouldn't really have won anything.

I couldn't second guess every other hypervisor on the planet to know
whether specifying a machine class would work for them.

Dave

> Max
> 



--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-06 14:37                               ` Daniel P. Berrangé
@ 2018-06-06 14:42                                 ` Dr. David Alan Gilbert
  0 siblings, 0 replies; 157+ messages in thread
From: Dr. David Alan Gilbert @ 2018-06-06 14:42 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Max Reitz, Kevin Wolf, qemu-block, Michael S. Tsirkin, armbru,
	qemu-devel, Richard W.M. Jones, stefanha

* Daniel P. Berrangé (berrange@redhat.com) wrote:
> On Wed, Jun 06, 2018 at 03:31:35PM +0100, Dr. David Alan Gilbert wrote:
> > > Not in this case because it'd still be a flat qcow2 file in a simple tar
> > > archive.
> > > 
> > > But you're right if we had a more complex format (like chunks stored in
> > > a tar file).
> > 
> > My only problem with using the tar like that is that all tools
> > everywhere would need to be updated to be able to parse them.
> 
> I feel it is the opposite actually. By adding named blobs or custom
> strings to qcow2, we've effectively invented a new type of archive
> format, except apps cant use the normal unzip/tar tools/apis they
> already have. Instead they need to use qemu-img to read,add,remove
> blobs from qcow2. It is very compelling to use an existing archive
> format like tar/zip because every language has APIs for dealing
> with them and apps probably already do this for things like OVA.

My thinking was that a qcow2 with this extra data would still work in
all our existing systems.

Dave

> Regards,
> Daniel
> -- 
> |: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org         -o-            https://fstop138.berrange.com :|
> |: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-06 11:02                   ` Max Reitz
  2018-06-06 11:19                     ` Michal Suchánek
  2018-06-06 11:40                     ` Richard W.M. Jones
@ 2018-06-06 14:43                     ` Michael S. Tsirkin
  2018-06-06 14:57                       ` Eric Blake
  2018-06-06 15:02                       ` Max Reitz
  2 siblings, 2 replies; 157+ messages in thread
From: Michael S. Tsirkin @ 2018-06-06 14:43 UTC (permalink / raw)
  To: Max Reitz
  Cc: Michal Suchánek, Kevin Wolf, Richard W.M. Jones, qemu-devel,
	stefanha, ehabkost, qemu-block, Eric Blake

On Wed, Jun 06, 2018 at 01:02:53PM +0200, Max Reitz wrote:
> Yeah, but why make qcow2 that format?  That's what I completely fail to
> understand.

Because why not? It's cheap to add it there and is much easier
than teaching people about a new container format.

Eric Blake put it very well I think.  There are several things that
several people would like to see addressed:

(1) A sensible list of guest visible aspects of the VM
  preserving which across VM restarts we deem critical enough to support
  starting guests.
  At this point this includes at least architecture and machine type.

(2) A compact file format for serializing list (1)

(3) Ability to store file (2) in a qcow2 image


You are asking why store (2) in qcow2 image specifically. The answer is
it's just one place where we can store it. The answer is we don't need
to involve qemu-block at all for storing it in other places.

But for many people it will be handy to have it in the same file, and
qcow2 is popular enough that many people will be well served if it's
there.

-- 
MT

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-06 11:44                           ` Max Reitz
  2018-06-06 12:16                             ` Dr. David Alan Gilbert
  2018-06-06 13:42                             ` [Qemu-devel] " Eduardo Habkost
@ 2018-06-06 14:46                             ` Michael S. Tsirkin
  2018-06-06 15:04                               ` Max Reitz
  2 siblings, 1 reply; 157+ messages in thread
From: Michael S. Tsirkin @ 2018-06-06 14:46 UTC (permalink / raw)
  To: Max Reitz
  Cc: Dr. David Alan Gilbert, Michal Suchánek, Kevin Wolf,
	ehabkost, qemu-block, Richard W.M. Jones, qemu-devel, stefanha

On Wed, Jun 06, 2018 at 01:44:02PM +0200, Max Reitz wrote:
> Because it's a hack, right.  Storing binary data in a qcow2 file,
> completely ignoring it in qemu (and being completely unusable to any
> potential other users of the qcow2 format[1]) and only interpreting it
> somewhere up the stack is a hack.

It's just a first step and it ensures compatibility with old QEMU
versions. But down the road I think we will start warning
user if the machine type does not match, and possibly even
get the type from there if user didn't supply it.

-- 
MST

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-06 14:31                             ` Dr. David Alan Gilbert
  2018-06-06 14:37                               ` Daniel P. Berrangé
@ 2018-06-06 14:51                               ` Max Reitz
  2018-06-06 15:05                                 ` Dr. David Alan Gilbert
  2018-06-06 15:09                                 ` Michael S. Tsirkin
  1 sibling, 2 replies; 157+ messages in thread
From: Max Reitz @ 2018-06-06 14:51 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: Kevin Wolf, qemu-block, Michael S. Tsirkin, armbru, qemu-devel,
	Richard W.M. Jones, stefanha

[-- Attachment #1: Type: text/plain, Size: 16862 bytes --]

On 2018-06-06 16:31, Dr. David Alan Gilbert wrote:
> * Max Reitz (mreitz@redhat.com) wrote:
>> On 2018-06-06 14:00, Dr. David Alan Gilbert wrote:
>>> * Max Reitz (mreitz@redhat.com) wrote:
>>>> On 2018-06-06 13:14, Dr. David Alan Gilbert wrote:
>>>>> * Max Reitz (mreitz@redhat.com) wrote:
>>>>>> On 2018-06-05 11:21, Dr. David Alan Gilbert wrote:
>>>>>>> <reawakening a fizzled out thread>
> 
> <snip>
> 
>>>>> The problem with having a separate file is that you either have to copy
>>>>> it around with the image 
>>>>
>>>> Which is just an inconvenience.
>>>
>>> It's more than that;  if it's a separate file then the tools can't
>>> rely on users supplying it, and frankly they won't and they'll still
>>> just supply an image.
>>
>> At which point you throw an error and tell them to specify the config file.
> 
> No:
>    a) At the moment they get away with it for images since they're all
>       'pc' and the management layers do the right thing.

So so far nobody has complained?  I don't really see the problem then.

If deploying a disk and using all the defaults works out for users,
great.  If they want more options, apparently they already know they
have to provide some config.

>    b) They'll give the wrong config file - then you'd need to add a flag
>      to detect that - which means you'd need to add something to the
>      qcow to match it to the config; loop back to teh start!

I'm not sure how seriously I should take this argument.  Do stupid
things, win stupid prizes.

If that's the issue, add a UUID to qcow2 files and reference it from the
config file.

> We should make this EASY for users.

To me, having a simple config file they can edit manually certainly
seems simpler than having to use specific tools to edit it inside of the
qcow2 file.

>>>> I understand it is an inconvenience and it would be nice to change it,
>>>> but please understand that I do not want qcow2 to become a filesystem
>>>> just to relieve an inconvenience.
>>>
>>> I very much don't want it to be a filesystem; my reason for writing
>>> down my spec the way I did was to make it clear that the only
>>> thing I want of qcow2 is a single blob, no more; I don't want naming
>>> of the blob or anything else.
>>>
>>>> (Note: I understand that you may not want qcow2 to become a filesystem,
>>>> but I do get the impression from others.)
>>>
>>> My aim was to specify it to fulfill the requirements that everyone
>>> else had asked for, but still only having one unmodifiable blob in qcow.
>>>
>>>>>                           or have an archive. If you have an archive
>>>>> you have to have an unpacking step which then copies, potentially a lot
>>>>> of data taking some reasonable amount of time.
>>>>
>>>> I'm sure this can be optimized, but yes, I get that.
>>>>
>>>> (If you use e.g. tar and store the image data starting on an FS cluster
>>>> boundary (64 kB should be more than sufficient), I assume there is a way
>>>> to extract that data into a new file without copying anything.)
>>>
>>> But then we have to modify all the current things that know how to
>>> handle a qcow2.
>>
>> Not in this case because it'd still be a flat qcow2 file in a simple tar
>> archive.
>>
>> But you're right if we had a more complex format (like chunks stored in
>> a tar file).
> 
> My only problem with using the tar like that is that all tools
> everywhere would need to be updated to be able to parse them.
> (Note if adding a blob to qcow2 like I'm asking for would break existing
> qcow2 users then I don't want it either).

OK, so I suppose this goes back to "let's decide what we really want to
configure first, and how we can limit our scope in an effective and
long-lasting way".

>>>>>                                                 Storing a simple bit
>>>>> of data with the image avoids that.
>>>>
>>>> It is not a simple bit of data, as evidenced by the discussion about
>>>> storing binary blobs and MIME types going on.
>>>
>>> All of the things they've suggested can be done inside that one blob;
>>> even inside the json (or any other structure in that blob).
>>
>> Right, from qcow2's perspective it's a blob of data.  But you can put a
>> whole filesystem into a blob of data, and I get the impression that this
>> is what some are trying to do.
>>
>> Once we store larger amounts of binary data in that blob (which is what
>> I'm fearing from comments on MIME types and PNG images), people will
>> realize that always having to re-store the whole blob if you modify
>> something in the middle is inefficient and that it needs to be
>> optimized.  I don't think you want to do that, but we haven't
>> implemented any of this yet and people are already asking for such
>> binary data inside of the blob.
>>
>> I suspect it'll only get worse over time.
>> I think the most difficult thing about this discussion is that there are
>> different targets.
>>
>> You just want to store a bit of information.  OK, good, but then I'd say
>> we could even just prepend that to the image file in a small header.
> 
> 
> I think you're over-reading what people are asking for.
> I think the PNG suggestion is again the 'label on the front' for a logo.

Which is OK if you store like everything, but very much over the top for
your suggestion.  Again, different people want different things and I
feel like that is the real discussion we should be having right now and
not necessarily where to store it.

Because I think (maybe I'm wrong, though) where to store it heavily
depends on what we want to store and how we want to use it.

> I've not seen anything that's not for either:
>   a) The user to know what the image is

I thought the use case was they just downloaded it.

Otherwise, they should manage their filenames reasonably, come on.
Seriously, adding a cute picture because users are too stupid to manage
their VMs is *not* qcow2's problem.

>   b) The management layer to know what type of VM to create

Apparently this is really what you want.  I really still don't see the
difficulty in supplying a config file (or the danger in not doing so, or
in supplying the wrong one), but, hey, it would be a nice feature indeed.

(I just don't like the tradeoff in complexity.)

>> (Note that extending that header would not even be too complicated,
>> because you can easily move the qcow2 header somewhere else.  Say you
>> move it back by one cluster (e.g. 64 kB), then you just put the cluster
>> that was there originally to the end of the file, which is pretty much
>> trivial.  Then you copy that original data there and overwrite it with
>> the image header.  Done.)
>>
>> Others want to store more binary data.  Then this may get inefficient
>> and insufficient.  But I'd think at this point it gets really
>> problematic to put the data into the qcow2 file because it really
>> doesn't belong there.  (I can't imagine anything that would warrant a
>> MIME type.)
> 
> No, I can't imagine why anyone wants a MIME type either.
> 
>> Then I've heard proposals of storing multiple disk images.  Yes, you
>> could store multiple disks inside of a single qcow2 file, but it would
>> be basically exactly the same as storing just multiple qcow2 files, so...
> 
> No, completely agree.

So, yeah, we need a discussion on what to store first, probably in a new
thread, and maybe one that is not about qcow2 specifically.

>> And really, I still believe in my slippery slope argument, which means
>> that even if you just want to innocently store a machine type, we will
>> end up with something vastly more complex in the end.
>>
>> Finally, it appears to me that you have a simple problem, found one
>> possible solution, and now you just focus on that solution instead of
>> taking a step back and looking at the problem again.
>>
>> The problem: You want to store a binary blob and a disk image together.
>>
>> Your solution: qcow2 has refcounting and thus "occupation bits".  You
>> can put data into it and it will leave it alone, as long as that area is
>> marked as occupied.  Let's put the data into the qcow2 file.
>>
>> OK, let's look at the problem and its constraints again.
>>
>> Hard constraint: Store a single file.
>> (I don't think this is a hard constraint, because I haven't been
>> convinced yet that handling more than a single file is so bad.)
> 
> See above; I think it is.

I know, but you haven't convinced me yet. :-)

> My other hard contraint is that no tool has to change unless
> it wants to make use of the new data.

Sure that it isn't a soft constraint?  If most tools can stay unchanged
but some very specific ones have to be changed, that seems reasonable to me.

>> Soft constraint: Max doesn't like storing blobs in qcow2.
>>
>> So one solution is to ignore the soft constraint.  OK, valid solution, I
>> give you that.  But it doesn't leave me content, probably understandably so.

[...]

>> But really, if you create a VM, you need a configuration.  Like if you
>> set up a new computer, you need to know what you want.  Usually there is
>> no sticky label, but you just have to know and input it manually.  Maybe
>> you have a sheet of paper, which I'd call the configuration file.
> 
> Most things are figurable-out by the management tools/defaults or
> are dependent on the whim of the user - we're only trying to stop the
> user doing things that wont work.

But what's so bad about an empty screen because the user hasn't read the
download description?

> Simpler example; what stops you trying to put the PPC qcow image into
> your x86 VM system - nothing that I know of.  I just want to stop the
> users shooting themselves in the foot.

They haven't shot themselves in the foot, they've just wasted a bit of
their time, which could've been avoided by reading before clicking.

[...]

>>>>>>> --------------------------------------------------------------
>>>>>>>    
>>>>>>>
>>>>>>> Some reasoning:
>>>>>>>    a) I've avoided the problem of when QEMU interprets the value
>>>>>>>       by ignoring it and giving it to management layers at the point
>>>>>>>       of VM import.
>>>>>>
>>>>>> Yes, but in the process you've made it completely opaque to qemu,
>>>>>> basically, which doesn't really make it better for me.  Not that
>>>>>> qemu-specific information in qcow2 files would be what I want, but, well.
>>>>>>
>>>>>> But it does solve technical issues, I concede that.
>>>>>>
>>>>>>>    b) I hate JSON, but there again nailing down a fixed format
>>>>>>>       seems easiest and it makes the job of QCOW easy - a single
>>>>>>>       string.
>>>>>>
>>>>>> Not really.  The string can be rather long, so you probably don't want
>>>>>> to store it in the image header, and thus it's just a binary blob from
>>>>>> qcow2's perspective, essentially.
>>>>>
>>>>> Yes, but it's a single blob - I'm not asking for multiple keyed blobs
>>>>> or the ability to update individual blobs; just one blob that I can
>>>>> replace.
>>>>
>>>> OK, you aren't, but others seem to be.
>>>>
>>>> Or, well, you call it a single blob.  But actually the current ideas
>>>> seem to be to store a rather large configuration tree with binary data
>>>> in that blob, so to me personally there is absolutely no functional
>>>> difference to just storing a tar file in that blob.
>>>>
>>>> So correct me if I'm wrong, but to me it appears that you effectively
>>>> want to store a filesystem in qcow2.[1]  Well, that's better than making
>>>> qcow2 the filesystem, but it still appears just the wrong way around to me.
>>>
>>> It's different in the sense that what we end up with is still a qcow2;
>>> anything that just handles qcow2's and can pass them through doesn't
>>> need to do anything different; users don't need to do anything
>>> different.  No one has to pack/unpack the file.
>>
>> Packing/unpacking is a strawman because I'm doing my best to give
>> proposals that completely avoid that.
>>
>> Users do need to do something different, because users do need to
>> realize that today there is no way to store VM configuration and disk
>> data in a single file.  So if they already start VMs just based on a
>> disk, then they are assuming behavior we do not have and that I'd call
>> naive.  But that is a strawman from my side, sorry.  Keeping naive users
>> happy is probably OK.
> 
> Remember this all works fine now and has done for many years;
> it's the addition of q35 that breaks that assumption.
> The users can already blidly pick up the qcow2 image and stuff it in

Which probably was blind luck already.  And if it wasn't, that means
they knew the defaults are what they want.  So now they'd know they
aren't and they have to offer a config file along with the disk image.

> and it all works; all I want is for that to keep working.

And all I say is that it's not unreasonable to expect users to realize
that a VM is more than a disk image, just like a computer is more than a
disk drive; and that handling two files really is not the end of the world.

(And neither is wasting someone's time because they can't read.)

Firstly, I agree it's a nice thing to have, but it's not worth it if we
don't come up with clear rules on how to prevent developing a full
appliance format.

Or maybe we want that (because I still believe that you can always come
up with obscure options without which the VM won't boot in your specific
case), but then this is beyond just storing a tiny bit of data in a
qcow2 image.

[...]

>>>> [1] Yes, I know that the guest disk already contains an FS. :-P
>>>>
>>>>>>>       (I would suggest in layer2 that the keys are sorted, but
>>>>>>>       that's a pain to do in some json creators)
>>>>>>>    c) Forcing the registry of keys might avoid silly duplication.
>>>>>>>       We can but hope.
>>>>>>>    d) I've not said it's a libvirt XML file since that seems
>>>>>>>       a bit prescriptive.
>>>>>>>
>>>>>>> Some initial suggested keys:
>>>>>>>
>>>>>>>    "qemu.machine-types": [ "q35", "i440fx" ]
>>>>>>>    "qemu.min-ram-MB": 1024
>>>>>>
>>>>>> I still don't understand why you'd want to put the configuration into
>>>>>> qcow2 instead of the other way around.
>>>>>>
>>>>>> Or why you'd want to use a single file at all, because as this whole
>>>>>> thread shows, a disk image alone is clearly not sufficient to describe a VM.
>>>>>>
>>>>>> (Or it may be in simple cases, but then that's because you don't need
>>>>>> any configuration.)
>>>>>
>>>>> Because it avoids the unpacking associated with archives.
>>>>
>>>> I'm not talking about unpacking.  I'm talking about a potentially new
>>>> format which allows accessing the qcow2 file in-place.  It would
>>>> probably be trivial to write a block driver to allow this.
>>>>
>>>> (And as I wrote in my response to Michal, I suspect that tar could
>>>> actually allow this, even though it would probably not be the ideal format.)
>>>
>>> As above, I don't think this is trivial; you have to change all the
>>> layers;  lets say it was a tar; you'd have to somehow know that you're
>>> importing one of these special tars,
>>
>> Which is trivial because it's just "Hey, look, it's a tar with that
>> description file".
> 
> Trivial? It's taking 100+ mails to add a tag to a qcow2 file! Can you
> imagine what it takes to change libvirt, openstack, ovirt and the rest?

:-)

The implementation is trivial is what I meant, just like the
implementation would be rather simple for qcow2 to store a binary blob
and completely ignore it.

>>>                                      you also have to have a tool to
>>> create them;
>>
>> Also trivial.  Non-trivial is modifying them.
>>
>> The workflow would be to create the tar with an empty qcow2 file, the VM
>> description you want, and then just using it.
>>
>> Yes, using is more difficult, but it wouldn't be an own tool, it would
>> be built into qemu.  I can't say how difficult that implementation would
>> be, but it would not be trivial, that is correct.
>>
>>>              and you have to worry about whether that alignment
>>> is correct for the storage/memory you're using it with.
>>
>> Which would be difficult with tar, right.  But we don't have to use tar.
>>
>> (And, no, I don't think creating a new container format is not worse for
>> interoperability than adding a blob to qcow2.)
> 
> If you were going to do this then you'd end up just using OVA.
> You couldn't justify yet another format.

Sure, the exact format doesn't matter to me (or at least currently I
don't think it does...).  I'm more interested in scope and what good it
actually brings.

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-06 14:41                                     ` Dr. David Alan Gilbert
@ 2018-06-06 14:55                                       ` Max Reitz
  2018-06-06 15:25                                         ` Michal Suchánek
  0 siblings, 1 reply; 157+ messages in thread
From: Max Reitz @ 2018-06-06 14:55 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: Kevin Wolf, ehabkost, qemu-block, Michael S. Tsirkin,
	Richard W.M. Jones, qemu-devel, stefanha, Michal Suchánek

[-- Attachment #1: Type: text/plain, Size: 3072 bytes --]

On 2018-06-06 16:41, Dr. David Alan Gilbert wrote:
> * Max Reitz (mreitz@redhat.com) wrote:

[...]

>> So why is it so dangerous to connect a disk you just downloaded to e.g.
>> the wrong machine type?  I assumed it just wouldn't work and you'd try
>> again, until you realized that maybe you should read the download
>> description and do as it says ("download this config file, pass it").
> 
> That's bad!  Stuff should just-work;

That's how it always should be.  Life's tough, though.

>                                      it currently just works,

Due to sheer blind luck, I'd say.

>                                                               things
> should get better and easier for our users.

Users using a whole VM stack plus management, but then handling two
files instead of one is too much to ask?

>                                              And anyway, not working for
> EFI for exmaple can be just a blank screen.  Seriously - keep it easy
> for the user!

Thinking this through makes you end up with appliances.

> And with 'pc' type VMs being all that's around it does just-work.
> 

[...]

>>>>>> [1] Yes, I concede that there are probably no other users of qcow2.  But
>>>>>> please forgive me for assuming that qcow2 was in a sense designed to be
>>>>>> a rather general image format that not only qemu could use.
>>>>>
>>>>> What makes it QEMU specific?  It's basically just the same key/value
>>>>> setup as OVA, except putting them inside the qcow2.
>>>>
>>>> Well, not necessarily qemu-specific, but
>>>> ${management_software}-specific, which comes down to the same.  Or,
>>>> well, I'd think that, but hold on.
>>>>
>>>>> We could use the same keys/value definitions as OVA in the blob,
>>>>> although their definitions aren't very portable either.
>>>>
>>>> So you're proposing that we only add options that seem portable for any VM?
>>>
>>> Hmm.  We should probably split them, so there should be general options
>>> (e.g. minimum-ram) but also hypervisor specifics
>>> (qemu.machine-class=q35); but that doesn't mean you can't add keys
>>> for multiple hypervisors into the one blob.  I mean
>>> something like:
>>>     minimum-ram = 1G
>>>     qemu.machine-class = q35
>>>     anothervm.machine-class = ....
>>
>> Well, and that's my issue.  Once you have application-specific info, you
>> can go wild.  And I would go wild, without a reasonable and strict
>> requirement that the information we want to store has to fulfill.
>>
>> For the record, I would've liked it if you'd said "only portable
>> options".  But I would have replied that I would fear we'd still end up
>> with someone saying "I'd like to store X and Y, let's just put them into
>> the specification, then they are portable [even if only this stack
>> supports them]" and we wouldn't really have won anything.
> 
> I couldn't second guess every other hypervisor on the planet to know
> whether specifying a machine class would work for them.

If they support the config format...

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-06 13:42                             ` [Qemu-devel] " Eduardo Habkost
@ 2018-06-06 14:55                               ` Michael S. Tsirkin
  2018-06-06 14:57                                 ` Max Reitz
  2018-06-11 14:10                                 ` Kevin Wolf
  0 siblings, 2 replies; 157+ messages in thread
From: Michael S. Tsirkin @ 2018-06-06 14:55 UTC (permalink / raw)
  To: Eduardo Habkost
  Cc: Max Reitz, Dr. David Alan Gilbert, Michal Suchánek,
	Kevin Wolf, qemu-block, Richard W.M. Jones, qemu-devel, stefanha

On Wed, Jun 06, 2018 at 10:42:33AM -0300, Eduardo Habkost wrote:
> > If we want a grand vision where a single file stores the whole VM, why
> > not invest the work and make it right from the start?
> 
> We don't want a grand vision where a single file stores the whole
> VM.  This is exactly what I would like to avoid, by not inventing
> a whole different appliance file format.

Besides, trying to get a grand vision from the start is a sure
way to never have the design leave the drawing board.

What we are asking for at this point is a way to stick a named blob in
an image that people can use with qemu without jumping through hoops.

It seems like a generic enough addition that it seems highly likely
to be useful down the road and harmless enough that maintaining
it won't become a burden.

Can we agree on that as a first step, so we can build that foundation
and move on to actually building ways to use it?

-- 
MST

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-06 14:55                               ` Michael S. Tsirkin
@ 2018-06-06 14:57                                 ` Max Reitz
  2018-06-11 14:10                                 ` Kevin Wolf
  1 sibling, 0 replies; 157+ messages in thread
From: Max Reitz @ 2018-06-06 14:57 UTC (permalink / raw)
  To: Michael S. Tsirkin, Eduardo Habkost
  Cc: Dr. David Alan Gilbert, Michal Suchánek, Kevin Wolf,
	qemu-block, Richard W.M. Jones, qemu-devel, stefanha

[-- Attachment #1: Type: text/plain, Size: 1290 bytes --]

On 2018-06-06 16:55, Michael S. Tsirkin wrote:
> On Wed, Jun 06, 2018 at 10:42:33AM -0300, Eduardo Habkost wrote:
>>> If we want a grand vision where a single file stores the whole VM, why
>>> not invest the work and make it right from the start?
>>
>> We don't want a grand vision where a single file stores the whole
>> VM.  This is exactly what I would like to avoid, by not inventing
>> a whole different appliance file format.
> 
> Besides, trying to get a grand vision from the start is a sure
> way to never have the design leave the drawing board.

Yes, but with our own non-qcow2 format we could easily start with
something simple and start over later.

> What we are asking for at this point is a way to stick a named blob in
> an image that people can use with qemu without jumping through hoops.
> 
> It seems like a generic enough addition that it seems highly likely
> to be useful down the road and harmless enough that maintaining
> it won't become a burden.

Yes.  Its genericity is a big part of what's bothering me.

> Can we agree on that as a first step, so we can build that foundation
> and move on to actually building ways to use it?

No, because I don't want to agree on putting anything inside qcow2
before I know its scope.

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-06 14:43                     ` Michael S. Tsirkin
@ 2018-06-06 14:57                       ` Eric Blake
  2018-06-06 20:39                         ` Eric Blake
  2018-06-06 15:02                       ` Max Reitz
  1 sibling, 1 reply; 157+ messages in thread
From: Eric Blake @ 2018-06-06 14:57 UTC (permalink / raw)
  To: Michael S. Tsirkin, Max Reitz
  Cc: Michal Suchánek, Kevin Wolf, Richard W.M. Jones, qemu-devel,
	stefanha, ehabkost, qemu-block

On 06/06/2018 09:43 AM, Michael S. Tsirkin wrote:
> On Wed, Jun 06, 2018 at 01:02:53PM +0200, Max Reitz wrote:
>> Yeah, but why make qcow2 that format?  That's what I completely fail to
>> understand.
> 
> Because why not? It's cheap to add it there and is much easier
> than teaching people about a new container format.

tar is not a new container format, but it is a new format to various 
toolchains - that said, if we popularize tar as the format for including 
a config file alongside a qcow2 image, it's not that hard to fix the 
stack to start passing that file around as the new preferred file type.

> 
> Eric Blake put it very well I think.  There are several things that
> several people would like to see addressed:
> 
> (1) A sensible list of guest visible aspects of the VM
>    preserving which across VM restarts we deem critical enough to support
>    starting guests.
>    At this point this includes at least architecture and machine type.

This part is true no matter where we store it. So as I see it, the 
question is now whether to store it in qcow2, or whether to store qcow2 
+ config in a tar file.

> 
> (2) A compact file format for serializing list (1)

tar files already serve this purpose (even if we don't like the current 
OVA specification for the config side, it at least proves that tar files 
are usable in this manner)

> 
> (3) Ability to store file (2) in a qcow2 image

As Dan pointed out, if we can FIRST popularize storing a config file + 
qcow2 in a tar file, THEN we can (for convenience) let qcow2 directly 
store the same config file, if it helps things (although if popularizing 
tar files works, we wouldn't need it in qcow2).

Maybe the step we want to take now is to add a new block driver to qemu 
that supports tar files including resize.  That is, where we currently do:

qcow2 -> posix file

we would instead popularize:

qcow2 -> tar driver -> posix file

where the tar driver takes care of finding the right subset within the 
overall tar file, AND makes it easy to resize (by updating tar metadata 
any time qcow2 wants to resize larger, and assuming qcow2 is the last 
member of the tar archive)

Perhaps we can even automate it so that the tar driver is automatically 
inserted even when not explicitly specified (after all, you can already 
start qemu by giving just the file name holding a qcow2 image, and qemu 
can figure out to use the qcow2 driver) - although we DO have to be 
careful of avoiding CVEs (probing image formats MUST NOT allow a guest 
to convert a raw image into something that qemu would treat as a tar, 
any more than it would misinterpret raw as qcow2)

> 
> 
> You are asking why store (2) in qcow2 image specifically. The answer is
> it's just one place where we can store it. The answer is we don't need
> to involve qemu-block at all for storing it in other places.

Then why start with qcow2? Let's start with tar files, and enhance 
qemu-block to make tar files containing qcow2 easier to use in qemu (and 
NOT with making qcow2 larger just to bypass tar files).

> 
> But for many people it will be handy to have it in the same file, and
> qcow2 is popular enough that many people will be well served if it's
> there.
> 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-06 14:43                     ` Michael S. Tsirkin
  2018-06-06 14:57                       ` Eric Blake
@ 2018-06-06 15:02                       ` Max Reitz
  1 sibling, 0 replies; 157+ messages in thread
From: Max Reitz @ 2018-06-06 15:02 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Michal Suchánek, Kevin Wolf, Richard W.M. Jones, qemu-devel,
	stefanha, ehabkost, qemu-block, Eric Blake

[-- Attachment #1: Type: text/plain, Size: 2010 bytes --]

On 2018-06-06 16:43, Michael S. Tsirkin wrote:
> On Wed, Jun 06, 2018 at 01:02:53PM +0200, Max Reitz wrote:
>> Yeah, but why make qcow2 that format?  That's what I completely fail to
>> understand.
> 
> Because why not? It's cheap to add it there and is much easier
> than teaching people about a new container format.

"new" container format?  qcow2 is not a container format, so that would
be teaching people something new anyway.

> Eric Blake put it very well I think.  There are several things that
> several people would like to see addressed:
> 
> (1) A sensible list of guest visible aspects of the VM
>   preserving which across VM restarts we deem critical enough to support
>   starting guests.
>   At this point this includes at least architecture and machine type.

If you use a whole management layer, that is trivial anyway (and that
seems to be Dave's assumption), because that management layer can store
its configuration somewhere.  Dave's issue was about downloading foreign
images, not about restarts.

> (2) A compact file format for serializing list (1)

OK.

> (3) Ability to store file (2) in a qcow2 image

I see that people are asking this, yes, although I don't quite get the
point.

> You are asking why store (2) in qcow2 image specifically. The answer is
> it's just one place where we can store it. The answer is we don't need
> to involve qemu-block at all for storing it in other places.
> 
> But for many people it will be handy to have it in the same file, and
> qcow2 is popular enough that many people will be well served if it's
> there.

Note that I'm also asking why it needs to be a single file.
Furthermore, I'm asking what the final scope is intended to be.  I don't
believe qcow2 to be the correct format for a whole appliance, but I do
believe that assuming that people want to provide just a qcow2 file
without anything else means already accepting it *is* an appliance.  But
Dave seems to reject that idea.

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-06 14:46                             ` Michael S. Tsirkin
@ 2018-06-06 15:04                               ` Max Reitz
  0 siblings, 0 replies; 157+ messages in thread
From: Max Reitz @ 2018-06-06 15:04 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Dr. David Alan Gilbert, Michal Suchánek, Kevin Wolf,
	ehabkost, qemu-block, Richard W.M. Jones, qemu-devel, stefanha

[-- Attachment #1: Type: text/plain, Size: 931 bytes --]

On 2018-06-06 16:46, Michael S. Tsirkin wrote:
> On Wed, Jun 06, 2018 at 01:44:02PM +0200, Max Reitz wrote:
>> Because it's a hack, right.  Storing binary data in a qcow2 file,
>> completely ignoring it in qemu (and being completely unusable to any
>> potential other users of the qcow2 format[1]) and only interpreting it
>> somewhere up the stack is a hack.
> 
> It's just a first step and it ensures compatibility with old QEMU
> versions. But down the road I think we will start warning
> user if the machine type does not match, and possibly even
> get the type from there if user didn't supply it.

If it's a first step we should have an idea on what the following steps
should be.

I don't want people to convince me that adding a blob to qcow2 is a good
idea because "we are only going to store two or three values!" and then
that blob grows out of control because "well, now we have it anyway".

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-06 14:51                               ` Max Reitz
@ 2018-06-06 15:05                                 ` Dr. David Alan Gilbert
  2018-06-06 15:36                                   ` Eric Blake
  2018-06-06 17:49                                   ` Max Reitz
  2018-06-06 15:09                                 ` Michael S. Tsirkin
  1 sibling, 2 replies; 157+ messages in thread
From: Dr. David Alan Gilbert @ 2018-06-06 15:05 UTC (permalink / raw)
  To: Max Reitz
  Cc: Kevin Wolf, qemu-block, Michael S. Tsirkin, armbru, qemu-devel,
	Richard W.M. Jones, stefanha

* Max Reitz (mreitz@redhat.com) wrote:
> On 2018-06-06 16:31, Dr. David Alan Gilbert wrote:
> > * Max Reitz (mreitz@redhat.com) wrote:
> >> On 2018-06-06 14:00, Dr. David Alan Gilbert wrote:
> >>> * Max Reitz (mreitz@redhat.com) wrote:
> >>>> On 2018-06-06 13:14, Dr. David Alan Gilbert wrote:
> >>>>> * Max Reitz (mreitz@redhat.com) wrote:
> >>>>>> On 2018-06-05 11:21, Dr. David Alan Gilbert wrote:
> >>>>>>> <reawakening a fizzled out thread>
> > 
> > <snip>
> > 
> >>>>> The problem with having a separate file is that you either have to copy
> >>>>> it around with the image 
> >>>>
> >>>> Which is just an inconvenience.
> >>>
> >>> It's more than that;  if it's a separate file then the tools can't
> >>> rely on users supplying it, and frankly they won't and they'll still
> >>> just supply an image.
> >>
> >> At which point you throw an error and tell them to specify the config file.
> > 
> > No:
> >    a) At the moment they get away with it for images since they're all
> >       'pc' and the management layers do the right thing.
> 
> So so far nobody has complained?  I don't really see the problem then.
> 
> If deploying a disk and using all the defaults works out for users,
> great.  If they want more options, apparently they already know they
> have to provide some config.

This problem all came about because of q35.  We can't change defaults to
use q35 because importing existing images might break so we need to
start flagging stuff as q35 - all we're trying to do is make stuff no
more broken than today.

> >    b) They'll give the wrong config file - then you'd need to add a flag
> >      to detect that - which means you'd need to add something to the
> >      qcow to match it to the config; loop back to teh start!
> 
> I'm not sure how seriously I should take this argument.  Do stupid
> things, win stupid prizes.
> 
> If that's the issue, add a UUID to qcow2 files and reference it from the
> config file.

Is a UUID a small string :-)

> > We should make this EASY for users.
> 
> To me, having a simple config file they can edit manually certainly
> seems simpler than having to use specific tools to edit it inside of the
> qcow2 file.

The users never touch the tools; they click and import the VM image.

> >>>> I understand it is an inconvenience and it would be nice to change it,
> >>>> but please understand that I do not want qcow2 to become a filesystem
> >>>> just to relieve an inconvenience.
> >>>
> >>> I very much don't want it to be a filesystem; my reason for writing
> >>> down my spec the way I did was to make it clear that the only
> >>> thing I want of qcow2 is a single blob, no more; I don't want naming
> >>> of the blob or anything else.
> >>>
> >>>> (Note: I understand that you may not want qcow2 to become a filesystem,
> >>>> but I do get the impression from others.)
> >>>
> >>> My aim was to specify it to fulfill the requirements that everyone
> >>> else had asked for, but still only having one unmodifiable blob in qcow.
> >>>
> >>>>>                           or have an archive. If you have an archive
> >>>>> you have to have an unpacking step which then copies, potentially a lot
> >>>>> of data taking some reasonable amount of time.
> >>>>
> >>>> I'm sure this can be optimized, but yes, I get that.
> >>>>
> >>>> (If you use e.g. tar and store the image data starting on an FS cluster
> >>>> boundary (64 kB should be more than sufficient), I assume there is a way
> >>>> to extract that data into a new file without copying anything.)
> >>>
> >>> But then we have to modify all the current things that know how to
> >>> handle a qcow2.
> >>
> >> Not in this case because it'd still be a flat qcow2 file in a simple tar
> >> archive.
> >>
> >> But you're right if we had a more complex format (like chunks stored in
> >> a tar file).
> > 
> > My only problem with using the tar like that is that all tools
> > everywhere would need to be updated to be able to parse them.
> > (Note if adding a blob to qcow2 like I'm asking for would break existing
> > qcow2 users then I don't want it either).
> 
> OK, so I suppose this goes back to "let's decide what we really want to
> configure first, and how we can limit our scope in an effective and
> long-lasting way".
> 
> >>>>>                                                 Storing a simple bit
> >>>>> of data with the image avoids that.
> >>>>
> >>>> It is not a simple bit of data, as evidenced by the discussion about
> >>>> storing binary blobs and MIME types going on.
> >>>
> >>> All of the things they've suggested can be done inside that one blob;
> >>> even inside the json (or any other structure in that blob).
> >>
> >> Right, from qcow2's perspective it's a blob of data.  But you can put a
> >> whole filesystem into a blob of data, and I get the impression that this
> >> is what some are trying to do.
> >>
> >> Once we store larger amounts of binary data in that blob (which is what
> >> I'm fearing from comments on MIME types and PNG images), people will
> >> realize that always having to re-store the whole blob if you modify
> >> something in the middle is inefficient and that it needs to be
> >> optimized.  I don't think you want to do that, but we haven't
> >> implemented any of this yet and people are already asking for such
> >> binary data inside of the blob.
> >>
> >> I suspect it'll only get worse over time.
> >> I think the most difficult thing about this discussion is that there are
> >> different targets.
> >>
> >> You just want to store a bit of information.  OK, good, but then I'd say
> >> we could even just prepend that to the image file in a small header.
> > 
> > 
> > I think you're over-reading what people are asking for.
> > I think the PNG suggestion is again the 'label on the front' for a logo.
> 
> Which is OK if you store like everything, but very much over the top for
> your suggestion.  Again, different people want different things and I
> feel like that is the real discussion we should be having right now and
> not necessarily where to store it.
> 
> Because I think (maybe I'm wrong, though) where to store it heavily
> depends on what we want to store and how we want to use it.
> 
> > I've not seen anything that's not for either:
> >   a) The user to know what the image is
> 
> I thought the use case was they just downloaded it.

Or pulled it from that big directory of images.

> Otherwise, they should manage their filenames reasonably, come on.
> Seriously, adding a cute picture because users are too stupid to manage
> their VMs is *not* qcow2's problem.

Well, it's someones problem; we already have magic to display those
images in some of the higher level tools.

> >   b) The management layer to know what type of VM to create
> 
> Apparently this is really what you want.  I really still don't see the
> difficulty in supplying a config file (or the danger in not doing so, or
> in supplying the wrong one), but, hey, it would be a nice feature indeed.
> 
> (I just don't like the tradeoff in complexity.)

Remember, give this to someone who doesn't understand what the
difference is between the machine types etc.

> >> (Note that extending that header would not even be too complicated,
> >> because you can easily move the qcow2 header somewhere else.  Say you
> >> move it back by one cluster (e.g. 64 kB), then you just put the cluster
> >> that was there originally to the end of the file, which is pretty much
> >> trivial.  Then you copy that original data there and overwrite it with
> >> the image header.  Done.)
> >>
> >> Others want to store more binary data.  Then this may get inefficient
> >> and insufficient.  But I'd think at this point it gets really
> >> problematic to put the data into the qcow2 file because it really
> >> doesn't belong there.  (I can't imagine anything that would warrant a
> >> MIME type.)
> > 
> > No, I can't imagine why anyone wants a MIME type either.
> > 
> >> Then I've heard proposals of storing multiple disk images.  Yes, you
> >> could store multiple disks inside of a single qcow2 file, but it would
> >> be basically exactly the same as storing just multiple qcow2 files, so...
> > 
> > No, completely agree.
> 
> So, yeah, we need a discussion on what to store first, probably in a new
> thread, and maybe one that is not about qcow2 specifically.
> 
> >> And really, I still believe in my slippery slope argument, which means
> >> that even if you just want to innocently store a machine type, we will
> >> end up with something vastly more complex in the end.
> >>
> >> Finally, it appears to me that you have a simple problem, found one
> >> possible solution, and now you just focus on that solution instead of
> >> taking a step back and looking at the problem again.
> >>
> >> The problem: You want to store a binary blob and a disk image together.
> >>
> >> Your solution: qcow2 has refcounting and thus "occupation bits".  You
> >> can put data into it and it will leave it alone, as long as that area is
> >> marked as occupied.  Let's put the data into the qcow2 file.
> >>
> >> OK, let's look at the problem and its constraints again.
> >>
> >> Hard constraint: Store a single file.
> >> (I don't think this is a hard constraint, because I haven't been
> >> convinced yet that handling more than a single file is so bad.)
> > 
> > See above; I think it is.
> 
> I know, but you haven't convinced me yet. :-)
> 
> > My other hard contraint is that no tool has to change unless
> > it wants to make use of the new data.
> 
> Sure that it isn't a soft constraint?  If most tools can stay unchanged
> but some very specific ones have to be changed, that seems reasonable to me.

The hard constraint is the normal path stays unchanged; we can change
the tools to make use of the extra data, but not change what's out
there.

> >> Soft constraint: Max doesn't like storing blobs in qcow2.
> >>
> >> So one solution is to ignore the soft constraint.  OK, valid solution, I
> >> give you that.  But it doesn't leave me content, probably understandably so.
> 
> [...]
> 
> >> But really, if you create a VM, you need a configuration.  Like if you
> >> set up a new computer, you need to know what you want.  Usually there is
> >> no sticky label, but you just have to know and input it manually.  Maybe
> >> you have a sheet of paper, which I'd call the configuration file.
> > 
> > Most things are figurable-out by the management tools/defaults or
> > are dependent on the whim of the user - we're only trying to stop the
> > user doing things that wont work.
> 
> But what's so bad about an empty screen because the user hasn't read the
> download description?

Because it's got to be EASY for the customer; seriously - stop punishing
the user for not noticing something.
We've got to help the users, if not we get people asking why their VM
system has given them a black screen, or why the image they just
downloaded didn't work - it's basic user friendliness.

It's not obvious why it's failed; if it was as simple as a nice box
popping up telling them they'd booted it wouldn't be too bad; but some
of them will waste 3 hours trying to figure out wth happened.

*seriously* think about our users.


> > Simpler example; what stops you trying to put the PPC qcow image into
> > your x86 VM system - nothing that I know of.  I just want to stop the
> > users shooting themselves in the foot.
> 
> They haven't shot themselves in the foot, they've just wasted a bit of
> their time, which could've been avoided by reading before clicking.

*seriously* think about our users.

> [...]
> 
> >>>>>>> --------------------------------------------------------------
> >>>>>>>    
> >>>>>>>
> >>>>>>> Some reasoning:
> >>>>>>>    a) I've avoided the problem of when QEMU interprets the value
> >>>>>>>       by ignoring it and giving it to management layers at the point
> >>>>>>>       of VM import.
> >>>>>>
> >>>>>> Yes, but in the process you've made it completely opaque to qemu,
> >>>>>> basically, which doesn't really make it better for me.  Not that
> >>>>>> qemu-specific information in qcow2 files would be what I want, but, well.
> >>>>>>
> >>>>>> But it does solve technical issues, I concede that.
> >>>>>>
> >>>>>>>    b) I hate JSON, but there again nailing down a fixed format
> >>>>>>>       seems easiest and it makes the job of QCOW easy - a single
> >>>>>>>       string.
> >>>>>>
> >>>>>> Not really.  The string can be rather long, so you probably don't want
> >>>>>> to store it in the image header, and thus it's just a binary blob from
> >>>>>> qcow2's perspective, essentially.
> >>>>>
> >>>>> Yes, but it's a single blob - I'm not asking for multiple keyed blobs
> >>>>> or the ability to update individual blobs; just one blob that I can
> >>>>> replace.
> >>>>
> >>>> OK, you aren't, but others seem to be.
> >>>>
> >>>> Or, well, you call it a single blob.  But actually the current ideas
> >>>> seem to be to store a rather large configuration tree with binary data
> >>>> in that blob, so to me personally there is absolutely no functional
> >>>> difference to just storing a tar file in that blob.
> >>>>
> >>>> So correct me if I'm wrong, but to me it appears that you effectively
> >>>> want to store a filesystem in qcow2.[1]  Well, that's better than making
> >>>> qcow2 the filesystem, but it still appears just the wrong way around to me.
> >>>
> >>> It's different in the sense that what we end up with is still a qcow2;
> >>> anything that just handles qcow2's and can pass them through doesn't
> >>> need to do anything different; users don't need to do anything
> >>> different.  No one has to pack/unpack the file.
> >>
> >> Packing/unpacking is a strawman because I'm doing my best to give
> >> proposals that completely avoid that.
> >>
> >> Users do need to do something different, because users do need to
> >> realize that today there is no way to store VM configuration and disk
> >> data in a single file.  So if they already start VMs just based on a
> >> disk, then they are assuming behavior we do not have and that I'd call
> >> naive.  But that is a strawman from my side, sorry.  Keeping naive users
> >> happy is probably OK.
> > 
> > Remember this all works fine now and has done for many years;
> > it's the addition of q35 that breaks that assumption.
> > The users can already blidly pick up the qcow2 image and stuff it in
> 
> Which probably was blind luck already.  And if it wasn't, that means
> they knew the defaults are what they want.  So now they'd know they
> aren't and they have to offer a config file along with the disk image.

No, it's not blind look; it's that the management tools know how to get
it right.

> > and it all works; all I want is for that to keep working.
> 
> And all I say is that it's not unreasonable to expect users to realize
> that a VM is more than a disk image, just like a computer is more than a
> disk drive; and that handling two files really is not the end of the world.
> 
> (And neither is wasting someone's time because they can't read.)


*seriously* think about our users.

> Firstly, I agree it's a nice thing to have, but it's not worth it if we
> don't come up with clear rules on how to prevent developing a full
> appliance format.
> 
> Or maybe we want that (because I still believe that you can always come
> up with obscure options without which the VM won't boot in your specific
> case), but then this is beyond just storing a tiny bit of data in a
> qcow2 image.

I don't want to protect them from really trying to shoot themselves in
the foot; I just want to make sure the easy-path works.  Download an
image, tell the tool to import; VM works. All good.

> [...]
> 
> >>>> [1] Yes, I know that the guest disk already contains an FS. :-P
> >>>>
> >>>>>>>       (I would suggest in layer2 that the keys are sorted, but
> >>>>>>>       that's a pain to do in some json creators)
> >>>>>>>    c) Forcing the registry of keys might avoid silly duplication.
> >>>>>>>       We can but hope.
> >>>>>>>    d) I've not said it's a libvirt XML file since that seems
> >>>>>>>       a bit prescriptive.
> >>>>>>>
> >>>>>>> Some initial suggested keys:
> >>>>>>>
> >>>>>>>    "qemu.machine-types": [ "q35", "i440fx" ]
> >>>>>>>    "qemu.min-ram-MB": 1024
> >>>>>>
> >>>>>> I still don't understand why you'd want to put the configuration into
> >>>>>> qcow2 instead of the other way around.
> >>>>>>
> >>>>>> Or why you'd want to use a single file at all, because as this whole
> >>>>>> thread shows, a disk image alone is clearly not sufficient to describe a VM.
> >>>>>>
> >>>>>> (Or it may be in simple cases, but then that's because you don't need
> >>>>>> any configuration.)
> >>>>>
> >>>>> Because it avoids the unpacking associated with archives.
> >>>>
> >>>> I'm not talking about unpacking.  I'm talking about a potentially new
> >>>> format which allows accessing the qcow2 file in-place.  It would
> >>>> probably be trivial to write a block driver to allow this.
> >>>>
> >>>> (And as I wrote in my response to Michal, I suspect that tar could
> >>>> actually allow this, even though it would probably not be the ideal format.)
> >>>
> >>> As above, I don't think this is trivial; you have to change all the
> >>> layers;  lets say it was a tar; you'd have to somehow know that you're
> >>> importing one of these special tars,
> >>
> >> Which is trivial because it's just "Hey, look, it's a tar with that
> >> description file".
> > 
> > Trivial? It's taking 100+ mails to add a tag to a qcow2 file! Can you
> > imagine what it takes to change libvirt, openstack, ovirt and the rest?
> 
> :-)
> 
> The implementation is trivial is what I meant, just like the
> implementation would be rather simple for qcow2 to store a binary blob
> and completely ignore it.

But then you'd have people shipping .newformat files as well as qcow2
files and you'd have to persuade people to start doing that, and they'd
ship both or none or....

> >>>                                      you also have to have a tool to
> >>> create them;
> >>
> >> Also trivial.  Non-trivial is modifying them.
> >>
> >> The workflow would be to create the tar with an empty qcow2 file, the VM
> >> description you want, and then just using it.
> >>
> >> Yes, using is more difficult, but it wouldn't be an own tool, it would
> >> be built into qemu.  I can't say how difficult that implementation would
> >> be, but it would not be trivial, that is correct.
> >>
> >>>              and you have to worry about whether that alignment
> >>> is correct for the storage/memory you're using it with.
> >>
> >> Which would be difficult with tar, right.  But we don't have to use tar.
> >>
> >> (And, no, I don't think creating a new container format is not worse for
> >> interoperability than adding a blob to qcow2.)
> > 
> > If you were going to do this then you'd end up just using OVA.
> > You couldn't justify yet another format.
> 
> Sure, the exact format doesn't matter to me (or at least currently I
> don't think it does...).  I'm more interested in scope and what good it
> actually brings.

Dave

> Max
> 



--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-06 14:51                               ` Max Reitz
  2018-06-06 15:05                                 ` Dr. David Alan Gilbert
@ 2018-06-06 15:09                                 ` Michael S. Tsirkin
  2018-06-06 17:06                                   ` Max Reitz
  1 sibling, 1 reply; 157+ messages in thread
From: Michael S. Tsirkin @ 2018-06-06 15:09 UTC (permalink / raw)
  To: Max Reitz
  Cc: Dr. David Alan Gilbert, Kevin Wolf, qemu-block, armbru,
	qemu-devel, Richard W.M. Jones, stefanha

On Wed, Jun 06, 2018 at 04:51:39PM +0200, Max Reitz wrote:
> On 2018-06-06 16:31, Dr. David Alan Gilbert wrote:
> > * Max Reitz (mreitz@redhat.com) wrote:
> >> On 2018-06-06 14:00, Dr. David Alan Gilbert wrote:
> >>> * Max Reitz (mreitz@redhat.com) wrote:
> >>>> On 2018-06-06 13:14, Dr. David Alan Gilbert wrote:
> >>>>> * Max Reitz (mreitz@redhat.com) wrote:
> >>>>>> On 2018-06-05 11:21, Dr. David Alan Gilbert wrote:
> >>>>>>> <reawakening a fizzled out thread>
> > 
> > <snip>
> > 
> >>>>> The problem with having a separate file is that you either have to copy
> >>>>> it around with the image 
> >>>>
> >>>> Which is just an inconvenience.
> >>>
> >>> It's more than that;  if it's a separate file then the tools can't
> >>> rely on users supplying it, and frankly they won't and they'll still
> >>> just supply an image.
> >>
> >> At which point you throw an error and tell them to specify the config file.
> > 
> > No:
> >    a) At the moment they get away with it for images since they're all
> >       'pc' and the management layers do the right thing.
> 
> So so far nobody has complained?  I don't really see the problem then.
> 
> If deploying a disk and using all the defaults works out for users,
> great.  If they want more options, apparently they already know they
> have to provide some config.

QEMU's usability is terrible. There are tons of tools out there to try
to tame it, but of course they lack the knowledge of the VM internals
that QEMU has.


> >    b) They'll give the wrong config file - then you'd need to add a flag
> >      to detect that - which means you'd need to add something to the
> >      qcow to match it to the config; loop back to teh start!
> 
> I'm not sure how seriously I should take this argument.  Do stupid
> things, win stupid prizes.
> 
> If that's the issue, add a UUID to qcow2 files and reference it from the
> config file.
> 
> > We should make this EASY for users.
> 
> To me, having a simple config file they can edit manually certainly
> seems simpler than having to use specific tools to edit it inside of the
> qcow2 file.

I think you are one of the happy users familiar with qemu intricacies
and/or using a tool on top that does it for you.

> >>>> I understand it is an inconvenience and it would be nice to change it,
> >>>> but please understand that I do not want qcow2 to become a filesystem
> >>>> just to relieve an inconvenience.
> >>>
> >>> I very much don't want it to be a filesystem; my reason for writing
> >>> down my spec the way I did was to make it clear that the only
> >>> thing I want of qcow2 is a single blob, no more; I don't want naming
> >>> of the blob or anything else.
> >>>
> >>>> (Note: I understand that you may not want qcow2 to become a filesystem,
> >>>> but I do get the impression from others.)
> >>>
> >>> My aim was to specify it to fulfill the requirements that everyone
> >>> else had asked for, but still only having one unmodifiable blob in qcow.
> >>>
> >>>>>                           or have an archive. If you have an archive
> >>>>> you have to have an unpacking step which then copies, potentially a lot
> >>>>> of data taking some reasonable amount of time.
> >>>>
> >>>> I'm sure this can be optimized, but yes, I get that.
> >>>>
> >>>> (If you use e.g. tar and store the image data starting on an FS cluster
> >>>> boundary (64 kB should be more than sufficient), I assume there is a way
> >>>> to extract that data into a new file without copying anything.)
> >>>
> >>> But then we have to modify all the current things that know how to
> >>> handle a qcow2.
> >>
> >> Not in this case because it'd still be a flat qcow2 file in a simple tar
> >> archive.
> >>
> >> But you're right if we had a more complex format (like chunks stored in
> >> a tar file).
> > 
> > My only problem with using the tar like that is that all tools
> > everywhere would need to be updated to be able to parse them.
> > (Note if adding a blob to qcow2 like I'm asking for would break existing
> > qcow2 users then I don't want it either).
> 
> OK, so I suppose this goes back to "let's decide what we really want to
> configure first, and how we can limit our scope in an effective and
> long-lasting way".
> 
> >>>>>                                                 Storing a simple bit
> >>>>> of data with the image avoids that.
> >>>>
> >>>> It is not a simple bit of data, as evidenced by the discussion about
> >>>> storing binary blobs and MIME types going on.
> >>>
> >>> All of the things they've suggested can be done inside that one blob;
> >>> even inside the json (or any other structure in that blob).
> >>
> >> Right, from qcow2's perspective it's a blob of data.  But you can put a
> >> whole filesystem into a blob of data, and I get the impression that this
> >> is what some are trying to do.
> >>
> >> Once we store larger amounts of binary data in that blob (which is what
> >> I'm fearing from comments on MIME types and PNG images), people will
> >> realize that always having to re-store the whole blob if you modify
> >> something in the middle is inefficient and that it needs to be
> >> optimized.  I don't think you want to do that, but we haven't
> >> implemented any of this yet and people are already asking for such
> >> binary data inside of the blob.
> >>
> >> I suspect it'll only get worse over time.
> >> I think the most difficult thing about this discussion is that there are
> >> different targets.
> >>
> >> You just want to store a bit of information.  OK, good, but then I'd say
> >> we could even just prepend that to the image file in a small header.
> > 
> > 
> > I think you're over-reading what people are asking for.
> > I think the PNG suggestion is again the 'label on the front' for a logo.
> 
> Which is OK if you store like everything, but very much over the top for
> your suggestion.  Again, different people want different things and I
> feel like that is the real discussion we should be having right now and
> not necessarily where to store it.
> 
> Because I think (maybe I'm wrong, though) where to store it heavily
> depends on what we want to store and how we want to use it.

I don't really see why.

> > I've not seen anything that's not for either:
> >   a) The user to know what the image is
> 
> I thought the use case was they just downloaded it.
> 
> Otherwise, they should manage their filenames reasonably, come on.
> Seriously, adding a cute picture because users are too stupid to manage
> their VMs is *not* qcow2's problem.

QEMU is hard to use right and it is QEMU's problem. Users aren't stupid
but neither do they have the time to learn internals of the tools they
use.


> >   b) The management layer to know what type of VM to create
> 
> Apparently this is really what you want.  I really still don't see the
> difficulty in supplying a config file (or the danger in not doing so, or
> in supplying the wrong one), but, hey, it would be a nice feature indeed.
> 
> (I just don't like the tradeoff in complexity.)
> 
> >> (Note that extending that header would not even be too complicated,
> >> because you can easily move the qcow2 header somewhere else.  Say you
> >> move it back by one cluster (e.g. 64 kB), then you just put the cluster
> >> that was there originally to the end of the file, which is pretty much
> >> trivial.  Then you copy that original data there and overwrite it with
> >> the image header.  Done.)
> >>
> >> Others want to store more binary data.  Then this may get inefficient
> >> and insufficient.  But I'd think at this point it gets really
> >> problematic to put the data into the qcow2 file because it really
> >> doesn't belong there.  (I can't imagine anything that would warrant a
> >> MIME type.)
> > 
> > No, I can't imagine why anyone wants a MIME type either.
> > 
> >> Then I've heard proposals of storing multiple disk images.  Yes, you
> >> could store multiple disks inside of a single qcow2 file, but it would
> >> be basically exactly the same as storing just multiple qcow2 files, so...
> > 
> > No, completely agree.
> 
> So, yeah, we need a discussion on what to store first, probably in a new
> thread, and maybe one that is not about qcow2 specifically.
> 
> >> And really, I still believe in my slippery slope argument, which means
> >> that even if you just want to innocently store a machine type, we will
> >> end up with something vastly more complex in the end.
> >>
> >> Finally, it appears to me that you have a simple problem, found one
> >> possible solution, and now you just focus on that solution instead of
> >> taking a step back and looking at the problem again.
> >>
> >> The problem: You want to store a binary blob and a disk image together.
> >>
> >> Your solution: qcow2 has refcounting and thus "occupation bits".  You
> >> can put data into it and it will leave it alone, as long as that area is
> >> marked as occupied.  Let's put the data into the qcow2 file.
> >>
> >> OK, let's look at the problem and its constraints again.
> >>
> >> Hard constraint: Store a single file.
> >> (I don't think this is a hard constraint, because I haven't been
> >> convinced yet that handling more than a single file is so bad.)
> > 
> > See above; I think it is.
> 
> I know, but you haven't convinced me yet. :-)
> 
> > My other hard contraint is that no tool has to change unless
> > it wants to make use of the new data.
> 
> Sure that it isn't a soft constraint?  If most tools can stay unchanged
> but some very specific ones have to be changed, that seems reasonable to me.
> 
> >> Soft constraint: Max doesn't like storing blobs in qcow2.
> >>
> >> So one solution is to ignore the soft constraint.  OK, valid solution, I
> >> give you that.  But it doesn't leave me content, probably understandably so.
> 
> [...]
> 
> >> But really, if you create a VM, you need a configuration.  Like if you
> >> set up a new computer, you need to know what you want.  Usually there is
> >> no sticky label, but you just have to know and input it manually.  Maybe
> >> you have a sheet of paper, which I'd call the configuration file.
> > 
> > Most things are figurable-out by the management tools/defaults or
> > are dependent on the whim of the user - we're only trying to stop the
> > user doing things that wont work.
> 
> But what's so bad about an empty screen because the user hasn't read the
> download description?

Because user just learns to avoid QEMU as being too hard in the future.

> > Simpler example; what stops you trying to put the PPC qcow image into
> > your x86 VM system - nothing that I know of.  I just want to stop the
> > users shooting themselves in the foot.
> 
> They haven't shot themselves in the foot, they've just wasted a bit of
> their time, which could've been avoided by reading before clicking.
> 
> [...]

Software developers are being paid for saving people's time.

> >>>>>>> --------------------------------------------------------------
> >>>>>>>    
> >>>>>>>
> >>>>>>> Some reasoning:
> >>>>>>>    a) I've avoided the problem of when QEMU interprets the value
> >>>>>>>       by ignoring it and giving it to management layers at the point
> >>>>>>>       of VM import.
> >>>>>>
> >>>>>> Yes, but in the process you've made it completely opaque to qemu,
> >>>>>> basically, which doesn't really make it better for me.  Not that
> >>>>>> qemu-specific information in qcow2 files would be what I want, but, well.
> >>>>>>
> >>>>>> But it does solve technical issues, I concede that.
> >>>>>>
> >>>>>>>    b) I hate JSON, but there again nailing down a fixed format
> >>>>>>>       seems easiest and it makes the job of QCOW easy - a single
> >>>>>>>       string.
> >>>>>>
> >>>>>> Not really.  The string can be rather long, so you probably don't want
> >>>>>> to store it in the image header, and thus it's just a binary blob from
> >>>>>> qcow2's perspective, essentially.
> >>>>>
> >>>>> Yes, but it's a single blob - I'm not asking for multiple keyed blobs
> >>>>> or the ability to update individual blobs; just one blob that I can
> >>>>> replace.
> >>>>
> >>>> OK, you aren't, but others seem to be.
> >>>>
> >>>> Or, well, you call it a single blob.  But actually the current ideas
> >>>> seem to be to store a rather large configuration tree with binary data
> >>>> in that blob, so to me personally there is absolutely no functional
> >>>> difference to just storing a tar file in that blob.
> >>>>
> >>>> So correct me if I'm wrong, but to me it appears that you effectively
> >>>> want to store a filesystem in qcow2.[1]  Well, that's better than making
> >>>> qcow2 the filesystem, but it still appears just the wrong way around to me.
> >>>
> >>> It's different in the sense that what we end up with is still a qcow2;
> >>> anything that just handles qcow2's and can pass them through doesn't
> >>> need to do anything different; users don't need to do anything
> >>> different.  No one has to pack/unpack the file.
> >>
> >> Packing/unpacking is a strawman because I'm doing my best to give
> >> proposals that completely avoid that.
> >>
> >> Users do need to do something different, because users do need to
> >> realize that today there is no way to store VM configuration and disk
> >> data in a single file.  So if they already start VMs just based on a
> >> disk, then they are assuming behavior we do not have and that I'd call
> >> naive.  But that is a strawman from my side, sorry.  Keeping naive users
> >> happy is probably OK.
> > 
> > Remember this all works fine now and has done for many years;
> > it's the addition of q35 that breaks that assumption.
> > The users can already blidly pick up the qcow2 image and stuff it in
> 
> Which probably was blind luck already.  And if it wasn't, that means
> they knew the defaults are what they want.  So now they'd know they
> aren't and they have to offer a config file along with the disk image.
> 
> > and it all works; all I want is for that to keep working.
> 
> And all I say is that it's not unreasonable to expect users to realize
> that a VM is more than a disk image, just like a computer is more than a
> disk drive; and that handling two files really is not the end of the world.
> 
> (And neither is wasting someone's time because they can't read.)
> 
> Firstly, I agree it's a nice thing to have, but it's not worth it if we
> don't come up with clear rules on how to prevent developing a full
> appliance format.
> 
> Or maybe we want that (because I still believe that you can always come
> up with obscure options without which the VM won't boot in your specific
> case), but then this is beyond just storing a tiny bit of data in a
> qcow2 image.
> 
> [...]

Either we'll add more and more data later or we won't. Why worry about
it from the start? We'll never get anywhere if we do.


> >>>> [1] Yes, I know that the guest disk already contains an FS. :-P
> >>>>
> >>>>>>>       (I would suggest in layer2 that the keys are sorted, but
> >>>>>>>       that's a pain to do in some json creators)
> >>>>>>>    c) Forcing the registry of keys might avoid silly duplication.
> >>>>>>>       We can but hope.
> >>>>>>>    d) I've not said it's a libvirt XML file since that seems
> >>>>>>>       a bit prescriptive.
> >>>>>>>
> >>>>>>> Some initial suggested keys:
> >>>>>>>
> >>>>>>>    "qemu.machine-types": [ "q35", "i440fx" ]
> >>>>>>>    "qemu.min-ram-MB": 1024
> >>>>>>
> >>>>>> I still don't understand why you'd want to put the configuration into
> >>>>>> qcow2 instead of the other way around.
> >>>>>>
> >>>>>> Or why you'd want to use a single file at all, because as this whole
> >>>>>> thread shows, a disk image alone is clearly not sufficient to describe a VM.
> >>>>>>
> >>>>>> (Or it may be in simple cases, but then that's because you don't need
> >>>>>> any configuration.)
> >>>>>
> >>>>> Because it avoids the unpacking associated with archives.
> >>>>
> >>>> I'm not talking about unpacking.  I'm talking about a potentially new
> >>>> format which allows accessing the qcow2 file in-place.  It would
> >>>> probably be trivial to write a block driver to allow this.
> >>>>
> >>>> (And as I wrote in my response to Michal, I suspect that tar could
> >>>> actually allow this, even though it would probably not be the ideal format.)
> >>>
> >>> As above, I don't think this is trivial; you have to change all the
> >>> layers;  lets say it was a tar; you'd have to somehow know that you're
> >>> importing one of these special tars,
> >>
> >> Which is trivial because it's just "Hey, look, it's a tar with that
> >> description file".
> > 
> > Trivial? It's taking 100+ mails to add a tag to a qcow2 file! Can you
> > imagine what it takes to change libvirt, openstack, ovirt and the rest?
> 
> :-)
> 
> The implementation is trivial is what I meant, just like the
> implementation would be rather simple for qcow2 to store a binary blob
> and completely ignore it.

Old QEMU can't handle tar files. You need to unpack them,
then figure out that there are two files in the tar, one
is just for new qemu versions, one is portable. At which point
you need to go figure out what is your QEMU version.


> >>>                                      you also have to have a tool to
> >>> create them;
> >>
> >> Also trivial.  Non-trivial is modifying them.
> >>
> >> The workflow would be to create the tar with an empty qcow2 file, the VM
> >> description you want, and then just using it.
> >>
> >> Yes, using is more difficult, but it wouldn't be an own tool, it would
> >> be built into qemu.  I can't say how difficult that implementation would
> >> be, but it would not be trivial, that is correct.
> >>
> >>>              and you have to worry about whether that alignment
> >>> is correct for the storage/memory you're using it with.
> >>
> >> Which would be difficult with tar, right.  But we don't have to use tar.
> >>
> >> (And, no, I don't think creating a new container format is not worse for
> >> interoperability than adding a blob to qcow2.)
> > 
> > If you were going to do this then you'd end up just using OVA.
> > You couldn't justify yet another format.
> 
> Sure, the exact format doesn't matter to me (or at least currently I
> don't think it does...).  I'm more interested in scope and what good it
> actually brings.
> 
> Max
> 

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-06 14:55                                       ` Max Reitz
@ 2018-06-06 15:25                                         ` Michal Suchánek
  2018-06-06 18:02                                           ` Max Reitz
  0 siblings, 1 reply; 157+ messages in thread
From: Michal Suchánek @ 2018-06-06 15:25 UTC (permalink / raw)
  To: Max Reitz
  Cc: Dr. David Alan Gilbert, Kevin Wolf, ehabkost, qemu-block,
	Michael S. Tsirkin, Richard W.M. Jones, qemu-devel, stefanha

[-- Attachment #1: Type: text/plain, Size: 1823 bytes --]

On Wed, 6 Jun 2018 16:55:08 +0200
Max Reitz <mreitz@redhat.com> wrote:

> On 2018-06-06 16:41, Dr. David Alan Gilbert wrote:
> > * Max Reitz (mreitz@redhat.com) wrote:  
> 
> [...]
> 
> >> So why is it so dangerous to connect a disk you just downloaded to
> >> e.g. the wrong machine type?  I assumed it just wouldn't work and
> >> you'd try again, until you realized that maybe you should read the
> >> download description and do as it says ("download this config
> >> file, pass it").  
> > 
> > That's bad!  Stuff should just-work;  
> 
> That's how it always should be.  Life's tough, though.
> 
> >                                      it currently just works,  
> 
> Due to sheer blind luck, I'd say.

It's TimeProvenSolution(tm).

> 
> >                                                               things
> > should get better and easier for our users.  
> 
> Users using a whole VM stack plus management, but then handling two
> files instead of one is too much to ask?

What you don't seem to realize is there are cases when there is an
'administrator' who has set up the VM stack plus management and 'joe
user' who wants to run some random VM on that stack.

And if you download an appliance compatible with the stack it should
just work. For a long time the 'appliance' for qemu based
virtualization was a simple qcow2 file which was sized sufficiently for
the VM to run but shrunk for transport. And although it is technically
wrong it JustWorked(tm).

> 
> >                                              And anyway, not
> > working for EFI for exmaple can be just a blank screen.  Seriously
> > - keep it easy for the user!  
> 
> Thinking this through makes you end up with appliances.

And those can in general have more than one disk.

Thanks

Michal

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-06 15:05                                 ` Dr. David Alan Gilbert
@ 2018-06-06 15:36                                   ` Eric Blake
  2018-06-06 16:11                                     ` Michal Suchánek
  2018-06-06 16:32                                     ` Daniel P. Berrangé
  2018-06-06 17:49                                   ` Max Reitz
  1 sibling, 2 replies; 157+ messages in thread
From: Eric Blake @ 2018-06-06 15:36 UTC (permalink / raw)
  To: Dr. David Alan Gilbert, Max Reitz
  Cc: Kevin Wolf, qemu-block, Michael S. Tsirkin, armbru, qemu-devel,
	Richard W.M. Jones, stefanha

On 06/06/2018 10:05 AM, Dr. David Alan Gilbert wrote:

>> If that's the issue, add a UUID to qcow2 files and reference it from the
>> config file.
> 
> Is a UUID a small string :-)

Even better, it's something that you could stick directly in the qcow2 
header (and which therefore cannot grow to a larger size) - it would be 
a well-constrained scoped addition.  Maybe the analogy to actual 
hardware would be that the config file is like a sticky note, and a UUID 
embedded in the qcow2 file would be the disk serial number; if you are 
paranoid that the sticky note could be too easily pulled off one disk 
and put on another, then the sticky note can include the serial number.

> 
>>> We should make this EASY for users.
>>
>> To me, having a simple config file they can edit manually certainly
>> seems simpler than having to use specific tools to edit it inside of the
>> qcow2 file.
> 
> The users never touch the tools; they click and import the VM image.

And if we make it easy to import a tar file as the VM image, then that's 
still the case.

>> Sure that it isn't a soft constraint?  If most tools can stay unchanged
>> but some very specific ones have to be changed, that seems reasonable to me.
> 
> The hard constraint is the normal path stays unchanged; we can change
> the tools to make use of the extra data, but not change what's out
> there.

But for the new config to be useful, you have to modify at least one 
tool in the path.  At which point, it is just as easy to say: "libvirt 
is now smart enough to read the config file out of a .qcow2 to know that 
it should prefer a q35 machine" as it is to say "libvirt is now smart 
enough to treat a .tar file containing .qcow2 and a config file that 
states that it should prefer a q35 machine", and either approach 
requires just a single file for the user to download.  Or, if you are 
worried about what happens in a too-old system that doesn't understand 
the config file, you have either: "the tool didn't know that the .qcow2 
contained a config snippet, and tried to open the qcow2 file with pc 
even though q35 would have been better" or "the tool didn't know that 
the .tar file contains a config snippet and .qcow2 image, and could not 
run the image".  Either way, the image with the new config data doesn't 
run unless the user realizes they need to upgrade their system - but 
trying (and failing) to run is actually less friendly than flat out 
claiming unrecognized format and failing early.  So going with the tar 
format actually encourages users to upgrade, unlike going with enhancing 
qcow2.


> 
> Because it's got to be EASY for the customer; seriously - stop punishing
> the user for not noticing something.
> We've got to help the users, if not we get people asking why their VM
> system has given them a black screen, or why the image they just
> downloaded didn't work - it's basic user friendliness.
> 
> It's not obvious why it's failed; if it was as simple as a nice box
> popping up telling them they'd booted it wouldn't be too bad; but some
> of them will waste 3 hours trying to figure out wth happened.
> 
> *seriously* think about our users.

I am - to me, telling a user that "here is your image, it is a new file 
extension, therefore you need a new qemu before you can even try to use 
it - but once you have that, everything just works" is nicer than "here 
is the same extension you've always used, but it might not work for you 
and it might be 3 hours of your time to figure out why it didn't work".

>>
>> The implementation is trivial is what I meant, just like the
>> implementation would be rather simple for qcow2 to store a binary blob
>> and completely ignore it.
> 
> But then you'd have people shipping .newformat files as well as qcow2
> files and you'd have to persuade people to start doing that, and they'd
> ship both or none or....

Why would you intentionally ship a .qcow2 that only works on q35 if 
.newformat is already known to be nicer to users?  Just ship .newformat! 
  People shipping just .qcow2 would be limited to those used to 'pc' as 
the default, but even they could be encouraged to tweak their process to 
make it easier for end consumers to take advantage of .newformat.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-06 14:17                                   ` Max Reitz
@ 2018-06-06 16:10                                     ` Eduardo Habkost
  2018-06-06 18:09                                       ` Max Reitz
  0 siblings, 1 reply; 157+ messages in thread
From: Eduardo Habkost @ 2018-06-06 16:10 UTC (permalink / raw)
  To: Max Reitz
  Cc: Michal Suchánek, Kevin Wolf, qemu-block, Michael S. Tsirkin,
	Richard W.M. Jones, qemu-devel, stefanha

On Wed, Jun 06, 2018 at 04:17:14PM +0200, Max Reitz wrote:
> On 2018-06-06 15:45, Michal Suchánek wrote:
> > On Wed, 6 Jun 2018 15:14:03 +0200
> > Max Reitz <mreitz@redhat.com> wrote:
> > 
> >> On 2018-06-06 14:13, Michal Suchánek wrote:
> >>> On Wed, 6 Jun 2018 13:52:35 +0200
> >>> Max Reitz <mreitz@redhat.com> wrote:
> >>>   
> >>>> On 2018-06-06 13:43, Michal Suchánek wrote:  
> >>>>> On Wed, 6 Jun 2018 13:32:47 +0200
> >>>>> Max Reitz <mreitz@redhat.com> wrote:
> >>>>>     
> >>>>>> On 2018-06-06 13:19, Michal Suchánek wrote:    
> >>>>>>> On Wed, 6 Jun 2018 13:02:53 +0200
> >>>>>>> Max Reitz <mreitz@redhat.com> wrote:  
> >>
> >> [...]
> >>
> >>>>>>>> What I'm trying to get at is that qcow2 was not designed to be
> >>>>>>>> a container format for arbitrary files.  If you want to make it
> >>>>>>>> such, I'm sure there are existing formats that work
> >>>>>>>> better.      
> >>>>>>>
> >>>>>>> Such as?      
> >>>>>>
> >>>>>> ext2?    
> >>>>>
> >>>>> So you want an ext2 driver in qemu instead of expanding qcow2 to
> >>>>> work not only for a single disk but also for an appliance?    
> >>>>
> >>>> Yes, because ext2 was designed to be a proper filesystem.  I'm not
> >>>> an FS designer.  Well, not a good one anyway.  So I don't trust
> >>>> myself on extending qcow2 to be a good FS -- and why would I, when
> >>>> there are already numerous FS around.  
> >>>
> >>> Do you expect that performance of qemu using qcow2 driver over ext2
> >>> driver will be better than using qcow driver directly with some part
> >>> semi-permanently occupied by a configuration blob? My bet is not.  
> >>
> >> If you want to store multiple disk images in a single file?  I would
> >> think so, yes.  With qcow2, I would assume it leads to
> >> fragmentation.  
> > 
> > How is that different from single disk divided into two partitions
> > internally (without any knowledge on the qcow2 level)?
> 
> From how it's going to be fragmented, there is no difference.  If you
> have multiple partitions and write to them concurrently, thus allocating
> new areas, you get bad fragmentation.
> 
> >> I would hope that proper filesystems can mitigate this.
> > 
> > Not really. Not without much complexity and repeated maintenance.
> 
> Yes, a proper filesystem.  Which we'd get for free with multiple files.
> 
> >>> The ext* drivers are designed to work with kernel VM infrastructure
> >>> which must be tuned for different usage scenarios and you would
> >>> have to duplicate that tuning in qemu to get competitive
> >>> performance. Also you get qcow2 and ext2 metadata which must be
> >>> allocated, managed, etc. You get more storage and performance
> >>> overhead for no good reason.  
> >>
> >> Yes, there is a good reason.  You can add arbitrary configuration
> >> options without having to worry about me.
> > 
> > But I will not be able to use the images in qemu so it will be useless.
> 
> Neither can you with the current proposal because that is about adding
> management layer configuration options which are opaque to qemu.
> 
> > Well, there is FUSE and that is certainly blazing fast and ubiquitous,
> > I am sure.
> 
> If you want to use pre-existing drivers, you'd probably use a loop device.
> 
> Otherwise, you'd use the block layer for accessing the disk.
> 
> If you want blazingly fast, you probably won't use qcow2 anyway.  Or,
> funnily enough, you'd want to probably split the qcow2 file into a
> metadata and a data file, so you get even more files.  (But that is a
> proposal for the future.)
> 
> >> Seriously, though, a real FS would allow you to be more expressive and
> >> really do what you want without having to work around the quirks that
> >> adding a not-real-FS in the most simple way possible to qcow2 would
> >> bring with it.
> >>
> >> Because this is part of my fear, that we now add a very simple blob
> >> for just a sprinkle of data.  But over time it gets more and more
> >> complex because we want to store more and more data to make things
> >> ever more convenient[1], we notice that we need more features, the
> >> format gets more complex, and in the end we have an FS that is just
> >> worse than a real FS.
> >>
> >> [1] And note that if I'm convinced to store VM configuration data in
> >> qemu, I will agree that we can store any data in there and it would be
> >> nice if any VM could be provisioned and used that way.
> >>
> >>> On the other hand, qcow is designed for storing VM disk data and
> >>> hopefully was tuned to do that decently over the years. The primary
> >>> use case remains storing VM disk data. Adding a configuration blob
> >>> does not change that.  
> >>
> >> True.  So the argument is that qcow2 may be worse for storing
> >> arbitrary data, but we don't have performance requirements for that;
> >> but we do have performance requirements for disk data and adding
> >> another format below qcow2 will not make it better.
> >>
> >> I do think it is possible to not make things worse with a format under
> >> qcow2, but that may require additional complexity, that you think is
> >> pointless.
> >>
> >> I understand that you think that, but I still believe that putting the
> >> configuration into qcow2 is just the wrong way around and will fall on
> >> our feet in the long run.
> > 
> > I think that *if* we want an 'appliance' format that stores a whole VM
> > in a single file to ease VM distribution then the logical place to look
> > in qemu is qcow. The reason have been explained at length.
> 
> The reason being that it's the easiest place, yes.  That doesn't make it
> the best place.
> 
> > I understand that for some use cases simplifying the distribution of
> > VMs as much as possible is quite important.
> 
> I don't because still nobody has explained it to me.
> 
> The only explanation I got so far was "People are lazy and we have
> defaults for everything, so we don't throw an error if people forget to
> pass a configuration file."

People don't pass a configuration file today because there's no
standard for such a configuration file.  qcow2 is already used
today as an appliance file format because there's no better
option.  People download disk images from appliance and OS
providers, import them into a cloud system, and it works out of
the box because (luckily) "pc" is enough for most of them.

We can specify a true appliance file format, and ask people to
use it.  But then providers of single-disk appliances and OSes
will need to publish two appliance images: qcow2 disk image for
old systems that don't support the new format, and one in the new
appliance format, for systems that support it.


> 
> Which to me still just makes it an inconvenience.

Well, there are small inconveniences and there are big
inconveniences that together make a system unnecessarily hard to
use.  I'd say this one falls somewhere in the middle.


[...]
> I'm noticing a pattern here, and that is that everybody has a different
> opinion on what we actually want in the end, and it's just by chance
> that we find ourselves in two camps ("put it in qcow2" vs. "put it
> somewhere else").
> 
> Maybe we should first discuss what we actually want before we can
> discuss where to put it.

I'm inclined to agree.  Once we figure out a good VM description
format, we can justify a proposal to allow embedding the VM
description in qcow2 for convenience.

-- 
Eduardo

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-06 15:36                                   ` Eric Blake
@ 2018-06-06 16:11                                     ` Michal Suchánek
  2018-06-06 16:37                                       ` Eric Blake
  2018-06-06 16:32                                     ` Daniel P. Berrangé
  1 sibling, 1 reply; 157+ messages in thread
From: Michal Suchánek @ 2018-06-06 16:11 UTC (permalink / raw)
  To: Eric Blake
  Cc: Dr. David Alan Gilbert, Max Reitz, Kevin Wolf, qemu-block,
	Michael S. Tsirkin, armbru, qemu-devel, Richard W.M. Jones,
	stefanha

On Wed, 6 Jun 2018 10:36:20 -0500
Eric Blake <eblake@redhat.com> wrote:

> On 06/06/2018 10:05 AM, Dr. David Alan Gilbert wrote:
> 
> >   
> >>> We should make this EASY for users.  
> >>
> >> To me, having a simple config file they can edit manually certainly
> >> seems simpler than having to use specific tools to edit it inside
> >> of the qcow2 file.  
> > 
> > The users never touch the tools; they click and import the VM
> > image.  
> 
> And if we make it easy to import a tar file as the VM image, then
> that's still the case.
> 
> >> Sure that it isn't a soft constraint?  If most tools can stay
> >> unchanged but some very specific ones have to be changed, that
> >> seems reasonable to me.  
> > 
> > The hard constraint is the normal path stays unchanged; we can
> > change the tools to make use of the extra data, but not change
> > what's out there.  
> 
> But for the new config to be useful, you have to modify at least one 
> tool in the path.  At which point, it is just as easy to say:
> "libvirt is now smart enough to read the config file out of a .qcow2
> to know that it should prefer a q35 machine" as it is to say "libvirt
> is now smart enough to treat a .tar file containing .qcow2 and a
> config file that states that it should prefer a q35 machine", and
> either approach requires just a single file for the user to
> download.  Or, if you are worried about what happens in a too-old
> system that doesn't understand the config file, you have either: "the
> tool didn't know that the .qcow2 contained a config snippet, and
> tried to open the qcow2 file with pc even though q35 would have been
> better" or "the tool didn't know that the .tar file contains a config
> snippet and .qcow2 image, and could not run the image".  Either way,
> the image with the new config data doesn't run unless the user
> realizes they need to upgrade their system - but trying (and failing)
> to run is actually less friendly than flat out claiming unrecognized
> format and failing early.  So going with the tar format actually
> encourages users to upgrade, unlike going with enhancing qcow2.
> 

Yes, that's a good argument.

The reason why storing the config inside the qcow2 file is you have one
self-contained file that can be updated by qemu itself.

So you take an image file, point your management console to it, boot
it, change something on it, shut it down, and publish the image as the
new revision of the VM.

With the tar file the management console needs to chew the tar file,
save a copy of the config file and the disk image somewhere, and when
you update the image you have to re-export the tar file.

This is a lot of lengthy copying. The upside is that the console can
check that the machine you are exporting is not running an you are not
prone to publishing images in inconsistent state.

Thanks

Michal

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-06 15:36                                   ` Eric Blake
  2018-06-06 16:11                                     ` Michal Suchánek
@ 2018-06-06 16:32                                     ` Daniel P. Berrangé
  2018-06-06 16:36                                       ` Dr. David Alan Gilbert
  2018-06-07 10:02                                       ` Andrea Bolognani
  1 sibling, 2 replies; 157+ messages in thread
From: Daniel P. Berrangé @ 2018-06-06 16:32 UTC (permalink / raw)
  To: Eric Blake
  Cc: Dr. David Alan Gilbert, Max Reitz, Kevin Wolf, qemu-block,
	Michael S. Tsirkin, armbru, qemu-devel, Richard W.M. Jones,
	stefanha

On Wed, Jun 06, 2018 at 10:36:20AM -0500, Eric Blake wrote:
> On 06/06/2018 10:05 AM, Dr. David Alan Gilbert wrote:
> 
> > > If that's the issue, add a UUID to qcow2 files and reference it from the
> > > config file.
> > 
> > Is a UUID a small string :-)
> 
> Even better, it's something that you could stick directly in the qcow2
> header (and which therefore cannot grow to a larger size) - it would be a
> well-constrained scoped addition.  Maybe the analogy to actual hardware
> would be that the config file is like a sticky note, and a UUID embedded in
> the qcow2 file would be the disk serial number; if you are paranoid that the
> sticky note could be too easily pulled off one disk and put on another, then
> the sticky note can include the serial number.
> 
> > 
> > > > We should make this EASY for users.
> > > 
> > > To me, having a simple config file they can edit manually certainly
> > > seems simpler than having to use specific tools to edit it inside of the
> > > qcow2 file.
> > 
> > The users never touch the tools; they click and import the VM image.
> 
> And if we make it easy to import a tar file as the VM image, then that's
> still the case.
> 
> > > Sure that it isn't a soft constraint?  If most tools can stay unchanged
> > > but some very specific ones have to be changed, that seems reasonable to me.
> > 
> > The hard constraint is the normal path stays unchanged; we can change
> > the tools to make use of the extra data, but not change what's out
> > there.
> 
> But for the new config to be useful, you have to modify at least one tool in
> the path.  At which point, it is just as easy to say: "libvirt is now smart
> enough to read the config file out of a .qcow2 to know that it should prefer
> a q35 machine" as it is to say "libvirt is now smart enough to treat a .tar
> file containing .qcow2 and a config file that states that it should prefer a
> q35 machine", and either approach requires just a single file for the user
> to download.

Just to be clear, libvirt isn't going to do either of those things.

Whether there is metadata stuffed inside qcow2, or in a metdata file
inside a tar file, libvirt is not going to look inside either of them.
The XML is the only place libvirt deals with the hardware config.

Extracting machine type is always going to be a job for the layer above
such as OpenStack/OVirt/Virt-manager/etc. They will then decide whether
or not they want to honour that info, and if so, put it into the XML
they give to libvirt.

As mentioned elsewhere, IMHO, it is more friendly to those tools
to use pre-existing formats, eg TAR and XML/JSON, for which
their respective programming langauges already have APIs/parsers.


Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-06 16:32                                     ` Daniel P. Berrangé
@ 2018-06-06 16:36                                       ` Dr. David Alan Gilbert
  2018-06-07 10:02                                       ` Andrea Bolognani
  1 sibling, 0 replies; 157+ messages in thread
From: Dr. David Alan Gilbert @ 2018-06-06 16:36 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Eric Blake, Max Reitz, Kevin Wolf, qemu-block,
	Michael S. Tsirkin, armbru, qemu-devel, Richard W.M. Jones,
	stefanha

* Daniel P. Berrangé (berrange@redhat.com) wrote:
> On Wed, Jun 06, 2018 at 10:36:20AM -0500, Eric Blake wrote:
> > On 06/06/2018 10:05 AM, Dr. David Alan Gilbert wrote:
> > 
> > > > If that's the issue, add a UUID to qcow2 files and reference it from the
> > > > config file.
> > > 
> > > Is a UUID a small string :-)
> > 
> > Even better, it's something that you could stick directly in the qcow2
> > header (and which therefore cannot grow to a larger size) - it would be a
> > well-constrained scoped addition.  Maybe the analogy to actual hardware
> > would be that the config file is like a sticky note, and a UUID embedded in
> > the qcow2 file would be the disk serial number; if you are paranoid that the
> > sticky note could be too easily pulled off one disk and put on another, then
> > the sticky note can include the serial number.
> > 
> > > 
> > > > > We should make this EASY for users.
> > > > 
> > > > To me, having a simple config file they can edit manually certainly
> > > > seems simpler than having to use specific tools to edit it inside of the
> > > > qcow2 file.
> > > 
> > > The users never touch the tools; they click and import the VM image.
> > 
> > And if we make it easy to import a tar file as the VM image, then that's
> > still the case.
> > 
> > > > Sure that it isn't a soft constraint?  If most tools can stay unchanged
> > > > but some very specific ones have to be changed, that seems reasonable to me.
> > > 
> > > The hard constraint is the normal path stays unchanged; we can change
> > > the tools to make use of the extra data, but not change what's out
> > > there.
> > 
> > But for the new config to be useful, you have to modify at least one tool in
> > the path.  At which point, it is just as easy to say: "libvirt is now smart
> > enough to read the config file out of a .qcow2 to know that it should prefer
> > a q35 machine" as it is to say "libvirt is now smart enough to treat a .tar
> > file containing .qcow2 and a config file that states that it should prefer a
> > q35 machine", and either approach requires just a single file for the user
> > to download.
> 
> Just to be clear, libvirt isn't going to do either of those things.
> 
> Whether there is metadata stuffed inside qcow2, or in a metdata file
> inside a tar file, libvirt is not going to look inside either of them.
> The XML is the only place libvirt deals with the hardware config.
> 
> Extracting machine type is always going to be a job for the layer above
> such as OpenStack/OVirt/Virt-manager/etc. They will then decide whether
> or not they want to honour that info, and if so, put it into the XML
> they give to libvirt.
> 
> As mentioned elsewhere, IMHO, it is more friendly to those tools
> to use pre-existing formats, eg TAR and XML/JSON, for which
> their respective programming langauges already have APIs/parsers.

Libvirt could provide a wrapper around whichever format to extract the
data and provide it to the upper layer.  It could also validate against
it to see if the constraints were met as a service for an upper layer.

Dave

> 
> Regards,
> Daniel
> -- 
> |: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org         -o-            https://fstop138.berrange.com :|
> |: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-06 16:11                                     ` Michal Suchánek
@ 2018-06-06 16:37                                       ` Eric Blake
  0 siblings, 0 replies; 157+ messages in thread
From: Eric Blake @ 2018-06-06 16:37 UTC (permalink / raw)
  To: Michal Suchánek
  Cc: Dr. David Alan Gilbert, Max Reitz, Kevin Wolf, qemu-block,
	Michael S. Tsirkin, armbru, qemu-devel, Richard W.M. Jones,
	stefanha

On 06/06/2018 11:11 AM, Michal Suchánek wrote:

> The reason why storing the config inside the qcow2 file is you have one
> self-contained file that can be updated by qemu itself.
> 
> So you take an image file, point your management console to it, boot
> it, change something on it, shut it down, and publish the image as the
> new revision of the VM.
> 
> With the tar file the management console needs to chew the tar file,
> save a copy of the config file and the disk image somewhere, and when
> you update the image you have to re-export the tar file.

That's not necessarily true.  We're arguing that as long as the qcow2 
image is last in the tar file, it should still be relatively easy to 
write a qemu block driver that manages tar files for in-place qcow2 
editing.  GNU tar can also do --append or --update to modify the config 
file; although we may have to be careful if that starts to make the 
qcow2 image not the last member in the tar file.  On the other hand, if 
updating the config file is a common enough operation, I would not be 
surprised if we end up teaching the qemu tar driver how to resize images 
even when the qcow2 portion of the tar file is not last, or adding 
qemu-img commands that wrap typical tar file manipulations.

> 
> This is a lot of lengthy copying. The upside is that the console can
> check that the machine you are exporting is not running an you are not
> prone to publishing images in inconsistent state.

Read-only qcow2 within a tar does NOT require lengthy copying.  Updating 
qcow2 within tar might, but then again, we are arguing that with a sane 
layout, copying the ENTIRE qcow2 portion is NOT always going to be 
necessary.  For example, modern Linux has things like 
fallocate(FALLOC_FL_INSERT_RANGE) (on file systems that are capable) for 
enlarging the head of a tar file enough to stick in a larger config file 
at the front, with a LOT less data movement effort when compared to 
moving the qcow2 tail of that same tar file.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-06 15:09                                 ` Michael S. Tsirkin
@ 2018-06-06 17:06                                   ` Max Reitz
  2018-06-07 21:43                                     ` Michael S. Tsirkin
  0 siblings, 1 reply; 157+ messages in thread
From: Max Reitz @ 2018-06-06 17:06 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Dr. David Alan Gilbert, Kevin Wolf, qemu-block, armbru,
	qemu-devel, Richard W.M. Jones, stefanha

[-- Attachment #1: Type: text/plain, Size: 16599 bytes --]

On 2018-06-06 17:09, Michael S. Tsirkin wrote:
> On Wed, Jun 06, 2018 at 04:51:39PM +0200, Max Reitz wrote:
>> On 2018-06-06 16:31, Dr. David Alan Gilbert wrote:
>>> * Max Reitz (mreitz@redhat.com) wrote:
>>>> On 2018-06-06 14:00, Dr. David Alan Gilbert wrote:
>>>>> * Max Reitz (mreitz@redhat.com) wrote:
>>>>>> On 2018-06-06 13:14, Dr. David Alan Gilbert wrote:
>>>>>>> * Max Reitz (mreitz@redhat.com) wrote:
>>>>>>>> On 2018-06-05 11:21, Dr. David Alan Gilbert wrote:
>>>>>>>>> <reawakening a fizzled out thread>
>>>
>>> <snip>
>>>
>>>>>>> The problem with having a separate file is that you either have to copy
>>>>>>> it around with the image 
>>>>>>
>>>>>> Which is just an inconvenience.
>>>>>
>>>>> It's more than that;  if it's a separate file then the tools can't
>>>>> rely on users supplying it, and frankly they won't and they'll still
>>>>> just supply an image.
>>>>
>>>> At which point you throw an error and tell them to specify the config file.
>>>
>>> No:
>>>    a) At the moment they get away with it for images since they're all
>>>       'pc' and the management layers do the right thing.
>>
>> So so far nobody has complained?  I don't really see the problem then.
>>
>> If deploying a disk and using all the defaults works out for users,
>> great.  If they want more options, apparently they already know they
>> have to provide some config.
> 
> QEMU's usability is terrible. There are tons of tools out there to try
> to tame it, but of course they lack the knowledge of the VM internals
> that QEMU has.

Er, yeah, OK.  But it was my understanding that we decided that we have
a management layer on top of qemu to make things simple.

Also, this is once more a case of first deciding what we want at all.
Dave wants configuration options for the upper management layer which
are completely opaque to qemu.  That has nothing to do whatsoever with
the usability of qemu itself.

>>>    b) They'll give the wrong config file - then you'd need to add a flag
>>>      to detect that - which means you'd need to add something to the
>>>      qcow to match it to the config; loop back to teh start!
>>
>> I'm not sure how seriously I should take this argument.  Do stupid
>> things, win stupid prizes.
>>
>> If that's the issue, add a UUID to qcow2 files and reference it from the
>> config file.
>>
>>> We should make this EASY for users.
>>
>> To me, having a simple config file they can edit manually certainly
>> seems simpler than having to use specific tools to edit it inside of the
>> qcow2 file.
> 
> I think you are one of the happy users familiar with qemu intricacies
> and/or using a tool on top that does it for you.

Yeah, virt-manager and sometimes libvirt directly.  Works nicely.  In
any case, having to manage more than a single file was never one of my
worries.  In fact, I never had to manage any file because both tools do
it for me.

And again, I don't know what the usability of qemu has to do with what
Dave is proposing.

[...]
>> Because I think (maybe I'm wrong, though) where to store it heavily
>> depends on what we want to store and how we want to use it.
> 
> I don't really see why.

For instance, supporting full-blown appliances would mean supporting
multiple images.  Maybe in multiple formats.  Maybe the user wants
runtime performance and is willing to give up a bit of installation time
for that (e.g. for unpacking an archive).

In any case, if we want to be able to configure every kind of VM, tying
everything to qcow2 seems like a bad idea.  First defining a format and
then deciding on whether it makes sense to be able to put it into qcow2
for certain subcases seems much more reasonable.

And if you make the format decidedly qcow2-independent, the whole
"putting it into qcow2 is the simplest implementation" argument becomes
rather weak.

>>> I've not seen anything that's not for either:
>>>   a) The user to know what the image is
>>
>> I thought the use case was they just downloaded it.
>>
>> Otherwise, they should manage their filenames reasonably, come on.
>> Seriously, adding a cute picture because users are too stupid to manage
>> their VMs is *not* qcow2's problem.
> 
> QEMU is hard to use right and it is QEMU's problem. Users aren't stupid
> but neither do they have the time to learn internals of the tools they
> use.

Technically, it's the users' problem.  It may be qemu's fault, though.

I will not say it is qemu's fault, because I was always told we have a
management layer to make things simple again.  "qemu worries about
execution, management layer worries about policy" is what I was told.

Also, I have no idea what you are talking about.  I gave a very specific
example.  How is adding a picture to a VM disk image going to help
anyone?  If that's the issue people are facing, I would argue they
probably have a multitude of different issues with using qemu, because I
fully agree with you on that point -- using qemu for complex cases is
hard.  Well, no, it's simple, really, but then you probably won't get
the best out of it.  (As can be seen by the fact that some people seem
to start their VM just based on a disk image, and that seems to work...)

So, using qemu in the best way possible is hard.  But a pictogram in a
disk image will not solve that problem.  I was always told that using a
management layer solves the problem.  And as I understood, this was what
Dave's proposal was about, the management layer, not qemu.

I would expect from the management layer to at least make managing VMs
easy.  The management layer can give names.  It can present pictures.
It can manage files.  It can export a config file + disk image so that
it can be imported somewhere else.

Therefore, I don't know what you mean by "learn internals of the tools
they use".  They don't need to do that, if they use a management layer.
All they need to do is to supply everything the management layer may ask
of them, and I do not understand why it is too difficult to request a
plain config file that the user doesn't even need to understand.  They
just need to download it along with the disk image.


But all of that writing once again comes down to this: You are talking
about qemu.  Dave is talking about something higher in the management
layer.  Those are different things, and as I said, we first need to find
common ground there.

This is exactly why I said "where to store it heavily depends on what we
want to store and how we want to use it."  As long as we don't know
that, all of us are using strawman arguments where some other party
suddenly chimes in and says "no, no, no, this is not what I'm talking
about".  Yes, maybe you aren't, but someone else is.

[...]

>>>> But really, if you create a VM, you need a configuration.  Like if you
>>>> set up a new computer, you need to know what you want.  Usually there is
>>>> no sticky label, but you just have to know and input it manually.  Maybe
>>>> you have a sheet of paper, which I'd call the configuration file.
>>>
>>> Most things are figurable-out by the management tools/defaults or
>>> are dependent on the whim of the user - we're only trying to stop the
>>> user doing things that wont work.
>>
>> But what's so bad about an empty screen because the user hasn't read the
>> download description?
> 
> Because user just learns to avoid QEMU as being too hard in the future.

So you want appliances, do I understand that correctly?  Because that is
exactly what Dave doesn't want.

Furthermore, another case of "qemu is too hard to use".  I will not
argue against you there, because that may very well be true, but I will
once again say that I was of the impression that we had management
layers to handle that complexity.

>>> Simpler example; what stops you trying to put the PPC qcow image into
>>> your x86 VM system - nothing that I know of.  I just want to stop the
>>> users shooting themselves in the foot.
>>
>> They haven't shot themselves in the foot, they've just wasted a bit of
>> their time, which could've been avoided by reading before clicking.
>>
>> [...]
> 
> Software developers are being paid for saving people's time.

Very good point, but I did say something like this before: I do not
oppose appliances whatsoever.  In fact, it seems like a nice thing to have.

But, here's the deal: I do not think putting that data into qcow2 to be
the best solution.  Furthermore, I have things to do that I consider
more important than developing an appliance solution.  Therefore, it's
not like I'm sitting around doing nothing when I could be developing a
solution to this issue here.

I kept saying that I consider all of this an inconvenience.  Yes, it
would be nice to have.  But I have things on my to do list that are hard
feature requests, things that people really do need.  We all have.  We
all need to decide how we can use our own time as efficiently as
possible.  And I do not think that developing an appliance solution
would be the best use of my time.  (Until my manager disagrees.)

>>>>>>>>> --------------------------------------------------------------
>>>>>>>>>    
>>>>>>>>>
>>>>>>>>> Some reasoning:
>>>>>>>>>    a) I've avoided the problem of when QEMU interprets the value
>>>>>>>>>       by ignoring it and giving it to management layers at the point
>>>>>>>>>       of VM import.
>>>>>>>>
>>>>>>>> Yes, but in the process you've made it completely opaque to qemu,
>>>>>>>> basically, which doesn't really make it better for me.  Not that
>>>>>>>> qemu-specific information in qcow2 files would be what I want, but, well.
>>>>>>>>
>>>>>>>> But it does solve technical issues, I concede that.
>>>>>>>>
>>>>>>>>>    b) I hate JSON, but there again nailing down a fixed format
>>>>>>>>>       seems easiest and it makes the job of QCOW easy - a single
>>>>>>>>>       string.
>>>>>>>>
>>>>>>>> Not really.  The string can be rather long, so you probably don't want
>>>>>>>> to store it in the image header, and thus it's just a binary blob from
>>>>>>>> qcow2's perspective, essentially.
>>>>>>>
>>>>>>> Yes, but it's a single blob - I'm not asking for multiple keyed blobs
>>>>>>> or the ability to update individual blobs; just one blob that I can
>>>>>>> replace.
>>>>>>
>>>>>> OK, you aren't, but others seem to be.
>>>>>>
>>>>>> Or, well, you call it a single blob.  But actually the current ideas
>>>>>> seem to be to store a rather large configuration tree with binary data
>>>>>> in that blob, so to me personally there is absolutely no functional
>>>>>> difference to just storing a tar file in that blob.
>>>>>>
>>>>>> So correct me if I'm wrong, but to me it appears that you effectively
>>>>>> want to store a filesystem in qcow2.[1]  Well, that's better than making
>>>>>> qcow2 the filesystem, but it still appears just the wrong way around to me.
>>>>>
>>>>> It's different in the sense that what we end up with is still a qcow2;
>>>>> anything that just handles qcow2's and can pass them through doesn't
>>>>> need to do anything different; users don't need to do anything
>>>>> different.  No one has to pack/unpack the file.
>>>>
>>>> Packing/unpacking is a strawman because I'm doing my best to give
>>>> proposals that completely avoid that.
>>>>
>>>> Users do need to do something different, because users do need to
>>>> realize that today there is no way to store VM configuration and disk
>>>> data in a single file.  So if they already start VMs just based on a
>>>> disk, then they are assuming behavior we do not have and that I'd call
>>>> naive.  But that is a strawman from my side, sorry.  Keeping naive users
>>>> happy is probably OK.
>>>
>>> Remember this all works fine now and has done for many years;
>>> it's the addition of q35 that breaks that assumption.
>>> The users can already blidly pick up the qcow2 image and stuff it in
>>
>> Which probably was blind luck already.  And if it wasn't, that means
>> they knew the defaults are what they want.  So now they'd know they
>> aren't and they have to offer a config file along with the disk image.
>>
>>> and it all works; all I want is for that to keep working.
>>
>> And all I say is that it's not unreasonable to expect users to realize
>> that a VM is more than a disk image, just like a computer is more than a
>> disk drive; and that handling two files really is not the end of the world.
>>
>> (And neither is wasting someone's time because they can't read.)
>>
>> Firstly, I agree it's a nice thing to have, but it's not worth it if we
>> don't come up with clear rules on how to prevent developing a full
>> appliance format.
>>
>> Or maybe we want that (because I still believe that you can always come
>> up with obscure options without which the VM won't boot in your specific
>> case), but then this is beyond just storing a tiny bit of data in a
>> qcow2 image.
>>
>> [...]
> 
> Either we'll add more and more data later or we won't. Why worry about
> it from the start? We'll never get anywhere if we do.

That is not a very good argument.  Adding things always means having to
support them later.  It does make a lot of sense to worry about this
burden before starting, and thus trying to find the best possible
solution for the future, not the easiest hack for now.

And as I've said multiple times now, but I can't repeat myself often
enough, I think it would be most efficient if we worried about what we
want to store first, before we worry about where to store it.  I believe
that once we have a hard requirement on what we want to store and how to
use it (that most people agree on), we will have a set of constraints on
how we can represent that data and where it needs to be stored, and this
will give us a simple yes or no to the question whether the data needs
to be stored in qcow2, or whether there is any better way (or whether it
can be stored in qcow2, but need not be).

>>>>>> [1] Yes, I know that the guest disk already contains an FS. :-P
>>>>>>
>>>>>>>>>       (I would suggest in layer2 that the keys are sorted, but
>>>>>>>>>       that's a pain to do in some json creators)
>>>>>>>>>    c) Forcing the registry of keys might avoid silly duplication.
>>>>>>>>>       We can but hope.
>>>>>>>>>    d) I've not said it's a libvirt XML file since that seems
>>>>>>>>>       a bit prescriptive.
>>>>>>>>>
>>>>>>>>> Some initial suggested keys:
>>>>>>>>>
>>>>>>>>>    "qemu.machine-types": [ "q35", "i440fx" ]
>>>>>>>>>    "qemu.min-ram-MB": 1024
>>>>>>>>
>>>>>>>> I still don't understand why you'd want to put the configuration into
>>>>>>>> qcow2 instead of the other way around.
>>>>>>>>
>>>>>>>> Or why you'd want to use a single file at all, because as this whole
>>>>>>>> thread shows, a disk image alone is clearly not sufficient to describe a VM.
>>>>>>>>
>>>>>>>> (Or it may be in simple cases, but then that's because you don't need
>>>>>>>> any configuration.)
>>>>>>>
>>>>>>> Because it avoids the unpacking associated with archives.
>>>>>>
>>>>>> I'm not talking about unpacking.  I'm talking about a potentially new
>>>>>> format which allows accessing the qcow2 file in-place.  It would
>>>>>> probably be trivial to write a block driver to allow this.
>>>>>>
>>>>>> (And as I wrote in my response to Michal, I suspect that tar could
>>>>>> actually allow this, even though it would probably not be the ideal format.)
>>>>>
>>>>> As above, I don't think this is trivial; you have to change all the
>>>>> layers;  lets say it was a tar; you'd have to somehow know that you're
>>>>> importing one of these special tars,
>>>>
>>>> Which is trivial because it's just "Hey, look, it's a tar with that
>>>> description file".
>>>
>>> Trivial? It's taking 100+ mails to add a tag to a qcow2 file! Can you
>>> imagine what it takes to change libvirt, openstack, ovirt and the rest?
>>
>> :-)
>>
>> The implementation is trivial is what I meant, just like the
>> implementation would be rather simple for qcow2 to store a binary blob
>> and completely ignore it.
> 
> Old QEMU can't handle tar files. You need to unpack them,
> then figure out that there are two files in the tar, one
> is just for new qemu versions, one is portable. At which point
> you need to go figure out what is your QEMU version.

And old qemu versions will just give you a blank screen for a qcow2 file
with required non-default options.

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 484 bytes --]

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-06 15:05                                 ` Dr. David Alan Gilbert
  2018-06-06 15:36                                   ` Eric Blake
@ 2018-06-06 17:49                                   ` Max Reitz
  1 sibling, 0 replies; 157+ messages in thread
From: Max Reitz @ 2018-06-06 17:49 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: Kevin Wolf, qemu-block, Michael S. Tsirkin, armbru, qemu-devel,
	Richard W.M. Jones, stefanha

[-- Attachment #1: Type: text/plain, Size: 22526 bytes --]

On 2018-06-06 17:05, Dr. David Alan Gilbert wrote:
> * Max Reitz (mreitz@redhat.com) wrote:
>> On 2018-06-06 16:31, Dr. David Alan Gilbert wrote:
>>> * Max Reitz (mreitz@redhat.com) wrote:
>>>> On 2018-06-06 14:00, Dr. David Alan Gilbert wrote:
>>>>> * Max Reitz (mreitz@redhat.com) wrote:
>>>>>> On 2018-06-06 13:14, Dr. David Alan Gilbert wrote:
>>>>>>> * Max Reitz (mreitz@redhat.com) wrote:
>>>>>>>> On 2018-06-05 11:21, Dr. David Alan Gilbert wrote:
>>>>>>>>> <reawakening a fizzled out thread>
>>>
>>> <snip>
>>>
>>>>>>> The problem with having a separate file is that you either have to copy
>>>>>>> it around with the image 
>>>>>>
>>>>>> Which is just an inconvenience.
>>>>>
>>>>> It's more than that;  if it's a separate file then the tools can't
>>>>> rely on users supplying it, and frankly they won't and they'll still
>>>>> just supply an image.
>>>>
>>>> At which point you throw an error and tell them to specify the config file.
>>>
>>> No:
>>>    a) At the moment they get away with it for images since they're all
>>>       'pc' and the management layers do the right thing.
>>
>> So so far nobody has complained?  I don't really see the problem then.
>>
>> If deploying a disk and using all the defaults works out for users,
>> great.  If they want more options, apparently they already know they
>> have to provide some config.
> 
> This problem all came about because of q35.  We can't change defaults to
> use q35 because importing existing images might break so we need to
> start flagging stuff as q35 - all we're trying to do is make stuff no
> more broken than today.

Well, I really don't want to go into this much further.  I understand
that you see it as a bug prevention.  But as I've said, we've always had
this issue, e.g. with minimum RAM required.  It's nothing new.

But the most important thing is that other people seem to really
disagree on your "I don't want it to get an appliance" stance.

>>>    b) They'll give the wrong config file - then you'd need to add a flag
>>>      to detect that - which means you'd need to add something to the
>>>      qcow to match it to the config; loop back to teh start!
>>
>> I'm not sure how seriously I should take this argument.  Do stupid
>> things, win stupid prizes.
>>
>> If that's the issue, add a UUID to qcow2 files and reference it from the
>> config file.
> 
> Is a UUID a small string :-)

Well, it certainly isn't unheard of for disk image formats, so I'm much
more inclined to include it.

Though UUIDs tend to be nasty to handle, but as long as they don't
change...  Oh well.

>>> We should make this EASY for users.
>>
>> To me, having a simple config file they can edit manually certainly
>> seems simpler than having to use specific tools to edit it inside of the
>> qcow2 file.
> 
> The users never touch the tools; they click and import the VM image.

Not really, no.  I know these are the end users, but others (the
providers) probably do want to edit the config file.

[...]

>>>>>>>                                                 Storing a simple bit
>>>>>>> of data with the image avoids that.
>>>>>>
>>>>>> It is not a simple bit of data, as evidenced by the discussion about
>>>>>> storing binary blobs and MIME types going on.
>>>>>
>>>>> All of the things they've suggested can be done inside that one blob;
>>>>> even inside the json (or any other structure in that blob).
>>>>
>>>> Right, from qcow2's perspective it's a blob of data.  But you can put a
>>>> whole filesystem into a blob of data, and I get the impression that this
>>>> is what some are trying to do.
>>>>
>>>> Once we store larger amounts of binary data in that blob (which is what
>>>> I'm fearing from comments on MIME types and PNG images), people will
>>>> realize that always having to re-store the whole blob if you modify
>>>> something in the middle is inefficient and that it needs to be
>>>> optimized.  I don't think you want to do that, but we haven't
>>>> implemented any of this yet and people are already asking for such
>>>> binary data inside of the blob.
>>>>
>>>> I suspect it'll only get worse over time.
>>>> I think the most difficult thing about this discussion is that there are
>>>> different targets.
>>>>
>>>> You just want to store a bit of information.  OK, good, but then I'd say
>>>> we could even just prepend that to the image file in a small header.
>>>
>>>
>>> I think you're over-reading what people are asking for.
>>> I think the PNG suggestion is again the 'label on the front' for a logo.
>>
>> Which is OK if you store like everything, but very much over the top for
>> your suggestion.  Again, different people want different things and I
>> feel like that is the real discussion we should be having right now and
>> not necessarily where to store it.
>>
>> Because I think (maybe I'm wrong, though) where to store it heavily
>> depends on what we want to store and how we want to use it.
>>
>>> I've not seen anything that's not for either:
>>>   a) The user to know what the image is
>>
>> I thought the use case was they just downloaded it.
> 
> Or pulled it from that big directory of images.

I don't know what to say.

Maybe this: I do not believe you can convince me that this is a
reasonable use case.  Please do not make me fix the problems people get
because they cannot identify their own files.

>> Otherwise, they should manage their filenames reasonably, come on.
>> Seriously, adding a cute picture because users are too stupid to manage
>> their VMs is *not* qcow2's problem.
> 
> Well, it's someones problem; we already have magic to display those
> images in some of the higher level tools.

Sure.  virt-manager gives me cute pictures for my VMs, and I'm grateful
for that.  (Actually just cute names, but you know.)

But I absolutely do not see this as a qcow2-level issue.  It might be an
appliance-level issue, though.

>>>   b) The management layer to know what type of VM to create
>>
>> Apparently this is really what you want.  I really still don't see the
>> difficulty in supplying a config file (or the danger in not doing so, or
>> in supplying the wrong one), but, hey, it would be a nice feature indeed.
>>
>> (I just don't like the tradeoff in complexity.)
> 
> Remember, give this to someone who doesn't understand what the
> difference is between the machine types etc.

But you don't tell them "Configure this machine type".  There are two
download links.  One says "Disk image".  The other says "VM config
file".  The description says to download both and to import the VM
config file into their management application, which will then proceed
to ask for the disk image (automatically, perchance!), and that's it.
No understanding required.

[...]

>>>> And really, I still believe in my slippery slope argument, which means
>>>> that even if you just want to innocently store a machine type, we will
>>>> end up with something vastly more complex in the end.
>>>>
>>>> Finally, it appears to me that you have a simple problem, found one
>>>> possible solution, and now you just focus on that solution instead of
>>>> taking a step back and looking at the problem again.
>>>>
>>>> The problem: You want to store a binary blob and a disk image together.
>>>>
>>>> Your solution: qcow2 has refcounting and thus "occupation bits".  You
>>>> can put data into it and it will leave it alone, as long as that area is
>>>> marked as occupied.  Let's put the data into the qcow2 file.
>>>>
>>>> OK, let's look at the problem and its constraints again.
>>>>
>>>> Hard constraint: Store a single file.
>>>> (I don't think this is a hard constraint, because I haven't been
>>>> convinced yet that handling more than a single file is so bad.)
>>>
>>> See above; I think it is.
>>
>> I know, but you haven't convinced me yet. :-)
>>
>>> My other hard contraint is that no tool has to change unless
>>> it wants to make use of the new data.
>>
>> Sure that it isn't a soft constraint?  If most tools can stay unchanged
>> but some very specific ones have to be changed, that seems reasonable to me.
> 
> The hard constraint is the normal path stays unchanged; we can change
> the tools to make use of the extra data, but not change what's out
> there.

Ah, right, because you want the data to be visible in previously legacy
VMs, too.  I see.  Though I'd argue that legacy VMs are already covered
by a management application which can store all of that information
somewhere else.  Managing multiple files is easy for a management
application and never visible to the user.  Exporting and importing VMs
is the point at which it would get visible, but I'd think that is just a
temporary state (before the VM is imported into the user's management
application, at which point that target application can just create its
own configuration file again).

>>>> Soft constraint: Max doesn't like storing blobs in qcow2.
>>>>
>>>> So one solution is to ignore the soft constraint.  OK, valid solution, I
>>>> give you that.  But it doesn't leave me content, probably understandably so.
>>
>> [...]
>>
>>>> But really, if you create a VM, you need a configuration.  Like if you
>>>> set up a new computer, you need to know what you want.  Usually there is
>>>> no sticky label, but you just have to know and input it manually.  Maybe
>>>> you have a sheet of paper, which I'd call the configuration file.
>>>
>>> Most things are figurable-out by the management tools/defaults or
>>> are dependent on the whim of the user - we're only trying to stop the
>>> user doing things that wont work.
>>
>> But what's so bad about an empty screen because the user hasn't read the
>> download description?
> 
> Because it's got to be EASY for the customer; seriously - stop punishing
> the user for not noticing something.
> We've got to help the users, if not we get people asking why their VM
> system has given them a black screen, or why the image they just
> downloaded didn't work - it's basic user friendliness.

Again, to me that user friendliness would be provided by not telling
people to download disk images, but by telling them to download
configuration files, which get imported as a VM, at which point the
application asks for the disk image.

> It's not obvious why it's failed; if it was as simple as a nice box
> popping up telling them they'd booted it wouldn't be too bad; but some
> of them will waste 3 hours trying to figure out wth happened.

If they imported the config file instead of the disk first, they'd get
the box.

> *seriously* think about our users.

The thing is that the only report I've heard about something like this
is from myself.  I always had issue with remembering to pump up the
minimum amount of RAM required to boot a Linux image.

But I do realize that if the download page not only offered a raw disk
image, but a qemu config file with it, I would have used it.

>>> Simpler example; what stops you trying to put the PPC qcow image into
>>> your x86 VM system - nothing that I know of.  I just want to stop the
>>> users shooting themselves in the foot.
>>
>> They haven't shot themselves in the foot, they've just wasted a bit of
>> their time, which could've been avoided by reading before clicking.
> 
> *seriously* think about our users.

I do and I realize that setting the machine type is not sufficient.
What you are now asking for is an appliance.

I don't like to repeat myself again and again, but I do still think an
appliance would be nice to have.  *But*:

I do not think qemu is the right place to manage it, though I may be
wrong.  Also, you do not want it to make an appliance.

And most importantly, if we want appliances, we have to think seriously
about it and not just handwave it as "Max doesn't want a tiny bit of
data in a qcow2 file, what a party spoiler.  Let's just force him or
make someone else add support and start some random thing, because we
gotta start somewhere, right?"

I do not get that impression from you, I should say.  I do get the
impression that you think more seriously about this than I do, but I
also believe that you should gather what everybody wants instead of just
arguing with me that it needs to be in qcow2.

I do not believe that I have any authority on what configuration options
we need to store whatsoever.  I can only give hunches there and what I'd
find useful.

But I do believe that I have some authority over what makes sense in
qcow2 and what doesn't.  So if you get a consensus on what to store (and
there is no consensus on that whatsoever), you can indeed make me add
support in qcow2, because I believe at that point there will be very
good arguments for adding such support.

So as long as we are in this thread which bears a "qcow" in its subject,
I will respond and say "no".

>>>>>>>>> --------------------------------------------------------------
>>>>>>>>>    
>>>>>>>>>
>>>>>>>>> Some reasoning:
>>>>>>>>>    a) I've avoided the problem of when QEMU interprets the value
>>>>>>>>>       by ignoring it and giving it to management layers at the point
>>>>>>>>>       of VM import.
>>>>>>>>
>>>>>>>> Yes, but in the process you've made it completely opaque to qemu,
>>>>>>>> basically, which doesn't really make it better for me.  Not that
>>>>>>>> qemu-specific information in qcow2 files would be what I want, but, well.
>>>>>>>>
>>>>>>>> But it does solve technical issues, I concede that.
>>>>>>>>
>>>>>>>>>    b) I hate JSON, but there again nailing down a fixed format
>>>>>>>>>       seems easiest and it makes the job of QCOW easy - a single
>>>>>>>>>       string.
>>>>>>>>
>>>>>>>> Not really.  The string can be rather long, so you probably don't want
>>>>>>>> to store it in the image header, and thus it's just a binary blob from
>>>>>>>> qcow2's perspective, essentially.
>>>>>>>
>>>>>>> Yes, but it's a single blob - I'm not asking for multiple keyed blobs
>>>>>>> or the ability to update individual blobs; just one blob that I can
>>>>>>> replace.
>>>>>>
>>>>>> OK, you aren't, but others seem to be.
>>>>>>
>>>>>> Or, well, you call it a single blob.  But actually the current ideas
>>>>>> seem to be to store a rather large configuration tree with binary data
>>>>>> in that blob, so to me personally there is absolutely no functional
>>>>>> difference to just storing a tar file in that blob.
>>>>>>
>>>>>> So correct me if I'm wrong, but to me it appears that you effectively
>>>>>> want to store a filesystem in qcow2.[1]  Well, that's better than making
>>>>>> qcow2 the filesystem, but it still appears just the wrong way around to me.
>>>>>
>>>>> It's different in the sense that what we end up with is still a qcow2;
>>>>> anything that just handles qcow2's and can pass them through doesn't
>>>>> need to do anything different; users don't need to do anything
>>>>> different.  No one has to pack/unpack the file.
>>>>
>>>> Packing/unpacking is a strawman because I'm doing my best to give
>>>> proposals that completely avoid that.
>>>>
>>>> Users do need to do something different, because users do need to
>>>> realize that today there is no way to store VM configuration and disk
>>>> data in a single file.  So if they already start VMs just based on a
>>>> disk, then they are assuming behavior we do not have and that I'd call
>>>> naive.  But that is a strawman from my side, sorry.  Keeping naive users
>>>> happy is probably OK.
>>>
>>> Remember this all works fine now and has done for many years;
>>> it's the addition of q35 that breaks that assumption.
>>> The users can already blidly pick up the qcow2 image and stuff it in
>>
>> Which probably was blind luck already.  And if it wasn't, that means
>> they knew the defaults are what they want.  So now they'd know they
>> aren't and they have to offer a config file along with the disk image.
> 
> No, it's not blind look; it's that the management tools know how to get
> it right.

Good!  I mean it.

>>> and it all works; all I want is for that to keep working.
>>
>> And all I say is that it's not unreasonable to expect users to realize
>> that a VM is more than a disk image, just like a computer is more than a
>> disk drive; and that handling two files really is not the end of the world.
>>
>> (And neither is wasting someone's time because they can't read.)
> 
> 
> *seriously* think about our users.

You haven't yet told me why expecting users to download two files
instead of one is so bad.  You just say they won't.

OK, so I'd say the VM config is more important.  That should be
displayed prominently and users should download that first, and that
would be nice because we could trivially emit errors when users forget
to download the disk images with them.

But you say that we unfortunately have reached a point where everyone
already uses disk images and nobody uses config files, so training
people to do something else is going to be practically impossible.

I can see your point, but I don't like that you accuse me of hating
users when I'm just saying that they've been doing it wrong.

Also, blackmail doesn't work.  As long as there is no consensus on what
to store, I don't like being told "but it's so easy" and "it will help a
great deal, you just hate our users".


Also, you were saying "shoot themselves in the foot".  To me that meant
they could seriously break something while that was apparently not the
case.  It just meant that they wasted some time, which yes, is bad
enough (not least because time is money), but it's not the end of the
world.  That is what I meant by everything you replied to "think about
our users".

>> Firstly, I agree it's a nice thing to have, but it's not worth it if we
>> don't come up with clear rules on how to prevent developing a full
>> appliance format.
>>
>> Or maybe we want that (because I still believe that you can always come
>> up with obscure options without which the VM won't boot in your specific
>> case), but then this is beyond just storing a tiny bit of data in a
>> qcow2 image.
> 
> I don't want to protect them from really trying to shoot themselves in
> the foot; I just want to make sure the easy-path works.  Download an
> image, tell the tool to import; VM works. All good.

OK, good, that is a good use case that I understand.  I have exactly two
issues with it:

(1) I am not sure how far the easy path goes.  Ultimately, it does mean
an appliance to which in principle I am not opposed.

But there are many open questions for which there just is no consensus
yet.  What application is that appliance for? (qemu? libvirt? Some other
management application?)  Do we actually want a full-blown appliance?
Do we really want just qcow2 for appliances?

Before these questions are answered in a consensus, there is just no
reason to discuss what the best kind of representation is.

(2) And after this is answered, someone has to decide for themselves
that they think working on this is important enough.


I do assume that once we (or you, because I do not have real authority
there) have a consensus on (1), I believe that it will be evident
whether we need qcow2 support, or more generally, what kind of qemu
block layer support is required.  I just ask of you not to assume you'll
need qcow2 support beforehand.

Once you know you need qcow2 support for very specific reasons and for a
rather specific use case, we can continue the qcow2-specific part.  If
you have good reasons and a specific use case that multiple people agree
on, I will not oppose it and if I have the time, I can implement it
myself, if you need me to.


But the current hand-waving where everyone I'm talking with wants
something else (and nobody but you seems really clear about what they
want specifically) really does not make me want to add or back support now.

[...]

>>>>>> I'm not talking about unpacking.  I'm talking about a potentially new
>>>>>> format which allows accessing the qcow2 file in-place.  It would
>>>>>> probably be trivial to write a block driver to allow this.
>>>>>>
>>>>>> (And as I wrote in my response to Michal, I suspect that tar could
>>>>>> actually allow this, even though it would probably not be the ideal format.)
>>>>>
>>>>> As above, I don't think this is trivial; you have to change all the
>>>>> layers;  lets say it was a tar; you'd have to somehow know that you're
>>>>> importing one of these special tars,
>>>>
>>>> Which is trivial because it's just "Hey, look, it's a tar with that
>>>> description file".
>>>
>>> Trivial? It's taking 100+ mails to add a tag to a qcow2 file! Can you
>>> imagine what it takes to change libvirt, openstack, ovirt and the rest?
>>
>> :-)
>>
>> The implementation is trivial is what I meant, just like the
>> implementation would be rather simple for qcow2 to store a binary blob
>> and completely ignore it.
> 
> But then you'd have people shipping .newformat files as well as qcow2
> files and you'd have to persuade people to start doing that, and they'd
> ship both or none or....

Hm.  Reasonable point.

I mean, I'd say it's not so hard, for multiple reasons.

First, I'd say that our users can figure it out.  But you don't think
they can figure out downloading two files (which may be right!  I can
understand that people can't know everything about every single tool
they use, and that it really is unreasonable to ask that of them,
although I do believe that people can intuitively know that a VM needs a
config file, which is why I'm still not of your exact opinion).  So if
they don't "want" to download two files, it is going to be difficult to
make them provide the new format.

Secondly, as far as I have understood you (and I mean you and not e.g.
Michael), you are mainly worried about the management layer.  I would
assume that the management layer could give users a specific way of
exporting VMs which makes things simpler than them having to find the
disk image themselves.  This would make exporting in the new format
trivial, because they'd do that automatically.

(This exporting process might allow exporting just the disk image, while
emitting a warning that this does not include VM configuration options.)

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-06 15:25                                         ` Michal Suchánek
@ 2018-06-06 18:02                                           ` Max Reitz
  2018-06-06 18:33                                             ` Michal Suchánek
  0 siblings, 1 reply; 157+ messages in thread
From: Max Reitz @ 2018-06-06 18:02 UTC (permalink / raw)
  To: Michal Suchánek
  Cc: Dr. David Alan Gilbert, Kevin Wolf, ehabkost, qemu-block,
	Michael S. Tsirkin, Richard W.M. Jones, qemu-devel, stefanha

[-- Attachment #1: Type: text/plain, Size: 3498 bytes --]

On 2018-06-06 17:25, Michal Suchánek wrote:
> On Wed, 6 Jun 2018 16:55:08 +0200
> Max Reitz <mreitz@redhat.com> wrote:
> 
>> On 2018-06-06 16:41, Dr. David Alan Gilbert wrote:
>>> * Max Reitz (mreitz@redhat.com) wrote:  
>>
>> [...]
>>
>>>> So why is it so dangerous to connect a disk you just downloaded to
>>>> e.g. the wrong machine type?  I assumed it just wouldn't work and
>>>> you'd try again, until you realized that maybe you should read the
>>>> download description and do as it says ("download this config
>>>> file, pass it").  
>>>
>>> That's bad!  Stuff should just-work;  
>>
>> That's how it always should be.  Life's tough, though.
>>
>>>                                      it currently just works,  
>>
>> Due to sheer blind luck, I'd say.
> 
> It's TimeProvenSolution(tm).
> 
>>
>>>                                                               things
>>> should get better and easier for our users.  
>>
>> Users using a whole VM stack plus management, but then handling two
>> files instead of one is too much to ask?
> 
> What you don't seem to realize is there are cases when there is an
> 'administrator' who has set up the VM stack plus management and 'joe
> user' who wants to run some random VM on that stack.
> 
> And if you download an appliance compatible with the stack it should
> just work. For a long time the 'appliance' for qemu based
> virtualization was a simple qcow2 file which was sized sufficiently for
> the VM to run but shrunk for transport. And although it is technically
> wrong it JustWorked(tm).

Hm, yes.  As I replied to Dave, I understand, but I would think this
then requires a real appliance solution.  I think you do want such a
solution, but Dave doesn't.

My problem is that I cannot accept Dave's arguments on why to include
this blob in qcow2 if someone else already plans on making that blob the
basis for qcow2 appliances.

And I still do not think that qcow2 is the right format for VM
appliances.  To convince me, we'd first need a consensus on what the
appliances are for (Michael seems to want them for qemu directly,
apparently you want them for something higher up the stack) and thus
what they are supposed to be capable of exactly.

Like, one thing that is important to discuss is this (but please not in
this thread...): If we agree on making an appliance format (qcow2 or
not), is it for running VMs off or do we just want it for VM
export/import?  The former might mean we need qcow2, because there is no
good way to offer good performance with multiple disks otherwise (but
this would constrain us e.g. in the disk image format -- no raw images
for you, then).  But the latter can work just fine with a normal
archival format as long as building/decomposing it is possible without
copying.

(I would think that you can move blocks from one file to another, so
with proper alignment you should be able to build/decompose an archive
from/into its members without copying.)

>>>                                              And anyway, not
>>> working for EFI for exmaple can be just a blank screen.  Seriously
>>> - keep it easy for the user!  
>>
>> Thinking this through makes you end up with appliances.
> 
> And those can in general have more than one disk.

Indeed.  And thus knowing what we want is important so we can make a
good decision on where to store it instead of just focusing on where it
would be apparently simplest.

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-06 16:10                                     ` Eduardo Habkost
@ 2018-06-06 18:09                                       ` Max Reitz
  0 siblings, 0 replies; 157+ messages in thread
From: Max Reitz @ 2018-06-06 18:09 UTC (permalink / raw)
  To: Eduardo Habkost
  Cc: Michal Suchánek, Kevin Wolf, qemu-block, Michael S. Tsirkin,
	Richard W.M. Jones, qemu-devel, stefanha

[-- Attachment #1: Type: text/plain, Size: 2961 bytes --]

On 2018-06-06 18:10, Eduardo Habkost wrote:
> On Wed, Jun 06, 2018 at 04:17:14PM +0200, Max Reitz wrote:
>> On 2018-06-06 15:45, Michal Suchánek wrote:

[...]

>>> I understand that for some use cases simplifying the distribution of
>>> VMs as much as possible is quite important.
>>
>> I don't because still nobody has explained it to me.
>>
>> The only explanation I got so far was "People are lazy and we have
>> defaults for everything, so we don't throw an error if people forget to
>> pass a configuration file."
> 
> People don't pass a configuration file today because there's no
> standard for such a configuration file.  qcow2 is already used
> today as an appliance file format because there's no better
> option.  People download disk images from appliance and OS
> providers, import them into a cloud system, and it works out of
> the box because (luckily) "pc" is enough for most of them.

OK.

> We can specify a true appliance file format, and ask people to
> use it.  But then providers of single-disk appliances and OSes
> will need to publish two appliance images: qcow2 disk image for
> old systems that don't support the new format, and one in the new
> appliance format, for systems that support it.

True.  This is a valid argument for making a compatible change to qcow2.
 But: If these options are actually important, the qcow2 file is not
really compatible either.  You might end up with a blank screen again.

Depending on how we'd design a new format, it might contain a config
file and a qcow2 file which may be easily extractable.  If so, users of
legacy software could easily extract the qcow2 file and might even be
tempted to look at the config file when it doesn't work, and maybe that
could even help them.

Also, I think that it is not too unreasonable to ask providers to
provide two formats.

But that does not make me disregard your argument.  It is still valid,
especially with your convenience note below.

(Providers could choose whether it is best for them to include the
description in the qcow2 file, or whether to offer an archive that
contains e.g. a raw file.)

>> Which to me still just makes it an inconvenience.
> 
> Well, there are small inconveniences and there are big
> inconveniences that together make a system unnecessarily hard to
> use.  I'd say this one falls somewhere in the middle.

Hm, OK.

> [...]
>> I'm noticing a pattern here, and that is that everybody has a different
>> opinion on what we actually want in the end, and it's just by chance
>> that we find ourselves in two camps ("put it in qcow2" vs. "put it
>> somewhere else").
>>
>> Maybe we should first discuss what we actually want before we can
>> discuss where to put it.
> 
> I'm inclined to agree.  Once we figure out a good VM description
> format, we can justify a proposal to allow embedding the VM
> description in qcow2 for convenience.

OK.

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-06 18:02                                           ` Max Reitz
@ 2018-06-06 18:33                                             ` Michal Suchánek
  2018-06-06 18:36                                               ` Eduardo Habkost
  0 siblings, 1 reply; 157+ messages in thread
From: Michal Suchánek @ 2018-06-06 18:33 UTC (permalink / raw)
  To: Max Reitz
  Cc: Kevin Wolf, ehabkost, qemu-block, Michael S. Tsirkin, qemu-devel,
	Richard W.M. Jones, Dr. David Alan Gilbert, stefanha

[-- Attachment #1: Type: text/plain, Size: 3802 bytes --]

On Wed, 6 Jun 2018 20:02:54 +0200
Max Reitz <mreitz@redhat.com> wrote:

> On 2018-06-06 17:25, Michal Suchánek wrote:
> > On Wed, 6 Jun 2018 16:55:08 +0200
> > Max Reitz <mreitz@redhat.com> wrote:
> >   
> >> On 2018-06-06 16:41, Dr. David Alan Gilbert wrote:  
> >>> * Max Reitz (mreitz@redhat.com) wrote:    

> >> Users using a whole VM stack plus management, but then handling two
> >> files instead of one is too much to ask?  
> > 
> > What you don't seem to realize is there are cases when there is an
> > 'administrator' who has set up the VM stack plus management and 'joe
> > user' who wants to run some random VM on that stack.
> > 
> > And if you download an appliance compatible with the stack it should
> > just work. For a long time the 'appliance' for qemu based
> > virtualization was a simple qcow2 file which was sized sufficiently
> > for the VM to run but shrunk for transport. And although it is
> > technically wrong it JustWorked(tm).  
> 
> Hm, yes.  As I replied to Dave, I understand, but I would think this
> then requires a real appliance solution.  I think you do want such a
> solution, but Dave doesn't.

Yes, Dave wants a poor man's half-assed appliance and insists on it not
being an appliance. Duh.

In the pc world to maintain the status quo with minimum changes you
only need to know if the image uses EFI or legacy BIOS and you can
maintain the illusion that the TimeProvenSolution(tm) JustWorks(tm).

Sneaking that single piece of information somewhere seems to be the
goal here.

> 
> My problem is that I cannot accept Dave's arguments on why to include
> this blob in qcow2 if someone else already plans on making that blob
> the basis for qcow2 appliances.
> 
> And I still do not think that qcow2 is the right format for VM
> appliances.  To convince me, we'd first need a consensus on what the
> appliances are for (Michael seems to want them for qemu directly,

Let's put this straight: qemu as is cannot run appliances. It is not
designed for that and it would be a big feature to create enough
management inside qemu (or around qemu but part of qemu distribution) to
change that.

As it stands qemu always takes the configuration from the outside -
either from the user directly or from a separate management layer.

> apparently you want them for something higher up the stack) and thus
> what they are supposed to be capable of exactly.

Using qcow2 would be kind of cool but it has its limitations and
drawbacks as well.

You could use qcow2 as transport format and convert the VM to use
raw disks or whatever if you need the performance. And you could run
the VM directly from qcow2 without additional processing which is its
advantage.

It would fail miserably with tools not aware of the extra metadata,
however.

Lastly we are missing a developer of a management layer committed to
support such appliances.

> 
> Like, one thing that is important to discuss is this (but please not
> in this thread...): If we agree on making an appliance format (qcow2
> or not), is it for running VMs off or do we just want it for VM
> export/import?  The former might mean we need qcow2, because there is
> no good way to offer good performance with multiple disks otherwise
> (but this would constrain us e.g. in the disk image format -- no raw
> images for you, then).  But the latter can work just fine with a
> normal archival format as long as building/decomposing it is possible
> without copying.

Indeed, using an archive should be good enough for the 1-click download
solution. It will take time to extract but it will typically take even
more time to download or publish. So optimizing the format for speed of
export/import might be misplaced.

Thanks

Michal

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-06 18:33                                             ` Michal Suchánek
@ 2018-06-06 18:36                                               ` Eduardo Habkost
  2018-06-07 18:27                                                 ` [Qemu-devel] [Qemu-block] " Kashyap Chamarthy
  0 siblings, 1 reply; 157+ messages in thread
From: Eduardo Habkost @ 2018-06-06 18:36 UTC (permalink / raw)
  To: Michal Suchánek
  Cc: Max Reitz, Kevin Wolf, qemu-block, Michael S. Tsirkin,
	qemu-devel, Richard W.M. Jones, Dr. David Alan Gilbert, stefanha

On Wed, Jun 06, 2018 at 08:33:39PM +0200, Michal Suchánek wrote:
[...]
> Lastly we are missing a developer of a management layer committed to
> support such appliances.

This is important.  Without developers of management tools
willing to help specify the requirements and implement the
feature, all the work in the lower layers would be useless.

-- 
Eduardo

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-06 14:57                       ` Eric Blake
@ 2018-06-06 20:39                         ` Eric Blake
  2018-06-06 21:01                           ` Gerd Hoffmann
  0 siblings, 1 reply; 157+ messages in thread
From: Eric Blake @ 2018-06-06 20:39 UTC (permalink / raw)
  To: Michael S. Tsirkin, Max Reitz
  Cc: Kevin Wolf, ehabkost, qemu-block, Richard W.M. Jones, qemu-devel,
	stefanha, Michal Suchánek

On 06/06/2018 09:57 AM, Eric Blake wrote:
> On 06/06/2018 09:43 AM, Michael S. Tsirkin wrote:
>> On Wed, Jun 06, 2018 at 01:02:53PM +0200, Max Reitz wrote:
>>> Yeah, but why make qcow2 that format?  That's what I completely fail to
>>> understand.
>>
>> Because why not? It's cheap to add it there and is much easier
>> than teaching people about a new container format.
> 
> tar is not a new container format, but it is a new format to various 
> toolchains - that said, if we popularize tar as the format for including 
> a config file alongside a qcow2 image, it's not that hard to fix the 
> stack to start passing that file around as the new preferred file type.

On a completely different front, 'qcow2' as a file format comes with 
some psychological baggage.  If someone was using it 8 years ago, before 
we did coroutine optimizations, it was noticeably slower than raw, and 
relatively easier to get into a corrupted image condition that resulted 
in data loss.  Just one VM lost, and it leaves a sour taste in your 
mouth, where you are unwilling to trust that file format (even though 
the file format was not necessarily the cause of the corruption). 
Marketing-wise, we failed with our improvements ('qcow2v3' is so much 
more of a mouthful than 'qcow3'), and it took years to flip the defaults 
from v2 as the default to v3 as the default (moreso in downstream 
distros than upstream), in part because we couldn't convince people of 
the improvements they would be gaining by moving to v3.  Historically, 
there's also 'qed' which was promised as a way to fix some of the poor 
performance of qcow2, but which ended up not being any better than our 
actual qcow2v3 improvements, so no one ended up switching to that. So, 
to some extent, various high-level consumers still have the notion that 
'raw' files are better/safer/faster than 'qcow2' files because of an 
anecdote from years ago, even if we have since fixed the speed parity 
and added locking to eliminate careless data loss.

If we DO add a new tar-file block driver to qemu, that could serve as a 
marketing opportunity to convince people that the new format has all of 
the features that you can't get from just a raw file, and does not 
suffer from the slowness or data corruption they were worried about in 
qcow2.  Thus, even if our new format is just a thin wrapper around a 
config file plus the existing qcow2v3 we already know and love, the mere 
fact that it is a new format may get people to move away from raw images 
in situations where just the name 'qcow2' is unable to do so, at which 
point we can help them take advantage of the features made possible by 
qcow2.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-06 20:39                         ` Eric Blake
@ 2018-06-06 21:01                           ` Gerd Hoffmann
  0 siblings, 0 replies; 157+ messages in thread
From: Gerd Hoffmann @ 2018-06-06 21:01 UTC (permalink / raw)
  To: Eric Blake
  Cc: Michael S. Tsirkin, Max Reitz, Kevin Wolf, ehabkost, qemu-block,
	qemu-devel, Richard W.M. Jones, stefanha, Michal Suchánek

  Hi,

> our actual qcow2v3 improvements, so no one ended up switching to that. So,
> to some extent, various high-level consumers still have the notion that
> 'raw' files are better/safer/faster than 'qcow2' files because of an
> anecdote from years ago, even if we have since fixed the speed parity and
> added locking to eliminate careless data loss.

When I use raw images the reasons are different ones.  Most of the time
it is that I want use the image with something which isn't qemu and thus
doesn't understand qcow2.  Booting a image as container for example
(systemd-nspawn -i $image.raw).

cheers,
  Gerd

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-06 16:32                                     ` Daniel P. Berrangé
  2018-06-06 16:36                                       ` Dr. David Alan Gilbert
@ 2018-06-07 10:02                                       ` Andrea Bolognani
  2018-06-07 10:22                                         ` Daniel P. Berrangé
  2018-06-07 10:32                                         ` Richard W.M. Jones
  1 sibling, 2 replies; 157+ messages in thread
From: Andrea Bolognani @ 2018-06-07 10:02 UTC (permalink / raw)
  To: Daniel P. Berrangé, Eric Blake
  Cc: Kevin Wolf, qemu-block, Michael S. Tsirkin, qemu-devel, armbru,
	Dr. David Alan Gilbert, Richard W.M. Jones, stefanha, Max Reitz

On Wed, 2018-06-06 at 17:32 +0100, Daniel P. Berrangé wrote:
> On Wed, Jun 06, 2018 at 10:36:20AM -0500, Eric Blake wrote:
> > But for the new config to be useful, you have to modify at least one tool in
> > the path.  At which point, it is just as easy to say: "libvirt is now smart
> > enough to read the config file out of a .qcow2 to know that it should prefer
> > a q35 machine" as it is to say "libvirt is now smart enough to treat a .tar
> > file containing .qcow2 and a config file that states that it should prefer a
> > q35 machine", and either approach requires just a single file for the user
> > to download.
> 
> Just to be clear, libvirt isn't going to do either of those things.
> 
> Whether there is metadata stuffed inside qcow2, or in a metdata file
> inside a tar file, libvirt is not going to look inside either of them.
> The XML is the only place libvirt deals with the hardware config.
> 
> Extracting machine type is always going to be a job for the layer above
> such as OpenStack/OVirt/Virt-manager/etc. They will then decide whether
> or not they want to honour that info, and if so, put it into the XML
> they give to libvirt.
> 
> As mentioned elsewhere, IMHO, it is more friendly to those tools
> to use pre-existing formats, eg TAR and XML/JSON, for which
> their respective programming langauges already have APIs/parsers.

Something that I haven't seen mentioned in the thread - and this
looks like as good a point as any to jump in - is that for q35
guests using EFI as well as aarch64 guests the "one click import"
experience requires not only hints about the machine (and firmware!)
type, but also a copy of the EFI variable store:

  $ virt-builder fedora-27 --arch aarch64 --notes
  Fedora® 27 Server (aarch64)

  [...]

  You will need to use the associated UEFI NVRAM variables file:
    http://libguestfs.org/download/builder/fedora-27-aarch64-nvram.xz

While hints might be considered a reasonable fit for qcow2, I think
it's pretty hard to argue for embedding the NVRAM file in there,
which to me signals quite clearly that an archive containing the
disk image(s) *and* the configuration hints *and* other ancillary
files such as the NVRAM is the only way to build a solution that's
not dead on arrival.

It's pretty easy then to imagine using something like

  $ virt-builder \
    fedora-27 \
    --arch aarch64 \
    --format qva \
    --output f27-aarch64.qva

or download the equivalent from some website, followed by

  $ virt-install \
    --name f27-aarch64 \
    --import \
    --input f27-aarch.qva

or the equivalent pointy-clicky import step and having things
Just Work™, provided sufficient hints are included in the archive;
the user, or the management application, would of course be able
to override such hints at import time.

-- 
Andrea Bolognani / Red Hat / Virtualization

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-07 10:02                                       ` Andrea Bolognani
@ 2018-06-07 10:22                                         ` Daniel P. Berrangé
  2018-06-07 11:17                                           ` Andrea Bolognani
  2018-06-07 10:32                                         ` Richard W.M. Jones
  1 sibling, 1 reply; 157+ messages in thread
From: Daniel P. Berrangé @ 2018-06-07 10:22 UTC (permalink / raw)
  To: Andrea Bolognani
  Cc: Eric Blake, Kevin Wolf, qemu-block, Michael S. Tsirkin,
	qemu-devel, armbru, Dr. David Alan Gilbert, Richard W.M. Jones,
	stefanha, Max Reitz

On Thu, Jun 07, 2018 at 12:02:29PM +0200, Andrea Bolognani wrote:
> On Wed, 2018-06-06 at 17:32 +0100, Daniel P. Berrangé wrote:
> > On Wed, Jun 06, 2018 at 10:36:20AM -0500, Eric Blake wrote:
> > > But for the new config to be useful, you have to modify at least one tool in
> > > the path.  At which point, it is just as easy to say: "libvirt is now smart
> > > enough to read the config file out of a .qcow2 to know that it should prefer
> > > a q35 machine" as it is to say "libvirt is now smart enough to treat a .tar
> > > file containing .qcow2 and a config file that states that it should prefer a
> > > q35 machine", and either approach requires just a single file for the user
> > > to download.
> > 
> > Just to be clear, libvirt isn't going to do either of those things.
> > 
> > Whether there is metadata stuffed inside qcow2, or in a metdata file
> > inside a tar file, libvirt is not going to look inside either of them.
> > The XML is the only place libvirt deals with the hardware config.
> > 
> > Extracting machine type is always going to be a job for the layer above
> > such as OpenStack/OVirt/Virt-manager/etc. They will then decide whether
> > or not they want to honour that info, and if so, put it into the XML
> > they give to libvirt.
> > 
> > As mentioned elsewhere, IMHO, it is more friendly to those tools
> > to use pre-existing formats, eg TAR and XML/JSON, for which
> > their respective programming langauges already have APIs/parsers.
> 
> Something that I haven't seen mentioned in the thread - and this
> looks like as good a point as any to jump in - is that for q35
> guests using EFI as well as aarch64 guests the "one click import"
> experience requires not only hints about the machine (and firmware!)
> type, but also a copy of the EFI variable store:
> 
>   $ virt-builder fedora-27 --arch aarch64 --notes
>   Fedora® 27 Server (aarch64)
> 
>   [...]
> 
>   You will need to use the associated UEFI NVRAM variables file:
>     http://libguestfs.org/download/builder/fedora-27-aarch64-nvram.xz
> 
> While hints might be considered a reasonable fit for qcow2, I think
> it's pretty hard to argue for embedding the NVRAM file in there,
> which to me signals quite clearly that an archive containing the
> disk image(s) *and* the configuration hints *and* other ancillary
> files such as the NVRAM is the only way to build a solution that's
> not dead on arrival.

On a similar theme, I can imagine users wanting to provide a TPM
data blob too, and for AMD SEV we'd need to be able to provide a
DH key, and session blob too IIUC.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-07 10:02                                       ` Andrea Bolognani
  2018-06-07 10:22                                         ` Daniel P. Berrangé
@ 2018-06-07 10:32                                         ` Richard W.M. Jones
  2018-06-07 10:35                                           ` Dr. David Alan Gilbert
                                                             ` (2 more replies)
  1 sibling, 3 replies; 157+ messages in thread
From: Richard W.M. Jones @ 2018-06-07 10:32 UTC (permalink / raw)
  To: Andrea Bolognani
  Cc: Daniel P. Berrangé,
	Eric Blake, Kevin Wolf, qemu-block, Michael S. Tsirkin,
	qemu-devel, armbru, Dr. David Alan Gilbert, stefanha, Max Reitz

On Thu, Jun 07, 2018 at 12:02:29PM +0200, Andrea Bolognani wrote:
> Something that I haven't seen mentioned in the thread - and this
> looks like as good a point as any to jump in - is that for q35
> guests using EFI as well as aarch64 guests the "one click import"
> experience requires not only hints about the machine (and firmware!)
> type, but also a copy of the EFI variable store:
> 
>   $ virt-builder fedora-27 --arch aarch64 --notes
>   Fedora® 27 Server (aarch64)
> 
>   [...]
> 
>   You will need to use the associated UEFI NVRAM variables file:
>     http://libguestfs.org/download/builder/fedora-27-aarch64-nvram.xz

This is true, although only sometimes.  If the bootloader[*] has a
working fallback path then usually it is able to boot and reset the
UEFI varstore back to the correct values.  We have had bugs before
where the fallback path was not working, eg:

  https://bugzilla.redhat.com/show_bug.cgi?id=1353689 (yours!)
  https://bugzilla.redhat.com/show_bug.cgi?id=1558793

Another problem which Laszlo mentioned is the varstore isn't portable
between UEFI implementations, or if the UEFI is compiled with
different options.  You can even imagine shipping multiple
varstores(!) which argues for a tar-like format.

> While hints might be considered a reasonable fit for qcow2, I think
> it's pretty hard to argue for embedding the NVRAM file in there,
> which to me signals quite clearly that an archive containing the
> disk image(s) *and* the configuration hints *and* other ancillary
> files such as the NVRAM is the only way to build a solution that's
> not dead on arrival.

The tar argument is quite strong.  Just not the wretched OVA/OVF :-)

> It's pretty easy then to imagine using something like
> 
>   $ virt-builder \
>     fedora-27 \
>     --arch aarch64 \
>     --format qva \
>     --output f27-aarch64.qva
> 
> or download the equivalent from some website, followed by
> 
>   $ virt-install \
>     --name f27-aarch64 \
>     --import \
>     --input f27-aarch.qva
> 
> or the equivalent pointy-clicky import step and having things
> Just Work™, provided sufficient hints are included in the archive;
> the user, or the management application, would of course be able
> to override such hints at import time.

RFEs for virt-builder & virt-install one day :-)

Rich.

[*] I'm not sure exactly which bit of the bootloader does this,
whether it's UEFI itself, or the grub-efi in the guest.


-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-p2v converts physical machines to virtual machines.  Boot with a
live CD or over the network (PXE) and turn machines into KVM guests.
http://libguestfs.org/virt-v2v

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-07 10:32                                         ` Richard W.M. Jones
@ 2018-06-07 10:35                                           ` Dr. David Alan Gilbert
  2018-06-07 10:36                                           ` Daniel P. Berrangé
  2018-06-07 10:51                                           ` Andrea Bolognani
  2 siblings, 0 replies; 157+ messages in thread
From: Dr. David Alan Gilbert @ 2018-06-07 10:35 UTC (permalink / raw)
  To: Richard W.M. Jones
  Cc: Andrea Bolognani, Daniel P. Berrangé,
	Eric Blake, Kevin Wolf, qemu-block, Michael S. Tsirkin,
	qemu-devel, armbru, stefanha, Max Reitz

* Richard W.M. Jones (rjones@redhat.com) wrote:
> On Thu, Jun 07, 2018 at 12:02:29PM +0200, Andrea Bolognani wrote:
> > Something that I haven't seen mentioned in the thread - and this
> > looks like as good a point as any to jump in - is that for q35
> > guests using EFI as well as aarch64 guests the "one click import"
> > experience requires not only hints about the machine (and firmware!)
> > type, but also a copy of the EFI variable store:
> > 
> >   $ virt-builder fedora-27 --arch aarch64 --notes
> >   Fedora® 27 Server (aarch64)
> > 
> >   [...]
> > 
> >   You will need to use the associated UEFI NVRAM variables file:
> >     http://libguestfs.org/download/builder/fedora-27-aarch64-nvram.xz
> 
> This is true, although only sometimes.  If the bootloader[*] has a
> working fallback path then usually it is able to boot and reset the
> UEFI varstore back to the correct values.  We have had bugs before
> where the fallback path was not working, eg:
> 
>   https://bugzilla.redhat.com/show_bug.cgi?id=1353689 (yours!)
>   https://bugzilla.redhat.com/show_bug.cgi?id=1558793
> 
> Another problem which Laszlo mentioned is the varstore isn't portable
> between UEFI implementations, or if the UEFI is compiled with
> different options.  You can even imagine shipping multiple
> varstores(!) which argues for a tar-like format.

Given that level of incompatibility with var stores (which I've seen
myself) I don't see how you can distribute them with images.

Dave

> > While hints might be considered a reasonable fit for qcow2, I think
> > it's pretty hard to argue for embedding the NVRAM file in there,
> > which to me signals quite clearly that an archive containing the
> > disk image(s) *and* the configuration hints *and* other ancillary
> > files such as the NVRAM is the only way to build a solution that's
> > not dead on arrival.
> 
> The tar argument is quite strong.  Just not the wretched OVA/OVF :-)
> 
> > It's pretty easy then to imagine using something like
> > 
> >   $ virt-builder \
> >     fedora-27 \
> >     --arch aarch64 \
> >     --format qva \
> >     --output f27-aarch64.qva
> > 
> > or download the equivalent from some website, followed by
> > 
> >   $ virt-install \
> >     --name f27-aarch64 \
> >     --import \
> >     --input f27-aarch.qva
> > 
> > or the equivalent pointy-clicky import step and having things
> > Just Work™, provided sufficient hints are included in the archive;
> > the user, or the management application, would of course be able
> > to override such hints at import time.
> 
> RFEs for virt-builder & virt-install one day :-)
> 
> Rich.
> 
> [*] I'm not sure exactly which bit of the bootloader does this,
> whether it's UEFI itself, or the grub-efi in the guest.
> 
> 
> -- 
> Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
> Read my programming and virtualization blog: http://rwmj.wordpress.com
> virt-p2v converts physical machines to virtual machines.  Boot with a
> live CD or over the network (PXE) and turn machines into KVM guests.
> http://libguestfs.org/virt-v2v
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-07 10:32                                         ` Richard W.M. Jones
  2018-06-07 10:35                                           ` Dr. David Alan Gilbert
@ 2018-06-07 10:36                                           ` Daniel P. Berrangé
  2018-06-07 10:54                                             ` Andrea Bolognani
  2018-06-07 21:18                                             ` Michael S. Tsirkin
  2018-06-07 10:51                                           ` Andrea Bolognani
  2 siblings, 2 replies; 157+ messages in thread
From: Daniel P. Berrangé @ 2018-06-07 10:36 UTC (permalink / raw)
  To: Richard W.M. Jones
  Cc: Andrea Bolognani, Eric Blake, Kevin Wolf, qemu-block,
	Michael S. Tsirkin, qemu-devel, armbru, Dr. David Alan Gilbert,
	stefanha, Max Reitz

On Thu, Jun 07, 2018 at 11:32:18AM +0100, Richard W.M. Jones wrote:
> On Thu, Jun 07, 2018 at 12:02:29PM +0200, Andrea Bolognani wrote:
> > Something that I haven't seen mentioned in the thread - and this
> > looks like as good a point as any to jump in - is that for q35
> > guests using EFI as well as aarch64 guests the "one click import"
> > experience requires not only hints about the machine (and firmware!)
> > type, but also a copy of the EFI variable store:
> > 
> >   $ virt-builder fedora-27 --arch aarch64 --notes
> >   Fedora® 27 Server (aarch64)
> > 
> >   [...]
> > 
> >   You will need to use the associated UEFI NVRAM variables file:
> >     http://libguestfs.org/download/builder/fedora-27-aarch64-nvram.xz
> 
> This is true, although only sometimes.  If the bootloader[*] has a
> working fallback path then usually it is able to boot and reset the
> UEFI varstore back to the correct values.  We have had bugs before
> where the fallback path was not working, eg:
> 
>   https://bugzilla.redhat.com/show_bug.cgi?id=1353689 (yours!)
>   https://bugzilla.redhat.com/show_bug.cgi?id=1558793
> 
> Another problem which Laszlo mentioned is the varstore isn't portable
> between UEFI implementations, or if the UEFI is compiled with
> different options.  You can even imagine shipping multiple
> varstores(!) which argues for a tar-like format.

Could we perhaps imagine shipping the actual UEFI bios, rather
than only the varstore.  The bios blob runs in guest context,
so there shouldn't be able security concerns from hosting
vendors with running user provided bios. Mostly its a matter
of confidence that the interface between bios & qemu is stable
which feels easier than assuming varstore vs different bios is
portable. IIRC, shipping actual UEFI BIOS is something that was
desirable for AMD SEV usage.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-07 10:32                                         ` Richard W.M. Jones
  2018-06-07 10:35                                           ` Dr. David Alan Gilbert
  2018-06-07 10:36                                           ` Daniel P. Berrangé
@ 2018-06-07 10:51                                           ` Andrea Bolognani
  2018-06-07 19:38                                             ` Laszlo Ersek
  2 siblings, 1 reply; 157+ messages in thread
From: Andrea Bolognani @ 2018-06-07 10:51 UTC (permalink / raw)
  To: Richard W.M. Jones
  Cc: Daniel P. Berrangé,
	Eric Blake, Kevin Wolf, qemu-block, Michael S. Tsirkin,
	qemu-devel, armbru, Dr. David Alan Gilbert, stefanha, Max Reitz

On Thu, 2018-06-07 at 11:32 +0100, Richard W.M. Jones wrote:
> On Thu, Jun 07, 2018 at 12:02:29PM +0200, Andrea Bolognani wrote:
> > Something that I haven't seen mentioned in the thread - and this
> > looks like as good a point as any to jump in - is that for q35
> > guests using EFI as well as aarch64 guests the "one click import"
> > experience requires not only hints about the machine (and firmware!)
> > type, but also a copy of the EFI variable store:
> > 
> >   $ virt-builder fedora-27 --arch aarch64 --notes
> >   Fedora® 27 Server (aarch64)
> > 
> >   [...]
> > 
> >   You will need to use the associated UEFI NVRAM variables file:
> >     http://libguestfs.org/download/builder/fedora-27-aarch64-nvram.xz
> 
> This is true, although only sometimes.  If the bootloader[*] has a
> working fallback path then usually it is able to boot and reset the
> UEFI varstore back to the correct values.  We have had bugs before
> where the fallback path was not working, eg:
> 
>   https://bugzilla.redhat.com/show_bug.cgi?id=1353689 (yours!)
>   https://bugzilla.redhat.com/show_bug.cgi?id=1558793

[...]
> [*] I'm not sure exactly which bit of the bootloader does this,
> whether it's UEFI itself, or the grub-efi in the guest.

IIUC the UEFI spec itself reserves certain file names in the ESP
for this fallback mechanism; it's then up to the guest operating
system to actually install something appropriate there.

In Fedora and RHEL, shim is what takes care of it (except when it
doesn't ;), but in Debian and Ubuntu AFAIK shim is not included
and the fallback path doesn't work at all, which makes providing
the NVRAM file a hard requirement to boot such guests.

-- 
Andrea Bolognani / Red Hat / Virtualization

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-07 10:36                                           ` Daniel P. Berrangé
@ 2018-06-07 10:54                                             ` Andrea Bolognani
  2018-06-07 19:24                                               ` Laszlo Ersek
  2018-06-07 21:19                                               ` Michael S. Tsirkin
  2018-06-07 21:18                                             ` Michael S. Tsirkin
  1 sibling, 2 replies; 157+ messages in thread
From: Andrea Bolognani @ 2018-06-07 10:54 UTC (permalink / raw)
  To: Daniel P. Berrangé, Richard W.M. Jones
  Cc: Eric Blake, Kevin Wolf, qemu-block, Michael S. Tsirkin,
	qemu-devel, armbru, Dr. David Alan Gilbert, stefanha, Max Reitz

On Thu, 2018-06-07 at 11:36 +0100, Daniel P. Berrangé wrote:
> On Thu, Jun 07, 2018 at 11:32:18AM +0100, Richard W.M. Jones wrote:
> > Another problem which Laszlo mentioned is the varstore isn't portable
> > between UEFI implementations, or if the UEFI is compiled with
> > different options.  You can even imagine shipping multiple
> > varstores(!) which argues for a tar-like format.
> 
> Could we perhaps imagine shipping the actual UEFI bios, rather
> than only the varstore.  The bios blob runs in guest context,
> so there shouldn't be able security concerns from hosting
> vendors with running user provided bios. Mostly its a matter
> of confidence that the interface between bios & qemu is stable
> which feels easier than assuming varstore vs different bios is
> portable.

That sounds sensible, and further reinforces the idea that we
need way more than a single string baked into the qcow2 file.

-- 
Andrea Bolognani / Red Hat / Virtualization

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-07 10:22                                         ` Daniel P. Berrangé
@ 2018-06-07 11:17                                           ` Andrea Bolognani
  2018-06-07 12:38                                             ` Daniel P. Berrangé
  0 siblings, 1 reply; 157+ messages in thread
From: Andrea Bolognani @ 2018-06-07 11:17 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Eric Blake, Kevin Wolf, qemu-block, Michael S. Tsirkin,
	qemu-devel, armbru, Dr. David Alan Gilbert, Richard W.M. Jones,
	stefanha, Max Reitz

On Thu, 2018-06-07 at 11:22 +0100, Daniel P. Berrangé wrote:
> On Thu, Jun 07, 2018 at 12:02:29PM +0200, Andrea Bolognani wrote:
> > While hints might be considered a reasonable fit for qcow2, I think
> > it's pretty hard to argue for embedding the NVRAM file in there,
> > which to me signals quite clearly that an archive containing the
> > disk image(s) *and* the configuration hints *and* other ancillary
> > files such as the NVRAM is the only way to build a solution that's
> > not dead on arrival.
> 
> On a similar theme, I can imagine users wanting to provide a TPM
> data blob too, and for AMD SEV we'd need to be able to provide a
> DH key, and session blob too IIUC.

I'm not familiar with the technologies you're talking about, but
all that sounds like something very security sensitive and not
something eg. the Fedora project would want to bake into their
cloud images.

Perhaps we should keep in mind that this kind of archive format
lends itself quite naturally to both generic ready-made images and
custom, fully configured images: in the former case it would only
contain the few things mentione above, while in the latter it might
also have security sensitive data that's specific to the deployment
it's going to be used against.

For non-vanilla images, it might be interesting to include the
libvirt XML in its entirety, which would make it trivial to keep
around a full-contained copy of a guest that can be imported back
into libvirt with a single click; on the other hand, the management
layer might want to override that, and for generic images we
probably want to avoid the security implications of people
importing potentially untrusted configurations into the system
libvirt instance and stick to just a few hints instead.

So there's at least two partially overlapping use cases right
there. Fun :)

-- 
Andrea Bolognani / Red Hat / Virtualization

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-07 11:17                                           ` Andrea Bolognani
@ 2018-06-07 12:38                                             ` Daniel P. Berrangé
  2018-06-07 13:49                                               ` Dr. David Alan Gilbert
  0 siblings, 1 reply; 157+ messages in thread
From: Daniel P. Berrangé @ 2018-06-07 12:38 UTC (permalink / raw)
  To: Andrea Bolognani
  Cc: Eric Blake, Kevin Wolf, qemu-block, Michael S. Tsirkin,
	qemu-devel, armbru, Dr. David Alan Gilbert, Richard W.M. Jones,
	stefanha, Max Reitz

On Thu, Jun 07, 2018 at 01:17:24PM +0200, Andrea Bolognani wrote:
> On Thu, 2018-06-07 at 11:22 +0100, Daniel P. Berrangé wrote:
> > On Thu, Jun 07, 2018 at 12:02:29PM +0200, Andrea Bolognani wrote:
> > > While hints might be considered a reasonable fit for qcow2, I think
> > > it's pretty hard to argue for embedding the NVRAM file in there,
> > > which to me signals quite clearly that an archive containing the
> > > disk image(s) *and* the configuration hints *and* other ancillary
> > > files such as the NVRAM is the only way to build a solution that's
> > > not dead on arrival.
> > 
> > On a similar theme, I can imagine users wanting to provide a TPM
> > data blob too, and for AMD SEV we'd need to be able to provide a
> > DH key, and session blob too IIUC.
> 
> I'm not familiar with the technologies you're talking about, but
> all that sounds like something very security sensitive and not
> something eg. the Fedora project would want to bake into their
> cloud images.
> 
> Perhaps we should keep in mind that this kind of archive format
> lends itself quite naturally to both generic ready-made images and
> custom, fully configured images: in the former case it would only
> contain the few things mentione above, while in the latter it might
> also have security sensitive data that's specific to the deployment
> it's going to be used against.

I don't thonk there's any such distinction. A downstream user
may build generic ready-made images, or fully configured app
specific images. Both can contain the security sensitive data.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-07 12:38                                             ` Daniel P. Berrangé
@ 2018-06-07 13:49                                               ` Dr. David Alan Gilbert
  2018-06-07 14:06                                                 ` Andrea Bolognani
  0 siblings, 1 reply; 157+ messages in thread
From: Dr. David Alan Gilbert @ 2018-06-07 13:49 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Andrea Bolognani, Eric Blake, Kevin Wolf, qemu-block,
	Michael S. Tsirkin, qemu-devel, armbru, Richard W.M. Jones,
	stefanha, Max Reitz

* Daniel P. Berrangé (berrange@redhat.com) wrote:
> On Thu, Jun 07, 2018 at 01:17:24PM +0200, Andrea Bolognani wrote:
> > On Thu, 2018-06-07 at 11:22 +0100, Daniel P. Berrangé wrote:
> > > On Thu, Jun 07, 2018 at 12:02:29PM +0200, Andrea Bolognani wrote:
> > > > While hints might be considered a reasonable fit for qcow2, I think
> > > > it's pretty hard to argue for embedding the NVRAM file in there,
> > > > which to me signals quite clearly that an archive containing the
> > > > disk image(s) *and* the configuration hints *and* other ancillary
> > > > files such as the NVRAM is the only way to build a solution that's
> > > > not dead on arrival.
> > > 
> > > On a similar theme, I can imagine users wanting to provide a TPM
> > > data blob too, and for AMD SEV we'd need to be able to provide a
> > > DH key, and session blob too IIUC.
> > 
> > I'm not familiar with the technologies you're talking about, but
> > all that sounds like something very security sensitive and not
> > something eg. the Fedora project would want to bake into their
> > cloud images.
> > 
> > Perhaps we should keep in mind that this kind of archive format
> > lends itself quite naturally to both generic ready-made images and
> > custom, fully configured images: in the former case it would only
> > contain the few things mentione above, while in the latter it might
> > also have security sensitive data that's specific to the deployment
> > it's going to be used against.
> 
> I don't thonk there's any such distinction. A downstream user
> may build generic ready-made images, or fully configured app
> specific images. Both can contain the security sensitive data.

Including the nvram and efi makes me nervous; but I can see why together
they might work.  However, there's no guarantee that EFI has been tested
with the QEMU it's used on and ... that could be trouble.
Also, if we're going to start including the EFI rom then that would have
to be migrated with the VM so that after a restart on a different host
it's still using the right ROM that's compatible with it's varfile.

Dave

> Regards,
> Daniel
> -- 
> |: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org         -o-            https://fstop138.berrange.com :|
> |: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-07 13:49                                               ` Dr. David Alan Gilbert
@ 2018-06-07 14:06                                                 ` Andrea Bolognani
  2018-06-07 14:45                                                   ` Dr. David Alan Gilbert
  0 siblings, 1 reply; 157+ messages in thread
From: Andrea Bolognani @ 2018-06-07 14:06 UTC (permalink / raw)
  To: Dr. David Alan Gilbert, Daniel P. Berrangé
  Cc: Eric Blake, Kevin Wolf, qemu-block, Michael S. Tsirkin,
	qemu-devel, armbru, Richard W.M. Jones, stefanha, Max Reitz

On Thu, 2018-06-07 at 14:49 +0100, Dr. David Alan Gilbert wrote:
> Including the nvram and efi makes me nervous; but I can see why together
> they might work.  However, there's no guarantee that EFI has been tested
> with the QEMU it's used on and ... that could be trouble.

If the QEMU binary doesn't support EFI, then a guest expecting
EFI won't be able to start regardless of where the EFI ROM came
from.

> Also, if we're going to start including the EFI rom then that would have
> to be migrated with the VM so that after a restart on a different host
> it's still using the right ROM that's compatible with it's varfile.

That's a problem that needs to be addressed anyway, because even
as it is now you could easily find yourself trying and failing
to migrate a guest between two hosts that have different and
incompatible EFI ROMs installed.

-- 
Andrea Bolognani / Red Hat / Virtualization

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-07 14:06                                                 ` Andrea Bolognani
@ 2018-06-07 14:45                                                   ` Dr. David Alan Gilbert
  2018-06-07 14:56                                                     ` Andrea Bolognani
  0 siblings, 1 reply; 157+ messages in thread
From: Dr. David Alan Gilbert @ 2018-06-07 14:45 UTC (permalink / raw)
  To: Andrea Bolognani
  Cc: Daniel P. Berrangé,
	Eric Blake, Kevin Wolf, qemu-block, Michael S. Tsirkin,
	qemu-devel, armbru, Richard W.M. Jones, stefanha, Max Reitz

* Andrea Bolognani (abologna@redhat.com) wrote:
> On Thu, 2018-06-07 at 14:49 +0100, Dr. David Alan Gilbert wrote:
> > Including the nvram and efi makes me nervous; but I can see why together
> > they might work.  However, there's no guarantee that EFI has been tested
> > with the QEMU it's used on and ... that could be trouble.
> 
> If the QEMU binary doesn't support EFI, then a guest expecting
> EFI won't be able to start regardless of where the EFI ROM came
> from.

No, I mean if the QEMU doesn't support that *particular* EFI.

> > Also, if we're going to start including the EFI rom then that would have
> > to be migrated with the VM so that after a restart on a different host
> > it's still using the right ROM that's compatible with it's varfile.
> 
> That's a problem that needs to be addressed anyway, because even
> as it is now you could easily find yourself trying and failing
> to migrate a guest between two hosts that have different and
> incompatible EFI ROMs installed.

True; although I was working on the basis that vendors who cared about
migration compatibility would couple the EFI versions with machine types
to ensure that the variable data didn't become incompatible.

Dave

> -- 
> Andrea Bolognani / Red Hat / Virtualization
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-07 14:45                                                   ` Dr. David Alan Gilbert
@ 2018-06-07 14:56                                                     ` Andrea Bolognani
  2018-06-07 15:25                                                       ` Dr. David Alan Gilbert
  0 siblings, 1 reply; 157+ messages in thread
From: Andrea Bolognani @ 2018-06-07 14:56 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: Daniel P. Berrangé,
	Eric Blake, Kevin Wolf, qemu-block, Michael S. Tsirkin,
	qemu-devel, armbru, Richard W.M. Jones, stefanha, Max Reitz

On Thu, 2018-06-07 at 15:45 +0100, Dr. David Alan Gilbert wrote:
> * Andrea Bolognani (abologna@redhat.com) wrote:
> > On Thu, 2018-06-07 at 14:49 +0100, Dr. David Alan Gilbert wrote:
> > > Including the nvram and efi makes me nervous; but I can see why together
> > > they might work.  However, there's no guarantee that EFI has been tested
> > > with the QEMU it's used on and ... that could be trouble.
> > 
> > If the QEMU binary doesn't support EFI, then a guest expecting
> > EFI won't be able to start regardless of where the EFI ROM came
> > from.
> 
> No, I mean if the QEMU doesn't support that *particular* EFI.

I could be wrong, but I feel like it's significantly less likely
that a random QEMU binary won't like a random EFI ROM than it is
for a random EFI ROM to not like a random EFI NVRAM.

> > > Also, if we're going to start including the EFI rom then that would have
> > > to be migrated with the VM so that after a restart on a different host
> > > it's still using the right ROM that's compatible with it's varfile.
> > 
> > That's a problem that needs to be addressed anyway, because even
> > as it is now you could easily find yourself trying and failing
> > to migrate a guest between two hosts that have different and
> > incompatible EFI ROMs installed.
> 
> True; although I was working on the basis that vendors who cared about
> migration compatibility would couple the EFI versions with machine types
> to ensure that the variable data didn't become incompatible.

As far as I know, nobody is actually doing this at the moment.

-- 
Andrea Bolognani / Red Hat / Virtualization

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-07 14:56                                                     ` Andrea Bolognani
@ 2018-06-07 15:25                                                       ` Dr. David Alan Gilbert
  2018-06-07 20:38                                                         ` Gerd Hoffmann
  0 siblings, 1 reply; 157+ messages in thread
From: Dr. David Alan Gilbert @ 2018-06-07 15:25 UTC (permalink / raw)
  To: Andrea Bolognani
  Cc: Daniel P. Berrangé,
	Eric Blake, Kevin Wolf, qemu-block, Michael S. Tsirkin,
	qemu-devel, armbru, Richard W.M. Jones, stefanha, Max Reitz

* Andrea Bolognani (abologna@redhat.com) wrote:
> On Thu, 2018-06-07 at 15:45 +0100, Dr. David Alan Gilbert wrote:
> > * Andrea Bolognani (abologna@redhat.com) wrote:
> > > On Thu, 2018-06-07 at 14:49 +0100, Dr. David Alan Gilbert wrote:
> > > > Including the nvram and efi makes me nervous; but I can see why together
> > > > they might work.  However, there's no guarantee that EFI has been tested
> > > > with the QEMU it's used on and ... that could be trouble.
> > > 
> > > If the QEMU binary doesn't support EFI, then a guest expecting
> > > EFI won't be able to start regardless of where the EFI ROM came
> > > from.
> > 
> > No, I mean if the QEMU doesn't support that *particular* EFI.
> 
> I could be wrong, but I feel like it's significantly less likely
> that a random QEMU binary won't like a random EFI ROM than it is
> for a random EFI ROM to not like a random EFI NVRAM.

True, but it's not that rare to find SeaBIOS+qemu version problems;
so I'll assume the same happens with EFI.

> > > > Also, if we're going to start including the EFI rom then that would have
> > > > to be migrated with the VM so that after a restart on a different host
> > > > it's still using the right ROM that's compatible with it's varfile.
> > > 
> > > That's a problem that needs to be addressed anyway, because even
> > > as it is now you could easily find yourself trying and failing
> > > to migrate a guest between two hosts that have different and
> > > incompatible EFI ROMs installed.
> > 
> > True; although I was working on the basis that vendors who cared about
> > migration compatibility would couple the EFI versions with machine types
> > to ensure that the variable data didn't become incompatible.
> 
> As far as I know, nobody is actually doing this at the moment.

I'm assuming we'll have to.

Dave


> -- 
> Andrea Bolognani / Red Hat / Virtualization
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] [Qemu-block] storing machine data in qcow images?
  2018-06-06 18:36                                               ` Eduardo Habkost
@ 2018-06-07 18:27                                                 ` Kashyap Chamarthy
  0 siblings, 0 replies; 157+ messages in thread
From: Kashyap Chamarthy @ 2018-06-07 18:27 UTC (permalink / raw)
  To: Eduardo Habkost
  Cc: Michal Suchánek, Kevin Wolf, qemu-block, Michael S. Tsirkin,
	Richard W.M. Jones, qemu-devel, stefanha, Max Reitz,
	Dr. David Alan Gilbert

On Wed, Jun 06, 2018 at 03:36:53PM -0300, Eduardo Habkost wrote:
> On Wed, Jun 06, 2018 at 08:33:39PM +0200, Michal Suchánek wrote:
> [...]
> > Lastly we are missing a developer of a management layer committed to
> > support such appliances.
> 
> This is important.  Without developers of management tools
> willing to help specify the requirements and implement the
> feature, all the work in the lower layers would be useless.

FWIW, I'm following along from the OpenStack 'Nova' (it allows you to
provision VMs / QEMU instances) point of view.

Here is a bug (filed by Eduardo) that is tracking what needs to fixed in
Nova:

    https://bugzilla.redhat.com/show_bug.cgi?id=1581414 -- OpenStack
    shouldn't break if the default machine-type in QEMU is "q35"

Refer to comment#6 and comment#11 for some analysis as to where Nova
makes assumptions for machine types.  (There is one instance of it.)

Related: Elsewhere on this KM-long thread, Dan Berrangé and myself have
noted how Nova allows configuring machine types today -- either via
setting a config attribute (in /etc/nova/nova.conf) per Compute node
(where QEMU processes are launched) or via setting a metadata property
per template disk image, which is used to launch Nova instances (VMs).

-- 
/kashyap

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-07 10:54                                             ` Andrea Bolognani
@ 2018-06-07 19:24                                               ` Laszlo Ersek
  2018-06-08  8:21                                                 ` Dr. David Alan Gilbert
  2018-06-07 21:19                                               ` Michael S. Tsirkin
  1 sibling, 1 reply; 157+ messages in thread
From: Laszlo Ersek @ 2018-06-07 19:24 UTC (permalink / raw)
  To: Andrea Bolognani, Daniel P. Berrangé, Richard W.M. Jones
  Cc: Kevin Wolf, qemu-block, Michael S. Tsirkin, armbru, qemu-devel,
	stefanha, Max Reitz, Dr. David Alan Gilbert

On 06/07/18 12:54, Andrea Bolognani wrote:
> On Thu, 2018-06-07 at 11:36 +0100, Daniel P. Berrangé wrote:
>> On Thu, Jun 07, 2018 at 11:32:18AM +0100, Richard W.M. Jones wrote:
>>> Another problem which Laszlo mentioned is the varstore isn't portable
>>> between UEFI implementations, or if the UEFI is compiled with
>>> different options.  You can even imagine shipping multiple
>>> varstores(!) which argues for a tar-like format.
>>
>> Could we perhaps imagine shipping the actual UEFI bios, rather
>> than only the varstore.  The bios blob runs in guest context,
>> so there shouldn't be able security concerns from hosting
>> vendors with running user provided bios. Mostly its a matter
>> of confidence that the interface between bios & qemu is stable
>> which feels easier than assuming varstore vs different bios is
>> portable.
> 
> That sounds sensible, and further reinforces the idea that we
> need way more than a single string baked into the qcow2 file.
> 

Sorry for arriving late (thanks Rich for the Fwd).

The contents of the non-volatile UEFI variables should be considered
part of (permanent) guest state, such as disk contents. Therefore I'd
argue for bundling the varstore file with the disk image(s).

In turn, the best way to ensure comaptibility between varstore and
firmware binary is to just bundle the firmware binary as well. It's
generally not large (x86) or if it is, it compresses extremely well
(aarch64). For extra politeness, image providers can bundle a text file
with their firmware build options (like a kernel config), possibly even
a JSON document conforming to the new firmware schema (qemu commit
3a0adfc9bfcf), but that's not a hard requirement I guess.

If such a VM is to be migrated between hosts, I'd expect the host admin
to take care of installing the fw binary on all eligible hosts.


Regarding compat between QEMU and firmware binary, I see three cases:

(1) Static requirements presented by the firmware for the QEMU
configuration. (Such as -D SMM_REQUIRE.) With the domain configuration
captured one way or another alongside the disk image anyway, this should
not be a problem.

(2) New firmware launched on old QEMU. The firmware generally detects or
negotiates features with QEMU, so this should be safe.

(Discounting firmware regressions, of course -- for example, search
<https://www.mail-archive.com/qemu-devel@nongnu.org/msg471901.html> for
the string "I messed up".)

(3) Old firmware launched on new QEMU. This scenario has given us a lot
more grief than (2), but I think for the appliance distribution use
case, it can be folded into case (1) above -- specify the machine type
too in the domain config, and that should be compatible with the old
firmware.

(The handling of (3) is not uniform between upstream QEMU and various
downstreams. For example, consider
<https://bugs.launchpad.net/qemu/+bug/1715700>. This was a latent bug in
OVMF that got exposed by a new QEMU (due to a valid QEMU change), even
when using old machine types. The upstream solution was to fix edk2 and
stick with QEMU as-was (although the agreement around that hadn't been
universal). Conversely, one downstream solution was to restrict the
otherwise valid QEMU change to new machine types
<https://bugzilla.redhat.com/show_bug.cgi?id=1489800#c5>.)


All in all I agree with Daniel's proposal; it seems to be the most
robust one.

And, I too recall that, under AMD SEV, users will be supposed to, or
allowed to, provide their own firmware binaries.

Thanks!
Laszlo

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-07 10:51                                           ` Andrea Bolognani
@ 2018-06-07 19:38                                             ` Laszlo Ersek
  0 siblings, 0 replies; 157+ messages in thread
From: Laszlo Ersek @ 2018-06-07 19:38 UTC (permalink / raw)
  To: Andrea Bolognani, Richard W.M. Jones
  Cc: Kevin Wolf, qemu-block, Michael S. Tsirkin, qemu-devel, armbru,
	stefanha, Max Reitz, Dr. David Alan Gilbert

On 06/07/18 12:51, Andrea Bolognani wrote:
> On Thu, 2018-06-07 at 11:32 +0100, Richard W.M. Jones wrote:
>> On Thu, Jun 07, 2018 at 12:02:29PM +0200, Andrea Bolognani wrote:
>>> Something that I haven't seen mentioned in the thread - and this
>>> looks like as good a point as any to jump in - is that for q35
>>> guests using EFI as well as aarch64 guests the "one click import"
>>> experience requires not only hints about the machine (and firmware!)
>>> type, but also a copy of the EFI variable store:
>>>
>>>   $ virt-builder fedora-27 --arch aarch64 --notes
>>>   Fedora® 27 Server (aarch64)
>>>
>>>   [...]
>>>
>>>   You will need to use the associated UEFI NVRAM variables file:
>>>     http://libguestfs.org/download/builder/fedora-27-aarch64-nvram.xz
>>
>> This is true, although only sometimes.  If the bootloader[*] has a
>> working fallback path then usually it is able to boot and reset the
>> UEFI varstore back to the correct values.  We have had bugs before
>> where the fallback path was not working, eg:
>>
>>   https://bugzilla.redhat.com/show_bug.cgi?id=1353689 (yours!)
>>   https://bugzilla.redhat.com/show_bug.cgi?id=1558793
>
> [...]
>> [*] I'm not sure exactly which bit of the bootloader does this,
>> whether it's UEFI itself, or the grub-efi in the guest.
>
> IIUC the UEFI spec itself reserves certain file names in the ESP
> for this fallback mechanism; it's then up to the guest operating
> system to actually install something appropriate there.
>
> In Fedora and RHEL, shim is what takes care of it (except when it
> doesn't ;), but in Debian and Ubuntu AFAIK shim is not included
> and the fallback path doesn't work at all, which makes providing
> the NVRAM file a hard requirement to boot such guests.

Quoting the UEFI-2.7 spec:

> 3.4.3 Boot Option Variables Default Boot Behavior
>
> [...] the boot options require a standard default behavior in the
> exceptional case that valid boot options are not present on a
> platform. The default behavior must be invoked any time the BootOrder
> variable does not exist or only points to nonexistent boot options, or
> if no entry in BootOrder can successfully be executed.
>
> If system firmware supports boot option recovery as described in
> Section 3.4, system firmware must include a PlatformRecovery####
> variable specifying a short-form File Path Media Device Path (see
> Section 3.1.2) containing the platform default file path for removable
> media (see Table 11). [...]

(Note from Laszlo: think '\EFI\BOOT\BOOTX64.EFI' on the system disk's
EFI System Partition.)

> It is expected that this default boot will load an operating system or
> a maintenance utility.
>
> If this is an operating system setup program it is then responsible
> for setting the requisite environment variables for subsequent boots.
> [...]

More details:
<https://blog.uncooperative.org/blog/2014/02/06/the-efi-system-partition/>.

Thanks
Laszlo

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-07 15:25                                                       ` Dr. David Alan Gilbert
@ 2018-06-07 20:38                                                         ` Gerd Hoffmann
  0 siblings, 0 replies; 157+ messages in thread
From: Gerd Hoffmann @ 2018-06-07 20:38 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: Andrea Bolognani, Kevin Wolf, qemu-block, Michael S. Tsirkin,
	qemu-devel, armbru, Richard W.M. Jones, stefanha, Max Reitz

  Hi,

> > I could be wrong, but I feel like it's significantly less likely
> > that a random QEMU binary won't like a random EFI ROM than it is
> > for a random EFI ROM to not like a random EFI NVRAM.
> 
> True, but it's not that rare to find SeaBIOS+qemu version problems;

Hmm?  Any recent examples?  Since we switched over to have qemu generate
the acpi tables instead of expecting the firmware doing it (qemu 1.5 or
1.6 IIRC) there where no hard lockstep updates.  Only soft dependencies
a'la "if you want use the new qemu feature foo you also need a seabios
supporting the new feature foo".

> so I'll assume the same happens with EFI.

We try to avoid it but sometimes it doesn't work out as we like.

cheers,
  Gerd

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-07 10:36                                           ` Daniel P. Berrangé
  2018-06-07 10:54                                             ` Andrea Bolognani
@ 2018-06-07 21:18                                             ` Michael S. Tsirkin
  1 sibling, 0 replies; 157+ messages in thread
From: Michael S. Tsirkin @ 2018-06-07 21:18 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Richard W.M. Jones, Andrea Bolognani, Eric Blake, Kevin Wolf,
	qemu-block, qemu-devel, armbru, Dr. David Alan Gilbert, stefanha,
	Max Reitz

On Thu, Jun 07, 2018 at 11:36:20AM +0100, Daniel P. Berrangé wrote:
> On Thu, Jun 07, 2018 at 11:32:18AM +0100, Richard W.M. Jones wrote:
> > On Thu, Jun 07, 2018 at 12:02:29PM +0200, Andrea Bolognani wrote:
> > > Something that I haven't seen mentioned in the thread - and this
> > > looks like as good a point as any to jump in - is that for q35
> > > guests using EFI as well as aarch64 guests the "one click import"
> > > experience requires not only hints about the machine (and firmware!)
> > > type, but also a copy of the EFI variable store:
> > > 
> > >   $ virt-builder fedora-27 --arch aarch64 --notes
> > >   Fedora® 27 Server (aarch64)
> > > 
> > >   [...]
> > > 
> > >   You will need to use the associated UEFI NVRAM variables file:
> > >     http://libguestfs.org/download/builder/fedora-27-aarch64-nvram.xz
> > 
> > This is true, although only sometimes.  If the bootloader[*] has a
> > working fallback path then usually it is able to boot and reset the
> > UEFI varstore back to the correct values.  We have had bugs before
> > where the fallback path was not working, eg:
> > 
> >   https://bugzilla.redhat.com/show_bug.cgi?id=1353689 (yours!)
> >   https://bugzilla.redhat.com/show_bug.cgi?id=1558793
> > 
> > Another problem which Laszlo mentioned is the varstore isn't portable
> > between UEFI implementations, or if the UEFI is compiled with
> > different options.  You can even imagine shipping multiple
> > varstores(!) which argues for a tar-like format.
> 
> Could we perhaps imagine shipping the actual UEFI bios, rather
> than only the varstore.

That's pretty unusual, UEFI is designed to abstract away the
hardware. It normally ships with the hardware.

I don't think it's a good idea to stick firmware itself in the image:
updating guest images is already a problem, at least we can easily fix
firmware bugs by dnf update on the host.

> The bios blob runs in guest context,
> so there shouldn't be able security concerns from hosting
> vendors with running user provided bios.

It seems possible that users that do supply their own firmware
will want to save it with the image. I don't think
we should do it for the standard firmware.


> Mostly its a matter
> of confidence that the interface between bios & qemu is stable
> which feels easier than assuming varstore vs different bios is
> portable. IIRC, shipping actual UEFI BIOS is something that was
> desirable for AMD SEV usage.

For SEV storing the un-encrypted binary, having QEMU read it out and write
it into guest memory isn't any better than shipping it with QEMU.

> 
> Regards,
> Daniel
> -- 
> |: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org         -o-            https://fstop138.berrange.com :|
> |: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-07 10:54                                             ` Andrea Bolognani
  2018-06-07 19:24                                               ` Laszlo Ersek
@ 2018-06-07 21:19                                               ` Michael S. Tsirkin
  1 sibling, 0 replies; 157+ messages in thread
From: Michael S. Tsirkin @ 2018-06-07 21:19 UTC (permalink / raw)
  To: Andrea Bolognani
  Cc: Daniel P. Berrangé,
	Richard W.M. Jones, Eric Blake, Kevin Wolf, qemu-block,
	qemu-devel, armbru, Dr. David Alan Gilbert, stefanha, Max Reitz

On Thu, Jun 07, 2018 at 12:54:33PM +0200, Andrea Bolognani wrote:
> On Thu, 2018-06-07 at 11:36 +0100, Daniel P. Berrangé wrote:
> > On Thu, Jun 07, 2018 at 11:32:18AM +0100, Richard W.M. Jones wrote:
> > > Another problem which Laszlo mentioned is the varstore isn't portable
> > > between UEFI implementations, or if the UEFI is compiled with
> > > different options.  You can even imagine shipping multiple
> > > varstores(!) which argues for a tar-like format.
> > 
> > Could we perhaps imagine shipping the actual UEFI bios, rather
> > than only the varstore.  The bios blob runs in guest context,
> > so there shouldn't be able security concerns from hosting
> > vendors with running user provided bios. Mostly its a matter
> > of confidence that the interface between bios & qemu is stable
> > which feels easier than assuming varstore vs different bios is
> > portable.
> 
> That sounds sensible, and further reinforces the idea that we
> need way more than a single string baked into the qcow2 file.

I don't think anyone said we want a single string.
What was proposed is a set of key value pairs with
values being binary blobs.

> -- 
> Andrea Bolognani / Red Hat / Virtualization

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-06 17:06                                   ` Max Reitz
@ 2018-06-07 21:43                                     ` Michael S. Tsirkin
  2018-06-09 21:34                                       ` Max Reitz
  0 siblings, 1 reply; 157+ messages in thread
From: Michael S. Tsirkin @ 2018-06-07 21:43 UTC (permalink / raw)
  To: Max Reitz
  Cc: Dr. David Alan Gilbert, Kevin Wolf, qemu-block, armbru,
	qemu-devel, Richard W.M. Jones, stefanha

On Wed, Jun 06, 2018 at 07:06:27PM +0200, Max Reitz wrote:
> On 2018-06-06 17:09, Michael S. Tsirkin wrote:
> > On Wed, Jun 06, 2018 at 04:51:39PM +0200, Max Reitz wrote:
> >> On 2018-06-06 16:31, Dr. David Alan Gilbert wrote:
> >>> * Max Reitz (mreitz@redhat.com) wrote:
> >>>> On 2018-06-06 14:00, Dr. David Alan Gilbert wrote:
> >>>>> * Max Reitz (mreitz@redhat.com) wrote:
> >>>>>> On 2018-06-06 13:14, Dr. David Alan Gilbert wrote:
> >>>>>>> * Max Reitz (mreitz@redhat.com) wrote:
> >>>>>>>> On 2018-06-05 11:21, Dr. David Alan Gilbert wrote:
> >>>>>>>>> <reawakening a fizzled out thread>
> >>>
> >>> <snip>
> >>>
> >>>>>>> The problem with having a separate file is that you either have to copy
> >>>>>>> it around with the image 
> >>>>>>
> >>>>>> Which is just an inconvenience.
> >>>>>
> >>>>> It's more than that;  if it's a separate file then the tools can't
> >>>>> rely on users supplying it, and frankly they won't and they'll still
> >>>>> just supply an image.
> >>>>
> >>>> At which point you throw an error and tell them to specify the config file.
> >>>
> >>> No:
> >>>    a) At the moment they get away with it for images since they're all
> >>>       'pc' and the management layers do the right thing.
> >>
> >> So so far nobody has complained?  I don't really see the problem then.
> >>
> >> If deploying a disk and using all the defaults works out for users,
> >> great.  If they want more options, apparently they already know they
> >> have to provide some config.
> > 
> > QEMU's usability is terrible. There are tons of tools out there to try
> > to tame it, but of course they lack the knowledge of the VM internals
> > that QEMU has.
> 
> Er, yeah, OK.  But it was my understanding that we decided that we have
> a management layer on top of qemu to make things simple.

Who's we? I don't think the QEMU community completely gave up on people
using QEMU directly. It will need to be much more user-friendly than it
is right now. But it's possible. Fabrice built an emulator in
javascript, you go to a URL bam it runs a VM.

> Also, this is once more a case of first deciding what we want at all.

Who's we here again? Different people want different things. Enough
people seem to want to store tagged data with a disk image that it might
be worth someone's while to try to add that capability for starters to
qemu-img.

> Dave wants configuration options for the upper management layer which
> are completely opaque to qemu.  That has nothing to do whatsoever with
> the usability of qemu itself.

That's why I keep saying, let's start with implementing a mechanism,
worry about policy later if at all.

> >>>    b) They'll give the wrong config file - then you'd need to add a flag
> >>>      to detect that - which means you'd need to add something to the
> >>>      qcow to match it to the config; loop back to teh start!
> >>
> >> I'm not sure how seriously I should take this argument.  Do stupid
> >> things, win stupid prizes.
> >>
> >> If that's the issue, add a UUID to qcow2 files and reference it from the
> >> config file.
> >>
> >>> We should make this EASY for users.
> >>
> >> To me, having a simple config file they can edit manually certainly
> >> seems simpler than having to use specific tools to edit it inside of the
> >> qcow2 file.
> > 
> > I think you are one of the happy users familiar with qemu intricacies
> > and/or using a tool on top that does it for you.
> 
> Yeah, virt-manager and sometimes libvirt directly.  Works nicely.  In
> any case, having to manage more than a single file was never one of my
> worries.  In fact, I never had to manage any file because both tools do
> it for me.
> 
> And again, I don't know what the usability of qemu has to do with what
> Dave is proposing.
> 
> [...]

I think what we are seeing here is many people jumping on the
bandwagon and finding more and more uses for ability to store
meta-data in the qcow2 file.

This just means we should make it flexible enough to possibly
support more uses. It does not mean we need to make it
read mail on day 1.

> >> Because I think (maybe I'm wrong, though) where to store it heavily
> >> depends on what we want to store and how we want to use it.
> > 
> > I don't really see why.
> 
> For instance, supporting full-blown appliances would mean supporting
> multiple images.  Maybe in multiple formats.  Maybe the user wants
> runtime performance and is willing to give up a bit of installation time
> for that (e.g. for unpacking an archive).
> 
> In any case, if we want to be able to configure every kind of VM, tying
> everything to qcow2 seems like a bad idea.  First defining a format and
> then deciding on whether it makes sense to be able to put it into qcow2
> for certain subcases seems much more reasonable.
> 
> And if you make the format decidedly qcow2-independent, the whole
> "putting it into qcow2 is the simplest implementation" argument becomes
> rather weak.

I don't see why. Yes I think it's a separate format that we should just
allow storing in qcow2 for usability.


> >>> I've not seen anything that's not for either:
> >>>   a) The user to know what the image is
> >>
> >> I thought the use case was they just downloaded it.
> >>
> >> Otherwise, they should manage their filenames reasonably, come on.
> >> Seriously, adding a cute picture because users are too stupid to manage
> >> their VMs is *not* qcow2's problem.
> > 
> > QEMU is hard to use right and it is QEMU's problem. Users aren't stupid
> > but neither do they have the time to learn internals of the tools they
> > use.
> 
> Technically, it's the users' problem.
>  It may be qemu's fault, though.

I find solving peoblems interesting. I don't find assigning blame
interesting.

> 
> I will not say it is qemu's fault, because I was always told we have a
> management layer to make things simple again.  "qemu worries about
> execution, management layer worries about policy" is what I was told.
> 
> Also, I have no idea what you are talking about.  I gave a very specific
> example.  How is adding a picture to a VM disk image going to help
> anyone?  If that's the issue people are facing, I would argue they
> probably have a multitude of different issues with using qemu, because I
> fully agree with you on that point -- using qemu for complex cases is
> hard.  Well, no, it's simple, really, but then you probably won't get
> the best out of it.  (As can be seen by the fact that some people seem
> to start their VM just based on a disk image, and that seems to work...)
> 
> So, using qemu in the best way possible is hard.  But a pictogram in a
> disk image will not solve that problem.  I was always told that using a
> management layer solves the problem.  And as I understood, this was what
> Dave's proposal was about, the management layer, not qemu.
> 
> I would expect from the management layer to at least make managing VMs
> easy.  The management layer can give names.  It can present pictures.
> It can manage files.  It can export a config file + disk image so that
> it can be imported somewhere else.
> 
> Therefore, I don't know what you mean by "learn internals of the tools
> they use".  They don't need to do that, if they use a management layer.
> All they need to do is to supply everything the management layer may ask
> of them, and I do not understand why it is too difficult to request a
> plain config file that the user doesn't even need to understand.  They
> just need to download it along with the disk image.
> 
> 
> But all of that writing once again comes down to this: You are talking
> about qemu.  Dave is talking about something higher in the management
> layer.  Those are different things, and as I said, we first need to find
> common ground there.

The common ground is that both me and Dave find it useful to store meta-data
in the disk image.

> This is exactly why I said "where to store it heavily depends on what we
> want to store and how we want to use it."  As long as we don't know
> that, all of us are using strawman arguments where some other party
> suddenly chimes in and says "no, no, no, this is not what I'm talking
> about".  Yes, maybe you aren't, but someone else is.
> 
> [...]

Looks like discussion has run its course.

I think it's time for someone motivated enough to send a patch.
If enough interested people ack it, we will know it addresses
some of their needs.


> >>>> But really, if you create a VM, you need a configuration.  Like if you
> >>>> set up a new computer, you need to know what you want.  Usually there is
> >>>> no sticky label, but you just have to know and input it manually.  Maybe
> >>>> you have a sheet of paper, which I'd call the configuration file.
> >>>
> >>> Most things are figurable-out by the management tools/defaults or
> >>> are dependent on the whim of the user - we're only trying to stop the
> >>> user doing things that wont work.
> >>
> >> But what's so bad about an empty screen because the user hasn't read the
> >> download description?
> > 
> > Because user just learns to avoid QEMU as being too hard in the future.
> 
> So you want appliances, do I understand that correctly?  Because that is
> exactly what Dave doesn't want.

That's policy. I see no need to prevent people from building appliances,
though right now I'm not interested in building them myself.
We there's a mechanism both kinds of people can use, then great.

> Furthermore, another case of "qemu is too hard to use".  I will not
> argue against you there, because that may very well be true, but I will
> once again say that I was of the impression that we had management
> layers to handle that complexity.
> 
> >>> Simpler example; what stops you trying to put the PPC qcow image into
> >>> your x86 VM system - nothing that I know of.  I just want to stop the
> >>> users shooting themselves in the foot.
> >>
> >> They haven't shot themselves in the foot, they've just wasted a bit of
> >> their time, which could've been avoided by reading before clicking.
> >>
> >> [...]
> > 
> > Software developers are being paid for saving people's time.
> 
> Very good point, but I did say something like this before: I do not
> oppose appliances whatsoever.  In fact, it seems like a nice thing to have.
> 
> But, here's the deal: I do not think putting that data into qcow2 to be
> the best solution.  Furthermore, I have things to do that I consider
> more important than developing an appliance solution.  Therefore, it's
> not like I'm sitting around doing nothing when I could be developing a
> solution to this issue here.
> 
> I kept saying that I consider all of this an inconvenience.  Yes, it
> would be nice to have.  But I have things on my to do list that are hard
> feature requests, things that people really do need.  We all have.  We
> all need to decide how we can use our own time as efficiently as
> possible.  And I do not think that developing an appliance solution
> would be the best use of my time.  (Until my manager disagrees.)

As long as you don't start sending nacks on the basis that it's also not
the best use of other's time, I don't mind.

> >>>>>>>>> --------------------------------------------------------------
> >>>>>>>>>    
> >>>>>>>>>
> >>>>>>>>> Some reasoning:
> >>>>>>>>>    a) I've avoided the problem of when QEMU interprets the value
> >>>>>>>>>       by ignoring it and giving it to management layers at the point
> >>>>>>>>>       of VM import.
> >>>>>>>>
> >>>>>>>> Yes, but in the process you've made it completely opaque to qemu,
> >>>>>>>> basically, which doesn't really make it better for me.  Not that
> >>>>>>>> qemu-specific information in qcow2 files would be what I want, but, well.
> >>>>>>>>
> >>>>>>>> But it does solve technical issues, I concede that.
> >>>>>>>>
> >>>>>>>>>    b) I hate JSON, but there again nailing down a fixed format
> >>>>>>>>>       seems easiest and it makes the job of QCOW easy - a single
> >>>>>>>>>       string.
> >>>>>>>>
> >>>>>>>> Not really.  The string can be rather long, so you probably don't want
> >>>>>>>> to store it in the image header, and thus it's just a binary blob from
> >>>>>>>> qcow2's perspective, essentially.
> >>>>>>>
> >>>>>>> Yes, but it's a single blob - I'm not asking for multiple keyed blobs
> >>>>>>> or the ability to update individual blobs; just one blob that I can
> >>>>>>> replace.
> >>>>>>
> >>>>>> OK, you aren't, but others seem to be.
> >>>>>>
> >>>>>> Or, well, you call it a single blob.  But actually the current ideas
> >>>>>> seem to be to store a rather large configuration tree with binary data
> >>>>>> in that blob, so to me personally there is absolutely no functional
> >>>>>> difference to just storing a tar file in that blob.
> >>>>>>
> >>>>>> So correct me if I'm wrong, but to me it appears that you effectively
> >>>>>> want to store a filesystem in qcow2.[1]  Well, that's better than making
> >>>>>> qcow2 the filesystem, but it still appears just the wrong way around to me.
> >>>>>
> >>>>> It's different in the sense that what we end up with is still a qcow2;
> >>>>> anything that just handles qcow2's and can pass them through doesn't
> >>>>> need to do anything different; users don't need to do anything
> >>>>> different.  No one has to pack/unpack the file.
> >>>>
> >>>> Packing/unpacking is a strawman because I'm doing my best to give
> >>>> proposals that completely avoid that.
> >>>>
> >>>> Users do need to do something different, because users do need to
> >>>> realize that today there is no way to store VM configuration and disk
> >>>> data in a single file.  So if they already start VMs just based on a
> >>>> disk, then they are assuming behavior we do not have and that I'd call
> >>>> naive.  But that is a strawman from my side, sorry.  Keeping naive users
> >>>> happy is probably OK.
> >>>
> >>> Remember this all works fine now and has done for many years;
> >>> it's the addition of q35 that breaks that assumption.
> >>> The users can already blidly pick up the qcow2 image and stuff it in
> >>
> >> Which probably was blind luck already.  And if it wasn't, that means
> >> they knew the defaults are what they want.  So now they'd know they
> >> aren't and they have to offer a config file along with the disk image.
> >>
> >>> and it all works; all I want is for that to keep working.
> >>
> >> And all I say is that it's not unreasonable to expect users to realize
> >> that a VM is more than a disk image, just like a computer is more than a
> >> disk drive; and that handling two files really is not the end of the world.
> >>
> >> (And neither is wasting someone's time because they can't read.)
> >>
> >> Firstly, I agree it's a nice thing to have, but it's not worth it if we
> >> don't come up with clear rules on how to prevent developing a full
> >> appliance format.
> >>
> >> Or maybe we want that (because I still believe that you can always come
> >> up with obscure options without which the VM won't boot in your specific
> >> case), but then this is beyond just storing a tiny bit of data in a
> >> qcow2 image.
> >>
> >> [...]
> > 
> > Either we'll add more and more data later or we won't. Why worry about
> > it from the start? We'll never get anywhere if we do.
> 
> That is not a very good argument.  Adding things always means having to
> support them later.  It does make a lot of sense to worry about this
> burden before starting, and thus trying to find the best possible
> solution for the future, not the easiest hack for now.
> 
> And as I've said multiple times now, but I can't repeat myself often
> enough, I think it would be most efficient if we worried about what we
> want to store first, before we worry about where to store it.  I believe
> that once we have a hard requirement on what we want to store and how to
> use it (that most people agree on), we will have a set of constraints on
> how we can represent that data and where it needs to be stored, and this
> will give us a simple yes or no to the question whether the data needs
> to be stored in qcow2, or whether there is any better way (or whether it
> can be stored in qcow2, but need not be).

Well the subject says it, does it not? We want to store
machine data there.


> >>>>>> [1] Yes, I know that the guest disk already contains an FS. :-P
> >>>>>>
> >>>>>>>>>       (I would suggest in layer2 that the keys are sorted, but
> >>>>>>>>>       that's a pain to do in some json creators)
> >>>>>>>>>    c) Forcing the registry of keys might avoid silly duplication.
> >>>>>>>>>       We can but hope.
> >>>>>>>>>    d) I've not said it's a libvirt XML file since that seems
> >>>>>>>>>       a bit prescriptive.
> >>>>>>>>>
> >>>>>>>>> Some initial suggested keys:
> >>>>>>>>>
> >>>>>>>>>    "qemu.machine-types": [ "q35", "i440fx" ]
> >>>>>>>>>    "qemu.min-ram-MB": 1024
> >>>>>>>>
> >>>>>>>> I still don't understand why you'd want to put the configuration into
> >>>>>>>> qcow2 instead of the other way around.
> >>>>>>>>
> >>>>>>>> Or why you'd want to use a single file at all, because as this whole
> >>>>>>>> thread shows, a disk image alone is clearly not sufficient to describe a VM.
> >>>>>>>>
> >>>>>>>> (Or it may be in simple cases, but then that's because you don't need
> >>>>>>>> any configuration.)
> >>>>>>>
> >>>>>>> Because it avoids the unpacking associated with archives.
> >>>>>>
> >>>>>> I'm not talking about unpacking.  I'm talking about a potentially new
> >>>>>> format which allows accessing the qcow2 file in-place.  It would
> >>>>>> probably be trivial to write a block driver to allow this.
> >>>>>>
> >>>>>> (And as I wrote in my response to Michal, I suspect that tar could
> >>>>>> actually allow this, even though it would probably not be the ideal format.)
> >>>>>
> >>>>> As above, I don't think this is trivial; you have to change all the
> >>>>> layers;  lets say it was a tar; you'd have to somehow know that you're
> >>>>> importing one of these special tars,
> >>>>
> >>>> Which is trivial because it's just "Hey, look, it's a tar with that
> >>>> description file".
> >>>
> >>> Trivial? It's taking 100+ mails to add a tag to a qcow2 file! Can you
> >>> imagine what it takes to change libvirt, openstack, ovirt and the rest?
> >>
> >> :-)
> >>
> >> The implementation is trivial is what I meant, just like the
> >> implementation would be rather simple for qcow2 to store a binary blob
> >> and completely ignore it.
> > 
> > Old QEMU can't handle tar files. You need to unpack them,
> > then figure out that there are two files in the tar, one
> > is just for new qemu versions, one is portable. At which point
> > you need to go figure out what is your QEMU version.
> 
> And old qemu versions will just give you a blank screen for a qcow2 file
> with required non-default options.
> 
> Max

Compatiblity is not worthless simply because we do not have time travel.

-- 
MST

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-07 19:24                                               ` Laszlo Ersek
@ 2018-06-08  8:21                                                 ` Dr. David Alan Gilbert
  2018-06-08  8:41                                                   ` Daniel P. Berrangé
  0 siblings, 1 reply; 157+ messages in thread
From: Dr. David Alan Gilbert @ 2018-06-08  8:21 UTC (permalink / raw)
  To: Laszlo Ersek
  Cc: Andrea Bolognani, Daniel P. Berrangé,
	Richard W.M. Jones, Kevin Wolf, qemu-block, Michael S. Tsirkin,
	armbru, qemu-devel, stefanha, Max Reitz

* Laszlo Ersek (lersek@redhat.com) wrote:
> On 06/07/18 12:54, Andrea Bolognani wrote:
> > On Thu, 2018-06-07 at 11:36 +0100, Daniel P. Berrangé wrote:
> >> On Thu, Jun 07, 2018 at 11:32:18AM +0100, Richard W.M. Jones wrote:
> >>> Another problem which Laszlo mentioned is the varstore isn't portable
> >>> between UEFI implementations, or if the UEFI is compiled with
> >>> different options.  You can even imagine shipping multiple
> >>> varstores(!) which argues for a tar-like format.
> >>
> >> Could we perhaps imagine shipping the actual UEFI bios, rather
> >> than only the varstore.  The bios blob runs in guest context,
> >> so there shouldn't be able security concerns from hosting
> >> vendors with running user provided bios. Mostly its a matter
> >> of confidence that the interface between bios & qemu is stable
> >> which feels easier than assuming varstore vs different bios is
> >> portable.
> > 
> > That sounds sensible, and further reinforces the idea that we
> > need way more than a single string baked into the qcow2 file.
> > 
> 
> Sorry for arriving late (thanks Rich for the Fwd).
> 
> The contents of the non-volatile UEFI variables should be considered
> part of (permanent) guest state, such as disk contents. Therefore I'd
> argue for bundling the varstore file with the disk image(s).
> 
> In turn, the best way to ensure comaptibility between varstore and
> firmware binary is to just bundle the firmware binary as well. It's
> generally not large (x86) or if it is, it compresses extremely well
> (aarch64). For extra politeness, image providers can bundle a text file
> with their firmware build options (like a kernel config), possibly even
> a JSON document conforming to the new firmware schema (qemu commit
> 3a0adfc9bfcf), but that's not a hard requirement I guess.
> 
> If such a VM is to be migrated between hosts, I'd expect the host admin
> to take care of installing the fw binary on all eligible hosts.

There's no way they can do that if they're just importing VMs from
templates that include the image; who is going to keep track of which
BIOSs are needed where?

Dave

> Regarding compat between QEMU and firmware binary, I see three cases:
> 
> (1) Static requirements presented by the firmware for the QEMU
> configuration. (Such as -D SMM_REQUIRE.) With the domain configuration
> captured one way or another alongside the disk image anyway, this should
> not be a problem.
> 
> (2) New firmware launched on old QEMU. The firmware generally detects or
> negotiates features with QEMU, so this should be safe.
> 
> (Discounting firmware regressions, of course -- for example, search
> <https://www.mail-archive.com/qemu-devel@nongnu.org/msg471901.html> for
> the string "I messed up".)
> 
> (3) Old firmware launched on new QEMU. This scenario has given us a lot
> more grief than (2), but I think for the appliance distribution use
> case, it can be folded into case (1) above -- specify the machine type
> too in the domain config, and that should be compatible with the old
> firmware.
> 
> (The handling of (3) is not uniform between upstream QEMU and various
> downstreams. For example, consider
> <https://bugs.launchpad.net/qemu/+bug/1715700>. This was a latent bug in
> OVMF that got exposed by a new QEMU (due to a valid QEMU change), even
> when using old machine types. The upstream solution was to fix edk2 and
> stick with QEMU as-was (although the agreement around that hadn't been
> universal). Conversely, one downstream solution was to restrict the
> otherwise valid QEMU change to new machine types
> <https://bugzilla.redhat.com/show_bug.cgi?id=1489800#c5>.)
> 
> 
> All in all I agree with Daniel's proposal; it seems to be the most
> robust one.
> 
> And, I too recall that, under AMD SEV, users will be supposed to, or
> allowed to, provide their own firmware binaries.
> 
> Thanks!
> Laszlo
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-08  8:21                                                 ` Dr. David Alan Gilbert
@ 2018-06-08  8:41                                                   ` Daniel P. Berrangé
  2018-06-08  8:53                                                     ` Dr. David Alan Gilbert
  0 siblings, 1 reply; 157+ messages in thread
From: Daniel P. Berrangé @ 2018-06-08  8:41 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: Laszlo Ersek, Andrea Bolognani, Richard W.M. Jones, Kevin Wolf,
	qemu-block, Michael S. Tsirkin, armbru, qemu-devel, stefanha,
	Max Reitz

On Fri, Jun 08, 2018 at 09:21:30AM +0100, Dr. David Alan Gilbert wrote:
> * Laszlo Ersek (lersek@redhat.com) wrote:
> > On 06/07/18 12:54, Andrea Bolognani wrote:
> > > On Thu, 2018-06-07 at 11:36 +0100, Daniel P. Berrangé wrote:
> > >> On Thu, Jun 07, 2018 at 11:32:18AM +0100, Richard W.M. Jones wrote:
> > >>> Another problem which Laszlo mentioned is the varstore isn't portable
> > >>> between UEFI implementations, or if the UEFI is compiled with
> > >>> different options.  You can even imagine shipping multiple
> > >>> varstores(!) which argues for a tar-like format.
> > >>
> > >> Could we perhaps imagine shipping the actual UEFI bios, rather
> > >> than only the varstore.  The bios blob runs in guest context,
> > >> so there shouldn't be able security concerns from hosting
> > >> vendors with running user provided bios. Mostly its a matter
> > >> of confidence that the interface between bios & qemu is stable
> > >> which feels easier than assuming varstore vs different bios is
> > >> portable.
> > > 
> > > That sounds sensible, and further reinforces the idea that we
> > > need way more than a single string baked into the qcow2 file.
> > > 
> > 
> > Sorry for arriving late (thanks Rich for the Fwd).
> > 
> > The contents of the non-volatile UEFI variables should be considered
> > part of (permanent) guest state, such as disk contents. Therefore I'd
> > argue for bundling the varstore file with the disk image(s).
> > 
> > In turn, the best way to ensure comaptibility between varstore and
> > firmware binary is to just bundle the firmware binary as well. It's
> > generally not large (x86) or if it is, it compresses extremely well
> > (aarch64). For extra politeness, image providers can bundle a text file
> > with their firmware build options (like a kernel config), possibly even
> > a JSON document conforming to the new firmware schema (qemu commit
> > 3a0adfc9bfcf), but that's not a hard requirement I guess.
> > 
> > If such a VM is to be migrated between hosts, I'd expect the host admin
> > to take care of installing the fw binary on all eligible hosts.
> 
> There's no way they can do that if they're just importing VMs from
> templates that include the image; who is going to keep track of which
> BIOSs are needed where?

It isn't that unusual a requirement. When Openstack deploys a VM, it
has the user provided image as a base file, and then creates  qcow2
overlay.  If the VM is cold migrated (ie not running) to another
host, OpenStack has to make sure the same base file gets copied across
to the new host so that the overlay still works. Copying the BIOS file
and vars state across at the same time is no more difficult than what
its already doing.


Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-08  8:41                                                   ` Daniel P. Berrangé
@ 2018-06-08  8:53                                                     ` Dr. David Alan Gilbert
  0 siblings, 0 replies; 157+ messages in thread
From: Dr. David Alan Gilbert @ 2018-06-08  8:53 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Laszlo Ersek, Andrea Bolognani, Richard W.M. Jones, Kevin Wolf,
	qemu-block, Michael S. Tsirkin, armbru, qemu-devel, stefanha,
	Max Reitz

* Daniel P. Berrangé (berrange@redhat.com) wrote:
> On Fri, Jun 08, 2018 at 09:21:30AM +0100, Dr. David Alan Gilbert wrote:
> > * Laszlo Ersek (lersek@redhat.com) wrote:
> > > On 06/07/18 12:54, Andrea Bolognani wrote:
> > > > On Thu, 2018-06-07 at 11:36 +0100, Daniel P. Berrangé wrote:
> > > >> On Thu, Jun 07, 2018 at 11:32:18AM +0100, Richard W.M. Jones wrote:
> > > >>> Another problem which Laszlo mentioned is the varstore isn't portable
> > > >>> between UEFI implementations, or if the UEFI is compiled with
> > > >>> different options.  You can even imagine shipping multiple
> > > >>> varstores(!) which argues for a tar-like format.
> > > >>
> > > >> Could we perhaps imagine shipping the actual UEFI bios, rather
> > > >> than only the varstore.  The bios blob runs in guest context,
> > > >> so there shouldn't be able security concerns from hosting
> > > >> vendors with running user provided bios. Mostly its a matter
> > > >> of confidence that the interface between bios & qemu is stable
> > > >> which feels easier than assuming varstore vs different bios is
> > > >> portable.
> > > > 
> > > > That sounds sensible, and further reinforces the idea that we
> > > > need way more than a single string baked into the qcow2 file.
> > > > 
> > > 
> > > Sorry for arriving late (thanks Rich for the Fwd).
> > > 
> > > The contents of the non-volatile UEFI variables should be considered
> > > part of (permanent) guest state, such as disk contents. Therefore I'd
> > > argue for bundling the varstore file with the disk image(s).
> > > 
> > > In turn, the best way to ensure comaptibility between varstore and
> > > firmware binary is to just bundle the firmware binary as well. It's
> > > generally not large (x86) or if it is, it compresses extremely well
> > > (aarch64). For extra politeness, image providers can bundle a text file
> > > with their firmware build options (like a kernel config), possibly even
> > > a JSON document conforming to the new firmware schema (qemu commit
> > > 3a0adfc9bfcf), but that's not a hard requirement I guess.
> > > 
> > > If such a VM is to be migrated between hosts, I'd expect the host admin
> > > to take care of installing the fw binary on all eligible hosts.
> > 
> > There's no way they can do that if they're just importing VMs from
> > templates that include the image; who is going to keep track of which
> > BIOSs are needed where?
> 
> It isn't that unusual a requirement. When Openstack deploys a VM, it
> has the user provided image as a base file, and then creates  qcow2
> overlay.  If the VM is cold migrated (ie not running) to another
> host, OpenStack has to make sure the same base file gets copied across
> to the new host so that the overlay still works. Copying the BIOS file
> and vars state across at the same time is no more difficult than what
> its already doing.

I'm kind of OK with management layers doing it; but Laszlo was
suggesting it was an admins problem;  if we can make it something
manageable by higher levels that's OK.
(Although I'm still concerned that making images with a UEFI image in
that's portable is still not going to work).

Dave

> Regards,
> Daniel
> -- 
> |: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org         -o-            https://fstop138.berrange.com :|
> |: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-07 21:43                                     ` Michael S. Tsirkin
@ 2018-06-09 21:34                                       ` Max Reitz
  2018-06-11  2:06                                         ` Michael S. Tsirkin
  0 siblings, 1 reply; 157+ messages in thread
From: Max Reitz @ 2018-06-09 21:34 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Dr. David Alan Gilbert, Kevin Wolf, qemu-block, armbru,
	qemu-devel, Richard W.M. Jones, stefanha

[-- Attachment #1: Type: text/plain, Size: 14922 bytes --]

On 2018-06-07 23:43, Michael S. Tsirkin wrote:
> On Wed, Jun 06, 2018 at 07:06:27PM +0200, Max Reitz wrote:

[...]

>> Er, yeah, OK.  But it was my understanding that we decided that we have
>> a management layer on top of qemu to make things simple.
> 
> Who's we?

Everyone I'm usually talking to when it comes to adding features to the
block layer, be it on the mailing list or at KVM Forum.

Also my own judgment, when I look at how extensively the block layer
uses QMP, which by definition is not meant to be used by humans.

So this is strongly biased by my immediate environment.

>           I don't think the QEMU community completely gave up on people
> using QEMU directly. It will need to be much more user-friendly than it
> is right now.

You are the very first person I hear this from.  Don't take this as
disagreement, it's just that most of the energy recently seem to go into
making qemu interoperate more easily with management layers than with
end users.  Those goals are not necessarily mutually exclusive, but one
thing this changes did was to make "modern" qemu configuration rather
verbose, which is not what I'd call user-friendly.

>               But it's possible. Fabrice built an emulator in
> javascript, you go to a URL bam it runs a VM.

So?

You can still give qemu an image directly and bam it runs a VM.  It's
just that you get an i440 with an IDE HDD, probably like the JS qemu.

The whole point of this thread was that you might want to run q35, and I
don't know, maybe virtio-blk/scsi.  So exactly what you don't get from
the JS VM.

qemu would be very easy to use if it didn't offer any configuration
options.  The problem is that is offers a huge load of configuration
options and it is not reasonable to expect every user to know all of them.

One way to solve this is to add a management layer which knows all of
the options and may choose good defaults based on knowledge that the
qemu process itself doesn't have (i.e. "put implementation in qemu, and
choose the policy in the management layer").  This is at least what the
block layer usually chooses to do.

>> Also, this is once more a case of first deciding what we want at all.
> 
> Who's we here again?

The participants of this discussion.  Maybe I should have said "you"
because it doesn't necessarily concern myself.

OTOH, we don't have an appliance solution so far (as far as I'm aware),
so I suppose it concerns all of the qemu community, and probably
everyone working somewhere up any qemu management stack as well.

>                      Different people want different things. Enough
> people seem to want to store tagged data with a disk image that it might
> be worth someone's while to try to add that capability for starters to
> qemu-img.

I disagree, and I will explain why below. [1]

>> Dave wants configuration options for the upper management layer which
>> are completely opaque to qemu.  That has nothing to do whatsoever with
>> the usability of qemu itself.
> 
> That's why I keep saying, let's start with implementing a mechanism,
> worry about policy later if at all.

[1] I wholeheartedly disagree.

Say you have this mechanism now.  Whatever mechanism it is, because
there isn't even any consensus on that (you'd like key-value binary
object storage, Dave just wants a simple strong with some key-value
format that hasn't been defined in detail, although he has given some
proposals).

Then you need to implement something on top of it, store some values,
interpret them somehow.  And this is exactly what I am asking right now.
 What do you want to store?  Where do you want to interpret it?

I do not understand why you think it's harder to decide that now than
after you have extended qcow2.

I absolutely do think designing the qcow2 extension at least becomes
simpler if you do that design now.


There is a case to be made that one shouldn't worry too much about the
future.  Sometimes you just gotta start somewhere and see where it
leads, maybe the initial design was crap, too bad, then you need to
start over.  But at least you got something done.

But we have talked for half a week now.  In my very personal opinion we
definitely haven't reached the point yet where we just need to start
with something.  Appliances would be a big thing, no need to rush it.

[...]

> I think what we are seeing here is many people jumping on the
> bandwagon and finding more and more uses for ability to store
> meta-data in the qcow2 file.

I don't see that here, but it may very well be true that other people
may find it useful even for other purposes than appliances.

> This just means we should make it flexible enough to possibly
> support more uses. It does not mean we need to make it
> read mail on day 1.

So you are saying that we may end up with multiple parties storing
(meta-)data independently in the qcow2 file?

That would be an argument on why we'd want opaque metadata storage
before there are concrete design documents, and on why it doesn't matter
that you focus on qemu whereas others focus on the management layer.

Though it would mean opaque storage, as I've said, and that storing data
in the qcow2 file still probably does not make sense for some of the
possible use cases.  For instance, you propose storing data for qemu
proper without any management layer, but this poses the question again
of who should interpret that data.  (A management layer may just query
the image before launching qemu and then set the appropriate options,
but it gets difficult for the block layer to open the qcow2 image and
change the machine type when qemu is already running.  Though maybe you
could at least error out when incompatible options have been used to
launch qemu.  Hm.  Another question was who'd be supposed to store the
data.)

[...]

>> And if you make the format decidedly qcow2-independent, the whole
>> "putting it into qcow2 is the simplest implementation" argument becomes
>> rather weak.
> 
> I don't see why. Yes I think it's a separate format that we should just
> allow storing in qcow2 for usability.

It becomes weak because storing it in qcow2 would no longer be the
simplest implementation if you'd need to be able to read it from a file
outside of a qcow2 image anyway.

It still may be the easiest use case for users.

[...]

>> But all of that writing once again comes down to this: You are talking
>> about qemu.  Dave is talking about something higher in the management
>> layer.  Those are different things, and as I said, we first need to find
>> common ground there.
> 
> The common ground is that both me and Dave find it useful to store meta-data
> in the disk image.

Though it seems to me that you have very different ideas on how to store
it.  As far as I have understood, Dave just wants to store a bit of data
that might even go into the image header, whereas you'd prefer a
full-blown infrastructure for binary storage and large objects, because
someone might want that at some point.

That isn't to say I personally prefer either, it just means that those
are different and deciding on one naturally changes what to do.

There may be even other approaches, I don't know.

And my idea was that we should evaluate the different use cases for
storing arbitrary metadata in a qcow2 file, and then we'd see whether it
does or doesn't make sense to do so, for each case.

I think making qemu store something in the qcow2 file doesn't bring too
much because first, qemu couldn't really interpret that information by
itself (it could at best detect conflicting configuration, though in my
head checking qemu configuration in the qcow2 driver screams complexity;
and storing that information automatically would be problematic); and
secondly, I don't think that we can solve the complexity that is modern
qemu configuration by just putting it into a qcow2 file.  That will only
work as long as the user doesn't want to change anything.  There is
probably much more to say, but all of that would deserve its own thread.

Regarding designing an appliance format, there has already been some
discussion, so I won't say anything about that now, except that it too
should go into its own thread so that people are aware and don't think
it's just a qcow2 issue.

>> This is exactly why I said "where to store it heavily depends on what we
>> want to store and how we want to use it."  As long as we don't know
>> that, all of us are using strawman arguments where some other party
>> suddenly chimes in and says "no, no, no, this is not what I'm talking
>> about".  Yes, maybe you aren't, but someone else is.
>>
>> [...]
> 
> Looks like discussion has run its course.
> 
> I think it's time for someone motivated enough to send a patch.
> If enough interested people ack it, we will know it addresses
> some of their needs.

OK, but don't expect me to merge it at the current state of this
discussion[2].  If you or someone else sends a patch, I may raise my
concerns (though I will probably not NACK it) and I will defer the
decision to Kevin.

[2] By that I mean that I will not merge a feature-adding patch with a
justification of "We'll find a use for it later", which is exactly what
you've said in this mail.  (Though Dave does have concrete intentions,
so if he can expand on that, give an ACK and explain why this is worth
the effort (and the effort depends on the implementation complexity),
then things will probably be different.)

[...]

>> So you want appliances, do I understand that correctly?  Because that is
>> exactly what Dave doesn't want.
> 
> That's policy. I see no need to prevent people from building appliances,

Dave does.  He explicitly raised the idea of limiting the things that
can be stored in a qcow2 file, precisely because he didn't want the
feature to become too complex.

> though right now I'm not interested in building them myself.
> We there's a mechanism both kinds of people can use, then great.

Such a flexible feature means complexity, and in my opinion a complex
feature needs a reasonable justification.  "People will find a use for
it" is not reasonable.

Although it really is not unlikely that I'm wrong and that a flexible
metadata storage does not need to be complex.

[...]

>>> Software developers are being paid for saving people's time.
>>
>> Very good point, but I did say something like this before: I do not
>> oppose appliances whatsoever.  In fact, it seems like a nice thing to have.
>>
>> But, here's the deal: I do not think putting that data into qcow2 to be
>> the best solution.  Furthermore, I have things to do that I consider
>> more important than developing an appliance solution.  Therefore, it's
>> not like I'm sitting around doing nothing when I could be developing a
>> solution to this issue here.
>>
>> I kept saying that I consider all of this an inconvenience.  Yes, it
>> would be nice to have.  But I have things on my to do list that are hard
>> feature requests, things that people really do need.  We all have.  We
>> all need to decide how we can use our own time as efficiently as
>> possible.  And I do not think that developing an appliance solution
>> would be the best use of my time.  (Until my manager disagrees.)
> 
> As long as you don't start sending nacks on the basis that it's also not
> the best use of other's time, I don't mind.

No, but I'm one of the qcow2 maintainers, so any feature added to qcow2
just is a burden to myself and may therefore cost me time.

Luckily there is another qcow2 maintainer, so I will not NACK features
based on the fact that they are just a burden.  If Kevin wants to merge
a feature that I don't deem sufficiently useful, then he can go right ahead.

[...]

>> And as I've said multiple times now, but I can't repeat myself often
>> enough, I think it would be most efficient if we worried about what we
>> want to store first, before we worry about where to store it.  I believe
>> that once we have a hard requirement on what we want to store and how to
>> use it (that most people agree on), we will have a set of constraints on
>> how we can represent that data and where it needs to be stored, and this
>> will give us a simple yes or no to the question whether the data needs
>> to be stored in qcow2, or whether there is any better way (or whether it
>> can be stored in qcow2, but need not be).
> 
> Well the subject says it, does it not? We want to store
> machine data there.

Are you sure?

Firstly, that is not sufficiently precise.  Do you want an appliance,
i.e. store everything?  Do you want to store just something and limit
everyone in what can be stored (Dave's proposal)?  That is a difference,
and that is exactly what I was asking.

Secondly, in this mail you even seemed to propose storing just any
metadata that might be related to a VM (or maybe not even that).  This
too has some (meta-?)influence on the design.

For instance, if you want to store only VM-related information, we can
document those structures in the qcow2 specification in the qemu tree
(or at least link to another document in the qemu tree).  But if you
want to be able to store any metadata that just anyone wants to store,
then that won't be possible.  (And this would have implications.  For
instance, someone might decide on storing metadata that makes the qcow2
image effectively unreadable (e.g. by storing a special backing link).
I wouldn't like that, but we couldn't do anything about it if we'd allow
storage of arbitrary metadata.)

[...]

>>> Old QEMU can't handle tar files. You need to unpack them,
>>> then figure out that there are two files in the tar, one
>>> is just for new qemu versions, one is portable. At which point
>>> you need to go figure out what is your QEMU version.
>>
>> And old qemu versions will just give you a blank screen for a qcow2 file
>> with required non-default options.
> 
> Compatiblity is not worthless simply because we do not have time travel.

Dave was saying that the worst thing about the whole q35 thing is that
users download an image and have no idea why it isn't working.  Figuring
that out may take a long time, because nothing is even throwing an error
message.

If we had a new format, users couldn't even run it in qemu, so they
would quickly figure out that in order to run this VM, they need to
update their stack.

If we just add this information to qcow2, those users with outdated qemu
versions would again have to figure out why the image isn't working.

Sure, that is a minor issue that would solve itself by users slowly
upgrading their qemu, but it goes to show that compatibility may not
always be what is best.

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-09 21:34                                       ` Max Reitz
@ 2018-06-11  2:06                                         ` Michael S. Tsirkin
  2018-06-11  8:16                                           ` Michal Suchánek
  0 siblings, 1 reply; 157+ messages in thread
From: Michael S. Tsirkin @ 2018-06-11  2:06 UTC (permalink / raw)
  To: Max Reitz
  Cc: Dr. David Alan Gilbert, Kevin Wolf, qemu-block, armbru,
	qemu-devel, Richard W.M. Jones, stefanha

On Sat, Jun 09, 2018 at 11:34:03PM +0200, Max Reitz wrote:
> qemu would be very easy to use if it didn't offer any configuration
> options.  The problem is that is offers a huge load of configuration
> options and it is not reasonable to expect every user to know all of them.

Right but once one user id find a way to make a specific guest work in a
specific VM, it should not be as hard as it is to replicate the success.

...

> > This just means we should make it flexible enough to possibly
> > support more uses. It does not mean we need to make it
> > read mail on day 1.
> 
> So you are saying that we may end up with multiple parties storing
> (meta-)data independently in the qcow2 file?

Absolutely.

> before there are concrete design documents, and on why it doesn't matter
> that you focus on qemu whereas others focus on the management layer.

Right. That's what I'm saying.

> Though it would mean opaque storage, as I've said, and that storing data
> in the qcow2 file still probably does not make sense for some of the
> possible use cases.  For instance, you propose storing data for qemu
> proper without any management layer, but this poses the question again
> of who should interpret that data.  (A management layer may just query
> the image before launching qemu and then set the appropriate options,
> but it gets difficult for the block layer to open the qcow2 image and
> change the machine type when qemu is already running.  Though maybe you
> could at least error out when incompatible options have been used to
> launch qemu.  Hm.  Another question was who'd be supposed to store the
> data.)
> 
> [...]
> 
> >> And if you make the format decidedly qcow2-independent, the whole
> >> "putting it into qcow2 is the simplest implementation" argument becomes
> >> rather weak.
> > 
> > I don't see why. Yes I think it's a separate format that we should just
> > allow storing in qcow2 for usability.
> 
> It becomes weak because storing it in qcow2 would no longer be the
> simplest implementation if you'd need to be able to read it from a file
> outside of a qcow2 image anyway.
> 
> It still may be the easiest use case for users.
> 
> [...]
> 
> >> But all of that writing once again comes down to this: You are talking
> >> about qemu.  Dave is talking about something higher in the management
> >> layer.  Those are different things, and as I said, we first need to find
> >> common ground there.
> > 
> > The common ground is that both me and Dave find it useful to store meta-data
> > in the disk image.
> 
> Though it seems to me that you have very different ideas on how to store
> it.  As far as I have understood, Dave just wants to store a bit of data
> that might even go into the image header, whereas you'd prefer a
> full-blown infrastructure for binary storage and large objects, because
> someone might want that at some point.
> 
> That isn't to say I personally prefer either, it just means that those
> are different and deciding on one naturally changes what to do.
> 
> There may be even other approaches, I don't know.

I thought I heard Dave utter "key-value store" at some point,
which likely precludes "go directly into the header".


> And my idea was that we should evaluate the different use cases for
> storing arbitrary metadata in a qcow2 file, and then we'd see whether it
> does or doesn't make sense to do so, for each case.

As block guys maybe you could ask more specific questions then.
E.g. "would 1/2K of data be sufficient for these purposes"?
That's a more valid point than a generic "tells us what it's
for" question.

> I think making qemu store something in the qcow2 file doesn't bring too
> much because first, qemu couldn't really interpret that information by
> itself (it could at best detect conflicting configuration, though in my
> head checking qemu configuration in the qcow2 driver screams complexity;
> and storing that information automatically would be problematic); and
> secondly, I don't think that we can solve the complexity that is modern
> qemu configuration by just putting it into a qcow2 file.  That will only
> work as long as the user doesn't want to change anything.  There is
> probably much more to say, but all of that would deserve its own thread.

But I do think we can solve only doing it once and not per user
of an image.

> Regarding designing an appliance format, there has already been some
> discussion, so I won't say anything about that now, except that it too
> should go into its own thread so that people are aware and don't think
> it's just a qcow2 issue.
> 
> >> This is exactly why I said "where to store it heavily depends on what we
> >> want to store and how we want to use it."  As long as we don't know
> >> that, all of us are using strawman arguments where some other party
> >> suddenly chimes in and says "no, no, no, this is not what I'm talking
> >> about".  Yes, maybe you aren't, but someone else is.
> >>
> >> [...]
> > 
> > Looks like discussion has run its course.
> > 
> > I think it's time for someone motivated enough to send a patch.
> > If enough interested people ack it, we will know it addresses
> > some of their needs.
> 
> OK, but don't expect me to merge it at the current state of this
> discussion[2].  If you or someone else sends a patch, I may raise my
> concerns (though I will probably not NACK it) and I will defer the
> decision to Kevin.
> 
> [2] By that I mean that I will not merge a feature-adding patch with a
> justification of "We'll find a use for it later", which is exactly what
> you've said in this mail.  (Though Dave does have concrete intentions,
> so if he can expand on that, give an ACK and explain why this is worth
> the effort (and the effort depends on the implementation complexity),
> then things will probably be different.)
> 
> [...]

I'd expect a first patch to have ability to store and retrieve the
machine type used. Hopefully for qemu to check and warn if
it does not match the specified one, and probably using
as default if nothing was specified.


> >> So you want appliances, do I understand that correctly?  Because that is
> >> exactly what Dave doesn't want.
> > 
> > That's policy. I see no need to prevent people from building appliances,
> 
> Dave does.  He explicitly raised the idea of limiting the things that
> can be stored in a qcow2 file, precisely because he didn't want the
> feature to become too complex.
> 
> > though right now I'm not interested in building them myself.
> > We there's a mechanism both kinds of people can use, then great.
> 
> Such a flexible feature means complexity, and in my opinion a complex
> feature needs a reasonable justification.  "People will find a use for
> it" is not reasonable.
> 
> Although it really is not unlikely that I'm wrong and that a flexible
> metadata storage does not need to be complex.
> 
> [...]

I'm not saying people will find a use for it. I think we
already have one use for it and if it's generic enough
more people will find more uses for it.


> >>> Software developers are being paid for saving people's time.
> >>
> >> Very good point, but I did say something like this before: I do not
> >> oppose appliances whatsoever.  In fact, it seems like a nice thing to have.
> >>
> >> But, here's the deal: I do not think putting that data into qcow2 to be
> >> the best solution.  Furthermore, I have things to do that I consider
> >> more important than developing an appliance solution.  Therefore, it's
> >> not like I'm sitting around doing nothing when I could be developing a
> >> solution to this issue here.
> >>
> >> I kept saying that I consider all of this an inconvenience.  Yes, it
> >> would be nice to have.  But I have things on my to do list that are hard
> >> feature requests, things that people really do need.  We all have.  We
> >> all need to decide how we can use our own time as efficiently as
> >> possible.  And I do not think that developing an appliance solution
> >> would be the best use of my time.  (Until my manager disagrees.)
> > 
> > As long as you don't start sending nacks on the basis that it's also not
> > the best use of other's time, I don't mind.
> 
> No, but I'm one of the qcow2 maintainers, so any feature added to qcow2
> just is a burden to myself and may therefore cost me time.
>
> Luckily there is another qcow2 maintainer, so I will not NACK features
> based on the fact that they are just a burden.  If Kevin wants to merge
> a feature that I don't deem sufficiently useful, then he can go right ahead.
> 
> [...]
> >> And as I've said multiple times now, but I can't repeat myself often
> >> enough, I think it would be most efficient if we worried about what we
> >> want to store first, before we worry about where to store it.  I believe
> >> that once we have a hard requirement on what we want to store and how to
> >> use it (that most people agree on), we will have a set of constraints on
> >> how we can represent that data and where it needs to be stored, and this
> >> will give us a simple yes or no to the question whether the data needs
> >> to be stored in qcow2, or whether there is any better way (or whether it
> >> can be stored in qcow2, but need not be).
> > 
> > Well the subject says it, does it not? We want to store
> > machine data there.
> 
> Are you sure?
> 
> Firstly, that is not sufficiently precise.  Do you want an appliance,
> i.e. store everything?  Do you want to store just something and limit
> everyone in what can be stored (Dave's proposal)?  That is a difference,
> and that is exactly what I was asking.
> 
> Secondly, in this mail you even seemed to propose storing just any
> metadata that might be related to a VM (or maybe not even that).  This
> too has some (meta-?)influence on the design.
> 
> For instance, if you want to store only VM-related information, we can
> document those structures in the qcow2 specification in the qemu tree
> (or at least link to another document in the qemu tree).  But if you
> want to be able to store any metadata that just anyone wants to store,
> then that won't be possible.  (And this would have implications.  For
> instance, someone might decide on storing metadata that makes the qcow2
> image effectively unreadable (e.g. by storing a special backing link).
> I wouldn't like that, but we couldn't do anything about it if we'd allow
> storage of arbitrary metadata.)
> 
> [...]


And what if I say "whatever"? Yes I see a very specific usecase
which will be well served by adding a specific bit of data
in the disk image. Others see other uses.

> >>> Old QEMU can't handle tar files. You need to unpack them,
> >>> then figure out that there are two files in the tar, one
> >>> is just for new qemu versions, one is portable. At which point
> >>> you need to go figure out what is your QEMU version.
> >>
> >> And old qemu versions will just give you a blank screen for a qcow2 file
> >> with required non-default options.
> > 
> > Compatiblity is not worthless simply because we do not have time travel.
> 
> Dave was saying that the worst thing about the whole q35 thing is that
> users download an image and have no idea why it isn't working.  Figuring
> that out may take a long time, because nothing is even throwing an error
> message.
> 
> If we had a new format, users couldn't even run it in qemu, so they
> would quickly figure out that in order to run this VM, they need to
> update their stack.

Since then users of old software can't use your
image at all, then most people simply will not create
the new fangled image format.

> If we just add this information to qcow2, those users with outdated qemu
> versions would again have to figure out why the image isn't working.

By that metric compatibility is never worth it unless you have
ability to add new functionality retroactively to existing
software.

> 
> Sure, that is a minor issue that would solve itself by users slowly
> upgrading their qemu, but it goes to show that compatibility may not
> always be what is best.
> 
> Max
> 

Certainly in this case the feature is minor, so it's extremely
important.

-- 
MST

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-11  2:06                                         ` Michael S. Tsirkin
@ 2018-06-11  8:16                                           ` Michal Suchánek
  0 siblings, 0 replies; 157+ messages in thread
From: Michal Suchánek @ 2018-06-11  8:16 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Max Reitz, Kevin Wolf, qemu-block, armbru, qemu-devel,
	Richard W.M. Jones, stefanha, Dr. David Alan Gilbert

On Mon, 11 Jun 2018 05:06:53 +0300
"Michael S. Tsirkin" <mst@redhat.com> wrote:

> On Sat, Jun 09, 2018 at 11:34:03PM +0200, Max Reitz wrote:
....
> > Dave was saying that the worst thing about the whole q35 thing is
> > that users download an image and have no idea why it isn't
> > working.  Figuring that out may take a long time, because nothing
> > is even throwing an error message.
> > 
> > If we had a new format, users couldn't even run it in qemu, so they
> > would quickly figure out that in order to run this VM, they need to
> > update their stack.  
> 
> Since then users of old software can't use your
> image at all, then most people simply will not create
> the new fangled image format.

They will have to because without the new fangles the images just don't
work.

> 
> > If we just add this information to qcow2, those users with outdated
> > qemu versions would again have to figure out why the image isn't
> > working.  
> 
> By that metric compatibility is never worth it unless you have
> ability to add new functionality retroactively to existing
> software.

Compatibility is worth it when you add a new extension that is useful
but not critical. When the image does not work without interpreting the
extended metadata, the failure is hard to diagnose, and old versions of
qemu just ignore the extended metadata the extension failed the primary
purpose: making images work out of the box.

Thanks

Michal

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-06 11:32                       ` Max Reitz
  2018-06-06 11:37                         ` Dr. David Alan Gilbert
  2018-06-06 11:43                         ` Michal Suchánek
@ 2018-06-11  8:44                         ` Richard W.M. Jones
  2 siblings, 0 replies; 157+ messages in thread
From: Richard W.M. Jones @ 2018-06-11  8:44 UTC (permalink / raw)
  To: Max Reitz
  Cc: Michal Suchánek, Kevin Wolf, qemu-devel, stefanha, ehabkost,
	qemu-block, Michael S. Tsirkin

On Wed, Jun 06, 2018 at 01:32:47PM +0200, Max Reitz wrote:
> ext2?

I wrote an nbdkit plugin for ext2/ext3/ext4 last week.

  https://github.com/libguestfs/nbdkit/tree/master/plugins/ext2

It uses libext2fs from e2fsprogs and I think there are some lessons
for anyone who wants to use ext2 to store disk images.

(1) You cannot have more than one host process accessing a single
filesystem image, even read-only.  This is because opening an ext2+
filesystem even read-only causes writes, replaying the journal (for
ext3+) or writing to the superblock.

I'm sure there are some common use-cases such as overlays sharing a
common backing store which are excluded by this restriction.

(2) Within a single process you cannot have more than one libext2fs
handle open on the filesystem image.  This could make qemu block
drivers a bit awkward (although not impossible) because if two
instances of an ext2 qemu block driver both opened different disks in
the same filesystem they'd need to share a handle.

(3) You can resize files in the filesystem, although because we're
waiting for the NBD resize extension to be finalized my plugin does
not do that.

(4) Trim/discard appears to be possible (and it should be possible to
punch holes in the filesystem image) but I couldn't actually
understand how to make it work.  Also fast zeroing.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
libguestfs lets you edit virtual machines.  Supports shell scripting,
bindings from many languages.  http://libguestfs.org

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: [Qemu-devel] storing machine data in qcow images?
  2018-06-06 14:55                               ` Michael S. Tsirkin
  2018-06-06 14:57                                 ` Max Reitz
@ 2018-06-11 14:10                                 ` Kevin Wolf
  1 sibling, 0 replies; 157+ messages in thread
From: Kevin Wolf @ 2018-06-11 14:10 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Eduardo Habkost, Max Reitz, Dr. David Alan Gilbert,
	Michal Suchánek, qemu-block, Richard W.M. Jones, qemu-devel,
	stefanha

Am 06.06.2018 um 16:55 hat Michael S. Tsirkin geschrieben:
> On Wed, Jun 06, 2018 at 10:42:33AM -0300, Eduardo Habkost wrote:
> > > If we want a grand vision where a single file stores the whole VM, why
> > > not invest the work and make it right from the start?
> > 
> > We don't want a grand vision where a single file stores the whole
> > VM.  This is exactly what I would like to avoid, by not inventing
> > a whole different appliance file format.
> 
> Besides, trying to get a grand vision from the start is a sure
> way to never have the design leave the drawing board.
> 
> What we are asking for at this point is a way to stick a named blob in
> an image that people can use with qemu without jumping through hoops.
> 
> It seems like a generic enough addition that it seems highly likely
> to be useful down the road and harmless enough that maintaining
> it won't become a burden.
> 
> Can we agree on that as a first step, so we can build that foundation
> and move on to actually building ways to use it?

As you don't seem to believe Max, here's my opinion: No.

I'm okay with adding some well-specified information to qcow2 (I'm
thinking of a JSON document that is validated against a schema) where
the meaning is clear from the qcow2 spec, for all users.

Allowing undefined blobs would add data to qcow2 images whose meaning is
only understood by whatever highlevel tool that wrote it, and thereby
fragment the one qcow2 format that we have today into many subformats
whose specs are scattered all over the net (if they even exist). This is
not where I want to go.

So before I'll even think of adding some extra information, we're back
to what I already said two weeks ago: Show me what you want to store
there. If we go with my proposal to use JSON, show me the JSON schema
(with doc comments, obviously) and we'll talk.

Kevin

^ permalink raw reply	[flat|nested] 157+ messages in thread

end of thread, other threads:[~2018-06-11 14:10 UTC | newest]

Thread overview: 157+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-18 15:30 [Qemu-devel] storing machine data in qcow images? Michael S. Tsirkin
2018-05-18 16:49 ` Eduardo Habkost
2018-05-18 17:09 ` Daniel P. Berrangé
2018-05-18 17:41   ` Eduardo Habkost
2018-05-19  6:05     ` Markus Armbruster
2018-05-21 18:29       ` Eduardo Habkost
2018-05-21 18:44         ` Daniel P. Berrangé
2018-05-21 19:01           ` Eduardo Habkost
2018-05-23 11:19             ` Markus Armbruster
2018-05-23 12:13               ` Eduardo Habkost
2018-05-23 16:35                 ` Markus Armbruster
2018-05-29 14:06                   ` Dr. David Alan Gilbert
2018-06-05 21:58                   ` Michal Suchánek
2018-05-21 20:18     ` Daniel P. Berrangé
2018-05-21 20:33       ` Eduardo Habkost
2018-05-24  9:58         ` Kashyap Chamarthy
2018-05-22  7:35   ` Gerd Hoffmann
2018-05-22 10:53     ` Eduardo Habkost
2018-05-22 14:19     ` Michael S. Tsirkin
2018-05-22 15:02       ` Kevin Wolf
2018-05-22 15:14         ` Eduardo Habkost
2018-05-23  2:12         ` Fam Zheng
2018-05-23  9:16           ` Kevin Wolf
2018-05-23 14:46             ` Michael S. Tsirkin
2018-05-24 11:17   ` Richard W.M. Jones
2018-05-29 14:03     ` Dr. David Alan Gilbert
2018-05-29 14:14       ` Eduardo Habkost
2018-05-29 14:51         ` Richard W.M. Jones
2018-05-29 15:31         ` Dr. David Alan Gilbert
2018-05-22  8:50 ` Philipp Hahn
2018-05-24 11:32 ` Richard W.M. Jones
2018-05-24 14:56   ` Michael S. Tsirkin
2018-05-24 15:08     ` Kevin Wolf
2018-05-24 15:19       ` Michael S. Tsirkin
2018-05-24 15:20       ` Richard W.M. Jones
2018-05-24 16:25         ` Markus Armbruster
2018-05-28 18:10   ` Max Reitz
2018-05-28 18:30     ` Richard W.M. Jones
2018-05-28 18:38       ` Kevin Wolf
2018-05-28 18:44         ` Max Reitz
2018-05-28 19:09           ` Kevin Wolf
2018-05-29  9:23             ` Max Reitz
2018-05-29 10:14               ` Kevin Wolf
2018-05-29 13:16                 ` Eduardo Habkost
2018-05-28 21:20         ` Richard W.M. Jones
2018-05-28 21:25           ` Richard W.M. Jones
2018-05-29  6:44             ` Kevin Wolf
2018-05-29 10:14               ` Max Reitz
2018-06-05  9:21                 ` Dr. David Alan Gilbert
2018-06-05 19:03                   ` Eduardo Habkost
2018-06-05 19:47                     ` Michael S. Tsirkin
2018-06-05 19:54                       ` [Qemu-devel] [Qemu-block] " Eric Blake
2018-06-05 19:58                         ` Richard W.M. Jones
2018-06-05 20:09                           ` Eric Blake
2018-06-05 20:28                             ` Michael S. Tsirkin
2018-06-05 20:46                               ` Eric Blake
2018-06-05 21:26                                 ` Michael S. Tsirkin
2018-06-06  8:07                               ` Dr. David Alan Gilbert
2018-06-06  6:23                           ` Gerd Hoffmann
2018-06-05 20:06                         ` Michael S. Tsirkin
2018-06-06  6:26                     ` [Qemu-devel] " Gerd Hoffmann
2018-06-06  9:44                     ` Dr. David Alan Gilbert
2018-06-06 13:35                       ` Eduardo Habkost
2018-06-06 11:02                   ` Max Reitz
2018-06-06 11:14                     ` Dr. David Alan Gilbert
2018-06-06 11:26                       ` Max Reitz
2018-06-06 12:00                         ` Dr. David Alan Gilbert
2018-06-06 12:59                           ` Max Reitz
2018-06-06 14:31                             ` Dr. David Alan Gilbert
2018-06-06 14:37                               ` Daniel P. Berrangé
2018-06-06 14:42                                 ` Dr. David Alan Gilbert
2018-06-06 14:51                               ` Max Reitz
2018-06-06 15:05                                 ` Dr. David Alan Gilbert
2018-06-06 15:36                                   ` Eric Blake
2018-06-06 16:11                                     ` Michal Suchánek
2018-06-06 16:37                                       ` Eric Blake
2018-06-06 16:32                                     ` Daniel P. Berrangé
2018-06-06 16:36                                       ` Dr. David Alan Gilbert
2018-06-07 10:02                                       ` Andrea Bolognani
2018-06-07 10:22                                         ` Daniel P. Berrangé
2018-06-07 11:17                                           ` Andrea Bolognani
2018-06-07 12:38                                             ` Daniel P. Berrangé
2018-06-07 13:49                                               ` Dr. David Alan Gilbert
2018-06-07 14:06                                                 ` Andrea Bolognani
2018-06-07 14:45                                                   ` Dr. David Alan Gilbert
2018-06-07 14:56                                                     ` Andrea Bolognani
2018-06-07 15:25                                                       ` Dr. David Alan Gilbert
2018-06-07 20:38                                                         ` Gerd Hoffmann
2018-06-07 10:32                                         ` Richard W.M. Jones
2018-06-07 10:35                                           ` Dr. David Alan Gilbert
2018-06-07 10:36                                           ` Daniel P. Berrangé
2018-06-07 10:54                                             ` Andrea Bolognani
2018-06-07 19:24                                               ` Laszlo Ersek
2018-06-08  8:21                                                 ` Dr. David Alan Gilbert
2018-06-08  8:41                                                   ` Daniel P. Berrangé
2018-06-08  8:53                                                     ` Dr. David Alan Gilbert
2018-06-07 21:19                                               ` Michael S. Tsirkin
2018-06-07 21:18                                             ` Michael S. Tsirkin
2018-06-07 10:51                                           ` Andrea Bolognani
2018-06-07 19:38                                             ` Laszlo Ersek
2018-06-06 17:49                                   ` Max Reitz
2018-06-06 15:09                                 ` Michael S. Tsirkin
2018-06-06 17:06                                   ` Max Reitz
2018-06-07 21:43                                     ` Michael S. Tsirkin
2018-06-09 21:34                                       ` Max Reitz
2018-06-11  2:06                                         ` Michael S. Tsirkin
2018-06-11  8:16                                           ` Michal Suchánek
2018-06-06 11:42                       ` Richard W.M. Jones
2018-06-06 11:48                         ` Daniel P. Berrangé
2018-06-06 11:53                           ` Max Reitz
2018-06-06 12:03                           ` Dr. David Alan Gilbert
2018-06-06 13:15                             ` Max Reitz
2018-06-06 12:29                           ` Richard W.M. Jones
2018-06-06 11:22                     ` [Qemu-devel] [Qemu-block] " Peter Krempa
2018-06-06 10:32                 ` [Qemu-devel] " Michal Suchánek
2018-06-06 11:02                   ` Max Reitz
2018-06-06 11:19                     ` Michal Suchánek
2018-06-06 11:32                       ` Max Reitz
2018-06-06 11:37                         ` Dr. David Alan Gilbert
2018-06-06 11:44                           ` Max Reitz
2018-06-06 12:16                             ` Dr. David Alan Gilbert
2018-06-06 13:22                               ` Max Reitz
2018-06-06 14:02                                 ` Dr. David Alan Gilbert
2018-06-06 14:33                                   ` Max Reitz
2018-06-06 14:41                                     ` Dr. David Alan Gilbert
2018-06-06 14:55                                       ` Max Reitz
2018-06-06 15:25                                         ` Michal Suchánek
2018-06-06 18:02                                           ` Max Reitz
2018-06-06 18:33                                             ` Michal Suchánek
2018-06-06 18:36                                               ` Eduardo Habkost
2018-06-07 18:27                                                 ` [Qemu-devel] [Qemu-block] " Kashyap Chamarthy
2018-06-06 13:42                             ` [Qemu-devel] " Eduardo Habkost
2018-06-06 14:55                               ` Michael S. Tsirkin
2018-06-06 14:57                                 ` Max Reitz
2018-06-11 14:10                                 ` Kevin Wolf
2018-06-06 14:46                             ` Michael S. Tsirkin
2018-06-06 15:04                               ` Max Reitz
2018-06-06 11:43                         ` Michal Suchánek
2018-06-06 11:52                           ` Max Reitz
2018-06-06 12:13                             ` Michal Suchánek
2018-06-06 13:14                               ` Max Reitz
2018-06-06 13:45                                 ` Michal Suchánek
2018-06-06 13:50                                   ` Daniel P. Berrangé
2018-06-06 14:14                                     ` Eduardo Habkost
2018-06-06 14:21                                       ` Max Reitz
2018-06-06 14:24                                       ` Daniel P. Berrangé
2018-06-06 14:17                                   ` Max Reitz
2018-06-06 16:10                                     ` Eduardo Habkost
2018-06-06 18:09                                       ` Max Reitz
2018-06-11  8:44                         ` Richard W.M. Jones
2018-06-06 11:40                     ` Richard W.M. Jones
2018-06-06 14:31                       ` Michael S. Tsirkin
2018-06-06 14:43                     ` Michael S. Tsirkin
2018-06-06 14:57                       ` Eric Blake
2018-06-06 20:39                         ` Eric Blake
2018-06-06 21:01                           ` Gerd Hoffmann
2018-06-06 15:02                       ` Max Reitz

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.