All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] RFC v2: blockdev_add & friends, brief rationale, QMP docs
@ 2010-06-10 17:45 Markus Armbruster
  2010-06-15  9:04 ` Avi Kivity
  2010-06-15 13:44 ` [Qemu-devel] " Avi Kivity
  0 siblings, 2 replies; 25+ messages in thread
From: Markus Armbruster @ 2010-06-10 17:45 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Christoph Hellwig, Luiz Capitulino, Gerd Hoffmann,
	Avi Kivity

v2: Cover protocols
    Split blockdev_change into media_insert and media_remove
    Option syntax
    list TODOs


Rationale: Why new commands for block devices?
==============================================

We want a clean separation between host part and guest part.  Existing
-drive and drive_add don't provide that; they were designed to specify
both parts together.  Moreover, drive_add is limited to adding virtio
drives (with pci_add's help) and SCSI drives.

We already have commands to specify the guest part: -device and
device_add.

Support for defining just a host part for use with -device and
device_add was grafted onto -drive and drive_add (if=none), but it's a
mess.  Some parts are redundant, other parts are broken.

For instance, unit, bus, index, addr are redundant: -device/device_add
don't use them.  They provide their own parameters to specify bus and
bus-specific address.

The checks whether rerror, werror, readonly, cyls, heads, secs are sane
for a particular guest device model are broken.  The checks are in the
-drive code, which used to know what the guest device model supports,
but can't with if=none.

Moreover, -drive/drive_add have several flaws:

* Many parameters set with -drive/drive_add silently revert to defaults
  on media change.

* There are two ways to specify protocol, and both are flawed: you
  either use parameter format (then you can neither specify a
  non-default format, nor supply protocol options), or encode it in
  parameter file (yet another ad hoc mini-language, breaks filenames
  with ':').

* Behavior when format= is missing is insecure.

My proposed solution is a new option -blockdev and monitor command
blockdev_add.  These specify only the host drive.  Guest drive
properties are left to -device/device_add.  We keep -drive for backwards
compatibility and command line convenience.  Except we get rid of
if=none (may need a grace period).

New monitor command blockdev_del works regardless of how the host block
device was created.

New monitor commands media_insert, media_remove provide full control
over the host part, unlike the existing change command.

Summary of the host / guest split:

-drive options                  host or guest?
bus, unit, if, index, addr      guest, already covered by qdev
cyls, heads, secs, trans        guest, new qdev properties
                                  (but defaults depend on image)
media                           guest FIXME
snapshot, cache, aio            host, blockdev_add options
file, format                    host, blockdev_add options
                                  separate option for protocol
                                  format is just that, not protocol
                                  file is just a filename, no protocol
rerror, werror                  host, guest device models will
                                  reject values they don't support
serial                          guest, new qdev properties
readonly                        both host & guest, qdev will refuse to
                                  connect readonly host to read/write
                                  guest


QMP command docs
================

blockdev_add
------------

Add host block device.

Arguments:

- "id": the host block device's ID, must be unique (json-string)
- "format": image format (json-string, optional)
    - Possible values: "raw", "qcow2", ...
- "protocol": image access protocol (json-object, optional)
    - Has a member "type" (json-string), and depending on its value
      additional members
    - For "type" = "file", [...]
      - "file": name of image file (json-string)
    - For "type" = "nbd":
      - "domain": address family (json-string, optional)
          - Possible values: "inet" (default), "unix"
      - "file": name of socket file (json-string), only with "domain" =
        "unix"
      - "host": host name (json-string), only with "domain" = "inet"
      - "port": port (json-int), only with "domain" = "inet"
    - For "type" = "blkdebug":
      - "config": name of config file (json-string)
      - "protocol": image access protocol (json-object), as above
    [...]
- "aio": host AIO (json-string, optional)
    - Possible values: "threads" (default), "native"
- "cache": host cache usage (json-string, optional)
    - Possible values: "writethrough" (default), "writeback", "unsafe",
                       "none"
- "readonly": open image read-only (json-bool, optional, default false)
- "rerror": what to do on read error (json-string, optional)
    - Possible values: "report" (default), "ignore", "stop"
- "werror": what to do on write error (json-string, optional)
    - Possible values: "enospc" (default), "report", "ignore", "stop"
- "snapshot": enable snapshot (json-bool, optional, default false)

Example:

-> { "execute": "blockdev_add",
     "arguments": { "id": "blk1", "format": "raw",
                    "protocol": { "type": "file", "file": "fedora.img" } } }
<- { "return": {} }

-> { "execute": "blockdev_add",
     "arguments": {
       "id": "blk2", "format": "qcow2",
       "protocol": {
         "type": "blkdebug", "config": "test.blkdebug",
         "protocol": { "type": "file", "file": "test.qcow2" }
       }
     }
   }
<- { "return": {} }

Notes:

(1) If argument "protocol" is missing, all other optional arguments must
    be missing as well.  This defines a block device with no media
    inserted.

(2) In theory, the protocols form a tree.  In practice, all protocols
    but blkdebug have no children, and blkdebug has just one.

(3) It's possible to list supported disk formats by running QEMU with
    arguments "-blockdev_add \?".


blockdev_del
------------

Remove a host block device.

Arguments:

- "id": the host block device's ID (json-string)

Example:

-> { "execute": "blockdev_del", "arguments": { "id": "blk1" } }
<- { "return": {} }


media_insert
------------

Insert media into an empty host block device.

Arguments are exactly like blockdev_add, except "protocol" is mandatory.


media_remove
------------

Remove media from a host block device.

Arguments:

- "id": the host block device's ID (json-string)

Example:

-> { "execute": "media_remove", "arguments": { "id": "blk1" } }
<- { "return": {} }


Command line syntax
===================

We want common setups to be simple, and fancy setups to be possible.
"Simple" means use of common QemuOpts syntax, and reasonable defaults.

The trouble are protocols.  QemuOpts provides a list of options.
Protocols stack, so in the general case, we need a tree of lists of
options.

I can see a couple of ways to do this, and each of them sucks in its own
way.

Let me start with QMP snippets for the examples I'm going to use:

* The common case: a single protocol

  {
      "id": "blk1", "format": "raw",
      "protocol": { "type": "file", "file": "fedora.img" }
  }

* blkdebug

  {
      "id": "blk2", "format": "qcow2",
      "protocol": {
          "type": "blkdebug", "config": "test.blkdebug",
          "protocol": { "type": "file", "file": "test.qcow2" }
      }
  }

* Avi's hypothetical mirror protocol

  {
      "id": "blk3", "format": "raw",
      "protocol": {
          "type": "mirror",
          "protocols": [
              { "type": "file", "file": "local.img" }
              { "type": "nbd", "domain": "unix", "socket": "nbd-sock" }
          ]
      }
  }

I think we can all agree that the common case should look as simple as
this:

    -blockdev id=blk1,file=fedora.img                (*)

Here, file=fedora.img has the intuitive meaning "use fedora.img with the
appropriate protocol and a default format".  Since fedora.img is not a
special file, the appropriate protocol is "file".  The default format is
"raw".  Maybe we want to pick default format based on filename,
say. "qcow2" when the file name ends with ".qcow2".

Now let's examine what we can do to provide for the general case.

1. Make QemuOpts recursive, add brackets to its syntax

   This has several serious drawbacks:

   * No matter what kind of bracket we choose, the shell will be unhappy
     with it, unless we quote them.  Angle brackets fail destructively
     (redirection).  Curly braces suffer brace expansion.  Parenthesis
     fail reliably.  Square brackets suffer filename expansion.
     However, since actual match is unlikely, they're likely to just
     work.  Except with shopt failglob they fail reliably.  I guess
     they're the least bad choice.

   * We need a way to escape the brackets, or else we can't do filenames
     containing brackets.

     We better choose an escape syntax that doesn't interfere with the
     shell's.

     If we make brackets special characters in all option values, then
     no matter what escape syntax we choose, we'll break existing usage.
     In particular, brackets in filenames would need escaping.

     The one exception is to use ',' as escape character, but that's
     just too ugly.

     If we make brackets special characters only in values of structured
     options like protocol, our syntax becomes context-sensitive.  Ugh.

   * Our config file format is in INI syntax.  QemuOpts correspond to
     INI sections.  Sections can't be nested, so recursive QemuOpts
     don't translate.

   Examples:

   * Single protocol:

     -blockdev id=blk1,format=raw,protocol=[file,file=fedora.img]

     Requires suitable syntactic sugar to get the simple form (*).

   * blkdebug

     -blockdev id=blk2,format=qcow2,\
     protocol=[blkdebug,config=test.blkdebug,\
     protocol=[file,file=test.qcow2]]

   * Avi's mirror:

     -blockdev id=blk3,format=raw,\
     protocol=[mirror,\
     [file,file=local.img],\
     [nbd,domain=unix,sockert=nbd-sock]]

2. We already have a syntax to specify trees, namely JSON, so use it

   If -blockdev's argument starts with '{', it's a JSON object suitable
   as argument of blockdev_add in QMP.

   We still provide ordinary QemuOpts syntax for the cases that can be
   expressed with it, i.e. single protocol.

   I figure we'd want syntactic sugar for blkdebug, to permit its use
   from the command line without having to resort to JSON.

3. Stack protocols through named references

   The first protocol is "inlined" into -blockdev.  Any further
   protocols need to be referenced by name.

   Best explained by example:

   * Single protocol:

     -blockdev id=blk1,format=raw,protocol=file,file=fedora.img

     To get the simple form (*), make protocol optional with a suitable
     default.

   * blkdebug

     -blockdev id=blk2,format=qcow2,protocol=blkdebug,config=test.blkdebug,\
     base=blk2-base
     -blockproto id=blk2-base,protocol=file,file=test.qcow2

   * Avi's mirror:

     -blockdev id=blk3,format=raw,protocol=mirror,\
     base=blk3-base1,base=blk3=base2
     -blockproto id=blk3-base1,protocol=file,file=local.img
     -blockproto id=blk3-base2,protocol=nbd,domain=unix,sockert=nbd-sock

   Anything but a single protocol becomes pretty verbose.  Syntactic
   sugar for the blkdebug case would be possible; not sure it's worth
   it.

   No QemuOpts syntax changes.  INI can handle this just fine.


TODO
====

* Pick a command line syntax

* Human monitor syntax

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] RFC v2: blockdev_add & friends, brief rationale, QMP docs
  2010-06-10 17:45 [Qemu-devel] RFC v2: blockdev_add & friends, brief rationale, QMP docs Markus Armbruster
@ 2010-06-15  9:04 ` Avi Kivity
  2010-06-15 12:23   ` Markus Armbruster
  2010-06-15 13:44 ` [Qemu-devel] " Avi Kivity
  1 sibling, 1 reply; 25+ messages in thread
From: Avi Kivity @ 2010-06-15  9:04 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Kevin Wolf, Christoph Hellwig, Gerd Hoffmann, qemu-devel,
	Luiz Capitulino

On 06/10/2010 08:45 PM, Markus Armbruster wrote:
> QMP command docs
> ================
>
> blockdev_add
> ------------
>
> Add host block device.
>
> Arguments:
>
> - "id": the host block device's ID, must be unique (json-string)
> - "format": image format (json-string, optional)
>      - Possible values: "raw", "qcow2", ...
> - "protocol": image access protocol (json-object, optional)
>      - Has a member "type" (json-string), and depending on its value
>        additional members
>      - For "type" = "file", [...]
>        - "file": name of image file (json-string)
>      - For "type" = "nbd":
>        - "domain": address family (json-string, optional)
>            - Possible values: "inet" (default), "unix"
>        - "file": name of socket file (json-string), only with "domain" =
>          "unix"
>        - "host": host name (json-string), only with "domain" = "inet"
>        - "port": port (json-int), only with "domain" = "inet"
>      - For "type" = "blkdebug":
>        - "config": name of config file (json-string)
>        - "protocol": image access protocol (json-object), as above
>      [...]
> - "aio": host AIO (json-string, optional)
>      - Possible values: "threads" (default), "native"
> - "cache": host cache usage (json-string, optional)
>      - Possible values: "writethrough" (default), "writeback", "unsafe",
>                         "none"
> - "readonly": open image read-only (json-bool, optional, default false)
> - "rerror": what to do on read error (json-string, optional)
>      - Possible values: "report" (default), "ignore", "stop"
> - "werror": what to do on write error (json-string, optional)
>      - Possible values: "enospc" (default), "report", "ignore", "stop"
> - "snapshot": enable snapshot (json-bool, optional, default false)
>
> Example:
>
> ->  { "execute": "blockdev_add",
>       "arguments": { "id": "blk1", "format": "raw",
>                      "protocol": { "type": "file", "file": "fedora.img" } } }
> <- { "return": {} }
>
> ->  { "execute": "blockdev_add",
>       "arguments": {
>         "id": "blk2", "format": "qcow2",
>         "protocol": {
>           "type": "blkdebug", "config": "test.blkdebug",
>           "protocol": { "type": "file", "file": "test.qcow2" }
>         }
>       }
>     }
> <- { "return": {} }
>
> Notes:
>
> (1) If argument "protocol" is missing, all other optional arguments must
>      be missing as well.  This defines a block device with no media
>      inserted.
>
> (2) In theory, the protocols form a tree.  In practice, all protocols
>      but blkdebug have no children, and blkdebug has just one.
>
> (3) It's possible to list supported disk formats by running QEMU with
>      arguments "-blockdev_add \?".
>
>
> blockdev_del
> ------------
>
> Remove a host block device.
>
> Arguments:
>
> - "id": the host block device's ID (json-string)
>
> Example:
>
> ->  { "execute": "blockdev_del", "arguments": { "id": "blk1" } }
> <- { "return": {} }
>
>
> media_insert
> ------------
>
> Insert media into an empty host block device.
>
> Arguments are exactly like blockdev_add, except "protocol" is mandatory.
>
>
> media_remove
> ------------
>
> Remove media from a host block device.
>
> Arguments:
>
> - "id": the host block device's ID (json-string)
>
> Example:
>
> ->  { "execute": "media_remove", "arguments": { "id": "blk1" } }
> <- { "return": {} }
>
>    

media_insert/remove seem to duplicate blockdev_add/del.  Perhaps we 
don't need them?

To change media, tell the guest device to eject, blockdev_del, 
blockdev_add, reassociate the guest and host parts.

To pretend you're a media changer, blockdev_add all your cds at once and 
just change the guest/host association when you want to hear a new band.

For a fake a multipath setup, blockdev_add one device, associate it with 
multiple guest interfaces.

Otherwise, looks good.

-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] RFC v2: blockdev_add & friends, brief rationale, QMP docs
  2010-06-15  9:04 ` Avi Kivity
@ 2010-06-15 12:23   ` Markus Armbruster
  2010-06-15 12:43     ` Avi Kivity
  0 siblings, 1 reply; 25+ messages in thread
From: Markus Armbruster @ 2010-06-15 12:23 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Kevin Wolf, Christoph Hellwig, Gerd Hoffmann, qemu-devel,
	Luiz Capitulino

Avi Kivity <avi@redhat.com> writes:

> On 06/10/2010 08:45 PM, Markus Armbruster wrote:
>> QMP command docs
>> ================
>>
>> blockdev_add
>> ------------
>>
>> Add host block device.
>>
>> Arguments:
>>
>> - "id": the host block device's ID, must be unique (json-string)
>> - "format": image format (json-string, optional)
>>      - Possible values: "raw", "qcow2", ...
>> - "protocol": image access protocol (json-object, optional)
>>      - Has a member "type" (json-string), and depending on its value
>>        additional members
>>      - For "type" = "file", [...]
>>        - "file": name of image file (json-string)
>>      - For "type" = "nbd":
>>        - "domain": address family (json-string, optional)
>>            - Possible values: "inet" (default), "unix"
>>        - "file": name of socket file (json-string), only with "domain" =
>>          "unix"
>>        - "host": host name (json-string), only with "domain" = "inet"
>>        - "port": port (json-int), only with "domain" = "inet"
>>      - For "type" = "blkdebug":
>>        - "config": name of config file (json-string)
>>        - "protocol": image access protocol (json-object), as above
>>      [...]
>> - "aio": host AIO (json-string, optional)
>>      - Possible values: "threads" (default), "native"
>> - "cache": host cache usage (json-string, optional)
>>      - Possible values: "writethrough" (default), "writeback", "unsafe",
>>                         "none"
>> - "readonly": open image read-only (json-bool, optional, default false)
>> - "rerror": what to do on read error (json-string, optional)
>>      - Possible values: "report" (default), "ignore", "stop"
>> - "werror": what to do on write error (json-string, optional)
>>      - Possible values: "enospc" (default), "report", "ignore", "stop"
>> - "snapshot": enable snapshot (json-bool, optional, default false)
>>
>> Example:
>>
>> ->  { "execute": "blockdev_add",
>>       "arguments": { "id": "blk1", "format": "raw",
>>                      "protocol": { "type": "file", "file": "fedora.img" } } }
>> <- { "return": {} }
>>
>> ->  { "execute": "blockdev_add",
>>       "arguments": {
>>         "id": "blk2", "format": "qcow2",
>>         "protocol": {
>>           "type": "blkdebug", "config": "test.blkdebug",
>>           "protocol": { "type": "file", "file": "test.qcow2" }
>>         }
>>       }
>>     }
>> <- { "return": {} }
>>
>> Notes:
>>
>> (1) If argument "protocol" is missing, all other optional arguments must
>>      be missing as well.  This defines a block device with no media
>>      inserted.
>>
>> (2) In theory, the protocols form a tree.  In practice, all protocols
>>      but blkdebug have no children, and blkdebug has just one.
>>
>> (3) It's possible to list supported disk formats by running QEMU with
>>      arguments "-blockdev_add \?".
>>
>>
>> blockdev_del
>> ------------
>>
>> Remove a host block device.
>>
>> Arguments:
>>
>> - "id": the host block device's ID (json-string)
>>
>> Example:
>>
>> ->  { "execute": "blockdev_del", "arguments": { "id": "blk1" } }
>> <- { "return": {} }
>>
>>
>> media_insert
>> ------------
>>
>> Insert media into an empty host block device.
>>
>> Arguments are exactly like blockdev_add, except "protocol" is mandatory.
>>
>>
>> media_remove
>> ------------
>>
>> Remove media from a host block device.
>>
>> Arguments:
>>
>> - "id": the host block device's ID (json-string)
>>
>> Example:
>>
>> ->  { "execute": "media_remove", "arguments": { "id": "blk1" } }
>> <- { "return": {} }
>>
>>    
>
> media_insert/remove seem to duplicate blockdev_add/del.  Perhaps we
> don't need them?
>
> To change media, tell the guest device to eject, blockdev_del,
> blockdev_add, reassociate the guest and host parts.

Device model code is not prepared to have host parts go away and come
back during operation.  The device model driver attaches to the host
part on initialization, and detaches only when the device gets destroyed
(hot unplug).

If we yank out the host part during operation as you propose, then the
device model driver's pointer to the host part becomes null.  I don't
see that ending happily.

> To pretend you're a media changer, blockdev_add all your cds at once
> and just change the guest/host association when you want to hear a new
> band.
>
> For a fake a multipath setup, blockdev_add one device, associate it
> with multiple guest interfaces.
>
> Otherwise, looks good.

Any preference on the command line option syntax?

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] RFC v2: blockdev_add & friends, brief rationale, QMP docs
  2010-06-15 12:23   ` Markus Armbruster
@ 2010-06-15 12:43     ` Avi Kivity
  2010-06-15 13:27       ` Markus Armbruster
  0 siblings, 1 reply; 25+ messages in thread
From: Avi Kivity @ 2010-06-15 12:43 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Kevin Wolf, Christoph Hellwig, Gerd Hoffmann, qemu-devel,
	Luiz Capitulino

On 06/15/2010 03:23 PM, Markus Armbruster wrote:
>
>> media_insert/remove seem to duplicate blockdev_add/del.  Perhaps we
>> don't need them?
>>
>> To change media, tell the guest device to eject, blockdev_del,
>> blockdev_add, reassociate the guest and host parts.
>>      
> Device model code is not prepared to have host parts go away and come
> back during operation.  The device model driver attaches to the host
> part on initialization, and detaches only when the device gets destroyed
> (hot unplug).
>
> If we yank out the host part during operation as you propose, then the
> device model driver's pointer to the host part becomes null.  I don't
> see that ending happily.
>    

I'm only talking about the interface, not the implementation.  Internal 
design details shouldn't be exposed.

For the implementation, I imagine you can create an empty blockdev 
during guest device creation and treat blockdev_add/blockdev_del as 
media change/eject.

>> To pretend you're a media changer, blockdev_add all your cds at once
>> and just change the guest/host association when you want to hear a new
>> band.
>>
>> For a fake a multipath setup, blockdev_add one device, associate it
>> with multiple guest interfaces.
>>
>> Otherwise, looks good.
>>      
> Any preference on the command line option syntax?
>    

Something incredibly long and complicated?

We might keep the existing stuff which is already complicated enough for 
users, and ask machines to build guests using QMP instead of the command 
line.

-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] RFC v2: blockdev_add & friends, brief rationale, QMP docs
  2010-06-15 12:43     ` Avi Kivity
@ 2010-06-15 13:27       ` Markus Armbruster
  2010-06-15 13:40         ` Avi Kivity
  0 siblings, 1 reply; 25+ messages in thread
From: Markus Armbruster @ 2010-06-15 13:27 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Kevin Wolf, Christoph Hellwig, Gerd Hoffmann, qemu-devel,
	Luiz Capitulino

Avi Kivity <avi@redhat.com> writes:

> On 06/15/2010 03:23 PM, Markus Armbruster wrote:
>>
>>> media_insert/remove seem to duplicate blockdev_add/del.  Perhaps we
>>> don't need them?
>>>
>>> To change media, tell the guest device to eject, blockdev_del,
>>> blockdev_add, reassociate the guest and host parts.
>>>      
>> Device model code is not prepared to have host parts go away and come
>> back during operation.  The device model driver attaches to the host
>> part on initialization, and detaches only when the device gets destroyed
>> (hot unplug).
>>
>> If we yank out the host part during operation as you propose, then the
>> device model driver's pointer to the host part becomes null.  I don't
>> see that ending happily.
>>    
>
> I'm only talking about the interface, not the implementation.
> Internal design details shouldn't be exposed.
>
> For the implementation, I imagine you can create an empty blockdev
> during guest device creation and treat blockdev_add/blockdev_del as
> media change/eject.

If blockdev_del only ejects media, then we need another command to get
rid of a blockdev.

Unless you propose that blockdev_del merely ejects media if the blockdev
is being used by a device, but destroys the blockdev outright if not.
But that would be sick, wouldn't it?

>>> To pretend you're a media changer, blockdev_add all your cds at once
>>> and just change the guest/host association when you want to hear a new
>>> band.
>>>
>>> For a fake a multipath setup, blockdev_add one device, associate it
>>> with multiple guest interfaces.
>>>
>>> Otherwise, looks good.
>>>      
>> Any preference on the command line option syntax?
>>    
>
> Something incredibly long and complicated?
>
> We might keep the existing stuff which is already complicated enough
> for users, and ask machines to build guests using QMP instead of the
> command line.

I sketched three ways to do -blockdev.  They attempt to make the simple
simple, and the complex possible.  Any preference among them?

We can't simply keep -drive, because of the flaws I listed in the
rationale.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] RFC v2: blockdev_add & friends, brief rationale, QMP docs
  2010-06-15 13:27       ` Markus Armbruster
@ 2010-06-15 13:40         ` Avi Kivity
  2010-06-15 14:54           ` Markus Armbruster
  0 siblings, 1 reply; 25+ messages in thread
From: Avi Kivity @ 2010-06-15 13:40 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Kevin Wolf, Christoph Hellwig, Gerd Hoffmann, qemu-devel,
	Luiz Capitulino

On 06/15/2010 04:27 PM, Markus Armbruster wrote:
>
>> I'm only talking about the interface, not the implementation.
>> Internal design details shouldn't be exposed.
>>
>> For the implementation, I imagine you can create an empty blockdev
>> during guest device creation and treat blockdev_add/blockdev_del as
>> media change/eject.
>>      
> If blockdev_del only ejects media, then we need another command to get
> rid of a blockdev.
>    

No.  If you have a blockdev just to satisfy the guest device's need for 
a blockdev pointer, you can delete it automatically when the last 
reference goes.


> Unless you propose that blockdev_del merely ejects media if the blockdev
> is being used by a device, but destroys the blockdev outright if not.
> But that would be sick, wouldn't it?
>    

Create a blockdev implicitly with guest device creation, and use 
blockdev_add (or media_attach) to attach the media.

(but that creates a window where the guest device is visible but media 
is not yet inserted?)

>>>> To pretend you're a media changer, blockdev_add all your cds at once
>>>> and just change the guest/host association when you want to hear a new
>>>> band.
>>>>
>>>> For a fake a multipath setup, blockdev_add one device, associate it
>>>> with multiple guest interfaces.
>>>>
>>>> Otherwise, looks good.
>>>>
>>>>          
>>> Any preference on the command line option syntax?
>>>
>>>        
>> Something incredibly long and complicated?
>>
>> We might keep the existing stuff which is already complicated enough
>> for users, and ask machines to build guests using QMP instead of the
>> command line.
>>      
> I sketched three ways to do -blockdev.  They attempt to make the simple
> simple, and the complex possible.  Any preference among them?
>
>    

I'll look again.

> We can't simply keep -drive, because of the flaws I listed in the
> rationale.
>    

Ok.

-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Qemu-devel] Re: RFC v2: blockdev_add & friends, brief rationale, QMP docs
  2010-06-10 17:45 [Qemu-devel] RFC v2: blockdev_add & friends, brief rationale, QMP docs Markus Armbruster
  2010-06-15  9:04 ` Avi Kivity
@ 2010-06-15 13:44 ` Avi Kivity
  2010-06-15 14:39   ` Gerd Hoffmann
  2010-06-16 11:20   ` Kevin Wolf
  1 sibling, 2 replies; 25+ messages in thread
From: Avi Kivity @ 2010-06-15 13:44 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Kevin Wolf, Christoph Hellwig, qemu-devel, Luiz Capitulino,
	Gerd Hoffmann

On 06/10/2010 08:45 PM, Markus Armbruster wrote:
>
>
>     * Our config file format is in INI syntax.  QemuOpts correspond to
>       INI sections.  Sections can't be nested, so recursive QemuOpts
>       don't translate.
>    

git (and probably others) use

[a "b"]
     c = d

for

    a.b.c=d

>     Examples:
>
>     * Single protocol:
>
>       -blockdev id=blk1,format=raw,protocol=[file,file=fedora.img]
>
>       Requires suitable syntactic sugar to get the simple form (*).
>
>     * blkdebug
>
>       -blockdev id=blk2,format=qcow2,\
>       protocol=[blkdebug,config=test.blkdebug,\
>       protocol=[file,file=test.qcow2]]
>
>     * Avi's mirror:
>
>       -blockdev id=blk3,format=raw,\
>       protocol=[mirror,\
>       [file,file=local.img],\
>       [nbd,domain=unix,sockert=nbd-sock]]
>
> 2. We already have a syntax to specify trees, namely JSON, so use it
>
>     If -blockdev's argument starts with '{', it's a JSON object suitable
>     as argument of blockdev_add in QMP.
>
>     We still provide ordinary QemuOpts syntax for the cases that can be
>     expressed with it, i.e. single protocol.
>
>     I figure we'd want syntactic sugar for blkdebug, to permit its use
>     from the command line without having to resort to JSON.
>    

Might be nice as a general extension to QemuOpts.

> 3. Stack protocols through named references
>
>     The first protocol is "inlined" into -blockdev.  Any further
>     protocols need to be referenced by name.
>
>     Best explained by example:
>
>     * Single protocol:
>
>       -blockdev id=blk1,format=raw,protocol=file,file=fedora.img
>
>       To get the simple form (*), make protocol optional with a suitable
>       default.
>
>     * blkdebug
>
>       -blockdev id=blk2,format=qcow2,protocol=blkdebug,config=test.blkdebug,\
>       base=blk2-base
>       -blockproto id=blk2-base,protocol=file,file=test.qcow2
>
>     * Avi's mirror:
>
>       -blockdev id=blk3,format=raw,protocol=mirror,\
>       base=blk3-base1,base=blk3=base2
>       -blockproto id=blk3-base1,protocol=file,file=local.img
>       -blockproto id=blk3-base2,protocol=nbd,domain=unix,sockert=nbd-sock
>
>     Anything but a single protocol becomes pretty verbose.  Syntactic
>     sugar for the blkdebug case would be possible; not sure it's worth
>     it.
>
>     No QemuOpts syntax changes.  INI can handle this just fine.
>
>    

Looks like the least painful option as no new infrastructure is needed.  
I'd go with this.



-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Qemu-devel] Re: RFC v2: blockdev_add & friends, brief rationale, QMP docs
  2010-06-15 13:44 ` [Qemu-devel] " Avi Kivity
@ 2010-06-15 14:39   ` Gerd Hoffmann
  2010-06-16 11:20   ` Kevin Wolf
  1 sibling, 0 replies; 25+ messages in thread
From: Gerd Hoffmann @ 2010-06-15 14:39 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Kevin Wolf, Christoph Hellwig, qemu-devel, Markus Armbruster,
	Luiz Capitulino

On 06/15/10 15:44, Avi Kivity wrote:
> On 06/10/2010 08:45 PM, Markus Armbruster wrote:
>>
>>
>> * Our config file format is in INI syntax. QemuOpts correspond to
>> INI sections. Sections can't be nested, so recursive QemuOpts
>> don't translate.
>
> git (and probably others) use
>
> [a "b"]
> c = d
>
> for
>
> a.b.c=d

'b' is the id in qemu, i.e. -device driver=foo,id=bar,arg=value 
translates to

[device "bar"]
   driver = "foo"
   arg = "value"

in qemu.

Oh, and there is -set device.bar.arg2=value2 btw ...

cheers,
   Gerd

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] RFC v2: blockdev_add & friends, brief rationale, QMP docs
  2010-06-15 13:40         ` Avi Kivity
@ 2010-06-15 14:54           ` Markus Armbruster
  2010-06-16  9:50             ` Avi Kivity
  0 siblings, 1 reply; 25+ messages in thread
From: Markus Armbruster @ 2010-06-15 14:54 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Kevin Wolf, Christoph Hellwig, Gerd Hoffmann, qemu-devel,
	Luiz Capitulino

Avi Kivity <avi@redhat.com> writes:

> On 06/15/2010 04:27 PM, Markus Armbruster wrote:
>>
>>> I'm only talking about the interface, not the implementation.
>>> Internal design details shouldn't be exposed.
>>>
>>> For the implementation, I imagine you can create an empty blockdev
>>> during guest device creation and treat blockdev_add/blockdev_del as
>>> media change/eject.
>>>      
>> If blockdev_del only ejects media, then we need another command to get
>> rid of a blockdev.
>>    
>
> No.  If you have a blockdev just to satisfy the guest device's need
> for a blockdev pointer, you can delete it automatically when the last
> reference goes.

That's not how netdev and chardev behave.

>> Unless you propose that blockdev_del merely ejects media if the blockdev
>> is being used by a device, but destroys the blockdev outright if not.
>> But that would be sick, wouldn't it?
>>    
>
> Create a blockdev implicitly with guest device creation, and use
> blockdev_add (or media_attach) to attach the media.
>
> (but that creates a window where the guest device is visible but media
> is not yet inserted?)

Think so, for hot plug.

Actually, the device model driver would reject an empty blockdev, unless
it's a device with removable media, such as a CD-ROM.

Having a device with fixed media go through a "no media yet" state
during initialization just complicates things.  Defining the media
*before* creating the device is much simpler.

>>>>> To pretend you're a media changer, blockdev_add all your cds at once
>>>>> and just change the guest/host association when you want to hear a new
>>>>> band.
>>>>>
>>>>> For a fake a multipath setup, blockdev_add one device, associate it
>>>>> with multiple guest interfaces.
>>>>>
>>>>> Otherwise, looks good.
>>>>>
>>>>>          
>>>> Any preference on the command line option syntax?
>>>>
>>>>        
>>> Something incredibly long and complicated?
>>>
>>> We might keep the existing stuff which is already complicated enough
>>> for users, and ask machines to build guests using QMP instead of the
>>> command line.
>>>      
>> I sketched three ways to do -blockdev.  They attempt to make the simple
>> simple, and the complex possible.  Any preference among them?
>>
>>    
>
> I'll look again.

Thanks!

>> We can't simply keep -drive, because of the flaws I listed in the
>> rationale.
>>    
>
> Ok.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] RFC v2: blockdev_add & friends, brief rationale, QMP docs
  2010-06-15 14:54           ` Markus Armbruster
@ 2010-06-16  9:50             ` Avi Kivity
  2010-06-16 11:02               ` Markus Armbruster
  0 siblings, 1 reply; 25+ messages in thread
From: Avi Kivity @ 2010-06-16  9:50 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Kevin Wolf, Christoph Hellwig, Gerd Hoffmann, qemu-devel,
	Luiz Capitulino

On 06/15/2010 05:54 PM, Markus Armbruster wrote:
> Avi Kivity<avi@redhat.com>  writes:
>
>    
>> On 06/15/2010 04:27 PM, Markus Armbruster wrote:
>>      
>>>        
>>>> I'm only talking about the interface, not the implementation.
>>>> Internal design details shouldn't be exposed.
>>>>
>>>> For the implementation, I imagine you can create an empty blockdev
>>>> during guest device creation and treat blockdev_add/blockdev_del as
>>>> media change/eject.
>>>>
>>>>          
>>> If blockdev_del only ejects media, then we need another command to get
>>> rid of a blockdev.
>>>
>>>        
>> No.  If you have a blockdev just to satisfy the guest device's need
>> for a blockdev pointer, you can delete it automatically when the last
>> reference goes.
>>      
> That's not how netdev and chardev behave.
>    

Ok.  To me duplicate argument lists suggest a lack of orthogonality, though.

How about

   blockdev_add id=...
   media_attach blockdev=..., media-parameters
   media_detach blockdev=...
   blockdev_del id=...

So blockdev_add/del define/remove a slot for the media, 
media_attach/detach connect it to media.

>>> Unless you propose that blockdev_del merely ejects media if the blockdev
>>> is being used by a device, but destroys the blockdev outright if not.
>>> But that would be sick, wouldn't it?
>>>
>>>        
>> Create a blockdev implicitly with guest device creation, and use
>> blockdev_add (or media_attach) to attach the media.
>>
>> (but that creates a window where the guest device is visible but media
>> is not yet inserted?)
>>      
> Think so, for hot plug.
>
> Actually, the device model driver would reject an empty blockdev, unless
> it's a device with removable media, such as a CD-ROM.
>
> Having a device with fixed media go through a "no media yet" state
> during initialization just complicates things.  Defining the media
> *before* creating the device is much simpler.
>    

That is true.  Maybe we should just ignore the duplication.

-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] RFC v2: blockdev_add & friends, brief rationale, QMP docs
  2010-06-16  9:50             ` Avi Kivity
@ 2010-06-16 11:02               ` Markus Armbruster
  2010-06-16 11:06                 ` Avi Kivity
  0 siblings, 1 reply; 25+ messages in thread
From: Markus Armbruster @ 2010-06-16 11:02 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Kevin Wolf, Christoph Hellwig, Gerd Hoffmann, qemu-devel,
	Luiz Capitulino

Avi Kivity <avi@redhat.com> writes:

> On 06/15/2010 05:54 PM, Markus Armbruster wrote:
>> Avi Kivity<avi@redhat.com>  writes:
>>
>>    
>>> On 06/15/2010 04:27 PM, Markus Armbruster wrote:
>>>      
>>>>        
>>>>> I'm only talking about the interface, not the implementation.
>>>>> Internal design details shouldn't be exposed.
>>>>>
>>>>> For the implementation, I imagine you can create an empty blockdev
>>>>> during guest device creation and treat blockdev_add/blockdev_del as
>>>>> media change/eject.
>>>>>
>>>>>          
>>>> If blockdev_del only ejects media, then we need another command to get
>>>> rid of a blockdev.
>>>>
>>>>        
>>> No.  If you have a blockdev just to satisfy the guest device's need
>>> for a blockdev pointer, you can delete it automatically when the last
>>> reference goes.
>>>      
>> That's not how netdev and chardev behave.
>>    
>
> Ok.  To me duplicate argument lists suggest a lack of orthogonality, though.
>
> How about
>
>   blockdev_add id=...
>   media_attach blockdev=..., media-parameters
>   media_detach blockdev=...
>   blockdev_del id=...
>
> So blockdev_add/del define/remove a slot for the media,
> media_attach/detach connect it to media.

Yes, those are orthogonal operations.  Two more are side effects of
device_add and device_del: attach device model, detach device model.

My proposal provides two inessential operations purely for convenience:
define slot + connect media, and disconnect media + remove slot.  These
patterns are very common.

>>>> Unless you propose that blockdev_del merely ejects media if the blockdev
>>>> is being used by a device, but destroys the blockdev outright if not.
>>>> But that would be sick, wouldn't it?
>>>>
>>>>        
>>> Create a blockdev implicitly with guest device creation, and use
>>> blockdev_add (or media_attach) to attach the media.
>>>
>>> (but that creates a window where the guest device is visible but media
>>> is not yet inserted?)
>>>      
>> Think so, for hot plug.
>>
>> Actually, the device model driver would reject an empty blockdev, unless
>> it's a device with removable media, such as a CD-ROM.
>>
>> Having a device with fixed media go through a "no media yet" state
>> during initialization just complicates things.  Defining the media
>> *before* creating the device is much simpler.
>>    
>
> That is true.  Maybe we should just ignore the duplication.

It's fine to create empty blockdev, insert media, attach device model,
even if the device model wants fixed media.

Nevertheless, I think my convenience shortcuts make sense.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] RFC v2: blockdev_add & friends, brief rationale, QMP docs
  2010-06-16 11:02               ` Markus Armbruster
@ 2010-06-16 11:06                 ` Avi Kivity
  0 siblings, 0 replies; 25+ messages in thread
From: Avi Kivity @ 2010-06-16 11:06 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Kevin Wolf, Christoph Hellwig, Gerd Hoffmann, qemu-devel,
	Luiz Capitulino

On 06/16/2010 02:02 PM, Markus Armbruster wrote:
>
>> That is true.  Maybe we should just ignore the duplication.
>>      
> It's fine to create empty blockdev, insert media, attach device model,
> even if the device model wants fixed media.
>
> Nevertheless, I think my convenience shortcuts make sense.
>    

OK.

-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Qemu-devel] Re: RFC v2: blockdev_add & friends, brief rationale, QMP docs
  2010-06-15 13:44 ` [Qemu-devel] " Avi Kivity
  2010-06-15 14:39   ` Gerd Hoffmann
@ 2010-06-16 11:20   ` Kevin Wolf
  2010-06-16 12:41     ` Markus Armbruster
  1 sibling, 1 reply; 25+ messages in thread
From: Kevin Wolf @ 2010-06-16 11:20 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Christoph Hellwig, qemu-devel, Markus Armbruster,
	Luiz Capitulino, Gerd Hoffmann

Am 15.06.2010 15:44, schrieb Avi Kivity:
> On 06/10/2010 08:45 PM, Markus Armbruster wrote:
>>
>>
>>     * Our config file format is in INI syntax.  QemuOpts correspond to
>>       INI sections.  Sections can't be nested, so recursive QemuOpts
>>       don't translate.
>>    
> 
> git (and probably others) use
> 
> [a "b"]
>      c = d
> 
> for
> 
>     a.b.c=d
> 
>>     Examples:
>>
>>     * Single protocol:
>>
>>       -blockdev id=blk1,format=raw,protocol=[file,file=fedora.img]
>>
>>       Requires suitable syntactic sugar to get the simple form (*).
>>
>>     * blkdebug
>>
>>       -blockdev id=blk2,format=qcow2,\
>>       protocol=[blkdebug,config=test.blkdebug,\
>>       protocol=[file,file=test.qcow2]]
>>
>>     * Avi's mirror:
>>
>>       -blockdev id=blk3,format=raw,\
>>       protocol=[mirror,\
>>       [file,file=local.img],\
>>       [nbd,domain=unix,sockert=nbd-sock]]
>>
>> 2. We already have a syntax to specify trees, namely JSON, so use it
>>
>>     If -blockdev's argument starts with '{', it's a JSON object suitable
>>     as argument of blockdev_add in QMP.
>>
>>     We still provide ordinary QemuOpts syntax for the cases that can be
>>     expressed with it, i.e. single protocol.
>>
>>     I figure we'd want syntactic sugar for blkdebug, to permit its use
>>     from the command line without having to resort to JSON.
>>    
> 
> Might be nice as a general extension to QemuOpts.

I agree.

> 
>> 3. Stack protocols through named references
>>
>>     The first protocol is "inlined" into -blockdev.  Any further
>>     protocols need to be referenced by name.
>>
>>     Best explained by example:
>>
>>     * Single protocol:
>>
>>       -blockdev id=blk1,format=raw,protocol=file,file=fedora.img
>>
>>       To get the simple form (*), make protocol optional with a suitable
>>       default.
>>
>>     * blkdebug
>>
>>       -blockdev id=blk2,format=qcow2,protocol=blkdebug,config=test.blkdebug,\
>>       base=blk2-base
>>       -blockproto id=blk2-base,protocol=file,file=test.qcow2
>>
>>     * Avi's mirror:
>>
>>       -blockdev id=blk3,format=raw,protocol=mirror,\
>>       base=blk3-base1,base=blk3=base2
>>       -blockproto id=blk3-base1,protocol=file,file=local.img
>>       -blockproto id=blk3-base2,protocol=nbd,domain=unix,sockert=nbd-sock
>>
>>     Anything but a single protocol becomes pretty verbose.  Syntactic
>>     sugar for the blkdebug case would be possible; not sure it's worth
>>     it.
>>
>>     No QemuOpts syntax changes.  INI can handle this just fine.
>>
>>    
> 
> Looks like the least painful option as no new infrastructure is needed.  
> I'd go with this.

But it's painful to type for the user. After all -blockdev on the
command line is for the user, as tools should use QMP. Also note that
this syntax mixes format and protocol options on one line which I
consider confusing at best.

As I told Markus already in private before he posted this, I prefer the
bracket solution for its clarity and simplicity, even though it comes at
the cost of having additional characters that need to be escaped.

Kevin

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Qemu-devel] Re: RFC v2: blockdev_add & friends, brief rationale, QMP docs
  2010-06-16 11:20   ` Kevin Wolf
@ 2010-06-16 12:41     ` Markus Armbruster
  2010-06-16 13:41       ` Anthony Liguori
  0 siblings, 1 reply; 25+ messages in thread
From: Markus Armbruster @ 2010-06-16 12:41 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: Christoph Hellwig, qemu-devel, Luiz Capitulino, Gerd Hoffmann,
	Avi Kivity

Kevin Wolf <kwolf@redhat.com> writes:

> Am 15.06.2010 15:44, schrieb Avi Kivity:
>> On 06/10/2010 08:45 PM, Markus Armbruster wrote:
>>>
>>>
>>>     * Our config file format is in INI syntax.  QemuOpts correspond to
>>>       INI sections.  Sections can't be nested, so recursive QemuOpts
>>>       don't translate.
>>>    
>> 
>> git (and probably others) use
>> 
>> [a "b"]
>>      c = d
>> 
>> for
>> 
>>     a.b.c=d
>> 
>>>     Examples:
>>>
>>>     * Single protocol:
>>>
>>>       -blockdev id=blk1,format=raw,protocol=[file,file=fedora.img]
>>>
>>>       Requires suitable syntactic sugar to get the simple form (*).
>>>
>>>     * blkdebug
>>>
>>>       -blockdev id=blk2,format=qcow2,\
>>>       protocol=[blkdebug,config=test.blkdebug,\
>>>       protocol=[file,file=test.qcow2]]
>>>
>>>     * Avi's mirror:
>>>
>>>       -blockdev id=blk3,format=raw,\
>>>       protocol=[mirror,\
>>>       [file,file=local.img],\
>>>       [nbd,domain=unix,sockert=nbd-sock]]
>>>
>>> 2. We already have a syntax to specify trees, namely JSON, so use it
>>>
>>>     If -blockdev's argument starts with '{', it's a JSON object suitable
>>>     as argument of blockdev_add in QMP.
>>>
>>>     We still provide ordinary QemuOpts syntax for the cases that can be
>>>     expressed with it, i.e. single protocol.
>>>
>>>     I figure we'd want syntactic sugar for blkdebug, to permit its use
>>>     from the command line without having to resort to JSON.
>>>    
>> 
>> Might be nice as a general extension to QemuOpts.
>
> I agree.
>
>> 
>>> 3. Stack protocols through named references
>>>
>>>     The first protocol is "inlined" into -blockdev.  Any further
>>>     protocols need to be referenced by name.
>>>
>>>     Best explained by example:
>>>
>>>     * Single protocol:
>>>
>>>       -blockdev id=blk1,format=raw,protocol=file,file=fedora.img
>>>
>>>       To get the simple form (*), make protocol optional with a suitable
>>>       default.
>>>
>>>     * blkdebug
>>>
>>>       -blockdev id=blk2,format=qcow2,protocol=blkdebug,config=test.blkdebug,\
>>>       base=blk2-base
>>>       -blockproto id=blk2-base,protocol=file,file=test.qcow2
>>>
>>>     * Avi's mirror:
>>>
>>>       -blockdev id=blk3,format=raw,protocol=mirror,\
>>>       base=blk3-base1,base=blk3=base2
>>>       -blockproto id=blk3-base1,protocol=file,file=local.img
>>>       -blockproto id=blk3-base2,protocol=nbd,domain=unix,sockert=nbd-sock
>>>
>>>     Anything but a single protocol becomes pretty verbose.  Syntactic
>>>     sugar for the blkdebug case would be possible; not sure it's worth
>>>     it.
>>>
>>>     No QemuOpts syntax changes.  INI can handle this just fine.
>>>
>>>    
>> 
>> Looks like the least painful option as no new infrastructure is needed.  
>> I'd go with this.
>
> But it's painful to type for the user. After all -blockdev on the
> command line is for the user, as tools should use QMP. Also note that
> this syntax mixes format and protocol options on one line which I
> consider confusing at best.
>
> As I told Markus already in private before he posted this, I prefer the
> bracket solution for its clarity and simplicity, even though it comes at
> the cost of having additional characters that need to be escaped.

I dont't think 1. is less painful than 3.  Let's compare the two:

* Single protocol: identical with suitable syntactical sugar, namely

      -blockdev id=blk1,file=fedora.img

  Unsugared it's

      -blockdev id=blk1,format=raw,protocol=[file,file=fedora.img]
  vs.

      -blockdev id=blk1,format=raw,protocol=file,file=fedora.img

  I sure prefer the latter.  The brackets look like noise.  You need to
  understand protocol stacking for them to make any sense.

  Regarding confusion caused by mixing format and protocol options: yes,
  the brackets force you to distinguish between protocol options and
  other options.  But I doubt that'll reduce confusion here.  Either you
  understand protocols.  Then I doubt you need brackets to unconfuse
  you.  Or you don't understand protocols.  Then whether to put an
  option inside or outside the brackets is voodoo.

* blkdebug:

      -blockdev id=blk2,format=qcow2,\
      protocol=[blkdebug,config=test.blkdebug,\
      protocol=[file,file=test.qcow2]]

  vs.

      -blockdev id=blk2,format=qcow2,protocol=blkdebug,config=test.blkdebug,\
      base=blk2-base
      -blockproto id=blk2-base,protocol=file,file=test.qcow2

  Both look equally painfull to me.  It's slightly shorter with
  brackets, but I can't recognize "clarity and simplicity" there.

  To really reduce pain, we'd have to invent special-purpose syntactic
  sugar in either case.

* Avi's mirror:

      -blockdev id=blk3,format=raw,\
      protocol=[mirror,\
      [file,file=local.img],\
      [nbd,domain=unix,sockert=nbd-sock]]

  vs.

      -blockdev id=blk3,format=raw,protocol=mirror,\
      base=blk3-base1,base=blk3=base2
      -blockproto id=blk3-base1,protocol=file,file=local.img
      -blockproto id=blk3-base2,protocol=nbd,domain=unix,sockert=nbd-sock

  Same as for blkdebug.

Both with brackets and with -blockproto the protocol is clearly
separated.  Only the topmost protocol isn't with -blockproto.  I keep
that in -blockdev merely for brevity.  It could live in its own
-blockproto.

Brackets cause some ugly complications, both in the code and in the user
interface, as detailed in my original message.  We need some pretty
convincing advantages to justify those complications.  I can't see them.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] Re: RFC v2: blockdev_add & friends, brief rationale, QMP docs
  2010-06-16 12:41     ` Markus Armbruster
@ 2010-06-16 13:41       ` Anthony Liguori
  2010-06-16 13:57         ` Kevin Wolf
  0 siblings, 1 reply; 25+ messages in thread
From: Anthony Liguori @ 2010-06-16 13:41 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Kevin Wolf, Christoph Hellwig, qemu-devel, Luiz Capitulino,
	Avi Kivity, Gerd Hoffmann

On 06/16/2010 07:41 AM, Markus Armbruster wrote:
> Kevin Wolf<kwolf@redhat.com>  writes:
>
>    
>> Am 15.06.2010 15:44, schrieb Avi Kivity:
>>      
>>> On 06/10/2010 08:45 PM, Markus Armbruster wrote:
>>>        
>>>>
>>>>      * Our config file format is in INI syntax.  QemuOpts correspond to
>>>>        INI sections.  Sections can't be nested, so recursive QemuOpts
>>>>        don't translate.
>>>>
>>>>          
>>> git (and probably others) use
>>>
>>> [a "b"]
>>>       c = d
>>>
>>> for
>>>
>>>      a.b.c=d
>>>
>>>        
>>>>      Examples:
>>>>
>>>>      * Single protocol:
>>>>
>>>>        -blockdev id=blk1,format=raw,protocol=[file,file=fedora.img]
>>>>
>>>>        Requires suitable syntactic sugar to get the simple form (*).
>>>>
>>>>      * blkdebug
>>>>
>>>>        -blockdev id=blk2,format=qcow2,\
>>>>        protocol=[blkdebug,config=test.blkdebug,\
>>>>        protocol=[file,file=test.qcow2]]
>>>>
>>>>      * Avi's mirror:
>>>>
>>>>        -blockdev id=blk3,format=raw,\
>>>>        protocol=[mirror,\
>>>>        [file,file=local.img],\
>>>>        [nbd,domain=unix,sockert=nbd-sock]]
>>>>
>>>> 2. We already have a syntax to specify trees, namely JSON, so use it
>>>>
>>>>      If -blockdev's argument starts with '{', it's a JSON object suitable
>>>>      as argument of blockdev_add in QMP.
>>>>
>>>>      We still provide ordinary QemuOpts syntax for the cases that can be
>>>>      expressed with it, i.e. single protocol.
>>>>
>>>>      I figure we'd want syntactic sugar for blkdebug, to permit its use
>>>>      from the command line without having to resort to JSON.
>>>>
>>>>          
>>> Might be nice as a general extension to QemuOpts.
>>>        
>> I agree.
>>
>>      
>>>        
>>>> 3. Stack protocols through named references
>>>>
>>>>      The first protocol is "inlined" into -blockdev.  Any further
>>>>      protocols need to be referenced by name.
>>>>
>>>>      Best explained by example:
>>>>
>>>>      * Single protocol:
>>>>
>>>>        -blockdev id=blk1,format=raw,protocol=file,file=fedora.img
>>>>
>>>>        To get the simple form (*), make protocol optional with a suitable
>>>>        default.
>>>>
>>>>      * blkdebug
>>>>
>>>>        -blockdev id=blk2,format=qcow2,protocol=blkdebug,config=test.blkdebug,\
>>>>        base=blk2-base
>>>>        -blockproto id=blk2-base,protocol=file,file=test.qcow2
>>>>
>>>>      * Avi's mirror:
>>>>
>>>>        -blockdev id=blk3,format=raw,protocol=mirror,\
>>>>        base=blk3-base1,base=blk3=base2
>>>>        -blockproto id=blk3-base1,protocol=file,file=local.img
>>>>        -blockproto id=blk3-base2,protocol=nbd,domain=unix,sockert=nbd-sock
>>>>
>>>>      Anything but a single protocol becomes pretty verbose.  Syntactic
>>>>      sugar for the blkdebug case would be possible; not sure it's worth
>>>>      it.
>>>>
>>>>      No QemuOpts syntax changes.  INI can handle this just fine.
>>>>
>>>>
>>>>          
>>> Looks like the least painful option as no new infrastructure is needed.
>>> I'd go with this.
>>>        
>> But it's painful to type for the user. After all -blockdev on the
>> command line is for the user, as tools should use QMP. Also note that
>> this syntax mixes format and protocol options on one line which I
>> consider confusing at best.
>>
>> As I told Markus already in private before he posted this, I prefer the
>> bracket solution for its clarity and simplicity, even though it comes at
>> the cost of having additional characters that need to be escaped.
>>      
> I dont't think 1. is less painful than 3.  Let's compare the two:
>
> * Single protocol: identical with suitable syntactical sugar, namely
>
>        -blockdev id=blk1,file=fedora.img
>    

First, let me say that -blockdev is not something that I believe is 
targeted at users.  It's incredible unfair for us to expect a user to type:

-blockdev id=blk1,file=fedora.img -device ide-drive,drive=blk1,bus=0,unit=0

Instead of:

-hda fedora.img

I had to look up the device syntax just to write that.  There's no way 
users are going to do this.  We should drop any notion of syntactical 
sugar IMHO.  -blockdev is for management tools, scripts, and as an 
infrastructure for config files.

>    Unsugared it's
>
>        -blockdev id=blk1,format=raw,protocol=[file,file=fedora.img]
>    vs.
>
>        -blockdev id=blk1,format=raw,protocol=file,file=fedora.img
>    

Specifying nesting in a single option is a bad idea.  It should be:

-blockdev id=blk1,format=raw,protocol=blk2  \
-blockdev id=blk2,file=fedora.img

But honestly, I'm thoroughly confused about the distinction between 
protocol and format.  I had thought that protocols were a type of format 
and I'm not sure why we're making a distinction.

>    I sure prefer the latter.  The brackets look like noise.  You need to
>    understand protocol stacking for them to make any sense.
>
>    Regarding confusion caused by mixing format and protocol options: yes,
>    the brackets force you to distinguish between protocol options and
>    other options.  But I doubt that'll reduce confusion here.  Either you
>    understand protocols.  Then I doubt you need brackets to unconfuse
>    you.  Or you don't understand protocols.  Then whether to put an
>    option inside or outside the brackets is voodoo.
>    

If the above is necessary just to create a raw image, then we're doing 
something wrong in the block layer.  If should be possible to just say:

-blockdev id=blk1,format=raw,file=fedora.img

Regards,

Anthony Liguori

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] Re: RFC v2: blockdev_add & friends, brief rationale, QMP docs
  2010-06-16 13:41       ` Anthony Liguori
@ 2010-06-16 13:57         ` Kevin Wolf
  2010-06-16 14:24           ` Christoph Hellwig
  2010-06-16 18:07           ` Anthony Liguori
  0 siblings, 2 replies; 25+ messages in thread
From: Kevin Wolf @ 2010-06-16 13:57 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: Christoph Hellwig, qemu-devel, Markus Armbruster,
	Luiz Capitulino, Avi Kivity, Gerd Hoffmann

Am 16.06.2010 15:41, schrieb Anthony Liguori:
> On 06/16/2010 07:41 AM, Markus Armbruster wrote:
>> Kevin Wolf<kwolf@redhat.com>  writes:
>>> But it's painful to type for the user. After all -blockdev on the
>>> command line is for the user, as tools should use QMP. Also note that
>>> this syntax mixes format and protocol options on one line which I
>>> consider confusing at best.
>>>
>>> As I told Markus already in private before he posted this, I prefer the
>>> bracket solution for its clarity and simplicity, even though it comes at
>>> the cost of having additional characters that need to be escaped.
>>>      
>> I dont't think 1. is less painful than 3.  Let's compare the two:
>>
>> * Single protocol: identical with suitable syntactical sugar, namely
>>
>>        -blockdev id=blk1,file=fedora.img
>>    
> 
> First, let me say that -blockdev is not something that I believe is 
> targeted at users.  It's incredible unfair for us to expect a user to type:
> 
> -blockdev id=blk1,file=fedora.img -device ide-drive,drive=blk1,bus=0,unit=0
> 
> Instead of:
> 
> -hda fedora.img

Sure thing, as long as -hda provides all the options. I usually start
off with -hda, but after a while I need to set some option and switch to
-drive. This is what most users are using today.

If we're not going to extend -drive to cover all features, then users
will (have to) start using -blockdev.

> I had to look up the device syntax just to write that.  There's no way 
> users are going to do this.  We should drop any notion of syntactical 
> sugar IMHO.  -blockdev is for management tools, scripts, and as an 
> infrastructure for config files.

In that case, let's go for the JSON version. But it requires that
everything that -blockdev provides is accessible with -drive, too (or
that we're okay with users hating us).

> 
>>    Unsugared it's
>>
>>        -blockdev id=blk1,format=raw,protocol=[file,file=fedora.img]
>>    vs.
>>
>>        -blockdev id=blk1,format=raw,protocol=file,file=fedora.img
>>    
> 
> Specifying nesting in a single option is a bad idea.  It should be:
> 
> -blockdev id=blk1,format=raw,protocol=blk2  \
> -blockdev id=blk2,file=fedora.img

I agree, but again, this makes it an option only for tools.

> But honestly, I'm thoroughly confused about the distinction between 
> protocol and format.  I had thought that protocols were a type of format 
> and I'm not sure why we're making a distinction.

Technically, they are mostly the same. Logically, they are not. You have
one image format driver (raw, qcow2, ...) that accesses its image data
through one or more stacked protocols (file, host_device, nbd, http, ...).

In the past we've had quite some trouble because there was no clear
distinction. raw and file was the same. If you had an image on a block
device, you were asking for trouble.

>>    I sure prefer the latter.  The brackets look like noise.  You need to
>>    understand protocol stacking for them to make any sense.
>>
>>    Regarding confusion caused by mixing format and protocol options: yes,
>>    the brackets force you to distinguish between protocol options and
>>    other options.  But I doubt that'll reduce confusion here.  Either you
>>    understand protocols.  Then I doubt you need brackets to unconfuse
>>    you.  Or you don't understand protocols.  Then whether to put an
>>    option inside or outside the brackets is voodoo.
>>    
> 
> If the above is necessary just to create a raw image, then we're doing 
> something wrong in the block layer.  If should be possible to just say:
> 
> -blockdev id=blk1,format=raw,file=fedora.img

I think we all agree on this (although it contradicts what you said
above, because file is a property of the protocol). The question is how
to specify protocols explicitly.

Kevin

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] Re: RFC v2: blockdev_add & friends, brief rationale, QMP docs
  2010-06-16 13:57         ` Kevin Wolf
@ 2010-06-16 14:24           ` Christoph Hellwig
  2010-06-16 14:47             ` Markus Armbruster
  2010-06-16 18:07           ` Anthony Liguori
  1 sibling, 1 reply; 25+ messages in thread
From: Christoph Hellwig @ 2010-06-16 14:24 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: Christoph Hellwig, Markus Armbruster, qemu-devel,
	Luiz Capitulino, Avi Kivity, Gerd Hoffmann

On Wed, Jun 16, 2010 at 03:57:46PM +0200, Kevin Wolf wrote:
> Sure thing, as long as -hda provides all the options. I usually start
> off with -hda, but after a while I need to set some option and switch to
> -drive. This is what most users are using today.

I've never even bothered using -hda - it's not anywhere near useful
for usable setup (cache=none, virtio & co).

> If we're not going to extend -drive to cover all features, then users
> will (have to) start using -blockdev.

-drive is such a mess that we need to fully replace it.

> Technically, they are mostly the same. Logically, they are not. You have
> one image format driver (raw, qcow2, ...) that accesses its image data
> through one or more stacked protocols (file, host_device, nbd, http, ...).

That's where it gets really confusing.  To my naive sense everything
that takes fileish input below is an image format.  So to me stacking
image formats makes sense, stacking protocols as the fundamental way
to access data (file/http/nbd) doesn't.

The only stacking protocol right now is blkdebug, and that would fit
much better as a stacking image format, which could use ->bdrv_open
instead of ->bdrv_file_open, etc.  The only thing we'd require for that
is allowing to pass options to image formats.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] Re: RFC v2: blockdev_add & friends, brief rationale, QMP docs
  2010-06-16 14:24           ` Christoph Hellwig
@ 2010-06-16 14:47             ` Markus Armbruster
  0 siblings, 0 replies; 25+ messages in thread
From: Markus Armbruster @ 2010-06-16 14:47 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Kevin Wolf, Christoph Hellwig, qemu-devel, Luiz Capitulino,
	Avi Kivity, Gerd Hoffmann

Christoph Hellwig <hch@lst.de> writes:

> On Wed, Jun 16, 2010 at 03:57:46PM +0200, Kevin Wolf wrote:
>> Sure thing, as long as -hda provides all the options. I usually start
>> off with -hda, but after a while I need to set some option and switch to
>> -drive. This is what most users are using today.
>
> I've never even bothered using -hda - it's not anywhere near useful
> for usable setup (cache=none, virtio & co).
>
>> If we're not going to extend -drive to cover all features, then users
>> will (have to) start using -blockdev.
>
> -drive is such a mess that we need to fully replace it.

Agree.

>> Technically, they are mostly the same. Logically, they are not. You have
>> one image format driver (raw, qcow2, ...) that accesses its image data
>> through one or more stacked protocols (file, host_device, nbd, http, ...).
>
> That's where it gets really confusing.  To my naive sense everything
> that takes fileish input below is an image format.  So to me stacking
> image formats makes sense, stacking protocols as the fundamental way
> to access data (file/http/nbd) doesn't.
>
> The only stacking protocol right now is blkdebug, and that would fit
> much better as a stacking image format, which could use ->bdrv_open
> instead of ->bdrv_file_open, etc.  The only thing we'd require for that
> is allowing to pass options to image formats.

So we still don't agree on what a protocol is?!?

Talking about concrete command line syntax makes no sense at all before
we agree on the abstract syntax, i.e. the information we want to convey.
Please go right back to the head of the thread, and review the QMP
command docs.  The QMP JSON is a straightforward concrete syntax for the
abstract syntax.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] Re: RFC v2: blockdev_add & friends, brief rationale, QMP docs
  2010-06-16 13:57         ` Kevin Wolf
  2010-06-16 14:24           ` Christoph Hellwig
@ 2010-06-16 18:07           ` Anthony Liguori
  2010-06-17  8:20             ` Kevin Wolf
  1 sibling, 1 reply; 25+ messages in thread
From: Anthony Liguori @ 2010-06-16 18:07 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: Christoph Hellwig, qemu-devel, Markus Armbruster,
	Luiz Capitulino, Avi Kivity, Gerd Hoffmann

On 06/16/2010 08:57 AM, Kevin Wolf wrote:
> Am 16.06.2010 15:41, schrieb Anthony Liguori:
>    
>> On 06/16/2010 07:41 AM, Markus Armbruster wrote:
>>      
>>> Kevin Wolf<kwolf@redhat.com>   writes:
>>>        
>>>> But it's painful to type for the user. After all -blockdev on the
>>>> command line is for the user, as tools should use QMP. Also note that
>>>> this syntax mixes format and protocol options on one line which I
>>>> consider confusing at best.
>>>>
>>>> As I told Markus already in private before he posted this, I prefer the
>>>> bracket solution for its clarity and simplicity, even though it comes at
>>>> the cost of having additional characters that need to be escaped.
>>>>
>>>>          
>>> I dont't think 1. is less painful than 3.  Let's compare the two:
>>>
>>> * Single protocol: identical with suitable syntactical sugar, namely
>>>
>>>         -blockdev id=blk1,file=fedora.img
>>>
>>>        
>> First, let me say that -blockdev is not something that I believe is
>> targeted at users.  It's incredible unfair for us to expect a user to type:
>>
>> -blockdev id=blk1,file=fedora.img -device ide-drive,drive=blk1,bus=0,unit=0
>>
>> Instead of:
>>
>> -hda fedora.img
>>      
> Sure thing, as long as -hda provides all the options. I usually start
> off with -hda, but after a while I need to set some option and switch to
> -drive. This is what most users are using today.
>
> If we're not going to extend -drive to cover all features, then users
> will (have to) start using -blockdev.
>
>    
>> I had to look up the device syntax just to write that.  There's no way
>> users are going to do this.  We should drop any notion of syntactical
>> sugar IMHO.  -blockdev is for management tools, scripts, and as an
>> infrastructure for config files.
>>      
> In that case, let's go for the JSON version.

We need separate options to map to a configuration file.  We already 
represent trees of information in the configuration files and there's an 
established way of doing this (naming nodes with 'id' and then 
referencing them as a parent).

>   But it requires that
> everything that -blockdev provides is accessible with -drive, too (or
> that we're okay with users hating us).
>    

I'm happy for -drive to die.  I think we should support -hda and 
-blockdev.  -blockdev should be optimized for config files, not single 
argument input.  IOW:

[blockdev "blk2"]
  format = "raw"
  file = "/path/to/base.img"
  cache = "writeback"

[blockdev "blk1"]
   format = "qcow2"
   file = "/path/to/leaf.img"
   cache="off"
   backing_dev = "blk2"

[device "disk1"]
   driver = "ide-drive"
   blockdev = "blk1"
   bus = "0"
   unit = "0"

Or:

qemu -blockdev id=blk2,format=raw,file=/path/to/base.img,cache=writeback \
           -blockdev 
id=blk1,format=qcow2,file=/path/to/leaf.img,backing_dev=blk2 \
           -device ide-disk,blockdev=blk1,bus=0,unit=0

Or:

qemu -hda /path/to/leaf.img

And if a user really feels they need to modify the defaults, they can do:

qemu -hda /path/to/leaf.img -writeconfig myconf.cfg

And edit from there.

>> But honestly, I'm thoroughly confused about the distinction between
>> protocol and format.  I had thought that protocols were a type of format
>> and I'm not sure why we're making a distinction.
>>      
> Technically, they are mostly the same. Logically, they are not. You have
> one image format driver (raw, qcow2, ...) that accesses its image data
> through one or more stacked protocols (file, host_device, nbd, http, ...).
>
> In the past we've had quite some trouble because there was no clear
> distinction. raw and file was the same. If you had an image on a block
> device, you were asking for trouble.
>    

As Christoph mentions, we really don't have stacked protocols and I'm 
not sure they make sense.

>>>     I sure prefer the latter.  The brackets look like noise.  You need to
>>>     understand protocol stacking for them to make any sense.
>>>
>>>     Regarding confusion caused by mixing format and protocol options: yes,
>>>     the brackets force you to distinguish between protocol options and
>>>     other options.  But I doubt that'll reduce confusion here.  Either you
>>>     understand protocols.  Then I doubt you need brackets to unconfuse
>>>     you.  Or you don't understand protocols.  Then whether to put an
>>>     option inside or outside the brackets is voodoo.
>>>
>>>        
>> If the above is necessary just to create a raw image, then we're doing
>> something wrong in the block layer.  If should be possible to just say:
>>
>> -blockdev id=blk1,format=raw,file=fedora.img
>>      
> I think we all agree on this (although it contradicts what you said
> above, because file is a property of the protocol). The question is how
> to specify protocols explicitly.
>    

I think raw doesn't make very much sense then.  What's the point of it 
if it's just a thin wrapper around a protocol?

Regards,

Anthony Liguori

> Kevin
>    

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] Re: RFC v2: blockdev_add & friends, brief rationale, QMP docs
  2010-06-16 18:07           ` Anthony Liguori
@ 2010-06-17  8:20             ` Kevin Wolf
  2010-06-17 13:01               ` Anthony Liguori
  0 siblings, 1 reply; 25+ messages in thread
From: Kevin Wolf @ 2010-06-17  8:20 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: Christoph Hellwig, qemu-devel, Markus Armbruster,
	Luiz Capitulino, Avi Kivity, Gerd Hoffmann

Am 16.06.2010 20:07, schrieb Anthony Liguori:
>>   But it requires that
>> everything that -blockdev provides is accessible with -drive, too (or
>> that we're okay with users hating us).
>>    
> 
> I'm happy for -drive to die.  I think we should support -hda and 
> -blockdev. 

-hda is not sufficient for most users. It doesn't provide any options.
It doesn't even support virtio. If -drive is going to die (and we seem
to agree all on that), then -blockdev needs to be usable for users (and
it's only you who contradicts so far).

> -blockdev should be optimized for config files, not single 
> argument input.  IOW:
> 
> [blockdev "blk2"]
>   format = "raw"
>   file = "/path/to/base.img"
>   cache = "writeback"
> 
> [blockdev "blk1"]
>    format = "qcow2"
>    file = "/path/to/leaf.img"
>    cache="off"
>    backing_dev = "blk2"
> 
> [device "disk1"]
>    driver = "ide-drive"
>    blockdev = "blk1"
>    bus = "0"
>    unit = "0"

You don't specify the backing file of an image on the command line (or
in the configuration file). It's saved as part of the image. It's more
like this (for a simple raw image file):

[blockdev-protocol "proto1"]
   protocol = "file"
   file = "/path/to/image.img"

[blockdev "blk1"]
   format = "raw"
   cache="off"
   protocol = "proto1"

[device "disk1"]
   driver = "ide-drive"
   blockdev = "blk1"
   bus = "0"
   unit = "0"

(This would be Markus' option 3, I think)

> Or:
> 
> qemu -blockdev id=blk2,format=raw,file=/path/to/base.img,cache=writeback \
>            -blockdev 
> id=blk1,format=qcow2,file=/path/to/leaf.img,backing_dev=blk2 \
>            -device ide-disk,blockdev=blk1,bus=0,unit=0
> 
> Or:
> 
> qemu -hda /path/to/leaf.img
> 
> And if a user really feels they need to modify the defaults, they can do:
> 
> qemu -hda /path/to/leaf.img -writeconfig myconf.cfg
> 
> And edit from there.
> 
>>> But honestly, I'm thoroughly confused about the distinction between
>>> protocol and format.  I had thought that protocols were a type of format
>>> and I'm not sure why we're making a distinction.
>>>      
>> Technically, they are mostly the same. Logically, they are not. You have
>> one image format driver (raw, qcow2, ...) that accesses its image data
>> through one or more stacked protocols (file, host_device, nbd, http, ...).
>>
>> In the past we've had quite some trouble because there was no clear
>> distinction. raw and file was the same. If you had an image on a block
>> device, you were asking for trouble.
>>    
> 
> As Christoph mentions, we really don't have stacked protocols and I'm 
> not sure they make sense.

Right, if we go for Christoph's suggestion, we don't need stacked
protocols. We'll have stacked formats instead. I'm not sure if you like
this any better. ;-)

We do have stacking today. -hda blkdebug:test.cfg:foo.qcow2 is qcow2 on
blkdebug on file. We need to be able to represent this.

>>>>     I sure prefer the latter.  The brackets look like noise.  You need to
>>>>     understand protocol stacking for them to make any sense.
>>>>
>>>>     Regarding confusion caused by mixing format and protocol options: yes,
>>>>     the brackets force you to distinguish between protocol options and
>>>>     other options.  But I doubt that'll reduce confusion here.  Either you
>>>>     understand protocols.  Then I doubt you need brackets to unconfuse
>>>>     you.  Or you don't understand protocols.  Then whether to put an
>>>>     option inside or outside the brackets is voodoo.
>>>>
>>>>        
>>> If the above is necessary just to create a raw image, then we're doing
>>> something wrong in the block layer.  If should be possible to just say:
>>>
>>> -blockdev id=blk1,format=raw,file=fedora.img
>>>      
>> I think we all agree on this (although it contradicts what you said
>> above, because file is a property of the protocol). The question is how
>> to specify protocols explicitly.
> 
> I think raw doesn't make very much sense then.  What's the point of it 
> if it's just a thin wrapper around a protocol?

That it can be wrapped around any protocol. It's just about separating
code for handling the content of an image and code for accessing the image.

Ever tried something like "qemu-img create -f raw /dev/something 10G"?
You need the host_device protocol there, not the file protocol. When we
had raw == file this completely failed. And it's definitely reasonable
to expect that it works because the image format _is_ raw, it's just not
saved in a file.

Or the famous qcow2 images on block devices. Why did qemu guess the
format correctly when qcow2 was saved in a file, but not on a host
device? This was just inconsistent.

I've had more than one bug report about things like this which are
magically fixed when you do the layering right.

Kevin

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] Re: RFC v2: blockdev_add & friends, brief rationale, QMP docs
  2010-06-17  8:20             ` Kevin Wolf
@ 2010-06-17 13:01               ` Anthony Liguori
  2010-06-17 14:15                 ` Kevin Wolf
  2010-06-18  7:06                 ` Markus Armbruster
  0 siblings, 2 replies; 25+ messages in thread
From: Anthony Liguori @ 2010-06-17 13:01 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: Christoph Hellwig, qemu-devel, Markus Armbruster,
	Luiz Capitulino, Avi Kivity, Gerd Hoffmann

On 06/17/2010 03:20 AM, Kevin Wolf wrote:
> Am 16.06.2010 20:07, schrieb Anthony Liguori:
>    
>>>    But it requires that
>>> everything that -blockdev provides is accessible with -drive, too (or
>>> that we're okay with users hating us).
>>>
>>>        
>> I'm happy for -drive to die.  I think we should support -hda and
>> -blockdev.
>>      
> -hda is not sufficient for most users. It doesn't provide any options.
> It doesn't even support virtio. If -drive is going to die (and we seem
> to agree all on that), then -blockdev needs to be usable for users (and
> it's only you who contradicts so far).
>    

I've always thought we should have a -vda argument and an -sda argument 
specifically for specifying virtio and scsi disks.

>> -blockdev should be optimized for config files, not single
>> argument input.  IOW:
>>
>> [blockdev "blk2"]
>>    format = "raw"
>>    file = "/path/to/base.img"
>>    cache = "writeback"
>>
>> [blockdev "blk1"]
>>     format = "qcow2"
>>     file = "/path/to/leaf.img"
>>     cache="off"
>>     backing_dev = "blk2"
>>
>> [device "disk1"]
>>     driver = "ide-drive"
>>     blockdev = "blk1"
>>     bus = "0"
>>     unit = "0"
>>      
> You don't specify the backing file of an image on the command line (or
> in the configuration file).

But we really ought to allow it.  Backing files are implemented as part 
of the core block layer, not the actual block formats.  Today the block 
layer queries the block format for the name of the backing file but gets 
no additional options from the block format.  File isn't necessarily 
enough information to successfully open the backing device so why treat 
it specially?

I think we should keep the current ability to query the block format for 
a backing file name but we should also support hooking up the backing 
device without querying the block format at all.  It makes the model 
much more elegant IMHO because then we're just creating block devices 
and hooking them up.  All block devices are created equal more or less.

>   It's saved as part of the image. It's more
> like this (for a simple raw image file):
>
> [blockdev-protocol "proto1"]
>     protocol = "file"
>     file = "/path/to/image.img"
>
> [blockdev "blk1"]
>     format = "raw"
>     cache="off"
>     protocol = "proto1"
>
> [device "disk1"]
>     driver = "ide-drive"
>     blockdev = "blk1"
>     bus = "0"
>     unit = "0"
>
> (This would be Markus' option 3, I think)
>    

I don't understand why we need two layers of abstraction here.  Why not 
just:

[blockdev "proto1"]
   protocol = "file"
   cache = "off"
   file = "/path/to/image.img"

Why does the cache option belong with raw and not with file and why 
can't we just use file directly?As Christoph mentions, we really don't 
have stacked protocols and I'm

>> not sure they make sense.
>>      
> Right, if we go for Christoph's suggestion, we don't need stacked
> protocols. We'll have stacked formats instead. I'm not sure if you like
> this any better. ;-)
>
> We do have stacking today. -hda blkdebug:test.cfg:foo.qcow2 is qcow2 on
> blkdebug on file. We need to be able to represent this.
>    

I think we need to do stacking in a device specific way.  When you look 
at something like vmdk, it should actually support multiple leafs since 
the format does support such a thing.  So what I'd suggest is:

[blockdev "part1"]
   format = "raw"
   file = "image000.vmdk"

[blockdev "part2"]
   format = "raw"
   file = "image001.vmdk"

[blockdev "image"]
   format = "vmdk"
   section0 = "part1"
   section1 = "part2"

Note, we'll need to support this sort of model in order to support a 
disk that creates an automatic partition table (which would be a pretty 
useful feature).  For blkdebug, it would look like:

[blockdev "disk"]
   format = "qcow2"
   file = "foo.qcow2"

[blockdev "debug"]
   format = "blkdebug"
   blockdev = "disk"

>> I think raw doesn't make very much sense then.  What's the point of it
>> if it's just a thin wrapper around a protocol?
>>      
> That it can be wrapped around any protocol. It's just about separating
> code for handling the content of an image and code for accessing the image.
>
> Ever tried something like "qemu-img create -f raw /dev/something 10G"?
> You need the host_device protocol there, not the file protocol. When we
> had raw == file this completely failed. And it's definitely reasonable
> to expect that it works because the image format _is_ raw, it's just not
> saved in a file.
>    

No, I don't actually thing it's reasonable.  There's nothing meaningful 
that command can do.  Also, I've never understand creating qcow2 images 
on a physical device.  qcow2 needs to grow dynamically and physical 
devices can't.

I understand that we need to support the later use case but I don't 
think creating this layer of user-visible abstraction is the right thing 
to do.  This is an obscure use case and it shouldn't be the model that 
we force upon our users.

> Or the famous qcow2 images on block devices. Why did qemu guess the
> format correctly when qcow2 was saved in a file, but not on a host
> device? This was just inconsistent.
>
> I've had more than one bug report about things like this which are
> magically fixed when you do the layering right.
>    

Beyond qcow2 on physical devices, when does this issue actually come up?

Regards,

Anthony Liguori

> Kevin
>    

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] Re: RFC v2: blockdev_add & friends, brief rationale, QMP docs
  2010-06-17 13:01               ` Anthony Liguori
@ 2010-06-17 14:15                 ` Kevin Wolf
  2010-06-18  8:20                   ` Markus Armbruster
  2010-06-18  7:06                 ` Markus Armbruster
  1 sibling, 1 reply; 25+ messages in thread
From: Kevin Wolf @ 2010-06-17 14:15 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: Christoph Hellwig, qemu-devel, Markus Armbruster,
	Luiz Capitulino, Avi Kivity, Gerd Hoffmann

Am 17.06.2010 15:01, schrieb Anthony Liguori:
> On 06/17/2010 03:20 AM, Kevin Wolf wrote:
>> Am 16.06.2010 20:07, schrieb Anthony Liguori:
>>    
>>>>    But it requires that
>>>> everything that -blockdev provides is accessible with -drive, too (or
>>>> that we're okay with users hating us).
>>>>
>>>>        
>>> I'm happy for -drive to die.  I think we should support -hda and
>>> -blockdev.
>>>      
>> -hda is not sufficient for most users. It doesn't provide any options.
>> It doesn't even support virtio. If -drive is going to die (and we seem
>> to agree all on that), then -blockdev needs to be usable for users (and
>> it's only you who contradicts so far).
>>    
> 
> I've always thought we should have a -vda argument and an -sda argument 
> specifically for specifying virtio and scsi disks.

It would at least fix the most obvious problem. However, it still
doesn't allow passing options.

>>> -blockdev should be optimized for config files, not single
>>> argument input.  IOW:
>>>
>>> [blockdev "blk2"]
>>>    format = "raw"
>>>    file = "/path/to/base.img"
>>>    cache = "writeback"
>>>
>>> [blockdev "blk1"]
>>>     format = "qcow2"
>>>     file = "/path/to/leaf.img"
>>>     cache="off"
>>>     backing_dev = "blk2"
>>>
>>> [device "disk1"]
>>>     driver = "ide-drive"
>>>     blockdev = "blk1"
>>>     bus = "0"
>>>     unit = "0"
>>>      
>> You don't specify the backing file of an image on the command line (or
>> in the configuration file).
> 
> But we really ought to allow it.  Backing files are implemented as part 
> of the core block layer, not the actual block formats.  

The generic block layer knows the name of the backing file, so it can be
displayed in tools, but that's about it. Calling this the
"implementation" of backing files is daring.

I see no use case for specifying it on the command line. The only thing
you can achieve better with it is corrupting your image because you
specify the wrong/no backing file next time.

> Today the block 
> layer queries the block format for the name of the backing file but gets 
> no additional options from the block format.  File isn't necessarily 
> enough information to successfully open the backing device so why treat 
> it specially?
> 
> I think we should keep the current ability to query the block format for 
> a backing file name but we should also support hooking up the backing 
> device without querying the block format at all.  It makes the model 
> much more elegant IMHO because then we're just creating block devices 
> and hooking them up.  All block devices are created equal more or less.
> 
>>   It's saved as part of the image. It's more
>> like this (for a simple raw image file):
>>
>> [blockdev-protocol "proto1"]
>>     protocol = "file"
>>     file = "/path/to/image.img"
>>
>> [blockdev "blk1"]
>>     format = "raw"
>>     cache="off"
>>     protocol = "proto1"
>>
>> [device "disk1"]
>>     driver = "ide-drive"
>>     blockdev = "blk1"
>>     bus = "0"
>>     unit = "0"
>>
>> (This would be Markus' option 3, I think)
>>    
> 
> I don't understand why we need two layers of abstraction here.  Why not 
> just:
> 
> [blockdev "proto1"]
>    protocol = "file"
>    cache = "off"
>    file = "/path/to/image.img"
> 
> Why does the cache option belong with raw and not with file and why 
> can't we just use file directly?

The cache option is shared along the chain, so it probably fits best in
the blockdev.

And we don't use file directly because it's wrong. Users say that their
image is in raw format, and they don't get why they should have to make
a difference between a raw image stored on a block device and one stored
in a file.

> As Christoph mentions, we really don't 
> have stacked protocols and I'm

The only question is if we call them stacked formats or stacked
protocols. One of them exists.

>>> not sure they make sense.
>>>      
>> Right, if we go for Christoph's suggestion, we don't need stacked
>> protocols. We'll have stacked formats instead. I'm not sure if you like
>> this any better. ;-)
>>
>> We do have stacking today. -hda blkdebug:test.cfg:foo.qcow2 is qcow2 on
>> blkdebug on file. We need to be able to represent this.
>>    
> 
> I think we need to do stacking in a device specific way.  When you look 
> at something like vmdk, it should actually support multiple leafs since 
> the format does support such a thing.  So what I'd suggest is:
> 
> [blockdev "part1"]
>    format = "raw"
>    file = "image000.vmdk"
> 
> [blockdev "part2"]
>    format = "raw"
>    file = "image001.vmdk"
> 
> [blockdev "image"]
>    format = "vmdk"
>    section0 = "part1"
>    section1 = "part2"

Actually, I'd prefer to read that information from the VMDK file instead
of requiring the user to configure this manually...

> Note, we'll need to support this sort of model in order to support a 
> disk that creates an automatic partition table (which would be a pretty 
> useful feature). 

Sounds like a good example of a useful protocol.

Markus, I'm afraid we've found an equivalent to Avi's mirror. If not
even more complicated, because we'd need to accept any length for the
list of partitions - possibly an option that should take an array?

> For blkdebug, it would look like:
> 
> [blockdev "disk"]
>    format = "qcow2"
>    file = "foo.qcow2"
> 
> [blockdev "debug"]
>    format = "blkdebug"
>    blockdev = "disk"

Something's wrong here. You end up with a chain qcow2 -> blkdebug ->
file, where file needs the file name. So format = "qcow2" and file =
"foo.qcow" can't belong to the same section.

I'm not sure how you'd build this from your description and which one
should be the blockdev that is attached to a guest device (this is what
is made much clearer if you distinguish between blockdev and
blockdev-protocol)

>>> I think raw doesn't make very much sense then.  What's the point of it
>>> if it's just a thin wrapper around a protocol?
>>>      
>> That it can be wrapped around any protocol. It's just about separating
>> code for handling the content of an image and code for accessing the image.
>>
>> Ever tried something like "qemu-img create -f raw /dev/something 10G"?
>> You need the host_device protocol there, not the file protocol. When we
>> had raw == file this completely failed. And it's definitely reasonable
>> to expect that it works because the image format _is_ raw, it's just not
>> saved in a file.
> 
> No, I don't actually thing it's reasonable.  There's nothing meaningful 
> that command can do. 

Except not adding special cases.

And I can only repeat it once more: Users don't understand why they need
to tell qemu that they are working on a block device. qemu can know this
very well by itself.

> Also, I've never understand creating qcow2 images 
> on a physical device.  qcow2 needs to grow dynamically and physical 
> devices can't.

We can discuss this over and over again, but it's used and we can't
ignore that.

> I understand that we need to support the later use case but I don't 
> think creating this layer of user-visible abstraction is the right thing 
> to do.  This is an obscure use case and it shouldn't be the model that 
> we force upon our users.

No, you're trying to force your model upon our users, which they have
made very clear in bug reports that they don't understand. The whole
point of the abstraction is that the internal code organization isn't
visible to the user any more (and of course that it makes the code nicer
as a consequence).

On the other hand, what does having raw == file buy us? Have you heard
of any problems that were caused by the change? (Other than the block
driver whitelist which needed to be changed in the configure call if you
were using it)

Kevin

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] Re: RFC v2: blockdev_add & friends, brief rationale, QMP docs
  2010-06-17 13:01               ` Anthony Liguori
  2010-06-17 14:15                 ` Kevin Wolf
@ 2010-06-18  7:06                 ` Markus Armbruster
  1 sibling, 0 replies; 25+ messages in thread
From: Markus Armbruster @ 2010-06-18  7:06 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: Kevin Wolf, Christoph Hellwig, qemu-devel, Luiz Capitulino,
	Avi Kivity, Gerd Hoffmann

Anthony Liguori <anthony@codemonkey.ws> writes:

> On 06/17/2010 03:20 AM, Kevin Wolf wrote:
>> Am 16.06.2010 20:07, schrieb Anthony Liguori:
>>    
>>>>    But it requires that
>>>> everything that -blockdev provides is accessible with -drive, too (or
>>>> that we're okay with users hating us).
>>>>
>>>>        
>>> I'm happy for -drive to die.  I think we should support -hda and
>>> -blockdev.
>>>      
>> -hda is not sufficient for most users. It doesn't provide any options.
>> It doesn't even support virtio. If -drive is going to die (and we seem
>> to agree all on that), then -blockdev needs to be usable for users (and
>> it's only you who contradicts so far).
>>    
>
> I've always thought we should have a -vda argument and an -sda
> argument specifically for specifying virtio and scsi disks.

"-hda F" is a cognitive dead end: if you need more control, knowing -hda
doesn't give you a better idea where to look for it than not knowing it.

This is different with "-drive file=F".  It can grow with the user's
needs.

-blockdev's primary mission is clean host / guest separation, and clean
interface to block devices for QMP and for config files.

If it turns out to be too verbose and complex for everyday command line
use, we shouldn't take -hda & similar as excuse for leaving it there.
We still need a usable option that can grow with the user's needs.  And
it better be able to grow to the point where you have full control over
the device model, i.e. you use -device.

[...]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] Re: RFC v2: blockdev_add & friends, brief rationale, QMP docs
  2010-06-17 14:15                 ` Kevin Wolf
@ 2010-06-18  8:20                   ` Markus Armbruster
  2010-06-18  9:36                     ` Kevin Wolf
  0 siblings, 1 reply; 25+ messages in thread
From: Markus Armbruster @ 2010-06-18  8:20 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: Christoph Hellwig, qemu-devel, Luiz Capitulino, Avi Kivity,
	Gerd Hoffmann

Kevin Wolf <kwolf@redhat.com> writes:

> Am 17.06.2010 15:01, schrieb Anthony Liguori:
>> On 06/17/2010 03:20 AM, Kevin Wolf wrote:
>>> Am 16.06.2010 20:07, schrieb Anthony Liguori:
>>>    
>>>>>    But it requires that
>>>>> everything that -blockdev provides is accessible with -drive, too (or
>>>>> that we're okay with users hating us).
>>>>>
>>>>>        
>>>> I'm happy for -drive to die.  I think we should support -hda and
>>>> -blockdev.
>>>>      
>>> -hda is not sufficient for most users. It doesn't provide any options.
>>> It doesn't even support virtio. If -drive is going to die (and we seem
>>> to agree all on that), then -blockdev needs to be usable for users (and
>>> it's only you who contradicts so far).
>>>    
>> 
>> I've always thought we should have a -vda argument and an -sda argument 
>> specifically for specifying virtio and scsi disks.
>
> It would at least fix the most obvious problem. However, it still
> doesn't allow passing options.
>
>>>> -blockdev should be optimized for config files, not single
>>>> argument input.  IOW:
>>>>
>>>> [blockdev "blk2"]
>>>>    format = "raw"
>>>>    file = "/path/to/base.img"
>>>>    cache = "writeback"
>>>>
>>>> [blockdev "blk1"]
>>>>     format = "qcow2"
>>>>     file = "/path/to/leaf.img"
>>>>     cache="off"
>>>>     backing_dev = "blk2"
>>>>
>>>> [device "disk1"]
>>>>     driver = "ide-drive"
>>>>     blockdev = "blk1"
>>>>     bus = "0"
>>>>     unit = "0"
>>>>      
>>> You don't specify the backing file of an image on the command line (or
>>> in the configuration file).
>> 
>> But we really ought to allow it.  Backing files are implemented as part 
>> of the core block layer, not the actual block formats.  
>
> The generic block layer knows the name of the backing file, so it can be
> displayed in tools, but that's about it. Calling this the
> "implementation" of backing files is daring.

You're both partly wrong :)

COW format drivers retrieve the "name" of the backing file from the
image on open, and store it in a well-known place in their
BlockDriverState.  qcow2 also retrieves the format.

Then generic block code opens the backing "file", and stores a pointer
to the backing BlockDriverState in the COW format's BlockDriverState.

The COW format driver then uses the backing "file" BlockDriverState
however it sees fit.

Generic code takes care of closing it, and also has a hand in commit and
encryption.

Aside: I put "name" and "file" in quotes, because the name is really an
encoding of protocol + arguments.  It works just like -drive with
file=NAME (and format=FORMAT if qcow2).  But for the COW format driver,
it's just a string (or two) it uses to get access to a backing image.
It's fine with any format + protocol combination, so it leaves that to
generic code.

Aside^2: We can define the semantics of QCOW2 however we please.  But is
it really sane to interpret a backing file name "fat:duck" we find in
some VMDK image as "vvfat over directory duck"?

> I see no use case for specifying it on the command line. The only thing
> you can achieve better with it is corrupting your image because you
> specify the wrong/no backing file next time.

Anthony has a point on the conceptual level: COW is a stacking of
BlockDriverStates.

But Kevin is right on usability.  We could permit overriding of backing
file on the command line / config file / QMP, but it would be a
dangerous expert feature, and no replacement for storing references to
backing files in the COW images.

>> Today the block 
>> layer queries the block format for the name of the backing file but gets 
>> no additional options from the block format.  File isn't necessarily 
>> enough information to successfully open the backing device so why treat 
>> it specially?

Parameters for opening a block device are flags, format, filename (which
is really an encoding of protocol + arguments).  So the only piece of
information that's missing for backing files is flags.

Flags encode -drive's parameters snapshot, cache, aio, readonly.  For
backing files, we force readonly on and snapshot off.  We inherit cache
and aio settings from the backed file.  Which of these do you want to
control separately?

>> I think we should keep the current ability to query the block format for 
>> a backing file name but we should also support hooking up the backing 
>> device without querying the block format at all.  It makes the model 
>> much more elegant IMHO because then we're just creating block devices 
>> and hooking them up.  All block devices are created equal more or less.
>> 
>>>   It's saved as part of the image. It's more
>>> like this (for a simple raw image file):
>>>
>>> [blockdev-protocol "proto1"]
>>>     protocol = "file"
>>>     file = "/path/to/image.img"
>>>
>>> [blockdev "blk1"]
>>>     format = "raw"
>>>     cache="off"
>>>     protocol = "proto1"
>>>
>>> [device "disk1"]
>>>     driver = "ide-drive"
>>>     blockdev = "blk1"
>>>     bus = "0"
>>>     unit = "0"
>>>
>>> (This would be Markus' option 3, I think)
>>>    
>> 
>> I don't understand why we need two layers of abstraction here.  Why not 
>> just:
>> 
>> [blockdev "proto1"]
>>    protocol = "file"
>>    cache = "off"
>>    file = "/path/to/image.img"
>> 
>> Why does the cache option belong with raw and not with file and why 
>> can't we just use file directly?
>
> The cache option is shared along the chain, so it probably fits best in
> the blockdev.
>
> And we don't use file directly because it's wrong. Users say that their
> image is in raw format, and they don't get why they should have to make
> a difference between a raw image stored on a block device and one stored
> in a file.
>
>> As Christoph mentions, we really don't 
>> have stacked protocols and I'm

"missing the end of my sentence"?

What we *really* don't have is a common understanding of "format" and
"protocol"!

> The only question is if we call them stacked formats or stacked
> protocols. One of them exists.

It does internally.  But until we develop a stable understanding of
"format", "protocol" and how they stack, we better avoid exposing the
stacking at external interfaces.

>>>> not sure they make sense.
>>>>      
>>> Right, if we go for Christoph's suggestion, we don't need stacked
>>> protocols. We'll have stacked formats instead. I'm not sure if you like
>>> this any better. ;-)
>>>
>>> We do have stacking today. -hda blkdebug:test.cfg:foo.qcow2 is qcow2 on
>>> blkdebug on file. We need to be able to represent this.
>>>    
>> 
>> I think we need to do stacking in a device specific way.  When you look 
>> at something like vmdk, it should actually support multiple leafs since 
>> the format does support such a thing.  So what I'd suggest is:
>> 
>> [blockdev "part1"]
>>    format = "raw"
>>    file = "image000.vmdk"
>> 
>> [blockdev "part2"]
>>    format = "raw"
>>    file = "image001.vmdk"
>> 
>> [blockdev "image"]
>>    format = "vmdk"
>>    section0 = "part1"
>>    section1 = "part2"
>
> Actually, I'd prefer to read that information from the VMDK file instead
> of requiring the user to configure this manually...
>
>> Note, we'll need to support this sort of model in order to support a 
>> disk that creates an automatic partition table (which would be a pretty 
>> useful feature). 
>
> Sounds like a good example of a useful protocol.
>
> Markus, I'm afraid we've found an equivalent to Avi's mirror. If not
> even more complicated, because we'd need to accept any length for the
> list of partitions - possibly an option that should take an array?

I wouldn't say it's more complicated than Avi's mirror.  The
zero-one-infinity rule applies again: zero inferior protocols is
simplest (no stacking), then up to one inferior protocol (protocol
stack), then any number above one (protocol tree).

[...]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] Re: RFC v2: blockdev_add & friends, brief rationale, QMP docs
  2010-06-18  8:20                   ` Markus Armbruster
@ 2010-06-18  9:36                     ` Kevin Wolf
  0 siblings, 0 replies; 25+ messages in thread
From: Kevin Wolf @ 2010-06-18  9:36 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Christoph Hellwig, qemu-devel, Luiz Capitulino, Avi Kivity,
	Gerd Hoffmann

Am 18.06.2010 10:20, schrieb Markus Armbruster:
>>>>> -blockdev should be optimized for config files, not single
>>>>> argument input.  IOW:
>>>>>
>>>>> [blockdev "blk2"]
>>>>>    format = "raw"
>>>>>    file = "/path/to/base.img"
>>>>>    cache = "writeback"
>>>>>
>>>>> [blockdev "blk1"]
>>>>>     format = "qcow2"
>>>>>     file = "/path/to/leaf.img"
>>>>>     cache="off"
>>>>>     backing_dev = "blk2"
>>>>>
>>>>> [device "disk1"]
>>>>>     driver = "ide-drive"
>>>>>     blockdev = "blk1"
>>>>>     bus = "0"
>>>>>     unit = "0"
>>>>>      
>>>> You don't specify the backing file of an image on the command line (or
>>>> in the configuration file).
>>>
>>> But we really ought to allow it.  Backing files are implemented as part 
>>> of the core block layer, not the actual block formats.  
>>
>> The generic block layer knows the name of the backing file, so it can be
>> displayed in tools, but that's about it. Calling this the
>> "implementation" of backing files is daring.
> 
> You're both partly wrong :)

Yes, it was not entirely correct. What I was thinking of (and it is
probably what I should have written) is that the real COW implementation
is part of the driver and not generic infrastructure. If the format
driver doesn't do COW, there's no use in specifying a backing file to
block.c.

> COW format drivers retrieve the "name" of the backing file from the
> image on open, and store it in a well-known place in their
> BlockDriverState.  qcow2 also retrieves the format.
> 
> Then generic block code opens the backing "file", and stores a pointer
> to the backing BlockDriverState in the COW format's BlockDriverState.
> 
> The COW format driver then uses the backing "file" BlockDriverState
> however it sees fit.
> 
> Generic code takes care of closing it, and also has a hand in commit and
> encryption.
> 
> Aside: I put "name" and "file" in quotes, because the name is really an
> encoding of protocol + arguments.  It works just like -drive with
> file=NAME (and format=FORMAT if qcow2).  But for the COW format driver,
> it's just a string (or two) it uses to get access to a backing image.
> It's fine with any format + protocol combination, so it leaves that to
> generic code.
> 
> Aside^2: We can define the semantics of QCOW2 however we please.  But is
> it really sane to interpret a backing file name "fat:duck" we find in
> some VMDK image as "vvfat over directory duck"?

Good point, though I don't think it's a real problem. But it's true that
we use VMDK in a very qemu-specific way. We can read VMDKs that VMware
can't read (with a qcow2 backing file, for example), but in the same way
it may happen that we interpret a file name wrong if it comes from
VMware rather than qemu. Disabling protocol parsing here would disable
additional features that were possible with qemu - but to be honest, I
don't think that either option is a real problem.

>> I see no use case for specifying it on the command line. The only thing
>> you can achieve better with it is corrupting your image because you
>> specify the wrong/no backing file next time.
> 
> Anthony has a point on the conceptual level: COW is a stacking of
> BlockDriverStates.

It is. But it's not the same stacking as the format/protocol one, it's a
second dimension. I doubt that it makes sense to let the user specify
this stacking on the command line.

> But Kevin is right on usability.  We could permit overriding of backing
> file on the command line / config file / QMP, but it would be a
> dangerous expert feature, and no replacement for storing references to
> backing files in the COW images.

I even refuse to call it a feature as long as you can't tell me a use
case. I think it's useless - there are cases that are already possible
with having the backing file name stored in the image file and you just
duplicate it on the command line; and the other cases don't even work.

>>> Today the block 
>>> layer queries the block format for the name of the backing file but gets 
>>> no additional options from the block format.  File isn't necessarily 
>>> enough information to successfully open the backing device so why treat 
>>> it specially?
> 
> Parameters for opening a block device are flags, format, filename (which
> is really an encoding of protocol + arguments).  So the only piece of
> information that's missing for backing files is flags.
> 
> Flags encode -drive's parameters snapshot, cache, aio, readonly.  For
> backing files, we force readonly on and snapshot off.  We inherit cache
> and aio settings from the backed file.  Which of these do you want to
> control separately?
> 
>>> I think we should keep the current ability to query the block format for 
>>> a backing file name but we should also support hooking up the backing 
>>> device without querying the block format at all.  It makes the model 
>>> much more elegant IMHO because then we're just creating block devices 
>>> and hooking them up.  All block devices are created equal more or less.
>>>
>>>>   It's saved as part of the image. It's more
>>>> like this (for a simple raw image file):
>>>>
>>>> [blockdev-protocol "proto1"]
>>>>     protocol = "file"
>>>>     file = "/path/to/image.img"
>>>>
>>>> [blockdev "blk1"]
>>>>     format = "raw"
>>>>     cache="off"
>>>>     protocol = "proto1"
>>>>
>>>> [device "disk1"]
>>>>     driver = "ide-drive"
>>>>     blockdev = "blk1"
>>>>     bus = "0"
>>>>     unit = "0"
>>>>
>>>> (This would be Markus' option 3, I think)
>>>>    
>>>
>>> I don't understand why we need two layers of abstraction here.  Why not 
>>> just:
>>>
>>> [blockdev "proto1"]
>>>    protocol = "file"
>>>    cache = "off"
>>>    file = "/path/to/image.img"
>>>
>>> Why does the cache option belong with raw and not with file and why 
>>> can't we just use file directly?
>>
>> The cache option is shared along the chain, so it probably fits best in
>> the blockdev.
>>
>> And we don't use file directly because it's wrong. Users say that their
>> image is in raw format, and they don't get why they should have to make
>> a difference between a raw image stored on a block device and one stored
>> in a file.
>>
>>> As Christoph mentions, we really don't 
>>> have stacked protocols and I'm
> 
> "missing the end of my sentence"?
> 
> What we *really* don't have is a common understanding of "format" and
> "protocol"!
> 
>> The only question is if we call them stacked formats or stacked
>> protocols. One of them exists.
> 
> It does internally.  But until we develop a stable understanding of
> "format", "protocol" and how they stack, we better avoid exposing the
> stacking at external interfaces.

Right, we better develop a stable understand first. But that doesn't
mean that the problem is solved yet just because it's deferred.

>>>>> not sure they make sense.
>>>>>      
>>>> Right, if we go for Christoph's suggestion, we don't need stacked
>>>> protocols. We'll have stacked formats instead. I'm not sure if you like
>>>> this any better. ;-)
>>>>
>>>> We do have stacking today. -hda blkdebug:test.cfg:foo.qcow2 is qcow2 on
>>>> blkdebug on file. We need to be able to represent this.
>>>>    
>>>
>>> I think we need to do stacking in a device specific way.  When you look 
>>> at something like vmdk, it should actually support multiple leafs since 
>>> the format does support such a thing.  So what I'd suggest is:
>>>
>>> [blockdev "part1"]
>>>    format = "raw"
>>>    file = "image000.vmdk"
>>>
>>> [blockdev "part2"]
>>>    format = "raw"
>>>    file = "image001.vmdk"
>>>
>>> [blockdev "image"]
>>>    format = "vmdk"
>>>    section0 = "part1"
>>>    section1 = "part2"
>>
>> Actually, I'd prefer to read that information from the VMDK file instead
>> of requiring the user to configure this manually...
>>
>>> Note, we'll need to support this sort of model in order to support a 
>>> disk that creates an automatic partition table (which would be a pretty 
>>> useful feature). 
>>
>> Sounds like a good example of a useful protocol.
>>
>> Markus, I'm afraid we've found an equivalent to Avi's mirror. If not
>> even more complicated, because we'd need to accept any length for the
>> list of partitions - possibly an option that should take an array?
> 
> I wouldn't say it's more complicated than Avi's mirror.  The
> zero-one-infinity rule applies again: zero inferior protocols is
> simplest (no stacking), then up to one inferior protocol (protocol
> stack), then any number above one (protocol tree).

Anyway, it's more than purely theoretical, it's an example that could be
useful in practice. So when we decide the final structure (after having
developed a clear idea about formats and protocols) we should definitely
consider this.

Kevin

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2010-06-18  9:37 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-06-10 17:45 [Qemu-devel] RFC v2: blockdev_add & friends, brief rationale, QMP docs Markus Armbruster
2010-06-15  9:04 ` Avi Kivity
2010-06-15 12:23   ` Markus Armbruster
2010-06-15 12:43     ` Avi Kivity
2010-06-15 13:27       ` Markus Armbruster
2010-06-15 13:40         ` Avi Kivity
2010-06-15 14:54           ` Markus Armbruster
2010-06-16  9:50             ` Avi Kivity
2010-06-16 11:02               ` Markus Armbruster
2010-06-16 11:06                 ` Avi Kivity
2010-06-15 13:44 ` [Qemu-devel] " Avi Kivity
2010-06-15 14:39   ` Gerd Hoffmann
2010-06-16 11:20   ` Kevin Wolf
2010-06-16 12:41     ` Markus Armbruster
2010-06-16 13:41       ` Anthony Liguori
2010-06-16 13:57         ` Kevin Wolf
2010-06-16 14:24           ` Christoph Hellwig
2010-06-16 14:47             ` Markus Armbruster
2010-06-16 18:07           ` Anthony Liguori
2010-06-17  8:20             ` Kevin Wolf
2010-06-17 13:01               ` Anthony Liguori
2010-06-17 14:15                 ` Kevin Wolf
2010-06-18  8:20                   ` Markus Armbruster
2010-06-18  9:36                     ` Kevin Wolf
2010-06-18  7:06                 ` Markus Armbruster

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.