All of lore.kernel.org
 help / color / mirror / Atom feed
From: Eric Blake <eblake@redhat.com>
To: Fam Zheng <famz@redhat.com>, qemu-devel@nongnu.org
Cc: Kevin Wolf <kwolf@redhat.com>,
	qemu-block@nongnu.org, Markus Armbruster <armbru@redhat.com>,
	mreitz@redhat.com, vsementsov@parallels.com,
	Stefan Hajnoczi <stefanha@redhat.com>,
	jsnow@redhat.com
Subject: Re: [Qemu-devel] [RFC PATCH 01/16] doc: Add QBM format specification
Date: Tue, 26 Jan 2016 10:51:18 -0700	[thread overview]
Message-ID: <56A7B216.6070108@redhat.com> (raw)
In-Reply-To: <1453804705-7205-2-git-send-email-famz@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 11314 bytes --]

On 01/26/2016 03:38 AM, Fam Zheng wrote:
> Signed-off-by: Fam Zheng <famz@redhat.com>
> ---
>  docs/specs/qbm.md | 118 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 118 insertions(+)
>  create mode 100644 docs/specs/qbm.md
> 
> diff --git a/docs/specs/qbm.md b/docs/specs/qbm.md
> new file mode 100644
> index 0000000..b91910b
> --- /dev/null
> +++ b/docs/specs/qbm.md
> @@ -0,0 +1,118 @@
> +QEMU Block Bitmap (QBM)
> +=======================

No explicit copyright mention means that this document is GPLv2+ by
default.  I don't know if any 3rd-party implementation trying to use
this spec would object to that, or if a looser license is desirable here
(I don't personally care, but just raising the point).

> +
> +QBM is a multi-file disk format to allow storing persistent block bitmaps along
> +with the tracked data image.  A QBM image includes one json descriptor file,
> +one data image, one or more bitmap files that describe the block dirty status

s/one or more/and one or more/

Must it have block dirty status bitmaps, or can you have a QBM image
with just an allocation bitmap?

> +of the data image.
> +
> +The json file describes the structure of the image. The structure of the json
> +descriptor file is:

Please mention that the file must be valid JSON per RFC 7159.  Probably
also worth requiring that the file is a text file (ends in a newline).

> +
> +    QBM-JSON-FILE := { "QBM": DESC-JSON }
> +
> +    DESC-JSON := { "version": 1,
> +                   "image": IMAGE,
> +                   "BITMAPS": BITMAPS

s/"BITMAPS"/"bitmaps"/

See my thoughts below on whether this is the ideal top-level structure.

> +                 }
> +
> +Fields in the top level json dictionary are:
> +
> +@version: An integer which must be 1.
> +@image: A dictionary in IMAGE schema, as described later. It provides the
> +        information of the data image where user data is stored. Its format is
> +        documented in the "IMAGE schema" section.
> +@bitmaps: A dictionary that describes one ore more bitmap files. The keys into
> +          the dictionary are the names of bitmap, which must be strings, and
> +          each value is a dictionary describing the information of the bitmap,
> +          as documented below in the "BITMAP schema" section.

Making 'bitmaps' be a dictionary means that we now have keys that might
not be an identifier.  Although this is valid JSON, it may trip up some
tools.  Would it be better to make 'bitmaps' be a list of dictionaries,
where each dictionary has a 'name':'value' member, so that bitmap names
that are not a (C, Python, whatever) identifier are still valid?

> +
> +=== IMAGE schema ===
> +
> +An IMAGE records the information of an image (such as a data image or a backing
> +file). It has following fields:

I liked how you showed DESC-JSON := for the top level; should you do the
same here for IMAGE?

> +
> +@file: The file name string of the referenced image. If it's a relative path,
> +       the file should be found relative to the descriptor file's
> +       location.

Does that mean we'll have to use 'json:...' encoding for representing
network resources?  Should we instead reuse some of the
qapi/block-core.json representation of a block device?

> +@format: The format string of the file.

Nice that format is mandatory.  Do we want to call out a finite list of
supported formats, or leave it open-ended in this spec?

> +
> +=== BITMAP schema ===
> +
> +A BITMAP dictionary records the information of a bitmap (such as a dirty bitmap
> +or a block allocation status bitmap). It has following mandatory fields:
> +
> +@file: The name of the bitmap file. The bitmap file is in little endian, both

s/in //

> +       byte-order-wise and bit-order-wise, which means the LSB in the byte 0
> +       corresponds to the first sectors.

s/the byte 0/the first byte/

Again, should we be reusing something from qapi/block-core.json, to
allow network devices with more structure than just 'json:...' naming?

> +@granularity-bytes: How many bytes of data does one bit in the bitmap track.
> +                    This value must be a power of 2 and no less than 512.
> +@type: The type of the bitmap.  Currently only "dirty" and "allocation" are
> +       supported.
> +       "dirty" indicates a block dirty bitmap; "allocation" indicates a
> +       allocation status bitmap. There must be at most one "allocation" bitmap.
> +
> +If the type of the bitmap is "allocation", an extra field "backing" is also
> +accepted:
> +
> +@backing: a dictionary as specified in the IMAGE schema. It can be used to
> +          adding a backing file to raw image.

s/adding/add/

As promised above, would an alternative representation be any better?
That is, I'm trying to see if I could write qapi to describe the
structure you've presented here, and I fell short when it comes to
naming a particular bitmap.  Also, since there is at most one
'type':'allocation' member of 'bitmaps', I wonder if separating it out
would make it easier to locate.

Here's my proposal for an alternative schema, written in qapi:

{ 'struct': 'Other', 'data': { ...however we describe a network file... } }
{ 'alternate': 'File', 'data': { 'file': 'str', 'struct': 'Other' } }
{ 'struct': 'Bitmap', 'data': {
   'name':'str', 'file': 'File',
   'granularity-bytes':'int' } }
{ 'struct': 'AllocationBitmap', 'base': 'Bitmap', 'data': {
  '*backing': 'Image' } }
{ 'struct': 'Image', 'data': {
  'file': 'File',
  'format': 'str' # would an enum be better?
} }
{ 'struct': 'Desc', 'data': {
  'version': 'int', 'image': 'Image',
  '*allocation': 'AllocationBitmap',
  'dirty': [ 'AllocationBitmap' ]
} }
{ 'struct': 'QBM', 'data': { 'QBM': 'Desc' } }

where the json description file must consist of a single 'QBM' qapi
struct, and where the use of a QAPI alternate type 'File' allows us to
specify either a file name or a formal structure for describing a
network resource.  Below, I'll rewrite your example with my schema...

> +
> +
> +=== Extended fields ===
> +
> +Implementations are allowed to extend the format schema by inserting additinoal

s/additinoal/additional/

> +members into above dictionaries, with key names that starts with either
> +an "ext-hard-" or an "ext-soft-" prefix.

Should we be more like qcow2 and have a third category of auto-clear
keys (if you don't recognize the key, remove it upon editing the file,
but reading is okay)?  Feature negotiation via this approach requires
reading every member of 'bitmaps' (well, we have to do that anyway to
parse the full JSON structure); would it be any better to have an
up-front section in the top level that describes what features are in
use, rather than requiring all new features to use the 'ext-' namespace?

Should we require hard failure on any key whose name is not recognized
(other than the weirdness of your proposal having the keys of 'bitmaps'
be user-supplied names)?

> +
> +Extended fields prefixed with "ext-soft-" are optional and can be ignored by
> +parsers if they do not support it; fields starting with "ext-hard-" are
> +mandatory and cannot be ignored, a parser should not proceed parsing the image
> +if it does not support it.

Is it really the entire image invalidated if an extension is tied to a
particular bitmap, or only that bitmap?

> +
> +It is strongly recommended that the application names are also included in the
> +extention name string, such as "ext-hard-qemu-", if the effect or

s/extention/extension/

> +interpretation of the field is local to a specific application.
> +
> +For example, QEMU can implement a "checksum" feature to make sure no files
> +referred to by the json descriptor are modified inconsistently, by adding
> +"ext-soft-qemu-checksum" fields in "image" and "bitmaps" descriptions, like in
> +the json text found below.

If an extension proves to be useful, how do we standardize it later?
Will it always have to carry the 'ext-' prefix?

You said that soft extensions can be ignored on parse - but if we write
to the file, couldn't we possibly be invalidating the contents of the
extension field, and not leaving a breadcrumb for the future reader that
understands the extension to know that we messed it up?  I think an
auto-clear feature would be useful (preserve the checksum field, but
clear the associated auto-clear so that the newer reader knows that the
checksum has to be checked).

Should we be thinking about a write lock extension (similar to the
current thread on qcow2 write locks), to make it less likely that two
writers will be modifying the descriptor, image file, and/or bitmaps at
the same time?

> +
> +=== QBM descriptor file example ===
> +
> +This is the content of a QBM image's json descriptor file, which contains a
> +data image (data.img), and three bitmaps, out of which the "allocation" bitmap
> +associates a backing file to this image (base.img).
> +
> +{ "QBM": {
> +    "version": 1,
> +    "image": {
> +        "file": "data.img",
> +        "format": "raw"
> +        "ext-soft-qemu-checksum": "9eff24b72bd693cc8aa3e887141b96f8",

Comma on wrong row.

> +    },
> +    "bitmaps": {
> +        "0": {
> +            "file": "bitmap0.bin",
> +            "granularity-bytes": 512,
> +            "type": "dirty"
> +        },
> +        "1": {
> +            "file": "bitmap1.bin",
> +            "granularity-bytes": 4096,
> +            "type": "dirty"
> +        },
> +        "2": {
> +            "file": "bitmap3.bin",
> +            "granularity-bytes": 4096,
> +            "type": "allocation"

Missing comma

> +            "backing": {
> +                "file": "base.img",
> +                "format": "raw"
> +                "ext-soft-qemu-checksum": "fcad1f672b2fb19948405e7a1a18c2a7",

Comma on wrong row

> +            },

No trailing commas in JSON

Your approach means that the same bitmap cannot be used for both 'dirty'
and 'allocation' (because all entries in the 'bitmaps' dictionary must
have distinct names).  (Although nothing stops two bitmap entries from
naming the same 'file'.)

> +        }
> +    }
> +} }

So, rewriting to my schema above, this would be:

{ "QBM": {
    "version": 1,
    "image": {
        "file": "data.img",
        "format": "raw",
        "ext-soft-qemu-checksum": "9eff24b72bd693cc8aa3e887141b96f8"
    },
    "allocation": {
        "name": "2",
        "file": "bitmap3.bin",
        "granularity-bytes": 4096,
        "backing": {
            "file": "base.img",
            "format": "raw"
            "ext-soft-qemu-checksum": "fcad1f672b2fb19948405e7a1a18c2a7"
        }
    },
    "bitmaps": [
        {
            "name": "0",
            "file": "bitmap0.bin",
            "granularity-bytes": 512
        },
        {
            "name": "1",
            "file": "bitmap1.bin",
            "granularity-bytes": 4096
        }
    ]
} }

> +
> 

Awkward to end in a blank line.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

  reply	other threads:[~2016-01-26 17:51 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-01-26 10:38 [Qemu-devel] [RFC PATCH 00/16] Qemu Bit Map (QBM) - an overlay format for persistent dirty bitmap Fam Zheng
2016-01-26 10:38 ` [Qemu-devel] [RFC PATCH 01/16] doc: Add QBM format specification Fam Zheng
2016-01-26 17:51   ` Eric Blake [this message]
2016-02-09  0:05     ` John Snow
2016-02-23  8:35     ` Markus Armbruster
2016-02-08 23:51   ` John Snow
2016-02-17 11:48   ` Vladimir Sementsov-Ogievskiy
2016-02-17 16:30   ` Vladimir Sementsov-Ogievskiy
2016-01-26 10:38 ` [Qemu-devel] [RFC PATCH 02/16] block: Set dirty before doing write Fam Zheng
2016-01-26 17:52   ` Eric Blake
2016-02-09  0:11   ` John Snow
2016-01-26 10:38 ` [Qemu-devel] [RFC PATCH 03/16] block: Allow .bdrv_close callback to release dirty bitmaps Fam Zheng
2016-01-26 17:53   ` Eric Blake
2016-02-09  0:23   ` John Snow
2016-01-26 10:38 ` [Qemu-devel] [RFC PATCH 04/16] block: Move filename_decompose to block.c Fam Zheng
2016-01-27 16:07   ` Eric Blake
2016-02-09 20:56   ` John Snow
2016-01-26 10:38 ` [Qemu-devel] [RFC PATCH 05/16] block: Make bdrv_get_cluster_size public Fam Zheng
2016-01-27 16:08   ` Eric Blake
2016-02-09 21:06   ` John Snow
2016-01-26 10:38 ` [Qemu-devel] [RFC PATCH 06/16] block: Introduce bdrv_dirty_bitmap_set_persistent Fam Zheng
2016-02-09 21:31   ` John Snow
2016-02-09 22:04   ` John Snow
2016-01-26 10:38 ` [Qemu-devel] [RFC PATCH 07/16] block: Only swap non-persistent dirty bitmaps Fam Zheng
2016-01-26 10:38 ` [Qemu-devel] [RFC PATCH 08/16] qmp: Add optional parameter "persistent" in block-dirty-bitmap-add Fam Zheng
2016-02-09 22:05   ` John Snow
2016-01-26 10:38 ` [Qemu-devel] [RFC PATCH 09/16] qmp: Add block-dirty-bitmap-set-persistent Fam Zheng
2016-02-09 22:49   ` John Snow
2016-01-26 10:38 ` [Qemu-devel] [RFC PATCH 10/16] qbm: Implement format driver Fam Zheng
2016-02-17 13:30   ` Vladimir Sementsov-Ogievskiy
2016-01-26 10:38 ` [Qemu-devel] [RFC PATCH 11/16] qapi: Add "qbm" as a generic cow " Fam Zheng
2016-01-26 10:38 ` [Qemu-devel] [RFC PATCH 12/16] iotests: Add qbm format to 041 Fam Zheng
2016-01-26 10:38 ` [Qemu-devel] [RFC PATCH 13/16] iotests: Add qbm to case 097 Fam Zheng
2016-01-26 10:38 ` [Qemu-devel] [RFC PATCH 14/16] iotests: Add qbm to applicable test cases Fam Zheng
2016-01-26 10:38 ` [Qemu-devel] [RFC PATCH 15/16] iotests: Add qbm specific test case 140 Fam Zheng
2016-01-26 10:38 ` [Qemu-devel] [RFC PATCH 16/16] iotests: Add persistent bitmap test case 141 Fam Zheng
2016-02-22 14:24 ` [Qemu-devel] [RFC PATCH 00/16] Qemu Bit Map (QBM) - an overlay format for persistent dirty bitmap Kevin Wolf
2016-02-23  3:40   ` Fam Zheng
2016-02-23 17:43     ` Kevin Wolf
2016-02-24  0:49       ` Fam Zheng
2016-02-23  9:14   ` Markus Armbruster
2016-02-23 11:28     ` Kevin Wolf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56A7B216.6070108@redhat.com \
    --to=eblake@redhat.com \
    --cc=armbru@redhat.com \
    --cc=famz@redhat.com \
    --cc=jsnow@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    --cc=vsementsov@parallels.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.