qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Kevin Wolf <kwolf@redhat.com>
To: Fam Zheng <famz@redhat.com>
Cc: Alberto Garcia <berto@igalia.com>,
	qemu-block@nongnu.org, jsnow@redhat.com,
	Peter Lieven <pl@kamp.de>,
	qemu-devel@nongnu.org, Markus Armbruster <armbru@redhat.com>,
	vsementsov@parallels.com, Stefan Hajnoczi <stefanha@redhat.com>,
	"Denis V. Lunev" <den@openvz.org>,
	pbonzini@redhat.com, mreitz@redhat.com
Subject: Re: [Qemu-devel] [RFC PATCH 00/16] Qemu Bit Map (QBM) - an overlay format for persistent dirty bitmap
Date: Tue, 23 Feb 2016 18:43:30 +0100	[thread overview]
Message-ID: <20160223174330.GF8176@noname.redhat.com> (raw)
In-Reply-To: <20160223034050.GD26360@ad.usersys.redhat.com>

Am 23.02.2016 um 04:40 hat Fam Zheng geschrieben:
> (I'm Cc'ing a few more people here just in case they have different visions
> about raw image use cases.)
> 
> On Mon, 02/22 15:24, Kevin Wolf wrote:
> > Am 26.01.2016 um 11:38 hat Fam Zheng geschrieben:
> > > This series introduces a simple format to enable support of persistence of
> > > block dirty bitmaps. Block dirty bitmap is the tool to achieve incremental
> > > backup, and persistence of block dirty bitmap makes incrememtal backup possible
> > > across VM shutdowns, where existing in-memory dirty bitmaps cannot survive.
> > > 
> > > When user creates a "persisted" dirty bitmap, the QBM driver will create a
> > > binary file and synchronize it with the existing in-memory block dirty bitmap
> > > (BdrvDirtyBitmap). When the VM is powered down, the binary file has all the
> > > bits saved on disk, which will be loaded and used to initialize the in-memory
> > > block dirty bitmap next time the guest is started.
> > > 
> > > The idea of the format is to reuse as much existing infrastructure as possible
> > > and avoid introducing complex data structures - it works with any image format,
> > > by gluing it together plain bitmap files with a json descriptor file. The
> > > advantage of this approach over extending existing formats, such as qcow2, is
> > > that the new feature is implemented by an orthogonal driver, in a format
> > > agnostic way. This way, even raw images can have their persistent dirty
> > > bitmaps.  (And you will notice in this series, with a little forging to the
> > > spec, raw images can also have backing files through a QBM overlay!)
> > > 
> > > Rather than superseding it, this intends to be coexistent in parallel with the
> > > qcow2 bitmap extension that Vladimir is working on.  The block driver interface
> > > changes in this series also try to be generic and compatible for both drivers.
> > 
> > So as I already told Fam last week, before we discuss any technical
> > details here, we first need to discuss whether this is even the right
> > thing to do. Currently I'm doubtful, as this is another attempt to
> > introduce a new native image format in qemu.
> > 
> > Let's recap the image formats and what we tell users about them today:
> > 
> > * qcow2: This is the default choice for disk images. It gives you access
> >   to all of the features in qemu at a good performance. If it doesn't
> >   perform well in your case, we'll fix it.
> > 
> > * raw: Use this when you need absolute performance and don't need any
> >   features from an image format, so you want to get any complexity just
> >   out of the way and pass requests as directly as possible from the
> >   guest device to the host kernel.
> > 
> > * Anything else: Only use them to convert into raw or qcow2.
> > 
> > Now using bitmaps is clearly on the "features" side, which suggests that
> > qcow2 is the format of choice for this. If you want to introduce a new
> > format, you need to justify it with evidence that...
> > 
> > 1. there is a relevant use case that qcow2 doesn't cover
> > 2. qcow2 can't be fixed/enhanced to cover the use case
> > 
> > The one thing that people have claimed in the past that qcow2 can't
> > provide is enough performance. This is where QED tried to come in and
> > promised a compromise between performance (then a bit faster than qcow2)
> > and features (almost none, but supports backing files). We all know that
> > it was a failure because you had to sacrifice features and still the
> > idea that qcow2 couldn't be fixed was wrong, so today we have a QED
> > driver that is much slower than qcow2 despite having less features.
> > 
> > Now for QBM. First, let's have a look at the image format that it can be
> > used with. qcow2 doesn't need it if we continue with Vladimir's
> > extension. Other non-raw formats are only supposed to be used for
> > conversion. The only thing that's really left is raw.
> 
> Yes, I agree with this point.
> 
> > Now adding a
> > feature only for raw, as a compromise between features and performance,
> > looks an awful lot like what QED tried. We don't want to go there.
> > 
> > Even if we wanted to support persistent dirty bitmaps with raw images
> > (which has to be discussed based on use cases), it's still questionable
> > whether we need a new image format with JSON descriptor files instead of
> > just raw bitmaps that can be added with a QMP command.
> > 
> 
> I don't think QMP interface alone is enough, in persistent backup use case,
> when starting a guest, command line interface is more appropriate to continue
> dirty trackings that were enabled during shutdown.

Yes, I was sloppy. Maybe s/QMP command/runtime option/ gets closer.

> I'd justify in two parts, one is "why" and the other is "how".
> 
> So to answer why.  The reason I worked on QBM is because I feel it wrong to
> leaving raw behind. Ceph and LVM users use raw format. You could technically
> use qcow2 with ceph but that is discouraged[1] or even refused by openstack[2].
> We've seen qcow2 on top of LVs but that is not the dominance.

Ceph is definitely a valid point. I think we agree that qcow2 can't
provide what we need there today.

The question is whether qcow2 can be extended to provide it. As we
discussed last week internally and today on the call, a possible idea
would be to extend qcow2 to act as the filter driver here, where all I/O
is redirected to the backing file and only the bitmaps remain in the
qcow2 layer.

> The scope of "features" for which we tell users they have to use qcow2 should
> those that are format specific, not "block features" in general.  Backing file,
> internal/external snapshot, thin provisioning, compression and encryption are
> all great examples of format features, whereas things including throttling,
> statistics, migration, mirroring and backing up are IMHO not.  Actually we
> already support snapshotting a raw image, with an qcow2 overlay.  We've even
> implemented non-persistent incremental backup for raw today, through
> drive-backup.  If we will decide qcow2 is the only possible format that can do
> persistent backup, I'm not really a huge fan of it.

Yeah, but that's just a feeling, not a use case.

> Then "how"?
> 
> Actually, I thought we could do it in a way similar to quorum. The way quorum
> driver works is by specifying tediously long options. A snippet from
> qemu-iotests to build a quorum driver with 3 children is like this:
> 
>     quorum="driver=raw,file.driver=quorum,file.vote-threshold=2"
>     quorum="$quorum,file.children.0.file.filename=$TEST_DIR/1.raw"
>     quorum="$quorum,file.children.1.file.filename=$TEST_DIR/2.raw"
>     quorum="$quorum,file.children.2.file.filename=$TEST_DIR/3.raw"
>     quorum="$quorum,file.children.0.driver=raw"
>     quorum="$quorum,file.children.1.driver=raw"
>     quorum="$quorum,file.children.2.driver=raw"
> 
> Though very repetitive, it is also very simple: all children are almost
> symmemtrical (identical in user data). The only thing for user/management tool
> to make sure is the images have the same data.

By the way, the repetitiveness would be greatly reduced if the test case
were using the json: pseudo-protocol.

> Unfortunately the logic is more complicated in an persistent incremental backup
> scenario. Manual users will have to specify bitmap file names and the
> granularities which they may have no clue anymore two weeks after they created
> the bitmap, and can get wrong.  Management seems a must in this case, but the
> interface we provide to them still feels way too low level. Anyway, I do think
> we can consider a "banana" (dummy name) driver for persistent bitmap management
> which is structured like quorum:
> 
>     banana="driver=raw,file.driver=banana,file.mode=synchronous"
>     banana="$banana,file.image.file.filename=$TEST_IMG"
>     banana="$banana,file.bitmaps.0.file.filename=$TEST_DIR/bm0.raw"
>     banana="$banana,file.bitmaps.0.granularity=65536"
>     banana="$banana,file.bitmaps.0.name=bm0"
>     banana="$banana,file.bitmaps.1.file.filename=$TEST_DIR/bm1.raw"
>     banana="$banana,file.bitmaps.1.granularity=1048576"
>     banana="$banana,file.bitmaps.1.name=bm1"
>     ...
> 
> But we're merely inlining the information from QBM JSON format into the command
> line. This is IMO only one step of differences in between.

It wasn't as clear to me before I read this explanation, but is the QBM
on-disk file format really just reinventing qemu config files then?

Kevin

  reply	other threads:[~2016-02-23 17:43 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-01-26 10:38 [Qemu-devel] [RFC PATCH 00/16] Qemu Bit Map (QBM) - an overlay format for persistent dirty bitmap Fam Zheng
2016-01-26 10:38 ` [Qemu-devel] [RFC PATCH 01/16] doc: Add QBM format specification Fam Zheng
2016-01-26 17:51   ` Eric Blake
2016-02-09  0:05     ` John Snow
2016-02-23  8:35     ` Markus Armbruster
2016-02-08 23:51   ` John Snow
2016-02-17 11:48   ` Vladimir Sementsov-Ogievskiy
2016-02-17 16:30   ` Vladimir Sementsov-Ogievskiy
2016-01-26 10:38 ` [Qemu-devel] [RFC PATCH 02/16] block: Set dirty before doing write Fam Zheng
2016-01-26 17:52   ` Eric Blake
2016-02-09  0:11   ` John Snow
2016-01-26 10:38 ` [Qemu-devel] [RFC PATCH 03/16] block: Allow .bdrv_close callback to release dirty bitmaps Fam Zheng
2016-01-26 17:53   ` Eric Blake
2016-02-09  0:23   ` John Snow
2016-01-26 10:38 ` [Qemu-devel] [RFC PATCH 04/16] block: Move filename_decompose to block.c Fam Zheng
2016-01-27 16:07   ` Eric Blake
2016-02-09 20:56   ` John Snow
2016-01-26 10:38 ` [Qemu-devel] [RFC PATCH 05/16] block: Make bdrv_get_cluster_size public Fam Zheng
2016-01-27 16:08   ` Eric Blake
2016-02-09 21:06   ` John Snow
2016-01-26 10:38 ` [Qemu-devel] [RFC PATCH 06/16] block: Introduce bdrv_dirty_bitmap_set_persistent Fam Zheng
2016-02-09 21:31   ` John Snow
2016-02-09 22:04   ` John Snow
2016-01-26 10:38 ` [Qemu-devel] [RFC PATCH 07/16] block: Only swap non-persistent dirty bitmaps Fam Zheng
2016-01-26 10:38 ` [Qemu-devel] [RFC PATCH 08/16] qmp: Add optional parameter "persistent" in block-dirty-bitmap-add Fam Zheng
2016-02-09 22:05   ` John Snow
2016-01-26 10:38 ` [Qemu-devel] [RFC PATCH 09/16] qmp: Add block-dirty-bitmap-set-persistent Fam Zheng
2016-02-09 22:49   ` John Snow
2016-01-26 10:38 ` [Qemu-devel] [RFC PATCH 10/16] qbm: Implement format driver Fam Zheng
2016-02-17 13:30   ` Vladimir Sementsov-Ogievskiy
2016-01-26 10:38 ` [Qemu-devel] [RFC PATCH 11/16] qapi: Add "qbm" as a generic cow " Fam Zheng
2016-01-26 10:38 ` [Qemu-devel] [RFC PATCH 12/16] iotests: Add qbm format to 041 Fam Zheng
2016-01-26 10:38 ` [Qemu-devel] [RFC PATCH 13/16] iotests: Add qbm to case 097 Fam Zheng
2016-01-26 10:38 ` [Qemu-devel] [RFC PATCH 14/16] iotests: Add qbm to applicable test cases Fam Zheng
2016-01-26 10:38 ` [Qemu-devel] [RFC PATCH 15/16] iotests: Add qbm specific test case 140 Fam Zheng
2016-01-26 10:38 ` [Qemu-devel] [RFC PATCH 16/16] iotests: Add persistent bitmap test case 141 Fam Zheng
2016-02-22 14:24 ` [Qemu-devel] [RFC PATCH 00/16] Qemu Bit Map (QBM) - an overlay format for persistent dirty bitmap Kevin Wolf
2016-02-23  3:40   ` Fam Zheng
2016-02-23 17:43     ` Kevin Wolf [this message]
2016-02-24  0:49       ` Fam Zheng
2016-02-23  9:14   ` Markus Armbruster
2016-02-23 11:28     ` Kevin Wolf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160223174330.GF8176@noname.redhat.com \
    --to=kwolf@redhat.com \
    --cc=armbru@redhat.com \
    --cc=berto@igalia.com \
    --cc=den@openvz.org \
    --cc=famz@redhat.com \
    --cc=jsnow@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=pl@kamp.de \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    --cc=vsementsov@parallels.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).