qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Fam Zheng <famz@redhat.com>
To: Kevin Wolf <kwolf@redhat.com>
Cc: Alberto Garcia <berto@igalia.com>,
	qemu-block@nongnu.org, jsnow@redhat.com,
	Peter Lieven <pl@kamp.de>,
	qemu-devel@nongnu.org, Markus Armbruster <armbru@redhat.com>,
	vsementsov@parallels.com, Stefan Hajnoczi <stefanha@redhat.com>,
	"Denis V. Lunev" <den@openvz.org>,
	pbonzini@redhat.com, mreitz@redhat.com
Subject: Re: [Qemu-devel] [RFC PATCH 00/16] Qemu Bit Map (QBM) - an overlay format for persistent dirty bitmap
Date: Wed, 24 Feb 2016 08:49:28 +0800	[thread overview]
Message-ID: <20160224004928.GC749@ad.usersys.redhat.com> (raw)
In-Reply-To: <20160223174330.GF8176@noname.redhat.com>

On Tue, 02/23 18:43, Kevin Wolf wrote:
> Am 23.02.2016 um 04:40 hat Fam Zheng geschrieben:
> > (I'm Cc'ing a few more people here just in case they have different visions
> > about raw image use cases.)
> > 
> > On Mon, 02/22 15:24, Kevin Wolf wrote:
> > > Am 26.01.2016 um 11:38 hat Fam Zheng geschrieben:
> > > > This series introduces a simple format to enable support of persistence of
> > > > block dirty bitmaps. Block dirty bitmap is the tool to achieve incremental
> > > > backup, and persistence of block dirty bitmap makes incrememtal backup possible
> > > > across VM shutdowns, where existing in-memory dirty bitmaps cannot survive.
> > > > 
> > > > When user creates a "persisted" dirty bitmap, the QBM driver will create a
> > > > binary file and synchronize it with the existing in-memory block dirty bitmap
> > > > (BdrvDirtyBitmap). When the VM is powered down, the binary file has all the
> > > > bits saved on disk, which will be loaded and used to initialize the in-memory
> > > > block dirty bitmap next time the guest is started.
> > > > 
> > > > The idea of the format is to reuse as much existing infrastructure as possible
> > > > and avoid introducing complex data structures - it works with any image format,
> > > > by gluing it together plain bitmap files with a json descriptor file. The
> > > > advantage of this approach over extending existing formats, such as qcow2, is
> > > > that the new feature is implemented by an orthogonal driver, in a format
> > > > agnostic way. This way, even raw images can have their persistent dirty
> > > > bitmaps.  (And you will notice in this series, with a little forging to the
> > > > spec, raw images can also have backing files through a QBM overlay!)
> > > > 
> > > > Rather than superseding it, this intends to be coexistent in parallel with the
> > > > qcow2 bitmap extension that Vladimir is working on.  The block driver interface
> > > > changes in this series also try to be generic and compatible for both drivers.
> > > 
> > > So as I already told Fam last week, before we discuss any technical
> > > details here, we first need to discuss whether this is even the right
> > > thing to do. Currently I'm doubtful, as this is another attempt to
> > > introduce a new native image format in qemu.
> > > 
> > > Let's recap the image formats and what we tell users about them today:
> > > 
> > > * qcow2: This is the default choice for disk images. It gives you access
> > >   to all of the features in qemu at a good performance. If it doesn't
> > >   perform well in your case, we'll fix it.
> > > 
> > > * raw: Use this when you need absolute performance and don't need any
> > >   features from an image format, so you want to get any complexity just
> > >   out of the way and pass requests as directly as possible from the
> > >   guest device to the host kernel.
> > > 
> > > * Anything else: Only use them to convert into raw or qcow2.
> > > 
> > > Now using bitmaps is clearly on the "features" side, which suggests that
> > > qcow2 is the format of choice for this. If you want to introduce a new
> > > format, you need to justify it with evidence that...
> > > 
> > > 1. there is a relevant use case that qcow2 doesn't cover
> > > 2. qcow2 can't be fixed/enhanced to cover the use case
> > > 
> > > The one thing that people have claimed in the past that qcow2 can't
> > > provide is enough performance. This is where QED tried to come in and
> > > promised a compromise between performance (then a bit faster than qcow2)
> > > and features (almost none, but supports backing files). We all know that
> > > it was a failure because you had to sacrifice features and still the
> > > idea that qcow2 couldn't be fixed was wrong, so today we have a QED
> > > driver that is much slower than qcow2 despite having less features.
> > > 
> > > Now for QBM. First, let's have a look at the image format that it can be
> > > used with. qcow2 doesn't need it if we continue with Vladimir's
> > > extension. Other non-raw formats are only supposed to be used for
> > > conversion. The only thing that's really left is raw.
> > 
> > Yes, I agree with this point.
> > 
> > > Now adding a
> > > feature only for raw, as a compromise between features and performance,
> > > looks an awful lot like what QED tried. We don't want to go there.
> > > 
> > > Even if we wanted to support persistent dirty bitmaps with raw images
> > > (which has to be discussed based on use cases), it's still questionable
> > > whether we need a new image format with JSON descriptor files instead of
> > > just raw bitmaps that can be added with a QMP command.
> > > 
> > 
> > I don't think QMP interface alone is enough, in persistent backup use case,
> > when starting a guest, command line interface is more appropriate to continue
> > dirty trackings that were enabled during shutdown.
> 
> Yes, I was sloppy. Maybe s/QMP command/runtime option/ gets closer.
> 
> > I'd justify in two parts, one is "why" and the other is "how".
> > 
> > So to answer why.  The reason I worked on QBM is because I feel it wrong to
> > leaving raw behind. Ceph and LVM users use raw format. You could technically
> > use qcow2 with ceph but that is discouraged[1] or even refused by openstack[2].
> > We've seen qcow2 on top of LVs but that is not the dominance.
> 
> Ceph is definitely a valid point. I think we agree that qcow2 can't
> provide what we need there today.
> 
> The question is whether qcow2 can be extended to provide it. As we
> discussed last week internally and today on the call, a possible idea
> would be to extend qcow2 to act as the filter driver here, where all I/O
> is redirected to the backing file and only the bitmaps remain in the
> qcow2 layer.
> 
> > The scope of "features" for which we tell users they have to use qcow2 should
> > those that are format specific, not "block features" in general.  Backing file,
> > internal/external snapshot, thin provisioning, compression and encryption are
> > all great examples of format features, whereas things including throttling,
> > statistics, migration, mirroring and backing up are IMHO not.  Actually we
> > already support snapshotting a raw image, with an qcow2 overlay.  We've even
> > implemented non-persistent incremental backup for raw today, through
> > drive-backup.  If we will decide qcow2 is the only possible format that can do
> > persistent backup, I'm not really a huge fan of it.
> 
> Yeah, but that's just a feeling, not a use case.
> 
> > Then "how"?
> > 
> > Actually, I thought we could do it in a way similar to quorum. The way quorum
> > driver works is by specifying tediously long options. A snippet from
> > qemu-iotests to build a quorum driver with 3 children is like this:
> > 
> >     quorum="driver=raw,file.driver=quorum,file.vote-threshold=2"
> >     quorum="$quorum,file.children.0.file.filename=$TEST_DIR/1.raw"
> >     quorum="$quorum,file.children.1.file.filename=$TEST_DIR/2.raw"
> >     quorum="$quorum,file.children.2.file.filename=$TEST_DIR/3.raw"
> >     quorum="$quorum,file.children.0.driver=raw"
> >     quorum="$quorum,file.children.1.driver=raw"
> >     quorum="$quorum,file.children.2.driver=raw"
> > 
> > Though very repetitive, it is also very simple: all children are almost
> > symmemtrical (identical in user data). The only thing for user/management tool
> > to make sure is the images have the same data.
> 
> By the way, the repetitiveness would be greatly reduced if the test case
> were using the json: pseudo-protocol.
> 
> > Unfortunately the logic is more complicated in an persistent incremental backup
> > scenario. Manual users will have to specify bitmap file names and the
> > granularities which they may have no clue anymore two weeks after they created
> > the bitmap, and can get wrong.  Management seems a must in this case, but the
> > interface we provide to them still feels way too low level. Anyway, I do think
> > we can consider a "banana" (dummy name) driver for persistent bitmap management
> > which is structured like quorum:
> > 
> >     banana="driver=raw,file.driver=banana,file.mode=synchronous"
> >     banana="$banana,file.image.file.filename=$TEST_IMG"
> >     banana="$banana,file.bitmaps.0.file.filename=$TEST_DIR/bm0.raw"
> >     banana="$banana,file.bitmaps.0.granularity=65536"
> >     banana="$banana,file.bitmaps.0.name=bm0"
> >     banana="$banana,file.bitmaps.1.file.filename=$TEST_DIR/bm1.raw"
> >     banana="$banana,file.bitmaps.1.granularity=1048576"
> >     banana="$banana,file.bitmaps.1.name=bm1"
> >     ...
> > 
> > But we're merely inlining the information from QBM JSON format into the command
> > line. This is IMO only one step of differences in between.
> 
> It wasn't as clear to me before I read this explanation, but is the QBM
> on-disk file format really just reinventing qemu config files then?

I agree they look alike on the surface, but are qemu config files updated by
QEMU? An image is both read and more importantly written by the driver
following a definite format specification, I think that is fundamentally
different.

Fam

  reply	other threads:[~2016-02-24  0:49 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-01-26 10:38 [Qemu-devel] [RFC PATCH 00/16] Qemu Bit Map (QBM) - an overlay format for persistent dirty bitmap Fam Zheng
2016-01-26 10:38 ` [Qemu-devel] [RFC PATCH 01/16] doc: Add QBM format specification Fam Zheng
2016-01-26 17:51   ` Eric Blake
2016-02-09  0:05     ` John Snow
2016-02-23  8:35     ` Markus Armbruster
2016-02-08 23:51   ` John Snow
2016-02-17 11:48   ` Vladimir Sementsov-Ogievskiy
2016-02-17 16:30   ` Vladimir Sementsov-Ogievskiy
2016-01-26 10:38 ` [Qemu-devel] [RFC PATCH 02/16] block: Set dirty before doing write Fam Zheng
2016-01-26 17:52   ` Eric Blake
2016-02-09  0:11   ` John Snow
2016-01-26 10:38 ` [Qemu-devel] [RFC PATCH 03/16] block: Allow .bdrv_close callback to release dirty bitmaps Fam Zheng
2016-01-26 17:53   ` Eric Blake
2016-02-09  0:23   ` John Snow
2016-01-26 10:38 ` [Qemu-devel] [RFC PATCH 04/16] block: Move filename_decompose to block.c Fam Zheng
2016-01-27 16:07   ` Eric Blake
2016-02-09 20:56   ` John Snow
2016-01-26 10:38 ` [Qemu-devel] [RFC PATCH 05/16] block: Make bdrv_get_cluster_size public Fam Zheng
2016-01-27 16:08   ` Eric Blake
2016-02-09 21:06   ` John Snow
2016-01-26 10:38 ` [Qemu-devel] [RFC PATCH 06/16] block: Introduce bdrv_dirty_bitmap_set_persistent Fam Zheng
2016-02-09 21:31   ` John Snow
2016-02-09 22:04   ` John Snow
2016-01-26 10:38 ` [Qemu-devel] [RFC PATCH 07/16] block: Only swap non-persistent dirty bitmaps Fam Zheng
2016-01-26 10:38 ` [Qemu-devel] [RFC PATCH 08/16] qmp: Add optional parameter "persistent" in block-dirty-bitmap-add Fam Zheng
2016-02-09 22:05   ` John Snow
2016-01-26 10:38 ` [Qemu-devel] [RFC PATCH 09/16] qmp: Add block-dirty-bitmap-set-persistent Fam Zheng
2016-02-09 22:49   ` John Snow
2016-01-26 10:38 ` [Qemu-devel] [RFC PATCH 10/16] qbm: Implement format driver Fam Zheng
2016-02-17 13:30   ` Vladimir Sementsov-Ogievskiy
2016-01-26 10:38 ` [Qemu-devel] [RFC PATCH 11/16] qapi: Add "qbm" as a generic cow " Fam Zheng
2016-01-26 10:38 ` [Qemu-devel] [RFC PATCH 12/16] iotests: Add qbm format to 041 Fam Zheng
2016-01-26 10:38 ` [Qemu-devel] [RFC PATCH 13/16] iotests: Add qbm to case 097 Fam Zheng
2016-01-26 10:38 ` [Qemu-devel] [RFC PATCH 14/16] iotests: Add qbm to applicable test cases Fam Zheng
2016-01-26 10:38 ` [Qemu-devel] [RFC PATCH 15/16] iotests: Add qbm specific test case 140 Fam Zheng
2016-01-26 10:38 ` [Qemu-devel] [RFC PATCH 16/16] iotests: Add persistent bitmap test case 141 Fam Zheng
2016-02-22 14:24 ` [Qemu-devel] [RFC PATCH 00/16] Qemu Bit Map (QBM) - an overlay format for persistent dirty bitmap Kevin Wolf
2016-02-23  3:40   ` Fam Zheng
2016-02-23 17:43     ` Kevin Wolf
2016-02-24  0:49       ` Fam Zheng [this message]
2016-02-23  9:14   ` Markus Armbruster
2016-02-23 11:28     ` Kevin Wolf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160224004928.GC749@ad.usersys.redhat.com \
    --to=famz@redhat.com \
    --cc=armbru@redhat.com \
    --cc=berto@igalia.com \
    --cc=den@openvz.org \
    --cc=jsnow@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=pl@kamp.de \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    --cc=vsementsov@parallels.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).