From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([140.186.70.92]:60713) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1RznGd-0000Ob-Ks for qemu-devel@nongnu.org; Tue, 21 Feb 2012 05:48:21 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1RznGY-0006g3-UI for qemu-devel@nongnu.org; Tue, 21 Feb 2012 05:48:19 -0500 Received: from mx1.redhat.com ([209.132.183.28]:47766) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1RznGY-0006fx-MG for qemu-devel@nongnu.org; Tue, 21 Feb 2012 05:48:14 -0500 Message-ID: <4F43773B.6060109@redhat.com> Date: Tue, 21 Feb 2012 11:51:39 +0100 From: Kevin Wolf MIME-Version: 1.0 References: <73865e0ce364c40e0eb65ec6b22b819d@mail.gmail.com> <4F31153E.9010205@codemonkey.ws> <4F311839.9030709@redhat.com> <4F311BBA.8000708@codemonkey.ws> <4F312FD3.5020206@zerto.com> <4F3137DB.1040503@redhat.com> <4F3139CE.4040103@zerto.com> <4F314798.8010009@redhat.com> <4F3211D0.3070502@zerto.com> <4F323875.1000000@redhat.com> <4F3244C2.1040604@zerto.com> <4F32489A.80307@redhat.com> <4F32788C.60904@zerto.com> <4F40FBD6.2000500@zerto.com> <4F425987.20103@redhat.com> <4F435DD2.8080600@redhat.com> <4F4360C7.5080806@redhat.com> <4F4368BF.4040707@redhat.com> <4F436D76.6090206@redhat.com> In-Reply-To: <4F436D76.6090206@redhat.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] BlockDriverState stack and BlockListeners List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Paolo Bonzini Cc: =?UTF-8?B?16rXldee16gg15HXnyDXkNeV16g=?= , Stefan Hajnoczi , Jeff Cody , dlaor@redhat.com, qemu-devel@nongnu.org, Markus Armbruster , Zhi Yong Wu , Federico Simoncelli , Ori Mamluk , =?UTF-8?B?16LXldeT15Mg16fXk9ed?= , Yair Kuszpet Am 21.02.2012 11:09, schrieb Paolo Bonzini: >>> 2) BlockListener would be entirely an implementation detail, used in the >>> implementation of other commands. >> >> Depending on what you mean by command (presumably QMP commands?), > > QMP commands, command-line (-drive), whatever. > >> And on the other hand, protocols like file are entirely implementation >> details as well, and still they are BlockDrivers. > > True. Formats and protocols also do not have perfectly overlapping > functionality (backing images are only a format thing, for example). And even protocols and protocols don't. Compare file to blkdebug, for example. In fact, blkdebug and blkverify are already very close to what BlockListeners would be. >>>> The main difference that I see is that the listeners stay always on top. >>>> For example, let's assume that if implemented a blkmirror driver in >>>> today's infrastructure, you would get a BlockDriverState stack like >>>> blkmirror -> qcow2 -> file. If you take a live snapshot now, you don't >>>> want to have the blkmirror applied to the old top-level image, which is >>>> now a read-only backing file. Instead, it should move to the new >>>> top-level image. >>> >>> Yes, that's because a BlockListener always applies to a >>> BlockDriverState, and live snapshots close+reopen the BDS but do not >>> delete/recreate it. >> >> Hm, that's an interesting angle to look at it. The reasoning makes sense >> to me (though I would reword it as a BlockListener belongs to a >> drive/block device rather than a BDS, which is an implementation detail). > > Yes. > >> However, live snapshots can't close and reopen the BDS in the future, >> because the reopen could fail and you must not have closed the old image >> yet in this case. So what Jeff and I were looking into recently is to >> change this so that new top-level images are opened first without >> backing file and then the backing file relationship is created with the >> existing BDS. >> >> Of course, we stumbled across the thing that you're telling me here, >> that devices refer to the same BDS as before. So their content must be >> swapped, but some data like the device name (and now BlockListeners) >> stay with the top-level image instead. >> >> Which in turn reminds me of a discussion I had with Stefan a while ago, >> where we came to the conclusion that we need to separate the >> representation of an image file and a "view" of it which represents a >> block device (as a whole). [...] >> >> Isn't it cool how everything is connected with everything? :-) > > :) > > So we'd have: > > - Protocols -- Protocols tell you where to get the raw bits. > - Formats -- Formats transform those raw bits into a block device. > - Views -- Views can move from a format to another. A format can use a > default view implementation, or provide its own (e.g. to access > different snapshots). > - Listeners -- I think a view can have-a listener? Where protocols, formats and listeners are-a image (Not the best name, I'm open for suggestions. Something like "BDS stack building block" would be most accurate...). Or actually not is-a in terms of inheritance, but I think it would be the very same thing without any subclassing, implemented by a BlockDriver. > with the following relationship: > > - A format has-a protocol for the raw bits and has-a view for the > backing file. An image has-a image from which it takes its data (bs->file). And it has-a view for the backing file, yes. Both could be a listener. > - A view has-a format, a device has-a view. > - A view can have-a listener? Or is it formats? A view has-a image. This may happen to be a listener, which in turn has-a image (could be another listener, a format or a protocol). The question is what the semantics is with live snapshots (there are probably similar problems, but this is the obvious one). For example we could now have: mirror -> qcow2 -> blkdebug -> file There are two listeners here, mirror and blkdebug. (Things like blkdebug are why view has-a listener isn't enough). After creating an external snapshot, we expect the graph to look like this (the arrow down is the backing file): mirror -> qcow2 -> file | +-> qcow2 -> blkdebug -> file The question is: Can we assume that any listeners that are on top of the first format or protocol (i.e. those that would fit your model) should move to the new top-level view? Or would it sometimes make sense to keep it at the old one? > But I think we're digressing... No, in fact I think we need to get an idea of what we want to have in the end. And we need to do it soon, almost any new topic that's coming up ends up in a discussion about the same shortcomings of the block layer. It doesn't make sense to hack in more and more stuff without having proper infrastructure for it. >>>> So maybe we just need to extend the current BlockDriverState stack to >>>> distinguish "normal" and "always on top" BlockDrivers, where the latter >>>> would roughly correspond to BlockListeners? >>> >>> I would prefer having separate objects. Even if you do not count fields >>> related to throttling or copy-on-read or other tasks in the list above, >>> there are many fields in BDS that do not really apply to BlockListeners. >>> Backing files, device ops, encryption, even size. Having extra methods >>> is not a big problem, but unwanted data items smell... >> >> Most other block drivers use only little of them. We can try to clean up >> some of them (and making the separation described above would probably >> help with it), but BlockListeners aren't really different here from >> existing drivers. > > True, the question only matters insofar as having a separate data > structure simplifies the design. ("Simplify" means "we can actually > understand it and be reasonably sure that it's correct and implementable"). Having only one kind of building block for the block driver graph is a simplification, too. Or at least one common base class. Kevin