QEMU-Devel Archive on lore.kernel.org
 help / color / Atom feed
From: Kevin Wolf <kwolf@redhat.com>
To: Max Reitz <mreitz@redhat.com>
Cc: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>,
	qemu-devel@nongnu.org, qemu-block@nongnu.org
Subject: Re: [PATCH v6 28/42] stream: Deal with filters
Date: Wed, 11 Dec 2019 17:35:17 +0100
Message-ID: <20191211163517.GD6505@linux.fritz.box> (raw)
In-Reply-To: <38b4e4bf-9553-e231-98f5-91dcb6994462@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 9331 bytes --]

Am 11.12.2019 um 17:12 hat Max Reitz geschrieben:
> On 11.12.19 16:52, Kevin Wolf wrote:
> > Am 11.12.2019 um 13:52 hat Max Reitz geschrieben:
> >> On 16.09.19 11:52, Max Reitz wrote:
> >>> On 13.09.19 16:16, Kevin Wolf wrote:
> >>>> Am 09.08.2019 um 18:13 hat Max Reitz geschrieben:
> >>>>> Because of the recent changes that make the stream job independent of
> >>>>> the base node and instead track the node above it, we have to split that
> >>>>> "bottom" node into two cases: The bottom COW node, and the node directly
> >>>>> above the base node (which may be an R/W filter or the bottom COW node).
> >>>>>
> >>>>> Signed-off-by: Max Reitz <mreitz@redhat.com>
> >>>>> ---
> >>>>>  qapi/block-core.json |  4 ++++
> >>>>>  block/stream.c       | 52 ++++++++++++++++++++++++++++----------------
> >>>>>  blockdev.c           |  2 +-
> >>>>>  3 files changed, 38 insertions(+), 20 deletions(-)
> >>
> >>
> >> [...]
> >>
> >>>>> +    if (bdrv_freeze_chain(bs, above_base, errp) < 0) {
> >>>>>          return;
> >>>>>      }
> >>>>
> >>>> Hm... This feels odd. There are two places where stopping to freeze the
> >>>> chain would make obvious sense: At base, like we originally did; or at
> >>>> base_overlay, like we (intend to) do since commit c624b015, because we
> >>>> say that we don't actually mind if the user replaces the base image. I
> >>>> don't see how stopping at the first filter above base makes sense.
> >>>>
> >>>> So should this use bottom_cow_node/base_overlay instead of above_base?
> >>>
> >>> I suppose I thought “Better be safe than sorry”.
> >>>
> >>>> You couldn't use StreamBlockJob.above_base any more then because it
> >>>> could change, but you also don't really need it anywhere. It's only used
> >>>> for unfreezing (which would change) and for finding the base (you can
> >>>> still find bdrv_backing_chain_next(s->base_overlay)). I guess this would
> >>>> even be a code simplification.
> >>>
> >>> Great, I’ll see to it.
> >>
> >> On second thought (yes, I know, it’s been a couple of months...) I’m not
> >> so sure.
> >>
> >> If @base is a filter, then bdrv_backing_chain_next(s->base_overlay) will
> >> not return it.  So then the filter will be dropped, but that probably
> >> isn’t what the user intended.
> >>
> >> (In fact, the block-stream doc says “When streaming completes the image
> >> file will have the base file as its backing file.”)
> > 
> > Hm... Okay, let's try an example:
> > 
> >     ... <- backing <- filter1 <- filter2 <- filter3 <- top
> >                          |         |                    |
> >                       @base    above_base            @device
> >                                                    base_overlay
> > 
> > 
> > The expected result after the job has completed is:
> > 
> >     ... <- backing <- filter1 <- top
> > 
> > This means that filter1 must not go away until the job has completed. In
> > other words, we would need to freeze the link between @base and
> > above_base.
> > 
> > If we use backing_bs(above_base) as we currently do, we wouldn't
> > necessarily get filter1 as the new backing child of top (as the
> > documentation promises), but whatever is the backing child of filter2
> > when the job completes. In other words, documentation and code don't
> > match currently.
> 
> Correct.
> 
> > Let's look at a few more examples to see which of the options makes more
> > sense:
> > 
> > 1. Assume we drop filter1 while the stream job is running, i.e. backing
> >    is now the backing child of filter 2. I think it's obvious that when
> >    the stream job completes, you want backing to be the new backing
> >    child of top and not add filter1 back to the chain.
> > 
> > 2. If we add filter1b between filter1 and filter2, do we want to drop it
> >    when the job completes? It's not entirely clear, but there are
> >    probably cases where keeping it makes sense. The user can always drop
> >    it manually again, but if we delete a node, some requests will go
> >    through unfiltered.
> 
> Depends.  If the base-node was given to be literally "backing", then I’d
> say the user wants us to use "backing" as the new base still.
> 
> > 3. Imagine we replace filter1 wholesale, for example because a
> >    concurrently running mirror job performs a storage migration for
> >    everything up to filter1. Do we then want to reinstate the whole old
> >    subtree when the stream job completes? Certainly not.
> 
> I’m not sure.  Again, if I as the user specified the "backing" node-name
> as the base, I’d expect that to be the new backing file in all cases.
> 
> > So from this I would probably conclude that the bug is the
> > documentation, not necessarily in the code.
> 
> It certainly is true that it does not address what happens when you
> meddle with base.

Which is because it wasn't relevant then: You were not allowed to meddle
with base when block-stream documentation was written. It's only a
possible case since commit c624b015.

If we really want to guarantee that it's exactly base that will be used
at the end, then we need to freeze all links down to base, essentially
reverting commit c624b015 and disallowing some operations that would be
possible if we didn't make that promise. I don't think that's the right
thing to do, but at least it would be self-consistent.

Promising base, but not freezing the links, leads to weird results in
some of the cases mentioned above.

> (Except it can be argued (and I suppose I did argue that) that as it is
> it does say the base file should be the backing file after stream, so if
> the base file is still there, I do interpret that to mean that the
> stream job always uses that as the backing node.  Of course even if it
> says that, that doesn’t mean that it makes sense.  And if it doesn’t
> make sense, then that’s the definition of a bug in the documentation, yes.)
> 
> >> So now this gets hairy.  We do want exactly @base as the backing file
> >> unless the user changed the graph.  But how can we detect that and if it
> >> hasn’t changed find @base again?
> >>
> >> What this patch did in this version worked because the graph was frozen
> >> until @above_base.
> > 
> > No, it didn't provide the promised semantics, because "unless the user
> > changed the graph" isn't part of the documentation.But the promised
> > semantics are probably not what we want, so it's okay.
> 
> I would have said that no semantics are promised for when the node is no
> longer a valid base, so we can basically do what we think makes sense.

Like undefined behaviour in C?

Sounds great, let's reboot the guest into some 512-byte demo then.

> >> Alternatively, we could store a pointer to @base directly (or its node
> >> nmae) and then see whether there is some node between s->base_overlay
> >> and bdrv_backing_chain_next(s->base_overlay) that matches that at the
> >> end of streaming.
> >>
> >> Well, actually, a pointer won’t work because of course maybe that node
> >> was deleted and the area is now used for an unrelated node that the user
> >> doesn’t want as the new backing file.
> >>
> >> The node name could actually work, though.  I mean, if there is some
> >> node in the immediate backing filter chain of base_overlay with that
> >> node name after streaming, it does make sense to assume that the user
> >> wants this to be the backing file; regardless of whether that’s exactly
> >> the same node as it was at the start of streaming.
> >>
> >> But we now also have to think about what to do when there is no node
> >> with such a node name.  Should we keep all filters below base_overlay?
> >> Or should we drop all of them?  I actually think keeping them is the
> >> safer choice...
> > 
> > Using node names feels completely wrong to me. If we can't keep a
> > pointer (with a reference) because we want the user to be able to
> > delete the node, then we certainly can't keep the node name either
> > because that could refer to an entirely different node when the job
> > completes.
> 
> I was thinking that if the user does graph manipulation, we can expect
> them to have given base-node.  So they actually did refer to a node name
> and it makes sense to me to keep looking it up.
> 
> (As a side note: I originally intended to say “storing a pointer or the
> node-name are the only things that come to mind, but obviously both are
> stupid” – but then I changed my mind and realized that the node name
> maybe actually isn’t that stupid.)

I think it is. A node name refers to the node with that name at the
time of the QMP command. What node was referenced in that command
doesn't change later because of graph manipulations.

> > I don't think it's useful to waste too much thought on how to implement
> > the behaviour required by the documentation. The requirement seems to be
> > just wrong.
> > 
> > So in the end, maybe your current code is fine, and the way to address
> > the "doesn't make obvious sense" part is having a comment that explains
> > why above_base is where we stop freezing.
> 
> Let’s maybe talk about it tomorrow. :-)

Okay, added to my preliminary agenda.

Kevin

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 801 bytes --]

  reply index

Thread overview: 136+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-09 16:13 [Qemu-devel] [PATCH v6 00/42] block: " Max Reitz
2019-08-09 16:13 ` [Qemu-devel] [PATCH v6 01/42] block: Mark commit and mirror as filter drivers Max Reitz
2019-08-09 16:13 ` [Qemu-devel] [PATCH v6 02/42] copy-on-read: Support compressed writes Max Reitz
2019-08-09 16:13 ` [Qemu-devel] [PATCH v6 03/42] throttle: " Max Reitz
2019-08-09 16:13 ` [Qemu-devel] [PATCH v6 04/42] block: Add child access functions Max Reitz
2019-08-09 16:56   ` Eric Blake
2019-09-04 16:16   ` Kevin Wolf
2019-09-09  7:56     ` Max Reitz
2019-09-09  9:36       ` Kevin Wolf
2019-09-09 14:04         ` Max Reitz
2019-09-09 16:13           ` Kevin Wolf
2019-09-10  9:14             ` Max Reitz
2019-09-10 10:47               ` Kevin Wolf
2019-09-10 11:36                 ` Max Reitz
2019-09-10 12:48                   ` Kevin Wolf
2019-09-10 12:59                     ` Max Reitz
2019-09-10 13:10                       ` Kevin Wolf
2019-08-09 16:13 ` [Qemu-devel] [PATCH v6 05/42] block: Add chain helper functions Max Reitz
2019-08-09 17:01   ` Eric Blake
2019-08-09 16:13 ` [Qemu-devel] [PATCH v6 06/42] qcow2: Implement .bdrv_storage_child() Max Reitz
2019-08-09 17:07   ` Eric Blake
2019-08-09 16:13 ` [Qemu-devel] [PATCH v6 07/42] block: *filtered_cow_child() for *has_zero_init() Max Reitz
2019-08-09 16:13 ` [Qemu-devel] [PATCH v6 08/42] block: bdrv_set_backing_hd() is about bs->backing Max Reitz
2019-08-09 16:13 ` [Qemu-devel] [PATCH v6 09/42] block: Include filters when freezing backing chain Max Reitz
2019-08-10 13:32   ` Vladimir Sementsov-Ogievskiy
2019-08-12 12:56     ` Max Reitz
2019-09-05 13:05   ` Kevin Wolf
2019-09-09  8:02     ` Max Reitz
2019-09-09  9:40       ` Kevin Wolf
2019-08-09 16:13 ` [Qemu-devel] [PATCH v6 10/42] block: Drop bdrv_is_encrypted() Max Reitz
2019-08-10 13:42   ` Vladimir Sementsov-Ogievskiy
2019-08-09 16:13 ` [Qemu-devel] [PATCH v6 11/42] block: Add bdrv_supports_compressed_writes() Max Reitz
2019-09-05 13:11   ` Kevin Wolf
2019-09-09  8:09     ` Max Reitz
2019-08-09 16:13 ` [Qemu-devel] [PATCH v6 12/42] block: Use bdrv_filtered_rw* where obvious Max Reitz
2019-08-09 16:13 ` [Qemu-devel] [PATCH v6 13/42] block: Use CAFs in block status functions Max Reitz
2019-08-09 16:13 ` [Qemu-devel] [PATCH v6 14/42] block: Use CAFs when working with backing chains Max Reitz
2019-08-10 15:19   ` Vladimir Sementsov-Ogievskiy
2019-09-05 14:05   ` Kevin Wolf
2019-09-09  8:25     ` Max Reitz
2019-09-09  9:55       ` Kevin Wolf
2019-09-09 14:08         ` Max Reitz
2019-08-09 16:13 ` [Qemu-devel] [PATCH v6 15/42] block: Re-evaluate backing file handling in reopen Max Reitz
2019-08-10 16:05   ` Vladimir Sementsov-Ogievskiy
2019-08-09 16:13 ` [Qemu-devel] [PATCH v6 16/42] block: Flush all children in generic code Max Reitz
2019-08-10 15:36   ` Vladimir Sementsov-Ogievskiy
2019-08-12 12:58     ` Max Reitz
2019-09-05 16:24       ` Kevin Wolf
2019-09-09  8:31         ` Max Reitz
2019-09-09 10:01           ` Kevin Wolf
2019-09-09 14:15             ` Max Reitz
2019-08-09 16:13 ` [Qemu-devel] [PATCH v6 17/42] block: Use CAFs in bdrv_refresh_limits() Max Reitz
2019-08-09 16:13 ` [Qemu-devel] [PATCH v6 18/42] block: Use CAFs in bdrv_refresh_filename() Max Reitz
2019-08-10 16:22   ` Vladimir Sementsov-Ogievskiy
2019-08-09 16:13 ` [Qemu-devel] [PATCH v6 19/42] block: Use CAF in bdrv_co_rw_vmstate() Max Reitz
2019-08-09 16:13 ` [Qemu-devel] [PATCH v6 20/42] block/snapshot: Fix fallback Max Reitz
2019-08-10 16:34   ` Vladimir Sementsov-Ogievskiy
2019-08-12 13:06     ` Max Reitz
2019-09-10 11:56   ` Kevin Wolf
2019-09-10 12:04     ` Max Reitz
2019-09-10 12:49       ` Kevin Wolf
2019-09-10 13:06         ` Max Reitz
2019-08-09 16:13 ` [Qemu-devel] [PATCH v6 21/42] block: Use CAFs for debug breakpoints Max Reitz
2019-08-09 16:13 ` [Qemu-devel] [PATCH v6 22/42] block: Fix bdrv_get_allocated_file_size's fallback Max Reitz
2019-08-10 16:41   ` Vladimir Sementsov-Ogievskiy
2019-08-12 13:09     ` Max Reitz
2019-08-12 17:14       ` Vladimir Sementsov-Ogievskiy
2019-08-12 19:15         ` Max Reitz
2019-09-10 14:52   ` Kevin Wolf
2019-09-11  6:20     ` Max Reitz
2019-09-11  6:55       ` Kevin Wolf
2019-09-11  7:37         ` Max Reitz
2019-09-11  8:27           ` Kevin Wolf
2019-09-11 10:00             ` Max Reitz
2019-09-11 10:31               ` Kevin Wolf
2019-09-11 11:00                 ` Max Reitz
2019-09-12 10:34                   ` Kevin Wolf
2019-11-14 13:11                   ` Max Reitz
2019-08-09 16:13 ` [Qemu-devel] [PATCH v6 23/42] blockdev: Use CAF in external_snapshot_prepare() Max Reitz
2019-09-10 15:02   ` Kevin Wolf
2019-09-11  6:21     ` Max Reitz
2019-08-09 16:13 ` [Qemu-devel] [PATCH v6 24/42] block: Use child access functions for QAPI queries Max Reitz
2019-08-10 16:57   ` Vladimir Sementsov-Ogievskiy
2019-08-09 16:13 ` [Qemu-devel] [PATCH v6 25/42] mirror: Deal with filters Max Reitz
2019-08-12 11:09   ` Vladimir Sementsov-Ogievskiy
2019-08-12 13:26     ` Max Reitz
2019-08-14 15:17       ` Vladimir Sementsov-Ogievskiy
2019-08-31  9:57   ` Vladimir Sementsov-Ogievskiy
2019-09-02 14:35     ` Max Reitz
2019-09-03  8:32       ` Vladimir Sementsov-Ogievskiy
2019-09-09  7:41         ` Max Reitz
2019-09-13 12:55   ` Kevin Wolf
2019-09-16 10:26     ` Max Reitz
2019-08-09 16:13 ` [Qemu-devel] [PATCH v6 26/42] backup: " Max Reitz
2019-08-09 16:13 ` [Qemu-devel] [PATCH v6 27/42] commit: " Max Reitz
2019-08-31 10:44   ` Vladimir Sementsov-Ogievskiy
2019-09-02 14:55     ` Max Reitz
2019-08-09 16:13 ` [Qemu-devel] [PATCH v6 28/42] stream: " Max Reitz
2019-08-12 11:55   ` Vladimir Sementsov-Ogievskiy
2019-09-13 14:16   ` Kevin Wolf
2019-09-16  9:52     ` Max Reitz
2019-09-16 14:47       ` Kevin Wolf
2019-12-11 12:52       ` Max Reitz
2019-12-11 15:52         ` Kevin Wolf
2019-12-11 16:12           ` Max Reitz
2019-12-11 16:35             ` Kevin Wolf [this message]
2019-08-09 16:13 ` [Qemu-devel] [PATCH v6 29/42] nbd: Use CAF when looking for dirty bitmap Max Reitz
2019-08-09 16:13 ` [Qemu-devel] [PATCH v6 30/42] qemu-img: Use child access functions Max Reitz
2019-08-12 12:14   ` Vladimir Sementsov-Ogievskiy
2019-08-12 13:28     ` Max Reitz
2019-08-14 16:04   ` Vladimir Sementsov-Ogievskiy
2019-08-09 16:13 ` [Qemu-devel] [PATCH v6 31/42] block: Drop backing_bs() Max Reitz
2019-08-09 16:13 ` [Qemu-devel] [PATCH v6 32/42] block: Make bdrv_get_cumulative_perm() public Max Reitz
2019-08-09 16:13 ` [Qemu-devel] [PATCH v6 33/42] blockdev: Fix active commit choice Max Reitz
2019-08-09 16:13 ` [Qemu-devel] [PATCH v6 34/42] block: Inline bdrv_co_block_status_from_*() Max Reitz
2019-08-09 16:14 ` [Qemu-devel] [PATCH v6 35/42] block: Fix check_to_replace_node() Max Reitz
2019-08-15 15:21   ` Vladimir Sementsov-Ogievskiy
2019-08-15 17:01     ` Max Reitz
2019-08-16 11:01       ` Vladimir Sementsov-Ogievskiy
2019-08-16 13:30         ` Max Reitz
2019-08-16 14:24           ` Vladimir Sementsov-Ogievskiy
2019-08-09 16:14 ` [Qemu-devel] [PATCH v6 36/42] iotests: Add tests for mirror @replaces loops Max Reitz
2019-08-09 16:14 ` [Qemu-devel] [PATCH v6 37/42] block: Leave BDS.backing_file constant Max Reitz
2019-08-16 16:16   ` Vladimir Sementsov-Ogievskiy
2019-08-09 16:14 ` [Qemu-devel] [PATCH v6 38/42] iotests: Let complete_and_wait() work with commit Max Reitz
2019-08-23  5:59   ` Vladimir Sementsov-Ogievskiy
2019-08-09 16:14 ` [Qemu-devel] [PATCH v6 39/42] iotests: Add filter commit test cases Max Reitz
2019-08-31 11:41   ` Vladimir Sementsov-Ogievskiy
2019-09-02 15:06     ` Max Reitz
2019-08-31 12:35   ` Vladimir Sementsov-Ogievskiy
2019-09-02 15:09     ` Max Reitz
2019-08-09 16:14 ` [Qemu-devel] [PATCH v6 40/42] iotests: Add filter mirror " Max Reitz
2019-08-31 12:35   ` Vladimir Sementsov-Ogievskiy
2019-08-09 16:14 ` [Qemu-devel] [PATCH v6 41/42] iotests: Add test for commit in sub directory Max Reitz
2019-08-09 16:14 ` [Qemu-devel] [PATCH v6 42/42] iotests: Test committing to overridden backing Max Reitz
2019-09-03  9:18   ` Vladimir Sementsov-Ogievskiy

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191211163517.GD6505@linux.fritz.box \
    --to=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=vsementsov@virtuozzo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

QEMU-Devel Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/qemu-devel/0 qemu-devel/git/0.git
	git clone --mirror https://lore.kernel.org/qemu-devel/1 qemu-devel/git/1.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 qemu-devel qemu-devel/ https://lore.kernel.org/qemu-devel \
		qemu-devel@nongnu.org
	public-inbox-index qemu-devel

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.nongnu.qemu-devel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git