On 31.05.19 18:26, Max Reitz wrote: > On 16.04.19 12:02, Vladimir Sementsov-Ogievskiy wrote: >> 10.04.2019 23:20, Max Reitz wrote: >>> What bs->file and bs->backing mean depends on the node. For filter >>> nodes, both signify a node that will eventually receive all R/W >>> accesses. For format nodes, bs->file contains metadata and data, and >>> bs->backing will not receive writes -- instead, writes are COWed to >>> bs->file. Usually. >>> >>> In any case, it is not trivial to guess what a child means exactly with >>> our currently limited form of expression. It is better to introduce >>> some functions that actually guarantee a meaning: >>> >>> - bdrv_filtered_cow_child() will return the child that receives requests >>> filtered through COW. That is, reads may or may not be forwarded >>> (depending on the overlay's allocation status), but writes never go to >>> this child. >>> >>> - bdrv_filtered_rw_child() will return the child that receives requests >>> filtered through some very plain process. Reads and writes issued to >>> the parent will go to the child as well (although timing, etc. may be >>> modified). >>> >>> - All drivers but quorum (but quorum is pretty opaque to the general >>> block layer anyway) always only have one of these children: All read >>> requests must be served from the filtered_rw_child (if it exists), so >>> if there was a filtered_cow_child in addition, it would not receive >>> any requests at all. >>> (The closest here is mirror, where all requests are passed on to the >>> source, but with write-blocking, write requests are "COWed" to the >>> target. But that just means that the target is a special child that >>> cannot be introspected by the generic block layer functions, and that >>> source is a filtered_rw_child.) >>> Therefore, we can also add bdrv_filtered_child() which returns that >>> one child (or NULL, if there is no filtered child). >>> >>> Also, many places in the current block layer should be skipping filters >>> (all filters or just the ones added implicitly, it depends) when going >>> through a block node chain. They do not do that currently, but this >>> patch makes them. >>> >>> One example for this is qemu-img map, which should skip filters and only >>> look at the COW elements in the graph. The change to iotest 204's >>> reference output shows how using blkdebug on top of a COW node used to >>> make qemu-img map disregard the rest of the backing chain, but with this >>> patch, the allocation in the base image is reported correctly. >>> >>> Furthermore, a note should be made that sometimes we do want to access >>> bs->backing directly. This is whenever the operation in question is not >>> about accessing the COW child, but the "backing" child, be it COW or >>> not. This is the case in functions such as bdrv_open_backing_file() or >>> whenever we have to deal with the special behavior of @backing as a >>> blockdev option, which is that it does not default to null like all >>> other child references do. >>> >>> Finally, the query functions (query-block and query-named-block-nodes) >>> are modified to return any filtered child under "backing", not just >>> bs->backing or COW children. This is so that filters do not interrupt >>> the reported backing chain. This changes the output of iotest 184, as >>> the throttled node now appears as a backing child. >>> >>> Signed-off-by: Max Reitz >>> --- >>> qapi/block-core.json | 4 + >>> include/block/block.h | 1 + >>> include/block/block_int.h | 40 +++++-- >>> block.c | 210 +++++++++++++++++++++++++++------ >>> block/backup.c | 8 +- >>> block/block-backend.c | 16 ++- >>> block/commit.c | 33 +++--- >>> block/io.c | 45 ++++--- >>> block/mirror.c | 21 ++-- >>> block/qapi.c | 30 +++-- >>> block/stream.c | 13 +- >>> blockdev.c | 88 +++++++++++--- >>> migration/block-dirty-bitmap.c | 4 +- >>> nbd/server.c | 6 +- >>> qemu-img.c | 29 ++--- >>> tests/qemu-iotests/184.out | 7 +- >>> tests/qemu-iotests/204.out | 1 + >>> 17 files changed, 411 insertions(+), 145 deletions(-) >> >> really huge... didn't you consider conversion file-by-file? >> >> [..] >> >>> diff --git a/block.c b/block.c >>> index 16615bc876..e8f6febda0 100644 >>> --- a/block.c >>> +++ b/block.c >> >> [..] >> >>> >>> @@ -3467,14 +3469,17 @@ static int bdrv_reopen_parse_backing(BDRVReopenState *reopen_state, >>> /* >>> * Find the "actual" backing file by skipping all links that point >>> * to an implicit node, if any (e.g. a commit filter node). >>> + * We cannot use any of the bdrv_skip_*() functions here because >>> + * those return the first explicit node, while we are looking for >>> + * its overlay here. >>> */ >>> overlay_bs = bs; >>> - while (backing_bs(overlay_bs) && backing_bs(overlay_bs)->implicit) { >>> - overlay_bs = backing_bs(overlay_bs); >>> + while (overlay_bs->backing && bdrv_filtered_bs(overlay_bs)->implicit) { >> >> So, you don't want to skip implicit filters with 'file' child? Then, why not to use >> child_bs(overlay_bs->backing), like in following if condition? > > On second thought, I actually think this version is wrong in the other way. > > There needs to be a bs with bs->backing != NULL and !bs->implicit > somewhere in the chain. (Actually, no, the bs->backing node is @bs) > We try to find that node. It doesn’t matter > what’s on top of it, though, If there are implicit node (which we try > to skip here), the user isn’t aware of them. Consequentially, it > doesn’t matter whether these implicit nodes use bs->backing or bs->file, > we just need to skip them. > > What is wrong is the “while (overlay_bs->backing ...)”. That needs to > be “while (bdrv_filtered_bs(overlay_bs) ...)”. I just saw my reply where I noticed this before... So nothing too new then. Max