qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v5 00/42] block: Deal with filters
@ 2019-06-12 22:09 Max Reitz
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 01/42] block: Mark commit and mirror as filter drivers Max Reitz
                   ` (42 more replies)
  0 siblings, 43 replies; 113+ messages in thread
From: Max Reitz @ 2019-06-12 22:09 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, qemu-devel, Max Reitz

Hi,

When we introduced filters, we did it a bit casually.  Sure, we talked a
lot about them before, but that was mostly discussion about where
implicit filters should be added to the graph (note that we currently
only have two implicit filters, those being mirror and commit).  But in
the end, we really just designated some drivers filters (Quorum,
blkdebug, etc.) and added some specifically (throttle, COR), without
really looking through the block layer to see where issues might occur.

It turns out vast areas of the block layer just don’t know about filters
and cannot really handle them.  Many cases will work in practice, in
others, well, too bad, you cannot use some feature because some part
deep inside the block layer looks at your filters and thinks they are
format nodes.

This is one reason why this series is needed.  Over time (since v1), a
second reason has made its way in:

bs->file is not necessarily the place where a node’s data is stored.
qcow2 now has external data files, and currently there is no way for the
general block layer to know that the data is not stored in bs->file.
Right now, I do not think that has any real consequences (all functions
that need access to the actual data storage file should only do so as a
fallback if the driver does not provide some functionality, but qcow2
should provide it all), but it still shows that we need some way to let
the general block layer know about such data files.  (Also, I will need
this for v1 of my “Inquire images’ rotational info” series.)

I won’t go on and on about this series now, I think the patches pretty
much speak for themselves now.  If the cover letter gets too long,
nobody reads it anyway (see previous versions).


*** This series depends on some others. ***

Dependencies:
- [PATCH 0/4] block: Keep track of parent quiescing
- [PATCH 0/2] vl: Drain before (block) job cancel when quitting
- [PATCH v2 0/2] blockdev: Overlays are not snapshots

Based-on: <20190605161118.14544-1-mreitz@redhat.com>
Based-on: <20190612220839.1374-1-mreitz@redhat.com>
Based-on: <20190603202236.1342-1-mreitz@redhat.com>


v5:
- Split the huge patches 2 and 3 from the previous version into many
  smaller patches to maintain the potential reviewers’ sanity [Vladimir]

- Added support for compressed writes to the COR and throttle filter
  drivers to demonstrate how that looks, because the backup job needs to
  deal with filters that have such support

- Added differentiation between bdrv_storage_child(),
  bdrv_primary_child(), and bdrv_metadata_child()

- A whole lot of things Vladimir has noted

- Made the block jobs really work with filters.  In case of commit and
  stream, this now means that filters go away if they are between top
  and base.  I think that’s OK because it’s the user’s choice to include
  filters or not.  (They can move the filters around if they prefer a
  different result.)
  - This changes the “Add filter commit test cases” from checking that
    most things do not work to checking that they do

- Added the “blockdev: Fix active commit choice” patch because it turned
  out this became necessary after I allowed committing through and with
  filters.


Max Reitz (42):
  block: Mark commit and mirror as filter drivers
  copy-on-read: Support compressed writes
  throttle: Support compressed writes
  block: Add child access functions
  block: Add chain helper functions
  qcow2: Implement .bdrv_storage_child()
  block: *filtered_cow_child() for *has_zero_init()
  block: bdrv_set_backing_hd() is about bs->backing
  block: Include filters when freezing backing chain
  block: Use CAF in bdrv_is_encrypted()
  block: Add bdrv_supports_compressed_writes()
  block: Use bdrv_filtered_rw* where obvious
  block: Use CAFs in block status functions
  block: Use CAFs when working with backing chains
  block: Re-evaluate backing file handling in reopen
  block: Use child access functions when flushing
  block: Use CAFs in bdrv_refresh_limits()
  block: Use CAFs in bdrv_refresh_filename()
  block: Use CAF in bdrv_co_rw_vmstate()
  block/snapshot: Fall back to storage child
  block: Use CAFs for debug breakpoints
  block: Use CAFs in bdrv_get_allocated_file_size()
  blockdev: Use CAF in external_snapshot_prepare()
  block: Use child access functions for QAPI queries
  mirror: Deal with filters
  backup: Deal with filters
  commit: Deal with filters
  stream: Deal with filters
  nbd: Use CAF when looking for dirty bitmap
  qemu-img: Use child access functions
  block: Drop backing_bs()
  block: Make bdrv_get_cumulative_perm() public
  blockdev: Fix active commit choice
  block: Inline bdrv_co_block_status_from_*()
  block: Fix check_to_replace_node()
  iotests: Add tests for mirror @replaces loops
  block: Leave BDS.backing_file constant
  iotests: Let complete_and_wait() work with commit
  iotests: Add filter commit test cases
  iotests: Add filter mirror test cases
  iotests: Add test for commit in sub directory
  iotests: Test committing to overridden backing

 qapi/block-core.json          |   4 +
 include/block/block.h         |   2 +
 include/block/block_int.h     | 109 ++++---
 block.c                       | 523 +++++++++++++++++++++++++++++-----
 block/backup.c                |   9 +-
 block/blkdebug.c              |   7 +-
 block/blklogwrites.c          |   1 -
 block/block-backend.c         |  16 +-
 block/commit.c                | 100 +++++--
 block/copy-on-read.c          |  13 +-
 block/io.c                    | 115 ++++----
 block/mirror.c                | 113 ++++++--
 block/qapi.c                  |  42 +--
 block/qcow2.c                 |   9 +
 block/snapshot.c              |  74 +++--
 block/stream.c                |  23 +-
 block/throttle.c              |  11 +-
 blockdev.c                    | 139 +++++++--
 nbd/server.c                  |   6 +-
 qemu-img.c                    |  36 +--
 tests/qemu-iotests/020        |  36 +++
 tests/qemu-iotests/020.out    |  10 +
 tests/qemu-iotests/040        | 238 ++++++++++++++++
 tests/qemu-iotests/040.out    |   4 +-
 tests/qemu-iotests/041        | 270 +++++++++++++++++-
 tests/qemu-iotests/041.out    |   4 +-
 tests/qemu-iotests/184.out    |   7 +-
 tests/qemu-iotests/191.out    |   1 -
 tests/qemu-iotests/204.out    |   1 +
 tests/qemu-iotests/228        |   6 +-
 tests/qemu-iotests/228.out    |   6 +-
 tests/qemu-iotests/245        |   4 +-
 tests/qemu-iotests/iotests.py |  10 +-
 33 files changed, 1610 insertions(+), 339 deletions(-)

-- 
2.21.0



^ permalink raw reply	[flat|nested] 113+ messages in thread

* [Qemu-devel] [PATCH v5 01/42] block: Mark commit and mirror as filter drivers
  2019-06-12 22:09 [Qemu-devel] [PATCH v5 00/42] block: Deal with filters Max Reitz
@ 2019-06-12 22:09 ` Max Reitz
  2019-06-13 10:47   ` Vladimir Sementsov-Ogievskiy
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 02/42] copy-on-read: Support compressed writes Max Reitz
                   ` (41 subsequent siblings)
  42 siblings, 1 reply; 113+ messages in thread
From: Max Reitz @ 2019-06-12 22:09 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, qemu-devel, Max Reitz

The commit and mirror block nodes are filters, so they should be marked
as such.  (Strictly speaking, BDS.is_filter's documentation states that
a filter's child must be bs->file.  The following patch will relax this
restriction, however.)

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
 block/commit.c | 2 ++
 block/mirror.c | 2 ++
 2 files changed, 4 insertions(+)

diff --git a/block/commit.c b/block/commit.c
index c815def89a..f20a26fecd 100644
--- a/block/commit.c
+++ b/block/commit.c
@@ -256,6 +256,8 @@ static BlockDriver bdrv_commit_top = {
     .bdrv_co_block_status       = bdrv_co_block_status_from_backing,
     .bdrv_refresh_filename      = bdrv_commit_top_refresh_filename,
     .bdrv_child_perm            = bdrv_commit_top_child_perm,
+
+    .is_filter                  = true,
 };
 
 void commit_start(const char *job_id, BlockDriverState *bs,
diff --git a/block/mirror.c b/block/mirror.c
index f8bdb5b21b..4fa8f57c80 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -1480,6 +1480,8 @@ static BlockDriver bdrv_mirror_top = {
     .bdrv_co_block_status       = bdrv_co_block_status_from_backing,
     .bdrv_refresh_filename      = bdrv_mirror_top_refresh_filename,
     .bdrv_child_perm            = bdrv_mirror_top_child_perm,
+
+    .is_filter                  = true,
 };
 
 static void mirror_start_job(const char *job_id, BlockDriverState *bs,
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [Qemu-devel] [PATCH v5 02/42] copy-on-read: Support compressed writes
  2019-06-12 22:09 [Qemu-devel] [PATCH v5 00/42] block: Deal with filters Max Reitz
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 01/42] block: Mark commit and mirror as filter drivers Max Reitz
@ 2019-06-12 22:09 ` Max Reitz
  2019-06-13 10:49   ` Vladimir Sementsov-Ogievskiy
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 03/42] throttle: " Max Reitz
                   ` (40 subsequent siblings)
  42 siblings, 1 reply; 113+ messages in thread
From: Max Reitz @ 2019-06-12 22:09 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, qemu-devel, Max Reitz

Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 block/copy-on-read.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/block/copy-on-read.c b/block/copy-on-read.c
index 53972b1da3..88e1c1f538 100644
--- a/block/copy-on-read.c
+++ b/block/copy-on-read.c
@@ -114,6 +114,16 @@ static int coroutine_fn cor_co_pdiscard(BlockDriverState *bs,
 }
 
 
+static int coroutine_fn cor_co_pwritev_compressed(BlockDriverState *bs,
+                                                  uint64_t offset,
+                                                  uint64_t bytes,
+                                                  QEMUIOVector *qiov)
+{
+    return bdrv_co_pwritev(bs->file, offset, bytes, qiov,
+                           BDRV_REQ_WRITE_COMPRESSED);
+}
+
+
 static void cor_eject(BlockDriverState *bs, bool eject_flag)
 {
     bdrv_eject(bs->file->bs, eject_flag);
@@ -146,6 +156,7 @@ static BlockDriver bdrv_copy_on_read = {
     .bdrv_co_pwritev                    = cor_co_pwritev,
     .bdrv_co_pwrite_zeroes              = cor_co_pwrite_zeroes,
     .bdrv_co_pdiscard                   = cor_co_pdiscard,
+    .bdrv_co_pwritev_compressed         = cor_co_pwritev_compressed,
 
     .bdrv_eject                         = cor_eject,
     .bdrv_lock_medium                   = cor_lock_medium,
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [Qemu-devel] [PATCH v5 03/42] throttle: Support compressed writes
  2019-06-12 22:09 [Qemu-devel] [PATCH v5 00/42] block: Deal with filters Max Reitz
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 01/42] block: Mark commit and mirror as filter drivers Max Reitz
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 02/42] copy-on-read: Support compressed writes Max Reitz
@ 2019-06-12 22:09 ` Max Reitz
  2019-06-13 10:51   ` Vladimir Sementsov-Ogievskiy
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 04/42] block: Add child access functions Max Reitz
                   ` (39 subsequent siblings)
  42 siblings, 1 reply; 113+ messages in thread
From: Max Reitz @ 2019-06-12 22:09 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, qemu-devel, Max Reitz

Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 block/throttle.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/block/throttle.c b/block/throttle.c
index f64dcc27b9..de1b6bd7e8 100644
--- a/block/throttle.c
+++ b/block/throttle.c
@@ -152,6 +152,15 @@ static int coroutine_fn throttle_co_pdiscard(BlockDriverState *bs,
     return bdrv_co_pdiscard(bs->file, offset, bytes);
 }
 
+static int coroutine_fn throttle_co_pwritev_compressed(BlockDriverState *bs,
+                                                       uint64_t offset,
+                                                       uint64_t bytes,
+                                                       QEMUIOVector *qiov)
+{
+    return throttle_co_pwritev(bs, offset, bytes, qiov,
+                               BDRV_REQ_WRITE_COMPRESSED);
+}
+
 static int throttle_co_flush(BlockDriverState *bs)
 {
     return bdrv_co_flush(bs->file->bs);
@@ -250,6 +259,7 @@ static BlockDriver bdrv_throttle = {
 
     .bdrv_co_pwrite_zeroes              =   throttle_co_pwrite_zeroes,
     .bdrv_co_pdiscard                   =   throttle_co_pdiscard,
+    .bdrv_co_pwritev_compressed         =   throttle_co_pwritev_compressed,
 
     .bdrv_recurse_is_first_non_filter   =   throttle_recurse_is_first_non_filter,
 
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [Qemu-devel] [PATCH v5 04/42] block: Add child access functions
  2019-06-12 22:09 [Qemu-devel] [PATCH v5 00/42] block: Deal with filters Max Reitz
                   ` (2 preceding siblings ...)
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 03/42] throttle: " Max Reitz
@ 2019-06-12 22:09 ` Max Reitz
  2019-06-13 12:15   ` Vladimir Sementsov-Ogievskiy
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 05/42] block: Add chain helper functions Max Reitz
                   ` (38 subsequent siblings)
  42 siblings, 1 reply; 113+ messages in thread
From: Max Reitz @ 2019-06-12 22:09 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, qemu-devel, Max Reitz

There are BDS children that the general block layer code can access,
namely bs->file and bs->backing.  Since the introduction of filters and
external data files, their meaning is not quite clear.  bs->backing can
be a COW source, or it can be an R/W-filtered child; bs->file can be an
R/W-filtered child, it can be data and metadata storage, or it can be
just metadata storage.

This overloading really is not helpful.  This patch adds function that
retrieve the correct child for each exact purpose.  Later patches in
this series will make use of them.  Doing so will allow us to handle
filter nodes and external data files in a meaningful way.

Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 include/block/block_int.h | 57 ++++++++++++++++++++--
 block.c                   | 99 +++++++++++++++++++++++++++++++++++++++
 2 files changed, 153 insertions(+), 3 deletions(-)

diff --git a/include/block/block_int.h b/include/block/block_int.h
index 58fca37ba3..7ce71623f8 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -90,9 +90,11 @@ struct BlockDriver {
     int instance_size;
 
     /* set to true if the BlockDriver is a block filter. Block filters pass
-     * certain callbacks that refer to data (see block.c) to their bs->file if
-     * the driver doesn't implement them. Drivers that do not wish to forward
-     * must implement them and return -ENOTSUP.
+     * certain callbacks that refer to data (see block.c) to their bs->file
+     * or bs->backing (whichever one exists) if the driver doesn't implement
+     * them. Drivers that do not wish to forward must implement them and return
+     * -ENOTSUP.
+     * Note that filters are not allowed to modify data.
      */
     bool is_filter;
     /* for snapshots block filter like Quorum can implement the
@@ -562,6 +564,13 @@ struct BlockDriver {
      * If this pointer is NULL, the array is considered empty.
      * "filename" and "driver" are always considered strong. */
     const char *const *strong_runtime_opts;
+
+    /**
+     * Return the data storage child, if there is exactly one.  If
+     * this function is not implemented, the block layer will assume
+     * bs->file to be this child.
+     */
+    BdrvChild *(*bdrv_storage_child)(BlockDriverState *bs);
 };
 
 typedef struct BlockLimits {
@@ -1249,4 +1258,46 @@ int coroutine_fn bdrv_co_copy_range_to(BdrvChild *src, uint64_t src_offset,
 
 int refresh_total_sectors(BlockDriverState *bs, int64_t hint);
 
+BdrvChild *bdrv_filtered_cow_child(BlockDriverState *bs);
+BdrvChild *bdrv_filtered_rw_child(BlockDriverState *bs);
+BdrvChild *bdrv_filtered_child(BlockDriverState *bs);
+BdrvChild *bdrv_metadata_child(BlockDriverState *bs);
+BdrvChild *bdrv_storage_child(BlockDriverState *bs);
+BdrvChild *bdrv_primary_child(BlockDriverState *bs);
+
+static inline BlockDriverState *child_bs(BdrvChild *child)
+{
+    return child ? child->bs : NULL;
+}
+
+static inline BlockDriverState *bdrv_filtered_cow_bs(BlockDriverState *bs)
+{
+    return child_bs(bdrv_filtered_cow_child(bs));
+}
+
+static inline BlockDriverState *bdrv_filtered_rw_bs(BlockDriverState *bs)
+{
+    return child_bs(bdrv_filtered_rw_child(bs));
+}
+
+static inline BlockDriverState *bdrv_filtered_bs(BlockDriverState *bs)
+{
+    return child_bs(bdrv_filtered_child(bs));
+}
+
+static inline BlockDriverState *bdrv_metadata_bs(BlockDriverState *bs)
+{
+    return child_bs(bdrv_metadata_child(bs));
+}
+
+static inline BlockDriverState *bdrv_storage_bs(BlockDriverState *bs)
+{
+    return child_bs(bdrv_storage_child(bs));
+}
+
+static inline BlockDriverState *bdrv_primary_bs(BlockDriverState *bs)
+{
+    return child_bs(bdrv_primary_child(bs));
+}
+
 #endif /* BLOCK_INT_H */
diff --git a/block.c b/block.c
index 6bc51e371f..724d8889a6 100644
--- a/block.c
+++ b/block.c
@@ -6395,3 +6395,102 @@ bool bdrv_can_store_new_dirty_bitmap(BlockDriverState *bs, const char *name,
 
     return drv->bdrv_can_store_new_dirty_bitmap(bs, name, granularity, errp);
 }
+
+/*
+ * Return the child that @bs acts as an overlay for, and from which data may be
+ * copied in COW or COR operations.  Usually this is the backing file.
+ */
+BdrvChild *bdrv_filtered_cow_child(BlockDriverState *bs)
+{
+    if (!bs || !bs->drv) {
+        return NULL;
+    }
+
+    if (bs->drv->is_filter) {
+        return NULL;
+    }
+
+    return bs->backing;
+}
+
+/*
+ * If @bs acts as a pass-through filter for one of its children,
+ * return that child.  "Pass-through" means that write operations to
+ * @bs are forwarded to that child instead of triggering COW.
+ */
+BdrvChild *bdrv_filtered_rw_child(BlockDriverState *bs)
+{
+    if (!bs || !bs->drv) {
+        return NULL;
+    }
+
+    if (!bs->drv->is_filter) {
+        return NULL;
+    }
+
+    /* Only one of @backing or @file may be used */
+    assert(!(bs->backing && bs->file));
+
+    return bs->backing ?: bs->file;
+}
+
+/*
+ * Return any filtered child, independently of how it reacts to write
+ * accesses and whether data is copied onto this BDS through COR.
+ */
+BdrvChild *bdrv_filtered_child(BlockDriverState *bs)
+{
+    BdrvChild *cow_child = bdrv_filtered_cow_child(bs);
+    BdrvChild *rw_child = bdrv_filtered_rw_child(bs);
+
+    /* There can only be one filtered child at a time */
+    assert(!(cow_child && rw_child));
+
+    return cow_child ?: rw_child;
+}
+
+/*
+ * Return the child that stores the metadata for this node.
+ */
+BdrvChild *bdrv_metadata_child(BlockDriverState *bs)
+{
+    if (!bs || !bs->drv) {
+        return NULL;
+    }
+
+    /* Filters do not have metadata */
+    if (bs->drv->is_filter) {
+        return NULL;
+    }
+
+    return bs->file;
+}
+
+/*
+ * Return the child that stores the data that is allocated on this
+ * node.  This may or may not include metadata.
+ */
+BdrvChild *bdrv_storage_child(BlockDriverState *bs)
+{
+    if (!bs || !bs->drv) {
+        return NULL;
+    }
+
+    if (bs->drv->bdrv_storage_child) {
+        return bs->drv->bdrv_storage_child(bs);
+    }
+
+    return bdrv_filtered_rw_child(bs) ?: bs->file;
+}
+
+/*
+ * Return the primary child of this node: For filters, that is the
+ * filtered child.  For other nodes, that is usually the child storing
+ * metadata.
+ * (A generally more helpful description is that this is (usually) the
+ * child that has the same filename as @bs.)
+ */
+BdrvChild *bdrv_primary_child(BlockDriverState *bs)
+{
+    return bdrv_filtered_rw_child(bs) ?: bs->file;
+}
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [Qemu-devel] [PATCH v5 05/42] block: Add chain helper functions
  2019-06-12 22:09 [Qemu-devel] [PATCH v5 00/42] block: Deal with filters Max Reitz
                   ` (3 preceding siblings ...)
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 04/42] block: Add child access functions Max Reitz
@ 2019-06-12 22:09 ` Max Reitz
  2019-06-13 12:26   ` Vladimir Sementsov-Ogievskiy
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 06/42] qcow2: Implement .bdrv_storage_child() Max Reitz
                   ` (37 subsequent siblings)
  42 siblings, 1 reply; 113+ messages in thread
From: Max Reitz @ 2019-06-12 22:09 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, qemu-devel, Max Reitz

Add some helper functions for skipping filters in a chain of block
nodes.

Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 include/block/block_int.h |  3 +++
 block.c                   | 55 +++++++++++++++++++++++++++++++++++++++
 2 files changed, 58 insertions(+)

diff --git a/include/block/block_int.h b/include/block/block_int.h
index 7ce71623f8..875a33f255 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -1264,6 +1264,9 @@ BdrvChild *bdrv_filtered_child(BlockDriverState *bs);
 BdrvChild *bdrv_metadata_child(BlockDriverState *bs);
 BdrvChild *bdrv_storage_child(BlockDriverState *bs);
 BdrvChild *bdrv_primary_child(BlockDriverState *bs);
+BlockDriverState *bdrv_skip_implicit_filters(BlockDriverState *bs);
+BlockDriverState *bdrv_skip_rw_filters(BlockDriverState *bs);
+BlockDriverState *bdrv_backing_chain_next(BlockDriverState *bs);
 
 static inline BlockDriverState *child_bs(BdrvChild *child)
 {
diff --git a/block.c b/block.c
index 724d8889a6..be18130944 100644
--- a/block.c
+++ b/block.c
@@ -6494,3 +6494,58 @@ BdrvChild *bdrv_primary_child(BlockDriverState *bs)
 {
     return bdrv_filtered_rw_child(bs) ?: bs->file;
 }
+
+static BlockDriverState *bdrv_skip_filters(BlockDriverState *bs,
+                                           bool stop_on_explicit_filter)
+{
+    BdrvChild *filtered;
+
+    if (!bs) {
+        return NULL;
+    }
+
+    while (!(stop_on_explicit_filter && !bs->implicit)) {
+        filtered = bdrv_filtered_rw_child(bs);
+        if (!filtered) {
+            break;
+        }
+        bs = filtered->bs;
+    }
+    /*
+     * Note that this treats nodes with bs->drv == NULL as not being
+     * R/W filters (bs->drv == NULL should be replaced by something
+     * else anyway).
+     * The advantage of this behavior is that this function will thus
+     * always return a non-NULL value (given a non-NULL @bs).
+     */
+
+    return bs;
+}
+
+/*
+ * Return the first BDS that has not been added implicitly or that
+ * does not have an RW-filtered child down the chain starting from @bs
+ * (including @bs itself).
+ */
+BlockDriverState *bdrv_skip_implicit_filters(BlockDriverState *bs)
+{
+    return bdrv_skip_filters(bs, true);
+}
+
+/*
+ * Return the first BDS that does not have an RW-filtered child down
+ * the chain starting from @bs (including @bs itself).
+ */
+BlockDriverState *bdrv_skip_rw_filters(BlockDriverState *bs)
+{
+    return bdrv_skip_filters(bs, false);
+}
+
+/*
+ * For a backing chain, return the first non-filter backing image of
+ * the first non-filter image.
+ */
+BlockDriverState *bdrv_backing_chain_next(BlockDriverState *bs)
+{
+    return bdrv_skip_rw_filters(bdrv_filtered_cow_bs(bdrv_skip_rw_filters(bs)));
+}
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [Qemu-devel] [PATCH v5 06/42] qcow2: Implement .bdrv_storage_child()
  2019-06-12 22:09 [Qemu-devel] [PATCH v5 00/42] block: Deal with filters Max Reitz
                   ` (4 preceding siblings ...)
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 05/42] block: Add chain helper functions Max Reitz
@ 2019-06-12 22:09 ` Max Reitz
  2019-06-13 12:27   ` Vladimir Sementsov-Ogievskiy
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 07/42] block: *filtered_cow_child() for *has_zero_init() Max Reitz
                   ` (36 subsequent siblings)
  42 siblings, 1 reply; 113+ messages in thread
From: Max Reitz @ 2019-06-12 22:09 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, qemu-devel, Max Reitz

Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 block/qcow2.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/block/qcow2.c b/block/qcow2.c
index 9396d490d5..57675c9416 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -5085,6 +5085,13 @@ void qcow2_signal_corruption(BlockDriverState *bs, bool fatal, int64_t offset,
     s->signaled_corruption = true;
 }
 
+static BdrvChild *qcow2_storage_child(BlockDriverState *bs)
+{
+    BDRVQcow2State *s = bs->opaque;
+
+    return s->data_file;
+}
+
 static QemuOptsList qcow2_create_opts = {
     .name = "qcow2-create-opts",
     .head = QTAILQ_HEAD_INITIALIZER(qcow2_create_opts.head),
@@ -5231,6 +5238,8 @@ BlockDriver bdrv_qcow2 = {
     .bdrv_reopen_bitmaps_rw = qcow2_reopen_bitmaps_rw,
     .bdrv_can_store_new_dirty_bitmap = qcow2_can_store_new_dirty_bitmap,
     .bdrv_remove_persistent_dirty_bitmap = qcow2_remove_persistent_dirty_bitmap,
+
+    .bdrv_storage_child = qcow2_storage_child,
 };
 
 static void bdrv_qcow2_init(void)
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [Qemu-devel] [PATCH v5 07/42] block: *filtered_cow_child() for *has_zero_init()
  2019-06-12 22:09 [Qemu-devel] [PATCH v5 00/42] block: Deal with filters Max Reitz
                   ` (5 preceding siblings ...)
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 06/42] qcow2: Implement .bdrv_storage_child() Max Reitz
@ 2019-06-12 22:09 ` Max Reitz
  2019-06-13 12:34   ` Vladimir Sementsov-Ogievskiy
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 08/42] block: bdrv_set_backing_hd() is about bs->backing Max Reitz
                   ` (35 subsequent siblings)
  42 siblings, 1 reply; 113+ messages in thread
From: Max Reitz @ 2019-06-12 22:09 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, qemu-devel, Max Reitz

bdrv_has_zero_init() and the related bdrv_unallocated_blocks_are_zero()
should use bdrv_filtered_cow_child() if they want to check whether the
given BDS has a COW backing file.

Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 block.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/block.c b/block.c
index be18130944..64d6190984 100644
--- a/block.c
+++ b/block.c
@@ -4933,7 +4933,7 @@ int bdrv_has_zero_init(BlockDriverState *bs)
 
     /* If BS is a copy on write image, it is initialized to
        the contents of the base image, which may not be zeroes.  */
-    if (bs->backing) {
+    if (bdrv_filtered_cow_child(bs)) {
         return 0;
     }
     if (bs->drv->bdrv_has_zero_init) {
@@ -4951,7 +4951,7 @@ bool bdrv_unallocated_blocks_are_zero(BlockDriverState *bs)
 {
     BlockDriverInfo bdi;
 
-    if (bs->backing) {
+    if (bdrv_filtered_cow_child(bs)) {
         return false;
     }
 
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [Qemu-devel] [PATCH v5 08/42] block: bdrv_set_backing_hd() is about bs->backing
  2019-06-12 22:09 [Qemu-devel] [PATCH v5 00/42] block: Deal with filters Max Reitz
                   ` (6 preceding siblings ...)
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 07/42] block: *filtered_cow_child() for *has_zero_init() Max Reitz
@ 2019-06-12 22:09 ` Max Reitz
  2019-06-13 12:40   ` Vladimir Sementsov-Ogievskiy
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 09/42] block: Include filters when freezing backing chain Max Reitz
                   ` (34 subsequent siblings)
  42 siblings, 1 reply; 113+ messages in thread
From: Max Reitz @ 2019-06-12 22:09 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, qemu-devel, Max Reitz

bdrv_set_backing_hd() is a function that explicitly cares about the
bs->backing child.  Highlight that in its description and use
child_bs(bs->backing) instead of backing_bs(bs) to make it more obvious.

Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 block.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/block.c b/block.c
index 64d6190984..8438b0699e 100644
--- a/block.c
+++ b/block.c
@@ -2417,7 +2417,7 @@ static bool bdrv_inherits_from_recursive(BlockDriverState *child,
 }
 
 /*
- * Sets the backing file link of a BDS. A new reference is created; callers
+ * Sets the bs->backing link of a BDS. A new reference is created; callers
  * which don't need their own reference any more must call bdrv_unref().
  */
 void bdrv_set_backing_hd(BlockDriverState *bs, BlockDriverState *backing_hd,
@@ -2426,7 +2426,7 @@ void bdrv_set_backing_hd(BlockDriverState *bs, BlockDriverState *backing_hd,
     bool update_inherits_from = bdrv_chain_contains(bs, backing_hd) &&
         bdrv_inherits_from_recursive(backing_hd, bs);
 
-    if (bdrv_is_backing_chain_frozen(bs, backing_bs(bs), errp)) {
+    if (bdrv_is_backing_chain_frozen(bs, child_bs(bs->backing), errp)) {
         return;
     }
 
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [Qemu-devel] [PATCH v5 09/42] block: Include filters when freezing backing chain
  2019-06-12 22:09 [Qemu-devel] [PATCH v5 00/42] block: Deal with filters Max Reitz
                   ` (7 preceding siblings ...)
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 08/42] block: bdrv_set_backing_hd() is about bs->backing Max Reitz
@ 2019-06-12 22:09 ` Max Reitz
  2019-06-13 13:04   ` Vladimir Sementsov-Ogievskiy
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 10/42] block: Use CAF in bdrv_is_encrypted() Max Reitz
                   ` (33 subsequent siblings)
  42 siblings, 1 reply; 113+ messages in thread
From: Max Reitz @ 2019-06-12 22:09 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, qemu-devel, Max Reitz

In order to make filters work in backing chains, the associated
functions must be able to deal with them and freeze all filter links, be
they COW or R/W filter links.

While at it, add some comments that note which functions require their
caller to ensure that a given child link is not frozen, and how the
callers do so.

Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 block.c | 45 ++++++++++++++++++++++++++++++++-------------
 1 file changed, 32 insertions(+), 13 deletions(-)

diff --git a/block.c b/block.c
index 8438b0699e..45882a3470 100644
--- a/block.c
+++ b/block.c
@@ -2214,12 +2214,15 @@ static void bdrv_replace_child_noperm(BdrvChild *child,
  * If @new_bs is not NULL, bdrv_check_perm() must be called beforehand, as this
  * function uses bdrv_set_perm() to update the permissions according to the new
  * reference that @new_bs gets.
+ *
+ * Callers must ensure that child->frozen is false.
  */
 static void bdrv_replace_child(BdrvChild *child, BlockDriverState *new_bs)
 {
     BlockDriverState *old_bs = child->bs;
     uint64_t perm, shared_perm;
 
+    /* Asserts that child->frozen == false */
     bdrv_replace_child_noperm(child, new_bs);
 
     if (old_bs) {
@@ -2360,6 +2363,7 @@ static void bdrv_detach_child(BdrvChild *child)
     g_free(child);
 }
 
+/* Callers must ensure that child->frozen is false. */
 void bdrv_root_unref_child(BdrvChild *child)
 {
     BlockDriverState *child_bs;
@@ -2369,6 +2373,7 @@ void bdrv_root_unref_child(BdrvChild *child)
     bdrv_unref(child_bs);
 }
 
+/* Callers must ensure that child->frozen is false. */
 void bdrv_unref_child(BlockDriverState *parent, BdrvChild *child)
 {
     if (child == NULL) {
@@ -2435,6 +2440,7 @@ void bdrv_set_backing_hd(BlockDriverState *bs, BlockDriverState *backing_hd,
     }
 
     if (bs->backing) {
+        /* Cannot be frozen, we checked that above */
         bdrv_unref_child(bs, bs->backing);
     }
 
@@ -3908,6 +3914,7 @@ static void bdrv_close(BlockDriverState *bs)
 
     if (bs->drv) {
         if (bs->drv->bdrv_close) {
+            /* Must unfreeze all children, so bdrv_unref_child() works */
             bs->drv->bdrv_close(bs);
         }
         bs->drv = NULL;
@@ -4281,17 +4288,20 @@ BlockDriverState *bdrv_find_base(BlockDriverState *bs)
  * Return true if at least one of the backing links between @bs and
  * @base is frozen. @errp is set if that's the case.
  * @base must be reachable from @bs, or NULL.
+ * (Filters are treated as normal elements of the backing chain.)
  */
 bool bdrv_is_backing_chain_frozen(BlockDriverState *bs, BlockDriverState *base,
                                   Error **errp)
 {
     BlockDriverState *i;
+    BdrvChild *child;
 
-    for (i = bs; i != base; i = backing_bs(i)) {
-        if (i->backing && i->backing->frozen) {
+    for (i = bs; i != base; i = child_bs(child)) {
+        child = bdrv_filtered_child(i);
+
+        if (child && child->frozen) {
             error_setg(errp, "Cannot change '%s' link from '%s' to '%s'",
-                       i->backing->name, i->node_name,
-                       backing_bs(i)->node_name);
+                       child->name, i->node_name, child->bs->node_name);
             return true;
         }
     }
@@ -4305,19 +4315,22 @@ bool bdrv_is_backing_chain_frozen(BlockDriverState *bs, BlockDriverState *base,
  * none of the links are modified.
  * @base must be reachable from @bs, or NULL.
  * Returns 0 on success. On failure returns < 0 and sets @errp.
+ * (Filters are treated as normal elements of the backing chain.)
  */
 int bdrv_freeze_backing_chain(BlockDriverState *bs, BlockDriverState *base,
                               Error **errp)
 {
     BlockDriverState *i;
+    BdrvChild *child;
 
     if (bdrv_is_backing_chain_frozen(bs, base, errp)) {
         return -EPERM;
     }
 
-    for (i = bs; i != base; i = backing_bs(i)) {
-        if (i->backing) {
-            i->backing->frozen = true;
+    for (i = bs; i != base; i = child_bs(child)) {
+        child = bdrv_filtered_child(i);
+        if (child) {
+            child->frozen = true;
         }
     }
 
@@ -4328,15 +4341,18 @@ int bdrv_freeze_backing_chain(BlockDriverState *bs, BlockDriverState *base,
  * Unfreeze all backing links between @bs and @base. The caller must
  * ensure that all links are frozen before using this function.
  * @base must be reachable from @bs, or NULL.
+ * (Filters are treated as normal elements of the backing chain.)
  */
 void bdrv_unfreeze_backing_chain(BlockDriverState *bs, BlockDriverState *base)
 {
     BlockDriverState *i;
+    BdrvChild *child;
 
-    for (i = bs; i != base; i = backing_bs(i)) {
-        if (i->backing) {
-            assert(i->backing->frozen);
-            i->backing->frozen = false;
+    for (i = bs; i != base; i = child_bs(child)) {
+        child = bdrv_filtered_child(i);
+        if (child) {
+            assert(child->frozen);
+            child->frozen = false;
         }
     }
 }
@@ -4438,8 +4454,11 @@ int bdrv_drop_intermediate(BlockDriverState *top, BlockDriverState *base,
             }
         }
 
-        /* Do the actual switch in the in-memory graph.
-         * Completes bdrv_check_update_perm() transaction internally. */
+        /*
+         * Do the actual switch in the in-memory graph.
+         * Completes bdrv_check_update_perm() transaction internally.
+         * c->frozen is false, we have checked that above.
+         */
         bdrv_ref(base);
         bdrv_replace_child(c, base);
         bdrv_unref(top);
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [Qemu-devel] [PATCH v5 10/42] block: Use CAF in bdrv_is_encrypted()
  2019-06-12 22:09 [Qemu-devel] [PATCH v5 00/42] block: Deal with filters Max Reitz
                   ` (8 preceding siblings ...)
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 09/42] block: Include filters when freezing backing chain Max Reitz
@ 2019-06-12 22:09 ` Max Reitz
  2019-06-13 13:16   ` Vladimir Sementsov-Ogievskiy
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 11/42] block: Add bdrv_supports_compressed_writes() Max Reitz
                   ` (32 subsequent siblings)
  42 siblings, 1 reply; 113+ messages in thread
From: Max Reitz @ 2019-06-12 22:09 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, qemu-devel, Max Reitz

bdrv_is_encrypted() should not only check the BDS's backing child, but
any filtered child: If a filter's child is encrypted, the filter node
itself naturally is encrypted, too.  Furthermore, we need to recurse
down the chain.

(CAF means child access function.)

Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 block.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/block.c b/block.c
index 45882a3470..567a0f82c8 100644
--- a/block.c
+++ b/block.c
@@ -4574,10 +4574,14 @@ bool bdrv_is_sg(BlockDriverState *bs)
 
 bool bdrv_is_encrypted(BlockDriverState *bs)
 {
-    if (bs->backing && bs->backing->bs->encrypted) {
+    BlockDriverState *filtered = bdrv_filtered_bs(bs);
+    if (bs->encrypted) {
         return true;
     }
-    return bs->encrypted;
+    if (filtered && bdrv_is_encrypted(filtered)) {
+        return true;
+    }
+    return false;
 }
 
 const char *bdrv_get_format_name(BlockDriverState *bs)
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [Qemu-devel] [PATCH v5 11/42] block: Add bdrv_supports_compressed_writes()
  2019-06-12 22:09 [Qemu-devel] [PATCH v5 00/42] block: Deal with filters Max Reitz
                   ` (9 preceding siblings ...)
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 10/42] block: Use CAF in bdrv_is_encrypted() Max Reitz
@ 2019-06-12 22:09 ` Max Reitz
  2019-06-13 13:29   ` Vladimir Sementsov-Ogievskiy
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 12/42] block: Use bdrv_filtered_rw* where obvious Max Reitz
                   ` (31 subsequent siblings)
  42 siblings, 1 reply; 113+ messages in thread
From: Max Reitz @ 2019-06-12 22:09 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, qemu-devel, Max Reitz

Filters cannot compress data themselves but they have to implement
.bdrv_co_pwritev_compressed() still (or they cannot forward compressed
writes).  Therefore, checking whether
bs->drv->bdrv_co_pwritev_compressed is non-NULL is not sufficient to
know whether the node can actually handle compressed writes.  This
function looks down the filter chain to see whether there is a
non-filter that can actually convert the compressed writes into
compressed data (and thus normal writes).

Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 include/block/block.h |  1 +
 block.c               | 22 ++++++++++++++++++++++
 2 files changed, 23 insertions(+)

diff --git a/include/block/block.h b/include/block/block.h
index 687c03b275..7835c5b370 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -487,6 +487,7 @@ void bdrv_next_cleanup(BdrvNextIterator *it);
 
 BlockDriverState *bdrv_next_monitor_owned(BlockDriverState *bs);
 bool bdrv_is_encrypted(BlockDriverState *bs);
+bool bdrv_supports_compressed_writes(BlockDriverState *bs);
 void bdrv_iterate_format(void (*it)(void *opaque, const char *name),
                          void *opaque, bool read_only);
 const char *bdrv_get_node_name(const BlockDriverState *bs);
diff --git a/block.c b/block.c
index 567a0f82c8..97774b7b06 100644
--- a/block.c
+++ b/block.c
@@ -4584,6 +4584,28 @@ bool bdrv_is_encrypted(BlockDriverState *bs)
     return false;
 }
 
+/**
+ * Return whether the given node supports compressed writes.
+ */
+bool bdrv_supports_compressed_writes(BlockDriverState *bs)
+{
+    BlockDriverState *filtered = bdrv_filtered_rw_bs(bs);
+
+    if (!bs->drv || !bs->drv->bdrv_co_pwritev_compressed) {
+        return false;
+    }
+
+    if (filtered) {
+        /*
+         * Filters can only forward compressed writes, so we have to
+         * check the child.
+         */
+        return bdrv_supports_compressed_writes(filtered);
+    }
+
+    return true;
+}
+
 const char *bdrv_get_format_name(BlockDriverState *bs)
 {
     return bs->drv ? bs->drv->format_name : NULL;
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [Qemu-devel] [PATCH v5 12/42] block: Use bdrv_filtered_rw* where obvious
  2019-06-12 22:09 [Qemu-devel] [PATCH v5 00/42] block: Deal with filters Max Reitz
                   ` (10 preceding siblings ...)
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 11/42] block: Add bdrv_supports_compressed_writes() Max Reitz
@ 2019-06-12 22:09 ` Max Reitz
  2019-06-13 13:37   ` Vladimir Sementsov-Ogievskiy
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 13/42] block: Use CAFs in block status functions Max Reitz
                   ` (30 subsequent siblings)
  42 siblings, 1 reply; 113+ messages in thread
From: Max Reitz @ 2019-06-12 22:09 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, qemu-devel, Max Reitz

Places that use patterns like

    if (bs->drv->is_filter && bs->file) {
        ... something about bs->file->bs ...
    }

should be

    BlockDriverState *filtered = bdrv_filtered_rw_bs(bs);
    if (filtered) {
        ... something about @filtered ...
    }

instead.

Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 block.c    | 23 +++++++++++++++--------
 block/io.c |  5 +++--
 2 files changed, 18 insertions(+), 10 deletions(-)

diff --git a/block.c b/block.c
index 97774b7b06..11f37983d9 100644
--- a/block.c
+++ b/block.c
@@ -556,11 +556,12 @@ int bdrv_create_file(const char *filename, QemuOpts *opts, Error **errp)
 int bdrv_probe_blocksizes(BlockDriverState *bs, BlockSizes *bsz)
 {
     BlockDriver *drv = bs->drv;
+    BlockDriverState *filtered = bdrv_filtered_rw_bs(bs);
 
     if (drv && drv->bdrv_probe_blocksizes) {
         return drv->bdrv_probe_blocksizes(bs, bsz);
-    } else if (drv && drv->is_filter && bs->file) {
-        return bdrv_probe_blocksizes(bs->file->bs, bsz);
+    } else if (filtered) {
+        return bdrv_probe_blocksizes(filtered, bsz);
     }
 
     return -ENOTSUP;
@@ -575,11 +576,12 @@ int bdrv_probe_blocksizes(BlockDriverState *bs, BlockSizes *bsz)
 int bdrv_probe_geometry(BlockDriverState *bs, HDGeometry *geo)
 {
     BlockDriver *drv = bs->drv;
+    BlockDriverState *filtered = bdrv_filtered_rw_bs(bs);
 
     if (drv && drv->bdrv_probe_geometry) {
         return drv->bdrv_probe_geometry(bs, geo);
-    } else if (drv && drv->is_filter && bs->file) {
-        return bdrv_probe_geometry(bs->file->bs, geo);
+    } else if (filtered) {
+        return bdrv_probe_geometry(filtered, geo);
     }
 
     return -ENOTSUP;
@@ -4972,6 +4974,8 @@ int bdrv_has_zero_init_1(BlockDriverState *bs)
 
 int bdrv_has_zero_init(BlockDriverState *bs)
 {
+    BlockDriverState *filtered;
+
     if (!bs->drv) {
         return 0;
     }
@@ -4984,8 +4988,10 @@ int bdrv_has_zero_init(BlockDriverState *bs)
     if (bs->drv->bdrv_has_zero_init) {
         return bs->drv->bdrv_has_zero_init(bs);
     }
-    if (bs->file && bs->drv->is_filter) {
-        return bdrv_has_zero_init(bs->file->bs);
+
+    filtered = bdrv_filtered_rw_bs(bs);
+    if (filtered) {
+        return bdrv_has_zero_init(filtered);
     }
 
     /* safe default */
@@ -5030,8 +5036,9 @@ int bdrv_get_info(BlockDriverState *bs, BlockDriverInfo *bdi)
         return -ENOMEDIUM;
     }
     if (!drv->bdrv_get_info) {
-        if (bs->file && drv->is_filter) {
-            return bdrv_get_info(bs->file->bs, bdi);
+        BlockDriverState *filtered = bdrv_filtered_rw_bs(bs);
+        if (filtered) {
+            return bdrv_get_info(filtered, bdi);
         }
         return -ENOTSUP;
     }
diff --git a/block/io.c b/block/io.c
index 2408abffd9..73ade04834 100644
--- a/block/io.c
+++ b/block/io.c
@@ -3147,8 +3147,9 @@ int coroutine_fn bdrv_co_truncate(BdrvChild *child, int64_t offset,
     }
 
     if (!drv->bdrv_co_truncate) {
-        if (bs->file && drv->is_filter) {
-            ret = bdrv_co_truncate(bs->file, offset, prealloc, errp);
+        BdrvChild *filtered = bdrv_filtered_rw_child(bs);
+        if (filtered) {
+            ret = bdrv_co_truncate(filtered, offset, prealloc, errp);
             goto out;
         }
         error_setg(errp, "Image format driver does not support resize");
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [Qemu-devel] [PATCH v5 13/42] block: Use CAFs in block status functions
  2019-06-12 22:09 [Qemu-devel] [PATCH v5 00/42] block: Deal with filters Max Reitz
                   ` (11 preceding siblings ...)
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 12/42] block: Use bdrv_filtered_rw* where obvious Max Reitz
@ 2019-06-12 22:09 ` Max Reitz
  2019-06-14 12:07   ` Vladimir Sementsov-Ogievskiy
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 14/42] block: Use CAFs when working with backing chains Max Reitz
                   ` (29 subsequent siblings)
  42 siblings, 1 reply; 113+ messages in thread
From: Max Reitz @ 2019-06-12 22:09 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, qemu-devel, Max Reitz

Use the child access functions in the block status inquiry functions as
appropriate.

Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 block/io.c | 19 ++++++++++---------
 1 file changed, 10 insertions(+), 9 deletions(-)

diff --git a/block/io.c b/block/io.c
index 73ade04834..53aabf86b5 100644
--- a/block/io.c
+++ b/block/io.c
@@ -2150,11 +2150,12 @@ static int coroutine_fn bdrv_co_block_status(BlockDriverState *bs,
     if (ret & (BDRV_BLOCK_DATA | BDRV_BLOCK_ZERO)) {
         ret |= BDRV_BLOCK_ALLOCATED;
     } else if (want_zero) {
+        BlockDriverState *cow_bs = bdrv_filtered_cow_bs(bs);
+
         if (bdrv_unallocated_blocks_are_zero(bs)) {
             ret |= BDRV_BLOCK_ZERO;
-        } else if (bs->backing) {
-            BlockDriverState *bs2 = bs->backing->bs;
-            int64_t size2 = bdrv_getlength(bs2);
+        } else if (cow_bs) {
+            int64_t size2 = bdrv_getlength(cow_bs);
 
             if (size2 >= 0 && offset >= size2) {
                 ret |= BDRV_BLOCK_ZERO;
@@ -2220,7 +2221,7 @@ static int coroutine_fn bdrv_co_block_status_above(BlockDriverState *bs,
     bool first = true;
 
     assert(bs != base);
-    for (p = bs; p != base; p = backing_bs(p)) {
+    for (p = bs; p != base; p = bdrv_filtered_bs(p)) {
         ret = bdrv_co_block_status(p, want_zero, offset, bytes, pnum, map,
                                    file);
         if (ret < 0) {
@@ -2306,7 +2307,7 @@ int bdrv_block_status_above(BlockDriverState *bs, BlockDriverState *base,
 int bdrv_block_status(BlockDriverState *bs, int64_t offset, int64_t bytes,
                       int64_t *pnum, int64_t *map, BlockDriverState **file)
 {
-    return bdrv_block_status_above(bs, backing_bs(bs),
+    return bdrv_block_status_above(bs, bdrv_filtered_bs(bs),
                                    offset, bytes, pnum, map, file);
 }
 
@@ -2316,9 +2317,9 @@ int coroutine_fn bdrv_is_allocated(BlockDriverState *bs, int64_t offset,
     int ret;
     int64_t dummy;
 
-    ret = bdrv_common_block_status_above(bs, backing_bs(bs), false, offset,
-                                         bytes, pnum ? pnum : &dummy, NULL,
-                                         NULL);
+    ret = bdrv_common_block_status_above(bs, bdrv_filtered_bs(bs), false,
+                                         offset, bytes, pnum ? pnum : &dummy,
+                                         NULL, NULL);
     if (ret < 0) {
         return ret;
     }
@@ -2372,7 +2373,7 @@ int bdrv_is_allocated_above(BlockDriverState *top,
             n = pnum_inter;
         }
 
-        intermediate = backing_bs(intermediate);
+        intermediate = bdrv_filtered_bs(intermediate);
     }
 
     *pnum = n;
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [Qemu-devel] [PATCH v5 14/42] block: Use CAFs when working with backing chains
  2019-06-12 22:09 [Qemu-devel] [PATCH v5 00/42] block: Deal with filters Max Reitz
                   ` (12 preceding siblings ...)
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 13/42] block: Use CAFs in block status functions Max Reitz
@ 2019-06-12 22:09 ` Max Reitz
  2019-06-14 13:26   ` Vladimir Sementsov-Ogievskiy
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 15/42] block: Re-evaluate backing file handling in reopen Max Reitz
                   ` (28 subsequent siblings)
  42 siblings, 1 reply; 113+ messages in thread
From: Max Reitz @ 2019-06-12 22:09 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, qemu-devel, Max Reitz

Use child access functions when iterating through backing chains so
filters do not break the chain.

Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 block.c | 40 ++++++++++++++++++++++++++++------------
 1 file changed, 28 insertions(+), 12 deletions(-)

diff --git a/block.c b/block.c
index 11f37983d9..505b3e9a01 100644
--- a/block.c
+++ b/block.c
@@ -4261,7 +4261,8 @@ int bdrv_change_backing_file(BlockDriverState *bs,
 }
 
 /*
- * Finds the image layer in the chain that has 'bs' as its backing file.
+ * Finds the image layer in the chain that has 'bs' (or a filter on
+ * top of it) as its backing file.
  *
  * active is the current topmost image.
  *
@@ -4273,11 +4274,18 @@ int bdrv_change_backing_file(BlockDriverState *bs,
 BlockDriverState *bdrv_find_overlay(BlockDriverState *active,
                                     BlockDriverState *bs)
 {
-    while (active && bs != backing_bs(active)) {
-        active = backing_bs(active);
+    bs = bdrv_skip_rw_filters(bs);
+    active = bdrv_skip_rw_filters(active);
+
+    while (active) {
+        BlockDriverState *next = bdrv_backing_chain_next(active);
+        if (bs == next) {
+            return active;
+        }
+        active = next;
     }
 
-    return active;
+    return NULL;
 }
 
 /* Given a BDS, searches for the base layer. */
@@ -4421,9 +4429,7 @@ int bdrv_drop_intermediate(BlockDriverState *top, BlockDriverState *base,
      * other intermediate nodes have been dropped.
      * If 'top' is an implicit node (e.g. "commit_top") we should skip
      * it because no one inherits from it. We use explicit_top for that. */
-    while (explicit_top && explicit_top->implicit) {
-        explicit_top = backing_bs(explicit_top);
-    }
+    explicit_top = bdrv_skip_implicit_filters(explicit_top);
     update_inherits_from = bdrv_inherits_from_recursive(base, explicit_top);
 
     /* success - we can delete the intermediate states, and link top->base */
@@ -4902,7 +4908,7 @@ BlockDriverState *bdrv_lookup_bs(const char *device,
 bool bdrv_chain_contains(BlockDriverState *top, BlockDriverState *base)
 {
     while (top && top != base) {
-        top = backing_bs(top);
+        top = bdrv_filtered_bs(top);
     }
 
     return top != NULL;
@@ -5141,7 +5147,17 @@ BlockDriverState *bdrv_find_backing_image(BlockDriverState *bs,
 
     is_protocol = path_has_protocol(backing_file);
 
-    for (curr_bs = bs; curr_bs->backing; curr_bs = curr_bs->backing->bs) {
+    /*
+     * Being largely a legacy function, skip any filters here
+     * (because filters do not have normal filenames, so they cannot
+     * match anyway; and allowing json:{} filenames is a bit out of
+     * scope).
+     */
+    for (curr_bs = bdrv_skip_rw_filters(bs);
+         bdrv_filtered_cow_child(curr_bs) != NULL;
+         curr_bs = bdrv_backing_chain_next(curr_bs))
+    {
+        BlockDriverState *bs_below = bdrv_backing_chain_next(curr_bs);
 
         /* If either of the filename paths is actually a protocol, then
          * compare unmodified paths; otherwise make paths relative */
@@ -5149,7 +5165,7 @@ BlockDriverState *bdrv_find_backing_image(BlockDriverState *bs,
             char *backing_file_full_ret;
 
             if (strcmp(backing_file, curr_bs->backing_file) == 0) {
-                retval = curr_bs->backing->bs;
+                retval = bs_below;
                 break;
             }
             /* Also check against the full backing filename for the image */
@@ -5159,7 +5175,7 @@ BlockDriverState *bdrv_find_backing_image(BlockDriverState *bs,
                 bool equal = strcmp(backing_file, backing_file_full_ret) == 0;
                 g_free(backing_file_full_ret);
                 if (equal) {
-                    retval = curr_bs->backing->bs;
+                    retval = bs_below;
                     break;
                 }
             }
@@ -5185,7 +5201,7 @@ BlockDriverState *bdrv_find_backing_image(BlockDriverState *bs,
             g_free(filename_tmp);
 
             if (strcmp(backing_file_full, filename_full) == 0) {
-                retval = curr_bs->backing->bs;
+                retval = bs_below;
                 break;
             }
         }
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [Qemu-devel] [PATCH v5 15/42] block: Re-evaluate backing file handling in reopen
  2019-06-12 22:09 [Qemu-devel] [PATCH v5 00/42] block: Deal with filters Max Reitz
                   ` (13 preceding siblings ...)
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 14/42] block: Use CAFs when working with backing chains Max Reitz
@ 2019-06-12 22:09 ` Max Reitz
  2019-06-14 13:42   ` Vladimir Sementsov-Ogievskiy
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 16/42] block: Use child access functions when flushing Max Reitz
                   ` (27 subsequent siblings)
  42 siblings, 1 reply; 113+ messages in thread
From: Max Reitz @ 2019-06-12 22:09 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, qemu-devel, Max Reitz

Reopening a node's backing child needs a bit of special handling because
the "backing" child has different defaults than all other children
(among other things).  Adding filter support here is a bit more
difficult than just using the child access functions.  In fact, we often
have to directly use bs->backing because these functions are about the
"backing" child (which may or may not be the COW backing file).

Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 block.c | 36 +++++++++++++++++++++++++++++-------
 1 file changed, 29 insertions(+), 7 deletions(-)

diff --git a/block.c b/block.c
index 505b3e9a01..db2759c10d 100644
--- a/block.c
+++ b/block.c
@@ -3542,17 +3542,39 @@ static int bdrv_reopen_parse_backing(BDRVReopenState *reopen_state,
         }
     }
 
+    /*
+     * Ensure that @bs can really handle backing files, because we are
+     * about to give it one (or swap the existing one)
+     */
+    if (bs->drv->is_filter) {
+        /* Filters always have a file or a backing child */
+        if (!bs->backing) {
+            error_setg(errp, "'%s' is a %s filter node that does not support a "
+                       "backing child", bs->node_name, bs->drv->format_name);
+            return -EINVAL;
+        }
+    } else if (!bs->drv->supports_backing) {
+        error_setg(errp, "Driver '%s' of node '%s' does not support backing "
+                   "files", bs->drv->format_name, bs->node_name);
+        return -EINVAL;
+    }
+
     /*
      * Find the "actual" backing file by skipping all links that point
      * to an implicit node, if any (e.g. a commit filter node).
+     * We cannot use any of the bdrv_skip_*() functions here because
+     * those return the first explicit node, while we are looking for
+     * its overlay here.
      */
     overlay_bs = bs;
-    while (backing_bs(overlay_bs) && backing_bs(overlay_bs)->implicit) {
-        overlay_bs = backing_bs(overlay_bs);
+    while (bdrv_filtered_bs(overlay_bs) &&
+           bdrv_filtered_bs(overlay_bs)->implicit)
+    {
+        overlay_bs = bdrv_filtered_bs(overlay_bs);
     }
 
     /* If we want to replace the backing file we need some extra checks */
-    if (new_backing_bs != backing_bs(overlay_bs)) {
+    if (new_backing_bs != bdrv_filtered_bs(overlay_bs)) {
         /* Check for implicit nodes between bs and its backing file */
         if (bs != overlay_bs) {
             error_setg(errp, "Cannot change backing link if '%s' has "
@@ -3560,8 +3582,8 @@ static int bdrv_reopen_parse_backing(BDRVReopenState *reopen_state,
             return -EPERM;
         }
         /* Check if the backing link that we want to replace is frozen */
-        if (bdrv_is_backing_chain_frozen(overlay_bs, backing_bs(overlay_bs),
-                                         errp)) {
+        if (bdrv_is_backing_chain_frozen(overlay_bs,
+                                         child_bs(overlay_bs->backing), errp)) {
             return -EPERM;
         }
         reopen_state->replace_backing_bs = true;
@@ -3712,7 +3734,7 @@ int bdrv_reopen_prepare(BDRVReopenState *reopen_state, BlockReopenQueue *queue,
      * its metadata. Otherwise the 'backing' option can be omitted.
      */
     if (drv->supports_backing && reopen_state->backing_missing &&
-        (backing_bs(reopen_state->bs) || reopen_state->bs->backing_file[0])) {
+        (reopen_state->bs->backing || reopen_state->bs->backing_file[0])) {
         error_setg(errp, "backing is missing for '%s'",
                    reopen_state->bs->node_name);
         ret = -EINVAL;
@@ -3857,7 +3879,7 @@ void bdrv_reopen_commit(BDRVReopenState *reopen_state)
      * from bdrv_set_backing_hd()) has the new values.
      */
     if (reopen_state->replace_backing_bs) {
-        BlockDriverState *old_backing_bs = backing_bs(bs);
+        BlockDriverState *old_backing_bs = child_bs(bs->backing);
         assert(!old_backing_bs || !old_backing_bs->implicit);
         /* Abort the permission update on the backing bs we're detaching */
         if (old_backing_bs) {
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [Qemu-devel] [PATCH v5 16/42] block: Use child access functions when flushing
  2019-06-12 22:09 [Qemu-devel] [PATCH v5 00/42] block: Deal with filters Max Reitz
                   ` (14 preceding siblings ...)
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 15/42] block: Re-evaluate backing file handling in reopen Max Reitz
@ 2019-06-12 22:09 ` Max Reitz
  2019-06-14 14:01   ` Vladimir Sementsov-Ogievskiy
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 17/42] block: Use CAFs in bdrv_refresh_limits() Max Reitz
                   ` (26 subsequent siblings)
  42 siblings, 1 reply; 113+ messages in thread
From: Max Reitz @ 2019-06-12 22:09 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, qemu-devel, Max Reitz

If the driver does not support .bdrv_co_flush() so bdrv_co_flush()
itself has to flush the children of the given node, it should not flush
just bs->file->bs, but in fact both the child that stores data, and the
one that stores metadata (if they are separate).

In any case, the BLKDBG_EVENT() should be emitted on the primary child,
because that is where a blkdebug node would be if there is any.

Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 block/io.c | 21 ++++++++++++++++++---
 1 file changed, 18 insertions(+), 3 deletions(-)

diff --git a/block/io.c b/block/io.c
index 53aabf86b5..64408cf19a 100644
--- a/block/io.c
+++ b/block/io.c
@@ -2533,6 +2533,8 @@ static void coroutine_fn bdrv_flush_co_entry(void *opaque)
 
 int coroutine_fn bdrv_co_flush(BlockDriverState *bs)
 {
+    BdrvChild *primary_child = bdrv_primary_child(bs);
+    BlockDriverState *storage_bs, *metadata_bs;
     int current_gen;
     int ret = 0;
 
@@ -2562,7 +2564,7 @@ int coroutine_fn bdrv_co_flush(BlockDriverState *bs)
     }
 
     /* Write back cached data to the OS even with cache=unsafe */
-    BLKDBG_EVENT(bs->file, BLKDBG_FLUSH_TO_OS);
+    BLKDBG_EVENT(primary_child, BLKDBG_FLUSH_TO_OS);
     if (bs->drv->bdrv_co_flush_to_os) {
         ret = bs->drv->bdrv_co_flush_to_os(bs);
         if (ret < 0) {
@@ -2580,7 +2582,7 @@ int coroutine_fn bdrv_co_flush(BlockDriverState *bs)
         goto flush_parent;
     }
 
-    BLKDBG_EVENT(bs->file, BLKDBG_FLUSH_TO_DISK);
+    BLKDBG_EVENT(primary_child, BLKDBG_FLUSH_TO_DISK);
     if (!bs->drv) {
         /* bs->drv->bdrv_co_flush() might have ejected the BDS
          * (even in case of apparent success) */
@@ -2625,7 +2627,20 @@ int coroutine_fn bdrv_co_flush(BlockDriverState *bs)
      * in the case of cache=unsafe, so there are no useless flushes.
      */
 flush_parent:
-    ret = bs->file ? bdrv_co_flush(bs->file->bs) : 0;
+    storage_bs = bdrv_storage_bs(bs);
+    metadata_bs = bdrv_metadata_bs(bs);
+
+    ret = 0;
+    if (storage_bs) {
+        ret = bdrv_co_flush(storage_bs);
+    }
+    if (metadata_bs && metadata_bs != storage_bs) {
+        int ret_metadata = bdrv_co_flush(metadata_bs);
+        if (!ret) {
+            ret = ret_metadata;
+        }
+    }
+
 out:
     /* Notify any pending flushes that we have completed */
     if (ret == 0) {
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [Qemu-devel] [PATCH v5 17/42] block: Use CAFs in bdrv_refresh_limits()
  2019-06-12 22:09 [Qemu-devel] [PATCH v5 00/42] block: Deal with filters Max Reitz
                   ` (15 preceding siblings ...)
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 16/42] block: Use child access functions when flushing Max Reitz
@ 2019-06-12 22:09 ` Max Reitz
  2019-06-14 15:04   ` Vladimir Sementsov-Ogievskiy
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 18/42] block: Use CAFs in bdrv_refresh_filename() Max Reitz
                   ` (25 subsequent siblings)
  42 siblings, 1 reply; 113+ messages in thread
From: Max Reitz @ 2019-06-12 22:09 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, qemu-devel, Max Reitz

Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 block/io.c | 14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/block/io.c b/block/io.c
index 64408cf19a..659ea0c52a 100644
--- a/block/io.c
+++ b/block/io.c
@@ -151,6 +151,8 @@ static void bdrv_merge_limits(BlockLimits *dst, const BlockLimits *src)
 void bdrv_refresh_limits(BlockDriverState *bs, Error **errp)
 {
     BlockDriver *drv = bs->drv;
+    BlockDriverState *storage_bs = bdrv_storage_bs(bs);
+    BlockDriverState *cow_bs = bdrv_filtered_cow_bs(bs);
     Error *local_err = NULL;
 
     memset(&bs->bl, 0, sizeof(bs->bl));
@@ -164,13 +166,13 @@ void bdrv_refresh_limits(BlockDriverState *bs, Error **errp)
                                 drv->bdrv_aio_preadv) ? 1 : 512;
 
     /* Take some limits from the children as a default */
-    if (bs->file) {
-        bdrv_refresh_limits(bs->file->bs, &local_err);
+    if (storage_bs) {
+        bdrv_refresh_limits(storage_bs, &local_err);
         if (local_err) {
             error_propagate(errp, local_err);
             return;
         }
-        bdrv_merge_limits(&bs->bl, &bs->file->bs->bl);
+        bdrv_merge_limits(&bs->bl, &storage_bs->bl);
     } else {
         bs->bl.min_mem_alignment = 512;
         bs->bl.opt_mem_alignment = getpagesize();
@@ -179,13 +181,13 @@ void bdrv_refresh_limits(BlockDriverState *bs, Error **errp)
         bs->bl.max_iov = IOV_MAX;
     }
 
-    if (bs->backing) {
-        bdrv_refresh_limits(bs->backing->bs, &local_err);
+    if (cow_bs) {
+        bdrv_refresh_limits(cow_bs, &local_err);
         if (local_err) {
             error_propagate(errp, local_err);
             return;
         }
-        bdrv_merge_limits(&bs->bl, &bs->backing->bs->bl);
+        bdrv_merge_limits(&bs->bl, &cow_bs->bl);
     }
 
     /* Then let the driver override it */
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [Qemu-devel] [PATCH v5 18/42] block: Use CAFs in bdrv_refresh_filename()
  2019-06-12 22:09 [Qemu-devel] [PATCH v5 00/42] block: Deal with filters Max Reitz
                   ` (16 preceding siblings ...)
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 17/42] block: Use CAFs in bdrv_refresh_limits() Max Reitz
@ 2019-06-12 22:09 ` Max Reitz
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 19/42] block: Use CAF in bdrv_co_rw_vmstate() Max Reitz
                   ` (24 subsequent siblings)
  42 siblings, 0 replies; 113+ messages in thread
From: Max Reitz @ 2019-06-12 22:09 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, qemu-devel, Max Reitz

bdrv_refresh_filename() and the kind of related bdrv_dirname() should
look to the primary child when they wish to copy the underlying file's
filename.

Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 block.c | 29 +++++++++++++++++++++--------
 1 file changed, 21 insertions(+), 8 deletions(-)

diff --git a/block.c b/block.c
index db2759c10d..797bec0326 100644
--- a/block.c
+++ b/block.c
@@ -6280,6 +6280,7 @@ void bdrv_refresh_filename(BlockDriverState *bs)
 {
     BlockDriver *drv = bs->drv;
     BdrvChild *child;
+    BlockDriverState *primary_child_bs;
     QDict *opts;
     bool backing_overridden;
     bool generate_json_filename; /* Whether our default implementation should
@@ -6348,20 +6349,30 @@ void bdrv_refresh_filename(BlockDriverState *bs)
     qobject_unref(bs->full_open_options);
     bs->full_open_options = opts;
 
+    primary_child_bs = bdrv_primary_bs(bs);
+
     if (drv->bdrv_refresh_filename) {
         /* Obsolete information is of no use here, so drop the old file name
          * information before refreshing it */
         bs->exact_filename[0] = '\0';
 
         drv->bdrv_refresh_filename(bs);
-    } else if (bs->file) {
-        /* Try to reconstruct valid information from the underlying file */
+    } else if (primary_child_bs) {
+        /*
+         * Try to reconstruct valid information from the underlying
+         * file -- this only works for format nodes (filter nodes
+         * cannot be probed and as such must be selected by the user
+         * either through an options dict, or through a special
+         * filename which the filter driver must construct in its
+         * .bdrv_refresh_filename() implementation).
+         */
 
         bs->exact_filename[0] = '\0';
 
         /*
          * We can use the underlying file's filename if:
          * - it has a filename,
+         * - the current BDS is not a filter,
          * - the file is a protocol BDS, and
          * - opening that file (as this BDS's format) will automatically create
          *   the BDS tree we have right now, that is:
@@ -6370,11 +6381,11 @@ void bdrv_refresh_filename(BlockDriverState *bs)
          *   - no non-file child of this BDS has been overridden by the user
          *   Both of these conditions are represented by generate_json_filename.
          */
-        if (bs->file->bs->exact_filename[0] &&
-            bs->file->bs->drv->bdrv_file_open &&
-            !generate_json_filename)
+        if (primary_child_bs->exact_filename[0] &&
+            primary_child_bs->drv->bdrv_file_open &&
+            !drv->is_filter && !generate_json_filename)
         {
-            strcpy(bs->exact_filename, bs->file->bs->exact_filename);
+            strcpy(bs->exact_filename, primary_child_bs->exact_filename);
         }
     }
 
@@ -6391,6 +6402,7 @@ void bdrv_refresh_filename(BlockDriverState *bs)
 char *bdrv_dirname(BlockDriverState *bs, Error **errp)
 {
     BlockDriver *drv = bs->drv;
+    BlockDriverState *child_bs;
 
     if (!drv) {
         error_setg(errp, "Node '%s' is ejected", bs->node_name);
@@ -6401,8 +6413,9 @@ char *bdrv_dirname(BlockDriverState *bs, Error **errp)
         return drv->bdrv_dirname(bs, errp);
     }
 
-    if (bs->file) {
-        return bdrv_dirname(bs->file->bs, errp);
+    child_bs = bdrv_primary_bs(bs);
+    if (child_bs) {
+        return bdrv_dirname(child_bs, errp);
     }
 
     bdrv_refresh_filename(bs);
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [Qemu-devel] [PATCH v5 19/42] block: Use CAF in bdrv_co_rw_vmstate()
  2019-06-12 22:09 [Qemu-devel] [PATCH v5 00/42] block: Deal with filters Max Reitz
                   ` (17 preceding siblings ...)
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 18/42] block: Use CAFs in bdrv_refresh_filename() Max Reitz
@ 2019-06-12 22:09 ` Max Reitz
  2019-06-14 15:14   ` Vladimir Sementsov-Ogievskiy
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 20/42] block/snapshot: Fall back to storage child Max Reitz
                   ` (23 subsequent siblings)
  42 siblings, 1 reply; 113+ messages in thread
From: Max Reitz @ 2019-06-12 22:09 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, qemu-devel, Max Reitz

If a node whose driver does not provide VM state functions has a
metadata child, the VM state should probably go there; if it is a
filter, the VM state should probably go there.  It follows that we
should generally go down to the primary child.

Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 block/io.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/block/io.c b/block/io.c
index 659ea0c52a..14f99e1c00 100644
--- a/block/io.c
+++ b/block/io.c
@@ -2395,6 +2395,7 @@ bdrv_co_rw_vmstate(BlockDriverState *bs, QEMUIOVector *qiov, int64_t pos,
                    bool is_read)
 {
     BlockDriver *drv = bs->drv;
+    BlockDriverState *child_bs = bdrv_primary_bs(bs);
     int ret = -ENOTSUP;
 
     bdrv_inc_in_flight(bs);
@@ -2407,8 +2408,8 @@ bdrv_co_rw_vmstate(BlockDriverState *bs, QEMUIOVector *qiov, int64_t pos,
         } else {
             ret = drv->bdrv_save_vmstate(bs, qiov, pos);
         }
-    } else if (bs->file) {
-        ret = bdrv_co_rw_vmstate(bs->file->bs, qiov, pos, is_read);
+    } else if (child_bs) {
+        ret = bdrv_co_rw_vmstate(child_bs, qiov, pos, is_read);
     }
 
     bdrv_dec_in_flight(bs);
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [Qemu-devel] [PATCH v5 20/42] block/snapshot: Fall back to storage child
  2019-06-12 22:09 [Qemu-devel] [PATCH v5 00/42] block: Deal with filters Max Reitz
                   ` (18 preceding siblings ...)
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 19/42] block: Use CAF in bdrv_co_rw_vmstate() Max Reitz
@ 2019-06-12 22:09 ` Max Reitz
  2019-06-14 15:22   ` Vladimir Sementsov-Ogievskiy
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 21/42] block: Use CAFs for debug breakpoints Max Reitz
                   ` (22 subsequent siblings)
  42 siblings, 1 reply; 113+ messages in thread
From: Max Reitz @ 2019-06-12 22:09 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, qemu-devel, Max Reitz

If the top node's driver does not provide snapshot functionality and we
want to go down the chain, we should go towards the child which stores
the data, i.e. the storage child.

bdrv_snapshot_goto() becomes a bit weird because we may have to redirect
the actual child pointer, so it only works if the storage child is
bs->file or bs->backing (and then we have to find out which it is).

Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 block/snapshot.c | 74 ++++++++++++++++++++++++++++++++++--------------
 1 file changed, 53 insertions(+), 21 deletions(-)

diff --git a/block/snapshot.c b/block/snapshot.c
index f2f48f926a..58cd667f3a 100644
--- a/block/snapshot.c
+++ b/block/snapshot.c
@@ -154,8 +154,9 @@ int bdrv_can_snapshot(BlockDriverState *bs)
     }
 
     if (!drv->bdrv_snapshot_create) {
-        if (bs->file != NULL) {
-            return bdrv_can_snapshot(bs->file->bs);
+        BlockDriverState *storage_bs = bdrv_storage_bs(bs);
+        if (storage_bs) {
+            return bdrv_can_snapshot(storage_bs);
         }
         return 0;
     }
@@ -167,14 +168,15 @@ int bdrv_snapshot_create(BlockDriverState *bs,
                          QEMUSnapshotInfo *sn_info)
 {
     BlockDriver *drv = bs->drv;
+    BlockDriverState *storage_bs = bdrv_storage_bs(bs);
     if (!drv) {
         return -ENOMEDIUM;
     }
     if (drv->bdrv_snapshot_create) {
         return drv->bdrv_snapshot_create(bs, sn_info);
     }
-    if (bs->file) {
-        return bdrv_snapshot_create(bs->file->bs, sn_info);
+    if (storage_bs) {
+        return bdrv_snapshot_create(storage_bs, sn_info);
     }
     return -ENOTSUP;
 }
@@ -184,6 +186,7 @@ int bdrv_snapshot_goto(BlockDriverState *bs,
                        Error **errp)
 {
     BlockDriver *drv = bs->drv;
+    BlockDriverState *storage_bs;
     int ret, open_ret;
 
     if (!drv) {
@@ -204,39 +207,66 @@ int bdrv_snapshot_goto(BlockDriverState *bs,
         return ret;
     }
 
-    if (bs->file) {
-        BlockDriverState *file;
-        QDict *options = qdict_clone_shallow(bs->options);
+    storage_bs = bdrv_storage_bs(bs);
+    if (storage_bs) {
+        QDict *options;
         QDict *file_options;
         Error *local_err = NULL;
+        bool is_backing_child;
+        BdrvChild **child_pointer;
+
+        /*
+         * Filters may reference the storage child through
+         * bs->backing.  We need to know whether we are dealing with
+         * bs->backing or bs->file, so we check it here.
+         */
+        if (storage_bs == bs->file->bs) {
+            is_backing_child = false;
+            child_pointer = &bs->file;
+        } else if (storage_bs == bs->backing->bs) {
+            is_backing_child = true;
+            child_pointer = &bs->backing;
+        } else {
+            /*
+             * The storage child is not referenced by a field in the
+             * BDS object.  We cannot go on then.
+             */
+            error_setg(errp, "Block driver does not support snapshots");
+            return -ENOTSUP;
+        }
+
+        options = qdict_clone_shallow(bs->options);
 
-        file = bs->file->bs;
         /* Prevent it from getting deleted when detached from bs */
-        bdrv_ref(file);
+        bdrv_ref(storage_bs);
 
-        qdict_extract_subqdict(options, &file_options, "file.");
+        qdict_extract_subqdict(options, &file_options,
+                               is_backing_child ? "backing." : "file.");
         qobject_unref(file_options);
-        qdict_put_str(options, "file", bdrv_get_node_name(file));
+        qdict_put_str(options, is_backing_child ? "backing" : "file",
+                      bdrv_get_node_name(storage_bs));
 
         if (drv->bdrv_close) {
             drv->bdrv_close(bs);
         }
-        bdrv_unref_child(bs, bs->file);
-        bs->file = NULL;
 
-        ret = bdrv_snapshot_goto(file, snapshot_id, errp);
+        assert(storage_bs == (*child_pointer)->bs);
+        bdrv_unref_child(bs, *child_pointer);
+        *child_pointer = NULL;
+
+        ret = bdrv_snapshot_goto(storage_bs, snapshot_id, errp);
         open_ret = drv->bdrv_open(bs, options, bs->open_flags, &local_err);
         qobject_unref(options);
         if (open_ret < 0) {
-            bdrv_unref(file);
+            bdrv_unref(storage_bs);
             bs->drv = NULL;
             /* A bdrv_snapshot_goto() error takes precedence */
             error_propagate(errp, local_err);
             return ret < 0 ? ret : open_ret;
         }
 
-        assert(bs->file->bs == file);
-        bdrv_unref(file);
+        assert(storage_bs == (*child_pointer)->bs);
+        bdrv_unref(storage_bs);
         return ret;
     }
 
@@ -272,6 +302,7 @@ int bdrv_snapshot_delete(BlockDriverState *bs,
                          Error **errp)
 {
     BlockDriver *drv = bs->drv;
+    BlockDriverState *storage_bs = bdrv_storage_bs(bs);
     int ret;
 
     if (!drv) {
@@ -288,8 +319,8 @@ int bdrv_snapshot_delete(BlockDriverState *bs,
 
     if (drv->bdrv_snapshot_delete) {
         ret = drv->bdrv_snapshot_delete(bs, snapshot_id, name, errp);
-    } else if (bs->file) {
-        ret = bdrv_snapshot_delete(bs->file->bs, snapshot_id, name, errp);
+    } else if (storage_bs) {
+        ret = bdrv_snapshot_delete(storage_bs, snapshot_id, name, errp);
     } else {
         error_setg(errp, "Block format '%s' used by device '%s' "
                    "does not support internal snapshot deletion",
@@ -305,14 +336,15 @@ int bdrv_snapshot_list(BlockDriverState *bs,
                        QEMUSnapshotInfo **psn_info)
 {
     BlockDriver *drv = bs->drv;
+    BlockDriverState *storage_bs = bdrv_storage_bs(bs);
     if (!drv) {
         return -ENOMEDIUM;
     }
     if (drv->bdrv_snapshot_list) {
         return drv->bdrv_snapshot_list(bs, psn_info);
     }
-    if (bs->file) {
-        return bdrv_snapshot_list(bs->file->bs, psn_info);
+    if (storage_bs) {
+        return bdrv_snapshot_list(storage_bs, psn_info);
     }
     return -ENOTSUP;
 }
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [Qemu-devel] [PATCH v5 21/42] block: Use CAFs for debug breakpoints
  2019-06-12 22:09 [Qemu-devel] [PATCH v5 00/42] block: Deal with filters Max Reitz
                   ` (19 preceding siblings ...)
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 20/42] block/snapshot: Fall back to storage child Max Reitz
@ 2019-06-12 22:09 ` Max Reitz
  2019-06-14 15:29   ` Vladimir Sementsov-Ogievskiy
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 22/42] block: Use CAFs in bdrv_get_allocated_file_size() Max Reitz
                   ` (21 subsequent siblings)
  42 siblings, 1 reply; 113+ messages in thread
From: Max Reitz @ 2019-06-12 22:09 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, qemu-devel, Max Reitz

When looking for a blkdebug node (which implements debug breakpoints),
use bdrv_primary_bs() to iterate through the graph, because that is
where a blkdebug node would be.

Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 block.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/block.c b/block.c
index 797bec0326..11b7ba8cf6 100644
--- a/block.c
+++ b/block.c
@@ -5097,7 +5097,7 @@ int bdrv_debug_breakpoint(BlockDriverState *bs, const char *event,
                           const char *tag)
 {
     while (bs && bs->drv && !bs->drv->bdrv_debug_breakpoint) {
-        bs = bs->file ? bs->file->bs : NULL;
+        bs = bdrv_primary_bs(bs);
     }
 
     if (bs && bs->drv && bs->drv->bdrv_debug_breakpoint) {
@@ -5110,7 +5110,7 @@ int bdrv_debug_breakpoint(BlockDriverState *bs, const char *event,
 int bdrv_debug_remove_breakpoint(BlockDriverState *bs, const char *tag)
 {
     while (bs && bs->drv && !bs->drv->bdrv_debug_remove_breakpoint) {
-        bs = bs->file ? bs->file->bs : NULL;
+        bs = bdrv_primary_bs(bs);
     }
 
     if (bs && bs->drv && bs->drv->bdrv_debug_remove_breakpoint) {
@@ -5123,7 +5123,7 @@ int bdrv_debug_remove_breakpoint(BlockDriverState *bs, const char *tag)
 int bdrv_debug_resume(BlockDriverState *bs, const char *tag)
 {
     while (bs && (!bs->drv || !bs->drv->bdrv_debug_resume)) {
-        bs = bs->file ? bs->file->bs : NULL;
+        bs = bdrv_primary_bs(bs);
     }
 
     if (bs && bs->drv && bs->drv->bdrv_debug_resume) {
@@ -5136,7 +5136,7 @@ int bdrv_debug_resume(BlockDriverState *bs, const char *tag)
 bool bdrv_debug_is_suspended(BlockDriverState *bs, const char *tag)
 {
     while (bs && bs->drv && !bs->drv->bdrv_debug_is_suspended) {
-        bs = bs->file ? bs->file->bs : NULL;
+        bs = bdrv_primary_bs(bs);
     }
 
     if (bs && bs->drv && bs->drv->bdrv_debug_is_suspended) {
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [Qemu-devel] [PATCH v5 22/42] block: Use CAFs in bdrv_get_allocated_file_size()
  2019-06-12 22:09 [Qemu-devel] [PATCH v5 00/42] block: Deal with filters Max Reitz
                   ` (20 preceding siblings ...)
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 21/42] block: Use CAFs for debug breakpoints Max Reitz
@ 2019-06-12 22:09 ` Max Reitz
  2019-06-12 22:17   ` Max Reitz
  2019-06-14 15:41   ` Vladimir Sementsov-Ogievskiy
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 23/42] blockdev: Use CAF in external_snapshot_prepare() Max Reitz
                   ` (20 subsequent siblings)
  42 siblings, 2 replies; 113+ messages in thread
From: Max Reitz @ 2019-06-12 22:09 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, qemu-devel, Max Reitz

Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 block.c | 26 ++++++++++++++++++++++++--
 1 file changed, 24 insertions(+), 2 deletions(-)

diff --git a/block.c b/block.c
index 11b7ba8cf6..856d9b58be 100644
--- a/block.c
+++ b/block.c
@@ -4511,15 +4511,37 @@ exit:
 int64_t bdrv_get_allocated_file_size(BlockDriverState *bs)
 {
     BlockDriver *drv = bs->drv;
+    BlockDriverState *storage_bs, *metadata_bs;
+
     if (!drv) {
         return -ENOMEDIUM;
     }
+
     if (drv->bdrv_get_allocated_file_size) {
         return drv->bdrv_get_allocated_file_size(bs);
     }
-    if (bs->file) {
-        return bdrv_get_allocated_file_size(bs->file->bs);
+
+    storage_bs = bdrv_storage_bs(bs);
+    metadata_bs = bdrv_metadata_bs(bs);
+
+    if (storage_bs) {
+        int64_t data_size, metadata_size = 0;
+
+        data_size = bdrv_get_allocated_file_size(storage_bs);
+        if (data_size < 0) {
+            return data_size;
+        }
+
+        if (storage_bs != metadata_bs) {
+            metadata_size = bdrv_get_allocated_file_size(metadata_bs);
+            if (metadata_size < 0) {
+                return metadata_size;
+            }
+        }
+
+        return data_size + metadata_size;
     }
+
     return -ENOTSUP;
 }
 
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [Qemu-devel] [PATCH v5 23/42] blockdev: Use CAF in external_snapshot_prepare()
  2019-06-12 22:09 [Qemu-devel] [PATCH v5 00/42] block: Deal with filters Max Reitz
                   ` (21 preceding siblings ...)
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 22/42] block: Use CAFs in bdrv_get_allocated_file_size() Max Reitz
@ 2019-06-12 22:09 ` Max Reitz
  2019-06-14 15:46   ` Vladimir Sementsov-Ogievskiy
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 24/42] block: Use child access functions for QAPI queries Max Reitz
                   ` (19 subsequent siblings)
  42 siblings, 1 reply; 113+ messages in thread
From: Max Reitz @ 2019-06-12 22:09 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, qemu-devel, Max Reitz

This allows us to differentiate between filters and nodes with COW
backing files: Filters cannot be used as overlays at all (for this
function).

Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 blockdev.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/blockdev.c b/blockdev.c
index b5c0fd3c49..0f0cf0d9ae 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1665,7 +1665,12 @@ static void external_snapshot_prepare(BlkActionState *common,
         goto out;
     }
 
-    if (state->new_bs->backing != NULL) {
+    if (state->new_bs->drv->is_filter) {
+        error_setg(errp, "Filters cannot be used as overlays");
+        goto out;
+    }
+
+    if (bdrv_filtered_cow_child(state->new_bs)) {
         error_setg(errp, "The overlay already has a backing image");
         goto out;
     }
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [Qemu-devel] [PATCH v5 24/42] block: Use child access functions for QAPI queries
  2019-06-12 22:09 [Qemu-devel] [PATCH v5 00/42] block: Deal with filters Max Reitz
                   ` (22 preceding siblings ...)
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 23/42] blockdev: Use CAF in external_snapshot_prepare() Max Reitz
@ 2019-06-12 22:09 ` Max Reitz
  2019-06-18 12:06   ` Vladimir Sementsov-Ogievskiy
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 25/42] mirror: Deal with filters Max Reitz
                   ` (18 subsequent siblings)
  42 siblings, 1 reply; 113+ messages in thread
From: Max Reitz @ 2019-06-12 22:09 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, qemu-devel, Max Reitz

query-block and query-named-block-nodes now return any filtered child
under "backing", not just bs->backing or COW children.  This is so that
filters do not interrupt the reported backing chain.  This changes the
output for iotest 184, as the throttled node now appears as a backing
child.

Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 block/qapi.c               | 35 ++++++++++++++++++++---------------
 tests/qemu-iotests/184.out |  7 ++++++-
 2 files changed, 26 insertions(+), 16 deletions(-)

diff --git a/block/qapi.c b/block/qapi.c
index 0c13c86f4e..1fd2937abc 100644
--- a/block/qapi.c
+++ b/block/qapi.c
@@ -150,9 +150,13 @@ BlockDeviceInfo *bdrv_block_device_info(BlockBackend *blk,
             return NULL;
         }
 
-        if (bs0->drv && bs0->backing) {
+        if (bs0->drv && bdrv_filtered_child(bs0)) {
+            /*
+             * Put any filtered child here (for backwards compatibility to when
+             * we put bs0->backing here, which might be any filtered child).
+             */
             info->backing_file_depth++;
-            bs0 = bs0->backing->bs;
+            bs0 = bdrv_filtered_bs(bs0);
             (*p_image_info)->has_backing_image = true;
             p_image_info = &((*p_image_info)->backing_image);
         } else {
@@ -161,9 +165,8 @@ BlockDeviceInfo *bdrv_block_device_info(BlockBackend *blk,
 
         /* Skip automatically inserted nodes that the user isn't aware of for
          * query-block (blk != NULL), but not for query-named-block-nodes */
-        while (blk && bs0->drv && bs0->implicit) {
-            bs0 = backing_bs(bs0);
-            assert(bs0);
+        if (blk) {
+            bs0 = bdrv_skip_implicit_filters(bs0);
         }
     }
 
@@ -348,9 +351,9 @@ static void bdrv_query_info(BlockBackend *blk, BlockInfo **p_info,
     BlockDriverState *bs = blk_bs(blk);
     char *qdev;
 
-    /* Skip automatically inserted nodes that the user isn't aware of */
-    while (bs && bs->drv && bs->implicit) {
-        bs = backing_bs(bs);
+    if (bs) {
+        /* Skip automatically inserted nodes that the user isn't aware of */
+        bs = bdrv_skip_implicit_filters(bs);
     }
 
     info->device = g_strdup(blk_name(blk));
@@ -507,6 +510,7 @@ static void bdrv_query_blk_stats(BlockDeviceStats *ds, BlockBackend *blk)
 static BlockStats *bdrv_query_bds_stats(BlockDriverState *bs,
                                         bool blk_level)
 {
+    BlockDriverState *storage_bs, *cow_bs;
     BlockStats *s = NULL;
 
     s = g_malloc0(sizeof(*s));
@@ -519,9 +523,8 @@ static BlockStats *bdrv_query_bds_stats(BlockDriverState *bs,
     /* Skip automatically inserted nodes that the user isn't aware of in
      * a BlockBackend-level command. Stay at the exact node for a node-level
      * command. */
-    while (blk_level && bs->drv && bs->implicit) {
-        bs = backing_bs(bs);
-        assert(bs);
+    if (blk_level) {
+        bs = bdrv_skip_implicit_filters(bs);
     }
 
     if (bdrv_get_node_name(bs)[0]) {
@@ -531,14 +534,16 @@ static BlockStats *bdrv_query_bds_stats(BlockDriverState *bs,
 
     s->stats->wr_highest_offset = stat64_get(&bs->wr_highest_offset);
 
-    if (bs->file) {
+    storage_bs = bdrv_storage_bs(bs);
+    if (storage_bs) {
         s->has_parent = true;
-        s->parent = bdrv_query_bds_stats(bs->file->bs, blk_level);
+        s->parent = bdrv_query_bds_stats(storage_bs, blk_level);
     }
 
-    if (blk_level && bs->backing) {
+    cow_bs = bdrv_filtered_cow_bs(bs);
+    if (blk_level && cow_bs) {
         s->has_backing = true;
-        s->backing = bdrv_query_bds_stats(bs->backing->bs, blk_level);
+        s->backing = bdrv_query_bds_stats(cow_bs, blk_level);
     }
 
     return s;
diff --git a/tests/qemu-iotests/184.out b/tests/qemu-iotests/184.out
index 3deb3cfb94..1d61f7e224 100644
--- a/tests/qemu-iotests/184.out
+++ b/tests/qemu-iotests/184.out
@@ -27,6 +27,11 @@ Testing:
             "iops_rd": 0,
             "detect_zeroes": "off",
             "image": {
+                "backing-image": {
+                    "virtual-size": 1073741824,
+                    "filename": "null-co://",
+                    "format": "null-co"
+                },
                 "virtual-size": 1073741824,
                 "filename": "json:{\"throttle-group\": \"group0\", \"driver\": \"throttle\", \"file\": {\"driver\": \"null-co\"}}",
                 "format": "throttle"
@@ -34,7 +39,7 @@ Testing:
             "iops_wr": 0,
             "ro": false,
             "node-name": "throttle0",
-            "backing_file_depth": 0,
+            "backing_file_depth": 1,
             "drv": "throttle",
             "iops": 0,
             "bps_wr": 0,
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [Qemu-devel] [PATCH v5 25/42] mirror: Deal with filters
  2019-06-12 22:09 [Qemu-devel] [PATCH v5 00/42] block: Deal with filters Max Reitz
                   ` (23 preceding siblings ...)
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 24/42] block: Use child access functions for QAPI queries Max Reitz
@ 2019-06-12 22:09 ` Max Reitz
  2019-06-18 13:12   ` Vladimir Sementsov-Ogievskiy
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 26/42] backup: " Max Reitz
                   ` (17 subsequent siblings)
  42 siblings, 1 reply; 113+ messages in thread
From: Max Reitz @ 2019-06-12 22:09 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, qemu-devel, Max Reitz

This includes some permission limiting (for example, we only need to
take the RESIZE permission for active commits where the base is smaller
than the top).

Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 block/mirror.c | 110 +++++++++++++++++++++++++++++++++++++------------
 blockdev.c     |  47 +++++++++++++++++----
 2 files changed, 124 insertions(+), 33 deletions(-)

diff --git a/block/mirror.c b/block/mirror.c
index 4fa8f57c80..3d767e3030 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -660,8 +660,10 @@ static int mirror_exit_common(Job *job)
                             &error_abort);
     if (!abort && s->backing_mode == MIRROR_SOURCE_BACKING_CHAIN) {
         BlockDriverState *backing = s->is_none_mode ? src : s->base;
-        if (backing_bs(target_bs) != backing) {
-            bdrv_set_backing_hd(target_bs, backing, &local_err);
+        BlockDriverState *unfiltered_target = bdrv_skip_rw_filters(target_bs);
+
+        if (bdrv_filtered_cow_bs(unfiltered_target) != backing) {
+            bdrv_set_backing_hd(unfiltered_target, backing, &local_err);
             if (local_err) {
                 error_report_err(local_err);
                 ret = -EPERM;
@@ -711,7 +713,7 @@ static int mirror_exit_common(Job *job)
     block_job_remove_all_bdrv(bjob);
     bdrv_child_try_set_perm(mirror_top_bs->backing, 0, BLK_PERM_ALL,
                             &error_abort);
-    bdrv_replace_node(mirror_top_bs, backing_bs(mirror_top_bs), &error_abort);
+    bdrv_replace_node(mirror_top_bs, mirror_top_bs->backing->bs, &error_abort);
 
     /* We just changed the BDS the job BB refers to (with either or both of the
      * bdrv_replace_node() calls), so switch the BB back so the cleanup does
@@ -757,6 +759,7 @@ static int coroutine_fn mirror_dirty_init(MirrorBlockJob *s)
 {
     int64_t offset;
     BlockDriverState *base = s->base;
+    BlockDriverState *filtered_base;
     BlockDriverState *bs = s->mirror_top_bs->backing->bs;
     BlockDriverState *target_bs = blk_bs(s->target);
     int ret;
@@ -795,6 +798,9 @@ static int coroutine_fn mirror_dirty_init(MirrorBlockJob *s)
         s->initial_zeroing_ongoing = false;
     }
 
+    /* Will be NULL if @base is not in @bs's chain */
+    filtered_base = bdrv_filtered_cow_bs(bdrv_find_overlay(bs, base));
+
     /* First part, loop on the sectors and initialize the dirty bitmap.  */
     for (offset = 0; offset < s->bdev_length; ) {
         /* Just to make sure we are not exceeding int limit. */
@@ -807,7 +813,7 @@ static int coroutine_fn mirror_dirty_init(MirrorBlockJob *s)
             return 0;
         }
 
-        ret = bdrv_is_allocated_above(bs, base, offset, bytes, &count);
+        ret = bdrv_is_allocated_above(bs, filtered_base, offset, bytes, &count);
         if (ret < 0) {
             return ret;
         }
@@ -903,7 +909,7 @@ static int coroutine_fn mirror_run(Job *job, Error **errp)
     } else {
         s->target_cluster_size = BDRV_SECTOR_SIZE;
     }
-    if (backing_filename[0] && !target_bs->backing &&
+    if (backing_filename[0] && !bdrv_backing_chain_next(target_bs) &&
         s->granularity < s->target_cluster_size) {
         s->buf_size = MAX(s->buf_size, s->target_cluster_size);
         s->cow_bitmap = bitmap_new(length);
@@ -1083,8 +1089,9 @@ static void mirror_complete(Job *job, Error **errp)
     if (s->backing_mode == MIRROR_OPEN_BACKING_CHAIN) {
         int ret;
 
-        assert(!target->backing);
-        ret = bdrv_open_backing_file(target, NULL, "backing", errp);
+        assert(!bdrv_backing_chain_next(target));
+        ret = bdrv_open_backing_file(bdrv_skip_rw_filters(target), NULL,
+                                     "backing", errp);
         if (ret < 0) {
             return;
         }
@@ -1503,8 +1510,8 @@ static void mirror_start_job(const char *job_id, BlockDriverState *bs,
     MirrorBlockJob *s;
     MirrorBDSOpaque *bs_opaque;
     BlockDriverState *mirror_top_bs;
-    bool target_graph_mod;
     bool target_is_backing;
+    uint64_t target_perms, target_shared_perms;
     Error *local_err = NULL;
     int ret;
 
@@ -1523,7 +1530,7 @@ static void mirror_start_job(const char *job_id, BlockDriverState *bs,
         buf_size = DEFAULT_MIRROR_BUF_SIZE;
     }
 
-    if (bs == target) {
+    if (bdrv_skip_rw_filters(bs) == bdrv_skip_rw_filters(target)) {
         error_setg(errp, "Can't mirror node into itself");
         return;
     }
@@ -1583,15 +1590,42 @@ static void mirror_start_job(const char *job_id, BlockDriverState *bs,
      * In the case of active commit, things look a bit different, though,
      * because the target is an already populated backing file in active use.
      * We can allow anything except resize there.*/
+
+    target_perms = BLK_PERM_WRITE;
+    target_shared_perms = BLK_PERM_WRITE_UNCHANGED;
+
     target_is_backing = bdrv_chain_contains(bs, target);
-    target_graph_mod = (backing_mode != MIRROR_LEAVE_BACKING_CHAIN);
+    if (target_is_backing) {
+        int64_t bs_size, target_size;
+        bs_size = bdrv_getlength(bs);
+        if (bs_size < 0) {
+            error_setg_errno(errp, -bs_size,
+                             "Could not inquire top image size");
+            goto fail;
+        }
+
+        target_size = bdrv_getlength(target);
+        if (target_size < 0) {
+            error_setg_errno(errp, -target_size,
+                             "Could not inquire base image size");
+            goto fail;
+        }
+
+        if (target_size < bs_size) {
+            target_perms |= BLK_PERM_RESIZE;
+        }
+
+        target_shared_perms |= BLK_PERM_CONSISTENT_READ
+                            |  BLK_PERM_WRITE
+                            |  BLK_PERM_GRAPH_MOD;
+    }
+
+    if (backing_mode != MIRROR_LEAVE_BACKING_CHAIN) {
+        target_perms |= BLK_PERM_GRAPH_MOD;
+    }
+
     s->target = blk_new(s->common.job.aio_context,
-                        BLK_PERM_WRITE | BLK_PERM_RESIZE |
-                        (target_graph_mod ? BLK_PERM_GRAPH_MOD : 0),
-                        BLK_PERM_WRITE_UNCHANGED |
-                        (target_is_backing ? BLK_PERM_CONSISTENT_READ |
-                                             BLK_PERM_WRITE |
-                                             BLK_PERM_GRAPH_MOD : 0));
+                        target_perms, target_shared_perms);
     ret = blk_insert_bs(s->target, target, errp);
     if (ret < 0) {
         goto fail;
@@ -1641,15 +1675,39 @@ static void mirror_start_job(const char *job_id, BlockDriverState *bs,
     /* In commit_active_start() all intermediate nodes disappear, so
      * any jobs in them must be blocked */
     if (target_is_backing) {
-        BlockDriverState *iter;
-        for (iter = backing_bs(bs); iter != target; iter = backing_bs(iter)) {
-            /* XXX BLK_PERM_WRITE needs to be allowed so we don't block
-             * ourselves at s->base (if writes are blocked for a node, they are
-             * also blocked for its backing file). The other options would be a
-             * second filter driver above s->base (== target). */
+        BlockDriverState *iter, *filtered_target;
+        uint64_t iter_shared_perms;
+
+        /*
+         * The topmost node with
+         * bdrv_skip_rw_filters(filtered_target) == bdrv_skip_rw_filters(target)
+         */
+        filtered_target = bdrv_filtered_cow_bs(bdrv_find_overlay(bs, target));
+
+        assert(bdrv_skip_rw_filters(filtered_target) ==
+               bdrv_skip_rw_filters(target));
+
+        /*
+         * XXX BLK_PERM_WRITE needs to be allowed so we don't block
+         * ourselves at s->base (if writes are blocked for a node, they are
+         * also blocked for its backing file). The other options would be a
+         * second filter driver above s->base (== target).
+         */
+        iter_shared_perms = BLK_PERM_WRITE_UNCHANGED | BLK_PERM_WRITE;
+
+        for (iter = bdrv_filtered_bs(bs); iter != target;
+             iter = bdrv_filtered_bs(iter))
+        {
+            if (iter == filtered_target) {
+                /*
+                 * From here on, all nodes are filters on the base.
+                 * This allows us to share BLK_PERM_CONSISTENT_READ.
+                 */
+                iter_shared_perms |= BLK_PERM_CONSISTENT_READ;
+            }
+
             ret = block_job_add_bdrv(&s->common, "intermediate node", iter, 0,
-                                     BLK_PERM_WRITE_UNCHANGED | BLK_PERM_WRITE,
-                                     errp);
+                                     iter_shared_perms, errp);
             if (ret < 0) {
                 goto fail;
             }
@@ -1683,7 +1741,7 @@ fail:
 
     bdrv_child_try_set_perm(mirror_top_bs->backing, 0, BLK_PERM_ALL,
                             &error_abort);
-    bdrv_replace_node(mirror_top_bs, backing_bs(mirror_top_bs), &error_abort);
+    bdrv_replace_node(mirror_top_bs, mirror_top_bs->backing->bs, &error_abort);
 
     bdrv_unref(mirror_top_bs);
 }
@@ -1706,7 +1764,7 @@ void mirror_start(const char *job_id, BlockDriverState *bs,
         return;
     }
     is_none_mode = mode == MIRROR_SYNC_MODE_NONE;
-    base = mode == MIRROR_SYNC_MODE_TOP ? backing_bs(bs) : NULL;
+    base = mode == MIRROR_SYNC_MODE_TOP ? bdrv_backing_chain_next(bs) : NULL;
     mirror_start_job(job_id, bs, creation_flags, target, replaces,
                      speed, granularity, buf_size, backing_mode,
                      on_source_error, on_target_error, unmap, NULL, NULL,
diff --git a/blockdev.c b/blockdev.c
index 0f0cf0d9ae..68e8d33447 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -3777,7 +3777,7 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
         return;
     }
 
-    if (!bs->backing && sync == MIRROR_SYNC_MODE_TOP) {
+    if (!bdrv_backing_chain_next(bs) && sync == MIRROR_SYNC_MODE_TOP) {
         sync = MIRROR_SYNC_MODE_FULL;
     }
 
@@ -3826,7 +3826,7 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
 
 void qmp_drive_mirror(DriveMirror *arg, Error **errp)
 {
-    BlockDriverState *bs;
+    BlockDriverState *bs, *unfiltered_bs;
     BlockDriverState *source, *target_bs;
     AioContext *aio_context;
     BlockMirrorBackingMode backing_mode;
@@ -3835,6 +3835,7 @@ void qmp_drive_mirror(DriveMirror *arg, Error **errp)
     int flags;
     int64_t size;
     const char *format = arg->format;
+    const char *replaces_node_name = NULL;
     int ret;
 
     bs = qmp_get_root_bs(arg->device, errp);
@@ -3847,6 +3848,16 @@ void qmp_drive_mirror(DriveMirror *arg, Error **errp)
         return;
     }
 
+    /*
+     * If the user has not instructed us otherwise, we should let the
+     * block job run from @bs (thus taking into account all filters on
+     * it) but replace @unfiltered_bs when it finishes (thus not
+     * removing those filters).
+     * (And if there are any explicit filters, we should assume the
+     *  user knows how to use the @replaces option.)
+     */
+    unfiltered_bs = bdrv_skip_implicit_filters(bs);
+
     aio_context = bdrv_get_aio_context(bs);
     aio_context_acquire(aio_context);
 
@@ -3860,8 +3871,14 @@ void qmp_drive_mirror(DriveMirror *arg, Error **errp)
     }
 
     flags = bs->open_flags | BDRV_O_RDWR;
-    source = backing_bs(bs);
+    source = bdrv_filtered_cow_bs(unfiltered_bs);
     if (!source && arg->sync == MIRROR_SYNC_MODE_TOP) {
+        if (bdrv_filtered_bs(unfiltered_bs)) {
+            /* @unfiltered_bs is an explicit filter */
+            error_setg(errp, "Cannot perform sync=top mirror through an "
+                       "explicitly added filter node on the source");
+            goto out;
+        }
         arg->sync = MIRROR_SYNC_MODE_FULL;
     }
     if (arg->sync == MIRROR_SYNC_MODE_NONE) {
@@ -3880,6 +3897,9 @@ void qmp_drive_mirror(DriveMirror *arg, Error **errp)
                              " named node of the graph");
             goto out;
         }
+        replaces_node_name = arg->replaces;
+    } else if (unfiltered_bs != bs) {
+        replaces_node_name = unfiltered_bs->node_name;
     }
 
     if (arg->mode == NEW_IMAGE_MODE_ABSOLUTE_PATHS) {
@@ -3899,6 +3919,9 @@ void qmp_drive_mirror(DriveMirror *arg, Error **errp)
         bdrv_img_create(arg->target, format,
                         NULL, NULL, NULL, size, flags, false, &local_err);
     } else {
+        /* Implicit filters should not appear in the filename */
+        BlockDriverState *explicit_backing = bdrv_skip_implicit_filters(source);
+
         switch (arg->mode) {
         case NEW_IMAGE_MODE_EXISTING:
             break;
@@ -3906,8 +3929,8 @@ void qmp_drive_mirror(DriveMirror *arg, Error **errp)
             /* create new image with backing file */
             bdrv_refresh_filename(source);
             bdrv_img_create(arg->target, format,
-                            source->filename,
-                            source->drv->format_name,
+                            explicit_backing->filename,
+                            explicit_backing->drv->format_name,
                             NULL, size, flags, false, &local_err);
             break;
         default:
@@ -3943,7 +3966,7 @@ void qmp_drive_mirror(DriveMirror *arg, Error **errp)
     }
 
     blockdev_mirror_common(arg->has_job_id ? arg->job_id : NULL, bs, target_bs,
-                           arg->has_replaces, arg->replaces, arg->sync,
+                           !!replaces_node_name, replaces_node_name, arg->sync,
                            backing_mode, arg->has_speed, arg->speed,
                            arg->has_granularity, arg->granularity,
                            arg->has_buf_size, arg->buf_size,
@@ -3979,7 +4002,7 @@ void qmp_blockdev_mirror(bool has_job_id, const char *job_id,
                          bool has_auto_dismiss, bool auto_dismiss,
                          Error **errp)
 {
-    BlockDriverState *bs;
+    BlockDriverState *bs, *unfiltered_bs;
     BlockDriverState *target_bs;
     AioContext *aio_context;
     BlockMirrorBackingMode backing_mode = MIRROR_LEAVE_BACKING_CHAIN;
@@ -3991,6 +4014,16 @@ void qmp_blockdev_mirror(bool has_job_id, const char *job_id,
         return;
     }
 
+    /*
+     * Same as in qmp_drive_mirror(): We want to run the job from @bs,
+     * but we want to replace @unfiltered_bs on completion.
+     */
+    unfiltered_bs = bdrv_skip_implicit_filters(bs);
+    if (!has_replaces && unfiltered_bs != bs) {
+        replaces = unfiltered_bs->node_name;
+        has_replaces = true;
+    }
+
     target_bs = bdrv_lookup_bs(target, target, errp);
     if (!target_bs) {
         return;
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [Qemu-devel] [PATCH v5 26/42] backup: Deal with filters
  2019-06-12 22:09 [Qemu-devel] [PATCH v5 00/42] block: Deal with filters Max Reitz
                   ` (24 preceding siblings ...)
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 25/42] mirror: Deal with filters Max Reitz
@ 2019-06-12 22:09 ` Max Reitz
  2019-06-18 13:45   ` Vladimir Sementsov-Ogievskiy
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 27/42] commit: " Max Reitz
                   ` (16 subsequent siblings)
  42 siblings, 1 reply; 113+ messages in thread
From: Max Reitz @ 2019-06-12 22:09 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, qemu-devel, Max Reitz

Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 block/backup.c |  9 +++++----
 blockdev.c     | 19 +++++++++++++++----
 2 files changed, 20 insertions(+), 8 deletions(-)

diff --git a/block/backup.c b/block/backup.c
index 715e1d3be8..88435f883d 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -502,6 +502,7 @@ static int64_t backup_calculate_cluster_size(BlockDriverState *target,
 {
     int ret;
     BlockDriverInfo bdi;
+    bool target_does_cow = bdrv_backing_chain_next(target);
 
     /*
      * If there is no backing file on the target, we cannot rely on COW if our
@@ -509,7 +510,7 @@ static int64_t backup_calculate_cluster_size(BlockDriverState *target,
      * targets with a backing file, try to avoid COW if possible.
      */
     ret = bdrv_get_info(target, &bdi);
-    if (ret == -ENOTSUP && !target->backing) {
+    if (ret == -ENOTSUP && !target_does_cow) {
         /* Cluster size is not defined */
         warn_report("The target block device doesn't provide "
                     "information about the block size and it doesn't have a "
@@ -518,14 +519,14 @@ static int64_t backup_calculate_cluster_size(BlockDriverState *target,
                     "this default, the backup may be unusable",
                     BACKUP_CLUSTER_SIZE_DEFAULT);
         return BACKUP_CLUSTER_SIZE_DEFAULT;
-    } else if (ret < 0 && !target->backing) {
+    } else if (ret < 0 && !target_does_cow) {
         error_setg_errno(errp, -ret,
             "Couldn't determine the cluster size of the target image, "
             "which has no backing file");
         error_append_hint(errp,
             "Aborting, since this may create an unusable destination image\n");
         return ret;
-    } else if (ret < 0 && target->backing) {
+    } else if (ret < 0 && target_does_cow) {
         /* Not fatal; just trudge on ahead. */
         return BACKUP_CLUSTER_SIZE_DEFAULT;
     }
@@ -569,7 +570,7 @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs,
         return NULL;
     }
 
-    if (compress && target->drv->bdrv_co_pwritev_compressed == NULL) {
+    if (compress && !bdrv_supports_compressed_writes(target)) {
         error_setg(errp, "Compression is not supported for this drive %s",
                    bdrv_get_device_name(target));
         return NULL;
diff --git a/blockdev.c b/blockdev.c
index 68e8d33447..605e7b0994 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -3500,7 +3500,13 @@ static BlockJob *do_drive_backup(DriveBackup *backup, JobTxn *txn,
     /* See if we have a backing HD we can use to create our new image
      * on top of. */
     if (backup->sync == MIRROR_SYNC_MODE_TOP) {
-        source = backing_bs(bs);
+        /*
+         * Backup will not replace the source by the target, so none
+         * of the filters skipped here will be removed (in contrast to
+         * mirror).  Therefore, we can skip all of them when looking
+         * for the first COW relationship.
+         */
+        source = bdrv_filtered_cow_bs(bdrv_skip_rw_filters(bs));
         if (!source) {
             backup->sync = MIRROR_SYNC_MODE_FULL;
         }
@@ -3520,9 +3526,14 @@ static BlockJob *do_drive_backup(DriveBackup *backup, JobTxn *txn,
     if (backup->mode != NEW_IMAGE_MODE_EXISTING) {
         assert(backup->format);
         if (source) {
-            bdrv_refresh_filename(source);
-            bdrv_img_create(backup->target, backup->format, source->filename,
-                            source->drv->format_name, NULL,
+            /* Implicit filters should not appear in the filename */
+            BlockDriverState *explicit_backing =
+                bdrv_skip_implicit_filters(source);
+
+            bdrv_refresh_filename(explicit_backing);
+            bdrv_img_create(backup->target, backup->format,
+                            explicit_backing->filename,
+                            explicit_backing->drv->format_name, NULL,
                             size, flags, false, &local_err);
         } else {
             bdrv_img_create(backup->target, backup->format, NULL, NULL, NULL,
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [Qemu-devel] [PATCH v5 27/42] commit: Deal with filters
  2019-06-12 22:09 [Qemu-devel] [PATCH v5 00/42] block: Deal with filters Max Reitz
                   ` (25 preceding siblings ...)
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 26/42] backup: " Max Reitz
@ 2019-06-12 22:09 ` Max Reitz
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 28/42] stream: " Max Reitz
                   ` (15 subsequent siblings)
  42 siblings, 0 replies; 113+ messages in thread
From: Max Reitz @ 2019-06-12 22:09 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, qemu-devel, Max Reitz

This includes some permission limiting (for example, we only need to
take the RESIZE permission if the base is smaller than the top).

Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 block/block-backend.c | 16 ++++---
 block/commit.c        | 97 ++++++++++++++++++++++++++++++++-----------
 blockdev.c            |  6 ++-
 3 files changed, 87 insertions(+), 32 deletions(-)

diff --git a/block/block-backend.c b/block/block-backend.c
index f5d9407d20..227a6951a0 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -2156,11 +2156,17 @@ int blk_commit_all(void)
         AioContext *aio_context = blk_get_aio_context(blk);
 
         aio_context_acquire(aio_context);
-        if (blk_is_inserted(blk) && blk->root->bs->backing) {
-            int ret = bdrv_commit(blk->root->bs);
-            if (ret < 0) {
-                aio_context_release(aio_context);
-                return ret;
+        if (blk_is_inserted(blk)) {
+            BlockDriverState *non_filter;
+
+            /* Legacy function, so skip implicit filters */
+            non_filter = bdrv_skip_implicit_filters(blk->root->bs);
+            if (bdrv_filtered_cow_child(non_filter)) {
+                int ret = bdrv_commit(non_filter);
+                if (ret < 0) {
+                    aio_context_release(aio_context);
+                    return ret;
+                }
             }
         }
         aio_context_release(aio_context);
diff --git a/block/commit.c b/block/commit.c
index f20a26fecd..ec5a8c8edf 100644
--- a/block/commit.c
+++ b/block/commit.c
@@ -112,7 +112,7 @@ static void commit_abort(Job *job)
      * something to base, the intermediate images aren't valid any more. */
     bdrv_child_try_set_perm(s->commit_top_bs->backing, 0, BLK_PERM_ALL,
                             &error_abort);
-    bdrv_replace_node(s->commit_top_bs, backing_bs(s->commit_top_bs),
+    bdrv_replace_node(s->commit_top_bs, s->commit_top_bs->backing->bs,
                       &error_abort);
 
     bdrv_unref(s->commit_top_bs);
@@ -137,6 +137,7 @@ static void commit_clean(Job *job)
 static int coroutine_fn commit_run(Job *job, Error **errp)
 {
     CommitBlockJob *s = container_of(job, CommitBlockJob, common.job);
+    BlockDriverState *filtered_base;
     int64_t offset;
     uint64_t delay_ns = 0;
     int ret = 0;
@@ -163,6 +164,9 @@ static int coroutine_fn commit_run(Job *job, Error **errp)
         }
     }
 
+    filtered_base = bdrv_filtered_cow_bs(bdrv_find_overlay(blk_bs(s->top),
+                                                           blk_bs(s->base)));
+
     buf = blk_blockalign(s->top, COMMIT_BUFFER_SIZE);
 
     for (offset = 0; offset < len; offset += n) {
@@ -176,7 +180,7 @@ static int coroutine_fn commit_run(Job *job, Error **errp)
             break;
         }
         /* Copy if allocated above the base */
-        ret = bdrv_is_allocated_above(blk_bs(s->top), blk_bs(s->base),
+        ret = bdrv_is_allocated_above(blk_bs(s->top), filtered_base,
                                       offset, COMMIT_BUFFER_SIZE, &n);
         copy = (ret == 1);
         trace_commit_one_iteration(s, offset, n, ret);
@@ -269,15 +273,35 @@ void commit_start(const char *job_id, BlockDriverState *bs,
     CommitBlockJob *s;
     BlockDriverState *iter;
     BlockDriverState *commit_top_bs = NULL;
+    BlockDriverState *filtered_base;
     Error *local_err = NULL;
+    int64_t base_size, top_size;
+    uint64_t perms, iter_shared_perms;
     int ret;
 
     assert(top != bs);
-    if (top == base) {
+    if (bdrv_skip_rw_filters(top) == bdrv_skip_rw_filters(base)) {
         error_setg(errp, "Invalid files for merge: top and base are the same");
         return;
     }
 
+    base_size = bdrv_getlength(base);
+    if (base_size < 0) {
+        error_setg_errno(errp, -base_size, "Could not inquire base image size");
+        return;
+    }
+
+    top_size = bdrv_getlength(top);
+    if (top_size < 0) {
+        error_setg_errno(errp, -top_size, "Could not inquire top image size");
+        return;
+    }
+
+    perms = BLK_PERM_CONSISTENT_READ | BLK_PERM_WRITE;
+    if (base_size < top_size) {
+        perms |= BLK_PERM_RESIZE;
+    }
+
     s = block_job_create(job_id, &commit_job_driver, NULL, bs, 0, BLK_PERM_ALL,
                          speed, creation_flags, NULL, NULL, errp);
     if (!s) {
@@ -313,17 +337,43 @@ void commit_start(const char *job_id, BlockDriverState *bs,
 
     s->commit_top_bs = commit_top_bs;
 
-    /* Block all nodes between top and base, because they will
-     * disappear from the chain after this operation. */
+    /*
+     * Block all nodes between top and base, because they will
+     * disappear from the chain after this operation.
+     * Note that this assumes that the user is fine with removing all
+     * nodes (including R/W filters) between top and base.  Assuring
+     * this is the responsibility of the interface (i.e. whoever calls
+     * commit_start()).
+     */
     assert(bdrv_chain_contains(top, base));
-    for (iter = top; iter != base; iter = backing_bs(iter)) {
-        /* XXX BLK_PERM_WRITE needs to be allowed so we don't block ourselves
-         * at s->base (if writes are blocked for a node, they are also blocked
-         * for its backing file). The other options would be a second filter
-         * driver above s->base. */
+
+    /*
+     * The topmost node with
+     * bdrv_skip_rw_filters(filtered_base) == bdrv_skip_rw_filters(base)
+     */
+    filtered_base = bdrv_filtered_cow_bs(bdrv_find_overlay(top, base));
+
+    assert(bdrv_skip_rw_filters(filtered_base) == bdrv_skip_rw_filters(base));
+
+    /*
+     * XXX BLK_PERM_WRITE needs to be allowed so we don't block ourselves
+     * at s->base (if writes are blocked for a node, they are also blocked
+     * for its backing file). The other options would be a second filter
+     * driver above s->base.
+     */
+    iter_shared_perms = BLK_PERM_WRITE_UNCHANGED | BLK_PERM_WRITE;
+
+    for (iter = top; iter != base; iter = bdrv_filtered_bs(iter)) {
+        if (iter == filtered_base) {
+            /*
+             * From here on, all nodes are filters on the base.  This
+             * allows us to share BLK_PERM_CONSISTENT_READ.
+             */
+            iter_shared_perms |= BLK_PERM_CONSISTENT_READ;
+        }
+
         ret = block_job_add_bdrv(&s->common, "intermediate node", iter, 0,
-                                 BLK_PERM_WRITE_UNCHANGED | BLK_PERM_WRITE,
-                                 errp);
+                                 iter_shared_perms, errp);
         if (ret < 0) {
             goto fail;
         }
@@ -340,9 +390,7 @@ void commit_start(const char *job_id, BlockDriverState *bs,
     }
 
     s->base = blk_new(s->common.job.aio_context,
-                      BLK_PERM_CONSISTENT_READ
-                      | BLK_PERM_WRITE
-                      | BLK_PERM_RESIZE,
+                      perms,
                       BLK_PERM_CONSISTENT_READ
                       | BLK_PERM_GRAPH_MOD
                       | BLK_PERM_WRITE_UNCHANGED);
@@ -408,19 +456,22 @@ int bdrv_commit(BlockDriverState *bs)
     if (!drv)
         return -ENOMEDIUM;
 
-    if (!bs->backing) {
+    backing_file_bs = bdrv_filtered_cow_bs(bs);
+
+    if (!backing_file_bs) {
         return -ENOTSUP;
     }
 
     if (bdrv_op_is_blocked(bs, BLOCK_OP_TYPE_COMMIT_SOURCE, NULL) ||
-        bdrv_op_is_blocked(bs->backing->bs, BLOCK_OP_TYPE_COMMIT_TARGET, NULL)) {
+        bdrv_op_is_blocked(backing_file_bs, BLOCK_OP_TYPE_COMMIT_TARGET, NULL))
+    {
         return -EBUSY;
     }
 
-    ro = bs->backing->bs->read_only;
+    ro = backing_file_bs->read_only;
 
     if (ro) {
-        if (bdrv_reopen_set_read_only(bs->backing->bs, false, NULL)) {
+        if (bdrv_reopen_set_read_only(backing_file_bs, false, NULL)) {
             return -EACCES;
         }
     }
@@ -436,8 +487,6 @@ int bdrv_commit(BlockDriverState *bs)
     }
 
     /* Insert commit_top block node above backing, so we can write to it */
-    backing_file_bs = backing_bs(bs);
-
     commit_top_bs = bdrv_new_open_driver(&bdrv_commit_top, NULL, BDRV_O_RDWR,
                                          &local_err);
     if (commit_top_bs == NULL) {
@@ -522,15 +571,13 @@ ro_cleanup:
     qemu_vfree(buf);
 
     blk_unref(backing);
-    if (backing_file_bs) {
-        bdrv_set_backing_hd(bs, backing_file_bs, &error_abort);
-    }
+    bdrv_set_backing_hd(bs, backing_file_bs, &error_abort);
     bdrv_unref(commit_top_bs);
     blk_unref(src);
 
     if (ro) {
         /* ignoring error return here */
-        bdrv_reopen_set_read_only(bs->backing->bs, true, NULL);
+        bdrv_reopen_set_read_only(backing_file_bs, true, NULL);
     }
 
     return ret;
diff --git a/blockdev.c b/blockdev.c
index 605e7b0994..5036d064d4 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1095,7 +1095,7 @@ void hmp_commit(Monitor *mon, const QDict *qdict)
             return;
         }
 
-        bs = blk_bs(blk);
+        bs = bdrv_skip_implicit_filters(blk_bs(blk));
         aio_context = bdrv_get_aio_context(bs);
         aio_context_acquire(aio_context);
 
@@ -3392,7 +3392,9 @@ void qmp_block_commit(bool has_job_id, const char *job_id, const char *device,
 
     assert(bdrv_get_aio_context(base_bs) == aio_context);
 
-    for (iter = top_bs; iter != backing_bs(base_bs); iter = backing_bs(iter)) {
+    for (iter = top_bs; iter != bdrv_filtered_bs(base_bs);
+         iter = bdrv_filtered_bs(iter))
+    {
         if (bdrv_op_is_blocked(iter, BLOCK_OP_TYPE_COMMIT_TARGET, errp)) {
             goto out;
         }
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [Qemu-devel] [PATCH v5 28/42] stream: Deal with filters
  2019-06-12 22:09 [Qemu-devel] [PATCH v5 00/42] block: Deal with filters Max Reitz
                   ` (26 preceding siblings ...)
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 27/42] commit: " Max Reitz
@ 2019-06-12 22:09 ` Max Reitz
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 29/42] nbd: Use CAF when looking for dirty bitmap Max Reitz
                   ` (14 subsequent siblings)
  42 siblings, 0 replies; 113+ messages in thread
From: Max Reitz @ 2019-06-12 22:09 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, qemu-devel, Max Reitz

Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 qapi/block-core.json |  4 ++++
 block/stream.c       | 23 +++++++++++++++--------
 blockdev.c           |  2 +-
 3 files changed, 20 insertions(+), 9 deletions(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index df52a90736..a3c5298cf5 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -2518,6 +2518,10 @@
 # On successful completion the image file is updated to drop the backing file
 # and the BLOCK_JOB_COMPLETED event is emitted.
 #
+# In case @device is a filter node, block-stream modifies the first non-filter
+# overlay node below it to point to base's backing node (or NULL if @base was
+# not specified) instead of modifying @device itself.
+#
 # @job-id: identifier for the newly-created block job. If
 #          omitted, the device name will be used. (Since 2.7)
 #
diff --git a/block/stream.c b/block/stream.c
index 1a906fd860..9271e1821a 100644
--- a/block/stream.c
+++ b/block/stream.c
@@ -63,6 +63,7 @@ static int stream_prepare(Job *job)
     StreamBlockJob *s = container_of(job, StreamBlockJob, common.job);
     BlockJob *bjob = &s->common;
     BlockDriverState *bs = blk_bs(bjob->blk);
+    BlockDriverState *unfiltered_bs = bdrv_skip_rw_filters(bs);
     BlockDriverState *base = s->base;
     Error *local_err = NULL;
     int ret = 0;
@@ -70,7 +71,7 @@ static int stream_prepare(Job *job)
     bdrv_unfreeze_backing_chain(bs, base);
     s->chain_frozen = false;
 
-    if (bs->backing) {
+    if (bdrv_filtered_cow_child(unfiltered_bs)) {
         const char *base_id = NULL, *base_fmt = NULL;
         if (base) {
             base_id = s->backing_file_str;
@@ -78,8 +79,8 @@ static int stream_prepare(Job *job)
                 base_fmt = base->drv->format_name;
             }
         }
-        ret = bdrv_change_backing_file(bs, base_id, base_fmt);
-        bdrv_set_backing_hd(bs, base, &local_err);
+        ret = bdrv_change_backing_file(unfiltered_bs, base_id, base_fmt);
+        bdrv_set_backing_hd(unfiltered_bs, base, &local_err);
         if (local_err) {
             error_report_err(local_err);
             return -EPERM;
@@ -110,7 +111,9 @@ static int coroutine_fn stream_run(Job *job, Error **errp)
     StreamBlockJob *s = container_of(job, StreamBlockJob, common.job);
     BlockBackend *blk = s->common.blk;
     BlockDriverState *bs = blk_bs(blk);
+    BlockDriverState *unfiltered_bs = bdrv_skip_rw_filters(bs);
     BlockDriverState *base = s->base;
+    BlockDriverState *filtered_base;
     int64_t len;
     int64_t offset = 0;
     uint64_t delay_ns = 0;
@@ -119,10 +122,12 @@ static int coroutine_fn stream_run(Job *job, Error **errp)
     int64_t n = 0; /* bytes */
     void *buf;
 
-    if (!bs->backing) {
+    if (!bdrv_filtered_cow_child(unfiltered_bs)) {
         goto out;
     }
 
+    filtered_base = bdrv_filtered_cow_bs(bdrv_find_overlay(bs, base));
+
     len = bdrv_getlength(bs);
     if (len < 0) {
         ret = len;
@@ -154,14 +159,14 @@ static int coroutine_fn stream_run(Job *job, Error **errp)
 
         copy = false;
 
-        ret = bdrv_is_allocated(bs, offset, STREAM_BUFFER_SIZE, &n);
+        ret = bdrv_is_allocated(unfiltered_bs, offset, STREAM_BUFFER_SIZE, &n);
         if (ret == 1) {
             /* Allocated in the top, no need to copy.  */
         } else if (ret >= 0) {
             /* Copy if allocated in the intermediate images.  Limit to the
              * known-unallocated area [offset, offset+n*BDRV_SECTOR_SIZE).  */
-            ret = bdrv_is_allocated_above(backing_bs(bs), base,
-                                          offset, n, &n);
+            ret = bdrv_is_allocated_above(bdrv_filtered_cow_bs(unfiltered_bs),
+                                          filtered_base, offset, n, &n);
 
             /* Finish early if end of backing file has been reached */
             if (ret == 0 && n == 0) {
@@ -266,7 +271,9 @@ void stream_start(const char *job_id, BlockDriverState *bs,
      * disappear from the chain after this operation. The streaming job reads
      * every block only once, assuming that it doesn't change, so block writes
      * and resizes. */
-    for (iter = backing_bs(bs); iter && iter != base; iter = backing_bs(iter)) {
+    for (iter = bdrv_filtered_bs(bs); iter && iter != base;
+         iter = bdrv_filtered_bs(iter))
+    {
         block_job_add_bdrv(&s->common, "intermediate node", iter, 0,
                            BLK_PERM_CONSISTENT_READ | BLK_PERM_WRITE_UNCHANGED,
                            &error_abort);
diff --git a/blockdev.c b/blockdev.c
index 5036d064d4..a464cabf9e 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -3235,7 +3235,7 @@ void qmp_block_stream(bool has_job_id, const char *job_id, const char *device,
     }
 
     /* Check for op blockers in the whole chain between bs and base */
-    for (iter = bs; iter && iter != base_bs; iter = backing_bs(iter)) {
+    for (iter = bs; iter && iter != base_bs; iter = bdrv_filtered_bs(iter)) {
         if (bdrv_op_is_blocked(iter, BLOCK_OP_TYPE_STREAM, errp)) {
             goto out;
         }
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [Qemu-devel] [PATCH v5 29/42] nbd: Use CAF when looking for dirty bitmap
  2019-06-12 22:09 [Qemu-devel] [PATCH v5 00/42] block: Deal with filters Max Reitz
                   ` (27 preceding siblings ...)
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 28/42] stream: " Max Reitz
@ 2019-06-12 22:09 ` Max Reitz
  2019-06-18 13:58   ` Vladimir Sementsov-Ogievskiy
  2019-06-18 14:48   ` Eric Blake
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 30/42] qemu-img: Use child access functions Max Reitz
                   ` (13 subsequent siblings)
  42 siblings, 2 replies; 113+ messages in thread
From: Max Reitz @ 2019-06-12 22:09 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, qemu-devel, Max Reitz

When looking for a dirty bitmap to share, we should handle filters by
just including them in the search (so they do not break backing chains).

Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 nbd/server.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/nbd/server.c b/nbd/server.c
index aeca3893fe..0d51d46b81 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -1508,13 +1508,13 @@ NBDExport *nbd_export_new(BlockDriverState *bs, uint64_t dev_offset,
     if (bitmap) {
         BdrvDirtyBitmap *bm = NULL;
 
-        while (true) {
+        while (bs) {
             bm = bdrv_find_dirty_bitmap(bs, bitmap);
-            if (bm != NULL || bs->backing == NULL) {
+            if (bm != NULL) {
                 break;
             }
 
-            bs = bs->backing->bs;
+            bs = bdrv_filtered_bs(bs);
         }
 
         if (bm == NULL) {
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [Qemu-devel] [PATCH v5 30/42] qemu-img: Use child access functions
  2019-06-12 22:09 [Qemu-devel] [PATCH v5 00/42] block: Deal with filters Max Reitz
                   ` (28 preceding siblings ...)
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 29/42] nbd: Use CAF when looking for dirty bitmap Max Reitz
@ 2019-06-12 22:09 ` Max Reitz
  2019-06-19  9:18   ` Vladimir Sementsov-Ogievskiy
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 31/42] block: Drop backing_bs() Max Reitz
                   ` (12 subsequent siblings)
  42 siblings, 1 reply; 113+ messages in thread
From: Max Reitz @ 2019-06-12 22:09 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, qemu-devel, Max Reitz

This changes iotest 204's output, because blkdebug on top of a COW node
used to make qemu-img map disregard the rest of the backing chain (the
backing chain was broken by the filter).  With this patch, the
allocation in the base image is reported correctly.

Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 qemu-img.c                 | 36 ++++++++++++++++++++----------------
 tests/qemu-iotests/204.out |  1 +
 2 files changed, 21 insertions(+), 16 deletions(-)

diff --git a/qemu-img.c b/qemu-img.c
index 07b6e2a808..7bfa6e5d40 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -992,7 +992,7 @@ static int img_commit(int argc, char **argv)
     if (!blk) {
         return 1;
     }
-    bs = blk_bs(blk);
+    bs = bdrv_skip_implicit_filters(blk_bs(blk));
 
     qemu_progress_init(progress, 1.f);
     qemu_progress_print(0.f, 100);
@@ -1009,7 +1009,7 @@ static int img_commit(int argc, char **argv)
         /* This is different from QMP, which by default uses the deepest file in
          * the backing chain (i.e., the very base); however, the traditional
          * behavior of qemu-img commit is using the immediate backing file. */
-        base_bs = backing_bs(bs);
+        base_bs = bdrv_filtered_cow_bs(bs);
         if (!base_bs) {
             error_setg(&local_err, "Image does not have a backing file");
             goto done;
@@ -1626,19 +1626,18 @@ static int convert_iteration_sectors(ImgConvertState *s, int64_t sector_num)
 
     if (s->sector_next_status <= sector_num) {
         int64_t count = n * BDRV_SECTOR_SIZE;
+        BlockDriverState *src_bs = blk_bs(s->src[src_cur]);
+        BlockDriverState *base;
 
         if (s->target_has_backing) {
-
-            ret = bdrv_block_status(blk_bs(s->src[src_cur]),
-                                    (sector_num - src_cur_offset) *
-                                    BDRV_SECTOR_SIZE,
-                                    count, &count, NULL, NULL);
+            base = bdrv_backing_chain_next(src_bs);
         } else {
-            ret = bdrv_block_status_above(blk_bs(s->src[src_cur]), NULL,
-                                          (sector_num - src_cur_offset) *
-                                          BDRV_SECTOR_SIZE,
-                                          count, &count, NULL, NULL);
+            base = NULL;
         }
+        ret = bdrv_block_status_above(src_bs, base,
+                                      (sector_num - src_cur_offset) *
+                                      BDRV_SECTOR_SIZE,
+                                      count, &count, NULL, NULL);
         if (ret < 0) {
             error_report("error while reading block status of sector %" PRId64
                          ": %s", sector_num, strerror(-ret));
@@ -2439,7 +2438,8 @@ static int img_convert(int argc, char **argv)
          * s.target_backing_sectors has to be negative, which it will
          * be automatically).  The backing file length is used only
          * for optimizations, so such a case is not fatal. */
-        s.target_backing_sectors = bdrv_nb_sectors(out_bs->backing->bs);
+        s.target_backing_sectors =
+            bdrv_nb_sectors(bdrv_filtered_cow_bs(out_bs));
     } else {
         s.target_backing_sectors = -1;
     }
@@ -2802,6 +2802,7 @@ static int get_block_status(BlockDriverState *bs, int64_t offset,
 
     depth = 0;
     for (;;) {
+        bs = bdrv_skip_rw_filters(bs);
         ret = bdrv_block_status(bs, offset, bytes, &bytes, &map, &file);
         if (ret < 0) {
             return ret;
@@ -2810,7 +2811,7 @@ static int get_block_status(BlockDriverState *bs, int64_t offset,
         if (ret & (BDRV_BLOCK_ZERO|BDRV_BLOCK_DATA)) {
             break;
         }
-        bs = backing_bs(bs);
+        bs = bdrv_filtered_cow_bs(bs);
         if (bs == NULL) {
             ret = 0;
             break;
@@ -2949,7 +2950,7 @@ static int img_map(int argc, char **argv)
     if (!blk) {
         return 1;
     }
-    bs = blk_bs(blk);
+    bs = bdrv_skip_implicit_filters(blk_bs(blk));
 
     if (output_format == OFORMAT_HUMAN) {
         printf("%-16s%-16s%-16s%s\n", "Offset", "Length", "Mapped to", "File");
@@ -3165,6 +3166,7 @@ static int img_rebase(int argc, char **argv)
     uint8_t *buf_old = NULL;
     uint8_t *buf_new = NULL;
     BlockDriverState *bs = NULL, *prefix_chain_bs = NULL;
+    BlockDriverState *unfiltered_bs;
     char *filename;
     const char *fmt, *cache, *src_cache, *out_basefmt, *out_baseimg;
     int c, flags, src_flags, ret;
@@ -3299,6 +3301,8 @@ static int img_rebase(int argc, char **argv)
     }
     bs = blk_bs(blk);
 
+    unfiltered_bs = bdrv_skip_rw_filters(bs);
+
     if (out_basefmt != NULL) {
         if (bdrv_find_format(out_basefmt) == NULL) {
             error_report("Invalid format name: '%s'", out_basefmt);
@@ -3310,7 +3314,7 @@ static int img_rebase(int argc, char **argv)
     /* For safe rebasing we need to compare old and new backing file */
     if (!unsafe) {
         QDict *options = NULL;
-        BlockDriverState *base_bs = backing_bs(bs);
+        BlockDriverState *base_bs = bdrv_filtered_cow_bs(unfiltered_bs);
 
         if (base_bs) {
             blk_old_backing = blk_new(qemu_get_aio_context(),
@@ -3463,7 +3467,7 @@ static int img_rebase(int argc, char **argv)
                  * If cluster wasn't changed since prefix_chain, we don't need
                  * to take action
                  */
-                ret = bdrv_is_allocated_above(backing_bs(bs), prefix_chain_bs,
+                ret = bdrv_is_allocated_above(unfiltered_bs, prefix_chain_bs,
                                               offset, n, &n);
                 if (ret < 0) {
                     error_report("error while reading image metadata: %s",
diff --git a/tests/qemu-iotests/204.out b/tests/qemu-iotests/204.out
index f3a10fbe90..684774d763 100644
--- a/tests/qemu-iotests/204.out
+++ b/tests/qemu-iotests/204.out
@@ -59,5 +59,6 @@ Offset          Length          File
 0x900000        0x2400000       TEST_DIR/t.IMGFMT
 0x3c00000       0x1100000       TEST_DIR/t.IMGFMT
 0x6a00000       0x400000        TEST_DIR/t.IMGFMT
+0x6e00000       0x1200000       TEST_DIR/t.IMGFMT.base
 No errors were found on the image.
 *** done
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [Qemu-devel] [PATCH v5 31/42] block: Drop backing_bs()
  2019-06-12 22:09 [Qemu-devel] [PATCH v5 00/42] block: Deal with filters Max Reitz
                   ` (29 preceding siblings ...)
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 30/42] qemu-img: Use child access functions Max Reitz
@ 2019-06-12 22:09 ` Max Reitz
  2019-06-19  9:18   ` Vladimir Sementsov-Ogievskiy
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 32/42] block: Make bdrv_get_cumulative_perm() public Max Reitz
                   ` (11 subsequent siblings)
  42 siblings, 1 reply; 113+ messages in thread
From: Max Reitz @ 2019-06-12 22:09 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, qemu-devel, Max Reitz

We want to make it explicit where bs->backing is used, and we have done
so.  The old role of backing_bs() is now effectively taken by
bdrv_filtered_cow_bs().

Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 include/block/block_int.h | 5 -----
 1 file changed, 5 deletions(-)

diff --git a/include/block/block_int.h b/include/block/block_int.h
index 875a33f255..c0a05beec3 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -925,11 +925,6 @@ typedef enum BlockMirrorBackingMode {
     MIRROR_LEAVE_BACKING_CHAIN,
 } BlockMirrorBackingMode;
 
-static inline BlockDriverState *backing_bs(BlockDriverState *bs)
-{
-    return bs->backing ? bs->backing->bs : NULL;
-}
-
 
 /* Essential block drivers which must always be statically linked into qemu, and
  * which therefore can be accessed without using bdrv_find_format() */
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [Qemu-devel] [PATCH v5 32/42] block: Make bdrv_get_cumulative_perm() public
  2019-06-12 22:09 [Qemu-devel] [PATCH v5 00/42] block: Deal with filters Max Reitz
                   ` (30 preceding siblings ...)
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 31/42] block: Drop backing_bs() Max Reitz
@ 2019-06-12 22:09 ` Max Reitz
  2019-06-19  9:19   ` Vladimir Sementsov-Ogievskiy
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 33/42] blockdev: Fix active commit choice Max Reitz
                   ` (10 subsequent siblings)
  42 siblings, 1 reply; 113+ messages in thread
From: Max Reitz @ 2019-06-12 22:09 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, qemu-devel, Max Reitz

This is useful in other files like blockdev.c to determine for example
whether a node can be written to or not.

Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 include/block/block_int.h | 3 +++
 block.c                   | 6 ++----
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/include/block/block_int.h b/include/block/block_int.h
index c0a05beec3..cfefb00104 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -1181,6 +1181,9 @@ void bdrv_root_unref_child(BdrvChild *child);
 int bdrv_child_try_set_perm(BdrvChild *c, uint64_t perm, uint64_t shared,
                             Error **errp);
 
+void bdrv_get_cumulative_perm(BlockDriverState *bs,
+                              uint64_t *perm, uint64_t *shared_perm);
+
 /* Default implementation for BlockDriver.bdrv_child_perm() that can be used by
  * block filters: Forward CONSISTENT_READ, WRITE, WRITE_UNCHANGED and RESIZE to
  * all children */
diff --git a/block.c b/block.c
index 856d9b58be..59d1d4b2b1 100644
--- a/block.c
+++ b/block.c
@@ -1711,8 +1711,6 @@ static int bdrv_child_check_perm(BdrvChild *c, BlockReopenQueue *q,
                                  GSList *ignore_children, Error **errp);
 static void bdrv_child_abort_perm_update(BdrvChild *c);
 static void bdrv_child_set_perm(BdrvChild *c, uint64_t perm, uint64_t shared);
-static void bdrv_get_cumulative_perm(BlockDriverState *bs, uint64_t *perm,
-                                     uint64_t *shared_perm);
 
 typedef struct BlockReopenQueueEntry {
      bool prepared;
@@ -1904,8 +1902,8 @@ static void bdrv_set_perm(BlockDriverState *bs, uint64_t cumulative_perms,
     }
 }
 
-static void bdrv_get_cumulative_perm(BlockDriverState *bs, uint64_t *perm,
-                                     uint64_t *shared_perm)
+void bdrv_get_cumulative_perm(BlockDriverState *bs,
+                              uint64_t *perm, uint64_t *shared_perm)
 {
     BdrvChild *c;
     uint64_t cumulative_perms = 0;
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [Qemu-devel] [PATCH v5 33/42] blockdev: Fix active commit choice
  2019-06-12 22:09 [Qemu-devel] [PATCH v5 00/42] block: Deal with filters Max Reitz
                   ` (31 preceding siblings ...)
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 32/42] block: Make bdrv_get_cumulative_perm() public Max Reitz
@ 2019-06-12 22:09 ` Max Reitz
  2019-06-19  9:31   ` Vladimir Sementsov-Ogievskiy
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 34/42] block: Inline bdrv_co_block_status_from_*() Max Reitz
                   ` (9 subsequent siblings)
  42 siblings, 1 reply; 113+ messages in thread
From: Max Reitz @ 2019-06-12 22:09 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, qemu-devel, Max Reitz

We have to perform an active commit whenever the top node has a parent
that has taken the WRITE permission on it.

Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 blockdev.c | 24 +++++++++++++++++++++---
 1 file changed, 21 insertions(+), 3 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index a464cabf9e..5370f3b738 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -3294,6 +3294,7 @@ void qmp_block_commit(bool has_job_id, const char *job_id, const char *device,
      */
     BlockdevOnError on_error = BLOCKDEV_ON_ERROR_REPORT;
     int job_flags = JOB_DEFAULT;
+    uint64_t top_perm, top_shared;
 
     if (!has_speed) {
         speed = 0;
@@ -3406,14 +3407,31 @@ void qmp_block_commit(bool has_job_id, const char *job_id, const char *device,
         goto out;
     }
 
-    if (top_bs == bs) {
+    /*
+     * Active commit is required if and only if someone has taken a
+     * WRITE permission on the top node.  Historically, we have always
+     * used active commit for top nodes, so continue that practice.
+     * (Active commit is never really wrong.)
+     */
+    bdrv_get_cumulative_perm(top_bs, &top_perm, &top_shared);
+    if (top_perm & BLK_PERM_WRITE ||
+        bdrv_skip_rw_filters(top_bs) == bdrv_skip_rw_filters(bs))
+    {
         if (has_backing_file) {
             error_setg(errp, "'backing-file' specified,"
                              " but 'top' is the active layer");
             goto out;
         }
-        commit_active_start(has_job_id ? job_id : NULL, bs, base_bs,
-                            job_flags, speed, on_error,
+        if (!has_job_id) {
+            /*
+             * Emulate here what block_job_create() does, because it
+             * is possible that @bs != @top_bs (the block job should
+             * be named after @bs, even if @top_bs is the actual
+             * source)
+             */
+            job_id = bdrv_get_device_name(bs);
+        }
+        commit_active_start(job_id, top_bs, base_bs, job_flags, speed, on_error,
                             filter_node_name, NULL, NULL, false, &local_err);
     } else {
         BlockDriverState *overlay_bs = bdrv_find_overlay(bs, top_bs);
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [Qemu-devel] [PATCH v5 34/42] block: Inline bdrv_co_block_status_from_*()
  2019-06-12 22:09 [Qemu-devel] [PATCH v5 00/42] block: Deal with filters Max Reitz
                   ` (32 preceding siblings ...)
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 33/42] blockdev: Fix active commit choice Max Reitz
@ 2019-06-12 22:09 ` Max Reitz
  2019-06-19  9:34   ` Vladimir Sementsov-Ogievskiy
  2019-06-21 13:39   ` Vladimir Sementsov-Ogievskiy
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 35/42] block: Fix check_to_replace_node() Max Reitz
                   ` (8 subsequent siblings)
  42 siblings, 2 replies; 113+ messages in thread
From: Max Reitz @ 2019-06-12 22:09 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, qemu-devel, Max Reitz

With bdrv_filtered_rw_bs(), we can easily handle this default filter
behavior in bdrv_co_block_status().

blkdebug wants to have an additional assertion, so it keeps its own
implementation, except bdrv_co_block_status_from_file() needs to be
inlined there.

Suggested-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 include/block/block_int.h | 22 -----------------
 block/blkdebug.c          |  7 ++++--
 block/blklogwrites.c      |  1 -
 block/commit.c            |  1 -
 block/copy-on-read.c      |  2 --
 block/io.c                | 51 +++++++++++++--------------------------
 block/mirror.c            |  1 -
 block/throttle.c          |  1 -
 8 files changed, 22 insertions(+), 64 deletions(-)

diff --git a/include/block/block_int.h b/include/block/block_int.h
index cfefb00104..431fa38ea0 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -1203,28 +1203,6 @@ void bdrv_format_default_perms(BlockDriverState *bs, BdrvChild *c,
                                uint64_t perm, uint64_t shared,
                                uint64_t *nperm, uint64_t *nshared);
 
-/*
- * Default implementation for drivers to pass bdrv_co_block_status() to
- * their file.
- */
-int coroutine_fn bdrv_co_block_status_from_file(BlockDriverState *bs,
-                                                bool want_zero,
-                                                int64_t offset,
-                                                int64_t bytes,
-                                                int64_t *pnum,
-                                                int64_t *map,
-                                                BlockDriverState **file);
-/*
- * Default implementation for drivers to pass bdrv_co_block_status() to
- * their backing file.
- */
-int coroutine_fn bdrv_co_block_status_from_backing(BlockDriverState *bs,
-                                                   bool want_zero,
-                                                   int64_t offset,
-                                                   int64_t bytes,
-                                                   int64_t *pnum,
-                                                   int64_t *map,
-                                                   BlockDriverState **file);
 const char *bdrv_get_parent_name(const BlockDriverState *bs);
 void blk_dev_change_media_cb(BlockBackend *blk, bool load, Error **errp);
 bool blk_dev_has_removable_media(BlockBackend *blk);
diff --git a/block/blkdebug.c b/block/blkdebug.c
index efd9441625..7950ae729c 100644
--- a/block/blkdebug.c
+++ b/block/blkdebug.c
@@ -637,8 +637,11 @@ static int coroutine_fn blkdebug_co_block_status(BlockDriverState *bs,
                                                  BlockDriverState **file)
 {
     assert(QEMU_IS_ALIGNED(offset | bytes, bs->bl.request_alignment));
-    return bdrv_co_block_status_from_file(bs, want_zero, offset, bytes,
-                                          pnum, map, file);
+    assert(bs->file && bs->file->bs);
+    *pnum = bytes;
+    *map = offset;
+    *file = bs->file->bs;
+    return BDRV_BLOCK_RAW | BDRV_BLOCK_OFFSET_VALID;
 }
 
 static void blkdebug_close(BlockDriverState *bs)
diff --git a/block/blklogwrites.c b/block/blklogwrites.c
index eb2b4901a5..1eb4a5c613 100644
--- a/block/blklogwrites.c
+++ b/block/blklogwrites.c
@@ -518,7 +518,6 @@ static BlockDriver bdrv_blk_log_writes = {
     .bdrv_co_pwrite_zeroes  = blk_log_writes_co_pwrite_zeroes,
     .bdrv_co_flush_to_disk  = blk_log_writes_co_flush_to_disk,
     .bdrv_co_pdiscard       = blk_log_writes_co_pdiscard,
-    .bdrv_co_block_status   = bdrv_co_block_status_from_file,
 
     .is_filter              = true,
     .strong_runtime_opts    = blk_log_writes_strong_runtime_opts,
diff --git a/block/commit.c b/block/commit.c
index ec5a8c8edf..a5b58eadeb 100644
--- a/block/commit.c
+++ b/block/commit.c
@@ -257,7 +257,6 @@ static void bdrv_commit_top_child_perm(BlockDriverState *bs, BdrvChild *c,
 static BlockDriver bdrv_commit_top = {
     .format_name                = "commit_top",
     .bdrv_co_preadv             = bdrv_commit_top_preadv,
-    .bdrv_co_block_status       = bdrv_co_block_status_from_backing,
     .bdrv_refresh_filename      = bdrv_commit_top_refresh_filename,
     .bdrv_child_perm            = bdrv_commit_top_child_perm,
 
diff --git a/block/copy-on-read.c b/block/copy-on-read.c
index 88e1c1f538..5a292de000 100644
--- a/block/copy-on-read.c
+++ b/block/copy-on-read.c
@@ -161,8 +161,6 @@ static BlockDriver bdrv_copy_on_read = {
     .bdrv_eject                         = cor_eject,
     .bdrv_lock_medium                   = cor_lock_medium,
 
-    .bdrv_co_block_status               = bdrv_co_block_status_from_file,
-
     .bdrv_recurse_is_first_non_filter   = cor_recurse_is_first_non_filter,
 
     .has_variable_length                = true,
diff --git a/block/io.c b/block/io.c
index 14f99e1c00..0a832e30a3 100644
--- a/block/io.c
+++ b/block/io.c
@@ -1998,36 +1998,6 @@ typedef struct BdrvCoBlockStatusData {
     bool done;
 } BdrvCoBlockStatusData;
 
-int coroutine_fn bdrv_co_block_status_from_file(BlockDriverState *bs,
-                                                bool want_zero,
-                                                int64_t offset,
-                                                int64_t bytes,
-                                                int64_t *pnum,
-                                                int64_t *map,
-                                                BlockDriverState **file)
-{
-    assert(bs->file && bs->file->bs);
-    *pnum = bytes;
-    *map = offset;
-    *file = bs->file->bs;
-    return BDRV_BLOCK_RAW | BDRV_BLOCK_OFFSET_VALID;
-}
-
-int coroutine_fn bdrv_co_block_status_from_backing(BlockDriverState *bs,
-                                                   bool want_zero,
-                                                   int64_t offset,
-                                                   int64_t bytes,
-                                                   int64_t *pnum,
-                                                   int64_t *map,
-                                                   BlockDriverState **file)
-{
-    assert(bs->backing && bs->backing->bs);
-    *pnum = bytes;
-    *map = offset;
-    *file = bs->backing->bs;
-    return BDRV_BLOCK_RAW | BDRV_BLOCK_OFFSET_VALID;
-}
-
 /*
  * Returns the allocation status of the specified sectors.
  * Drivers not implementing the functionality are assumed to not support
@@ -2068,6 +2038,7 @@ static int coroutine_fn bdrv_co_block_status(BlockDriverState *bs,
     BlockDriverState *local_file = NULL;
     int64_t aligned_offset, aligned_bytes;
     uint32_t align;
+    bool has_filtered_child;
 
     assert(pnum);
     *pnum = 0;
@@ -2093,7 +2064,8 @@ static int coroutine_fn bdrv_co_block_status(BlockDriverState *bs,
 
     /* Must be non-NULL or bdrv_getlength() would have failed */
     assert(bs->drv);
-    if (!bs->drv->bdrv_co_block_status) {
+    has_filtered_child = bs->drv->is_filter && bdrv_filtered_rw_child(bs);
+    if (!bs->drv->bdrv_co_block_status && !has_filtered_child) {
         *pnum = bytes;
         ret = BDRV_BLOCK_DATA | BDRV_BLOCK_ALLOCATED;
         if (offset + bytes == total_size) {
@@ -2114,9 +2086,20 @@ static int coroutine_fn bdrv_co_block_status(BlockDriverState *bs,
     aligned_offset = QEMU_ALIGN_DOWN(offset, align);
     aligned_bytes = ROUND_UP(offset + bytes, align) - aligned_offset;
 
-    ret = bs->drv->bdrv_co_block_status(bs, want_zero, aligned_offset,
-                                        aligned_bytes, pnum, &local_map,
-                                        &local_file);
+    if (bs->drv->bdrv_co_block_status) {
+        ret = bs->drv->bdrv_co_block_status(bs, want_zero, aligned_offset,
+                                            aligned_bytes, pnum, &local_map,
+                                            &local_file);
+    } else {
+        /* Default code for filters */
+
+        local_file = bdrv_filtered_rw_bs(bs);
+        assert(local_file);
+
+        *pnum = aligned_bytes;
+        local_map = aligned_offset;
+        ret = BDRV_BLOCK_RAW | BDRV_BLOCK_OFFSET_VALID;
+    }
     if (ret < 0) {
         *pnum = 0;
         goto out;
diff --git a/block/mirror.c b/block/mirror.c
index 3d767e3030..71bd7f7625 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -1484,7 +1484,6 @@ static BlockDriver bdrv_mirror_top = {
     .bdrv_co_pwrite_zeroes      = bdrv_mirror_top_pwrite_zeroes,
     .bdrv_co_pdiscard           = bdrv_mirror_top_pdiscard,
     .bdrv_co_flush              = bdrv_mirror_top_flush,
-    .bdrv_co_block_status       = bdrv_co_block_status_from_backing,
     .bdrv_refresh_filename      = bdrv_mirror_top_refresh_filename,
     .bdrv_child_perm            = bdrv_mirror_top_child_perm,
 
diff --git a/block/throttle.c b/block/throttle.c
index de1b6bd7e8..32ec56db0f 100644
--- a/block/throttle.c
+++ b/block/throttle.c
@@ -269,7 +269,6 @@ static BlockDriver bdrv_throttle = {
     .bdrv_reopen_prepare                =   throttle_reopen_prepare,
     .bdrv_reopen_commit                 =   throttle_reopen_commit,
     .bdrv_reopen_abort                  =   throttle_reopen_abort,
-    .bdrv_co_block_status               =   bdrv_co_block_status_from_file,
 
     .bdrv_co_drain_begin                =   throttle_co_drain_begin,
     .bdrv_co_drain_end                  =   throttle_co_drain_end,
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [Qemu-devel] [PATCH v5 35/42] block: Fix check_to_replace_node()
  2019-06-12 22:09 [Qemu-devel] [PATCH v5 00/42] block: Deal with filters Max Reitz
                   ` (33 preceding siblings ...)
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 34/42] block: Inline bdrv_co_block_status_from_*() Max Reitz
@ 2019-06-12 22:09 ` Max Reitz
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 36/42] iotests: Add tests for mirror @replaces loops Max Reitz
                   ` (7 subsequent siblings)
  42 siblings, 0 replies; 113+ messages in thread
From: Max Reitz @ 2019-06-12 22:09 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, qemu-devel, Max Reitz

Currently, check_to_replace_node() only allows mirror to replace a node
in the chain of the source node, and only if it is the first non-filter
node below the source.  Well, technically, the idea is that you can
exactly replace a quorum child by mirroring from quorum.

This has (probably) two reasons:
(1) We do not want to create loops.
(2) @replaces and @device should have exactly the same content so
    replacing them does not cause visible data to change.

This has two issues:
(1) It is overly restrictive.  It is completely fine for @replaces to be
    a filter.
(2) It is not restrictive enough.  You can create loops with this as
    follows:

$ qemu-img create -f qcow2 /tmp/source.qcow2 64M
$ qemu-system-x86_64 -qmp stdio
{"execute": "qmp_capabilities"}
{"execute": "object-add",
 "arguments": {"qom-type": "throttle-group", "id": "tg0"}}
{"execute": "blockdev-add",
 "arguments": {
     "node-name": "source",
     "driver": "throttle",
     "throttle-group": "tg0",
     "file": {
         "node-name": "filtered",
         "driver": "qcow2",
         "file": {
             "driver": "file",
             "filename": "/tmp/source.qcow2"
         } } } }
{"execute": "drive-mirror",
 "arguments": {
     "job-id": "mirror",
     "device": "source",
     "target": "/tmp/target.qcow2",
     "format": "qcow2",
     "node-name": "target",
     "sync" :"none",
     "replaces": "filtered"
 } }
{"execute": "block-job-complete", "arguments": {"device": "mirror"}}

And qemu crashes because of a stack overflow due to the loop being
created (target's backing file is source, so when it replaces filtered,
it points to itself through source).

(blockdev-mirror can be broken similarly.)

So let us make the checks for the two conditions above explicit, which
makes the whole function exactly as restrictive as it needs to be.

Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 include/block/block.h |  1 +
 block.c               | 83 +++++++++++++++++++++++++++++++++++++++----
 blockdev.c            | 34 ++++++++++++++++--
 3 files changed, 110 insertions(+), 8 deletions(-)

diff --git a/include/block/block.h b/include/block/block.h
index 7835c5b370..484c0af766 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -404,6 +404,7 @@ bool bdrv_is_first_non_filter(BlockDriverState *candidate);
 
 /* check if a named node can be replaced when doing drive-mirror */
 BlockDriverState *check_to_replace_node(BlockDriverState *parent_bs,
+                                        BlockDriverState *backing_bs,
                                         const char *node_name, Error **errp);
 
 /* async block I/O */
diff --git a/block.c b/block.c
index 59d1d4b2b1..e129869a7e 100644
--- a/block.c
+++ b/block.c
@@ -6142,7 +6142,59 @@ bool bdrv_is_first_non_filter(BlockDriverState *candidate)
     return false;
 }
 
+static bool is_child_of(BlockDriverState *child, BlockDriverState *parent)
+{
+    BdrvChild *c;
+
+    if (!parent) {
+        return false;
+    }
+
+    QLIST_FOREACH(c, &parent->children, next) {
+        if (c->bs == child || is_child_of(child, c->bs)) {
+            return true;
+        }
+    }
+
+    return false;
+}
+
+/*
+ * Return true if there are only filters in [@top, @base).  Note that
+ * this may include quorum (which bdrv_chain_contains() cannot
+ * handle).
+ */
+static bool is_filtered_child(BlockDriverState *top, BlockDriverState *base)
+{
+    BdrvChild *c;
+
+    if (!top) {
+        return false;
+    }
+
+    if (top == base) {
+        return true;
+    }
+
+    if (!top->drv->is_filter) {
+        return false;
+    }
+
+    QLIST_FOREACH(c, &top->children, next) {
+        if (is_filtered_child(c->bs, base)) {
+            return true;
+        }
+    }
+
+    return false;
+}
+
+/*
+ * @parent_bs is mirror's source BDS, @backing_bs is the BDS which
+ * will be attached to the target when mirror completes.
+ */
 BlockDriverState *check_to_replace_node(BlockDriverState *parent_bs,
+                                        BlockDriverState *backing_bs,
                                         const char *node_name, Error **errp)
 {
     BlockDriverState *to_replace_bs = bdrv_find_node(node_name);
@@ -6161,13 +6213,32 @@ BlockDriverState *check_to_replace_node(BlockDriverState *parent_bs,
         goto out;
     }
 
-    /* We don't want arbitrary node of the BDS chain to be replaced only the top
-     * most non filter in order to prevent data corruption.
-     * Another benefit is that this tests exclude backing files which are
-     * blocked by the backing blockers.
+    /*
+     * If to_replace_bs is (recursively) a child of backing_bs,
+     * replacing it may create a loop.  We cannot allow that.
      */
-    if (!bdrv_recurse_is_first_non_filter(parent_bs, to_replace_bs)) {
-        error_setg(errp, "Only top most non filter can be replaced");
+    if (to_replace_bs == backing_bs || is_child_of(to_replace_bs, backing_bs)) {
+        error_setg(errp, "Replacing this node would result in a loop");
+        to_replace_bs = NULL;
+        goto out;
+    }
+
+    /*
+     * Mirror is designed in such a way that when it completes, the
+     * source BDS is seamlessly replaced.  It is therefore not allowed
+     * to replace a BDS where this condition would be violated, as that
+     * would defeat the purpose of mirror and could lead to data
+     * corruption.
+     * Therefore, between parent_bs and to_replace_bs there may be
+     * only filters (and the one on top must be a filter, too), so
+     * their data always stays in sync and mirror can complete and
+     * replace to_replace_bs without any possible corruptions.
+     */
+    if (!is_filtered_child(parent_bs, to_replace_bs) &&
+        !is_filtered_child(to_replace_bs, parent_bs))
+    {
+        error_setg(errp, "The node to be replaced must be connected to the "
+                   "source through filter nodes only");
         to_replace_bs = NULL;
         goto out;
     }
diff --git a/blockdev.c b/blockdev.c
index 5370f3b738..6f9f75327e 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -3813,7 +3813,7 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
     }
 
     if (has_replaces) {
-        BlockDriverState *to_replace_bs;
+        BlockDriverState *to_replace_bs, *backing_bs;
         AioContext *replace_aio_context;
         int64_t bs_size, replace_size;
 
@@ -3823,7 +3823,37 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
             return;
         }
 
-        to_replace_bs = check_to_replace_node(bs, replaces, errp);
+        if (backing_mode == MIRROR_SOURCE_BACKING_CHAIN ||
+            backing_mode == MIRROR_OPEN_BACKING_CHAIN)
+        {
+            /*
+             * While we do not quite know what OPEN_BACKING_CHAIN
+             * (used for mode=existing) will yield, it is probably
+             * best to restrict it exactly like SOURCE_BACKING_CHAIN,
+             * because that is our best guess.
+             */
+            switch (sync) {
+            case MIRROR_SYNC_MODE_FULL:
+                backing_bs = NULL;
+                break;
+
+            case MIRROR_SYNC_MODE_TOP:
+                backing_bs = bdrv_filtered_cow_bs(bdrv_skip_rw_filters(bs));
+                break;
+
+            case MIRROR_SYNC_MODE_NONE:
+                backing_bs = bs;
+                break;
+
+            default:
+                abort();
+            }
+        } else {
+            assert(backing_mode == MIRROR_LEAVE_BACKING_CHAIN);
+            backing_bs = bdrv_filtered_cow_bs(bdrv_skip_rw_filters(target));
+        }
+
+        to_replace_bs = check_to_replace_node(bs, backing_bs, replaces, errp);
         if (!to_replace_bs) {
             return;
         }
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [Qemu-devel] [PATCH v5 36/42] iotests: Add tests for mirror @replaces loops
  2019-06-12 22:09 [Qemu-devel] [PATCH v5 00/42] block: Deal with filters Max Reitz
                   ` (34 preceding siblings ...)
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 35/42] block: Fix check_to_replace_node() Max Reitz
@ 2019-06-12 22:09 ` Max Reitz
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 37/42] block: Leave BDS.backing_file constant Max Reitz
                   ` (6 subsequent siblings)
  42 siblings, 0 replies; 113+ messages in thread
From: Max Reitz @ 2019-06-12 22:09 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, qemu-devel, Max Reitz

This adds two tests for cases where our old check_to_replace_node()
function failed to detect that executing this job with these parameters
would result in a cyclic graph.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
 tests/qemu-iotests/041     | 124 +++++++++++++++++++++++++++++++++++++
 tests/qemu-iotests/041.out |   4 +-
 2 files changed, 126 insertions(+), 2 deletions(-)

diff --git a/tests/qemu-iotests/041 b/tests/qemu-iotests/041
index 26bf1701eb..0c1432f189 100755
--- a/tests/qemu-iotests/041
+++ b/tests/qemu-iotests/041
@@ -1067,5 +1067,129 @@ class TestOrphanedSource(iotests.QMPTestCase):
                              target='dest-ro')
         self.assert_qmp(result, 'error/class', 'GenericError')
 
+# Various tests for the @replaces option (independent of quorum)
+class TestReplaces(iotests.QMPTestCase):
+    def setUp(self):
+        self.vm = iotests.VM()
+        self.vm.launch()
+
+    def tearDown(self):
+        self.vm.shutdown()
+
+    def test_drive_mirror_loop(self):
+        qemu_img('create', '-f', iotests.imgfmt, test_img, '1M')
+
+        result = self.vm.qmp('object-add', qom_type='throttle-group', id='tg')
+        self.assert_qmp(result, 'return', {})
+
+        result = self.vm.qmp('blockdev-add', **{
+                    'node-name': 'source',
+                    'driver': 'throttle',
+                    'throttle-group': 'tg',
+                    'file': {
+                        'node-name': 'filtered',
+                        'driver': iotests.imgfmt,
+                        'file': {
+                            'driver': 'file',
+                            'filename': test_img
+                        }
+                    }
+                })
+        self.assert_qmp(result, 'return', {})
+
+        # Mirror from @source to @target in sync=none, so that @source
+        # will be @target's backing file; but replace @filtered.
+        # Then, @target's backing file will be @source, whose backing
+        # file is now @target instead of @filtered.  That is a loop.
+        # (But apart from the loop, replacing @filtered instead of
+        # @source is fine, because both are just filtered versions of
+        # each other.)
+        result = self.vm.qmp('drive-mirror',
+                             job_id='mirror',
+                             device='source',
+                             target=target_img,
+                             format=iotests.imgfmt,
+                             node_name='target',
+                             sync='none',
+                             replaces='filtered')
+        if 'error' in result:
+            # This is the correct result
+            self.assert_qmp(result, 'error/class', 'GenericError')
+        else:
+            # This is wrong, but let's run it to the bitter conclusion
+            self.complete_and_wait(drive='mirror')
+            # Fail for good measure, although qemu should have crashed
+            # anyway
+            self.fail('Loop creation was successful')
+
+        os.remove(test_img)
+        try:
+            os.remove(target_img)
+        except OSError:
+            pass
+
+    def test_blockdev_mirror_loop(self):
+        qemu_img('create', '-f', iotests.imgfmt, test_img, '1M')
+        qemu_img('create', '-f', iotests.imgfmt, target_img, '1M')
+
+        result = self.vm.qmp('object-add', qom_type='throttle-group', id='tg')
+        self.assert_qmp(result, 'return', {})
+
+        result = self.vm.qmp('blockdev-add', **{
+                    'node-name': 'source',
+                    'driver': 'throttle',
+                    'throttle-group': 'tg',
+                    'file': {
+                        'node-name': 'middle',
+                        'driver': 'throttle',
+                        'throttle-group': 'tg',
+                        'file': {
+                            'node-name': 'bottom',
+                            'driver': iotests.imgfmt,
+                            'file': {
+                                'driver': 'file',
+                                'filename': test_img
+                            }
+                        }
+                    }
+                })
+        self.assert_qmp(result, 'return', {})
+
+        result = self.vm.qmp('blockdev-add', **{
+                    'node-name': 'target',
+                    'driver': iotests.imgfmt,
+                    'file': {
+                        'driver': 'file',
+                        'filename': target_img
+                    },
+                    'backing': 'middle'
+                })
+
+        # Mirror from @source to @target.  With blockdev-mirror, the
+        # current (old) backing file is retained (which is @middle).
+        # By replacing @bottom, @middle's file will be @target, whose
+        # backing file is @middle again.  That is a loop.
+        # (But apart from the loop, replacing @bottom instead of
+        # @source is fine, because both are just filtered versions of
+        # each other.)
+        result = self.vm.qmp('blockdev-mirror',
+                             job_id='mirror',
+                             device='source',
+                             target='target',
+                             sync='full',
+                             replaces='bottom')
+        if 'error' in result:
+            # This is the correct result
+            self.assert_qmp(result, 'error/class', 'GenericError')
+        else:
+            # This is wrong, but let's run it to the bitter conclusion
+            self.complete_and_wait(drive='mirror')
+            # Fail for good measure, although qemu should have crashed
+            # anyway
+            self.fail('Loop creation was successful')
+
+        os.remove(test_img)
+        os.remove(target_img)
+
 if __name__ == '__main__':
     iotests.main(supported_fmts=['qcow2', 'qed'])
diff --git a/tests/qemu-iotests/041.out b/tests/qemu-iotests/041.out
index e071d0b261..2c448b4239 100644
--- a/tests/qemu-iotests/041.out
+++ b/tests/qemu-iotests/041.out
@@ -1,5 +1,5 @@
-........................................................................................
+..........................................................................................
 ----------------------------------------------------------------------
-Ran 88 tests
+Ran 90 tests
 
 OK
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [Qemu-devel] [PATCH v5 37/42] block: Leave BDS.backing_file constant
  2019-06-12 22:09 [Qemu-devel] [PATCH v5 00/42] block: Deal with filters Max Reitz
                   ` (35 preceding siblings ...)
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 36/42] iotests: Add tests for mirror @replaces loops Max Reitz
@ 2019-06-12 22:09 ` Max Reitz
  2019-06-12 22:10 ` [Qemu-devel] [PATCH v5 38/42] iotests: Let complete_and_wait() work with commit Max Reitz
                   ` (5 subsequent siblings)
  42 siblings, 0 replies; 113+ messages in thread
From: Max Reitz @ 2019-06-12 22:09 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, qemu-devel, Max Reitz

Parts of the block layer treat BDS.backing_file as if it were whatever
the image header says (i.e., if it is a relative path, it is relative to
the overlay), other parts treat it like a cache for
bs->backing->bs->filename (relative paths are relative to the CWD).
Considering bs->backing->bs->filename exists, let us make it mean the
former.

Among other things, this now allows the user to specify a base when
using qemu-img to commit an image file in a directory that is not the
CWD (assuming, everything uses relative filenames).

Before this patch:

$ ./qemu-img create -f qcow2 foo/bot.qcow2 1M
$ ./qemu-img create -f qcow2 -b bot.qcow2 foo/mid.qcow2
$ ./qemu-img create -f qcow2 -b mid.qcow2 foo/top.qcow2
$ ./qemu-img commit -b mid.qcow2 foo/top.qcow2
qemu-img: Did not find 'mid.qcow2' in the backing chain of 'foo/top.qcow2'
$ ./qemu-img commit -b foo/mid.qcow2 foo/top.qcow2
qemu-img: Did not find 'foo/mid.qcow2' in the backing chain of 'foo/top.qcow2'
$ ./qemu-img commit -b $PWD/foo/mid.qcow2 foo/top.qcow2
qemu-img: Did not find '[...]/foo/mid.qcow2' in the backing chain of 'foo/top.qcow2'

After this patch:

$ ./qemu-img commit -b mid.qcow2 foo/top.qcow2
Image committed.
$ ./qemu-img commit -b foo/mid.qcow2 foo/top.qcow2
qemu-img: Did not find 'foo/mid.qcow2' in the backing chain of 'foo/top.qcow2'
$ ./qemu-img commit -b $PWD/foo/mid.qcow2 foo/top.qcow2
Image committed.

With this change, bdrv_find_backing_image() must look at whether the
user has overridden a BDS's backing file.  If so, it can no longer use
bs->backing_file, but must instead compare the given filename against
the backing node's filename directly.

Note that this changes the QAPI output for a node's backing_file.  We
had very inconsistent output there (sometimes what the image header
said, sometimes the actual filename of the backing image).  This
inconsistent output was effectively useless, so we have to decide one
way or the other.  Considering that bs->backing_file usually at runtime
contained the path to the image relative to qemu's CWD (or absolute),
this patch changes QAPI's backing_file to always report the
bs->backing->bs->filename from now on.  If you want to receive the image
header information, you have to refer to full-backing-filename.

This necessitates a change to iotest 228.  The interesting information
it really wanted is the image header, and it can get that now, but it
has to use full-backing-filename instead of backing_file.  Because of
this patch's changes to bs->backing_file's behavior, we also need some
reference output changes.

Along with the changes to bs->backing_file, stop updating
BDS.backing_format in bdrv_backing_attach() as well.  This necessitates
a change to the reference output of iotest 191.

iotest 245 changes in behavior: With the backing node no longer
overriding the parent node's backing_file string, you can now omit the
@backing option when reopening a node with neither a default nor a
current backing file even if it used to have a backing node at some
point.

Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 include/block/block_int.h  | 19 ++++++++++++++-----
 block.c                    | 35 ++++++++++++++++++++++++++++-------
 block/qapi.c               |  7 ++++---
 tests/qemu-iotests/191.out |  1 -
 tests/qemu-iotests/228     |  6 +++---
 tests/qemu-iotests/228.out |  6 +++---
 tests/qemu-iotests/245     |  4 +++-
 7 files changed, 55 insertions(+), 23 deletions(-)

diff --git a/include/block/block_int.h b/include/block/block_int.h
index 431fa38ea0..02b55cff91 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -777,11 +777,20 @@ struct BlockDriverState {
     bool walking_aio_notifiers; /* to make removal during iteration safe */
 
     char filename[PATH_MAX];
-    char backing_file[PATH_MAX]; /* if non zero, the image is a diff of
-                                    this file image */
-    /* The backing filename indicated by the image header; if we ever
-     * open this file, then this is replaced by the resulting BDS's
-     * filename (i.e. after a bdrv_refresh_filename() run). */
+    /*
+     * If not empty, this image is a diff in relation to backing_file.
+     * Note that this is the name given in the image header and
+     * therefore may or may not be equal to .backing->bs->filename.
+     * If this field contains a relative path, it is to be resolved
+     * relatively to the overlay's location.
+     */
+    char backing_file[PATH_MAX];
+    /*
+     * The backing filename indicated by the image header.  Contrary
+     * to backing_file, if we ever open this file, auto_backing_file
+     * is replaced by the resulting BDS's filename (i.e. after a
+     * bdrv_refresh_filename() run).
+     */
     char auto_backing_file[PATH_MAX];
     char backing_format[16]; /* if non-zero and backing_file exists */
 
diff --git a/block.c b/block.c
index e129869a7e..a962e346ab 100644
--- a/block.c
+++ b/block.c
@@ -78,6 +78,8 @@ static BlockDriverState *bdrv_open_inherit(const char *filename,
                                            const BdrvChildRole *child_role,
                                            Error **errp);
 
+static bool bdrv_backing_overridden(BlockDriverState *bs);
+
 /* If non-zero, use only whitelisted block drivers */
 static int use_bdrv_whitelist;
 
@@ -1064,10 +1066,6 @@ static void bdrv_backing_attach(BdrvChild *c)
     bdrv_refresh_filename(backing_hd);
 
     parent->open_flags &= ~BDRV_O_NO_BACKING;
-    pstrcpy(parent->backing_file, sizeof(parent->backing_file),
-            backing_hd->filename);
-    pstrcpy(parent->backing_format, sizeof(parent->backing_format),
-            backing_hd->drv ? backing_hd->drv->format_name : "");
 
     bdrv_op_block_all(backing_hd, parent->backing_blocker);
     /* Otherwise we won't be able to commit or stream */
@@ -5177,6 +5175,7 @@ BlockDriverState *bdrv_find_backing_image(BlockDriverState *bs,
     char *backing_file_full = NULL;
     char *filename_tmp = NULL;
     int is_protocol = 0;
+    bool filenames_refreshed = false;
     BlockDriverState *curr_bs = NULL;
     BlockDriverState *retval = NULL;
 
@@ -5201,9 +5200,31 @@ BlockDriverState *bdrv_find_backing_image(BlockDriverState *bs,
     {
         BlockDriverState *bs_below = bdrv_backing_chain_next(curr_bs);
 
-        /* If either of the filename paths is actually a protocol, then
-         * compare unmodified paths; otherwise make paths relative */
-        if (is_protocol || path_has_protocol(curr_bs->backing_file)) {
+        if (bdrv_backing_overridden(curr_bs)) {
+            /*
+             * If the backing file was overridden, we can only compare
+             * directly against the backing node's filename.
+             */
+
+            if (!filenames_refreshed) {
+                /*
+                 * This will automatically refresh all of the
+                 * filenames in the rest of the backing chain, so we
+                 * only need to do this once.
+                 */
+                bdrv_refresh_filename(bs_below);
+                filenames_refreshed = true;
+            }
+
+            if (strcmp(backing_file, bs_below->filename) == 0) {
+                retval = bs_below;
+                break;
+            }
+        } else if (is_protocol || path_has_protocol(curr_bs->backing_file)) {
+            /*
+             * If either of the filename paths is actually a protocol, then
+             * compare unmodified paths; otherwise make paths relative.
+             */
             char *backing_file_full_ret;
 
             if (strcmp(backing_file, curr_bs->backing_file) == 0) {
diff --git a/block/qapi.c b/block/qapi.c
index 1fd2937abc..3586c09516 100644
--- a/block/qapi.c
+++ b/block/qapi.c
@@ -44,7 +44,7 @@ BlockDeviceInfo *bdrv_block_device_info(BlockBackend *blk,
                                         BlockDriverState *bs, Error **errp)
 {
     ImageInfo **p_image_info;
-    BlockDriverState *bs0;
+    BlockDriverState *bs0, *backing;
     BlockDeviceInfo *info;
 
     if (!bs->drv) {
@@ -73,9 +73,10 @@ BlockDeviceInfo *bdrv_block_device_info(BlockBackend *blk,
         info->node_name = g_strdup(bs->node_name);
     }
 
-    if (bs->backing_file[0]) {
+    backing = bdrv_filtered_cow_bs(bs);
+    if (backing) {
         info->has_backing_file = true;
-        info->backing_file = g_strdup(bs->backing_file);
+        info->backing_file = g_strdup(backing->filename);
     }
 
     info->detect_zeroes = bs->detect_zeroes;
diff --git a/tests/qemu-iotests/191.out b/tests/qemu-iotests/191.out
index 3fc92bb56e..0b3c216b0c 100644
--- a/tests/qemu-iotests/191.out
+++ b/tests/qemu-iotests/191.out
@@ -605,7 +605,6 @@ wrote 65536/65536 bytes at offset 1048576
                     "backing-filename": "TEST_DIR/t.IMGFMT.base",
                     "dirty-flag": false
                 },
-                "backing-filename-format": "IMGFMT",
                 "virtual-size": 67108864,
                 "filename": "TEST_DIR/t.IMGFMT.ovl3",
                 "cluster-size": 65536,
diff --git a/tests/qemu-iotests/228 b/tests/qemu-iotests/228
index 9a50afd205..a1f3187212 100755
--- a/tests/qemu-iotests/228
+++ b/tests/qemu-iotests/228
@@ -34,7 +34,7 @@ def log_node_info(node):
 
     log('bs->filename: ' + node['image']['filename'],
         filters=[filter_testfiles, filter_imgfmt])
-    log('bs->backing_file: ' + node['backing_file'],
+    log('bs->backing_file: ' + node['image']['full-backing-filename'],
         filters=[filter_testfiles, filter_imgfmt])
 
     if 'backing-image' in node['image']:
@@ -70,8 +70,8 @@ with iotests.FilePath('base.img') as base_img_path, \
                 },
                 filters=[filter_qmp_testfiles, filter_qmp_imgfmt])
 
-    # Filename should be plain, and the backing filename should not
-    # contain the "file:" prefix
+    # Filename should be plain, and the backing node filename should
+    # not contain the "file:" prefix
     log_node_info(vm.node_info('node0'))
 
     vm.qmp_log('blockdev-del', node_name='node0')
diff --git a/tests/qemu-iotests/228.out b/tests/qemu-iotests/228.out
index 4217df24fe..8c82009abe 100644
--- a/tests/qemu-iotests/228.out
+++ b/tests/qemu-iotests/228.out
@@ -4,7 +4,7 @@
 {"return": {}}
 
 bs->filename: TEST_DIR/PID-top.img
-bs->backing_file: TEST_DIR/PID-base.img
+bs->backing_file: file:TEST_DIR/PID-base.img
 bs->backing->bs->filename: TEST_DIR/PID-base.img
 
 {"execute": "blockdev-del", "arguments": {"node-name": "node0"}}
@@ -41,7 +41,7 @@ bs->backing->bs->filename: TEST_DIR/PID-base.img
 {"return": {}}
 
 bs->filename: TEST_DIR/PID-top.img
-bs->backing_file: TEST_DIR/PID-base.img
+bs->backing_file: file:TEST_DIR/PID-base.img
 bs->backing->bs->filename: TEST_DIR/PID-base.img
 
 {"execute": "blockdev-del", "arguments": {"node-name": "node0"}}
@@ -55,7 +55,7 @@ bs->backing->bs->filename: TEST_DIR/PID-base.img
 {"return": {}}
 
 bs->filename: json:{"backing": {"driver": "null-co"}, "driver": "IMGFMT", "file": {"driver": "file", "filename": "TEST_DIR/PID-top.img"}}
-bs->backing_file: null-co://
+bs->backing_file: TEST_DIR/PID-base.img
 bs->backing->bs->filename: null-co://
 
 {"execute": "blockdev-del", "arguments": {"node-name": "node0"}}
diff --git a/tests/qemu-iotests/245 b/tests/qemu-iotests/245
index 349b94aace..80c3e1b92d 100644
--- a/tests/qemu-iotests/245
+++ b/tests/qemu-iotests/245
@@ -722,7 +722,9 @@ class TestBlockdevReopen(iotests.QMPTestCase):
 
         # Detach hd2 from hd0.
         self.reopen(opts, {'backing': None})
-        self.reopen(opts, {}, "backing is missing for 'hd0'")
+
+        # Without a backing file, we can omit 'backing' again
+        self.reopen(opts)
 
         # Remove both hd0 and hd2
         result = self.vm.qmp('blockdev-del', conv_keys = True, node_name = 'hd0')
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [Qemu-devel] [PATCH v5 38/42] iotests: Let complete_and_wait() work with commit
  2019-06-12 22:09 [Qemu-devel] [PATCH v5 00/42] block: Deal with filters Max Reitz
                   ` (36 preceding siblings ...)
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 37/42] block: Leave BDS.backing_file constant Max Reitz
@ 2019-06-12 22:10 ` Max Reitz
  2019-06-12 22:10 ` [Qemu-devel] [PATCH v5 39/42] iotests: Add filter commit test cases Max Reitz
                   ` (4 subsequent siblings)
  42 siblings, 0 replies; 113+ messages in thread
From: Max Reitz @ 2019-06-12 22:10 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, qemu-devel, Max Reitz

complete_and_wait() and wait_ready() currently only work for mirror
jobs.  Let them work for active commit jobs, too.

Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 tests/qemu-iotests/iotests.py | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index dc77d3fba0..55066d62bb 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -697,8 +697,12 @@ class QMPTestCase(unittest.TestCase):
 
     def wait_ready(self, drive='drive0'):
         '''Wait until a block job BLOCK_JOB_READY event'''
-        f = {'data': {'type': 'mirror', 'device': drive } }
-        event = self.vm.event_wait(name='BLOCK_JOB_READY', match=f)
+        event = self.vm.events_wait([
+                ('BLOCK_JOB_READY',
+                 {'data': {'type': 'mirror', 'device': drive } }),
+                ('BLOCK_JOB_READY',
+                 {'data': {'type': 'commit', 'device': drive } })
+            ])
 
     def wait_ready_and_cancel(self, drive='drive0'):
         self.wait_ready(drive=drive)
@@ -716,7 +720,7 @@ class QMPTestCase(unittest.TestCase):
         self.assert_qmp(result, 'return', {})
 
         event = self.wait_until_completed(drive=drive)
-        self.assert_qmp(event, 'data/type', 'mirror')
+        self.assertTrue(event['data']['type'] in ['mirror', 'commit'])
 
     def pause_wait(self, job_id='job0'):
         with Timeout(1, "Timeout waiting for job to pause"):
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [Qemu-devel] [PATCH v5 39/42] iotests: Add filter commit test cases
  2019-06-12 22:09 [Qemu-devel] [PATCH v5 00/42] block: Deal with filters Max Reitz
                   ` (37 preceding siblings ...)
  2019-06-12 22:10 ` [Qemu-devel] [PATCH v5 38/42] iotests: Let complete_and_wait() work with commit Max Reitz
@ 2019-06-12 22:10 ` Max Reitz
  2019-06-12 22:10 ` [Qemu-devel] [PATCH v5 40/42] iotests: Add filter mirror " Max Reitz
                   ` (3 subsequent siblings)
  42 siblings, 0 replies; 113+ messages in thread
From: Max Reitz @ 2019-06-12 22:10 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, qemu-devel, Max Reitz

This patch adds some tests on how commit copes with filter nodes.

Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 tests/qemu-iotests/040     | 177 +++++++++++++++++++++++++++++++++++++
 tests/qemu-iotests/040.out |   4 +-
 2 files changed, 179 insertions(+), 2 deletions(-)

diff --git a/tests/qemu-iotests/040 b/tests/qemu-iotests/040
index aa0b1847e3..31c2a8da3b 100755
--- a/tests/qemu-iotests/040
+++ b/tests/qemu-iotests/040
@@ -432,5 +432,182 @@ class TestReopenOverlay(ImageCommitTestCase):
     def test_reopen_overlay(self):
         self.run_commit_test(self.img1, self.img0)
 
+class TestCommitWithFilters(iotests.QMPTestCase):
+    img0 = os.path.join(iotests.test_dir, '0.img')
+    img1 = os.path.join(iotests.test_dir, '1.img')
+    img2 = os.path.join(iotests.test_dir, '2.img')
+    img3 = os.path.join(iotests.test_dir, '3.img')
+
+    def setUp(self):
+        qemu_img('create', '-f', iotests.imgfmt, self.img0, '64M')
+        qemu_img('create', '-f', iotests.imgfmt, self.img1, '64M')
+        qemu_img('create', '-f', iotests.imgfmt, self.img2, '64M')
+        qemu_img('create', '-f', iotests.imgfmt, self.img3, '64M')
+
+        qemu_io('-f', iotests.imgfmt, '-c', 'write -P 1 0M 1M', self.img0)
+        qemu_io('-f', iotests.imgfmt, '-c', 'write -P 2 1M 1M', self.img1)
+        qemu_io('-f', iotests.imgfmt, '-c', 'write -P 3 2M 1M', self.img2)
+        qemu_io('-f', iotests.imgfmt, '-c', 'write -P 4 3M 1M', self.img3)
+
+        # Distributions of the patterns in the files; this is checked
+        # by tearDown() and should be changed by the test cases as is
+        # necessary
+        self.pattern_files = [self.img0, self.img1, self.img2, self.img3]
+
+        self.vm = iotests.VM()
+        self.vm.launch()
+        self.has_quit = False
+
+        result = self.vm.qmp('object-add', qom_type='throttle-group', id='tg')
+        self.assert_qmp(result, 'return', {})
+
+        result = self.vm.qmp('blockdev-add', **{
+                'node-name': 'top-filter',
+                'driver': 'throttle',
+                'throttle-group': 'tg',
+                'file': {
+                    'node-name': 'cow-3',
+                    'driver': iotests.imgfmt,
+                    'file': {
+                        'driver': 'file',
+                        'filename': self.img3
+                    },
+                    'backing': {
+                        'node-name': 'cow-2',
+                        'driver': iotests.imgfmt,
+                        'file': {
+                            'driver': 'file',
+                            'filename': self.img2
+                        },
+                        'backing': {
+                            'node-name': 'cow-1',
+                            'driver': iotests.imgfmt,
+                            'file': {
+                                'driver': 'file',
+                                'filename': self.img1
+                            },
+                            'backing': {
+                                'node-name': 'bottom-filter',
+                                'driver': 'throttle',
+                                'throttle-group': 'tg',
+                                'file': {
+                                    'node-name': 'cow-0',
+                                    'driver': iotests.imgfmt,
+                                    'file': {
+                                        'driver': 'file',
+                                        'filename': self.img0
+                                    }
+                                }
+                            }
+                        }
+                    }
+                }
+            })
+        self.assert_qmp(result, 'return', {})
+
+    def tearDown(self):
+        self.vm.shutdown(has_quit=self.has_quit)
+
+        for index in range(len(self.pattern_files)):
+            result = qemu_io('-f', iotests.imgfmt,
+                             '-c', 'read -P %i %iM 1M' % (index + 1, index),
+                             self.pattern_files[index])
+            self.assertFalse('Pattern verification failed' in result)
+
+        os.remove(self.img3)
+        os.remove(self.img2)
+        os.remove(self.img1)
+        os.remove(self.img0)
+
+    # Filters make for funny filenames, so we cannot just use
+    # self.imgX to get them
+    def get_filename(self, node):
+        return self.vm.node_info(node)['image']['filename']
+
+    def test_filterless_commit(self):
+        self.assert_no_active_block_jobs()
+        result = self.vm.qmp('block-commit',
+                             job_id='commit',
+                             device='top-filter',
+                             top_node='cow-2',
+                             base_node='cow-1')
+        self.assert_qmp(result, 'return', {})
+        self.wait_until_completed(drive='commit')
+
+        self.assertIsNotNone(self.vm.node_info('cow-3'))
+        self.assertIsNone(self.vm.node_info('cow-2'))
+        self.assertIsNotNone(self.vm.node_info('cow-1'))
+
+        # 2 has been comitted into 1
+        self.pattern_files[2] = self.img1
+
+    def test_commit_through_filter(self):
+        self.assert_no_active_block_jobs()
+        result = self.vm.qmp('block-commit',
+                             job_id='commit',
+                             device='top-filter',
+                             top_node='cow-1',
+                             base_node='cow-0')
+        self.assert_qmp(result, 'return', {})
+        self.wait_until_completed(drive='commit')
+
+        self.assertIsNotNone(self.vm.node_info('cow-2'))
+        self.assertIsNone(self.vm.node_info('cow-1'))
+        self.assertIsNone(self.vm.node_info('bottom-filter'))
+        self.assertIsNotNone(self.vm.node_info('cow-0'))
+
+        # 1 has been comitted into 0
+        self.pattern_files[1] = self.img0
+
+    def test_filtered_active_commit_with_filter(self):
+        # Add a device, so the commit job finds a parent it can change
+        # to point to the base node (so we can test that top-filter is
+        # dropped from the graph)
+        result = self.vm.qmp('device_add', id='drv0', driver='virtio-blk',
+                             drive='top-filter')
+        self.assert_qmp(result, 'return', {})
+
+        # Try to release our reference to top-filter; that should not
+        # work because drv0 uses it
+        result = self.vm.qmp('blockdev-del', node_name='top-filter')
+        self.assert_qmp(result, 'error/class', 'GenericError')
+        self.assert_qmp(result, 'error/desc', 'Node top-filter is in use')
+
+        self.assert_no_active_block_jobs()
+        result = self.vm.qmp('block-commit',
+                             job_id='commit',
+                             device='top-filter',
+                             base_node='cow-2')
+        self.assert_qmp(result, 'return', {})
+        self.complete_and_wait(drive='commit')
+
+        # Try to release our reference to top-filter again
+        result = self.vm.qmp('blockdev-del', node_name='top-filter')
+        self.assert_qmp(result, 'return', {})
+
+        self.assertIsNone(self.vm.node_info('top-filter'))
+        self.assertIsNone(self.vm.node_info('cow-3'))
+        self.assertIsNotNone(self.vm.node_info('cow-2'))
+
+        # 3 has been comitted into 2
+        self.pattern_files[3] = self.img2
+
+    def test_filtered_active_commit_without_filter(self):
+        self.assert_no_active_block_jobs()
+        result = self.vm.qmp('block-commit',
+                             job_id='commit',
+                             device='top-filter',
+                             top_node='cow-3',
+                             base_node='cow-2')
+        self.assert_qmp(result, 'return', {})
+        self.complete_and_wait(drive='commit')
+
+        self.assertIsNotNone(self.vm.node_info('top-filter'))
+        self.assertIsNone(self.vm.node_info('cow-3'))
+        self.assertIsNotNone(self.vm.node_info('cow-2'))
+
+        # 3 has been comitted into 2
+        self.pattern_files[3] = self.img2
+
 if __name__ == '__main__':
     iotests.main(supported_fmts=['qcow2', 'qed'])
diff --git a/tests/qemu-iotests/040.out b/tests/qemu-iotests/040.out
index 220a5fa82c..fe58934d7a 100644
--- a/tests/qemu-iotests/040.out
+++ b/tests/qemu-iotests/040.out
@@ -1,5 +1,5 @@
-...............................................
+...................................................
 ----------------------------------------------------------------------
-Ran 47 tests
+Ran 51 tests
 
 OK
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [Qemu-devel] [PATCH v5 40/42] iotests: Add filter mirror test cases
  2019-06-12 22:09 [Qemu-devel] [PATCH v5 00/42] block: Deal with filters Max Reitz
                   ` (38 preceding siblings ...)
  2019-06-12 22:10 ` [Qemu-devel] [PATCH v5 39/42] iotests: Add filter commit test cases Max Reitz
@ 2019-06-12 22:10 ` Max Reitz
  2019-06-12 22:10 ` [Qemu-devel] [PATCH v5 41/42] iotests: Add test for commit in sub directory Max Reitz
                   ` (2 subsequent siblings)
  42 siblings, 0 replies; 113+ messages in thread
From: Max Reitz @ 2019-06-12 22:10 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, qemu-devel, Max Reitz

This patch adds some test cases how mirroring relates to filters.  One
of them tests what happens when you mirror off a filtered COW node, two
others use the mirror filter node as basically our only example of an
implicitly created filter node so far (besides the commit filter).

Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 tests/qemu-iotests/041     | 146 ++++++++++++++++++++++++++++++++++++-
 tests/qemu-iotests/041.out |   4 +-
 2 files changed, 147 insertions(+), 3 deletions(-)

diff --git a/tests/qemu-iotests/041 b/tests/qemu-iotests/041
index 0c1432f189..c2b5299f62 100755
--- a/tests/qemu-iotests/041
+++ b/tests/qemu-iotests/041
@@ -20,8 +20,9 @@
 
 import time
 import os
+import json
 import iotests
-from iotests import qemu_img, qemu_io
+from iotests import qemu_img, qemu_img_pipe, qemu_io
 
 backing_img = os.path.join(iotests.test_dir, 'backing.img')
 target_backing_img = os.path.join(iotests.test_dir, 'target-backing.img')
@@ -1191,5 +1192,148 @@ class TestReplaces(iotests.QMPTestCase):
         os.remove(test_img)
         os.remove(target_img)
 
+# Tests for mirror with filters (and how the mirror filter behaves, as
+# an example for an implicit filter)
+class TestFilters(iotests.QMPTestCase):
+    def setUp(self):
+        qemu_img('create', '-f', iotests.imgfmt, backing_img, '1M')
+        qemu_img('create', '-f', iotests.imgfmt, '-b', backing_img, test_img)
+        qemu_img('create', '-f', iotests.imgfmt, '-b', backing_img, target_img)
+
+        qemu_io('-c', 'write -P 1 0 512k', backing_img)
+        qemu_io('-c', 'write -P 2 512k 512k', test_img)
+
+        self.vm = iotests.VM()
+        self.vm.launch()
+
+        result = self.vm.qmp('blockdev-add', **{
+                                'node-name': 'target',
+                                'driver': iotests.imgfmt,
+                                'file': {
+                                    'driver': 'file',
+                                    'filename': target_img
+                                },
+                                'backing': None
+                            })
+        self.assert_qmp(result, 'return', {})
+
+        self.filterless_chain = {
+                'node-name': 'source',
+                'driver': iotests.imgfmt,
+                'file': {
+                    'driver': 'file',
+                    'filename': test_img
+                },
+                'backing': {
+                    'node-name': 'backing',
+                    'driver': iotests.imgfmt,
+                    'file': {
+                        'driver': 'file',
+                        'filename': backing_img
+                    }
+                }
+            }
+
+    def tearDown(self):
+        self.vm.shutdown()
+
+        os.remove(test_img)
+        os.remove(target_img)
+        os.remove(backing_img)
+
+    def test_cor(self):
+        result = self.vm.qmp('blockdev-add', **{
+                                'node-name': 'filter',
+                                'driver': 'copy-on-read',
+                                'file': self.filterless_chain
+                            })
+        self.assert_qmp(result, 'return', {})
+
+        result = self.vm.qmp('blockdev-mirror',
+                             job_id='mirror',
+                             device='filter',
+                             target='target',
+                             sync='top')
+        self.assert_qmp(result, 'return', {})
+
+        self.complete_and_wait('mirror')
+
+        self.vm.qmp('blockdev-del', node_name='target')
+
+        target_map = qemu_img_pipe('map', '--output=json', target_img)
+        target_map = json.loads(target_map)
+
+        assert target_map[0]['start'] == 0
+        assert target_map[0]['length'] == 512 * 1024
+        assert target_map[0]['depth'] == 1
+
+        assert target_map[1]['start'] == 512 * 1024
+        assert target_map[1]['length'] == 512 * 1024
+        assert target_map[1]['depth'] == 0
+
+    def test_implicit_mirror_filter(self):
+        result = self.vm.qmp('blockdev-add', **self.filterless_chain)
+        self.assert_qmp(result, 'return', {})
+
+        # We need this so we can query from above the mirror node
+        result = self.vm.qmp('device_add',
+                             driver='virtio-blk',
+                             id='virtio',
+                             bus='pci.0',
+                             drive='source')
+        self.assert_qmp(result, 'return', {})
+
+        result = self.vm.qmp('blockdev-mirror',
+                             job_id='mirror',
+                             device='source',
+                             target='target',
+                             sync='top')
+        self.assert_qmp(result, 'return', {})
+
+        # The mirror filter is now an implicit node, so it should be
+        # invisible when querying the backing chain
+        device_info = self.vm.qmp('query-block')['return'][0]
+        assert device_info['qdev'] == '/machine/peripheral/virtio/virtio-backend'
+
+        assert device_info['inserted']['node-name'] == 'source'
+
+        image_info = device_info['inserted']['image']
+        assert image_info['filename'] == test_img
+        assert image_info['backing-image']['filename'] == backing_img
+
+        self.complete_and_wait('mirror')
+
+    def test_explicit_mirror_filter(self):
+        # Same test as above, but this time we give the mirror filter
+        # a node-name so it will not be invisible
+        result = self.vm.qmp('blockdev-add', **self.filterless_chain)
+        self.assert_qmp(result, 'return', {})
+
+        # We need this so we can query from above the mirror node
+        result = self.vm.qmp('device_add',
+                             driver='virtio-blk',
+                             id='virtio',
+                             bus='pci.0',
+                             drive='source')
+        self.assert_qmp(result, 'return', {})
+
+        result = self.vm.qmp('blockdev-mirror',
+                             job_id='mirror',
+                             device='source',
+                             target='target',
+                             sync='top',
+                             filter_node_name='mirror-filter')
+        self.assert_qmp(result, 'return', {})
+
+        # With a node-name given to it, the mirror filter should now
+        # be visible
+        device_info = self.vm.qmp('query-block')['return'][0]
+        assert device_info['qdev'] == '/machine/peripheral/virtio/virtio-backend'
+
+        assert device_info['inserted']['node-name'] == 'mirror-filter'
+
+        self.complete_and_wait('mirror')
+
+
 if __name__ == '__main__':
     iotests.main(supported_fmts=['qcow2', 'qed'])
diff --git a/tests/qemu-iotests/041.out b/tests/qemu-iotests/041.out
index 2c448b4239..ffc779b4d1 100644
--- a/tests/qemu-iotests/041.out
+++ b/tests/qemu-iotests/041.out
@@ -1,5 +1,5 @@
-..........................................................................................
+.............................................................................................
 ----------------------------------------------------------------------
-Ran 90 tests
+Ran 93 tests
 
 OK
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [Qemu-devel] [PATCH v5 41/42] iotests: Add test for commit in sub directory
  2019-06-12 22:09 [Qemu-devel] [PATCH v5 00/42] block: Deal with filters Max Reitz
                   ` (39 preceding siblings ...)
  2019-06-12 22:10 ` [Qemu-devel] [PATCH v5 40/42] iotests: Add filter mirror " Max Reitz
@ 2019-06-12 22:10 ` Max Reitz
  2019-06-12 22:10 ` [Qemu-devel] [PATCH v5 42/42] iotests: Test committing to overridden backing Max Reitz
  2019-06-13 15:28 ` [Qemu-devel] [PATCH v5 00/42] block: Deal with filters Vladimir Sementsov-Ogievskiy
  42 siblings, 0 replies; 113+ messages in thread
From: Max Reitz @ 2019-06-12 22:10 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, qemu-devel, Max Reitz

Add a test for committing an overlay in a sub directory to one of the
images in its backing chain, using both relative and absolute filenames.

Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 tests/qemu-iotests/020     | 36 ++++++++++++++++++++++++++++++++++++
 tests/qemu-iotests/020.out | 10 ++++++++++
 2 files changed, 46 insertions(+)

diff --git a/tests/qemu-iotests/020 b/tests/qemu-iotests/020
index 6b0ebb37d2..94633c3a32 100755
--- a/tests/qemu-iotests/020
+++ b/tests/qemu-iotests/020
@@ -31,6 +31,11 @@ _cleanup()
 	_cleanup_test_img
     rm -f "$TEST_IMG.base"
     rm -f "$TEST_IMG.orig"
+
+    rm -f "$TEST_DIR/subdir/t.$IMGFMT.base"
+    rm -f "$TEST_DIR/subdir/t.$IMGFMT.mid"
+    rm -f "$TEST_DIR/subdir/t.$IMGFMT"
+    rmdir "$TEST_DIR/subdir" &> /dev/null
 }
 trap "_cleanup; exit \$status" 0 1 2 3 15
 
@@ -133,6 +138,37 @@ $QEMU_IO -c 'writev 0 64k' "$TEST_IMG" | _filter_qemu_io
 $QEMU_IMG commit "$TEST_IMG"
 _cleanup
 
+
+echo
+echo 'Testing commit in sub-directory with relative filenames'
+echo
+
+pushd "$TEST_DIR" > /dev/null
+
+mkdir subdir
+
+TEST_IMG="subdir/t.$IMGFMT.base" _make_test_img 1M
+TEST_IMG="subdir/t.$IMGFMT.mid" _make_test_img -b "t.$IMGFMT.base"
+TEST_IMG="subdir/t.$IMGFMT" _make_test_img -b "t.$IMGFMT.mid"
+
+# Should work
+$QEMU_IMG commit -b "t.$IMGFMT.mid" "subdir/t.$IMGFMT"
+
+# Might theoretically work, but does not in practice (we have to
+# decide between this and the above; and since we always represent
+# backing file names as relative to the overlay, we go for the above)
+$QEMU_IMG commit -b "subdir/t.$IMGFMT.mid" "subdir/t.$IMGFMT" 2>&1 | \
+    _filter_imgfmt
+
+# This should work as well
+$QEMU_IMG commit -b "$TEST_DIR/subdir/t.$IMGFMT.mid" "subdir/t.$IMGFMT"
+
+popd > /dev/null
+
+# Now let's try with just absolute filenames
+$QEMU_IMG commit -b "$TEST_DIR/subdir/t.$IMGFMT.mid" \
+    "$TEST_DIR/subdir/t.$IMGFMT"
+
 # success, all done
 echo "*** done"
 rm -f $seq.full
diff --git a/tests/qemu-iotests/020.out b/tests/qemu-iotests/020.out
index 4b722b2dd0..228c37dded 100644
--- a/tests/qemu-iotests/020.out
+++ b/tests/qemu-iotests/020.out
@@ -1094,4 +1094,14 @@ Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1048576 backing_file=json:{'driv
 wrote 65536/65536 bytes at offset 0
 64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
 qemu-img: Block job failed: No space left on device
+
+Testing commit in sub-directory with relative filenames
+
+Formatting 'subdir/t.IMGFMT.base', fmt=IMGFMT size=1048576
+Formatting 'subdir/t.IMGFMT.mid', fmt=IMGFMT size=1048576 backing_file=t.IMGFMT.base
+Formatting 'subdir/t.IMGFMT', fmt=IMGFMT size=1048576 backing_file=t.IMGFMT.mid
+Image committed.
+qemu-img: Did not find 'subdir/t.IMGFMT.mid' in the backing chain of 'subdir/t.IMGFMT'
+Image committed.
+Image committed.
 *** done
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [Qemu-devel] [PATCH v5 42/42] iotests: Test committing to overridden backing
  2019-06-12 22:09 [Qemu-devel] [PATCH v5 00/42] block: Deal with filters Max Reitz
                   ` (40 preceding siblings ...)
  2019-06-12 22:10 ` [Qemu-devel] [PATCH v5 41/42] iotests: Add test for commit in sub directory Max Reitz
@ 2019-06-12 22:10 ` Max Reitz
  2019-06-13 15:28 ` [Qemu-devel] [PATCH v5 00/42] block: Deal with filters Vladimir Sementsov-Ogievskiy
  42 siblings, 0 replies; 113+ messages in thread
From: Max Reitz @ 2019-06-12 22:10 UTC (permalink / raw)
  To: qemu-block
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, qemu-devel, Max Reitz

Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 tests/qemu-iotests/040     | 61 ++++++++++++++++++++++++++++++++++++++
 tests/qemu-iotests/040.out |  4 +--
 2 files changed, 63 insertions(+), 2 deletions(-)

diff --git a/tests/qemu-iotests/040 b/tests/qemu-iotests/040
index 31c2a8da3b..5cbbd30ee3 100755
--- a/tests/qemu-iotests/040
+++ b/tests/qemu-iotests/040
@@ -609,5 +609,66 @@ class TestCommitWithFilters(iotests.QMPTestCase):
         # 3 has been comitted into 2
         self.pattern_files[3] = self.img2
 
+class TestCommitWithOverriddenBacking(iotests.QMPTestCase):
+    img_base_a = os.path.join(iotests.test_dir, 'base_a.img')
+    img_base_b = os.path.join(iotests.test_dir, 'base_b.img')
+    img_top = os.path.join(iotests.test_dir, 'top.img')
+
+    def setUp(self):
+        qemu_img('create', '-f', iotests.imgfmt, self.img_base_a, '1M')
+        qemu_img('create', '-f', iotests.imgfmt, self.img_base_b, '1M')
+        qemu_img('create', '-f', iotests.imgfmt, '-b', self.img_base_a, \
+                 self.img_top)
+
+        self.vm = iotests.VM()
+        self.vm.launch()
+
+        # Use base_b instead of base_a as the backing of top
+        result = self.vm.qmp('blockdev-add', **{
+                                'node-name': 'top',
+                                'driver': iotests.imgfmt,
+                                'file': {
+                                    'driver': 'file',
+                                    'filename': self.img_top
+                                },
+                                'backing': {
+                                    'node-name': 'base',
+                                    'driver': iotests.imgfmt,
+                                    'file': {
+                                        'driver': 'file',
+                                        'filename': self.img_base_b
+                                    }
+                                }
+                            })
+        self.assert_qmp(result, 'return', {})
+
+    def tearDown(self):
+        self.vm.shutdown()
+        os.remove(self.img_top)
+        os.remove(self.img_base_a)
+        os.remove(self.img_base_b)
+
+    def test_commit_to_a(self):
+        # Try committing to base_a (which should fail, as top's
+        # backing image is base_b instead)
+        result = self.vm.qmp('block-commit',
+                             job_id='commit',
+                             device='top',
+                             base=self.img_base_a)
+        self.assert_qmp(result, 'error/class', 'GenericError')
+
+    def test_commit_to_b(self):
+        # Try committing to base_b (which should work, since that is
+        # actually top's backing image)
+        result = self.vm.qmp('block-commit',
+                             job_id='commit',
+                             device='top',
+                             base=self.img_base_b)
+        self.assert_qmp(result, 'return', {})
+
+        self.vm.event_wait('BLOCK_JOB_READY')
+        self.vm.qmp('block-job-complete', device='commit')
+        self.vm.event_wait('BLOCK_JOB_COMPLETED')
+
 if __name__ == '__main__':
     iotests.main(supported_fmts=['qcow2', 'qed'])
diff --git a/tests/qemu-iotests/040.out b/tests/qemu-iotests/040.out
index fe58934d7a..499af0e2ff 100644
--- a/tests/qemu-iotests/040.out
+++ b/tests/qemu-iotests/040.out
@@ -1,5 +1,5 @@
-...................................................
+.....................................................
 ----------------------------------------------------------------------
-Ran 51 tests
+Ran 53 tests
 
 OK
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 113+ messages in thread

* Re: [Qemu-devel] [PATCH v5 22/42] block: Use CAFs in bdrv_get_allocated_file_size()
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 22/42] block: Use CAFs in bdrv_get_allocated_file_size() Max Reitz
@ 2019-06-12 22:17   ` Max Reitz
  2019-06-14 15:41   ` Vladimir Sementsov-Ogievskiy
  1 sibling, 0 replies; 113+ messages in thread
From: Max Reitz @ 2019-06-12 22:17 UTC (permalink / raw)
  To: qemu-block; +Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, qemu-devel


[-- Attachment #1.1: Type: text/plain, Size: 1810 bytes --]

On 13.06.19 00:09, Max Reitz wrote:
> Signed-off-by: Max Reitz <mreitz@redhat.com>
> ---
>  block.c | 26 ++++++++++++++++++++++++--
>  1 file changed, 24 insertions(+), 2 deletions(-)
> 
> diff --git a/block.c b/block.c
> index 11b7ba8cf6..856d9b58be 100644
> --- a/block.c
> +++ b/block.c
> @@ -4511,15 +4511,37 @@ exit:
>  int64_t bdrv_get_allocated_file_size(BlockDriverState *bs)
>  {
>      BlockDriver *drv = bs->drv;
> +    BlockDriverState *storage_bs, *metadata_bs;
> +
>      if (!drv) {
>          return -ENOMEDIUM;
>      }
> +
>      if (drv->bdrv_get_allocated_file_size) {
>          return drv->bdrv_get_allocated_file_size(bs);
>      }
> -    if (bs->file) {
> -        return bdrv_get_allocated_file_size(bs->file->bs);
> +
> +    storage_bs = bdrv_storage_bs(bs);
> +    metadata_bs = bdrv_metadata_bs(bs);
> +
> +    if (storage_bs) {
> +        int64_t data_size, metadata_size = 0;
> +
> +        data_size = bdrv_get_allocated_file_size(storage_bs);
> +        if (data_size < 0) {
> +            return data_size;
> +        }
> +
> +        if (storage_bs != metadata_bs) {

Let this be a lesson to you: If you run all tests, then prepare to send
the series and just change “a minor thing”, you really should rerun the
tests.  Well, I should have, at least.

That should read “if (metadata_bs && storage_bs != metadata_bs) {”.

(Damn.  Why did I only remember to do so literally five minutes after
sending the series?)

Max

> +            metadata_size = bdrv_get_allocated_file_size(metadata_bs);
> +            if (metadata_size < 0) {
> +                return metadata_size;
> +            }
> +        }
> +
> +        return data_size + metadata_size;
>      }
> +
>      return -ENOTSUP;
>  }
>  
> 



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [Qemu-devel] [PATCH v5 01/42] block: Mark commit and mirror as filter drivers
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 01/42] block: Mark commit and mirror as filter drivers Max Reitz
@ 2019-06-13 10:47   ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 113+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2019-06-13 10:47 UTC (permalink / raw)
  To: Max Reitz, qemu-block; +Cc: Kevin Wolf, qemu-devel

13.06.2019 1:09, Max Reitz wrote:
> The commit and mirror block nodes are filters, so they should be marked
> as such.  (Strictly speaking, BDS.is_filter's documentation states that
> a filter's child must be bs->file.  The following patch will relax this
> restriction, however.)
> 
> Signed-off-by: Max Reitz <mreitz@redhat.com>
> Reviewed-by: Alberto Garcia <berto@igalia.com>
> Reviewed-by: Eric Blake <eblake@redhat.com>

Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

> ---
>   block/commit.c | 2 ++
>   block/mirror.c | 2 ++
>   2 files changed, 4 insertions(+)
> 
> diff --git a/block/commit.c b/block/commit.c
> index c815def89a..f20a26fecd 100644
> --- a/block/commit.c
> +++ b/block/commit.c
> @@ -256,6 +256,8 @@ static BlockDriver bdrv_commit_top = {
>       .bdrv_co_block_status       = bdrv_co_block_status_from_backing,
>       .bdrv_refresh_filename      = bdrv_commit_top_refresh_filename,
>       .bdrv_child_perm            = bdrv_commit_top_child_perm,
> +
> +    .is_filter                  = true,
>   };
>   
>   void commit_start(const char *job_id, BlockDriverState *bs,
> diff --git a/block/mirror.c b/block/mirror.c
> index f8bdb5b21b..4fa8f57c80 100644
> --- a/block/mirror.c
> +++ b/block/mirror.c
> @@ -1480,6 +1480,8 @@ static BlockDriver bdrv_mirror_top = {
>       .bdrv_co_block_status       = bdrv_co_block_status_from_backing,
>       .bdrv_refresh_filename      = bdrv_mirror_top_refresh_filename,
>       .bdrv_child_perm            = bdrv_mirror_top_child_perm,
> +
> +    .is_filter                  = true,
>   };
>   
>   static void mirror_start_job(const char *job_id, BlockDriverState *bs,
> 


-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [Qemu-devel] [PATCH v5 02/42] copy-on-read: Support compressed writes
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 02/42] copy-on-read: Support compressed writes Max Reitz
@ 2019-06-13 10:49   ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 113+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2019-06-13 10:49 UTC (permalink / raw)
  To: Max Reitz, qemu-block; +Cc: Kevin Wolf, qemu-devel

13.06.2019 1:09, Max Reitz wrote:
> Signed-off-by: Max Reitz <mreitz@redhat.com>

Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

> ---
>   block/copy-on-read.c | 11 +++++++++++
>   1 file changed, 11 insertions(+)
> 
> diff --git a/block/copy-on-read.c b/block/copy-on-read.c
> index 53972b1da3..88e1c1f538 100644
> --- a/block/copy-on-read.c
> +++ b/block/copy-on-read.c
> @@ -114,6 +114,16 @@ static int coroutine_fn cor_co_pdiscard(BlockDriverState *bs,
>   }
>   
>   
> +static int coroutine_fn cor_co_pwritev_compressed(BlockDriverState *bs,
> +                                                  uint64_t offset,
> +                                                  uint64_t bytes,
> +                                                  QEMUIOVector *qiov)
> +{
> +    return bdrv_co_pwritev(bs->file, offset, bytes, qiov,
> +                           BDRV_REQ_WRITE_COMPRESSED);
> +}

Hmm, possibly it's better to handle support of compression by checking supported
flags

> +
> +
>   static void cor_eject(BlockDriverState *bs, bool eject_flag)
>   {
>       bdrv_eject(bs->file->bs, eject_flag);
> @@ -146,6 +156,7 @@ static BlockDriver bdrv_copy_on_read = {
>       .bdrv_co_pwritev                    = cor_co_pwritev,
>       .bdrv_co_pwrite_zeroes              = cor_co_pwrite_zeroes,
>       .bdrv_co_pdiscard                   = cor_co_pdiscard,
> +    .bdrv_co_pwritev_compressed         = cor_co_pwritev_compressed,
>   
>       .bdrv_eject                         = cor_eject,
>       .bdrv_lock_medium                   = cor_lock_medium,
> 


-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [Qemu-devel] [PATCH v5 03/42] throttle: Support compressed writes
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 03/42] throttle: " Max Reitz
@ 2019-06-13 10:51   ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 113+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2019-06-13 10:51 UTC (permalink / raw)
  To: Max Reitz, qemu-block; +Cc: Kevin Wolf, qemu-devel

13.06.2019 1:09, Max Reitz wrote:
> Signed-off-by: Max Reitz <mreitz@redhat.com>

Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

> ---
>   block/throttle.c | 10 ++++++++++
>   1 file changed, 10 insertions(+)
> 
> diff --git a/block/throttle.c b/block/throttle.c
> index f64dcc27b9..de1b6bd7e8 100644
> --- a/block/throttle.c
> +++ b/block/throttle.c
> @@ -152,6 +152,15 @@ static int coroutine_fn throttle_co_pdiscard(BlockDriverState *bs,
>       return bdrv_co_pdiscard(bs->file, offset, bytes);
>   }
>   
> +static int coroutine_fn throttle_co_pwritev_compressed(BlockDriverState *bs,
> +                                                       uint64_t offset,
> +                                                       uint64_t bytes,
> +                                                       QEMUIOVector *qiov)
> +{
> +    return throttle_co_pwritev(bs, offset, bytes, qiov,
> +                               BDRV_REQ_WRITE_COMPRESSED);
> +}
> +
>   static int throttle_co_flush(BlockDriverState *bs)
>   {
>       return bdrv_co_flush(bs->file->bs);
> @@ -250,6 +259,7 @@ static BlockDriver bdrv_throttle = {
>   
>       .bdrv_co_pwrite_zeroes              =   throttle_co_pwrite_zeroes,
>       .bdrv_co_pdiscard                   =   throttle_co_pdiscard,
> +    .bdrv_co_pwritev_compressed         =   throttle_co_pwritev_compressed,
>   
>       .bdrv_recurse_is_first_non_filter   =   throttle_recurse_is_first_non_filter,
>   
> 


-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [Qemu-devel] [PATCH v5 04/42] block: Add child access functions
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 04/42] block: Add child access functions Max Reitz
@ 2019-06-13 12:15   ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 113+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2019-06-13 12:15 UTC (permalink / raw)
  To: Max Reitz, qemu-block; +Cc: Kevin Wolf, qemu-devel

13.06.2019 1:09, Max Reitz wrote:
> There are BDS children that the general block layer code can access,
> namely bs->file and bs->backing.  Since the introduction of filters and
> external data files, their meaning is not quite clear.  bs->backing can
> be a COW source, or it can be an R/W-filtered child; bs->file can be an
> R/W-filtered child, it can be data and metadata storage, or it can be
> just metadata storage.
> 
> This overloading really is not helpful.  This patch adds function that
> retrieve the correct child for each exact purpose.  Later patches in
> this series will make use of them.  Doing so will allow us to handle
> filter nodes and external data files in a meaningful way.
> 
> Signed-off-by: Max Reitz <mreitz@redhat.com>

Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

> ---
>   include/block/block_int.h | 57 ++++++++++++++++++++--
>   block.c                   | 99 +++++++++++++++++++++++++++++++++++++++
>   2 files changed, 153 insertions(+), 3 deletions(-)
> 
> diff --git a/include/block/block_int.h b/include/block/block_int.h
> index 58fca37ba3..7ce71623f8 100644
> --- a/include/block/block_int.h
> +++ b/include/block/block_int.h

[..]

>   
>   typedef struct BlockLimits {
> @@ -1249,4 +1258,46 @@ int coroutine_fn bdrv_co_copy_range_to(BdrvChild *src, uint64_t src_offset,
>   
>   int refresh_total_sectors(BlockDriverState *bs, int64_t hint);
>   
> +BdrvChild *bdrv_filtered_cow_child(BlockDriverState *bs);
> +BdrvChild *bdrv_filtered_rw_child(BlockDriverState *bs);
> +BdrvChild *bdrv_filtered_child(BlockDriverState *bs);
> +BdrvChild *bdrv_metadata_child(BlockDriverState *bs);
> +BdrvChild *bdrv_storage_child(BlockDriverState *bs);
> +BdrvChild *bdrv_primary_child(BlockDriverState *bs);
> +

Wow! Such a big family :)

I'd like to put them into a table, just for me to make it easier to keep it all in mind.
But if you want, you may include it here as a comment.. But it's difficult to keep it less than 80 columns.
I think, I'll modify it after reviewing following patches.

+--------------------+----------------------------+-------------------------------+-------------------------------+
| child              | description                | filter node                   | format node                   |
+--------------------+----------------------------+-------------------------------+-------------------------------+
| filtered_cow_child | for COW/COR                | NULL                          | bs->backing                   |
+--------------------+----------------------------+-------------------------------+-------------------------------+
| filtered_rw_child  | for IO pass-through        | bs->backing or bs->file       | NULL                          |
|                    |                            | (only one may exist)          |                               |
+--------------------+----------------------------+-------------------------------+-------------------------------+
| filtered_child     | one of the previous        |                               |                               |
|                    | for extended backing       | filtered_rw_child             | filtered_cow_child            |
|                    | chain                      |                               |                               |
+--------------------+----------------------------+-------------------------------+-------------------------------+
| metadata_child     | where metadata is stored   | NULL                          | bs->file                      |
+--------------------+----------------------------+-------------------------------+-------------------------------+
| storage_child      | where actual guest visible | bs->drv->bdrv_storage_child() | bs->drv->bdrv_storage_child() |
|                    | data is stored             | or filtered_rw_child          | or bs->file                   |
+--------------------+----------------------------+-------------------------------+-------------------------------+
| primary_child      | don't know yet             | filtered_rw_child             | bs->file                      |
+--------------------+----------------------------+-------------------------------+-------------------------------+




-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [Qemu-devel] [PATCH v5 05/42] block: Add chain helper functions
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 05/42] block: Add chain helper functions Max Reitz
@ 2019-06-13 12:26   ` Vladimir Sementsov-Ogievskiy
  2019-06-13 12:33     ` Max Reitz
  0 siblings, 1 reply; 113+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2019-06-13 12:26 UTC (permalink / raw)
  To: Max Reitz, qemu-block; +Cc: Kevin Wolf, qemu-devel

13.06.2019 1:09, Max Reitz wrote:
> Add some helper functions for skipping filters in a chain of block
> nodes.
> 
> Signed-off-by: Max Reitz <mreitz@redhat.com>
> ---
>   include/block/block_int.h |  3 +++
>   block.c                   | 55 +++++++++++++++++++++++++++++++++++++++
>   2 files changed, 58 insertions(+)
> 
> diff --git a/include/block/block_int.h b/include/block/block_int.h
> index 7ce71623f8..875a33f255 100644
> --- a/include/block/block_int.h
> +++ b/include/block/block_int.h
> @@ -1264,6 +1264,9 @@ BdrvChild *bdrv_filtered_child(BlockDriverState *bs);
>   BdrvChild *bdrv_metadata_child(BlockDriverState *bs);
>   BdrvChild *bdrv_storage_child(BlockDriverState *bs);
>   BdrvChild *bdrv_primary_child(BlockDriverState *bs);
> +BlockDriverState *bdrv_skip_implicit_filters(BlockDriverState *bs);
> +BlockDriverState *bdrv_skip_rw_filters(BlockDriverState *bs);
> +BlockDriverState *bdrv_backing_chain_next(BlockDriverState *bs);
>   
>   static inline BlockDriverState *child_bs(BdrvChild *child)
>   {
> diff --git a/block.c b/block.c
> index 724d8889a6..be18130944 100644
> --- a/block.c
> +++ b/block.c
> @@ -6494,3 +6494,58 @@ BdrvChild *bdrv_primary_child(BlockDriverState *bs)
>   {
>       return bdrv_filtered_rw_child(bs) ?: bs->file;
>   }
> +
> +static BlockDriverState *bdrv_skip_filters(BlockDriverState *bs,
> +                                           bool stop_on_explicit_filter)
> +{
> +    BdrvChild *filtered;
> +
> +    if (!bs) {
> +        return NULL;
> +    }
> +
> +    while (!(stop_on_explicit_filter && !bs->implicit)) {
> +        filtered = bdrv_filtered_rw_child(bs);
> +        if (!filtered) {
> +            break;
> +        }
> +        bs = filtered->bs;
> +    }
> +    /*
> +     * Note that this treats nodes with bs->drv == NULL

as well as filters without filtered_rw child

  as not being
> +     * R/W filters (bs->drv == NULL should be replaced by something
> +     * else anyway).
> +     * The advantage of this behavior is that this function will thus
> +     * always return a non-NULL value (given a non-NULL @bs).

and this is the advantage of what I've written, not about bs->drv.

> +     */
> +
> +    return bs;
> +}
> +
> +/*
> + * Return the first BDS that has not been added implicitly or that
> + * does not have an RW-filtered child down the chain starting from @bs
> + * (including @bs itself).
> + */
> +BlockDriverState *bdrv_skip_implicit_filters(BlockDriverState *bs)
> +{
> +    return bdrv_skip_filters(bs, true);
> +}
> +
> +/*
> + * Return the first BDS that does not have an RW-filtered child down
> + * the chain starting from @bs (including @bs itself).
> + */
> +BlockDriverState *bdrv_skip_rw_filters(BlockDriverState *bs)
> +{
> +    return bdrv_skip_filters(bs, false);
> +}
> +
> +/*
> + * For a backing chain, return the first non-filter backing image of
> + * the first non-filter image.
> + */
> +BlockDriverState *bdrv_backing_chain_next(BlockDriverState *bs)
> +{
> +    return bdrv_skip_rw_filters(bdrv_filtered_cow_bs(bdrv_skip_rw_filters(bs)));
> +}
> 


Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [Qemu-devel] [PATCH v5 06/42] qcow2: Implement .bdrv_storage_child()
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 06/42] qcow2: Implement .bdrv_storage_child() Max Reitz
@ 2019-06-13 12:27   ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 113+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2019-06-13 12:27 UTC (permalink / raw)
  To: Max Reitz, qemu-block; +Cc: Kevin Wolf, qemu-devel

13.06.2019 1:09, Max Reitz wrote:
> Signed-off-by: Max Reitz <mreitz@redhat.com>


Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

> ---
>   block/qcow2.c | 9 +++++++++
>   1 file changed, 9 insertions(+)
> 
> diff --git a/block/qcow2.c b/block/qcow2.c
> index 9396d490d5..57675c9416 100644
> --- a/block/qcow2.c
> +++ b/block/qcow2.c
> @@ -5085,6 +5085,13 @@ void qcow2_signal_corruption(BlockDriverState *bs, bool fatal, int64_t offset,
>       s->signaled_corruption = true;
>   }
>   
> +static BdrvChild *qcow2_storage_child(BlockDriverState *bs)
> +{
> +    BDRVQcow2State *s = bs->opaque;
> +
> +    return s->data_file;
> +}
> +
>   static QemuOptsList qcow2_create_opts = {
>       .name = "qcow2-create-opts",
>       .head = QTAILQ_HEAD_INITIALIZER(qcow2_create_opts.head),
> @@ -5231,6 +5238,8 @@ BlockDriver bdrv_qcow2 = {
>       .bdrv_reopen_bitmaps_rw = qcow2_reopen_bitmaps_rw,
>       .bdrv_can_store_new_dirty_bitmap = qcow2_can_store_new_dirty_bitmap,
>       .bdrv_remove_persistent_dirty_bitmap = qcow2_remove_persistent_dirty_bitmap,
> +
> +    .bdrv_storage_child = qcow2_storage_child,
>   };
>   
>   static void bdrv_qcow2_init(void)
> 


-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [Qemu-devel] [PATCH v5 05/42] block: Add chain helper functions
  2019-06-13 12:26   ` Vladimir Sementsov-Ogievskiy
@ 2019-06-13 12:33     ` Max Reitz
  2019-06-13 12:39       ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 113+ messages in thread
From: Max Reitz @ 2019-06-13 12:33 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block; +Cc: Kevin Wolf, qemu-devel


[-- Attachment #1.1: Type: text/plain, Size: 3519 bytes --]

On 13.06.19 14:26, Vladimir Sementsov-Ogievskiy wrote:
> 13.06.2019 1:09, Max Reitz wrote:
>> Add some helper functions for skipping filters in a chain of block
>> nodes.
>>
>> Signed-off-by: Max Reitz <mreitz@redhat.com>
>> ---
>>   include/block/block_int.h |  3 +++
>>   block.c                   | 55 +++++++++++++++++++++++++++++++++++++++
>>   2 files changed, 58 insertions(+)
>>
>> diff --git a/include/block/block_int.h b/include/block/block_int.h
>> index 7ce71623f8..875a33f255 100644
>> --- a/include/block/block_int.h
>> +++ b/include/block/block_int.h
>> @@ -1264,6 +1264,9 @@ BdrvChild *bdrv_filtered_child(BlockDriverState *bs);
>>   BdrvChild *bdrv_metadata_child(BlockDriverState *bs);
>>   BdrvChild *bdrv_storage_child(BlockDriverState *bs);
>>   BdrvChild *bdrv_primary_child(BlockDriverState *bs);
>> +BlockDriverState *bdrv_skip_implicit_filters(BlockDriverState *bs);
>> +BlockDriverState *bdrv_skip_rw_filters(BlockDriverState *bs);
>> +BlockDriverState *bdrv_backing_chain_next(BlockDriverState *bs);
>>   
>>   static inline BlockDriverState *child_bs(BdrvChild *child)
>>   {
>> diff --git a/block.c b/block.c
>> index 724d8889a6..be18130944 100644
>> --- a/block.c
>> +++ b/block.c
>> @@ -6494,3 +6494,58 @@ BdrvChild *bdrv_primary_child(BlockDriverState *bs)
>>   {
>>       return bdrv_filtered_rw_child(bs) ?: bs->file;
>>   }
>> +
>> +static BlockDriverState *bdrv_skip_filters(BlockDriverState *bs,
>> +                                           bool stop_on_explicit_filter)
>> +{
>> +    BdrvChild *filtered;
>> +
>> +    if (!bs) {
>> +        return NULL;
>> +    }
>> +
>> +    while (!(stop_on_explicit_filter && !bs->implicit)) {
>> +        filtered = bdrv_filtered_rw_child(bs);
>> +        if (!filtered) {
>> +            break;
>> +        }
>> +        bs = filtered->bs;
>> +    }
>> +    /*
>> +     * Note that this treats nodes with bs->drv == NULL
> 
> as well as filters without filtered_rw child

A filter always must have a filtered_rw child, though.  So I don’t quite
understand what you mean here...

Max

>   as not being
>> +     * R/W filters (bs->drv == NULL should be replaced by something
>> +     * else anyway).
>> +     * The advantage of this behavior is that this function will thus
>> +     * always return a non-NULL value (given a non-NULL @bs).
> 
> and this is the advantage of what I've written, not about bs->drv.
> 
>> +     */
>> +
>> +    return bs;
>> +}
>> +
>> +/*
>> + * Return the first BDS that has not been added implicitly or that
>> + * does not have an RW-filtered child down the chain starting from @bs
>> + * (including @bs itself).
>> + */
>> +BlockDriverState *bdrv_skip_implicit_filters(BlockDriverState *bs)
>> +{
>> +    return bdrv_skip_filters(bs, true);
>> +}
>> +
>> +/*
>> + * Return the first BDS that does not have an RW-filtered child down
>> + * the chain starting from @bs (including @bs itself).
>> + */
>> +BlockDriverState *bdrv_skip_rw_filters(BlockDriverState *bs)
>> +{
>> +    return bdrv_skip_filters(bs, false);
>> +}
>> +
>> +/*
>> + * For a backing chain, return the first non-filter backing image of
>> + * the first non-filter image.
>> + */
>> +BlockDriverState *bdrv_backing_chain_next(BlockDriverState *bs)
>> +{
>> +    return bdrv_skip_rw_filters(bdrv_filtered_cow_bs(bdrv_skip_rw_filters(bs)));
>> +}
>>
> 
> 
> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> 



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [Qemu-devel] [PATCH v5 07/42] block: *filtered_cow_child() for *has_zero_init()
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 07/42] block: *filtered_cow_child() for *has_zero_init() Max Reitz
@ 2019-06-13 12:34   ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 113+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2019-06-13 12:34 UTC (permalink / raw)
  To: Max Reitz, qemu-block; +Cc: Kevin Wolf, qemu-devel

13.06.2019 1:09, Max Reitz wrote:
> bdrv_has_zero_init() and the related bdrv_unallocated_blocks_are_zero()
> should use bdrv_filtered_cow_child() if they want to check whether the
> given BDS has a COW backing file.
> 
> Signed-off-by: Max Reitz <mreitz@redhat.com>
> ---
>   block.c | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/block.c b/block.c
> index be18130944..64d6190984 100644
> --- a/block.c
> +++ b/block.c
> @@ -4933,7 +4933,7 @@ int bdrv_has_zero_init(BlockDriverState *bs)
>   
>       /* If BS is a copy on write image, it is initialized to
>          the contents of the base image, which may not be zeroes.  */
> -    if (bs->backing) {
> +    if (bdrv_filtered_cow_child(bs)) {
>           return 0;
>       }

Hmm, if you are fixing bdrv_has_zero_init around filters, I'd prefere to fix the whole
function, converting the following here too:
     if (bs->file && bs->drv->is_filter) {
         return bdrv_has_zero_init(bs->file->bs);
     }


But it's not a real problem:

Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

>       if (bs->drv->bdrv_has_zero_init) {
> @@ -4951,7 +4951,7 @@ bool bdrv_unallocated_blocks_are_zero(BlockDriverState *bs)
>   {
>       BlockDriverInfo bdi;
>   
> -    if (bs->backing) {
> +    if (bdrv_filtered_cow_child(bs)) {
>           return false;
>       }
>   
> 


-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [Qemu-devel] [PATCH v5 05/42] block: Add chain helper functions
  2019-06-13 12:33     ` Max Reitz
@ 2019-06-13 12:39       ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 113+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2019-06-13 12:39 UTC (permalink / raw)
  To: Max Reitz, qemu-block; +Cc: Kevin Wolf, qemu-devel

13.06.2019 15:33, Max Reitz wrote:
> On 13.06.19 14:26, Vladimir Sementsov-Ogievskiy wrote:
>> 13.06.2019 1:09, Max Reitz wrote:
>>> Add some helper functions for skipping filters in a chain of block
>>> nodes.
>>>
>>> Signed-off-by: Max Reitz <mreitz@redhat.com>
>>> ---
>>>    include/block/block_int.h |  3 +++
>>>    block.c                   | 55 +++++++++++++++++++++++++++++++++++++++
>>>    2 files changed, 58 insertions(+)
>>>
>>> diff --git a/include/block/block_int.h b/include/block/block_int.h
>>> index 7ce71623f8..875a33f255 100644
>>> --- a/include/block/block_int.h
>>> +++ b/include/block/block_int.h
>>> @@ -1264,6 +1264,9 @@ BdrvChild *bdrv_filtered_child(BlockDriverState *bs);
>>>    BdrvChild *bdrv_metadata_child(BlockDriverState *bs);
>>>    BdrvChild *bdrv_storage_child(BlockDriverState *bs);
>>>    BdrvChild *bdrv_primary_child(BlockDriverState *bs);
>>> +BlockDriverState *bdrv_skip_implicit_filters(BlockDriverState *bs);
>>> +BlockDriverState *bdrv_skip_rw_filters(BlockDriverState *bs);
>>> +BlockDriverState *bdrv_backing_chain_next(BlockDriverState *bs);
>>>    
>>>    static inline BlockDriverState *child_bs(BdrvChild *child)
>>>    {
>>> diff --git a/block.c b/block.c
>>> index 724d8889a6..be18130944 100644
>>> --- a/block.c
>>> +++ b/block.c
>>> @@ -6494,3 +6494,58 @@ BdrvChild *bdrv_primary_child(BlockDriverState *bs)
>>>    {
>>>        return bdrv_filtered_rw_child(bs) ?: bs->file;
>>>    }
>>> +
>>> +static BlockDriverState *bdrv_skip_filters(BlockDriverState *bs,
>>> +                                           bool stop_on_explicit_filter)
>>> +{
>>> +    BdrvChild *filtered;
>>> +
>>> +    if (!bs) {
>>> +        return NULL;
>>> +    }
>>> +
>>> +    while (!(stop_on_explicit_filter && !bs->implicit)) {
>>> +        filtered = bdrv_filtered_rw_child(bs);
>>> +        if (!filtered) {
>>> +            break;
>>> +        }
>>> +        bs = filtered->bs;
>>> +    }
>>> +    /*
>>> +     * Note that this treats nodes with bs->drv == NULL
>>
>> as well as filters without filtered_rw child
> 
> A filter always must have a filtered_rw child, though.  So I don’t quite
> understand what you mean here...
> 
> Max
> 
>>    as not being
>>> +     * R/W filters (bs->drv == NULL should be replaced by something
>>> +     * else anyway).
>>> +     * The advantage of this behavior is that this function will thus
>>> +     * always return a non-NULL value (given a non-NULL @bs).
>>
>> and this is the advantage of what I've written, not about bs->drv.

I mean, that advantage seems unrelated to the reason about bs->drv == NULL,
as even with bs->drv == NULL we can go to bs->backing or bs->file..

But I don't  really care, my r-b stays here anyway

>>
>>> +     */
>>> +
>>> +    return bs;
>>> +}
>>> +
>>> +/*
>>> + * Return the first BDS that has not been added implicitly or that
>>> + * does not have an RW-filtered child down the chain starting from @bs
>>> + * (including @bs itself).
>>> + */
>>> +BlockDriverState *bdrv_skip_implicit_filters(BlockDriverState *bs)
>>> +{
>>> +    return bdrv_skip_filters(bs, true);
>>> +}
>>> +
>>> +/*
>>> + * Return the first BDS that does not have an RW-filtered child down
>>> + * the chain starting from @bs (including @bs itself).
>>> + */
>>> +BlockDriverState *bdrv_skip_rw_filters(BlockDriverState *bs)
>>> +{
>>> +    return bdrv_skip_filters(bs, false);
>>> +}
>>> +
>>> +/*
>>> + * For a backing chain, return the first non-filter backing image of
>>> + * the first non-filter image.
>>> + */
>>> +BlockDriverState *bdrv_backing_chain_next(BlockDriverState *bs)
>>> +{
>>> +    return bdrv_skip_rw_filters(bdrv_filtered_cow_bs(bdrv_skip_rw_filters(bs)));
>>> +}
>>>
>>
>>
>> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>>
> 
> 


-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [Qemu-devel] [PATCH v5 08/42] block: bdrv_set_backing_hd() is about bs->backing
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 08/42] block: bdrv_set_backing_hd() is about bs->backing Max Reitz
@ 2019-06-13 12:40   ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 113+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2019-06-13 12:40 UTC (permalink / raw)
  To: Max Reitz, qemu-block; +Cc: Kevin Wolf, qemu-devel

13.06.2019 1:09, Max Reitz wrote:
> bdrv_set_backing_hd() is a function that explicitly cares about the
> bs->backing child.  Highlight that in its description and use
> child_bs(bs->backing) instead of backing_bs(bs) to make it more obvious.
> 
> Signed-off-by: Max Reitz <mreitz@redhat.com>

Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>]

> ---
>   block.c | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/block.c b/block.c
> index 64d6190984..8438b0699e 100644
> --- a/block.c
> +++ b/block.c
> @@ -2417,7 +2417,7 @@ static bool bdrv_inherits_from_recursive(BlockDriverState *child,
>   }
>   
>   /*
> - * Sets the backing file link of a BDS. A new reference is created; callers
> + * Sets the bs->backing link of a BDS. A new reference is created; callers
>    * which don't need their own reference any more must call bdrv_unref().
>    */
>   void bdrv_set_backing_hd(BlockDriverState *bs, BlockDriverState *backing_hd,
> @@ -2426,7 +2426,7 @@ void bdrv_set_backing_hd(BlockDriverState *bs, BlockDriverState *backing_hd,
>       bool update_inherits_from = bdrv_chain_contains(bs, backing_hd) &&
>           bdrv_inherits_from_recursive(backing_hd, bs);
>   
> -    if (bdrv_is_backing_chain_frozen(bs, backing_bs(bs), errp)) {
> +    if (bdrv_is_backing_chain_frozen(bs, child_bs(bs->backing), errp)) {
>           return;
>       }
>   
> 


-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [Qemu-devel] [PATCH v5 09/42] block: Include filters when freezing backing chain
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 09/42] block: Include filters when freezing backing chain Max Reitz
@ 2019-06-13 13:04   ` Vladimir Sementsov-Ogievskiy
  2019-06-13 14:05     ` Max Reitz
  0 siblings, 1 reply; 113+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2019-06-13 13:04 UTC (permalink / raw)
  To: Max Reitz, qemu-block; +Cc: Kevin Wolf, qemu-devel

13.06.2019 1:09, Max Reitz wrote:
> In order to make filters work in backing chains, the associated
> functions must be able to deal with them and freeze all filter links, be
> they COW or R/W filter links.
> 
> While at it, add some comments that note which functions require their
> caller to ensure that a given child link is not frozen, and how the
> callers do so.
> 
> Signed-off-by: Max Reitz <mreitz@redhat.com>
> ---
>   block.c | 45 ++++++++++++++++++++++++++++++++-------------
>   1 file changed, 32 insertions(+), 13 deletions(-)
> 
> diff --git a/block.c b/block.c
> index 8438b0699e..45882a3470 100644
> --- a/block.c
> +++ b/block.c
> @@ -2214,12 +2214,15 @@ static void bdrv_replace_child_noperm(BdrvChild *child,
>    * If @new_bs is not NULL, bdrv_check_perm() must be called beforehand, as this
>    * function uses bdrv_set_perm() to update the permissions according to the new
>    * reference that @new_bs gets.
> + *
> + * Callers must ensure that child->frozen is false.
>    */
>   static void bdrv_replace_child(BdrvChild *child, BlockDriverState *new_bs)
>   {
>       BlockDriverState *old_bs = child->bs;
>       uint64_t perm, shared_perm;
>   
> +    /* Asserts that child->frozen == false */
>       bdrv_replace_child_noperm(child, new_bs);
>   
>       if (old_bs) {
> @@ -2360,6 +2363,7 @@ static void bdrv_detach_child(BdrvChild *child)
>       g_free(child);
>   }
>   
> +/* Callers must ensure that child->frozen is false. */

Is such a comment better than one-line extra assertion at start of the function body?

>   void bdrv_root_unref_child(BdrvChild *child)
>   {
>       BlockDriverState *child_bs;
> @@ -2369,6 +2373,7 @@ void bdrv_root_unref_child(BdrvChild *child)
>       bdrv_unref(child_bs);
>   }
>   
> +/* Callers must ensure that child->frozen is false. */
>   void bdrv_unref_child(BlockDriverState *parent, BdrvChild *child)
>   {
>       if (child == NULL) {
> @@ -2435,6 +2440,7 @@ void bdrv_set_backing_hd(BlockDriverState *bs, BlockDriverState *backing_hd,
>       }
>   
>       if (bs->backing) {
> +        /* Cannot be frozen, we checked that above */
>           bdrv_unref_child(bs, bs->backing);
>       }
>   
> @@ -3908,6 +3914,7 @@ static void bdrv_close(BlockDriverState *bs)
>   
>       if (bs->drv) {
>           if (bs->drv->bdrv_close) {
> +            /* Must unfreeze all children, so bdrv_unref_child() works */
>               bs->drv->bdrv_close(bs);
>           }
>           bs->drv = NULL;
> @@ -4281,17 +4288,20 @@ BlockDriverState *bdrv_find_base(BlockDriverState *bs)
>    * Return true if at least one of the backing links between @bs and
>    * @base is frozen. @errp is set if that's the case.
>    * @base must be reachable from @bs, or NULL.
> + * (Filters are treated as normal elements of the backing chain.)
>    */
>   bool bdrv_is_backing_chain_frozen(BlockDriverState *bs, BlockDriverState *base,
>                                     Error **errp)
>   {
>       BlockDriverState *i;
> +    BdrvChild *child;
>   
> -    for (i = bs; i != base; i = backing_bs(i)) {
> -        if (i->backing && i->backing->frozen) {
> +    for (i = bs; i != base; i = child_bs(child)) {
> +        child = bdrv_filtered_child(i);
> +
> +        if (child && child->frozen) {
>               error_setg(errp, "Cannot change '%s' link from '%s' to '%s'",
> -                       i->backing->name, i->node_name,
> -                       backing_bs(i)->node_name);
> +                       child->name, i->node_name, child->bs->node_name);
>               return true;
>           }
>       }
> @@ -4305,19 +4315,22 @@ bool bdrv_is_backing_chain_frozen(BlockDriverState *bs, BlockDriverState *base,
>    * none of the links are modified.
>    * @base must be reachable from @bs, or NULL.
>    * Returns 0 on success. On failure returns < 0 and sets @errp.
> + * (Filters are treated as normal elements of the backing chain.)
>    */
>   int bdrv_freeze_backing_chain(BlockDriverState *bs, BlockDriverState *base,
>                                 Error **errp)
>   {
>       BlockDriverState *i;
> +    BdrvChild *child;
>   
>       if (bdrv_is_backing_chain_frozen(bs, base, errp)) {
>           return -EPERM;
>       }
>   
> -    for (i = bs; i != base; i = backing_bs(i)) {
> -        if (i->backing) {
> -            i->backing->frozen = true;
> +    for (i = bs; i != base; i = child_bs(child)) {
> +        child = bdrv_filtered_child(i);
> +        if (child) {
> +            child->frozen = true;
>           }
>       }
>   
> @@ -4328,15 +4341,18 @@ int bdrv_freeze_backing_chain(BlockDriverState *bs, BlockDriverState *base,
>    * Unfreeze all backing links between @bs and @base. The caller must
>    * ensure that all links are frozen before using this function.
>    * @base must be reachable from @bs, or NULL.
> + * (Filters are treated as normal elements of the backing chain.)
>    */
>   void bdrv_unfreeze_backing_chain(BlockDriverState *bs, BlockDriverState *base)
>   {
>       BlockDriverState *i;
> +    BdrvChild *child;
>   
> -    for (i = bs; i != base; i = backing_bs(i)) {
> -        if (i->backing) {
> -            assert(i->backing->frozen);
> -            i->backing->frozen = false;
> +    for (i = bs; i != base; i = child_bs(child)) {
> +        child = bdrv_filtered_child(i);
> +        if (child) {
> +            assert(child->frozen);
> +            child->frozen = false;
>           }
>       }
>   }
> @@ -4438,8 +4454,11 @@ int bdrv_drop_intermediate(BlockDriverState *top, BlockDriverState *base,
>               }
>           }
>   
> -        /* Do the actual switch in the in-memory graph.
> -         * Completes bdrv_check_update_perm() transaction internally. */
> +        /*
> +         * Do the actual switch in the in-memory graph.
> +         * Completes bdrv_check_update_perm() transaction internally.
> +         * c->frozen is false, we have checked that above.
> +         */
>           bdrv_ref(base);
>           bdrv_replace_child(c, base);
>           bdrv_unref(top);
> 

Hmm, OK, it's better than it was, so:
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

But I have one thought: we check that frozen is false at some point, and then
do some logic around this child. Where is guarantee, that someone else will not
set frozen=true during some yield, for example? So, do we need a kind of child_mutex,
or something like this?

-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [Qemu-devel] [PATCH v5 10/42] block: Use CAF in bdrv_is_encrypted()
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 10/42] block: Use CAF in bdrv_is_encrypted() Max Reitz
@ 2019-06-13 13:16   ` Vladimir Sementsov-Ogievskiy
  2019-06-13 14:15     ` Max Reitz
  0 siblings, 1 reply; 113+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2019-06-13 13:16 UTC (permalink / raw)
  To: Max Reitz, qemu-block; +Cc: Kevin Wolf, qemu-devel

13.06.2019 1:09, Max Reitz wrote:
> bdrv_is_encrypted() should not only check the BDS's backing child, but
> any filtered child: If a filter's child is encrypted, the filter node
> itself naturally is encrypted, too.  Furthermore, we need to recurse
> down the chain.
> 
> (CAF means child access function.)

Hmm, so, if only one node in the backing chain is encrypted, all overlays,
filters or not are considered encrypted too? Even if all the data is in top
node and is not encrypted?

Checked that the function is used only for reporting through
bdrv_query_image_info, which is called from bdrv_block_device_info() (which
loops through backings), and from collect_image_info_list(), which loops through
backings if @chain=true.

And collect_image_info_list() is used only in img_info(), @chain is a mirrored
--backing-chain parameter..

So, isn't it more correct to return exactly bs->encrypted in this function? It will
give more correct and informative results for queries for the whole chain.


> 
> Signed-off-by: Max Reitz <mreitz@redhat.com>
> ---
>   block.c | 8 ++++++--
>   1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/block.c b/block.c
> index 45882a3470..567a0f82c8 100644
> --- a/block.c
> +++ b/block.c
> @@ -4574,10 +4574,14 @@ bool bdrv_is_sg(BlockDriverState *bs)
>   
>   bool bdrv_is_encrypted(BlockDriverState *bs)
>   {
> -    if (bs->backing && bs->backing->bs->encrypted) {
> +    BlockDriverState *filtered = bdrv_filtered_bs(bs);
> +    if (bs->encrypted) {
>           return true;
>       }
> -    return bs->encrypted;
> +    if (filtered && bdrv_is_encrypted(filtered)) {
> +        return true;
> +    }
> +    return false;
>   }
>   
>   const char *bdrv_get_format_name(BlockDriverState *bs)
> 


-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [Qemu-devel] [PATCH v5 11/42] block: Add bdrv_supports_compressed_writes()
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 11/42] block: Add bdrv_supports_compressed_writes() Max Reitz
@ 2019-06-13 13:29   ` Vladimir Sementsov-Ogievskiy
  2019-06-13 14:19     ` Max Reitz
  0 siblings, 1 reply; 113+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2019-06-13 13:29 UTC (permalink / raw)
  To: Max Reitz, qemu-block; +Cc: Kevin Wolf, qemu-devel

13.06.2019 1:09, Max Reitz wrote:
> Filters cannot compress data themselves but they have to implement
> .bdrv_co_pwritev_compressed() still (or they cannot forward compressed
> writes).  Therefore, checking whether
> bs->drv->bdrv_co_pwritev_compressed is non-NULL is not sufficient to
> know whether the node can actually handle compressed writes.  This
> function looks down the filter chain to see whether there is a
> non-filter that can actually convert the compressed writes into
> compressed data (and thus normal writes).

Why not to use this function in (as I remember only 2-3 cases) when
we check for bs->drv->bdrv_co_pwritev_compressed? It would be a complete fix
for described problem.
(hmm, ok, other new APIs are added separately too, for some reason they don't
confuse me and this confuses)

On the other hand, (the second time I think about it during review), could
we handle compression through flags completely?
We have supported_write_flags feature, which should be used for all these checks..
And may be, we may drop .bdrv_co_pwritev_compressed at all.

But if you want to keep it as is, it's OK too:
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

> 
> Signed-off-by: Max Reitz <mreitz@redhat.com>
> ---
>   include/block/block.h |  1 +
>   block.c               | 22 ++++++++++++++++++++++
>   2 files changed, 23 insertions(+)
> 
> diff --git a/include/block/block.h b/include/block/block.h
> index 687c03b275..7835c5b370 100644
> --- a/include/block/block.h
> +++ b/include/block/block.h
> @@ -487,6 +487,7 @@ void bdrv_next_cleanup(BdrvNextIterator *it);
>   
>   BlockDriverState *bdrv_next_monitor_owned(BlockDriverState *bs);
>   bool bdrv_is_encrypted(BlockDriverState *bs);
> +bool bdrv_supports_compressed_writes(BlockDriverState *bs);
>   void bdrv_iterate_format(void (*it)(void *opaque, const char *name),
>                            void *opaque, bool read_only);
>   const char *bdrv_get_node_name(const BlockDriverState *bs);
> diff --git a/block.c b/block.c
> index 567a0f82c8..97774b7b06 100644
> --- a/block.c
> +++ b/block.c
> @@ -4584,6 +4584,28 @@ bool bdrv_is_encrypted(BlockDriverState *bs)
>       return false;
>   }
>   
> +/**
> + * Return whether the given node supports compressed writes.
> + */
> +bool bdrv_supports_compressed_writes(BlockDriverState *bs)
> +{
> +    BlockDriverState *filtered = bdrv_filtered_rw_bs(bs);
> +
> +    if (!bs->drv || !bs->drv->bdrv_co_pwritev_compressed) {
> +        return false;
> +    }
> +
> +    if (filtered) {
> +        /*
> +         * Filters can only forward compressed writes, so we have to
> +         * check the child.
> +         */
> +        return bdrv_supports_compressed_writes(filtered);
> +    }
> +
> +    return true;
> +}
> +
>   const char *bdrv_get_format_name(BlockDriverState *bs)
>   {
>       return bs->drv ? bs->drv->format_name : NULL;
> 


-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [Qemu-devel] [PATCH v5 12/42] block: Use bdrv_filtered_rw* where obvious
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 12/42] block: Use bdrv_filtered_rw* where obvious Max Reitz
@ 2019-06-13 13:37   ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 113+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2019-06-13 13:37 UTC (permalink / raw)
  To: Max Reitz, qemu-block; +Cc: Kevin Wolf, qemu-devel

13.06.2019 1:09, Max Reitz wrote:
> Places that use patterns like
> 
>      if (bs->drv->is_filter && bs->file) {
>          ... something about bs->file->bs ...
>      }
> 
> should be
> 
>      BlockDriverState *filtered = bdrv_filtered_rw_bs(bs);
>      if (filtered) {
>          ... something about @filtered ...
>      }
> 
> instead.

Hmm, in other words, support filters with backing child in all places, where only file-based
filters are supported, as we don't want make any semantic difference between these two
types of filters.

> 
> Signed-off-by: Max Reitz <mreitz@redhat.com>

Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>



-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [Qemu-devel] [PATCH v5 09/42] block: Include filters when freezing backing chain
  2019-06-13 13:04   ` Vladimir Sementsov-Ogievskiy
@ 2019-06-13 14:05     ` Max Reitz
  0 siblings, 0 replies; 113+ messages in thread
From: Max Reitz @ 2019-06-13 14:05 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block; +Cc: Kevin Wolf, qemu-devel


[-- Attachment #1.1: Type: text/plain, Size: 7449 bytes --]

On 13.06.19 15:04, Vladimir Sementsov-Ogievskiy wrote:
> 13.06.2019 1:09, Max Reitz wrote:
>> In order to make filters work in backing chains, the associated
>> functions must be able to deal with them and freeze all filter links, be
>> they COW or R/W filter links.
>>
>> While at it, add some comments that note which functions require their
>> caller to ensure that a given child link is not frozen, and how the
>> callers do so.
>>
>> Signed-off-by: Max Reitz <mreitz@redhat.com>
>> ---
>>   block.c | 45 ++++++++++++++++++++++++++++++++-------------
>>   1 file changed, 32 insertions(+), 13 deletions(-)
>>
>> diff --git a/block.c b/block.c
>> index 8438b0699e..45882a3470 100644
>> --- a/block.c
>> +++ b/block.c
>> @@ -2214,12 +2214,15 @@ static void bdrv_replace_child_noperm(BdrvChild *child,
>>    * If @new_bs is not NULL, bdrv_check_perm() must be called beforehand, as this
>>    * function uses bdrv_set_perm() to update the permissions according to the new
>>    * reference that @new_bs gets.
>> + *
>> + * Callers must ensure that child->frozen is false.
>>    */
>>   static void bdrv_replace_child(BdrvChild *child, BlockDriverState *new_bs)
>>   {
>>       BlockDriverState *old_bs = child->bs;
>>       uint64_t perm, shared_perm;
>>   
>> +    /* Asserts that child->frozen == false */
>>       bdrv_replace_child_noperm(child, new_bs);
>>   
>>       if (old_bs) {
>> @@ -2360,6 +2363,7 @@ static void bdrv_detach_child(BdrvChild *child)
>>       g_free(child);
>>   }
>>   
>> +/* Callers must ensure that child->frozen is false. */
> 
> Is such a comment better than one-line extra assertion at start of the function body?

Well, there already is an assertion, it is in
bdrv_replace_child_noperm().  I personally prefer to read comments than
assertions.

>>   void bdrv_root_unref_child(BdrvChild *child)
>>   {
>>       BlockDriverState *child_bs;
>> @@ -2369,6 +2373,7 @@ void bdrv_root_unref_child(BdrvChild *child)
>>       bdrv_unref(child_bs);
>>   }
>>   
>> +/* Callers must ensure that child->frozen is false. */
>>   void bdrv_unref_child(BlockDriverState *parent, BdrvChild *child)
>>   {
>>       if (child == NULL) {
>> @@ -2435,6 +2440,7 @@ void bdrv_set_backing_hd(BlockDriverState *bs, BlockDriverState *backing_hd,
>>       }
>>   
>>       if (bs->backing) {
>> +        /* Cannot be frozen, we checked that above */
>>           bdrv_unref_child(bs, bs->backing);
>>       }
>>   
>> @@ -3908,6 +3914,7 @@ static void bdrv_close(BlockDriverState *bs)
>>   
>>       if (bs->drv) {
>>           if (bs->drv->bdrv_close) {
>> +            /* Must unfreeze all children, so bdrv_unref_child() works */
>>               bs->drv->bdrv_close(bs);
>>           }
>>           bs->drv = NULL;
>> @@ -4281,17 +4288,20 @@ BlockDriverState *bdrv_find_base(BlockDriverState *bs)
>>    * Return true if at least one of the backing links between @bs and
>>    * @base is frozen. @errp is set if that's the case.
>>    * @base must be reachable from @bs, or NULL.
>> + * (Filters are treated as normal elements of the backing chain.)
>>    */
>>   bool bdrv_is_backing_chain_frozen(BlockDriverState *bs, BlockDriverState *base,
>>                                     Error **errp)
>>   {
>>       BlockDriverState *i;
>> +    BdrvChild *child;
>>   
>> -    for (i = bs; i != base; i = backing_bs(i)) {
>> -        if (i->backing && i->backing->frozen) {
>> +    for (i = bs; i != base; i = child_bs(child)) {
>> +        child = bdrv_filtered_child(i);
>> +
>> +        if (child && child->frozen) {
>>               error_setg(errp, "Cannot change '%s' link from '%s' to '%s'",
>> -                       i->backing->name, i->node_name,
>> -                       backing_bs(i)->node_name);
>> +                       child->name, i->node_name, child->bs->node_name);
>>               return true;
>>           }
>>       }
>> @@ -4305,19 +4315,22 @@ bool bdrv_is_backing_chain_frozen(BlockDriverState *bs, BlockDriverState *base,
>>    * none of the links are modified.
>>    * @base must be reachable from @bs, or NULL.
>>    * Returns 0 on success. On failure returns < 0 and sets @errp.
>> + * (Filters are treated as normal elements of the backing chain.)
>>    */
>>   int bdrv_freeze_backing_chain(BlockDriverState *bs, BlockDriverState *base,
>>                                 Error **errp)
>>   {
>>       BlockDriverState *i;
>> +    BdrvChild *child;
>>   
>>       if (bdrv_is_backing_chain_frozen(bs, base, errp)) {
>>           return -EPERM;
>>       }
>>   
>> -    for (i = bs; i != base; i = backing_bs(i)) {
>> -        if (i->backing) {
>> -            i->backing->frozen = true;
>> +    for (i = bs; i != base; i = child_bs(child)) {
>> +        child = bdrv_filtered_child(i);
>> +        if (child) {
>> +            child->frozen = true;
>>           }
>>       }
>>   
>> @@ -4328,15 +4341,18 @@ int bdrv_freeze_backing_chain(BlockDriverState *bs, BlockDriverState *base,
>>    * Unfreeze all backing links between @bs and @base. The caller must
>>    * ensure that all links are frozen before using this function.
>>    * @base must be reachable from @bs, or NULL.
>> + * (Filters are treated as normal elements of the backing chain.)
>>    */
>>   void bdrv_unfreeze_backing_chain(BlockDriverState *bs, BlockDriverState *base)
>>   {
>>       BlockDriverState *i;
>> +    BdrvChild *child;
>>   
>> -    for (i = bs; i != base; i = backing_bs(i)) {
>> -        if (i->backing) {
>> -            assert(i->backing->frozen);
>> -            i->backing->frozen = false;
>> +    for (i = bs; i != base; i = child_bs(child)) {
>> +        child = bdrv_filtered_child(i);
>> +        if (child) {
>> +            assert(child->frozen);
>> +            child->frozen = false;
>>           }
>>       }
>>   }
>> @@ -4438,8 +4454,11 @@ int bdrv_drop_intermediate(BlockDriverState *top, BlockDriverState *base,
>>               }
>>           }
>>   
>> -        /* Do the actual switch in the in-memory graph.
>> -         * Completes bdrv_check_update_perm() transaction internally. */
>> +        /*
>> +         * Do the actual switch in the in-memory graph.
>> +         * Completes bdrv_check_update_perm() transaction internally.
>> +         * c->frozen is false, we have checked that above.
>> +         */
>>           bdrv_ref(base);
>>           bdrv_replace_child(c, base);
>>           bdrv_unref(top);
>>
> 
> Hmm, OK, it's better than it was, so:
> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> 
> But I have one thought: we check that frozen is false at some point, and then
> do some logic around this child. Where is guarantee, that someone else will not
> set frozen=true during some yield, for example? So, do we need a kind of child_mutex,
> or something like this?

The guarantee is that the caller does not do anything that could cause
the child link to become frozen between checking whether it is and
performing an operation that relies on it not being frozen.

Freezing (currently) only happens when starting block jobs, which can
only happen because of monitor commands.  Even in
bdrv_drop_intermediate(), which has quite a bit of code between checking
the frozen status and calling bdrv_replace_child(), I don’t see how a
new block job could sneak in.

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [Qemu-devel] [PATCH v5 10/42] block: Use CAF in bdrv_is_encrypted()
  2019-06-13 13:16   ` Vladimir Sementsov-Ogievskiy
@ 2019-06-13 14:15     ` Max Reitz
  0 siblings, 0 replies; 113+ messages in thread
From: Max Reitz @ 2019-06-13 14:15 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block; +Cc: Kevin Wolf, qemu-devel


[-- Attachment #1.1: Type: text/plain, Size: 1392 bytes --]

On 13.06.19 15:16, Vladimir Sementsov-Ogievskiy wrote:
> 13.06.2019 1:09, Max Reitz wrote:
>> bdrv_is_encrypted() should not only check the BDS's backing child, but
>> any filtered child: If a filter's child is encrypted, the filter node
>> itself naturally is encrypted, too.  Furthermore, we need to recurse
>> down the chain.
>>
>> (CAF means child access function.)
> 
> Hmm, so, if only one node in the backing chain is encrypted, all overlays,
> filters or not are considered encrypted too? Even if all the data is in top
> node and is not encrypted?
> 
> Checked that the function is used only for reporting through
> bdrv_query_image_info, which is called from bdrv_block_device_info() (which
> loops through backings), and from collect_image_info_list(), which loops through
> backings if @chain=true.
> 
> And collect_image_info_list() is used only in img_info(), @chain is a mirrored
> --backing-chain parameter..
> 
> So, isn't it more correct to return exactly bs->encrypted in this function? It will
> give more correct and informative results for queries for the whole chain.

Hm.  Maybe? :-)

I personally feel more comfortable to report more devices as being
reported than less.  The description of @encrypted in @BlockDeviceInfo
is vague enough that we can just “make it more precise”.

You’re right, it does sound more useful.

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [Qemu-devel] [PATCH v5 11/42] block: Add bdrv_supports_compressed_writes()
  2019-06-13 13:29   ` Vladimir Sementsov-Ogievskiy
@ 2019-06-13 14:19     ` Max Reitz
  0 siblings, 0 replies; 113+ messages in thread
From: Max Reitz @ 2019-06-13 14:19 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block; +Cc: Kevin Wolf, qemu-devel


[-- Attachment #1.1: Type: text/plain, Size: 1779 bytes --]

On 13.06.19 15:29, Vladimir Sementsov-Ogievskiy wrote:
> 13.06.2019 1:09, Max Reitz wrote:
>> Filters cannot compress data themselves but they have to implement
>> .bdrv_co_pwritev_compressed() still (or they cannot forward compressed
>> writes).  Therefore, checking whether
>> bs->drv->bdrv_co_pwritev_compressed is non-NULL is not sufficient to
>> know whether the node can actually handle compressed writes.  This
>> function looks down the filter chain to see whether there is a
>> non-filter that can actually convert the compressed writes into
>> compressed data (and thus normal writes).
> 
> Why not to use this function in (as I remember only 2-3 cases) when
> we check for bs->drv->bdrv_co_pwritev_compressed? It would be a complete fix
> for described problem.

Well, bdrv_driver_pwritev_compressed() doesn’t really care, it will find
out sooner or later anyway (while being passed down the chain).  This is
only really important for the backup job, which will use this function
as of patch 26.  (It isn’t important before 26, because using filters
with backup generally is a gamble before that patch.)

> (hmm, ok, other new APIs are added separately too, for some reason they don't
> confuse me and this confuses)
> 
> On the other hand, (the second time I think about it during review), could
> we handle compression through flags completely?
> We have supported_write_flags feature, which should be used for all these checks..
> And may be, we may drop .bdrv_co_pwritev_compressed at all.

We probably could, yes.  I just felt like this wasn’t the time to do it.
O:-)

> But if you want to keep it as is, it's OK too:
> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

Thanks for reviewing!

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [Qemu-devel] [PATCH v5 00/42] block: Deal with filters
  2019-06-12 22:09 [Qemu-devel] [PATCH v5 00/42] block: Deal with filters Max Reitz
                   ` (41 preceding siblings ...)
  2019-06-12 22:10 ` [Qemu-devel] [PATCH v5 42/42] iotests: Test committing to overridden backing Max Reitz
@ 2019-06-13 15:28 ` Vladimir Sementsov-Ogievskiy
  2019-06-13 16:12   ` Max Reitz
  42 siblings, 1 reply; 113+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2019-06-13 15:28 UTC (permalink / raw)
  To: Max Reitz, qemu-block; +Cc: Kevin Wolf, qemu-devel

13.06.2019 1:09, Max Reitz wrote:
> Hi,
> 
> When we introduced filters, we did it a bit casually.  Sure, we talked a
> lot about them before, but that was mostly discussion about where
> implicit filters should be added to the graph (note that we currently
> only have two implicit filters, those being mirror and commit).  But in
> the end, we really just designated some drivers filters (Quorum,
> blkdebug, etc.) and added some specifically (throttle, COR), without
> really looking through the block layer to see where issues might occur.
> 
> It turns out vast areas of the block layer just don’t know about filters
> and cannot really handle them.  Many cases will work in practice, in
> others, well, too bad, you cannot use some feature because some part
> deep inside the block layer looks at your filters and thinks they are
> format nodes.
> 
> This is one reason why this series is needed.  Over time (since v1), a
> second reason has made its way in:
> 
> bs->file is not necessarily the place where a node’s data is stored.
> qcow2 now has external data files, and currently there is no way for the
> general block layer to know that the data is not stored in bs->file.
> Right now, I do not think that has any real consequences (all functions
> that need access to the actual data storage file should only do so as a
> fallback if the driver does not provide some functionality, but qcow2
> should provide it all), but it still shows that we need some way to let
> the general block layer know about such data files.  (Also, I will need
> this for v1 of my “Inquire images’ rotational info” series.)
> 
> I won’t go on and on about this series now, I think the patches pretty
> much speak for themselves now.  If the cover letter gets too long,
> nobody reads it anyway (see previous versions).
> 
> 
> *** This series depends on some others. ***
> 
> Dependencies:
> - [PATCH 0/4] block: Keep track of parent quiescing
> - [PATCH 0/2] vl: Drain before (block) job cancel when quitting
> - [PATCH v2 0/2] blockdev: Overlays are not snapshots
> 
> Based-on: <20190605161118.14544-1-mreitz@redhat.com>
> Based-on: <20190612220839.1374-1-mreitz@redhat.com>
> Based-on: <20190603202236.1342-1-mreitz@redhat.com>

Could you please export a branch?

> 
> 
> v5:
> - Split the huge patches 2 and 3 from the previous version into many
>    smaller patches to maintain the potential reviewers’ sanity [Vladimir]

Thank you! In spite of frightening amount of patches, reviewing became a lot
simpler.

> 
> - Added support for compressed writes to the COR and throttle filter
>    drivers to demonstrate how that looks, because the backup job needs to
>    deal with filters that have such support
> 
> - Added differentiation between bdrv_storage_child(),
>    bdrv_primary_child(), and bdrv_metadata_child()
> 
> - A whole lot of things Vladimir has noted
> 
> - Made the block jobs really work with filters.  In case of commit and
>    stream, this now means that filters go away if they are between top
>    and base.  I think that’s OK because it’s the user’s choice to include
>    filters or not.  (They can move the filters around if they prefer a
>    different result.)
>    - This changes the “Add filter commit test cases” from checking that
>      most things do not work to checking that they do
> 
> - Added the “blockdev: Fix active commit choice” patch because it turned
>    out this became necessary after I allowed committing through and with
>    filters.
> 
> 
> Max Reitz (42):
>    block: Mark commit and mirror as filter drivers
>    copy-on-read: Support compressed writes
>    throttle: Support compressed writes
>    block: Add child access functions
>    block: Add chain helper functions
>    qcow2: Implement .bdrv_storage_child()
>    block: *filtered_cow_child() for *has_zero_init()
>    block: bdrv_set_backing_hd() is about bs->backing
>    block: Include filters when freezing backing chain
>    block: Use CAF in bdrv_is_encrypted()
>    block: Add bdrv_supports_compressed_writes()
>    block: Use bdrv_filtered_rw* where obvious
>    block: Use CAFs in block status functions
>    block: Use CAFs when working with backing chains
>    block: Re-evaluate backing file handling in reopen
>    block: Use child access functions when flushing
>    block: Use CAFs in bdrv_refresh_limits()
>    block: Use CAFs in bdrv_refresh_filename()
>    block: Use CAF in bdrv_co_rw_vmstate()
>    block/snapshot: Fall back to storage child
>    block: Use CAFs for debug breakpoints
>    block: Use CAFs in bdrv_get_allocated_file_size()
>    blockdev: Use CAF in external_snapshot_prepare()
>    block: Use child access functions for QAPI queries
>    mirror: Deal with filters
>    backup: Deal with filters
>    commit: Deal with filters
>    stream: Deal with filters
>    nbd: Use CAF when looking for dirty bitmap
>    qemu-img: Use child access functions
>    block: Drop backing_bs()
>    block: Make bdrv_get_cumulative_perm() public
>    blockdev: Fix active commit choice
>    block: Inline bdrv_co_block_status_from_*()
>    block: Fix check_to_replace_node()
>    iotests: Add tests for mirror @replaces loops
>    block: Leave BDS.backing_file constant
>    iotests: Let complete_and_wait() work with commit
>    iotests: Add filter commit test cases
>    iotests: Add filter mirror test cases
>    iotests: Add test for commit in sub directory
>    iotests: Test committing to overridden backing
> 
>   qapi/block-core.json          |   4 +
>   include/block/block.h         |   2 +
>   include/block/block_int.h     | 109 ++++---
>   block.c                       | 523 +++++++++++++++++++++++++++++-----
>   block/backup.c                |   9 +-
>   block/blkdebug.c              |   7 +-
>   block/blklogwrites.c          |   1 -
>   block/block-backend.c         |  16 +-
>   block/commit.c                | 100 +++++--
>   block/copy-on-read.c          |  13 +-
>   block/io.c                    | 115 ++++----
>   block/mirror.c                | 113 ++++++--
>   block/qapi.c                  |  42 +--
>   block/qcow2.c                 |   9 +
>   block/snapshot.c              |  74 +++--
>   block/stream.c                |  23 +-
>   block/throttle.c              |  11 +-
>   blockdev.c                    | 139 +++++++--
>   nbd/server.c                  |   6 +-
>   qemu-img.c                    |  36 +--
>   tests/qemu-iotests/020        |  36 +++
>   tests/qemu-iotests/020.out    |  10 +
>   tests/qemu-iotests/040        | 238 ++++++++++++++++
>   tests/qemu-iotests/040.out    |   4 +-
>   tests/qemu-iotests/041        | 270 +++++++++++++++++-
>   tests/qemu-iotests/041.out    |   4 +-
>   tests/qemu-iotests/184.out    |   7 +-
>   tests/qemu-iotests/191.out    |   1 -
>   tests/qemu-iotests/204.out    |   1 +
>   tests/qemu-iotests/228        |   6 +-
>   tests/qemu-iotests/228.out    |   6 +-
>   tests/qemu-iotests/245        |   4 +-
>   tests/qemu-iotests/iotests.py |  10 +-
>   33 files changed, 1610 insertions(+), 339 deletions(-)
> 


-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [Qemu-devel] [PATCH v5 00/42] block: Deal with filters
  2019-06-13 15:28 ` [Qemu-devel] [PATCH v5 00/42] block: Deal with filters Vladimir Sementsov-Ogievskiy
@ 2019-06-13 16:12   ` Max Reitz
  0 siblings, 0 replies; 113+ messages in thread
From: Max Reitz @ 2019-06-13 16:12 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block; +Cc: Kevin Wolf, qemu-devel


[-- Attachment #1.1: Type: text/plain, Size: 3102 bytes --]

On 13.06.19 17:28, Vladimir Sementsov-Ogievskiy wrote:
> 13.06.2019 1:09, Max Reitz wrote:
>> Hi,
>>
>> When we introduced filters, we did it a bit casually.  Sure, we talked a
>> lot about them before, but that was mostly discussion about where
>> implicit filters should be added to the graph (note that we currently
>> only have two implicit filters, those being mirror and commit).  But in
>> the end, we really just designated some drivers filters (Quorum,
>> blkdebug, etc.) and added some specifically (throttle, COR), without
>> really looking through the block layer to see where issues might occur.
>>
>> It turns out vast areas of the block layer just don’t know about filters
>> and cannot really handle them.  Many cases will work in practice, in
>> others, well, too bad, you cannot use some feature because some part
>> deep inside the block layer looks at your filters and thinks they are
>> format nodes.
>>
>> This is one reason why this series is needed.  Over time (since v1), a
>> second reason has made its way in:
>>
>> bs->file is not necessarily the place where a node’s data is stored.
>> qcow2 now has external data files, and currently there is no way for the
>> general block layer to know that the data is not stored in bs->file.
>> Right now, I do not think that has any real consequences (all functions
>> that need access to the actual data storage file should only do so as a
>> fallback if the driver does not provide some functionality, but qcow2
>> should provide it all), but it still shows that we need some way to let
>> the general block layer know about such data files.  (Also, I will need
>> this for v1 of my “Inquire images’ rotational info” series.)
>>
>> I won’t go on and on about this series now, I think the patches pretty
>> much speak for themselves now.  If the cover letter gets too long,
>> nobody reads it anyway (see previous versions).
>>
>>
>> *** This series depends on some others. ***
>>
>> Dependencies:
>> - [PATCH 0/4] block: Keep track of parent quiescing
>> - [PATCH 0/2] vl: Drain before (block) job cancel when quitting
>> - [PATCH v2 0/2] blockdev: Overlays are not snapshots
>>
>> Based-on: <20190605161118.14544-1-mreitz@redhat.com>
>> Based-on: <20190612220839.1374-1-mreitz@redhat.com>
>> Based-on: <20190603202236.1342-1-mreitz@redhat.com>
> 
> Could you please export a branch?

Sure:

https://git.xanclic.moe/XanClic/qemu child-access-functions-v5
Or:
https://github.com/XanClic/qemu child-access-functions-v5


(And the base branch is:

https://git.xanclic.moe/XanClic/qemu child-access-functions-base
https://github.com/XanClic/qemu child-access-functions-base
)

>> v5:
>> - Split the huge patches 2 and 3 from the previous version into many
>>    smaller patches to maintain the potential reviewers’ sanity [Vladimir]
> 
> Thank you! In spite of frightening amount of patches, reviewing became a lot
> simpler.

I had hoped making it exactly 42 patches would make it a bit more welcoming.

Again, thanks a lot for reviewing!

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [Qemu-devel] [PATCH v5 13/42] block: Use CAFs in block status functions
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 13/42] block: Use CAFs in block status functions Max Reitz
@ 2019-06-14 12:07   ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 113+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2019-06-14 12:07 UTC (permalink / raw)
  To: Max Reitz, qemu-block; +Cc: Kevin Wolf, qemu-devel

13.06.2019 1:09, Max Reitz wrote:
> Use the child access functions in the block status inquiry functions as
> appropriate.
> 
> Signed-off-by: Max Reitz<mreitz@redhat.com>


Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [Qemu-devel] [PATCH v5 14/42] block: Use CAFs when working with backing chains
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 14/42] block: Use CAFs when working with backing chains Max Reitz
@ 2019-06-14 13:26   ` Vladimir Sementsov-Ogievskiy
  2019-06-14 13:50     ` Max Reitz
  0 siblings, 1 reply; 113+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2019-06-14 13:26 UTC (permalink / raw)
  To: Max Reitz, qemu-block; +Cc: Kevin Wolf, qemu-devel

13.06.2019 1:09, Max Reitz wrote:
> Use child access functions when iterating through backing chains so
> filters do not break the chain.
> 
> Signed-off-by: Max Reitz <mreitz@redhat.com>
> ---
>   block.c | 40 ++++++++++++++++++++++++++++------------
>   1 file changed, 28 insertions(+), 12 deletions(-)
> 
> diff --git a/block.c b/block.c
> index 11f37983d9..505b3e9a01 100644
> --- a/block.c
> +++ b/block.c
> @@ -4261,7 +4261,8 @@ int bdrv_change_backing_file(BlockDriverState *bs,
>   }
>   
>   /*
> - * Finds the image layer in the chain that has 'bs' as its backing file.
> + * Finds the image layer in the chain that has 'bs' (or a filter on
> + * top of it) as its backing file.
>    *
>    * active is the current topmost image.
>    *
> @@ -4273,11 +4274,18 @@ int bdrv_change_backing_file(BlockDriverState *bs,
>   BlockDriverState *bdrv_find_overlay(BlockDriverState *active,
>                                       BlockDriverState *bs)
>   {
> -    while (active && bs != backing_bs(active)) {
> -        active = backing_bs(active);
> +    bs = bdrv_skip_rw_filters(bs);
> +    active = bdrv_skip_rw_filters(active);
> +
> +    while (active) {
> +        BlockDriverState *next = bdrv_backing_chain_next(active);
> +        if (bs == next) {
> +            return active;
> +        }
> +        active = next;
>       }
>   
> -    return active;
> +    return NULL;
>   }

Semantics changed for this function.
It is used in two places
1. from bdrv_find_base wtih @bs=NULL, it should be unchanged, as I hope we will never have
    filter node as a bottom of some valid chain

2. from qmp_block_commit, only to check op-blocker... hmmm. I really don't understand,
why do we check BLOCK_OP_TYPE_COMMIT_TARGET on top_bs overlay.. top_bs overlay is out of the job,
what is this check for?


>   
>   /* Given a BDS, searches for the base layer. */
> @@ -4421,9 +4429,7 @@ int bdrv_drop_intermediate(BlockDriverState *top, BlockDriverState *base,
>        * other intermediate nodes have been dropped.
>        * If 'top' is an implicit node (e.g. "commit_top") we should skip
>        * it because no one inherits from it. We use explicit_top for that. */
> -    while (explicit_top && explicit_top->implicit) {
> -        explicit_top = backing_bs(explicit_top);
> -    }
> +    explicit_top = bdrv_skip_implicit_filters(explicit_top);
>       update_inherits_from = bdrv_inherits_from_recursive(base, explicit_top);
>   
>       /* success - we can delete the intermediate states, and link top->base */
> @@ -4902,7 +4908,7 @@ BlockDriverState *bdrv_lookup_bs(const char *device,
>   bool bdrv_chain_contains(BlockDriverState *top, BlockDriverState *base)
>   {
>       while (top && top != base) {
> -        top = backing_bs(top);
> +        top = bdrv_filtered_bs(top);
>       }
>   
>       return top != NULL;
> @@ -5141,7 +5147,17 @@ BlockDriverState *bdrv_find_backing_image(BlockDriverState *bs,
>   
>       is_protocol = path_has_protocol(backing_file);
>   
> -    for (curr_bs = bs; curr_bs->backing; curr_bs = curr_bs->backing->bs) {
> +    /*
> +     * Being largely a legacy function, skip any filters here
> +     * (because filters do not have normal filenames, so they cannot
> +     * match anyway; and allowing json:{} filenames is a bit out of
> +     * scope).
> +     */
> +    for (curr_bs = bdrv_skip_rw_filters(bs);
> +         bdrv_filtered_cow_child(curr_bs) != NULL;
> +         curr_bs = bdrv_backing_chain_next(curr_bs))
> +    {
> +        BlockDriverState *bs_below = bdrv_backing_chain_next(curr_bs);
>   
>           /* If either of the filename paths is actually a protocol, then
>            * compare unmodified paths; otherwise make paths relative */
> @@ -5149,7 +5165,7 @@ BlockDriverState *bdrv_find_backing_image(BlockDriverState *bs,
>               char *backing_file_full_ret;
>   
>               if (strcmp(backing_file, curr_bs->backing_file) == 0) {

hmm, interesting, what bs->backing_file now means? It's strange enough to store such field on
bds, when we have backing link anyway..

> -                retval = curr_bs->backing->bs;
> +                retval = bs_below;
>                   break;
>               }
>               /* Also check against the full backing filename for the image */
> @@ -5159,7 +5175,7 @@ BlockDriverState *bdrv_find_backing_image(BlockDriverState *bs,
>                   bool equal = strcmp(backing_file, backing_file_full_ret) == 0;
>                   g_free(backing_file_full_ret);
>                   if (equal) {
> -                    retval = curr_bs->backing->bs;
> +                    retval = bs_below;
>                       break;
>                   }
>               }
> @@ -5185,7 +5201,7 @@ BlockDriverState *bdrv_find_backing_image(BlockDriverState *bs,
>               g_free(filename_tmp);
>   
>               if (strcmp(backing_file_full, filename_full) == 0) {
> -                retval = curr_bs->backing->bs;
> +                retval = bs_below;
>                   break;
>               }
>           }
> 


-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [Qemu-devel] [PATCH v5 15/42] block: Re-evaluate backing file handling in reopen
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 15/42] block: Re-evaluate backing file handling in reopen Max Reitz
@ 2019-06-14 13:42   ` Vladimir Sementsov-Ogievskiy
  2019-06-14 15:52     ` Max Reitz
  0 siblings, 1 reply; 113+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2019-06-14 13:42 UTC (permalink / raw)
  To: Max Reitz, qemu-block; +Cc: Kevin Wolf, qemu-devel

13.06.2019 1:09, Max Reitz wrote:
> Reopening a node's backing child needs a bit of special handling because
> the "backing" child has different defaults than all other children
> (among other things).  Adding filter support here is a bit more
> difficult than just using the child access functions.  In fact, we often
> have to directly use bs->backing because these functions are about the
> "backing" child (which may or may not be the COW backing file).
> 
> Signed-off-by: Max Reitz <mreitz@redhat.com>
> ---
>   block.c | 36 +++++++++++++++++++++++++++++-------
>   1 file changed, 29 insertions(+), 7 deletions(-)
> 
> diff --git a/block.c b/block.c
> index 505b3e9a01..db2759c10d 100644
> --- a/block.c
> +++ b/block.c
> @@ -3542,17 +3542,39 @@ static int bdrv_reopen_parse_backing(BDRVReopenState *reopen_state,
>           }
>       }
>   
> +    /*
> +     * Ensure that @bs can really handle backing files, because we are
> +     * about to give it one (or swap the existing one)
> +     */
> +    if (bs->drv->is_filter) {
> +        /* Filters always have a file or a backing child */
> +        if (!bs->backing) {
> +            error_setg(errp, "'%s' is a %s filter node that does not support a "
> +                       "backing child", bs->node_name, bs->drv->format_name);
> +            return -EINVAL;
> +        }
> +    } else if (!bs->drv->supports_backing) {
> +        error_setg(errp, "Driver '%s' of node '%s' does not support backing "
> +                   "files", bs->drv->format_name, bs->node_name);
> +        return -EINVAL;
> +    }

hmm, shouldn't we have these checks for overlay_bs?

> +
>       /*
>        * Find the "actual" backing file by skipping all links that point
>        * to an implicit node, if any (e.g. a commit filter node).
> +     * We cannot use any of the bdrv_skip_*() functions here because
> +     * those return the first explicit node, while we are looking for
> +     * its overlay here.
>        */
>       overlay_bs = bs;
> -    while (backing_bs(overlay_bs) && backing_bs(overlay_bs)->implicit) {
> -        overlay_bs = backing_bs(overlay_bs);
> +    while (bdrv_filtered_bs(overlay_bs) &&
> +           bdrv_filtered_bs(overlay_bs)->implicit)
> +    {
> +        overlay_bs = bdrv_filtered_bs(overlay_bs);
>       }

here, overlay_bs may be some filter with file child ..

>   
>       /* If we want to replace the backing file we need some extra checks */
> -    if (new_backing_bs != backing_bs(overlay_bs)) {
> +    if (new_backing_bs != bdrv_filtered_bs(overlay_bs)) {
>           /* Check for implicit nodes between bs and its backing file */
>           if (bs != overlay_bs) {
>               error_setg(errp, "Cannot change backing link if '%s' has "
> @@ -3560,8 +3582,8 @@ static int bdrv_reopen_parse_backing(BDRVReopenState *reopen_state,
>               return -EPERM;
>           }
>           /* Check if the backing link that we want to replace is frozen */
> -        if (bdrv_is_backing_chain_frozen(overlay_bs, backing_bs(overlay_bs),
> -                                         errp)) {
> +        if (bdrv_is_backing_chain_frozen(overlay_bs,
> +                                         child_bs(overlay_bs->backing), errp)) {

.. and here we are doing wrong thing, as it don't have backing child

Aha, you use the fact that we now don't have implicit filters with file child. Then, should
we add an assertion for this?

>               return -EPERM;
>           }
>           reopen_state->replace_backing_bs = true;
> @@ -3712,7 +3734,7 @@ int bdrv_reopen_prepare(BDRVReopenState *reopen_state, BlockReopenQueue *queue,
>        * its metadata. Otherwise the 'backing' option can be omitted.
>        */
>       if (drv->supports_backing && reopen_state->backing_missing &&
> -        (backing_bs(reopen_state->bs) || reopen_state->bs->backing_file[0])) {
> +        (reopen_state->bs->backing || reopen_state->bs->backing_file[0])) {
>           error_setg(errp, "backing is missing for '%s'",
>                      reopen_state->bs->node_name);
>           ret = -EINVAL;
> @@ -3857,7 +3879,7 @@ void bdrv_reopen_commit(BDRVReopenState *reopen_state)
>        * from bdrv_set_backing_hd()) has the new values.
>        */
>       if (reopen_state->replace_backing_bs) {
> -        BlockDriverState *old_backing_bs = backing_bs(bs);
> +        BlockDriverState *old_backing_bs = child_bs(bs->backing);
>           assert(!old_backing_bs || !old_backing_bs->implicit);
>           /* Abort the permission update on the backing bs we're detaching */
>           if (old_backing_bs) {
> 


-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [Qemu-devel] [PATCH v5 14/42] block: Use CAFs when working with backing chains
  2019-06-14 13:26   ` Vladimir Sementsov-Ogievskiy
@ 2019-06-14 13:50     ` Max Reitz
  2019-06-14 14:31       ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 113+ messages in thread
From: Max Reitz @ 2019-06-14 13:50 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block; +Cc: Kevin Wolf, qemu-devel


[-- Attachment #1.1: Type: text/plain, Size: 2335 bytes --]

On 14.06.19 15:26, Vladimir Sementsov-Ogievskiy wrote:
> 13.06.2019 1:09, Max Reitz wrote:
>> Use child access functions when iterating through backing chains so
>> filters do not break the chain.
>>
>> Signed-off-by: Max Reitz <mreitz@redhat.com>
>> ---
>>   block.c | 40 ++++++++++++++++++++++++++++------------
>>   1 file changed, 28 insertions(+), 12 deletions(-)
>>
>> diff --git a/block.c b/block.c
>> index 11f37983d9..505b3e9a01 100644
>> --- a/block.c
>> +++ b/block.c

[...]

>> @@ -4273,11 +4274,18 @@ int bdrv_change_backing_file(BlockDriverState *bs,
>>   BlockDriverState *bdrv_find_overlay(BlockDriverState *active,
>>                                       BlockDriverState *bs)
>>   {
>> -    while (active && bs != backing_bs(active)) {
>> -        active = backing_bs(active);
>> +    bs = bdrv_skip_rw_filters(bs);
>> +    active = bdrv_skip_rw_filters(active);
>> +
>> +    while (active) {
>> +        BlockDriverState *next = bdrv_backing_chain_next(active);
>> +        if (bs == next) {
>> +            return active;
>> +        }
>> +        active = next;
>>       }
>>   
>> -    return active;
>> +    return NULL;
>>   }
> 
> Semantics changed for this function.
> It is used in two places
> 1. from bdrv_find_base wtih @bs=NULL, it should be unchanged, as I hope we will never have
>     filter node as a bottom of some valid chain
> 
> 2. from qmp_block_commit, only to check op-blocker... hmmm. I really don't understand,
> why do we check BLOCK_OP_TYPE_COMMIT_TARGET on top_bs overlay.. top_bs overlay is out of the job,
> what is this check for?

There is a loop before this check which checks that the same blocker is
not set on any nodes between top and base (both inclusive).  I guess
non-active commit checks the node above @top, too, because its backing
file will change.

>>   /* Given a BDS, searches for the base layer. */

[...]

>> @@ -5149,7 +5165,7 @@ BlockDriverState *bdrv_find_backing_image(BlockDriverState *bs,
>>               char *backing_file_full_ret;
>>   
>>               if (strcmp(backing_file, curr_bs->backing_file) == 0) {
> 
> hmm, interesting, what bs->backing_file now means? It's strange enough to store such field on
> bds, when we have backing link anyway..

Patch 37 has you covered. :-)

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [Qemu-devel] [PATCH v5 16/42] block: Use child access functions when flushing
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 16/42] block: Use child access functions when flushing Max Reitz
@ 2019-06-14 14:01   ` Vladimir Sementsov-Ogievskiy
  2019-06-14 15:55     ` Max Reitz
  0 siblings, 1 reply; 113+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2019-06-14 14:01 UTC (permalink / raw)
  To: Max Reitz, qemu-block; +Cc: Kevin Wolf, qemu-devel

13.06.2019 1:09, Max Reitz wrote:
> If the driver does not support .bdrv_co_flush() so bdrv_co_flush()
> itself has to flush the children of the given node, it should not flush
> just bs->file->bs, but in fact both the child that stores data, and the
> one that stores metadata (if they are separate).
> 
> In any case, the BLKDBG_EVENT() should be emitted on the primary child,
> because that is where a blkdebug node would be if there is any.
> 
> Signed-off-by: Max Reitz <mreitz@redhat.com>
> ---
>   block/io.c | 21 ++++++++++++++++++---
>   1 file changed, 18 insertions(+), 3 deletions(-)
> 
> diff --git a/block/io.c b/block/io.c
> index 53aabf86b5..64408cf19a 100644
> --- a/block/io.c
> +++ b/block/io.c
> @@ -2533,6 +2533,8 @@ static void coroutine_fn bdrv_flush_co_entry(void *opaque)
>   
>   int coroutine_fn bdrv_co_flush(BlockDriverState *bs)
>   {
> +    BdrvChild *primary_child = bdrv_primary_child(bs);
> +    BlockDriverState *storage_bs, *metadata_bs;
>       int current_gen;
>       int ret = 0;
>   
> @@ -2562,7 +2564,7 @@ int coroutine_fn bdrv_co_flush(BlockDriverState *bs)
>       }
>   
>       /* Write back cached data to the OS even with cache=unsafe */
> -    BLKDBG_EVENT(bs->file, BLKDBG_FLUSH_TO_OS);
> +    BLKDBG_EVENT(primary_child, BLKDBG_FLUSH_TO_OS);
>       if (bs->drv->bdrv_co_flush_to_os) {
>           ret = bs->drv->bdrv_co_flush_to_os(bs);
>           if (ret < 0) {
> @@ -2580,7 +2582,7 @@ int coroutine_fn bdrv_co_flush(BlockDriverState *bs)
>           goto flush_parent;
>       }
>   
> -    BLKDBG_EVENT(bs->file, BLKDBG_FLUSH_TO_DISK);
> +    BLKDBG_EVENT(primary_child, BLKDBG_FLUSH_TO_DISK);
>       if (!bs->drv) {
>           /* bs->drv->bdrv_co_flush() might have ejected the BDS
>            * (even in case of apparent success) */
> @@ -2625,7 +2627,20 @@ int coroutine_fn bdrv_co_flush(BlockDriverState *bs)
>        * in the case of cache=unsafe, so there are no useless flushes.
>        */
>   flush_parent:
> -    ret = bs->file ? bdrv_co_flush(bs->file->bs) : 0;
> +    storage_bs = bdrv_storage_bs(bs);
> +    metadata_bs = bdrv_metadata_bs(bs);
> +
> +    ret = 0;
> +    if (storage_bs) {
> +        ret = bdrv_co_flush(storage_bs);
> +    }
> +    if (metadata_bs && metadata_bs != storage_bs) {
> +        int ret_metadata = bdrv_co_flush(metadata_bs);
> +        if (!ret) {
> +            ret = ret_metadata;
> +        }
> +    }
> +
>   out:
>       /* Notify any pending flushes that we have completed */
>       if (ret == 0) {
> 

Hmm, I'm not sure that if in one driver we decided to store data and metadata separately,
we need to support flushing them both generic code.. If at some point qcow2 decides store part
of metadata in third child, we will not flush it here too?

Should not we instead loop through children and flush all? And I'd s/flush_parent/flush_children as
it is rather weird.

-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [Qemu-devel] [PATCH v5 14/42] block: Use CAFs when working with backing chains
  2019-06-14 13:50     ` Max Reitz
@ 2019-06-14 14:31       ` Vladimir Sementsov-Ogievskiy
  2019-06-14 16:02         ` Max Reitz
  0 siblings, 1 reply; 113+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2019-06-14 14:31 UTC (permalink / raw)
  To: Max Reitz, qemu-block; +Cc: Kevin Wolf, qemu-devel

14.06.2019 16:50, Max Reitz wrote:
> On 14.06.19 15:26, Vladimir Sementsov-Ogievskiy wrote:
>> 13.06.2019 1:09, Max Reitz wrote:
>>> Use child access functions when iterating through backing chains so
>>> filters do not break the chain.
>>>
>>> Signed-off-by: Max Reitz <mreitz@redhat.com>
>>> ---
>>>    block.c | 40 ++++++++++++++++++++++++++++------------
>>>    1 file changed, 28 insertions(+), 12 deletions(-)
>>>
>>> diff --git a/block.c b/block.c
>>> index 11f37983d9..505b3e9a01 100644
>>> --- a/block.c
>>> +++ b/block.c
> 
> [...]
> 
>>> @@ -4273,11 +4274,18 @@ int bdrv_change_backing_file(BlockDriverState *bs,
>>>    BlockDriverState *bdrv_find_overlay(BlockDriverState *active,
>>>                                        BlockDriverState *bs)
>>>    {
>>> -    while (active && bs != backing_bs(active)) {
>>> -        active = backing_bs(active);
>>> +    bs = bdrv_skip_rw_filters(bs);
>>> +    active = bdrv_skip_rw_filters(active);
>>> +
>>> +    while (active) {
>>> +        BlockDriverState *next = bdrv_backing_chain_next(active);
>>> +        if (bs == next) {
>>> +            return active;
>>> +        }
>>> +        active = next;
>>>        }
>>>    
>>> -    return active;
>>> +    return NULL;
>>>    }
>>
>> Semantics changed for this function.
>> It is used in two places
>> 1. from bdrv_find_base wtih @bs=NULL, it should be unchanged, as I hope we will never have
>>      filter node as a bottom of some valid chain
>>
>> 2. from qmp_block_commit, only to check op-blocker... hmmm. I really don't understand,
>> why do we check BLOCK_OP_TYPE_COMMIT_TARGET on top_bs overlay.. top_bs overlay is out of the job,
>> what is this check for?
> 
> There is a loop before this check which checks that the same blocker is
> not set on any nodes between top and base (both inclusive).  I guess
> non-active commit checks the node above @top, too, because its backing
> file will change.

So in this case frozen chain works better.

> 
>>>    /* Given a BDS, searches for the base layer. */
> 
> [...]
> 
>>> @@ -5149,7 +5165,7 @@ BlockDriverState *bdrv_find_backing_image(BlockDriverState *bs,
>>>                char *backing_file_full_ret;
>>>    
>>>                if (strcmp(backing_file, curr_bs->backing_file) == 0) {
>>
>> hmm, interesting, what bs->backing_file now means? It's strange enough to store such field on
>> bds, when we have backing link anyway..
> 
> Patch 37 has you covered. :-)
> 

Hmm, if it has removed this field, but it doesn't)

So, we finished with some object, called "overlay", but it is not an overlay of bs, it's overlay of
first non-implicit filtered node in bs backing chain, it may be found by bdrv_find_overlay() helper (which is
almost unused and my be safely dropped), and filename of this "overlay" is stored in bs->backing_file string
variable, keeping in mind that bs->backing is pointer to backing child of bs which is completely another thing?

Oh, no, everything related to filename-based backing chain logic is not for me o_O. If something doesn't work
with filename-based logic users should use node-names.. And I'd prefer to deprecate filename based interfaces
at all.

-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [Qemu-devel] [PATCH v5 17/42] block: Use CAFs in bdrv_refresh_limits()
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 17/42] block: Use CAFs in bdrv_refresh_limits() Max Reitz
@ 2019-06-14 15:04   ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 113+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2019-06-14 15:04 UTC (permalink / raw)
  To: Max Reitz, qemu-block; +Cc: Kevin Wolf, qemu-devel

13.06.2019 1:09, Max Reitz wrote:
> Signed-off-by: Max Reitz<mreitz@redhat.com>


Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [Qemu-devel] [PATCH v5 19/42] block: Use CAF in bdrv_co_rw_vmstate()
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 19/42] block: Use CAF in bdrv_co_rw_vmstate() Max Reitz
@ 2019-06-14 15:14   ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 113+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2019-06-14 15:14 UTC (permalink / raw)
  To: Max Reitz, qemu-block; +Cc: Kevin Wolf, qemu-devel

13.06.2019 1:09, Max Reitz wrote:
> If a node whose driver does not provide VM state functions has a
> metadata child, the VM state should probably go there; if it is a
> filter, the VM state should probably go there.  It follows that we
> should generally go down to the primary child.

Hmm, as I understand vmstate is something stored in file and invisible for actual file user,
which may be guest or format node.. So actually it doesn't matter in which
child to store it, it should be transparent for the parent.. Maybe the right
thing is to loop through children and use first which supports storing vmstate.

But I'm OK with this too.

(hmm you assume that vmstate should go to metadata child,
but the only format which has separate metadata and storage child stores vmstate to
storage child)

Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>


> 
> Signed-off-by: Max Reitz <mreitz@redhat.com>
> ---
>   block/io.c | 5 +++--
>   1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/block/io.c b/block/io.c
> index 659ea0c52a..14f99e1c00 100644
> --- a/block/io.c
> +++ b/block/io.c
> @@ -2395,6 +2395,7 @@ bdrv_co_rw_vmstate(BlockDriverState *bs, QEMUIOVector *qiov, int64_t pos,
>                      bool is_read)
>   {
>       BlockDriver *drv = bs->drv;
> +    BlockDriverState *child_bs = bdrv_primary_bs(bs);
>       int ret = -ENOTSUP;
>   
>       bdrv_inc_in_flight(bs);
> @@ -2407,8 +2408,8 @@ bdrv_co_rw_vmstate(BlockDriverState *bs, QEMUIOVector *qiov, int64_t pos,
>           } else {
>               ret = drv->bdrv_save_vmstate(bs, qiov, pos);
>           }
> -    } else if (bs->file) {
> -        ret = bdrv_co_rw_vmstate(bs->file->bs, qiov, pos, is_read);
> +    } else if (child_bs) {
> +        ret = bdrv_co_rw_vmstate(child_bs, qiov, pos, is_read);
>       }
>   
>       bdrv_dec_in_flight(bs);
> 


-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [Qemu-devel] [PATCH v5 20/42] block/snapshot: Fall back to storage child
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 20/42] block/snapshot: Fall back to storage child Max Reitz
@ 2019-06-14 15:22   ` Vladimir Sementsov-Ogievskiy
  2019-06-14 16:10     ` Max Reitz
  0 siblings, 1 reply; 113+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2019-06-14 15:22 UTC (permalink / raw)
  To: Max Reitz, qemu-block; +Cc: Kevin Wolf, qemu-devel

13.06.2019 1:09, Max Reitz wrote:
> If the top node's driver does not provide snapshot functionality and we
> want to go down the chain, we should go towards the child which stores
> the data, i.e. the storage child.
> 
> bdrv_snapshot_goto() becomes a bit weird because we may have to redirect
> the actual child pointer, so it only works if the storage child is
> bs->file or bs->backing (and then we have to find out which it is).
> 
> Signed-off-by: Max Reitz <mreitz@redhat.com>
> ---
>   block/snapshot.c | 74 ++++++++++++++++++++++++++++++++++--------------
>   1 file changed, 53 insertions(+), 21 deletions(-)
> 
> diff --git a/block/snapshot.c b/block/snapshot.c
> index f2f48f926a..58cd667f3a 100644
> --- a/block/snapshot.c
> +++ b/block/snapshot.c
> @@ -154,8 +154,9 @@ int bdrv_can_snapshot(BlockDriverState *bs)
>       }
>   
>       if (!drv->bdrv_snapshot_create) {
> -        if (bs->file != NULL) {
> -            return bdrv_can_snapshot(bs->file->bs);
> +        BlockDriverState *storage_bs = bdrv_storage_bs(bs);
> +        if (storage_bs) {
> +            return bdrv_can_snapshot(storage_bs);
>           }
>           return 0;
>       }

Hmm is it correct at all doing a snapshot, when top format node doesn't support it,
metadata child doesn't support it and storage child supports? Doing snapshots of
storage child seems useless, as data file must be in sync with metadata.


> @@ -167,14 +168,15 @@ int bdrv_snapshot_create(BlockDriverState *bs,
>                            QEMUSnapshotInfo *sn_info)
>   {
>       BlockDriver *drv = bs->drv;
> +    BlockDriverState *storage_bs = bdrv_storage_bs(bs);
>       if (!drv) {
>           return -ENOMEDIUM;
>       }
>       if (drv->bdrv_snapshot_create) {
>           return drv->bdrv_snapshot_create(bs, sn_info);
>       }
> -    if (bs->file) {
> -        return bdrv_snapshot_create(bs->file->bs, sn_info);
> +    if (storage_bs) {
> +        return bdrv_snapshot_create(storage_bs, sn_info);
>       }
>       return -ENOTSUP;
>   }
> @@ -184,6 +186,7 @@ int bdrv_snapshot_goto(BlockDriverState *bs,
>                          Error **errp)
>   {
>       BlockDriver *drv = bs->drv;
> +    BlockDriverState *storage_bs;
>       int ret, open_ret;
>   
>       if (!drv) {
> @@ -204,39 +207,66 @@ int bdrv_snapshot_goto(BlockDriverState *bs,
>           return ret;
>       }
>   
> -    if (bs->file) {
> -        BlockDriverState *file;
> -        QDict *options = qdict_clone_shallow(bs->options);
> +    storage_bs = bdrv_storage_bs(bs);
> +    if (storage_bs) {
> +        QDict *options;
>           QDict *file_options;
>           Error *local_err = NULL;
> +        bool is_backing_child;
> +        BdrvChild **child_pointer;
> +
> +        /*
> +         * Filters may reference the storage child through
> +         * bs->backing.  We need to know whether we are dealing with
> +         * bs->backing or bs->file, so we check it here.
> +         */
> +        if (storage_bs == bs->file->bs) {
> +            is_backing_child = false;
> +            child_pointer = &bs->file;
> +        } else if (storage_bs == bs->backing->bs) {
> +            is_backing_child = true;
> +            child_pointer = &bs->backing;
> +        } else {
> +            /*
> +             * The storage child is not referenced by a field in the
> +             * BDS object.  We cannot go on then.
> +             */
> +            error_setg(errp, "Block driver does not support snapshots");
> +            return -ENOTSUP;
> +        }
> +
> +        options = qdict_clone_shallow(bs->options);
>   
> -        file = bs->file->bs;
>           /* Prevent it from getting deleted when detached from bs */
> -        bdrv_ref(file);
> +        bdrv_ref(storage_bs);
>   
> -        qdict_extract_subqdict(options, &file_options, "file.");
> +        qdict_extract_subqdict(options, &file_options,
> +                               is_backing_child ? "backing." : "file.");
>           qobject_unref(file_options);
> -        qdict_put_str(options, "file", bdrv_get_node_name(file));
> +        qdict_put_str(options, is_backing_child ? "backing" : "file",
> +                      bdrv_get_node_name(storage_bs));
>   
>           if (drv->bdrv_close) {
>               drv->bdrv_close(bs);
>           }
> -        bdrv_unref_child(bs, bs->file);
> -        bs->file = NULL;
>   
> -        ret = bdrv_snapshot_goto(file, snapshot_id, errp);
> +        assert(storage_bs == (*child_pointer)->bs);
> +        bdrv_unref_child(bs, *child_pointer);
> +        *child_pointer = NULL;
> +
> +        ret = bdrv_snapshot_goto(storage_bs, snapshot_id, errp);
>           open_ret = drv->bdrv_open(bs, options, bs->open_flags, &local_err);
>           qobject_unref(options);
>           if (open_ret < 0) {
> -            bdrv_unref(file);
> +            bdrv_unref(storage_bs);
>               bs->drv = NULL;
>               /* A bdrv_snapshot_goto() error takes precedence */
>               error_propagate(errp, local_err);
>               return ret < 0 ? ret : open_ret;
>           }
>   
> -        assert(bs->file->bs == file);
> -        bdrv_unref(file);
> +        assert(storage_bs == (*child_pointer)->bs);
> +        bdrv_unref(storage_bs);
>           return ret;
>       }
>   
> @@ -272,6 +302,7 @@ int bdrv_snapshot_delete(BlockDriverState *bs,
>                            Error **errp)
>   {
>       BlockDriver *drv = bs->drv;
> +    BlockDriverState *storage_bs = bdrv_storage_bs(bs);
>       int ret;
>   
>       if (!drv) {
> @@ -288,8 +319,8 @@ int bdrv_snapshot_delete(BlockDriverState *bs,
>   
>       if (drv->bdrv_snapshot_delete) {
>           ret = drv->bdrv_snapshot_delete(bs, snapshot_id, name, errp);
> -    } else if (bs->file) {
> -        ret = bdrv_snapshot_delete(bs->file->bs, snapshot_id, name, errp);
> +    } else if (storage_bs) {
> +        ret = bdrv_snapshot_delete(storage_bs, snapshot_id, name, errp);
>       } else {
>           error_setg(errp, "Block format '%s' used by device '%s' "
>                      "does not support internal snapshot deletion",
> @@ -305,14 +336,15 @@ int bdrv_snapshot_list(BlockDriverState *bs,
>                          QEMUSnapshotInfo **psn_info)
>   {
>       BlockDriver *drv = bs->drv;
> +    BlockDriverState *storage_bs = bdrv_storage_bs(bs);
>       if (!drv) {
>           return -ENOMEDIUM;
>       }
>       if (drv->bdrv_snapshot_list) {
>           return drv->bdrv_snapshot_list(bs, psn_info);
>       }
> -    if (bs->file) {
> -        return bdrv_snapshot_list(bs->file->bs, psn_info);
> +    if (storage_bs) {
> +        return bdrv_snapshot_list(storage_bs, psn_info);
>       }
>       return -ENOTSUP;
>   }
> 


-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [Qemu-devel] [PATCH v5 21/42] block: Use CAFs for debug breakpoints
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 21/42] block: Use CAFs for debug breakpoints Max Reitz
@ 2019-06-14 15:29   ` Vladimir Sementsov-Ogievskiy
  2019-06-14 16:12     ` Max Reitz
  0 siblings, 1 reply; 113+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2019-06-14 15:29 UTC (permalink / raw)
  To: Max Reitz, qemu-block; +Cc: Kevin Wolf, qemu-devel

13.06.2019 1:09, Max Reitz wrote:
> When looking for a blkdebug node (which implements debug breakpoints),
> use bdrv_primary_bs() to iterate through the graph, because that is
> where a blkdebug node would be.
> 
> Signed-off-by: Max Reitz <mreitz@redhat.com>

Honestly, don't know why blkdebug is always searched in ->file sequence, but this
patch obviously supports backing-based filters for blkdebug scenarios, which I need
for my backup-top series (and have corresponding patch in it, which is not needed
if this goes first)


Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>


> ---
>   block.c | 8 ++++----
>   1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/block.c b/block.c
> index 797bec0326..11b7ba8cf6 100644
> --- a/block.c
> +++ b/block.c
> @@ -5097,7 +5097,7 @@ int bdrv_debug_breakpoint(BlockDriverState *bs, const char *event,
>                             const char *tag)
>   {
>       while (bs && bs->drv && !bs->drv->bdrv_debug_breakpoint) {
> -        bs = bs->file ? bs->file->bs : NULL;
> +        bs = bdrv_primary_bs(bs);
>       }
>   
>       if (bs && bs->drv && bs->drv->bdrv_debug_breakpoint) {
> @@ -5110,7 +5110,7 @@ int bdrv_debug_breakpoint(BlockDriverState *bs, const char *event,
>   int bdrv_debug_remove_breakpoint(BlockDriverState *bs, const char *tag)
>   {
>       while (bs && bs->drv && !bs->drv->bdrv_debug_remove_breakpoint) {
> -        bs = bs->file ? bs->file->bs : NULL;
> +        bs = bdrv_primary_bs(bs);
>       }
>   
>       if (bs && bs->drv && bs->drv->bdrv_debug_remove_breakpoint) {
> @@ -5123,7 +5123,7 @@ int bdrv_debug_remove_breakpoint(BlockDriverState *bs, const char *tag)
>   int bdrv_debug_resume(BlockDriverState *bs, const char *tag)
>   {
>       while (bs && (!bs->drv || !bs->drv->bdrv_debug_resume)) {
> -        bs = bs->file ? bs->file->bs : NULL;
> +        bs = bdrv_primary_bs(bs);
>       }
>   
>       if (bs && bs->drv && bs->drv->bdrv_debug_resume) {
> @@ -5136,7 +5136,7 @@ int bdrv_debug_resume(BlockDriverState *bs, const char *tag)
>   bool bdrv_debug_is_suspended(BlockDriverState *bs, const char *tag)
>   {
>       while (bs && bs->drv && !bs->drv->bdrv_debug_is_suspended) {
> -        bs = bs->file ? bs->file->bs : NULL;
> +        bs = bdrv_primary_bs(bs);
>       }
>   
>       if (bs && bs->drv && bs->drv->bdrv_debug_is_suspended) {
> 


-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [Qemu-devel] [PATCH v5 22/42] block: Use CAFs in bdrv_get_allocated_file_size()
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 22/42] block: Use CAFs in bdrv_get_allocated_file_size() Max Reitz
  2019-06-12 22:17   ` Max Reitz
@ 2019-06-14 15:41   ` Vladimir Sementsov-Ogievskiy
  2019-06-14 16:15     ` Max Reitz
  1 sibling, 1 reply; 113+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2019-06-14 15:41 UTC (permalink / raw)
  To: Max Reitz, qemu-block; +Cc: Kevin Wolf, qemu-devel

13.06.2019 1:09, Max Reitz wrote:
> Signed-off-by: Max Reitz <mreitz@redhat.com>
> ---
>   block.c | 26 ++++++++++++++++++++++++--
>   1 file changed, 24 insertions(+), 2 deletions(-)
> 
> diff --git a/block.c b/block.c
> index 11b7ba8cf6..856d9b58be 100644
> --- a/block.c
> +++ b/block.c
> @@ -4511,15 +4511,37 @@ exit:
>   int64_t bdrv_get_allocated_file_size(BlockDriverState *bs)
>   {
>       BlockDriver *drv = bs->drv;
> +    BlockDriverState *storage_bs, *metadata_bs;
> +
>       if (!drv) {
>           return -ENOMEDIUM;
>       }
> +
>       if (drv->bdrv_get_allocated_file_size) {
>           return drv->bdrv_get_allocated_file_size(bs);
>       }
> -    if (bs->file) {
> -        return bdrv_get_allocated_file_size(bs->file->bs);
> +
> +    storage_bs = bdrv_storage_bs(bs);
> +    metadata_bs = bdrv_metadata_bs(bs);
> +
> +    if (storage_bs) {
> +        int64_t data_size, metadata_size = 0;
> +
> +        data_size = bdrv_get_allocated_file_size(storage_bs);
> +        if (data_size < 0) {
> +            return data_size;
> +        }
> +
> +        if (storage_bs != metadata_bs) {
> +            metadata_size = bdrv_get_allocated_file_size(metadata_bs);
> +            if (metadata_size < 0) {
> +                return metadata_size;
> +            }
> +        }
> +
> +        return data_size + metadata_size;
>       }
> +
>       return -ENOTSUP;
>   }
>   
> 

Again, I dislike nailing down new fresh feature about separate metadata and storage child
to the generic block layer, as it's simple to imagine a driver which needs three or more
children to store all its data and metadata..

Isn't it better by default loop through all children and sum all their allocated sizes?

Hmm, but we want exclude backing, yes? Still we may ignore it while iterating.

-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [Qemu-devel] [PATCH v5 23/42] blockdev: Use CAF in external_snapshot_prepare()
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 23/42] blockdev: Use CAF in external_snapshot_prepare() Max Reitz
@ 2019-06-14 15:46   ` Vladimir Sementsov-Ogievskiy
  2019-06-14 16:20     ` Max Reitz
  0 siblings, 1 reply; 113+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2019-06-14 15:46 UTC (permalink / raw)
  To: Max Reitz, qemu-block; +Cc: Kevin Wolf, qemu-devel

13.06.2019 1:09, Max Reitz wrote:
> This allows us to differentiate between filters and nodes with COW
> backing files: Filters cannot be used as overlays at all (for this
> function).
> 
> Signed-off-by: Max Reitz <mreitz@redhat.com>

Overlay created in snapshot operation assumed to consume following writes
and it's filtered child becomes readonly.. And filter works in completely another
way.

Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

[hmm, I start to like using "filtered child" collocation when I say about this thing.
  didn't you think about renaming backing chain to filtered chain?]

> ---
>   blockdev.c | 7 ++++++-
>   1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/blockdev.c b/blockdev.c
> index b5c0fd3c49..0f0cf0d9ae 100644
> --- a/blockdev.c
> +++ b/blockdev.c
> @@ -1665,7 +1665,12 @@ static void external_snapshot_prepare(BlkActionState *common,
>           goto out;
>       }
>   
> -    if (state->new_bs->backing != NULL) {
> +    if (state->new_bs->drv->is_filter) {
> +        error_setg(errp, "Filters cannot be used as overlays");
> +        goto out;
> +    }
> +
> +    if (bdrv_filtered_cow_child(state->new_bs)) {
>           error_setg(errp, "The overlay already has a backing image");
>           goto out;
>       }
> 


-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [Qemu-devel] [PATCH v5 15/42] block: Re-evaluate backing file handling in reopen
  2019-06-14 13:42   ` Vladimir Sementsov-Ogievskiy
@ 2019-06-14 15:52     ` Max Reitz
  2019-06-14 16:43       ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 113+ messages in thread
From: Max Reitz @ 2019-06-14 15:52 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block; +Cc: Kevin Wolf, qemu-devel


[-- Attachment #1.1: Type: text/plain, Size: 6889 bytes --]

On 14.06.19 15:42, Vladimir Sementsov-Ogievskiy wrote:
> 13.06.2019 1:09, Max Reitz wrote:
>> Reopening a node's backing child needs a bit of special handling because
>> the "backing" child has different defaults than all other children
>> (among other things).  Adding filter support here is a bit more
>> difficult than just using the child access functions.  In fact, we often
>> have to directly use bs->backing because these functions are about the
>> "backing" child (which may or may not be the COW backing file).
>>
>> Signed-off-by: Max Reitz <mreitz@redhat.com>
>> ---
>>   block.c | 36 +++++++++++++++++++++++++++++-------
>>   1 file changed, 29 insertions(+), 7 deletions(-)
>>
>> diff --git a/block.c b/block.c
>> index 505b3e9a01..db2759c10d 100644
>> --- a/block.c
>> +++ b/block.c
>> @@ -3542,17 +3542,39 @@ static int bdrv_reopen_parse_backing(BDRVReopenState *reopen_state,
>>           }
>>       }
>>   
>> +    /*
>> +     * Ensure that @bs can really handle backing files, because we are
>> +     * about to give it one (or swap the existing one)
>> +     */
>> +    if (bs->drv->is_filter) {
>> +        /* Filters always have a file or a backing child */
>> +        if (!bs->backing) {
>> +            error_setg(errp, "'%s' is a %s filter node that does not support a "
>> +                       "backing child", bs->node_name, bs->drv->format_name);
>> +            return -EINVAL;
>> +        }
>> +    } else if (!bs->drv->supports_backing) {
>> +        error_setg(errp, "Driver '%s' of node '%s' does not support backing "
>> +                   "files", bs->drv->format_name, bs->node_name);
>> +        return -EINVAL;
>> +    }
> 
> hmm, shouldn't we have these checks for overlay_bs?

I think this is correct here because this is the only node the user has
control over, so this is the only one we can reasonably complain about.

And I do think it is reasonable to complain about.

>> +
>>       /*
>>        * Find the "actual" backing file by skipping all links that point
>>        * to an implicit node, if any (e.g. a commit filter node).
>> +     * We cannot use any of the bdrv_skip_*() functions here because
>> +     * those return the first explicit node, while we are looking for
>> +     * its overlay here.
>>        */
>>       overlay_bs = bs;
>> -    while (backing_bs(overlay_bs) && backing_bs(overlay_bs)->implicit) {
>> -        overlay_bs = backing_bs(overlay_bs);
>> +    while (bdrv_filtered_bs(overlay_bs) &&
>> +           bdrv_filtered_bs(overlay_bs)->implicit)
>> +    {
>> +        overlay_bs = bdrv_filtered_bs(overlay_bs);
>>       }
> 
> here, overlay_bs may be some filter with file child ..
> 
>>   
>>       /* If we want to replace the backing file we need some extra checks */
>> -    if (new_backing_bs != backing_bs(overlay_bs)) {
>> +    if (new_backing_bs != bdrv_filtered_bs(overlay_bs)) {
>>           /* Check for implicit nodes between bs and its backing file */
>>           if (bs != overlay_bs) {
>>               error_setg(errp, "Cannot change backing link if '%s' has "
>> @@ -3560,8 +3582,8 @@ static int bdrv_reopen_parse_backing(BDRVReopenState *reopen_state,
>>               return -EPERM;
>>           }
>>           /* Check if the backing link that we want to replace is frozen */
>> -        if (bdrv_is_backing_chain_frozen(overlay_bs, backing_bs(overlay_bs),
>> -                                         errp)) {
>> +        if (bdrv_is_backing_chain_frozen(overlay_bs,
>> +                                         child_bs(overlay_bs->backing), errp)) {
> 
> .. and here we are doing wrong thing, as it don't have backing child
> 
> Aha, you use the fact that we now don't have implicit filters with file child. Then, should
> we add an assertion for this?

No, that wasn’t my intention.  The real reason is that all of this is a
mess.

Here is the full context:

>     overlay_bs = bs;
>     while (bdrv_filtered_bs(overlay_bs) &&
>            bdrv_filtered_bs(overlay_bs)->implicit)
>     {
>         overlay_bs = bdrv_filtered_bs(overlay_bs);
>     }
> 
>     /* If we want to replace the backing file we need some extra checks */
>     if (new_backing_bs != bdrv_filtered_bs(overlay_bs)) {
>         /* Check for implicit nodes between bs and its backing file */
>         if (bs != overlay_bs) {
>             error_setg(errp, "Cannot change backing link if '%s' has "                                                                                                                                                                       
>                        "an implicit backing file", bs->node_name);
>             return -EPERM;
>         }
>         /* Check if the backing link that we want to replace is frozen */
>         if (bdrv_is_backing_chain_frozen(overlay_bs,
>                                          child_bs(overlay_bs->backing), errp)) {
>             return -EPERM;
>         }

Note the “Check for implicit nodes” thing.  If we get to the frozen
check, we have already confirmed that overlay_bs == bs, so we then know
that overlay_bs->backing works.

I can add an additional comment to make that more clear.  It took myself
quite a bit of digging to figure that out again...

(The reason for the loop is that we want to be able to recognize when
the user tries to not change the backing file.  In that case, we don’t
have to do anything, but because the user doesn’t know about implicit
nodes, we have to skip them in order to check whether the user actually
doesn’t want to change anything.)

Max

>>               return -EPERM;
>>           }
>>           reopen_state->replace_backing_bs = true;
>> @@ -3712,7 +3734,7 @@ int bdrv_reopen_prepare(BDRVReopenState *reopen_state, BlockReopenQueue *queue,
>>        * its metadata. Otherwise the 'backing' option can be omitted.
>>        */
>>       if (drv->supports_backing && reopen_state->backing_missing &&
>> -        (backing_bs(reopen_state->bs) || reopen_state->bs->backing_file[0])) {
>> +        (reopen_state->bs->backing || reopen_state->bs->backing_file[0])) {
>>           error_setg(errp, "backing is missing for '%s'",
>>                      reopen_state->bs->node_name);
>>           ret = -EINVAL;
>> @@ -3857,7 +3879,7 @@ void bdrv_reopen_commit(BDRVReopenState *reopen_state)
>>        * from bdrv_set_backing_hd()) has the new values.
>>        */
>>       if (reopen_state->replace_backing_bs) {
>> -        BlockDriverState *old_backing_bs = backing_bs(bs);
>> +        BlockDriverState *old_backing_bs = child_bs(bs->backing);
>>           assert(!old_backing_bs || !old_backing_bs->implicit);
>>           /* Abort the permission update on the backing bs we're detaching */
>>           if (old_backing_bs) {
>>
> 
> 



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [Qemu-devel] [PATCH v5 16/42] block: Use child access functions when flushing
  2019-06-14 14:01   ` Vladimir Sementsov-Ogievskiy
@ 2019-06-14 15:55     ` Max Reitz
  0 siblings, 0 replies; 113+ messages in thread
From: Max Reitz @ 2019-06-14 15:55 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block; +Cc: Kevin Wolf, qemu-devel


[-- Attachment #1.1: Type: text/plain, Size: 3210 bytes --]

On 14.06.19 16:01, Vladimir Sementsov-Ogievskiy wrote:
> 13.06.2019 1:09, Max Reitz wrote:
>> If the driver does not support .bdrv_co_flush() so bdrv_co_flush()
>> itself has to flush the children of the given node, it should not flush
>> just bs->file->bs, but in fact both the child that stores data, and the
>> one that stores metadata (if they are separate).
>>
>> In any case, the BLKDBG_EVENT() should be emitted on the primary child,
>> because that is where a blkdebug node would be if there is any.
>>
>> Signed-off-by: Max Reitz <mreitz@redhat.com>
>> ---
>>   block/io.c | 21 ++++++++++++++++++---
>>   1 file changed, 18 insertions(+), 3 deletions(-)
>>
>> diff --git a/block/io.c b/block/io.c
>> index 53aabf86b5..64408cf19a 100644
>> --- a/block/io.c
>> +++ b/block/io.c
>> @@ -2533,6 +2533,8 @@ static void coroutine_fn bdrv_flush_co_entry(void *opaque)
>>   
>>   int coroutine_fn bdrv_co_flush(BlockDriverState *bs)
>>   {
>> +    BdrvChild *primary_child = bdrv_primary_child(bs);
>> +    BlockDriverState *storage_bs, *metadata_bs;
>>       int current_gen;
>>       int ret = 0;
>>   
>> @@ -2562,7 +2564,7 @@ int coroutine_fn bdrv_co_flush(BlockDriverState *bs)
>>       }
>>   
>>       /* Write back cached data to the OS even with cache=unsafe */
>> -    BLKDBG_EVENT(bs->file, BLKDBG_FLUSH_TO_OS);
>> +    BLKDBG_EVENT(primary_child, BLKDBG_FLUSH_TO_OS);
>>       if (bs->drv->bdrv_co_flush_to_os) {
>>           ret = bs->drv->bdrv_co_flush_to_os(bs);
>>           if (ret < 0) {
>> @@ -2580,7 +2582,7 @@ int coroutine_fn bdrv_co_flush(BlockDriverState *bs)
>>           goto flush_parent;
>>       }
>>   
>> -    BLKDBG_EVENT(bs->file, BLKDBG_FLUSH_TO_DISK);
>> +    BLKDBG_EVENT(primary_child, BLKDBG_FLUSH_TO_DISK);
>>       if (!bs->drv) {
>>           /* bs->drv->bdrv_co_flush() might have ejected the BDS
>>            * (even in case of apparent success) */
>> @@ -2625,7 +2627,20 @@ int coroutine_fn bdrv_co_flush(BlockDriverState *bs)
>>        * in the case of cache=unsafe, so there are no useless flushes.
>>        */
>>   flush_parent:
>> -    ret = bs->file ? bdrv_co_flush(bs->file->bs) : 0;
>> +    storage_bs = bdrv_storage_bs(bs);
>> +    metadata_bs = bdrv_metadata_bs(bs);
>> +
>> +    ret = 0;
>> +    if (storage_bs) {
>> +        ret = bdrv_co_flush(storage_bs);
>> +    }
>> +    if (metadata_bs && metadata_bs != storage_bs) {
>> +        int ret_metadata = bdrv_co_flush(metadata_bs);
>> +        if (!ret) {
>> +            ret = ret_metadata;
>> +        }
>> +    }
>> +
>>   out:
>>       /* Notify any pending flushes that we have completed */
>>       if (ret == 0) {
>>
> 
> Hmm, I'm not sure that if in one driver we decided to store data and metadata separately,
> we need to support flushing them both generic code.. If at some point qcow2 decides store part
> of metadata in third child, we will not flush it here too?
> 
> Should not we instead loop through children and flush all? And I'd s/flush_parent/flush_children as
> it is rather weird.

That sounds good.  Well, we only need to flush the ones the driver has
taken a WRITE permission on, but yes.

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [Qemu-devel] [PATCH v5 14/42] block: Use CAFs when working with backing chains
  2019-06-14 14:31       ` Vladimir Sementsov-Ogievskiy
@ 2019-06-14 16:02         ` Max Reitz
  2019-06-14 16:39           ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 113+ messages in thread
From: Max Reitz @ 2019-06-14 16:02 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block; +Cc: Kevin Wolf, qemu-devel


[-- Attachment #1.1: Type: text/plain, Size: 3987 bytes --]

On 14.06.19 16:31, Vladimir Sementsov-Ogievskiy wrote:
> 14.06.2019 16:50, Max Reitz wrote:
>> On 14.06.19 15:26, Vladimir Sementsov-Ogievskiy wrote:
>>> 13.06.2019 1:09, Max Reitz wrote:
>>>> Use child access functions when iterating through backing chains so
>>>> filters do not break the chain.
>>>>
>>>> Signed-off-by: Max Reitz <mreitz@redhat.com>
>>>> ---
>>>>    block.c | 40 ++++++++++++++++++++++++++++------------
>>>>    1 file changed, 28 insertions(+), 12 deletions(-)
>>>>
>>>> diff --git a/block.c b/block.c
>>>> index 11f37983d9..505b3e9a01 100644
>>>> --- a/block.c
>>>> +++ b/block.c
>>
>> [...]
>>
>>>> @@ -4273,11 +4274,18 @@ int bdrv_change_backing_file(BlockDriverState *bs,
>>>>    BlockDriverState *bdrv_find_overlay(BlockDriverState *active,
>>>>                                        BlockDriverState *bs)
>>>>    {
>>>> -    while (active && bs != backing_bs(active)) {
>>>> -        active = backing_bs(active);
>>>> +    bs = bdrv_skip_rw_filters(bs);
>>>> +    active = bdrv_skip_rw_filters(active);
>>>> +
>>>> +    while (active) {
>>>> +        BlockDriverState *next = bdrv_backing_chain_next(active);
>>>> +        if (bs == next) {
>>>> +            return active;
>>>> +        }
>>>> +        active = next;
>>>>        }
>>>>    
>>>> -    return active;
>>>> +    return NULL;
>>>>    }
>>>
>>> Semantics changed for this function.
>>> It is used in two places
>>> 1. from bdrv_find_base wtih @bs=NULL, it should be unchanged, as I hope we will never have
>>>      filter node as a bottom of some valid chain
>>>
>>> 2. from qmp_block_commit, only to check op-blocker... hmmm. I really don't understand,
>>> why do we check BLOCK_OP_TYPE_COMMIT_TARGET on top_bs overlay.. top_bs overlay is out of the job,
>>> what is this check for?
>>
>> There is a loop before this check which checks that the same blocker is
>> not set on any nodes between top and base (both inclusive).  I guess
>> non-active commit checks the node above @top, too, because its backing
>> file will change.
> 
> So in this case frozen chain works better.

Perhaps.  The op blockers are in this weird state anyway.  I don’t think
we even need them any more, because the permissions were intended to
replace them (they were originally called “fine-grained op blockers”, I
seem to remember).

I dare not touch them.

>>>>    /* Given a BDS, searches for the base layer. */
>>
>> [...]
>>
>>>> @@ -5149,7 +5165,7 @@ BlockDriverState *bdrv_find_backing_image(BlockDriverState *bs,
>>>>                char *backing_file_full_ret;
>>>>    
>>>>                if (strcmp(backing_file, curr_bs->backing_file) == 0) {
>>>
>>> hmm, interesting, what bs->backing_file now means? It's strange enough to store such field on
>>> bds, when we have backing link anyway..
>>
>> Patch 37 has you covered. :-)
>>
> 
> Hmm, if it has removed this field, but it doesn't)

Because it’s needed.  (Just not in the current form, but that’s what 37
is for.)

> So, we finished with some object, called "overlay", but it is not an overlay of bs, it's overlay of
> first non-implicit filtered node in bs backing chain, it may be found by bdrv_find_overlay() helper (which is
> almost unused and my be safely dropped), and filename of this "overlay" is stored in bs->backing_file string
> variable, keeping in mind that bs->backing is pointer to backing child of bs which is completely another thing?

I don’t quite see what you mean.  There is no “overlay” in this function.

> Oh, no, everything related to filename-based backing chain logic is not for me o_O. If something doesn't work
> with filename-based logic users should use node-names..

In theory yes, but that isn’t an option for qemu-img commit, for example.

> And I'd prefer to deprecate filename based interfaces at all.

Me too.

https://lists.nongnu.org/archive/html/qemu-devel/2014-09/msg04878.html

:-/

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [Qemu-devel] [PATCH v5 20/42] block/snapshot: Fall back to storage child
  2019-06-14 15:22   ` Vladimir Sementsov-Ogievskiy
@ 2019-06-14 16:10     ` Max Reitz
  2019-06-14 16:47       ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 113+ messages in thread
From: Max Reitz @ 2019-06-14 16:10 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block; +Cc: Kevin Wolf, qemu-devel


[-- Attachment #1.1: Type: text/plain, Size: 2030 bytes --]

On 14.06.19 17:22, Vladimir Sementsov-Ogievskiy wrote:
> 13.06.2019 1:09, Max Reitz wrote:
>> If the top node's driver does not provide snapshot functionality and we
>> want to go down the chain, we should go towards the child which stores
>> the data, i.e. the storage child.
>>
>> bdrv_snapshot_goto() becomes a bit weird because we may have to redirect
>> the actual child pointer, so it only works if the storage child is
>> bs->file or bs->backing (and then we have to find out which it is).
>>
>> Signed-off-by: Max Reitz <mreitz@redhat.com>
>> ---
>>   block/snapshot.c | 74 ++++++++++++++++++++++++++++++++++--------------
>>   1 file changed, 53 insertions(+), 21 deletions(-)
>>
>> diff --git a/block/snapshot.c b/block/snapshot.c
>> index f2f48f926a..58cd667f3a 100644
>> --- a/block/snapshot.c
>> +++ b/block/snapshot.c
>> @@ -154,8 +154,9 @@ int bdrv_can_snapshot(BlockDriverState *bs)
>>       }
>>   
>>       if (!drv->bdrv_snapshot_create) {
>> -        if (bs->file != NULL) {
>> -            return bdrv_can_snapshot(bs->file->bs);
>> +        BlockDriverState *storage_bs = bdrv_storage_bs(bs);
>> +        if (storage_bs) {
>> +            return bdrv_can_snapshot(storage_bs);
>>           }
>>           return 0;
>>       }
> 
> Hmm is it correct at all doing a snapshot, when top format node doesn't support it,
> metadata child doesn't support it and storage child supports? Doing snapshots of
> storage child seems useless, as data file must be in sync with metadata.

You’re right.

That’s actually a bug already.  VMDK can store data in multiple
children, but it does not support snapshots.  So if you store such a
split VMDK disk on an RBD volume, it is possible that just the
descriptor file is snapshotted, but nothing else.

Hmmm.  I think the best way is to check whether there is exactly one
child that is not the bdrv_filtered_cow_child().  If so, we can go down
to it and snapshot it.  Otherwise, the node does not support snapshots.

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [Qemu-devel] [PATCH v5 21/42] block: Use CAFs for debug breakpoints
  2019-06-14 15:29   ` Vladimir Sementsov-Ogievskiy
@ 2019-06-14 16:12     ` Max Reitz
  2019-06-14 20:28       ` Eric Blake
  0 siblings, 1 reply; 113+ messages in thread
From: Max Reitz @ 2019-06-14 16:12 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block; +Cc: Kevin Wolf, qemu-devel


[-- Attachment #1.1: Type: text/plain, Size: 654 bytes --]

On 14.06.19 17:29, Vladimir Sementsov-Ogievskiy wrote:
> 13.06.2019 1:09, Max Reitz wrote:
>> When looking for a blkdebug node (which implements debug breakpoints),
>> use bdrv_primary_bs() to iterate through the graph, because that is
>> where a blkdebug node would be.
>>
>> Signed-off-by: Max Reitz <mreitz@redhat.com>
> 
> Honestly, don't know why blkdebug is always searched in ->file sequence,

Usually, blkdebug is just above the protocol node.  So

$format --file--> $protocol

becomes

$format --file--> blkdebug --file--> $protocol

This is why the existing code generally looks for blkdebug under the
->file link.

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 484 bytes --]

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [Qemu-devel] [PATCH v5 22/42] block: Use CAFs in bdrv_get_allocated_file_size()
  2019-06-14 15:41   ` Vladimir Sementsov-Ogievskiy
@ 2019-06-14 16:15     ` Max Reitz
  0 siblings, 0 replies; 113+ messages in thread
From: Max Reitz @ 2019-06-14 16:15 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block; +Cc: Kevin Wolf, qemu-devel


[-- Attachment #1.1: Type: text/plain, Size: 2363 bytes --]

On 14.06.19 17:41, Vladimir Sementsov-Ogievskiy wrote:
> 13.06.2019 1:09, Max Reitz wrote:
>> Signed-off-by: Max Reitz <mreitz@redhat.com>
>> ---
>>   block.c | 26 ++++++++++++++++++++++++--
>>   1 file changed, 24 insertions(+), 2 deletions(-)
>>
>> diff --git a/block.c b/block.c
>> index 11b7ba8cf6..856d9b58be 100644
>> --- a/block.c
>> +++ b/block.c
>> @@ -4511,15 +4511,37 @@ exit:
>>   int64_t bdrv_get_allocated_file_size(BlockDriverState *bs)
>>   {
>>       BlockDriver *drv = bs->drv;
>> +    BlockDriverState *storage_bs, *metadata_bs;
>> +
>>       if (!drv) {
>>           return -ENOMEDIUM;
>>       }
>> +
>>       if (drv->bdrv_get_allocated_file_size) {
>>           return drv->bdrv_get_allocated_file_size(bs);
>>       }
>> -    if (bs->file) {
>> -        return bdrv_get_allocated_file_size(bs->file->bs);
>> +
>> +    storage_bs = bdrv_storage_bs(bs);
>> +    metadata_bs = bdrv_metadata_bs(bs);
>> +
>> +    if (storage_bs) {
>> +        int64_t data_size, metadata_size = 0;
>> +
>> +        data_size = bdrv_get_allocated_file_size(storage_bs);
>> +        if (data_size < 0) {
>> +            return data_size;
>> +        }
>> +
>> +        if (storage_bs != metadata_bs) {
>> +            metadata_size = bdrv_get_allocated_file_size(metadata_bs);
>> +            if (metadata_size < 0) {
>> +                return metadata_size;
>> +            }
>> +        }
>> +
>> +        return data_size + metadata_size;
>>       }
>> +
>>       return -ENOTSUP;
>>   }
>>   
>>
> 
> Again, I dislike nailing down new fresh feature about separate metadata and storage child
> to the generic block layer, as it's simple to imagine a driver which needs three or more
> children to store all its data and metadata..

Yes, we have that, it’s VMDK.

> Isn't it better by default loop through all children and sum all their allocated sizes?
> 
> Hmm, but we want exclude backing, yes? Still we may ignore it while iterating.

I want to object in that there could be drivers that have children that
should not count towards their allocated size other than COW backing
files.  But I actually cannot imagine a reasonable scenario.  (The only
reason why COW backing files should be excluded is because they are
generally listed separately.)

So, yes, that sounds good.

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [Qemu-devel] [PATCH v5 23/42] blockdev: Use CAF in external_snapshot_prepare()
  2019-06-14 15:46   ` Vladimir Sementsov-Ogievskiy
@ 2019-06-14 16:20     ` Max Reitz
  2019-06-14 16:58       ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 113+ messages in thread
From: Max Reitz @ 2019-06-14 16:20 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block; +Cc: Kevin Wolf, qemu-devel


[-- Attachment #1.1: Type: text/plain, Size: 1368 bytes --]

On 14.06.19 17:46, Vladimir Sementsov-Ogievskiy wrote:
> 13.06.2019 1:09, Max Reitz wrote:
>> This allows us to differentiate between filters and nodes with COW
>> backing files: Filters cannot be used as overlays at all (for this
>> function).
>>
>> Signed-off-by: Max Reitz <mreitz@redhat.com>
> 
> Overlay created in snapshot operation assumed to consume following writes
> and it's filtered child becomes readonly.. And filter works in completely another
> way.
> 
> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> 
> [hmm, I start to like using "filtered child" collocation when I say about this thing.
>   didn't you think about renaming backing chain to filtered chain?]

Hm.  There are backing chains and there are backing chains.  There are
qemu-internal backing chains that consist of a healthy mix of filters
and COW overlays, and then there are the more high-level backing chains
the user actually manages, where only the overlays are important.

I think it would make sense to rename the “qemu-internal backing chains"
to “filter chains” or something.  But that makes it sound a bit like it
would only mean R/W filters...  Maybe just “chain”?

Actually, the only functions I find are is_backing_chain_frozen & Co,
and they could simply become is_chain_frozen.  Is there anything else?

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [Qemu-devel] [PATCH v5 14/42] block: Use CAFs when working with backing chains
  2019-06-14 16:02         ` Max Reitz
@ 2019-06-14 16:39           ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 113+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2019-06-14 16:39 UTC (permalink / raw)
  To: Max Reitz, qemu-block; +Cc: Kevin Wolf, qemu-devel

14.06.2019 19:02, Max Reitz wrote:
> On 14.06.19 16:31, Vladimir Sementsov-Ogievskiy wrote:
>> 14.06.2019 16:50, Max Reitz wrote:
>>> On 14.06.19 15:26, Vladimir Sementsov-Ogievskiy wrote:
>>>> 13.06.2019 1:09, Max Reitz wrote:
>>>>> Use child access functions when iterating through backing chains so
>>>>> filters do not break the chain.
>>>>>
>>>>> Signed-off-by: Max Reitz <mreitz@redhat.com>
>>>>> ---
>>>>>     block.c | 40 ++++++++++++++++++++++++++++------------
>>>>>     1 file changed, 28 insertions(+), 12 deletions(-)
>>>>>
>>>>> diff --git a/block.c b/block.c
>>>>> index 11f37983d9..505b3e9a01 100644
>>>>> --- a/block.c
>>>>> +++ b/block.c
>>>
>>> [...]
>>>
>>>>> @@ -4273,11 +4274,18 @@ int bdrv_change_backing_file(BlockDriverState *bs,
>>>>>     BlockDriverState *bdrv_find_overlay(BlockDriverState *active,
>>>>>                                         BlockDriverState *bs)
>>>>>     {
>>>>> -    while (active && bs != backing_bs(active)) {
>>>>> -        active = backing_bs(active);
>>>>> +    bs = bdrv_skip_rw_filters(bs);
>>>>> +    active = bdrv_skip_rw_filters(active);
>>>>> +
>>>>> +    while (active) {
>>>>> +        BlockDriverState *next = bdrv_backing_chain_next(active);
>>>>> +        if (bs == next) {
>>>>> +            return active;
>>>>> +        }
>>>>> +        active = next;
>>>>>         }
>>>>>     
>>>>> -    return active;
>>>>> +    return NULL;
>>>>>     }
>>>>
>>>> Semantics changed for this function.
>>>> It is used in two places
>>>> 1. from bdrv_find_base wtih @bs=NULL, it should be unchanged, as I hope we will never have
>>>>       filter node as a bottom of some valid chain
>>>>
>>>> 2. from qmp_block_commit, only to check op-blocker... hmmm. I really don't understand,
>>>> why do we check BLOCK_OP_TYPE_COMMIT_TARGET on top_bs overlay.. top_bs overlay is out of the job,
>>>> what is this check for?
>>>
>>> There is a loop before this check which checks that the same blocker is
>>> not set on any nodes between top and base (both inclusive).  I guess
>>> non-active commit checks the node above @top, too, because its backing
>>> file will change.
>>
>> So in this case frozen chain works better.
> 
> Perhaps.  The op blockers are in this weird state anyway.  I don’t think
> we even need them any more, because the permissions were intended to
> replace them (they were originally called “fine-grained op blockers”, I
> seem to remember).
> 
> I dare not touch them.
> 
>>>>>     /* Given a BDS, searches for the base layer. */
>>>
>>> [...]
>>>
>>>>> @@ -5149,7 +5165,7 @@ BlockDriverState *bdrv_find_backing_image(BlockDriverState *bs,
>>>>>                 char *backing_file_full_ret;
>>>>>     
>>>>>                 if (strcmp(backing_file, curr_bs->backing_file) == 0) {
>>>>
>>>> hmm, interesting, what bs->backing_file now means? It's strange enough to store such field on
>>>> bds, when we have backing link anyway..
>>>
>>> Patch 37 has you covered. :-)
>>>
>>
>> Hmm, if it has removed this field, but it doesn't)
> 
> Because it’s needed.  (Just not in the current form, but that’s what 37
> is for.)
> 
>> So, we finished with some object, called "overlay", but it is not an overlay of bs, it's overlay of
>> first non-implicit filtered node in bs backing chain, it may be found by bdrv_find_overlay() helper (which is
>> almost unused and my be safely dropped), and filename of this "overlay" is stored in bs->backing_file string
>> variable, keeping in mind that bs->backing is pointer to backing child of bs which is completely another thing?
> 
> I don’t quite see what you mean.  There is no “overlay” in this function.

Hmm, sorry, I kept in mind overlay from the next patch..

> 
>> Oh, no, everything related to filename-based backing chain logic is not for me o_O. If something doesn't work
>> with filename-based logic users should use node-names..
> 
> In theory yes, but that isn’t an option for qemu-img commit, for example.

And if something doesn't work with qemu-img, users should use qemu process in stopped state. And I'd prefer to
deprecate most of qemu-img :) Actually we in Virtuozzo already go this way for some things. QMP interface is
a lot more useful than qemu-img, and it's always simpler to maintain and develop one thing than two.

> 
>> And I'd prefer to deprecate filename based interfaces at all.
> 
> Me too.
> 
> https://lists.nongnu.org/archive/html/qemu-devel/2014-09/msg04878.html
> 
> :-/
> 

Really sad..


-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [Qemu-devel] [PATCH v5 15/42] block: Re-evaluate backing file handling in reopen
  2019-06-14 15:52     ` Max Reitz
@ 2019-06-14 16:43       ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 113+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2019-06-14 16:43 UTC (permalink / raw)
  To: Max Reitz, qemu-block; +Cc: Kevin Wolf, qemu-devel

14.06.2019 18:52, Max Reitz wrote:
> On 14.06.19 15:42, Vladimir Sementsov-Ogievskiy wrote:
>> 13.06.2019 1:09, Max Reitz wrote:
>>> Reopening a node's backing child needs a bit of special handling because
>>> the "backing" child has different defaults than all other children
>>> (among other things).  Adding filter support here is a bit more
>>> difficult than just using the child access functions.  In fact, we often
>>> have to directly use bs->backing because these functions are about the
>>> "backing" child (which may or may not be the COW backing file).
>>>
>>> Signed-off-by: Max Reitz <mreitz@redhat.com>
>>> ---
>>>    block.c | 36 +++++++++++++++++++++++++++++-------
>>>    1 file changed, 29 insertions(+), 7 deletions(-)
>>>
>>> diff --git a/block.c b/block.c
>>> index 505b3e9a01..db2759c10d 100644
>>> --- a/block.c
>>> +++ b/block.c
>>> @@ -3542,17 +3542,39 @@ static int bdrv_reopen_parse_backing(BDRVReopenState *reopen_state,
>>>            }
>>>        }
>>>    
>>> +    /*
>>> +     * Ensure that @bs can really handle backing files, because we are
>>> +     * about to give it one (or swap the existing one)
>>> +     */
>>> +    if (bs->drv->is_filter) {
>>> +        /* Filters always have a file or a backing child */
>>> +        if (!bs->backing) {
>>> +            error_setg(errp, "'%s' is a %s filter node that does not support a "
>>> +                       "backing child", bs->node_name, bs->drv->format_name);
>>> +            return -EINVAL;
>>> +        }
>>> +    } else if (!bs->drv->supports_backing) {
>>> +        error_setg(errp, "Driver '%s' of node '%s' does not support backing "
>>> +                   "files", bs->drv->format_name, bs->node_name);
>>> +        return -EINVAL;
>>> +    }
>>
>> hmm, shouldn't we have these checks for overlay_bs?
> 
> I think this is correct here because this is the only node the user has
> control over, so this is the only one we can reasonably complain about.
> 
> And I do think it is reasonable to complain about.
> 
>>> +
>>>        /*
>>>         * Find the "actual" backing file by skipping all links that point
>>>         * to an implicit node, if any (e.g. a commit filter node).
>>> +     * We cannot use any of the bdrv_skip_*() functions here because
>>> +     * those return the first explicit node, while we are looking for
>>> +     * its overlay here.
>>>         */
>>>        overlay_bs = bs;
>>> -    while (backing_bs(overlay_bs) && backing_bs(overlay_bs)->implicit) {
>>> -        overlay_bs = backing_bs(overlay_bs);
>>> +    while (bdrv_filtered_bs(overlay_bs) &&
>>> +           bdrv_filtered_bs(overlay_bs)->implicit)
>>> +    {
>>> +        overlay_bs = bdrv_filtered_bs(overlay_bs);
>>>        }
>>
>> here, overlay_bs may be some filter with file child ..
>>
>>>    
>>>        /* If we want to replace the backing file we need some extra checks */
>>> -    if (new_backing_bs != backing_bs(overlay_bs)) {
>>> +    if (new_backing_bs != bdrv_filtered_bs(overlay_bs)) {
>>>            /* Check for implicit nodes between bs and its backing file */
>>>            if (bs != overlay_bs) {
>>>                error_setg(errp, "Cannot change backing link if '%s' has "
>>> @@ -3560,8 +3582,8 @@ static int bdrv_reopen_parse_backing(BDRVReopenState *reopen_state,
>>>                return -EPERM;
>>>            }
>>>            /* Check if the backing link that we want to replace is frozen */
>>> -        if (bdrv_is_backing_chain_frozen(overlay_bs, backing_bs(overlay_bs),
>>> -                                         errp)) {
>>> +        if (bdrv_is_backing_chain_frozen(overlay_bs,
>>> +                                         child_bs(overlay_bs->backing), errp)) {
>>
>> .. and here we are doing wrong thing, as it don't have backing child
>>
>> Aha, you use the fact that we now don't have implicit filters with file child. Then, should
>> we add an assertion for this?
> 
> No, that wasn’t my intention.  The real reason is that all of this is a
> mess.
> 
> Here is the full context:
> 
>>      overlay_bs = bs;
>>      while (bdrv_filtered_bs(overlay_bs) &&
>>             bdrv_filtered_bs(overlay_bs)->implicit)
>>      {
>>          overlay_bs = bdrv_filtered_bs(overlay_bs);
>>      }
>>
>>      /* If we want to replace the backing file we need some extra checks */
>>      if (new_backing_bs != bdrv_filtered_bs(overlay_bs)) {
>>          /* Check for implicit nodes between bs and its backing file */
>>          if (bs != overlay_bs) {
>>              error_setg(errp, "Cannot change backing link if '%s' has "
>>                         "an implicit backing file", bs->node_name);
>>              return -EPERM;
>>          }
>>          /* Check if the backing link that we want to replace is frozen */
>>          if (bdrv_is_backing_chain_frozen(overlay_bs,
>>                                           child_bs(overlay_bs->backing), errp)) {
>>              return -EPERM;
>>          }
> 
> Note the “Check for implicit nodes” thing.  If we get to the frozen
> check, we have already confirmed that overlay_bs == bs, so we then know
> that overlay_bs->backing works.
> 
> I can add an additional comment to make that more clear.  It took myself
> quite a bit of digging to figure that out again...

Aha, I see it. Comment would be good.

> 
> (The reason for the loop is that we want to be able to recognize when
> the user tries to not change the backing file.  In that case, we don’t
> have to do anything, but because the user doesn’t know about implicit
> nodes, we have to skip them in order to check whether the user actually
> doesn’t want to change anything.)
> 
> Max
> 
>>>                return -EPERM;
>>>            }
>>>            reopen_state->replace_backing_bs = true;
>>> @@ -3712,7 +3734,7 @@ int bdrv_reopen_prepare(BDRVReopenState *reopen_state, BlockReopenQueue *queue,
>>>         * its metadata. Otherwise the 'backing' option can be omitted.
>>>         */
>>>        if (drv->supports_backing && reopen_state->backing_missing &&
>>> -        (backing_bs(reopen_state->bs) || reopen_state->bs->backing_file[0])) {
>>> +        (reopen_state->bs->backing || reopen_state->bs->backing_file[0])) {
>>>            error_setg(errp, "backing is missing for '%s'",
>>>                       reopen_state->bs->node_name);
>>>            ret = -EINVAL;
>>> @@ -3857,7 +3879,7 @@ void bdrv_reopen_commit(BDRVReopenState *reopen_state)
>>>         * from bdrv_set_backing_hd()) has the new values.
>>>         */
>>>        if (reopen_state->replace_backing_bs) {
>>> -        BlockDriverState *old_backing_bs = backing_bs(bs);
>>> +        BlockDriverState *old_backing_bs = child_bs(bs->backing);
>>>            assert(!old_backing_bs || !old_backing_bs->implicit);
>>>            /* Abort the permission update on the backing bs we're detaching */
>>>            if (old_backing_bs) {
>>>
>>
>>
> 
> 


-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [Qemu-devel] [PATCH v5 20/42] block/snapshot: Fall back to storage child
  2019-06-14 16:10     ` Max Reitz
@ 2019-06-14 16:47       ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 113+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2019-06-14 16:47 UTC (permalink / raw)
  To: Max Reitz, qemu-block; +Cc: Kevin Wolf, qemu-devel

14.06.2019 19:10, Max Reitz wrote:
> On 14.06.19 17:22, Vladimir Sementsov-Ogievskiy wrote:
>> 13.06.2019 1:09, Max Reitz wrote:
>>> If the top node's driver does not provide snapshot functionality and we
>>> want to go down the chain, we should go towards the child which stores
>>> the data, i.e. the storage child.
>>>
>>> bdrv_snapshot_goto() becomes a bit weird because we may have to redirect
>>> the actual child pointer, so it only works if the storage child is
>>> bs->file or bs->backing (and then we have to find out which it is).
>>>
>>> Signed-off-by: Max Reitz <mreitz@redhat.com>
>>> ---
>>>    block/snapshot.c | 74 ++++++++++++++++++++++++++++++++++--------------
>>>    1 file changed, 53 insertions(+), 21 deletions(-)
>>>
>>> diff --git a/block/snapshot.c b/block/snapshot.c
>>> index f2f48f926a..58cd667f3a 100644
>>> --- a/block/snapshot.c
>>> +++ b/block/snapshot.c
>>> @@ -154,8 +154,9 @@ int bdrv_can_snapshot(BlockDriverState *bs)
>>>        }
>>>    
>>>        if (!drv->bdrv_snapshot_create) {
>>> -        if (bs->file != NULL) {
>>> -            return bdrv_can_snapshot(bs->file->bs);
>>> +        BlockDriverState *storage_bs = bdrv_storage_bs(bs);
>>> +        if (storage_bs) {
>>> +            return bdrv_can_snapshot(storage_bs);
>>>            }
>>>            return 0;
>>>        }
>>
>> Hmm is it correct at all doing a snapshot, when top format node doesn't support it,
>> metadata child doesn't support it and storage child supports? Doing snapshots of
>> storage child seems useless, as data file must be in sync with metadata.
> 
> You’re right.
> 
> That’s actually a bug already.  VMDK can store data in multiple
> children, but it does not support snapshots.  So if you store such a
> split VMDK disk on an RBD volume, it is possible that just the
> descriptor file is snapshotted, but nothing else.
> 
> Hmmm.  I think the best way is to check whether there is exactly one
> child that is not the bdrv_filtered_cow_child().  If so, we can go down
> to it and snapshot it.  Otherwise, the node does not support snapshots.
> 

Anything prevents format node to store something not in a bdrv child but in something separate?

May be the safest way is to fall back only for filter nodes.


-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [Qemu-devel] [PATCH v5 23/42] blockdev: Use CAF in external_snapshot_prepare()
  2019-06-14 16:20     ` Max Reitz
@ 2019-06-14 16:58       ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 113+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2019-06-14 16:58 UTC (permalink / raw)
  To: Max Reitz, qemu-block; +Cc: Kevin Wolf, qemu-devel

14.06.2019 19:20, Max Reitz wrote:
> On 14.06.19 17:46, Vladimir Sementsov-Ogievskiy wrote:
>> 13.06.2019 1:09, Max Reitz wrote:
>>> This allows us to differentiate between filters and nodes with COW
>>> backing files: Filters cannot be used as overlays at all (for this
>>> function).
>>>
>>> Signed-off-by: Max Reitz <mreitz@redhat.com>
>>
>> Overlay created in snapshot operation assumed to consume following writes
>> and it's filtered child becomes readonly.. And filter works in completely another
>> way.
>>
>> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>>
>> [hmm, I start to like using "filtered child" collocation when I say about this thing.
>>    didn't you think about renaming backing chain to filtered chain?]
> 
> Hm.  There are backing chains and there are backing chains.  There are
> qemu-internal backing chains that consist of a healthy mix of filters
> and COW overlays, and then there are the more high-level backing chains
> the user actually manages, where only the overlays are important.
> 
> I think it would make sense to rename the “qemu-internal backing chains"
> to “filter chains” or something.  But that makes it sound a bit like it
> would only mean R/W filters...  Maybe just “chain”?
> 
> Actually, the only functions I find are is_backing_chain_frozen & Co,
> and they could simply become is_chain_frozen.  Is there anything else?

Chain is too general, may be, blockchain? :)))

And to be serious, one more reason to rename it is yours
bdrv_backing_chain_next which is about user-backing-chain and differs from
frozen-chain related functions.

However, I don't think that these series is good place for this renaming,
it's rather big already.

-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [Qemu-devel] [PATCH v5 21/42] block: Use CAFs for debug breakpoints
  2019-06-14 16:12     ` Max Reitz
@ 2019-06-14 20:28       ` Eric Blake
  0 siblings, 0 replies; 113+ messages in thread
From: Eric Blake @ 2019-06-14 20:28 UTC (permalink / raw)
  To: Max Reitz, Vladimir Sementsov-Ogievskiy, qemu-block
  Cc: Kevin Wolf, qemu-devel


[-- Attachment #1.1: Type: text/plain, Size: 1073 bytes --]

On 6/14/19 11:12 AM, Max Reitz wrote:
> On 14.06.19 17:29, Vladimir Sementsov-Ogievskiy wrote:
>> 13.06.2019 1:09, Max Reitz wrote:
>>> When looking for a blkdebug node (which implements debug breakpoints),
>>> use bdrv_primary_bs() to iterate through the graph, because that is
>>> where a blkdebug node would be.
>>>
>>> Signed-off-by: Max Reitz <mreitz@redhat.com>
>>
>> Honestly, don't know why blkdebug is always searched in ->file sequence,
> 
> Usually, blkdebug is just above the protocol node.  So
> 
> $format --file--> $protocol
> 
> becomes
> 
> $format --file--> blkdebug --file--> $protocol
> 
> This is why the existing code generally looks for blkdebug under the
> ->file link.

blkdebug is an interesting beast; there are use cases for both:

blkdebug -> qcow2 -> file

for debugging only guest-visible actions, and

qcow2 -> blkdebug -> file

for debugging specific qcow2 metadata actions.


-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [Qemu-devel] [PATCH v5 24/42] block: Use child access functions for QAPI queries
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 24/42] block: Use child access functions for QAPI queries Max Reitz
@ 2019-06-18 12:06   ` Vladimir Sementsov-Ogievskiy
  2019-06-18 14:22     ` Max Reitz
  0 siblings, 1 reply; 113+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2019-06-18 12:06 UTC (permalink / raw)
  To: Max Reitz, qemu-block; +Cc: Kevin Wolf, qemu-devel

13.06.2019 1:09, Max Reitz wrote:
> query-block and query-named-block-nodes now return any filtered child
> under "backing", not just bs->backing or COW children.  This is so that
> filters do not interrupt the reported backing chain.  This changes the
> output for iotest 184, as the throttled node now appears as a backing
> child.
> 
> Signed-off-by: Max Reitz <mreitz@redhat.com>
> ---
>   block/qapi.c               | 35 ++++++++++++++++++++---------------
>   tests/qemu-iotests/184.out |  7 ++++++-
>   2 files changed, 26 insertions(+), 16 deletions(-)
> 
> diff --git a/block/qapi.c b/block/qapi.c
> index 0c13c86f4e..1fd2937abc 100644
> --- a/block/qapi.c
> +++ b/block/qapi.c
> @@ -150,9 +150,13 @@ BlockDeviceInfo *bdrv_block_device_info(BlockBackend *blk,
>               return NULL;
>           }
>   
> -        if (bs0->drv && bs0->backing) {
> +        if (bs0->drv && bdrv_filtered_child(bs0)) {
> +            /*
> +             * Put any filtered child here (for backwards compatibility to when
> +             * we put bs0->backing here, which might be any filtered child).
> +             */
>               info->backing_file_depth++;
> -            bs0 = bs0->backing->bs;
> +            bs0 = bdrv_filtered_bs(bs0);


so, here we report all filtered filter children as backing ...

>               (*p_image_info)->has_backing_image = true;
>               p_image_info = &((*p_image_info)->backing_image);
>           } else {
> @@ -161,9 +165,8 @@ BlockDeviceInfo *bdrv_block_device_info(BlockBackend *blk,
>   
>           /* Skip automatically inserted nodes that the user isn't aware of for
>            * query-block (blk != NULL), but not for query-named-block-nodes */
> -        while (blk && bs0->drv && bs0->implicit) {
> -            bs0 = backing_bs(bs0);
> -            assert(bs0);
> +        if (blk) {
> +            bs0 = bdrv_skip_implicit_filters(bs0);
>           }
>       }
>   
> @@ -348,9 +351,9 @@ static void bdrv_query_info(BlockBackend *blk, BlockInfo **p_info,
>       BlockDriverState *bs = blk_bs(blk);
>       char *qdev;
>   
> -    /* Skip automatically inserted nodes that the user isn't aware of */
> -    while (bs && bs->drv && bs->implicit) {
> -        bs = backing_bs(bs);
> +    if (bs) {
> +        /* Skip automatically inserted nodes that the user isn't aware of */
> +        bs = bdrv_skip_implicit_filters(bs);
>       }
>   
>       info->device = g_strdup(blk_name(blk));
> @@ -507,6 +510,7 @@ static void bdrv_query_blk_stats(BlockDeviceStats *ds, BlockBackend *blk)
>   static BlockStats *bdrv_query_bds_stats(BlockDriverState *bs,
>                                           bool blk_level)
>   {
> +    BlockDriverState *storage_bs, *cow_bs;
>       BlockStats *s = NULL;
>   
>       s = g_malloc0(sizeof(*s));
> @@ -519,9 +523,8 @@ static BlockStats *bdrv_query_bds_stats(BlockDriverState *bs,
>       /* Skip automatically inserted nodes that the user isn't aware of in
>        * a BlockBackend-level command. Stay at the exact node for a node-level
>        * command. */
> -    while (blk_level && bs->drv && bs->implicit) {
> -        bs = backing_bs(bs);
> -        assert(bs);
> +    if (blk_level) {
> +        bs = bdrv_skip_implicit_filters(bs);
>       }
>   
>       if (bdrv_get_node_name(bs)[0]) {
> @@ -531,14 +534,16 @@ static BlockStats *bdrv_query_bds_stats(BlockDriverState *bs,
>   
>       s->stats->wr_highest_offset = stat64_get(&bs->wr_highest_offset);
>   
> -    if (bs->file) {
> +    storage_bs = bdrv_storage_bs(bs);
> +    if (storage_bs) {
>           s->has_parent = true;
> -        s->parent = bdrv_query_bds_stats(bs->file->bs, blk_level);
> +        s->parent = bdrv_query_bds_stats(storage_bs, blk_level);

... and here not as "backing" but as "parent"

Shouldn't we report filter-child as backing here too, for consistency?

anyway:

Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

>       }
>   
> -    if (blk_level && bs->backing) {
> +    cow_bs = bdrv_filtered_cow_bs(bs);
> +    if (blk_level && cow_bs) {
>           s->has_backing = true;
> -        s->backing = bdrv_query_bds_stats(bs->backing->bs, blk_level);
> +        s->backing = bdrv_query_bds_stats(cow_bs, blk_level);
>       }
>   
>       return s;
> diff --git a/tests/qemu-iotests/184.out b/tests/qemu-iotests/184.out
> index 3deb3cfb94..1d61f7e224 100644
> --- a/tests/qemu-iotests/184.out
> +++ b/tests/qemu-iotests/184.out
> @@ -27,6 +27,11 @@ Testing:
>               "iops_rd": 0,
>               "detect_zeroes": "off",
>               "image": {
> +                "backing-image": {
> +                    "virtual-size": 1073741824,
> +                    "filename": "null-co://",
> +                    "format": "null-co"
> +                },
>                   "virtual-size": 1073741824,
>                   "filename": "json:{\"throttle-group\": \"group0\", \"driver\": \"throttle\", \"file\": {\"driver\": \"null-co\"}}",
>                   "format": "throttle"
> @@ -34,7 +39,7 @@ Testing:
>               "iops_wr": 0,
>               "ro": false,
>               "node-name": "throttle0",
> -            "backing_file_depth": 0,
> +            "backing_file_depth": 1,
>               "drv": "throttle",
>               "iops": 0,
>               "bps_wr": 0,
> 


-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [Qemu-devel] [PATCH v5 25/42] mirror: Deal with filters
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 25/42] mirror: Deal with filters Max Reitz
@ 2019-06-18 13:12   ` Vladimir Sementsov-Ogievskiy
  2019-06-18 14:47     ` Max Reitz
  0 siblings, 1 reply; 113+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2019-06-18 13:12 UTC (permalink / raw)
  To: Max Reitz, qemu-block; +Cc: Kevin Wolf, qemu-devel

13.06.2019 1:09, Max Reitz wrote:
> This includes some permission limiting (for example, we only need to
> take the RESIZE permission for active commits where the base is smaller
> than the top).
> 
> Signed-off-by: Max Reitz <mreitz@redhat.com>

ohm, unfortunately I'm far from knowing block/mirror mechanics and interfaces :(.

still some comments below.

> ---
>   block/mirror.c | 110 +++++++++++++++++++++++++++++++++++++------------
>   blockdev.c     |  47 +++++++++++++++++----
>   2 files changed, 124 insertions(+), 33 deletions(-)
> 
> diff --git a/block/mirror.c b/block/mirror.c
> index 4fa8f57c80..3d767e3030 100644
> --- a/block/mirror.c
> +++ b/block/mirror.c
> @@ -660,8 +660,10 @@ static int mirror_exit_common(Job *job)
>                               &error_abort);
>       if (!abort && s->backing_mode == MIRROR_SOURCE_BACKING_CHAIN) {
>           BlockDriverState *backing = s->is_none_mode ? src : s->base;
> -        if (backing_bs(target_bs) != backing) {
> -            bdrv_set_backing_hd(target_bs, backing, &local_err);
> +        BlockDriverState *unfiltered_target = bdrv_skip_rw_filters(target_bs);
> +
> +        if (bdrv_filtered_cow_bs(unfiltered_target) != backing) {
> +            bdrv_set_backing_hd(unfiltered_target, backing, &local_err);
>               if (local_err) {
>                   error_report_err(local_err);
>                   ret = -EPERM;
> @@ -711,7 +713,7 @@ static int mirror_exit_common(Job *job)
>       block_job_remove_all_bdrv(bjob);
>       bdrv_child_try_set_perm(mirror_top_bs->backing, 0, BLK_PERM_ALL,
>                               &error_abort);
> -    bdrv_replace_node(mirror_top_bs, backing_bs(mirror_top_bs), &error_abort);
> +    bdrv_replace_node(mirror_top_bs, mirror_top_bs->backing->bs, &error_abort);
>   
>       /* We just changed the BDS the job BB refers to (with either or both of the
>        * bdrv_replace_node() calls), so switch the BB back so the cleanup does
> @@ -757,6 +759,7 @@ static int coroutine_fn mirror_dirty_init(MirrorBlockJob *s)
>   {
>       int64_t offset;
>       BlockDriverState *base = s->base;
> +    BlockDriverState *filtered_base;
>       BlockDriverState *bs = s->mirror_top_bs->backing->bs;
>       BlockDriverState *target_bs = blk_bs(s->target);
>       int ret;
> @@ -795,6 +798,9 @@ static int coroutine_fn mirror_dirty_init(MirrorBlockJob *s)
>           s->initial_zeroing_ongoing = false;
>       }
>   
> +    /* Will be NULL if @base is not in @bs's chain */

Should we assert that not NULL?
Hmm, so this is the way to "skip filters reverse from the base", yes? Worth add a comment?

> +    filtered_base = bdrv_filtered_cow_bs(bdrv_find_overlay(bs, base));
> +
>       /* First part, loop on the sectors and initialize the dirty bitmap.  */
>       for (offset = 0; offset < s->bdev_length; ) {
>           /* Just to make sure we are not exceeding int limit. */
> @@ -807,7 +813,7 @@ static int coroutine_fn mirror_dirty_init(MirrorBlockJob *s)
>               return 0;
>           }
>   
> -        ret = bdrv_is_allocated_above(bs, base, offset, bytes, &count);
> +        ret = bdrv_is_allocated_above(bs, filtered_base, offset, bytes, &count);
>           if (ret < 0) {
>               return ret;
>           }
> @@ -903,7 +909,7 @@ static int coroutine_fn mirror_run(Job *job, Error **errp)
>       } else {
>           s->target_cluster_size = BDRV_SECTOR_SIZE;
>       }
> -    if (backing_filename[0] && !target_bs->backing &&
> +    if (backing_filename[0] && !bdrv_backing_chain_next(target_bs) &&
>           s->granularity < s->target_cluster_size) {
>           s->buf_size = MAX(s->buf_size, s->target_cluster_size);
>           s->cow_bitmap = bitmap_new(length);
> @@ -1083,8 +1089,9 @@ static void mirror_complete(Job *job, Error **errp)
>       if (s->backing_mode == MIRROR_OPEN_BACKING_CHAIN) {
>           int ret;
>   
> -        assert(!target->backing);
> -        ret = bdrv_open_backing_file(target, NULL, "backing", errp);
> +        assert(!bdrv_backing_chain_next(target));
> +        ret = bdrv_open_backing_file(bdrv_skip_rw_filters(target), NULL,
> +                                     "backing", errp);
>           if (ret < 0) {
>               return;
>           }
> @@ -1503,8 +1510,8 @@ static void mirror_start_job(const char *job_id, BlockDriverState *bs,
>       MirrorBlockJob *s;
>       MirrorBDSOpaque *bs_opaque;
>       BlockDriverState *mirror_top_bs;
> -    bool target_graph_mod;
>       bool target_is_backing;
> +    uint64_t target_perms, target_shared_perms;
>       Error *local_err = NULL;
>       int ret;
>   
> @@ -1523,7 +1530,7 @@ static void mirror_start_job(const char *job_id, BlockDriverState *bs,
>           buf_size = DEFAULT_MIRROR_BUF_SIZE;
>       }
>   
> -    if (bs == target) {
> +    if (bdrv_skip_rw_filters(bs) == bdrv_skip_rw_filters(target)) {
>           error_setg(errp, "Can't mirror node into itself");
>           return;
>       }
> @@ -1583,15 +1590,42 @@ static void mirror_start_job(const char *job_id, BlockDriverState *bs,
>        * In the case of active commit, things look a bit different, though,
>        * because the target is an already populated backing file in active use.
>        * We can allow anything except resize there.*/
> +
> +    target_perms = BLK_PERM_WRITE;
> +    target_shared_perms = BLK_PERM_WRITE_UNCHANGED;
> +
>       target_is_backing = bdrv_chain_contains(bs, target);

don't you want skip filters here? actual target node may be in backing chain, but has separate
filters above it


> -    target_graph_mod = (backing_mode != MIRROR_LEAVE_BACKING_CHAIN);
> +    if (target_is_backing) {
> +        int64_t bs_size, target_size;
> +        bs_size = bdrv_getlength(bs);
> +        if (bs_size < 0) {
> +            error_setg_errno(errp, -bs_size,
> +                             "Could not inquire top image size");
> +            goto fail;
> +        }
> +
> +        target_size = bdrv_getlength(target);
> +        if (target_size < 0) {
> +            error_setg_errno(errp, -target_size,
> +                             "Could not inquire base image size");
> +            goto fail;
> +        }
> +
> +        if (target_size < bs_size) {
> +            target_perms |= BLK_PERM_RESIZE;
> +        }
> +
> +        target_shared_perms |= BLK_PERM_CONSISTENT_READ
> +                            |  BLK_PERM_WRITE
> +                            |  BLK_PERM_GRAPH_MOD;
> +    }
> +
> +    if (backing_mode != MIRROR_LEAVE_BACKING_CHAIN) {
> +        target_perms |= BLK_PERM_GRAPH_MOD;
> +    }
> +
>       s->target = blk_new(s->common.job.aio_context,
> -                        BLK_PERM_WRITE | BLK_PERM_RESIZE |
> -                        (target_graph_mod ? BLK_PERM_GRAPH_MOD : 0),
> -                        BLK_PERM_WRITE_UNCHANGED |
> -                        (target_is_backing ? BLK_PERM_CONSISTENT_READ |
> -                                             BLK_PERM_WRITE |
> -                                             BLK_PERM_GRAPH_MOD : 0));
> +                        target_perms, target_shared_perms);
>       ret = blk_insert_bs(s->target, target, errp);
>       if (ret < 0) {
>           goto fail;
> @@ -1641,15 +1675,39 @@ static void mirror_start_job(const char *job_id, BlockDriverState *bs,
>       /* In commit_active_start() all intermediate nodes disappear, so
>        * any jobs in them must be blocked */

hmm, preexisting, it s/jobs/nodes/

>       if (target_is_backing) {
> -        BlockDriverState *iter;
> -        for (iter = backing_bs(bs); iter != target; iter = backing_bs(iter)) {
> -            /* XXX BLK_PERM_WRITE needs to be allowed so we don't block
> -             * ourselves at s->base (if writes are blocked for a node, they are
> -             * also blocked for its backing file). The other options would be a
> -             * second filter driver above s->base (== target). */
> +        BlockDriverState *iter, *filtered_target;
> +        uint64_t iter_shared_perms;
> +
> +        /*
> +         * The topmost node with
> +         * bdrv_skip_rw_filters(filtered_target) == bdrv_skip_rw_filters(target)
> +         */
> +        filtered_target = bdrv_filtered_cow_bs(bdrv_find_overlay(bs, target));
> +
> +        assert(bdrv_skip_rw_filters(filtered_target) ==
> +               bdrv_skip_rw_filters(target));
> +
> +        /*
> +         * XXX BLK_PERM_WRITE needs to be allowed so we don't block
> +         * ourselves at s->base (if writes are blocked for a node, they are
> +         * also blocked for its backing file). The other options would be a
> +         * second filter driver above s->base (== target).
> +         */
> +        iter_shared_perms = BLK_PERM_WRITE_UNCHANGED | BLK_PERM_WRITE;
> +
> +        for (iter = bdrv_filtered_bs(bs); iter != target;
> +             iter = bdrv_filtered_bs(iter))
> +        {
> +            if (iter == filtered_target) {
> +                /*
> +                 * From here on, all nodes are filters on the base.
> +                 * This allows us to share BLK_PERM_CONSISTENT_READ.
> +                 */
> +                iter_shared_perms |= BLK_PERM_CONSISTENT_READ;
> +            }
> +
>               ret = block_job_add_bdrv(&s->common, "intermediate node", iter, 0,
> -                                     BLK_PERM_WRITE_UNCHANGED | BLK_PERM_WRITE,
> -                                     errp);
> +                                     iter_shared_perms, errp);
>               if (ret < 0) {
>                   goto fail;
>               }
> @@ -1683,7 +1741,7 @@ fail:
>   
>       bdrv_child_try_set_perm(mirror_top_bs->backing, 0, BLK_PERM_ALL,
>                               &error_abort);
> -    bdrv_replace_node(mirror_top_bs, backing_bs(mirror_top_bs), &error_abort);
> +    bdrv_replace_node(mirror_top_bs, mirror_top_bs->backing->bs, &error_abort);
>   
>       bdrv_unref(mirror_top_bs);
>   }
> @@ -1706,7 +1764,7 @@ void mirror_start(const char *job_id, BlockDriverState *bs,
>           return;
>       }
>       is_none_mode = mode == MIRROR_SYNC_MODE_NONE;
> -    base = mode == MIRROR_SYNC_MODE_TOP ? backing_bs(bs) : NULL;
> +    base = mode == MIRROR_SYNC_MODE_TOP ? bdrv_backing_chain_next(bs) : NULL;
>       mirror_start_job(job_id, bs, creation_flags, target, replaces,
>                        speed, granularity, buf_size, backing_mode,
>                        on_source_error, on_target_error, unmap, NULL, NULL,
> diff --git a/blockdev.c b/blockdev.c
> index 0f0cf0d9ae..68e8d33447 100644
> --- a/blockdev.c
> +++ b/blockdev.c
> @@ -3777,7 +3777,7 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
>           return;
>       }
>   
> -    if (!bs->backing && sync == MIRROR_SYNC_MODE_TOP) {
> +    if (!bdrv_backing_chain_next(bs) && sync == MIRROR_SYNC_MODE_TOP) {
>           sync = MIRROR_SYNC_MODE_FULL;
>       }
>   
> @@ -3826,7 +3826,7 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
>   
>   void qmp_drive_mirror(DriveMirror *arg, Error **errp)
>   {
> -    BlockDriverState *bs;
> +    BlockDriverState *bs, *unfiltered_bs;
>       BlockDriverState *source, *target_bs;
>       AioContext *aio_context;
>       BlockMirrorBackingMode backing_mode;
> @@ -3835,6 +3835,7 @@ void qmp_drive_mirror(DriveMirror *arg, Error **errp)
>       int flags;
>       int64_t size;
>       const char *format = arg->format;
> +    const char *replaces_node_name = NULL;
>       int ret;
>   
>       bs = qmp_get_root_bs(arg->device, errp);
> @@ -3847,6 +3848,16 @@ void qmp_drive_mirror(DriveMirror *arg, Error **errp)
>           return;
>       }
>   
> +    /*
> +     * If the user has not instructed us otherwise, we should let the
> +     * block job run from @bs (thus taking into account all filters on
> +     * it) but replace @unfiltered_bs when it finishes (thus not
> +     * removing those filters).
> +     * (And if there are any explicit filters, we should assume the
> +     *  user knows how to use the @replaces option.)
> +     */
> +    unfiltered_bs = bdrv_skip_implicit_filters(bs);
> +
>       aio_context = bdrv_get_aio_context(bs);
>       aio_context_acquire(aio_context);
>   
> @@ -3860,8 +3871,14 @@ void qmp_drive_mirror(DriveMirror *arg, Error **errp)
>       }
>   
>       flags = bs->open_flags | BDRV_O_RDWR;
> -    source = backing_bs(bs);
> +    source = bdrv_filtered_cow_bs(unfiltered_bs);
>       if (!source && arg->sync == MIRROR_SYNC_MODE_TOP) {
> +        if (bdrv_filtered_bs(unfiltered_bs)) {
> +            /* @unfiltered_bs is an explicit filter */
> +            error_setg(errp, "Cannot perform sync=top mirror through an "
> +                       "explicitly added filter node on the source");
> +            goto out;
> +        }
>           arg->sync = MIRROR_SYNC_MODE_FULL;
>       }
>       if (arg->sync == MIRROR_SYNC_MODE_NONE) {
> @@ -3880,6 +3897,9 @@ void qmp_drive_mirror(DriveMirror *arg, Error **errp)
>                                " named node of the graph");
>               goto out;
>           }
> +        replaces_node_name = arg->replaces;
> +    } else if (unfiltered_bs != bs) {
> +        replaces_node_name = unfiltered_bs->node_name;
>       }
>   
>       if (arg->mode == NEW_IMAGE_MODE_ABSOLUTE_PATHS) {
> @@ -3899,6 +3919,9 @@ void qmp_drive_mirror(DriveMirror *arg, Error **errp)
>           bdrv_img_create(arg->target, format,
>                           NULL, NULL, NULL, size, flags, false, &local_err);
>       } else {
> +        /* Implicit filters should not appear in the filename */
> +        BlockDriverState *explicit_backing = bdrv_skip_implicit_filters(source);
> +
>           switch (arg->mode) {
>           case NEW_IMAGE_MODE_EXISTING:
>               break;
> @@ -3906,8 +3929,8 @@ void qmp_drive_mirror(DriveMirror *arg, Error **errp)
>               /* create new image with backing file */
>               bdrv_refresh_filename(source);
>               bdrv_img_create(arg->target, format,
> -                            source->filename,
> -                            source->drv->format_name,
> +                            explicit_backing->filename,
> +                            explicit_backing->drv->format_name,
>                               NULL, size, flags, false, &local_err);
>               break;
>           default:
> @@ -3943,7 +3966,7 @@ void qmp_drive_mirror(DriveMirror *arg, Error **errp)
>       }
>   
>       blockdev_mirror_common(arg->has_job_id ? arg->job_id : NULL, bs, target_bs,
> -                           arg->has_replaces, arg->replaces, arg->sync,
> +                           !!replaces_node_name, replaces_node_name, arg->sync,
>                              backing_mode, arg->has_speed, arg->speed,
>                              arg->has_granularity, arg->granularity,
>                              arg->has_buf_size, arg->buf_size,
> @@ -3979,7 +4002,7 @@ void qmp_blockdev_mirror(bool has_job_id, const char *job_id,
>                            bool has_auto_dismiss, bool auto_dismiss,
>                            Error **errp)
>   {
> -    BlockDriverState *bs;
> +    BlockDriverState *bs, *unfiltered_bs;
>       BlockDriverState *target_bs;
>       AioContext *aio_context;
>       BlockMirrorBackingMode backing_mode = MIRROR_LEAVE_BACKING_CHAIN;
> @@ -3991,6 +4014,16 @@ void qmp_blockdev_mirror(bool has_job_id, const char *job_id,
>           return;
>       }
>   
> +    /*
> +     * Same as in qmp_drive_mirror(): We want to run the job from @bs,
> +     * but we want to replace @unfiltered_bs on completion.
> +     */
> +    unfiltered_bs = bdrv_skip_implicit_filters(bs);
> +    if (!has_replaces && unfiltered_bs != bs) {
> +        replaces = unfiltered_bs->node_name;
> +        has_replaces = true;
> +    }
> +
>       target_bs = bdrv_lookup_bs(target, target, errp);
>       if (!target_bs) {
>           return;
> 


-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [Qemu-devel] [PATCH v5 26/42] backup: Deal with filters
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 26/42] backup: " Max Reitz
@ 2019-06-18 13:45   ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 113+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2019-06-18 13:45 UTC (permalink / raw)
  To: Max Reitz, qemu-block; +Cc: Kevin Wolf, qemu-devel

13.06.2019 1:09, Max Reitz wrote:
> Signed-off-by: Max Reitz<mreitz@redhat.com>

not sure in completeness (hmm, I'm afraid neither assurance nor
completeness is possible for filters yet)..

Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [Qemu-devel] [PATCH v5 29/42] nbd: Use CAF when looking for dirty bitmap
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 29/42] nbd: Use CAF when looking for dirty bitmap Max Reitz
@ 2019-06-18 13:58   ` Vladimir Sementsov-Ogievskiy
  2019-06-18 14:48   ` Eric Blake
  1 sibling, 0 replies; 113+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2019-06-18 13:58 UTC (permalink / raw)
  To: Max Reitz, qemu-block; +Cc: Kevin Wolf, qemu-devel

13.06.2019 1:09, Max Reitz wrote:
> When looking for a dirty bitmap to share, we should handle filters by
> just including them in the search (so they do not break backing chains).
> 
> Signed-off-by: Max Reitz <mreitz@redhat.com>
> ---
>   nbd/server.c | 6 +++---
>   1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/nbd/server.c b/nbd/server.c
> index aeca3893fe..0d51d46b81 100644
> --- a/nbd/server.c
> +++ b/nbd/server.c
> @@ -1508,13 +1508,13 @@ NBDExport *nbd_export_new(BlockDriverState *bs, uint64_t dev_offset,
>       if (bitmap) {
>           BdrvDirtyBitmap *bm = NULL;
>   
> -        while (true) {
> +        while (bs) {
>               bm = bdrv_find_dirty_bitmap(bs, bitmap);
> -            if (bm != NULL || bs->backing == NULL) {
> +            if (bm != NULL) {
>                   break;
>               }
>   
> -            bs = bs->backing->bs;
> +            bs = bdrv_filtered_bs(bs);
>           }
>   
>           if (bm == NULL) {
> 

Hmm, I'm a bit confused by the fact that we reuse bs for the other purpose (it was my idea, but bad one),
it seems safe here, as the only following usage of bs seems want entirely bs containing the bitmap, so it's
OK.. It may be worth adding near "BdrvDirtyBitmap *bm" and additional "BlockDriverState *bm_bs = bs" and
operate on it.


Anyway:

Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>



-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [Qemu-devel] [PATCH v5 24/42] block: Use child access functions for QAPI queries
  2019-06-18 12:06   ` Vladimir Sementsov-Ogievskiy
@ 2019-06-18 14:22     ` Max Reitz
  0 siblings, 0 replies; 113+ messages in thread
From: Max Reitz @ 2019-06-18 14:22 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block; +Cc: Kevin Wolf, qemu-devel


[-- Attachment #1.1: Type: text/plain, Size: 6245 bytes --]

On 18.06.19 14:06, Vladimir Sementsov-Ogievskiy wrote:
> 13.06.2019 1:09, Max Reitz wrote:
>> query-block and query-named-block-nodes now return any filtered child
>> under "backing", not just bs->backing or COW children.  This is so that
>> filters do not interrupt the reported backing chain.  This changes the
>> output for iotest 184, as the throttled node now appears as a backing
>> child.
>>
>> Signed-off-by: Max Reitz <mreitz@redhat.com>
>> ---
>>   block/qapi.c               | 35 ++++++++++++++++++++---------------
>>   tests/qemu-iotests/184.out |  7 ++++++-
>>   2 files changed, 26 insertions(+), 16 deletions(-)
>>
>> diff --git a/block/qapi.c b/block/qapi.c
>> index 0c13c86f4e..1fd2937abc 100644
>> --- a/block/qapi.c
>> +++ b/block/qapi.c
>> @@ -150,9 +150,13 @@ BlockDeviceInfo *bdrv_block_device_info(BlockBackend *blk,
>>               return NULL;
>>           }
>>   
>> -        if (bs0->drv && bs0->backing) {
>> +        if (bs0->drv && bdrv_filtered_child(bs0)) {
>> +            /*
>> +             * Put any filtered child here (for backwards compatibility to when
>> +             * we put bs0->backing here, which might be any filtered child).
>> +             */
>>               info->backing_file_depth++;
>> -            bs0 = bs0->backing->bs;
>> +            bs0 = bdrv_filtered_bs(bs0);
> 
> 
> so, here we report all filtered filter children as backing ...
> 
>>               (*p_image_info)->has_backing_image = true;
>>               p_image_info = &((*p_image_info)->backing_image);
>>           } else {
>> @@ -161,9 +165,8 @@ BlockDeviceInfo *bdrv_block_device_info(BlockBackend *blk,
>>   
>>           /* Skip automatically inserted nodes that the user isn't aware of for
>>            * query-block (blk != NULL), but not for query-named-block-nodes */
>> -        while (blk && bs0->drv && bs0->implicit) {
>> -            bs0 = backing_bs(bs0);
>> -            assert(bs0);
>> +        if (blk) {
>> +            bs0 = bdrv_skip_implicit_filters(bs0);
>>           }
>>       }
>>   
>> @@ -348,9 +351,9 @@ static void bdrv_query_info(BlockBackend *blk, BlockInfo **p_info,
>>       BlockDriverState *bs = blk_bs(blk);
>>       char *qdev;
>>   
>> -    /* Skip automatically inserted nodes that the user isn't aware of */
>> -    while (bs && bs->drv && bs->implicit) {
>> -        bs = backing_bs(bs);
>> +    if (bs) {
>> +        /* Skip automatically inserted nodes that the user isn't aware of */
>> +        bs = bdrv_skip_implicit_filters(bs);
>>       }
>>   
>>       info->device = g_strdup(blk_name(blk));
>> @@ -507,6 +510,7 @@ static void bdrv_query_blk_stats(BlockDeviceStats *ds, BlockBackend *blk)
>>   static BlockStats *bdrv_query_bds_stats(BlockDriverState *bs,
>>                                           bool blk_level)
>>   {
>> +    BlockDriverState *storage_bs, *cow_bs;
>>       BlockStats *s = NULL;
>>   
>>       s = g_malloc0(sizeof(*s));
>> @@ -519,9 +523,8 @@ static BlockStats *bdrv_query_bds_stats(BlockDriverState *bs,
>>       /* Skip automatically inserted nodes that the user isn't aware of in
>>        * a BlockBackend-level command. Stay at the exact node for a node-level
>>        * command. */
>> -    while (blk_level && bs->drv && bs->implicit) {
>> -        bs = backing_bs(bs);
>> -        assert(bs);
>> +    if (blk_level) {
>> +        bs = bdrv_skip_implicit_filters(bs);
>>       }
>>   
>>       if (bdrv_get_node_name(bs)[0]) {
>> @@ -531,14 +534,16 @@ static BlockStats *bdrv_query_bds_stats(BlockDriverState *bs,
>>   
>>       s->stats->wr_highest_offset = stat64_get(&bs->wr_highest_offset);
>>   
>> -    if (bs->file) {
>> +    storage_bs = bdrv_storage_bs(bs);
>> +    if (storage_bs) {
>>           s->has_parent = true;
>> -        s->parent = bdrv_query_bds_stats(bs->file->bs, blk_level);
>> +        s->parent = bdrv_query_bds_stats(storage_bs, blk_level);
> 
> ... and here not as "backing" but as "parent"
> 
> Shouldn't we report filter-child as backing here too, for consistency?

My idea here was that it compatibility is not that important, it’s just
stats, after all.  Well, wr_highest_offset is kind of important, but
that doesn’t matter for a backing file.

So I decided to implement here what actually makes sense.  But maybe I
was actually wrong about “what makes sense”.  It doesn’t make much sense
to interrupt the backing chain here because of a filter...  Maybe the
s->backing field should always be the next non-implicit element in the
backing chain?  (So that may be duplicated in s->parent and s->backing,
but actually, so what.)

Max

> anyway:
> 
> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> 
>>       }
>>   
>> -    if (blk_level && bs->backing) {
>> +    cow_bs = bdrv_filtered_cow_bs(bs);
>> +    if (blk_level && cow_bs) {
>>           s->has_backing = true;
>> -        s->backing = bdrv_query_bds_stats(bs->backing->bs, blk_level);
>> +        s->backing = bdrv_query_bds_stats(cow_bs, blk_level);
>>       }
>>   
>>       return s;
>> diff --git a/tests/qemu-iotests/184.out b/tests/qemu-iotests/184.out
>> index 3deb3cfb94..1d61f7e224 100644
>> --- a/tests/qemu-iotests/184.out
>> +++ b/tests/qemu-iotests/184.out
>> @@ -27,6 +27,11 @@ Testing:
>>               "iops_rd": 0,
>>               "detect_zeroes": "off",
>>               "image": {
>> +                "backing-image": {
>> +                    "virtual-size": 1073741824,
>> +                    "filename": "null-co://",
>> +                    "format": "null-co"
>> +                },
>>                   "virtual-size": 1073741824,
>>                   "filename": "json:{\"throttle-group\": \"group0\", \"driver\": \"throttle\", \"file\": {\"driver\": \"null-co\"}}",
>>                   "format": "throttle"
>> @@ -34,7 +39,7 @@ Testing:
>>               "iops_wr": 0,
>>               "ro": false,
>>               "node-name": "throttle0",
>> -            "backing_file_depth": 0,
>> +            "backing_file_depth": 1,
>>               "drv": "throttle",
>>               "iops": 0,
>>               "bps_wr": 0,
>>
> 
> 



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [Qemu-devel] [PATCH v5 25/42] mirror: Deal with filters
  2019-06-18 13:12   ` Vladimir Sementsov-Ogievskiy
@ 2019-06-18 14:47     ` Max Reitz
  2019-06-18 14:55       ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 113+ messages in thread
From: Max Reitz @ 2019-06-18 14:47 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block; +Cc: Kevin Wolf, qemu-devel


[-- Attachment #1.1: Type: text/plain, Size: 5209 bytes --]

On 18.06.19 15:12, Vladimir Sementsov-Ogievskiy wrote:
> 13.06.2019 1:09, Max Reitz wrote:
>> This includes some permission limiting (for example, we only need to
>> take the RESIZE permission for active commits where the base is smaller
>> than the top).
>>
>> Signed-off-by: Max Reitz <mreitz@redhat.com>
> 
> ohm, unfortunately I'm far from knowing block/mirror mechanics and interfaces :(.
> 
> still some comments below.
> 
>> ---
>>   block/mirror.c | 110 +++++++++++++++++++++++++++++++++++++------------
>>   blockdev.c     |  47 +++++++++++++++++----
>>   2 files changed, 124 insertions(+), 33 deletions(-)
>>
>> diff --git a/block/mirror.c b/block/mirror.c
>> index 4fa8f57c80..3d767e3030 100644
>> --- a/block/mirror.c
>> +++ b/block/mirror.c
>> @@ -660,8 +660,10 @@ static int mirror_exit_common(Job *job)
>>                               &error_abort);
>>       if (!abort && s->backing_mode == MIRROR_SOURCE_BACKING_CHAIN) {
>>           BlockDriverState *backing = s->is_none_mode ? src : s->base;
>> -        if (backing_bs(target_bs) != backing) {
>> -            bdrv_set_backing_hd(target_bs, backing, &local_err);
>> +        BlockDriverState *unfiltered_target = bdrv_skip_rw_filters(target_bs);
>> +
>> +        if (bdrv_filtered_cow_bs(unfiltered_target) != backing) {
>> +            bdrv_set_backing_hd(unfiltered_target, backing, &local_err);
>>               if (local_err) {
>>                   error_report_err(local_err);
>>                   ret = -EPERM;
>> @@ -711,7 +713,7 @@ static int mirror_exit_common(Job *job)
>>       block_job_remove_all_bdrv(bjob);
>>       bdrv_child_try_set_perm(mirror_top_bs->backing, 0, BLK_PERM_ALL,
>>                               &error_abort);
>> -    bdrv_replace_node(mirror_top_bs, backing_bs(mirror_top_bs), &error_abort);
>> +    bdrv_replace_node(mirror_top_bs, mirror_top_bs->backing->bs, &error_abort);
>>   
>>       /* We just changed the BDS the job BB refers to (with either or both of the
>>        * bdrv_replace_node() calls), so switch the BB back so the cleanup does
>> @@ -757,6 +759,7 @@ static int coroutine_fn mirror_dirty_init(MirrorBlockJob *s)
>>   {
>>       int64_t offset;
>>       BlockDriverState *base = s->base;
>> +    BlockDriverState *filtered_base;
>>       BlockDriverState *bs = s->mirror_top_bs->backing->bs;
>>       BlockDriverState *target_bs = blk_bs(s->target);
>>       int ret;
>> @@ -795,6 +798,9 @@ static int coroutine_fn mirror_dirty_init(MirrorBlockJob *s)
>>           s->initial_zeroing_ongoing = false;
>>       }
>>   
>> +    /* Will be NULL if @base is not in @bs's chain */
> 
> Should we assert that not NULL?

Well, but it can be NULL.  It is only non-NULL for active commit.

> Hmm, so this is the way to "skip filters reverse from the base", yes? Worth add a comment?

We need this because bdrv_is_allocated() will report everything as
allocated in a filter if it is allocated in its filtered child.  So if
we use @base in bdrv_is_allocated_above() and there is a filter on top
of it, bdrv_is_allocated_above() will report everything as allocated
that is allocated in @base (which we do not want).

Therefor, we need to go to the topmost R/W filter on top of @base, so
that bdrv_is_allocated_above() actually starts at the first COW chain
element above @base.

As for the comment, I thought the name “filtered base” would suffice,
but sure.

(“@filtered_base is the topmost node in the @bs-@base chain that is
connected to @base only through filters” or something; plus the
explanation why we need it.)

>> +    filtered_base = bdrv_filtered_cow_bs(bdrv_find_overlay(bs, base));
>> +
>>       /* First part, loop on the sectors and initialize the dirty bitmap.  */
>>       for (offset = 0; offset < s->bdev_length; ) {
>>           /* Just to make sure we are not exceeding int limit. */

[...]

>> @@ -1583,15 +1590,42 @@ static void mirror_start_job(const char *job_id, BlockDriverState *bs,
>>        * In the case of active commit, things look a bit different, though,
>>        * because the target is an already populated backing file in active use.
>>        * We can allow anything except resize there.*/
>> +
>> +    target_perms = BLK_PERM_WRITE;
>> +    target_shared_perms = BLK_PERM_WRITE_UNCHANGED;
>> +
>>       target_is_backing = bdrv_chain_contains(bs, target);
> 
> don't you want skip filters here? actual target node may be in backing chain, but has separate
> filters above it

I don’t quite understand.  bdrv_chain_contains() iterates over the
filter chain, so it shouldn’t matter whether there are filters above
target or not.

[...]

>> @@ -1641,15 +1675,39 @@ static void mirror_start_job(const char *job_id, BlockDriverState *bs,
>>       /* In commit_active_start() all intermediate nodes disappear, so
>>        * any jobs in them must be blocked */
> 
> hmm, preexisting, it s/jobs/nodes/

I think the idea was that no other jobs may be run on intermediate
nodes.  (But by now it’s no longer just about jobs, so yes, should be
s/jobs/nodes/.  I don’t know whether I should squeeze that in here, though.)

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [Qemu-devel] [PATCH v5 29/42] nbd: Use CAF when looking for dirty bitmap
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 29/42] nbd: Use CAF when looking for dirty bitmap Max Reitz
  2019-06-18 13:58   ` Vladimir Sementsov-Ogievskiy
@ 2019-06-18 14:48   ` Eric Blake
  1 sibling, 0 replies; 113+ messages in thread
From: Eric Blake @ 2019-06-18 14:48 UTC (permalink / raw)
  To: Max Reitz, qemu-block
  Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, qemu-devel


[-- Attachment #1.1: Type: text/plain, Size: 1185 bytes --]

On 6/12/19 5:09 PM, Max Reitz wrote:
> When looking for a dirty bitmap to share, we should handle filters by
> just including them in the search (so they do not break backing chains).
> 
> Signed-off-by: Max Reitz <mreitz@redhat.com>
> ---
>  nbd/server.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)

Reviewed-by: Eric Blake <eblake@redhat.com>

> 
> diff --git a/nbd/server.c b/nbd/server.c
> index aeca3893fe..0d51d46b81 100644
> --- a/nbd/server.c
> +++ b/nbd/server.c
> @@ -1508,13 +1508,13 @@ NBDExport *nbd_export_new(BlockDriverState *bs, uint64_t dev_offset,
>      if (bitmap) {
>          BdrvDirtyBitmap *bm = NULL;
>  
> -        while (true) {
> +        while (bs) {
>              bm = bdrv_find_dirty_bitmap(bs, bitmap);
> -            if (bm != NULL || bs->backing == NULL) {
> +            if (bm != NULL) {
>                  break;
>              }
>  
> -            bs = bs->backing->bs;
> +            bs = bdrv_filtered_bs(bs);
>          }
>  
>          if (bm == NULL) {
> 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [Qemu-devel] [PATCH v5 25/42] mirror: Deal with filters
  2019-06-18 14:47     ` Max Reitz
@ 2019-06-18 14:55       ` Vladimir Sementsov-Ogievskiy
  2019-06-18 15:20         ` Max Reitz
  0 siblings, 1 reply; 113+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2019-06-18 14:55 UTC (permalink / raw)
  To: Max Reitz, qemu-block; +Cc: Kevin Wolf, qemu-devel

18.06.2019 17:47, Max Reitz wrote:
> On 18.06.19 15:12, Vladimir Sementsov-Ogievskiy wrote:
>> 13.06.2019 1:09, Max Reitz wrote:
>>> This includes some permission limiting (for example, we only need to
>>> take the RESIZE permission for active commits where the base is smaller
>>> than the top).
>>>
>>> Signed-off-by: Max Reitz <mreitz@redhat.com>
>>
>> ohm, unfortunately I'm far from knowing block/mirror mechanics and interfaces :(.
>>
>> still some comments below.
>>
>>> ---
>>>    block/mirror.c | 110 +++++++++++++++++++++++++++++++++++++------------
>>>    blockdev.c     |  47 +++++++++++++++++----
>>>    2 files changed, 124 insertions(+), 33 deletions(-)
>>>
>>> diff --git a/block/mirror.c b/block/mirror.c
>>> index 4fa8f57c80..3d767e3030 100644
>>> --- a/block/mirror.c
>>> +++ b/block/mirror.c
>>> @@ -660,8 +660,10 @@ static int mirror_exit_common(Job *job)
>>>                                &error_abort);
>>>        if (!abort && s->backing_mode == MIRROR_SOURCE_BACKING_CHAIN) {
>>>            BlockDriverState *backing = s->is_none_mode ? src : s->base;
>>> -        if (backing_bs(target_bs) != backing) {
>>> -            bdrv_set_backing_hd(target_bs, backing, &local_err);
>>> +        BlockDriverState *unfiltered_target = bdrv_skip_rw_filters(target_bs);
>>> +
>>> +        if (bdrv_filtered_cow_bs(unfiltered_target) != backing) {
>>> +            bdrv_set_backing_hd(unfiltered_target, backing, &local_err);
>>>                if (local_err) {
>>>                    error_report_err(local_err);
>>>                    ret = -EPERM;
>>> @@ -711,7 +713,7 @@ static int mirror_exit_common(Job *job)
>>>        block_job_remove_all_bdrv(bjob);
>>>        bdrv_child_try_set_perm(mirror_top_bs->backing, 0, BLK_PERM_ALL,
>>>                                &error_abort);
>>> -    bdrv_replace_node(mirror_top_bs, backing_bs(mirror_top_bs), &error_abort);
>>> +    bdrv_replace_node(mirror_top_bs, mirror_top_bs->backing->bs, &error_abort);
>>>    
>>>        /* We just changed the BDS the job BB refers to (with either or both of the
>>>         * bdrv_replace_node() calls), so switch the BB back so the cleanup does
>>> @@ -757,6 +759,7 @@ static int coroutine_fn mirror_dirty_init(MirrorBlockJob *s)
>>>    {
>>>        int64_t offset;
>>>        BlockDriverState *base = s->base;
>>> +    BlockDriverState *filtered_base;
>>>        BlockDriverState *bs = s->mirror_top_bs->backing->bs;
>>>        BlockDriverState *target_bs = blk_bs(s->target);
>>>        int ret;
>>> @@ -795,6 +798,9 @@ static int coroutine_fn mirror_dirty_init(MirrorBlockJob *s)
>>>            s->initial_zeroing_ongoing = false;
>>>        }
>>>    
>>> +    /* Will be NULL if @base is not in @bs's chain */
>>
>> Should we assert that not NULL?
> 
> Well, but it can be NULL.  It is only non-NULL for active commit.
> 
>> Hmm, so this is the way to "skip filters reverse from the base", yes? Worth add a comment?
> 
> We need this because bdrv_is_allocated() will report everything as
> allocated in a filter if it is allocated in its filtered child.  So if
> we use @base in bdrv_is_allocated_above() and there is a filter on top
> of it, bdrv_is_allocated_above() will report everything as allocated
> that is allocated in @base (which we do not want).
> 
> Therefor, we need to go to the topmost R/W filter on top of @base, so
> that bdrv_is_allocated_above() actually starts at the first COW chain
> element above @base.
> 
> As for the comment, I thought the name “filtered base” would suffice,
> but sure.
> 
> (“@filtered_base is the topmost node in the @bs-@base chain that is
> connected to @base only through filters” or something; plus the
> explanation why we need it.)
> 
>>> +    filtered_base = bdrv_filtered_cow_bs(bdrv_find_overlay(bs, base));
>>> +
>>>        /* First part, loop on the sectors and initialize the dirty bitmap.  */
>>>        for (offset = 0; offset < s->bdev_length; ) {
>>>            /* Just to make sure we are not exceeding int limit. */
> 
> [...]
> 
>>> @@ -1583,15 +1590,42 @@ static void mirror_start_job(const char *job_id, BlockDriverState *bs,
>>>         * In the case of active commit, things look a bit different, though,
>>>         * because the target is an already populated backing file in active use.
>>>         * We can allow anything except resize there.*/
>>> +
>>> +    target_perms = BLK_PERM_WRITE;
>>> +    target_shared_perms = BLK_PERM_WRITE_UNCHANGED;
>>> +
>>>        target_is_backing = bdrv_chain_contains(bs, target);
>>
>> don't you want skip filters here? actual target node may be in backing chain, but has separate
>> filters above it
> 
> I don’t quite understand.  bdrv_chain_contains() iterates over the
> filter chain, so it shouldn’t matter whether there are filters above
> target or not.
> 
> [...]


I just imagine something like this:

bs
  |
...  target node (it's filter)
  |  /
  v v
base (unfiltered target)


> 
>>> @@ -1641,15 +1675,39 @@ static void mirror_start_job(const char *job_id, BlockDriverState *bs,
>>>        /* In commit_active_start() all intermediate nodes disappear, so
>>>         * any jobs in them must be blocked */
>>
>> hmm, preexisting, it s/jobs/nodes/
> 
> I think the idea was that no other jobs may be run on intermediate
> nodes.  (But by now it’s no longer just about jobs, so yes, should be
> s/jobs/nodes/.  I don’t know whether I should squeeze that in here, though.)
> 
> Max
> 


-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [Qemu-devel] [PATCH v5 25/42] mirror: Deal with filters
  2019-06-18 14:55       ` Vladimir Sementsov-Ogievskiy
@ 2019-06-18 15:20         ` Max Reitz
  0 siblings, 0 replies; 113+ messages in thread
From: Max Reitz @ 2019-06-18 15:20 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block; +Cc: Kevin Wolf, qemu-devel


[-- Attachment #1.1: Type: text/plain, Size: 6465 bytes --]

On 18.06.19 16:55, Vladimir Sementsov-Ogievskiy wrote:
> 18.06.2019 17:47, Max Reitz wrote:
>> On 18.06.19 15:12, Vladimir Sementsov-Ogievskiy wrote:
>>> 13.06.2019 1:09, Max Reitz wrote:
>>>> This includes some permission limiting (for example, we only need to
>>>> take the RESIZE permission for active commits where the base is smaller
>>>> than the top).
>>>>
>>>> Signed-off-by: Max Reitz <mreitz@redhat.com>
>>>
>>> ohm, unfortunately I'm far from knowing block/mirror mechanics and interfaces :(.
>>>
>>> still some comments below.
>>>
>>>> ---
>>>>    block/mirror.c | 110 +++++++++++++++++++++++++++++++++++++------------
>>>>    blockdev.c     |  47 +++++++++++++++++----
>>>>    2 files changed, 124 insertions(+), 33 deletions(-)
>>>>
>>>> diff --git a/block/mirror.c b/block/mirror.c
>>>> index 4fa8f57c80..3d767e3030 100644
>>>> --- a/block/mirror.c
>>>> +++ b/block/mirror.c
>>>> @@ -660,8 +660,10 @@ static int mirror_exit_common(Job *job)
>>>>                                &error_abort);
>>>>        if (!abort && s->backing_mode == MIRROR_SOURCE_BACKING_CHAIN) {
>>>>            BlockDriverState *backing = s->is_none_mode ? src : s->base;
>>>> -        if (backing_bs(target_bs) != backing) {
>>>> -            bdrv_set_backing_hd(target_bs, backing, &local_err);
>>>> +        BlockDriverState *unfiltered_target = bdrv_skip_rw_filters(target_bs);
>>>> +
>>>> +        if (bdrv_filtered_cow_bs(unfiltered_target) != backing) {
>>>> +            bdrv_set_backing_hd(unfiltered_target, backing, &local_err);
>>>>                if (local_err) {
>>>>                    error_report_err(local_err);
>>>>                    ret = -EPERM;
>>>> @@ -711,7 +713,7 @@ static int mirror_exit_common(Job *job)
>>>>        block_job_remove_all_bdrv(bjob);
>>>>        bdrv_child_try_set_perm(mirror_top_bs->backing, 0, BLK_PERM_ALL,
>>>>                                &error_abort);
>>>> -    bdrv_replace_node(mirror_top_bs, backing_bs(mirror_top_bs), &error_abort);
>>>> +    bdrv_replace_node(mirror_top_bs, mirror_top_bs->backing->bs, &error_abort);
>>>>    
>>>>        /* We just changed the BDS the job BB refers to (with either or both of the
>>>>         * bdrv_replace_node() calls), so switch the BB back so the cleanup does
>>>> @@ -757,6 +759,7 @@ static int coroutine_fn mirror_dirty_init(MirrorBlockJob *s)
>>>>    {
>>>>        int64_t offset;
>>>>        BlockDriverState *base = s->base;
>>>> +    BlockDriverState *filtered_base;
>>>>        BlockDriverState *bs = s->mirror_top_bs->backing->bs;
>>>>        BlockDriverState *target_bs = blk_bs(s->target);
>>>>        int ret;
>>>> @@ -795,6 +798,9 @@ static int coroutine_fn mirror_dirty_init(MirrorBlockJob *s)
>>>>            s->initial_zeroing_ongoing = false;
>>>>        }
>>>>    
>>>> +    /* Will be NULL if @base is not in @bs's chain */
>>>
>>> Should we assert that not NULL?
>>
>> Well, but it can be NULL.  It is only non-NULL for active commit.
>>
>>> Hmm, so this is the way to "skip filters reverse from the base", yes? Worth add a comment?
>>
>> We need this because bdrv_is_allocated() will report everything as
>> allocated in a filter if it is allocated in its filtered child.  So if
>> we use @base in bdrv_is_allocated_above() and there is a filter on top
>> of it, bdrv_is_allocated_above() will report everything as allocated
>> that is allocated in @base (which we do not want).
>>
>> Therefor, we need to go to the topmost R/W filter on top of @base, so
>> that bdrv_is_allocated_above() actually starts at the first COW chain
>> element above @base.
>>
>> As for the comment, I thought the name “filtered base” would suffice,
>> but sure.
>>
>> (“@filtered_base is the topmost node in the @bs-@base chain that is
>> connected to @base only through filters” or something; plus the
>> explanation why we need it.)
>>
>>>> +    filtered_base = bdrv_filtered_cow_bs(bdrv_find_overlay(bs, base));
>>>> +
>>>>        /* First part, loop on the sectors and initialize the dirty bitmap.  */
>>>>        for (offset = 0; offset < s->bdev_length; ) {
>>>>            /* Just to make sure we are not exceeding int limit. */
>>
>> [...]
>>
>>>> @@ -1583,15 +1590,42 @@ static void mirror_start_job(const char *job_id, BlockDriverState *bs,
>>>>         * In the case of active commit, things look a bit different, though,
>>>>         * because the target is an already populated backing file in active use.
>>>>         * We can allow anything except resize there.*/
>>>> +
>>>> +    target_perms = BLK_PERM_WRITE;
>>>> +    target_shared_perms = BLK_PERM_WRITE_UNCHANGED;
>>>> +
>>>>        target_is_backing = bdrv_chain_contains(bs, target);
>>>
>>> don't you want skip filters here? actual target node may be in backing chain, but has separate
>>> filters above it
>>
>> I don’t quite understand.  bdrv_chain_contains() iterates over the
>> filter chain, so it shouldn’t matter whether there are filters above
>> target or not.
>>
>> [...]
> 
> 
> I just imagine something like this:
> 
> bs
>   |
> ...  target node (it's filter)
>   |  /
>   v v
> base (unfiltered target)

Well, that’s just broken.  Good point.

Hm.  Can this be actually made to work?  The filter_target search could
be amended (by looking for the overlay of bdrv_skip_rw_filters(target)).
 The loop to block intermediate nodes, though...  Would require some
more modifications.  We’d probably also need two loops, one from bs to
bdrv_skip_rw_filters(target), and one from the target to
bdrv_skip_rw_filters(target).

All in all I think it’s best to just forbid this case for now.  (We can
try something like that again for blockdev-copy in the future(TM).)  So
I’ll just check whether bdrv_skip_rw_filters(target) is in the chain,
and if so (but target_is_backing is false), return an error.

Max

>>>> @@ -1641,15 +1675,39 @@ static void mirror_start_job(const char *job_id, BlockDriverState *bs,
>>>>        /* In commit_active_start() all intermediate nodes disappear, so
>>>>         * any jobs in them must be blocked */
>>>
>>> hmm, preexisting, it s/jobs/nodes/
>>
>> I think the idea was that no other jobs may be run on intermediate
>> nodes.  (But by now it’s no longer just about jobs, so yes, should be
>> s/jobs/nodes/.  I don’t know whether I should squeeze that in here, though.)
>>
>> Max
>>
> 
> 



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [Qemu-devel] [PATCH v5 30/42] qemu-img: Use child access functions
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 30/42] qemu-img: Use child access functions Max Reitz
@ 2019-06-19  9:18   ` Vladimir Sementsov-Ogievskiy
  2019-06-19 15:49     ` Max Reitz
  0 siblings, 1 reply; 113+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2019-06-19  9:18 UTC (permalink / raw)
  To: Max Reitz, qemu-block; +Cc: Kevin Wolf, qemu-devel

13.06.2019 1:09, Max Reitz wrote:
> This changes iotest 204's output, because blkdebug on top of a COW node
> used to make qemu-img map disregard the rest of the backing chain (the
> backing chain was broken by the filter).  With this patch, the
> allocation in the base image is reported correctly.
> 
> Signed-off-by: Max Reitz <mreitz@redhat.com>
> ---
>   qemu-img.c                 | 36 ++++++++++++++++++++----------------
>   tests/qemu-iotests/204.out |  1 +
>   2 files changed, 21 insertions(+), 16 deletions(-)
> 
> diff --git a/qemu-img.c b/qemu-img.c
> index 07b6e2a808..7bfa6e5d40 100644
> --- a/qemu-img.c
> +++ b/qemu-img.c
> @@ -992,7 +992,7 @@ static int img_commit(int argc, char **argv)
>       if (!blk) {
>           return 1;
>       }
> -    bs = blk_bs(blk);
> +    bs = bdrv_skip_implicit_filters(blk_bs(blk));

if filename is json, describing explicit filter over normal node, bs will be
explicit filter ...

>   
>       qemu_progress_init(progress, 1.f);
>       qemu_progress_print(0.f, 100);
> @@ -1009,7 +1009,7 @@ static int img_commit(int argc, char **argv)
>           /* This is different from QMP, which by default uses the deepest file in
>            * the backing chain (i.e., the very base); however, the traditional
>            * behavior of qemu-img commit is using the immediate backing file. */
> -        base_bs = backing_bs(bs);
> +        base_bs = bdrv_filtered_cow_bs(bs);
>           if (!base_bs) {

and here we'll fail.

>               error_setg(&local_err, "Image does not have a backing file");
>               goto done;
> @@ -1626,19 +1626,18 @@ static int convert_iteration_sectors(ImgConvertState *s, int64_t sector_num)
>   
>       if (s->sector_next_status <= sector_num) {
>           int64_t count = n * BDRV_SECTOR_SIZE;
> +        BlockDriverState *src_bs = blk_bs(s->src[src_cur]);
> +        BlockDriverState *base;
>   
>           if (s->target_has_backing) {
> -
> -            ret = bdrv_block_status(blk_bs(s->src[src_cur]),
> -                                    (sector_num - src_cur_offset) *
> -                                    BDRV_SECTOR_SIZE,
> -                                    count, &count, NULL, NULL);
> +            base = bdrv_backing_chain_next(src_bs);

As you described in another patch, will not we here get allocated in base as allocated, because of
counting filters above base?

Hmm. I now think, why filters don't report everything as unallocated, would not it be more correct
than fallthrough to child?

>           } else {
> -            ret = bdrv_block_status_above(blk_bs(s->src[src_cur]), NULL,
> -                                          (sector_num - src_cur_offset) *
> -                                          BDRV_SECTOR_SIZE,
> -                                          count, &count, NULL, NULL);
> +            base = NULL;
>           }
> +        ret = bdrv_block_status_above(src_bs, base,
> +                                      (sector_num - src_cur_offset) *
> +                                      BDRV_SECTOR_SIZE,
> +                                      count, &count, NULL, NULL);
>           if (ret < 0) {
>               error_report("error while reading block status of sector %" PRId64
>                            ": %s", sector_num, strerror(-ret));
> @@ -2439,7 +2438,8 @@ static int img_convert(int argc, char **argv)
>            * s.target_backing_sectors has to be negative, which it will
>            * be automatically).  The backing file length is used only
>            * for optimizations, so such a case is not fatal. */
> -        s.target_backing_sectors = bdrv_nb_sectors(out_bs->backing->bs);
> +        s.target_backing_sectors =
> +            bdrv_nb_sectors(bdrv_filtered_cow_bs(out_bs));
>       } else {
>           s.target_backing_sectors = -1;
>       }
> @@ -2802,6 +2802,7 @@ static int get_block_status(BlockDriverState *bs, int64_t offset,
>   
>       depth = 0;
>       for (;;) {
> +        bs = bdrv_skip_rw_filters(bs);
>           ret = bdrv_block_status(bs, offset, bytes, &bytes, &map, &file);
>           if (ret < 0) {
>               return ret;
> @@ -2810,7 +2811,7 @@ static int get_block_status(BlockDriverState *bs, int64_t offset,
>           if (ret & (BDRV_BLOCK_ZERO|BDRV_BLOCK_DATA)) {
>               break;
>           }
> -        bs = backing_bs(bs);
> +        bs = bdrv_filtered_cow_bs(bs);
>           if (bs == NULL) {
>               ret = 0;
>               break;
> @@ -2949,7 +2950,7 @@ static int img_map(int argc, char **argv)
>       if (!blk) {
>           return 1;
>       }
> -    bs = blk_bs(blk);
> +    bs = bdrv_skip_implicit_filters(blk_bs(blk));

Hmm, another thought about implicit filters, how they could be here in qemu-img? If implicit are only
job filters. Do you prepared it for implicit COR? But we discussed with Kevin that we'd better deprecate
copy-on-read option..

So, if implicit filters are for compatibility, we'll have them only in block-jobs. So, seems no reason to support
them in qemu-img.

Also, in block-jobs, we can abandon creating implicit filters above any filter nodes, as well as abandon creating any
filter nodes above implicit filters. This will still support old scenarios, but gives very simple and well defined scope
of using implicit filters and how to work with them. What do you think?

>   
>       if (output_format == OFORMAT_HUMAN) {
>           printf("%-16s%-16s%-16s%s\n", "Offset", "Length", "Mapped to", "File");
> @@ -3165,6 +3166,7 @@ static int img_rebase(int argc, char **argv)
>       uint8_t *buf_old = NULL;
>       uint8_t *buf_new = NULL;
>       BlockDriverState *bs = NULL, *prefix_chain_bs = NULL;
> +    BlockDriverState *unfiltered_bs;
>       char *filename;
>       const char *fmt, *cache, *src_cache, *out_basefmt, *out_baseimg;
>       int c, flags, src_flags, ret;
> @@ -3299,6 +3301,8 @@ static int img_rebase(int argc, char **argv)
>       }
>       bs = blk_bs(blk);
>   
> +    unfiltered_bs = bdrv_skip_rw_filters(bs);
> +
>       if (out_basefmt != NULL) {
>           if (bdrv_find_format(out_basefmt) == NULL) {
>               error_report("Invalid format name: '%s'", out_basefmt);
> @@ -3310,7 +3314,7 @@ static int img_rebase(int argc, char **argv)
>       /* For safe rebasing we need to compare old and new backing file */
>       if (!unsafe) {
>           QDict *options = NULL;
> -        BlockDriverState *base_bs = backing_bs(bs);
> +        BlockDriverState *base_bs = bdrv_filtered_cow_bs(unfiltered_bs);
>   
>           if (base_bs) {
>               blk_old_backing = blk_new(qemu_get_aio_context(),
> @@ -3463,7 +3467,7 @@ static int img_rebase(int argc, char **argv)
>                    * If cluster wasn't changed since prefix_chain, we don't need
>                    * to take action
>                    */
> -                ret = bdrv_is_allocated_above(backing_bs(bs), prefix_chain_bs,
> +                ret = bdrv_is_allocated_above(unfiltered_bs, prefix_chain_bs,
>                                                 offset, n, &n);
>                   if (ret < 0) {
>                       error_report("error while reading image metadata: %s",
> diff --git a/tests/qemu-iotests/204.out b/tests/qemu-iotests/204.out
> index f3a10fbe90..684774d763 100644
> --- a/tests/qemu-iotests/204.out
> +++ b/tests/qemu-iotests/204.out
> @@ -59,5 +59,6 @@ Offset          Length          File
>   0x900000        0x2400000       TEST_DIR/t.IMGFMT
>   0x3c00000       0x1100000       TEST_DIR/t.IMGFMT
>   0x6a00000       0x400000        TEST_DIR/t.IMGFMT
> +0x6e00000       0x1200000       TEST_DIR/t.IMGFMT.base
>   No errors were found on the image.
>   *** done
> 


-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [Qemu-devel] [PATCH v5 31/42] block: Drop backing_bs()
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 31/42] block: Drop backing_bs() Max Reitz
@ 2019-06-19  9:18   ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 113+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2019-06-19  9:18 UTC (permalink / raw)
  To: Max Reitz, qemu-block; +Cc: Kevin Wolf, qemu-devel

13.06.2019 1:09, Max Reitz wrote:
> We want to make it explicit where bs->backing is used, and we have done
> so.  The old role of backing_bs() is now effectively taken by
> bdrv_filtered_cow_bs().
> 
> Signed-off-by: Max Reitz <mreitz@redhat.com>


Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>


> ---
>   include/block/block_int.h | 5 -----
>   1 file changed, 5 deletions(-)
> 
> diff --git a/include/block/block_int.h b/include/block/block_int.h
> index 875a33f255..c0a05beec3 100644
> --- a/include/block/block_int.h
> +++ b/include/block/block_int.h
> @@ -925,11 +925,6 @@ typedef enum BlockMirrorBackingMode {
>       MIRROR_LEAVE_BACKING_CHAIN,
>   } BlockMirrorBackingMode;
>   
> -static inline BlockDriverState *backing_bs(BlockDriverState *bs)
> -{
> -    return bs->backing ? bs->backing->bs : NULL;
> -}
> -
>   
>   /* Essential block drivers which must always be statically linked into qemu, and
>    * which therefore can be accessed without using bdrv_find_format() */
> 


-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [Qemu-devel] [PATCH v5 32/42] block: Make bdrv_get_cumulative_perm() public
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 32/42] block: Make bdrv_get_cumulative_perm() public Max Reitz
@ 2019-06-19  9:19   ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 113+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2019-06-19  9:19 UTC (permalink / raw)
  To: Max Reitz, qemu-block; +Cc: Kevin Wolf, qemu-devel

13.06.2019 1:09, Max Reitz wrote:
> This is useful in other files like blockdev.c to determine for example
> whether a node can be written to or not.
> 
> Signed-off-by: Max Reitz <mreitz@redhat.com>


Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>


> ---
>   include/block/block_int.h | 3 +++
>   block.c                   | 6 ++----
>   2 files changed, 5 insertions(+), 4 deletions(-)
> 
> diff --git a/include/block/block_int.h b/include/block/block_int.h
> index c0a05beec3..cfefb00104 100644
> --- a/include/block/block_int.h
> +++ b/include/block/block_int.h
> @@ -1181,6 +1181,9 @@ void bdrv_root_unref_child(BdrvChild *child);
>   int bdrv_child_try_set_perm(BdrvChild *c, uint64_t perm, uint64_t shared,
>                               Error **errp);
>   
> +void bdrv_get_cumulative_perm(BlockDriverState *bs,
> +                              uint64_t *perm, uint64_t *shared_perm);
> +
>   /* Default implementation for BlockDriver.bdrv_child_perm() that can be used by
>    * block filters: Forward CONSISTENT_READ, WRITE, WRITE_UNCHANGED and RESIZE to
>    * all children */
> diff --git a/block.c b/block.c
> index 856d9b58be..59d1d4b2b1 100644
> --- a/block.c
> +++ b/block.c
> @@ -1711,8 +1711,6 @@ static int bdrv_child_check_perm(BdrvChild *c, BlockReopenQueue *q,
>                                    GSList *ignore_children, Error **errp);
>   static void bdrv_child_abort_perm_update(BdrvChild *c);
>   static void bdrv_child_set_perm(BdrvChild *c, uint64_t perm, uint64_t shared);
> -static void bdrv_get_cumulative_perm(BlockDriverState *bs, uint64_t *perm,
> -                                     uint64_t *shared_perm);
>   
>   typedef struct BlockReopenQueueEntry {
>        bool prepared;
> @@ -1904,8 +1902,8 @@ static void bdrv_set_perm(BlockDriverState *bs, uint64_t cumulative_perms,
>       }
>   }
>   
> -static void bdrv_get_cumulative_perm(BlockDriverState *bs, uint64_t *perm,
> -                                     uint64_t *shared_perm)
> +void bdrv_get_cumulative_perm(BlockDriverState *bs,
> +                              uint64_t *perm, uint64_t *shared_perm)
>   {
>       BdrvChild *c;
>       uint64_t cumulative_perms = 0;
> 


-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [Qemu-devel] [PATCH v5 33/42] blockdev: Fix active commit choice
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 33/42] blockdev: Fix active commit choice Max Reitz
@ 2019-06-19  9:31   ` Vladimir Sementsov-Ogievskiy
  2019-06-19 15:59     ` Max Reitz
  0 siblings, 1 reply; 113+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2019-06-19  9:31 UTC (permalink / raw)
  To: Max Reitz, qemu-block; +Cc: Kevin Wolf, qemu-devel

13.06.2019 1:09, Max Reitz wrote:
> We have to perform an active commit whenever the top node has a parent
> that has taken the WRITE permission on it.
> 
> Signed-off-by: Max Reitz <mreitz@redhat.com>
> ---
>   blockdev.c | 24 +++++++++++++++++++++---
>   1 file changed, 21 insertions(+), 3 deletions(-)
> 
> diff --git a/blockdev.c b/blockdev.c
> index a464cabf9e..5370f3b738 100644
> --- a/blockdev.c
> +++ b/blockdev.c
> @@ -3294,6 +3294,7 @@ void qmp_block_commit(bool has_job_id, const char *job_id, const char *device,
>        */
>       BlockdevOnError on_error = BLOCKDEV_ON_ERROR_REPORT;
>       int job_flags = JOB_DEFAULT;
> +    uint64_t top_perm, top_shared;
>   
>       if (!has_speed) {
>           speed = 0;
> @@ -3406,14 +3407,31 @@ void qmp_block_commit(bool has_job_id, const char *job_id, const char *device,
>           goto out;
>       }
>   
> -    if (top_bs == bs) {
> +    /*
> +     * Active commit is required if and only if someone has taken a
> +     * WRITE permission on the top node.  Historically, we have always
> +     * used active commit for top nodes, so continue that practice.
> +     * (Active commit is never really wrong.)

Hmm, if we start active commit when nobody has write access, than
we leave a possibility to someone to get this access during commit. And during
passive commit write access is blocked. So, may be right way is do active commit
always? Benefits:
1. One code path. and it shouldn't be worse when no writers, without guest writes
mirror code shouldn't work worse than passive commit, if it is, it should be fixed.
2. Possibility of write access if user needs it during commit
3. I'm sure that active commit (mirror code) actually works faster, as it uses
async requests and smarter handling of block status.

> +     */
> +    bdrv_get_cumulative_perm(top_bs, &top_perm, &top_shared);
> +    if (top_perm & BLK_PERM_WRITE ||
> +        bdrv_skip_rw_filters(top_bs) == bdrv_skip_rw_filters(bs))
> +    {
>           if (has_backing_file) {
>               error_setg(errp, "'backing-file' specified,"
>                                " but 'top' is the active layer");
>               goto out;
>           }
> -        commit_active_start(has_job_id ? job_id : NULL, bs, base_bs,
> -                            job_flags, speed, on_error,
> +        if (!has_job_id) {
> +            /*
> +             * Emulate here what block_job_create() does, because it
> +             * is possible that @bs != @top_bs (the block job should
> +             * be named after @bs, even if @top_bs is the actual
> +             * source)
> +             */
> +            job_id = bdrv_get_device_name(bs);
> +        }
> +        commit_active_start(job_id, top_bs, base_bs, job_flags, speed, on_error,
>                               filter_node_name, NULL, NULL, false, &local_err);
>       } else {
>           BlockDriverState *overlay_bs = bdrv_find_overlay(bs, top_bs);
> 


-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [Qemu-devel] [PATCH v5 34/42] block: Inline bdrv_co_block_status_from_*()
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 34/42] block: Inline bdrv_co_block_status_from_*() Max Reitz
@ 2019-06-19  9:34   ` Vladimir Sementsov-Ogievskiy
  2019-06-19 16:01     ` Max Reitz
  2019-06-19 16:07     ` Max Reitz
  2019-06-21 13:39   ` Vladimir Sementsov-Ogievskiy
  1 sibling, 2 replies; 113+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2019-06-19  9:34 UTC (permalink / raw)
  To: Max Reitz, qemu-block; +Cc: Kevin Wolf, qemu-devel

13.06.2019 1:09, Max Reitz wrote:
> With bdrv_filtered_rw_bs(), we can easily handle this default filter
> behavior in bdrv_co_block_status().
> 
> blkdebug wants to have an additional assertion, so it keeps its own
> implementation, except bdrv_co_block_status_from_file() needs to be
> inlined there.
> 
> Suggested-by: Eric Blake <eblake@redhat.com>
> Signed-off-by: Max Reitz <mreitz@redhat.com>
> ---
>   include/block/block_int.h | 22 -----------------
>   block/blkdebug.c          |  7 ++++--
>   block/blklogwrites.c      |  1 -
>   block/commit.c            |  1 -
>   block/copy-on-read.c      |  2 --
>   block/io.c                | 51 +++++++++++++--------------------------
>   block/mirror.c            |  1 -
>   block/throttle.c          |  1 -
>   8 files changed, 22 insertions(+), 64 deletions(-)
> 
> diff --git a/include/block/block_int.h b/include/block/block_int.h
> index cfefb00104..431fa38ea0 100644
> --- a/include/block/block_int.h
> +++ b/include/block/block_int.h
> @@ -1203,28 +1203,6 @@ void bdrv_format_default_perms(BlockDriverState *bs, BdrvChild *c,
>                                  uint64_t perm, uint64_t shared,
>                                  uint64_t *nperm, uint64_t *nshared);
>   
> -/*
> - * Default implementation for drivers to pass bdrv_co_block_status() to
> - * their file.
> - */
> -int coroutine_fn bdrv_co_block_status_from_file(BlockDriverState *bs,
> -                                                bool want_zero,
> -                                                int64_t offset,
> -                                                int64_t bytes,
> -                                                int64_t *pnum,
> -                                                int64_t *map,
> -                                                BlockDriverState **file);
> -/*
> - * Default implementation for drivers to pass bdrv_co_block_status() to
> - * their backing file.
> - */
> -int coroutine_fn bdrv_co_block_status_from_backing(BlockDriverState *bs,
> -                                                   bool want_zero,
> -                                                   int64_t offset,
> -                                                   int64_t bytes,
> -                                                   int64_t *pnum,
> -                                                   int64_t *map,
> -                                                   BlockDriverState **file);
>   const char *bdrv_get_parent_name(const BlockDriverState *bs);
>   void blk_dev_change_media_cb(BlockBackend *blk, bool load, Error **errp);
>   bool blk_dev_has_removable_media(BlockBackend *blk);
> diff --git a/block/blkdebug.c b/block/blkdebug.c
> index efd9441625..7950ae729c 100644
> --- a/block/blkdebug.c
> +++ b/block/blkdebug.c
> @@ -637,8 +637,11 @@ static int coroutine_fn blkdebug_co_block_status(BlockDriverState *bs,
>                                                    BlockDriverState **file)
>   {
>       assert(QEMU_IS_ALIGNED(offset | bytes, bs->bl.request_alignment));
> -    return bdrv_co_block_status_from_file(bs, want_zero, offset, bytes,
> -                                          pnum, map, file);
> +    assert(bs->file && bs->file->bs);
> +    *pnum = bytes;
> +    *map = offset;
> +    *file = bs->file->bs;
> +    return BDRV_BLOCK_RAW | BDRV_BLOCK_OFFSET_VALID;
>   }
>   
>   static void blkdebug_close(BlockDriverState *bs)
> diff --git a/block/blklogwrites.c b/block/blklogwrites.c
> index eb2b4901a5..1eb4a5c613 100644
> --- a/block/blklogwrites.c
> +++ b/block/blklogwrites.c
> @@ -518,7 +518,6 @@ static BlockDriver bdrv_blk_log_writes = {
>       .bdrv_co_pwrite_zeroes  = blk_log_writes_co_pwrite_zeroes,
>       .bdrv_co_flush_to_disk  = blk_log_writes_co_flush_to_disk,
>       .bdrv_co_pdiscard       = blk_log_writes_co_pdiscard,
> -    .bdrv_co_block_status   = bdrv_co_block_status_from_file,
>   
>       .is_filter              = true,
>       .strong_runtime_opts    = blk_log_writes_strong_runtime_opts,
> diff --git a/block/commit.c b/block/commit.c
> index ec5a8c8edf..a5b58eadeb 100644
> --- a/block/commit.c
> +++ b/block/commit.c
> @@ -257,7 +257,6 @@ static void bdrv_commit_top_child_perm(BlockDriverState *bs, BdrvChild *c,
>   static BlockDriver bdrv_commit_top = {
>       .format_name                = "commit_top",
>       .bdrv_co_preadv             = bdrv_commit_top_preadv,
> -    .bdrv_co_block_status       = bdrv_co_block_status_from_backing,
>       .bdrv_refresh_filename      = bdrv_commit_top_refresh_filename,
>       .bdrv_child_perm            = bdrv_commit_top_child_perm,
>   
> diff --git a/block/copy-on-read.c b/block/copy-on-read.c
> index 88e1c1f538..5a292de000 100644
> --- a/block/copy-on-read.c
> +++ b/block/copy-on-read.c
> @@ -161,8 +161,6 @@ static BlockDriver bdrv_copy_on_read = {
>       .bdrv_eject                         = cor_eject,
>       .bdrv_lock_medium                   = cor_lock_medium,
>   
> -    .bdrv_co_block_status               = bdrv_co_block_status_from_file,
> -
>       .bdrv_recurse_is_first_non_filter   = cor_recurse_is_first_non_filter,
>   
>       .has_variable_length                = true,
> diff --git a/block/io.c b/block/io.c
> index 14f99e1c00..0a832e30a3 100644
> --- a/block/io.c
> +++ b/block/io.c
> @@ -1998,36 +1998,6 @@ typedef struct BdrvCoBlockStatusData {
>       bool done;
>   } BdrvCoBlockStatusData;
>   
> -int coroutine_fn bdrv_co_block_status_from_file(BlockDriverState *bs,
> -                                                bool want_zero,
> -                                                int64_t offset,
> -                                                int64_t bytes,
> -                                                int64_t *pnum,
> -                                                int64_t *map,
> -                                                BlockDriverState **file)
> -{
> -    assert(bs->file && bs->file->bs);
> -    *pnum = bytes;
> -    *map = offset;
> -    *file = bs->file->bs;
> -    return BDRV_BLOCK_RAW | BDRV_BLOCK_OFFSET_VALID;
> -}
> -
> -int coroutine_fn bdrv_co_block_status_from_backing(BlockDriverState *bs,
> -                                                   bool want_zero,
> -                                                   int64_t offset,
> -                                                   int64_t bytes,
> -                                                   int64_t *pnum,
> -                                                   int64_t *map,
> -                                                   BlockDriverState **file)
> -{
> -    assert(bs->backing && bs->backing->bs);
> -    *pnum = bytes;
> -    *map = offset;
> -    *file = bs->backing->bs;
> -    return BDRV_BLOCK_RAW | BDRV_BLOCK_OFFSET_VALID;
> -}
> -
>   /*
>    * Returns the allocation status of the specified sectors.
>    * Drivers not implementing the functionality are assumed to not support
> @@ -2068,6 +2038,7 @@ static int coroutine_fn bdrv_co_block_status(BlockDriverState *bs,
>       BlockDriverState *local_file = NULL;
>       int64_t aligned_offset, aligned_bytes;
>       uint32_t align;
> +    bool has_filtered_child;
>   
>       assert(pnum);
>       *pnum = 0;
> @@ -2093,7 +2064,8 @@ static int coroutine_fn bdrv_co_block_status(BlockDriverState *bs,
>   
>       /* Must be non-NULL or bdrv_getlength() would have failed */
>       assert(bs->drv);
> -    if (!bs->drv->bdrv_co_block_status) {
> +    has_filtered_child = bs->drv->is_filter && bdrv_filtered_rw_child(bs);
> +    if (!bs->drv->bdrv_co_block_status && !has_filtered_child) {
>           *pnum = bytes;
>           ret = BDRV_BLOCK_DATA | BDRV_BLOCK_ALLOCATED;
>           if (offset + bytes == total_size) {
> @@ -2114,9 +2086,20 @@ static int coroutine_fn bdrv_co_block_status(BlockDriverState *bs,
>       aligned_offset = QEMU_ALIGN_DOWN(offset, align);
>       aligned_bytes = ROUND_UP(offset + bytes, align) - aligned_offset;
>   
> -    ret = bs->drv->bdrv_co_block_status(bs, want_zero, aligned_offset,
> -                                        aligned_bytes, pnum, &local_map,
> -                                        &local_file);
> +    if (bs->drv->bdrv_co_block_status) {
> +        ret = bs->drv->bdrv_co_block_status(bs, want_zero, aligned_offset,
> +                                            aligned_bytes, pnum, &local_map,
> +                                            &local_file);
> +    } else {
> +        /* Default code for filters */
> +
> +        local_file = bdrv_filtered_rw_bs(bs);
> +        assert(local_file);
> +
> +        *pnum = aligned_bytes;
> +        local_map = aligned_offset;
> +        ret = BDRV_BLOCK_RAW | BDRV_BLOCK_OFFSET_VALID;

I now in a little doubt:

What is real difference between RAW for filters and UNALLOCATED for qcow2 (when we
should look at backing) ?

> +    }
>       if (ret < 0) {
>           *pnum = 0;
>           goto out;
> diff --git a/block/mirror.c b/block/mirror.c
> index 3d767e3030..71bd7f7625 100644
> --- a/block/mirror.c
> +++ b/block/mirror.c
> @@ -1484,7 +1484,6 @@ static BlockDriver bdrv_mirror_top = {
>       .bdrv_co_pwrite_zeroes      = bdrv_mirror_top_pwrite_zeroes,
>       .bdrv_co_pdiscard           = bdrv_mirror_top_pdiscard,
>       .bdrv_co_flush              = bdrv_mirror_top_flush,
> -    .bdrv_co_block_status       = bdrv_co_block_status_from_backing,
>       .bdrv_refresh_filename      = bdrv_mirror_top_refresh_filename,
>       .bdrv_child_perm            = bdrv_mirror_top_child_perm,
>   
> diff --git a/block/throttle.c b/block/throttle.c
> index de1b6bd7e8..32ec56db0f 100644
> --- a/block/throttle.c
> +++ b/block/throttle.c
> @@ -269,7 +269,6 @@ static BlockDriver bdrv_throttle = {
>       .bdrv_reopen_prepare                =   throttle_reopen_prepare,
>       .bdrv_reopen_commit                 =   throttle_reopen_commit,
>       .bdrv_reopen_abort                  =   throttle_reopen_abort,
> -    .bdrv_co_block_status               =   bdrv_co_block_status_from_file,
>   
>       .bdrv_co_drain_begin                =   throttle_co_drain_begin,
>       .bdrv_co_drain_end                  =   throttle_co_drain_end,
> 


-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [Qemu-devel] [PATCH v5 30/42] qemu-img: Use child access functions
  2019-06-19  9:18   ` Vladimir Sementsov-Ogievskiy
@ 2019-06-19 15:49     ` Max Reitz
  2019-06-21 13:15       ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 113+ messages in thread
From: Max Reitz @ 2019-06-19 15:49 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block; +Cc: Kevin Wolf, qemu-devel


[-- Attachment #1.1: Type: text/plain, Size: 4755 bytes --]

On 19.06.19 11:18, Vladimir Sementsov-Ogievskiy wrote:
> 13.06.2019 1:09, Max Reitz wrote:
>> This changes iotest 204's output, because blkdebug on top of a COW node
>> used to make qemu-img map disregard the rest of the backing chain (the
>> backing chain was broken by the filter).  With this patch, the
>> allocation in the base image is reported correctly.
>>
>> Signed-off-by: Max Reitz <mreitz@redhat.com>
>> ---
>>   qemu-img.c                 | 36 ++++++++++++++++++++----------------
>>   tests/qemu-iotests/204.out |  1 +
>>   2 files changed, 21 insertions(+), 16 deletions(-)
>>
>> diff --git a/qemu-img.c b/qemu-img.c
>> index 07b6e2a808..7bfa6e5d40 100644
>> --- a/qemu-img.c
>> +++ b/qemu-img.c
>> @@ -992,7 +992,7 @@ static int img_commit(int argc, char **argv)
>>       if (!blk) {
>>           return 1;
>>       }
>> -    bs = blk_bs(blk);
>> +    bs = bdrv_skip_implicit_filters(blk_bs(blk));
> 
> if filename is json, describing explicit filter over normal node, bs will be
> explicit filter ...
> 
>>   
>>       qemu_progress_init(progress, 1.f);
>>       qemu_progress_print(0.f, 100);
>> @@ -1009,7 +1009,7 @@ static int img_commit(int argc, char **argv)
>>           /* This is different from QMP, which by default uses the deepest file in
>>            * the backing chain (i.e., the very base); however, the traditional
>>            * behavior of qemu-img commit is using the immediate backing file. */
>> -        base_bs = backing_bs(bs);
>> +        base_bs = bdrv_filtered_cow_bs(bs);
>>           if (!base_bs) {
> 
> and here we'll fail.

Right, will change to bdrv_backing_chain_next().

>>               error_setg(&local_err, "Image does not have a backing file");
>>               goto done;
>> @@ -1626,19 +1626,18 @@ static int convert_iteration_sectors(ImgConvertState *s, int64_t sector_num)
>>   
>>       if (s->sector_next_status <= sector_num) {
>>           int64_t count = n * BDRV_SECTOR_SIZE;
>> +        BlockDriverState *src_bs = blk_bs(s->src[src_cur]);
>> +        BlockDriverState *base;
>>   
>>           if (s->target_has_backing) {
>> -
>> -            ret = bdrv_block_status(blk_bs(s->src[src_cur]),
>> -                                    (sector_num - src_cur_offset) *
>> -                                    BDRV_SECTOR_SIZE,
>> -                                    count, &count, NULL, NULL);
>> +            base = bdrv_backing_chain_next(src_bs);
> 
> As you described in another patch, will not we here get allocated in base as allocated, because of
> counting filters above base?

Damn, yes.  So

    base = bdrv_filtered_cow_bs(bdrv_skip_rw_filters(src_bs));

I suppose.

> Hmm. I now think, why filters don't report everything as unallocated, would not it be more correct
> than fallthrough to child?

I don’t know, actually.  Maybe, maybe not.

>>           } else {
>> -            ret = bdrv_block_status_above(blk_bs(s->src[src_cur]), NULL,
>> -                                          (sector_num - src_cur_offset) *
>> -                                          BDRV_SECTOR_SIZE,
>> -                                          count, &count, NULL, NULL);
>> +            base = NULL;
>>           }
>> +        ret = bdrv_block_status_above(src_bs, base,
>> +                                      (sector_num - src_cur_offset) *
>> +                                      BDRV_SECTOR_SIZE,
>> +                                      count, &count, NULL, NULL);
>>           if (ret < 0) {
>>               error_report("error while reading block status of sector %" PRId64
>>                            ": %s", sector_num, strerror(-ret));

[...]

>> @@ -2949,7 +2950,7 @@ static int img_map(int argc, char **argv)
>>       if (!blk) {
>>           return 1;
>>       }
>> -    bs = blk_bs(blk);
>> +    bs = bdrv_skip_implicit_filters(blk_bs(blk));
> 
> Hmm, another thought about implicit filters, how they could be here in qemu-img?

Hm, I don’t think they can.

> If implicit are only
> job filters. Do you prepared it for implicit COR? But we discussed with Kevin that we'd better deprecate
> copy-on-read option..
> 
> So, if implicit filters are for compatibility, we'll have them only in block-jobs. So, seems no reason to support
> them in qemu-img.

Seems reasonable, yes.

> Also, in block-jobs, we can abandon creating implicit filters above any filter nodes, as well as abandon creating any
> filter nodes above implicit filters. This will still support old scenarios, but gives very simple and well defined scope
> of using implicit filters and how to work with them. What do you think?

Hm, in what way would that make things simpler?

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [Qemu-devel] [PATCH v5 33/42] blockdev: Fix active commit choice
  2019-06-19  9:31   ` Vladimir Sementsov-Ogievskiy
@ 2019-06-19 15:59     ` Max Reitz
  2019-06-21 13:26       ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 113+ messages in thread
From: Max Reitz @ 2019-06-19 15:59 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block; +Cc: Kevin Wolf, qemu-devel


[-- Attachment #1.1: Type: text/plain, Size: 3530 bytes --]

On 19.06.19 11:31, Vladimir Sementsov-Ogievskiy wrote:
> 13.06.2019 1:09, Max Reitz wrote:
>> We have to perform an active commit whenever the top node has a parent
>> that has taken the WRITE permission on it.
>>
>> Signed-off-by: Max Reitz <mreitz@redhat.com>
>> ---
>>   blockdev.c | 24 +++++++++++++++++++++---
>>   1 file changed, 21 insertions(+), 3 deletions(-)
>>
>> diff --git a/blockdev.c b/blockdev.c
>> index a464cabf9e..5370f3b738 100644
>> --- a/blockdev.c
>> +++ b/blockdev.c
>> @@ -3294,6 +3294,7 @@ void qmp_block_commit(bool has_job_id, const char *job_id, const char *device,
>>        */
>>       BlockdevOnError on_error = BLOCKDEV_ON_ERROR_REPORT;
>>       int job_flags = JOB_DEFAULT;
>> +    uint64_t top_perm, top_shared;
>>   
>>       if (!has_speed) {
>>           speed = 0;
>> @@ -3406,14 +3407,31 @@ void qmp_block_commit(bool has_job_id, const char *job_id, const char *device,
>>           goto out;
>>       }
>>   
>> -    if (top_bs == bs) {
>> +    /*
>> +     * Active commit is required if and only if someone has taken a
>> +     * WRITE permission on the top node.  Historically, we have always
>> +     * used active commit for top nodes, so continue that practice.
>> +     * (Active commit is never really wrong.)
> 
> Hmm, if we start active commit when nobody has write access, than
> we leave a possibility to someone to get this access during commit.

Isn’t that blocked by the commit filter?  If it isn’t, it should be.

> And during
> passive commit write access is blocked. So, may be right way is do active commit
> always? Benefits:
> 1. One code path. and it shouldn't be worse when no writers, without guest writes
> mirror code shouldn't work worse than passive commit, if it is, it should be fixed.
> 2. Possibility of write access if user needs it during commit
> 3. I'm sure that active commit (mirror code) actually works faster, as it uses
> async requests and smarter handling of block status.

Disadvantage: Something may break because the basic commit job does not
emit BLOCK_JOB_READY and thus does not require block-job-complete.

Technically everything should expect jobs to potentially emit
BLOCK_JOB_READY, but who knows.  I think we’d want at least a
deprecation period.

Max

>> +     */
>> +    bdrv_get_cumulative_perm(top_bs, &top_perm, &top_shared);
>> +    if (top_perm & BLK_PERM_WRITE ||
>> +        bdrv_skip_rw_filters(top_bs) == bdrv_skip_rw_filters(bs))
>> +    {
>>           if (has_backing_file) {
>>               error_setg(errp, "'backing-file' specified,"
>>                                " but 'top' is the active layer");
>>               goto out;
>>           }
>> -        commit_active_start(has_job_id ? job_id : NULL, bs, base_bs,
>> -                            job_flags, speed, on_error,
>> +        if (!has_job_id) {
>> +            /*
>> +             * Emulate here what block_job_create() does, because it
>> +             * is possible that @bs != @top_bs (the block job should
>> +             * be named after @bs, even if @top_bs is the actual
>> +             * source)
>> +             */
>> +            job_id = bdrv_get_device_name(bs);
>> +        }
>> +        commit_active_start(job_id, top_bs, base_bs, job_flags, speed, on_error,
>>                               filter_node_name, NULL, NULL, false, &local_err);
>>       } else {
>>           BlockDriverState *overlay_bs = bdrv_find_overlay(bs, top_bs);
>>
> 
> 



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [Qemu-devel] [PATCH v5 34/42] block: Inline bdrv_co_block_status_from_*()
  2019-06-19  9:34   ` Vladimir Sementsov-Ogievskiy
@ 2019-06-19 16:01     ` Max Reitz
  2019-06-19 16:07     ` Max Reitz
  1 sibling, 0 replies; 113+ messages in thread
From: Max Reitz @ 2019-06-19 16:01 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block; +Cc: Kevin Wolf, qemu-devel


[-- Attachment #1.1: Type: text/plain, Size: 2335 bytes --]

On 19.06.19 11:34, Vladimir Sementsov-Ogievskiy wrote:
> 13.06.2019 1:09, Max Reitz wrote:
>> With bdrv_filtered_rw_bs(), we can easily handle this default filter
>> behavior in bdrv_co_block_status().
>>
>> blkdebug wants to have an additional assertion, so it keeps its own
>> implementation, except bdrv_co_block_status_from_file() needs to be
>> inlined there.
>>
>> Suggested-by: Eric Blake <eblake@redhat.com>
>> Signed-off-by: Max Reitz <mreitz@redhat.com>
>> ---
>>   include/block/block_int.h | 22 -----------------
>>   block/blkdebug.c          |  7 ++++--
>>   block/blklogwrites.c      |  1 -
>>   block/commit.c            |  1 -
>>   block/copy-on-read.c      |  2 --
>>   block/io.c                | 51 +++++++++++++--------------------------
>>   block/mirror.c            |  1 -
>>   block/throttle.c          |  1 -
>>   8 files changed, 22 insertions(+), 64 deletions(-)

[...]

>> diff --git a/block/io.c b/block/io.c
>> index 14f99e1c00..0a832e30a3 100644
>> --- a/block/io.c
>> +++ b/block/io.c

[...]

>> @@ -2114,9 +2086,20 @@ static int coroutine_fn bdrv_co_block_status(BlockDriverState *bs,
>>       aligned_offset = QEMU_ALIGN_DOWN(offset, align);
>>       aligned_bytes = ROUND_UP(offset + bytes, align) - aligned_offset;
>>   
>> -    ret = bs->drv->bdrv_co_block_status(bs, want_zero, aligned_offset,
>> -                                        aligned_bytes, pnum, &local_map,
>> -                                        &local_file);
>> +    if (bs->drv->bdrv_co_block_status) {
>> +        ret = bs->drv->bdrv_co_block_status(bs, want_zero, aligned_offset,
>> +                                            aligned_bytes, pnum, &local_map,
>> +                                            &local_file);
>> +    } else {
>> +        /* Default code for filters */
>> +
>> +        local_file = bdrv_filtered_rw_bs(bs);
>> +        assert(local_file);
>> +
>> +        *pnum = aligned_bytes;
>> +        local_map = aligned_offset;
>> +        ret = BDRV_BLOCK_RAW | BDRV_BLOCK_OFFSET_VALID;
> 
> I now in a little doubt:
> 
> What is real difference between RAW for filters and UNALLOCATED for qcow2 (when we
> should look at backing) ?

Maybe none, but I don’t think diving down that rabbit hole is going to
make this seres shorter.

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [Qemu-devel] [PATCH v5 34/42] block: Inline bdrv_co_block_status_from_*()
  2019-06-19  9:34   ` Vladimir Sementsov-Ogievskiy
  2019-06-19 16:01     ` Max Reitz
@ 2019-06-19 16:07     ` Max Reitz
  1 sibling, 0 replies; 113+ messages in thread
From: Max Reitz @ 2019-06-19 16:07 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block; +Cc: Kevin Wolf, qemu-devel


[-- Attachment #1.1: Type: text/plain, Size: 2336 bytes --]

On 19.06.19 11:34, Vladimir Sementsov-Ogievskiy wrote:
> 13.06.2019 1:09, Max Reitz wrote:
>> With bdrv_filtered_rw_bs(), we can easily handle this default filter
>> behavior in bdrv_co_block_status().
>>
>> blkdebug wants to have an additional assertion, so it keeps its own
>> implementation, except bdrv_co_block_status_from_file() needs to be
>> inlined there.
>>
>> Suggested-by: Eric Blake <eblake@redhat.com>
>> Signed-off-by: Max Reitz <mreitz@redhat.com>
>> ---
>>   include/block/block_int.h | 22 -----------------
>>   block/blkdebug.c          |  7 ++++--
>>   block/blklogwrites.c      |  1 -
>>   block/commit.c            |  1 -
>>   block/copy-on-read.c      |  2 --
>>   block/io.c                | 51 +++++++++++++--------------------------
>>   block/mirror.c            |  1 -
>>   block/throttle.c          |  1 -
>>   8 files changed, 22 insertions(+), 64 deletions(-)

[...]

>> diff --git a/block/io.c b/block/io.c
>> index 14f99e1c00..0a832e30a3 100644
>> --- a/block/io.c
>> +++ b/block/io.c

[...]

>> @@ -2114,9 +2086,20 @@ static int coroutine_fn bdrv_co_block_status(BlockDriverState *bs,
>>       aligned_offset = QEMU_ALIGN_DOWN(offset, align);
>>       aligned_bytes = ROUND_UP(offset + bytes, align) - aligned_offset;
>>   
>> -    ret = bs->drv->bdrv_co_block_status(bs, want_zero, aligned_offset,
>> -                                        aligned_bytes, pnum, &local_map,
>> -                                        &local_file);
>> +    if (bs->drv->bdrv_co_block_status) {
>> +        ret = bs->drv->bdrv_co_block_status(bs, want_zero, aligned_offset,
>> +                                            aligned_bytes, pnum, &local_map,
>> +                                            &local_file);
>> +    } else {
>> +        /* Default code for filters */
>> +
>> +        local_file = bdrv_filtered_rw_bs(bs);
>> +        assert(local_file);
>> +
>> +        *pnum = aligned_bytes;
>> +        local_map = aligned_offset;
>> +        ret = BDRV_BLOCK_RAW | BDRV_BLOCK_OFFSET_VALID;
> 
> I now in a little doubt:
> 
> What is real difference between RAW for filters and UNALLOCATED for qcow2 (when we
> should look at backing) ?

Maybe none, but I don’t think diving down that rabbit hole is going to
make this series shorter.

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [Qemu-devel] [PATCH v5 30/42] qemu-img: Use child access functions
  2019-06-19 15:49     ` Max Reitz
@ 2019-06-21 13:15       ` Vladimir Sementsov-Ogievskiy
  2019-07-24  9:54         ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 113+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2019-06-21 13:15 UTC (permalink / raw)
  To: Max Reitz, qemu-block; +Cc: Kevin Wolf, qemu-devel

19.06.2019 18:49, Max Reitz wrote:
> On 19.06.19 11:18, Vladimir Sementsov-Ogievskiy wrote:
>> 13.06.2019 1:09, Max Reitz wrote:
>>> This changes iotest 204's output, because blkdebug on top of a COW node
>>> used to make qemu-img map disregard the rest of the backing chain (the
>>> backing chain was broken by the filter).  With this patch, the
>>> allocation in the base image is reported correctly.
>>>
>>> Signed-off-by: Max Reitz <mreitz@redhat.com>
>>> ---
>>>    qemu-img.c                 | 36 ++++++++++++++++++++----------------
>>>    tests/qemu-iotests/204.out |  1 +
>>>    2 files changed, 21 insertions(+), 16 deletions(-)
>>>
>>> diff --git a/qemu-img.c b/qemu-img.c
>>> index 07b6e2a808..7bfa6e5d40 100644
>>> --- a/qemu-img.c
>>> +++ b/qemu-img.c
>>> @@ -992,7 +992,7 @@ static int img_commit(int argc, char **argv)
>>>        if (!blk) {
>>>            return 1;
>>>        }
>>> -    bs = blk_bs(blk);
>>> +    bs = bdrv_skip_implicit_filters(blk_bs(blk));
>>
>> if filename is json, describing explicit filter over normal node, bs will be
>> explicit filter ...
>>
>>>    
>>>        qemu_progress_init(progress, 1.f);
>>>        qemu_progress_print(0.f, 100);
>>> @@ -1009,7 +1009,7 @@ static int img_commit(int argc, char **argv)
>>>            /* This is different from QMP, which by default uses the deepest file in
>>>             * the backing chain (i.e., the very base); however, the traditional
>>>             * behavior of qemu-img commit is using the immediate backing file. */
>>> -        base_bs = backing_bs(bs);
>>> +        base_bs = bdrv_filtered_cow_bs(bs);
>>>            if (!base_bs) {
>>
>> and here we'll fail.
> 
> Right, will change to bdrv_backing_chain_next().
> 
>>>                error_setg(&local_err, "Image does not have a backing file");
>>>                goto done;
>>> @@ -1626,19 +1626,18 @@ static int convert_iteration_sectors(ImgConvertState *s, int64_t sector_num)
>>>    
>>>        if (s->sector_next_status <= sector_num) {
>>>            int64_t count = n * BDRV_SECTOR_SIZE;
>>> +        BlockDriverState *src_bs = blk_bs(s->src[src_cur]);
>>> +        BlockDriverState *base;
>>>    
>>>            if (s->target_has_backing) {
>>> -
>>> -            ret = bdrv_block_status(blk_bs(s->src[src_cur]),
>>> -                                    (sector_num - src_cur_offset) *
>>> -                                    BDRV_SECTOR_SIZE,
>>> -                                    count, &count, NULL, NULL);
>>> +            base = bdrv_backing_chain_next(src_bs);
>>
>> As you described in another patch, will not we here get allocated in base as allocated, because of
>> counting filters above base?
> 
> Damn, yes.  So
> 
>      base = bdrv_filtered_cow_bs(bdrv_skip_rw_filters(src_bs));
> 
> I suppose.
> 
>> Hmm. I now think, why filters don't report everything as unallocated, would not it be more correct
>> than fallthrough to child?
> 
> I don’t know, actually.  Maybe, maybe not.
> 
>>>            } else {
>>> -            ret = bdrv_block_status_above(blk_bs(s->src[src_cur]), NULL,
>>> -                                          (sector_num - src_cur_offset) *
>>> -                                          BDRV_SECTOR_SIZE,
>>> -                                          count, &count, NULL, NULL);
>>> +            base = NULL;
>>>            }
>>> +        ret = bdrv_block_status_above(src_bs, base,
>>> +                                      (sector_num - src_cur_offset) *
>>> +                                      BDRV_SECTOR_SIZE,
>>> +                                      count, &count, NULL, NULL);
>>>            if (ret < 0) {
>>>                error_report("error while reading block status of sector %" PRId64
>>>                             ": %s", sector_num, strerror(-ret));
> 
> [...]
> 
>>> @@ -2949,7 +2950,7 @@ static int img_map(int argc, char **argv)
>>>        if (!blk) {
>>>            return 1;
>>>        }
>>> -    bs = blk_bs(blk);
>>> +    bs = bdrv_skip_implicit_filters(blk_bs(blk));
>>
>> Hmm, another thought about implicit filters, how they could be here in qemu-img?
> 
> Hm, I don’t think they can.
> 
>> If implicit are only
>> job filters. Do you prepared it for implicit COR? But we discussed with Kevin that we'd better deprecate
>> copy-on-read option..
>>
>> So, if implicit filters are for compatibility, we'll have them only in block-jobs. So, seems no reason to support
>> them in qemu-img.
> 
> Seems reasonable, yes.
> 
>> Also, in block-jobs, we can abandon creating implicit filters above any filter nodes, as well as abandon creating any
>> filter nodes above implicit filters. This will still support old scenarios, but gives very simple and well defined scope
>> of using implicit filters and how to work with them. What do you think?
> 
> Hm, in what way would that make things simpler?
> 

This question was in my mind while I've finishing this paragraph) At least such restriction answer the question, where
should new filters be added: below or under implicit filters.. With such restriction we always can have only one implicit filter
over non-filter node, and above it should be explicit filter or non-filter node. But this need huge work to be done with small
benefit, so, forget it)


-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [Qemu-devel] [PATCH v5 33/42] blockdev: Fix active commit choice
  2019-06-19 15:59     ` Max Reitz
@ 2019-06-21 13:26       ` Vladimir Sementsov-Ogievskiy
  2019-07-24  9:54         ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 113+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2019-06-21 13:26 UTC (permalink / raw)
  To: Max Reitz, qemu-block; +Cc: Kevin Wolf, qemu-devel

19.06.2019 18:59, Max Reitz wrote:
> On 19.06.19 11:31, Vladimir Sementsov-Ogievskiy wrote:
>> 13.06.2019 1:09, Max Reitz wrote:
>>> We have to perform an active commit whenever the top node has a parent
>>> that has taken the WRITE permission on it.
>>>
>>> Signed-off-by: Max Reitz <mreitz@redhat.com>
>>> ---
>>>    blockdev.c | 24 +++++++++++++++++++++---
>>>    1 file changed, 21 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/blockdev.c b/blockdev.c
>>> index a464cabf9e..5370f3b738 100644
>>> --- a/blockdev.c
>>> +++ b/blockdev.c
>>> @@ -3294,6 +3294,7 @@ void qmp_block_commit(bool has_job_id, const char *job_id, const char *device,
>>>         */
>>>        BlockdevOnError on_error = BLOCKDEV_ON_ERROR_REPORT;
>>>        int job_flags = JOB_DEFAULT;
>>> +    uint64_t top_perm, top_shared;
>>>    
>>>        if (!has_speed) {
>>>            speed = 0;
>>> @@ -3406,14 +3407,31 @@ void qmp_block_commit(bool has_job_id, const char *job_id, const char *device,
>>>            goto out;
>>>        }
>>>    
>>> -    if (top_bs == bs) {
>>> +    /*
>>> +     * Active commit is required if and only if someone has taken a
>>> +     * WRITE permission on the top node.  Historically, we have always
>>> +     * used active commit for top nodes, so continue that practice.
>>> +     * (Active commit is never really wrong.)
>>
>> Hmm, if we start active commit when nobody has write access, than
>> we leave a possibility to someone to get this access during commit.
> 
> Isn’t that blocked by the commit filter?  If it isn’t, it should be.
> 
>> And during
>> passive commit write access is blocked. So, may be right way is do active commit
>> always? Benefits:
>> 1. One code path. and it shouldn't be worse when no writers, without guest writes
>> mirror code shouldn't work worse than passive commit, if it is, it should be fixed.
>> 2. Possibility of write access if user needs it during commit
>> 3. I'm sure that active commit (mirror code) actually works faster, as it uses
>> async requests and smarter handling of block status.
> 
> Disadvantage: Something may break because the basic commit job does not
> emit BLOCK_JOB_READY and thus does not require block-job-complete.
> 
> Technically everything should expect jobs to potentially emit
> BLOCK_JOB_READY, but who knows.  I think we’d want at least a
> deprecation period.
> 
> Max

OK, so this is for future.. Then:

Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

> 
>>> +     */
>>> +    bdrv_get_cumulative_perm(top_bs, &top_perm, &top_shared);
>>> +    if (top_perm & BLK_PERM_WRITE ||
>>> +        bdrv_skip_rw_filters(top_bs) == bdrv_skip_rw_filters(bs))
>>> +    {
>>>            if (has_backing_file) {
>>>                error_setg(errp, "'backing-file' specified,"
>>>                                 " but 'top' is the active layer");
>>>                goto out;
>>>            }
>>> -        commit_active_start(has_job_id ? job_id : NULL, bs, base_bs,
>>> -                            job_flags, speed, on_error,
>>> +        if (!has_job_id) {
>>> +            /*
>>> +             * Emulate here what block_job_create() does, because it
>>> +             * is possible that @bs != @top_bs (the block job should
>>> +             * be named after @bs, even if @top_bs is the actual
>>> +             * source)
>>> +             */
>>> +            job_id = bdrv_get_device_name(bs);
>>> +        }
>>> +        commit_active_start(job_id, top_bs, base_bs, job_flags, speed, on_error,
>>>                                filter_node_name, NULL, NULL, false, &local_err);
>>>        } else {
>>>            BlockDriverState *overlay_bs = bdrv_find_overlay(bs, top_bs);
>>>
>>
>>
> 
> 


-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [Qemu-devel] [PATCH v5 34/42] block: Inline bdrv_co_block_status_from_*()
  2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 34/42] block: Inline bdrv_co_block_status_from_*() Max Reitz
  2019-06-19  9:34   ` Vladimir Sementsov-Ogievskiy
@ 2019-06-21 13:39   ` Vladimir Sementsov-Ogievskiy
  2019-07-24  9:54     ` Vladimir Sementsov-Ogievskiy
  1 sibling, 1 reply; 113+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2019-06-21 13:39 UTC (permalink / raw)
  To: Max Reitz, qemu-block; +Cc: Kevin Wolf, qemu-devel

13.06.2019 1:09, Max Reitz wrote:
> With bdrv_filtered_rw_bs(), we can easily handle this default filter
> behavior in bdrv_co_block_status().
> 
> blkdebug wants to have an additional assertion, so it keeps its own
> implementation, except bdrv_co_block_status_from_file() needs to be
> inlined there.
> 
> Suggested-by: Eric Blake<eblake@redhat.com>
> Signed-off-by: Max Reitz<mreitz@redhat.com>

Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [Qemu-devel] [PATCH v5 30/42] qemu-img: Use child access functions
  2019-06-21 13:15       ` Vladimir Sementsov-Ogievskiy
@ 2019-07-24  9:54         ` Vladimir Sementsov-Ogievskiy
  2019-07-25 16:34           ` Max Reitz
  0 siblings, 1 reply; 113+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2019-07-24  9:54 UTC (permalink / raw)
  To: Max Reitz, qemu-block; +Cc: Kevin Wolf, qemu-devel

21.06.2019 16:15, Vladimir Sementsov-Ogievskiy wrote:
> 19.06.2019 18:49, Max Reitz wrote:
>> On 19.06.19 11:18, Vladimir Sementsov-Ogievskiy wrote:
>>> 13.06.2019 1:09, Max Reitz wrote:
>>>> This changes iotest 204's output, because blkdebug on top of a COW node
>>>> used to make qemu-img map disregard the rest of the backing chain (the
>>>> backing chain was broken by the filter).  With this patch, the
>>>> allocation in the base image is reported correctly.
>>>>
>>>> Signed-off-by: Max Reitz <mreitz@redhat.com>
>>>> ---
>>>>    qemu-img.c                 | 36 ++++++++++++++++++++----------------
>>>>    tests/qemu-iotests/204.out |  1 +
>>>>    2 files changed, 21 insertions(+), 16 deletions(-)
>>>>
>>>> diff --git a/qemu-img.c b/qemu-img.c
>>>> index 07b6e2a808..7bfa6e5d40 100644
>>>> --- a/qemu-img.c
>>>> +++ b/qemu-img.c
>>>> @@ -992,7 +992,7 @@ static int img_commit(int argc, char **argv)
>>>>        if (!blk) {
>>>>            return 1;
>>>>        }
>>>> -    bs = blk_bs(blk);
>>>> +    bs = bdrv_skip_implicit_filters(blk_bs(blk));
>>>
>>> if filename is json, describing explicit filter over normal node, bs will be
>>> explicit filter ...
>>>
>>>>        qemu_progress_init(progress, 1.f);
>>>>        qemu_progress_print(0.f, 100);
>>>> @@ -1009,7 +1009,7 @@ static int img_commit(int argc, char **argv)
>>>>            /* This is different from QMP, which by default uses the deepest file in
>>>>             * the backing chain (i.e., the very base); however, the traditional
>>>>             * behavior of qemu-img commit is using the immediate backing file. */
>>>> -        base_bs = backing_bs(bs);
>>>> +        base_bs = bdrv_filtered_cow_bs(bs);
>>>>            if (!base_bs) {
>>>
>>> and here we'll fail.
>>
>> Right, will change to bdrv_backing_chain_next().
>>
>>>>                error_setg(&local_err, "Image does not have a backing file");
>>>>                goto done;
>>>> @@ -1626,19 +1626,18 @@ static int convert_iteration_sectors(ImgConvertState *s, int64_t sector_num)
>>>>        if (s->sector_next_status <= sector_num) {
>>>>            int64_t count = n * BDRV_SECTOR_SIZE;
>>>> +        BlockDriverState *src_bs = blk_bs(s->src[src_cur]);
>>>> +        BlockDriverState *base;
>>>>            if (s->target_has_backing) {
>>>> -
>>>> -            ret = bdrv_block_status(blk_bs(s->src[src_cur]),
>>>> -                                    (sector_num - src_cur_offset) *
>>>> -                                    BDRV_SECTOR_SIZE,
>>>> -                                    count, &count, NULL, NULL);
>>>> +            base = bdrv_backing_chain_next(src_bs);
>>>
>>> As you described in another patch, will not we here get allocated in base as allocated, because of
>>> counting filters above base?
>>
>> Damn, yes.  So
>>
>>      base = bdrv_filtered_cow_bs(bdrv_skip_rw_filters(src_bs));
>>
>> I suppose.
>>
>>> Hmm. I now think, why filters don't report everything as unallocated, would not it be more correct
>>> than fallthrough to child?
>>
>> I don’t know, actually.  Maybe, maybe not.
>>
>>>>            } else {
>>>> -            ret = bdrv_block_status_above(blk_bs(s->src[src_cur]), NULL,
>>>> -                                          (sector_num - src_cur_offset) *
>>>> -                                          BDRV_SECTOR_SIZE,
>>>> -                                          count, &count, NULL, NULL);
>>>> +            base = NULL;
>>>>            }
>>>> +        ret = bdrv_block_status_above(src_bs, base,
>>>> +                                      (sector_num - src_cur_offset) *
>>>> +                                      BDRV_SECTOR_SIZE,
>>>> +                                      count, &count, NULL, NULL);
>>>>            if (ret < 0) {
>>>>                error_report("error while reading block status of sector %" PRId64
>>>>                             ": %s", sector_num, strerror(-ret));
>>
>> [...]
>>
>>>> @@ -2949,7 +2950,7 @@ static int img_map(int argc, char **argv)
>>>>        if (!blk) {
>>>>            return 1;
>>>>        }
>>>> -    bs = blk_bs(blk);
>>>> +    bs = bdrv_skip_implicit_filters(blk_bs(blk));
>>>
>>> Hmm, another thought about implicit filters, how they could be here in qemu-img?
>>
>> Hm, I don’t think they can.
>>
>>> If implicit are only
>>> job filters. Do you prepared it for implicit COR? But we discussed with Kevin that we'd better deprecate
>>> copy-on-read option..
>>>
>>> So, if implicit filters are for compatibility, we'll have them only in block-jobs. So, seems no reason to support
>>> them in qemu-img.
>>
>> Seems reasonable, yes.
>>
>>> Also, in block-jobs, we can abandon creating implicit filters above any filter nodes, as well as abandon creating any
>>> filter nodes above implicit filters. This will still support old scenarios, but gives very simple and well defined scope
>>> of using implicit filters and how to work with them. What do you think?
>>
>> Hm, in what way would that make things simpler?
>>
> 
> This question was in my mind while I've finishing this paragraph) At least such restriction answer the question, where
> should new filters be added: below or under implicit filters.. With such restriction we always can have only one implicit filter
> over non-filter node, and above it should be explicit filter or non-filter node. But this need huge work to be done with small
> benefit, so, forget it)
> 
> 

Strange, I have this mail automatically returned back. Did you receive it?

-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [Qemu-devel] [PATCH v5 33/42] blockdev: Fix active commit choice
  2019-06-21 13:26       ` Vladimir Sementsov-Ogievskiy
@ 2019-07-24  9:54         ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 113+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2019-07-24  9:54 UTC (permalink / raw)
  To: Max Reitz, qemu-block; +Cc: Kevin Wolf, qemu-devel

21.06.2019 16:26, Vladimir Sementsov-Ogievskiy wrote:
> 19.06.2019 18:59, Max Reitz wrote:
>> On 19.06.19 11:31, Vladimir Sementsov-Ogievskiy wrote:
>>> 13.06.2019 1:09, Max Reitz wrote:
>>>> We have to perform an active commit whenever the top node has a parent
>>>> that has taken the WRITE permission on it.
>>>>
>>>> Signed-off-by: Max Reitz <mreitz@redhat.com>
>>>> ---
>>>>    blockdev.c | 24 +++++++++++++++++++++---
>>>>    1 file changed, 21 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/blockdev.c b/blockdev.c
>>>> index a464cabf9e..5370f3b738 100644
>>>> --- a/blockdev.c
>>>> +++ b/blockdev.c
>>>> @@ -3294,6 +3294,7 @@ void qmp_block_commit(bool has_job_id, const char *job_id, const char *device,
>>>>         */
>>>>        BlockdevOnError on_error = BLOCKDEV_ON_ERROR_REPORT;
>>>>        int job_flags = JOB_DEFAULT;
>>>> +    uint64_t top_perm, top_shared;
>>>>        if (!has_speed) {
>>>>            speed = 0;
>>>> @@ -3406,14 +3407,31 @@ void qmp_block_commit(bool has_job_id, const char *job_id, const char *device,
>>>>            goto out;
>>>>        }
>>>> -    if (top_bs == bs) {
>>>> +    /*
>>>> +     * Active commit is required if and only if someone has taken a
>>>> +     * WRITE permission on the top node.  Historically, we have always
>>>> +     * used active commit for top nodes, so continue that practice.
>>>> +     * (Active commit is never really wrong.)
>>>
>>> Hmm, if we start active commit when nobody has write access, than
>>> we leave a possibility to someone to get this access during commit.
>>
>> Isn’t that blocked by the commit filter?  If it isn’t, it should be.
>>
>>> And during
>>> passive commit write access is blocked. So, may be right way is do active commit
>>> always? Benefits:
>>> 1. One code path. and it shouldn't be worse when no writers, without guest writes
>>> mirror code shouldn't work worse than passive commit, if it is, it should be fixed.
>>> 2. Possibility of write access if user needs it during commit
>>> 3. I'm sure that active commit (mirror code) actually works faster, as it uses
>>> async requests and smarter handling of block status.
>>
>> Disadvantage: Something may break because the basic commit job does not
>> emit BLOCK_JOB_READY and thus does not require block-job-complete.
>>
>> Technically everything should expect jobs to potentially emit
>> BLOCK_JOB_READY, but who knows.  I think we’d want at least a
>> deprecation period.
>>
>> Max
> 
> OK, so this is for future.. Then:
> 
> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

Strange, I have this mail automatically returned back. Did you receive it?

> 
>>
>>>> +     */
>>>> +    bdrv_get_cumulative_perm(top_bs, &top_perm, &top_shared);
>>>> +    if (top_perm & BLK_PERM_WRITE ||
>>>> +        bdrv_skip_rw_filters(top_bs) == bdrv_skip_rw_filters(bs))
>>>> +    {
>>>>            if (has_backing_file) {
>>>>                error_setg(errp, "'backing-file' specified,"
>>>>                                 " but 'top' is the active layer");
>>>>                goto out;
>>>>            }
>>>> -        commit_active_start(has_job_id ? job_id : NULL, bs, base_bs,
>>>> -                            job_flags, speed, on_error,
>>>> +        if (!has_job_id) {
>>>> +            /*
>>>> +             * Emulate here what block_job_create() does, because it
>>>> +             * is possible that @bs != @top_bs (the block job should
>>>> +             * be named after @bs, even if @top_bs is the actual
>>>> +             * source)
>>>> +             */
>>>> +            job_id = bdrv_get_device_name(bs);
>>>> +        }
>>>> +        commit_active_start(job_id, top_bs, base_bs, job_flags, speed, on_error,
>>>>                                filter_node_name, NULL, NULL, false, &local_err);
>>>>        } else {
>>>>            BlockDriverState *overlay_bs = bdrv_find_overlay(bs, top_bs);
>>>>
>>>
>>>
>>
>>
> 
> 


-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [Qemu-devel] [PATCH v5 34/42] block: Inline bdrv_co_block_status_from_*()
  2019-06-21 13:39   ` Vladimir Sementsov-Ogievskiy
@ 2019-07-24  9:54     ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 113+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2019-07-24  9:54 UTC (permalink / raw)
  To: Max Reitz, qemu-block; +Cc: Kevin Wolf, qemu-devel

21.06.2019 16:39, Vladimir Sementsov-Ogievskiy wrote:
> 13.06.2019 1:09, Max Reitz wrote:
>> With bdrv_filtered_rw_bs(), we can easily handle this default filter
>> behavior in bdrv_co_block_status().
>>
>> blkdebug wants to have an additional assertion, so it keeps its own
>> implementation, except bdrv_co_block_status_from_file() needs to be
>> inlined there.
>>
>> Suggested-by: Eric Blake<eblake@redhat.com>
>> Signed-off-by: Max Reitz<mreitz@redhat.com>
> 
> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> 

Strange, I have this mail automatically returned back. Did you receive it?


-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [Qemu-devel] [PATCH v5 30/42] qemu-img: Use child access functions
  2019-07-24  9:54         ` Vladimir Sementsov-Ogievskiy
@ 2019-07-25 16:34           ` Max Reitz
  2019-07-26 13:44             ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 113+ messages in thread
From: Max Reitz @ 2019-07-25 16:34 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block; +Cc: Kevin Wolf, qemu-devel


[-- Attachment #1.1: Type: text/plain, Size: 6672 bytes --]

On 24.07.19 11:54, Vladimir Sementsov-Ogievskiy wrote:
> 21.06.2019 16:15, Vladimir Sementsov-Ogievskiy wrote:
>> 19.06.2019 18:49, Max Reitz wrote:
>>> On 19.06.19 11:18, Vladimir Sementsov-Ogievskiy wrote:
>>>> 13.06.2019 1:09, Max Reitz wrote:
>>>>> This changes iotest 204's output, because blkdebug on top of a COW node
>>>>> used to make qemu-img map disregard the rest of the backing chain (the
>>>>> backing chain was broken by the filter).  With this patch, the
>>>>> allocation in the base image is reported correctly.
>>>>>
>>>>> Signed-off-by: Max Reitz <mreitz@redhat.com>
>>>>> ---
>>>>>    qemu-img.c                 | 36 ++++++++++++++++++++----------------
>>>>>    tests/qemu-iotests/204.out |  1 +
>>>>>    2 files changed, 21 insertions(+), 16 deletions(-)
>>>>>
>>>>> diff --git a/qemu-img.c b/qemu-img.c
>>>>> index 07b6e2a808..7bfa6e5d40 100644
>>>>> --- a/qemu-img.c
>>>>> +++ b/qemu-img.c
>>>>> @@ -992,7 +992,7 @@ static int img_commit(int argc, char **argv)
>>>>>        if (!blk) {
>>>>>            return 1;
>>>>>        }
>>>>> -    bs = blk_bs(blk);
>>>>> +    bs = bdrv_skip_implicit_filters(blk_bs(blk));
>>>>
>>>> if filename is json, describing explicit filter over normal node, bs will be
>>>> explicit filter ...
>>>>
>>>>>        qemu_progress_init(progress, 1.f);
>>>>>        qemu_progress_print(0.f, 100);
>>>>> @@ -1009,7 +1009,7 @@ static int img_commit(int argc, char **argv)
>>>>>            /* This is different from QMP, which by default uses the deepest file in
>>>>>             * the backing chain (i.e., the very base); however, the traditional
>>>>>             * behavior of qemu-img commit is using the immediate backing file. */
>>>>> -        base_bs = backing_bs(bs);
>>>>> +        base_bs = bdrv_filtered_cow_bs(bs);
>>>>>            if (!base_bs) {
>>>>
>>>> and here we'll fail.
>>>
>>> Right, will change to bdrv_backing_chain_next().
>>>
>>>>>                error_setg(&local_err, "Image does not have a backing file");
>>>>>                goto done;
>>>>> @@ -1626,19 +1626,18 @@ static int convert_iteration_sectors(ImgConvertState *s, int64_t sector_num)
>>>>>        if (s->sector_next_status <= sector_num) {
>>>>>            int64_t count = n * BDRV_SECTOR_SIZE;
>>>>> +        BlockDriverState *src_bs = blk_bs(s->src[src_cur]);
>>>>> +        BlockDriverState *base;
>>>>>            if (s->target_has_backing) {
>>>>> -
>>>>> -            ret = bdrv_block_status(blk_bs(s->src[src_cur]),
>>>>> -                                    (sector_num - src_cur_offset) *
>>>>> -                                    BDRV_SECTOR_SIZE,
>>>>> -                                    count, &count, NULL, NULL);
>>>>> +            base = bdrv_backing_chain_next(src_bs);
>>>>
>>>> As you described in another patch, will not we here get allocated in base as allocated, because of
>>>> counting filters above base?
>>>
>>> Damn, yes.  So
>>>
>>>      base = bdrv_filtered_cow_bs(bdrv_skip_rw_filters(src_bs));
>>>
>>> I suppose.
>>>
>>>> Hmm. I now think, why filters don't report everything as unallocated, would not it be more correct
>>>> than fallthrough to child?
>>>
>>> I don’t know, actually.  Maybe, maybe not.
>>>
>>>>>            } else {
>>>>> -            ret = bdrv_block_status_above(blk_bs(s->src[src_cur]), NULL,
>>>>> -                                          (sector_num - src_cur_offset) *
>>>>> -                                          BDRV_SECTOR_SIZE,
>>>>> -                                          count, &count, NULL, NULL);
>>>>> +            base = NULL;
>>>>>            }
>>>>> +        ret = bdrv_block_status_above(src_bs, base,
>>>>> +                                      (sector_num - src_cur_offset) *
>>>>> +                                      BDRV_SECTOR_SIZE,
>>>>> +                                      count, &count, NULL, NULL);
>>>>>            if (ret < 0) {
>>>>>                error_report("error while reading block status of sector %" PRId64
>>>>>                             ": %s", sector_num, strerror(-ret));
>>>
>>> [...]
>>>
>>>>> @@ -2949,7 +2950,7 @@ static int img_map(int argc, char **argv)
>>>>>        if (!blk) {
>>>>>            return 1;
>>>>>        }
>>>>> -    bs = blk_bs(blk);
>>>>> +    bs = bdrv_skip_implicit_filters(blk_bs(blk));
>>>>
>>>> Hmm, another thought about implicit filters, how they could be here in qemu-img?
>>>
>>> Hm, I don’t think they can.
>>>
>>>> If implicit are only
>>>> job filters. Do you prepared it for implicit COR? But we discussed with Kevin that we'd better deprecate
>>>> copy-on-read option..
>>>>
>>>> So, if implicit filters are for compatibility, we'll have them only in block-jobs. So, seems no reason to support
>>>> them in qemu-img.
>>>
>>> Seems reasonable, yes.
>>>
>>>> Also, in block-jobs, we can abandon creating implicit filters above any filter nodes, as well as abandon creating any
>>>> filter nodes above implicit filters. This will still support old scenarios, but gives very simple and well defined scope
>>>> of using implicit filters and how to work with them. What do you think?
>>>
>>> Hm, in what way would that make things simpler?
>>>
>>
>> This question was in my mind while I've finishing this paragraph) At least such restriction answer the question, where
>> should new filters be added: below or under implicit filters.. With such restriction we always can have only one implicit filter
>> over non-filter node, and above it should be explicit filter or non-filter node. But this need huge work to be done with small
>> benefit, so, forget it)

OK.  I should have read the last part first, then I could have replied
sooner. :-)

> Strange, I have this mail automatically returned back. Did you receive it?

No, I didn’t.  (Nor any of the other mails you resent.)  Weird.

Also, welcome back, congratulations, and all the best to your family! :-)

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [Qemu-devel] [PATCH v5 30/42] qemu-img: Use child access functions
  2019-07-25 16:34           ` Max Reitz
@ 2019-07-26 13:44             ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 113+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2019-07-26 13:44 UTC (permalink / raw)
  To: Max Reitz, qemu-block; +Cc: Kevin Wolf, qemu-devel

25.07.2019 19:34, Max Reitz wrote:
> On 24.07.19 11:54, Vladimir Sementsov-Ogievskiy wrote:
>> 21.06.2019 16:15, Vladimir Sementsov-Ogievskiy wrote:
>>> 19.06.2019 18:49, Max Reitz wrote:
>>>> On 19.06.19 11:18, Vladimir Sementsov-Ogievskiy wrote:
>>>>> 13.06.2019 1:09, Max Reitz wrote:
>>>>>> This changes iotest 204's output, because blkdebug on top of a COW node
>>>>>> used to make qemu-img map disregard the rest of the backing chain (the
>>>>>> backing chain was broken by the filter).  With this patch, the
>>>>>> allocation in the base image is reported correctly.
>>>>>>
>>>>>> Signed-off-by: Max Reitz <mreitz@redhat.com>
>>>>>> ---
>>>>>>     qemu-img.c                 | 36 ++++++++++++++++++++----------------
>>>>>>     tests/qemu-iotests/204.out |  1 +
>>>>>>     2 files changed, 21 insertions(+), 16 deletions(-)
>>>>>>
>>>>>> diff --git a/qemu-img.c b/qemu-img.c
>>>>>> index 07b6e2a808..7bfa6e5d40 100644
>>>>>> --- a/qemu-img.c
>>>>>> +++ b/qemu-img.c
>>>>>> @@ -992,7 +992,7 @@ static int img_commit(int argc, char **argv)
>>>>>>         if (!blk) {
>>>>>>             return 1;
>>>>>>         }
>>>>>> -    bs = blk_bs(blk);
>>>>>> +    bs = bdrv_skip_implicit_filters(blk_bs(blk));
>>>>>
>>>>> if filename is json, describing explicit filter over normal node, bs will be
>>>>> explicit filter ...
>>>>>
>>>>>>         qemu_progress_init(progress, 1.f);
>>>>>>         qemu_progress_print(0.f, 100);
>>>>>> @@ -1009,7 +1009,7 @@ static int img_commit(int argc, char **argv)
>>>>>>             /* This is different from QMP, which by default uses the deepest file in
>>>>>>              * the backing chain (i.e., the very base); however, the traditional
>>>>>>              * behavior of qemu-img commit is using the immediate backing file. */
>>>>>> -        base_bs = backing_bs(bs);
>>>>>> +        base_bs = bdrv_filtered_cow_bs(bs);
>>>>>>             if (!base_bs) {
>>>>>
>>>>> and here we'll fail.
>>>>
>>>> Right, will change to bdrv_backing_chain_next().
>>>>
>>>>>>                 error_setg(&local_err, "Image does not have a backing file");
>>>>>>                 goto done;
>>>>>> @@ -1626,19 +1626,18 @@ static int convert_iteration_sectors(ImgConvertState *s, int64_t sector_num)
>>>>>>         if (s->sector_next_status <= sector_num) {
>>>>>>             int64_t count = n * BDRV_SECTOR_SIZE;
>>>>>> +        BlockDriverState *src_bs = blk_bs(s->src[src_cur]);
>>>>>> +        BlockDriverState *base;
>>>>>>             if (s->target_has_backing) {
>>>>>> -
>>>>>> -            ret = bdrv_block_status(blk_bs(s->src[src_cur]),
>>>>>> -                                    (sector_num - src_cur_offset) *
>>>>>> -                                    BDRV_SECTOR_SIZE,
>>>>>> -                                    count, &count, NULL, NULL);
>>>>>> +            base = bdrv_backing_chain_next(src_bs);
>>>>>
>>>>> As you described in another patch, will not we here get allocated in base as allocated, because of
>>>>> counting filters above base?
>>>>
>>>> Damn, yes.  So
>>>>
>>>>       base = bdrv_filtered_cow_bs(bdrv_skip_rw_filters(src_bs));
>>>>
>>>> I suppose.
>>>>
>>>>> Hmm. I now think, why filters don't report everything as unallocated, would not it be more correct
>>>>> than fallthrough to child?
>>>>
>>>> I don’t know, actually.  Maybe, maybe not.
>>>>
>>>>>>             } else {
>>>>>> -            ret = bdrv_block_status_above(blk_bs(s->src[src_cur]), NULL,
>>>>>> -                                          (sector_num - src_cur_offset) *
>>>>>> -                                          BDRV_SECTOR_SIZE,
>>>>>> -                                          count, &count, NULL, NULL);
>>>>>> +            base = NULL;
>>>>>>             }
>>>>>> +        ret = bdrv_block_status_above(src_bs, base,
>>>>>> +                                      (sector_num - src_cur_offset) *
>>>>>> +                                      BDRV_SECTOR_SIZE,
>>>>>> +                                      count, &count, NULL, NULL);
>>>>>>             if (ret < 0) {
>>>>>>                 error_report("error while reading block status of sector %" PRId64
>>>>>>                              ": %s", sector_num, strerror(-ret));
>>>>
>>>> [...]
>>>>
>>>>>> @@ -2949,7 +2950,7 @@ static int img_map(int argc, char **argv)
>>>>>>         if (!blk) {
>>>>>>             return 1;
>>>>>>         }
>>>>>> -    bs = blk_bs(blk);
>>>>>> +    bs = bdrv_skip_implicit_filters(blk_bs(blk));
>>>>>
>>>>> Hmm, another thought about implicit filters, how they could be here in qemu-img?
>>>>
>>>> Hm, I don’t think they can.
>>>>
>>>>> If implicit are only
>>>>> job filters. Do you prepared it for implicit COR? But we discussed with Kevin that we'd better deprecate
>>>>> copy-on-read option..
>>>>>
>>>>> So, if implicit filters are for compatibility, we'll have them only in block-jobs. So, seems no reason to support
>>>>> them in qemu-img.
>>>>
>>>> Seems reasonable, yes.
>>>>
>>>>> Also, in block-jobs, we can abandon creating implicit filters above any filter nodes, as well as abandon creating any
>>>>> filter nodes above implicit filters. This will still support old scenarios, but gives very simple and well defined scope
>>>>> of using implicit filters and how to work with them. What do you think?
>>>>
>>>> Hm, in what way would that make things simpler?
>>>>
>>>
>>> This question was in my mind while I've finishing this paragraph) At least such restriction answer the question, where
>>> should new filters be added: below or under implicit filters.. With such restriction we always can have only one implicit filter
>>> over non-filter node, and above it should be explicit filter or non-filter node. But this need huge work to be done with small
>>> benefit, so, forget it)
> 
> OK.  I should have read the last part first, then I could have replied
> sooner. :-)
> 
>> Strange, I have this mail automatically returned back. Did you receive it?
> 
> No, I didn’t.  (Nor any of the other mails you resent.)  Weird.

Interesting that it reached mailing list and presents in archive.

> 
> Also, welcome back, congratulations, and all the best to your family! :-)
> 


Thank you!


-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 113+ messages in thread

end of thread, other threads:[~2019-07-26 13:45 UTC | newest]

Thread overview: 113+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-12 22:09 [Qemu-devel] [PATCH v5 00/42] block: Deal with filters Max Reitz
2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 01/42] block: Mark commit and mirror as filter drivers Max Reitz
2019-06-13 10:47   ` Vladimir Sementsov-Ogievskiy
2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 02/42] copy-on-read: Support compressed writes Max Reitz
2019-06-13 10:49   ` Vladimir Sementsov-Ogievskiy
2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 03/42] throttle: " Max Reitz
2019-06-13 10:51   ` Vladimir Sementsov-Ogievskiy
2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 04/42] block: Add child access functions Max Reitz
2019-06-13 12:15   ` Vladimir Sementsov-Ogievskiy
2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 05/42] block: Add chain helper functions Max Reitz
2019-06-13 12:26   ` Vladimir Sementsov-Ogievskiy
2019-06-13 12:33     ` Max Reitz
2019-06-13 12:39       ` Vladimir Sementsov-Ogievskiy
2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 06/42] qcow2: Implement .bdrv_storage_child() Max Reitz
2019-06-13 12:27   ` Vladimir Sementsov-Ogievskiy
2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 07/42] block: *filtered_cow_child() for *has_zero_init() Max Reitz
2019-06-13 12:34   ` Vladimir Sementsov-Ogievskiy
2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 08/42] block: bdrv_set_backing_hd() is about bs->backing Max Reitz
2019-06-13 12:40   ` Vladimir Sementsov-Ogievskiy
2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 09/42] block: Include filters when freezing backing chain Max Reitz
2019-06-13 13:04   ` Vladimir Sementsov-Ogievskiy
2019-06-13 14:05     ` Max Reitz
2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 10/42] block: Use CAF in bdrv_is_encrypted() Max Reitz
2019-06-13 13:16   ` Vladimir Sementsov-Ogievskiy
2019-06-13 14:15     ` Max Reitz
2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 11/42] block: Add bdrv_supports_compressed_writes() Max Reitz
2019-06-13 13:29   ` Vladimir Sementsov-Ogievskiy
2019-06-13 14:19     ` Max Reitz
2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 12/42] block: Use bdrv_filtered_rw* where obvious Max Reitz
2019-06-13 13:37   ` Vladimir Sementsov-Ogievskiy
2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 13/42] block: Use CAFs in block status functions Max Reitz
2019-06-14 12:07   ` Vladimir Sementsov-Ogievskiy
2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 14/42] block: Use CAFs when working with backing chains Max Reitz
2019-06-14 13:26   ` Vladimir Sementsov-Ogievskiy
2019-06-14 13:50     ` Max Reitz
2019-06-14 14:31       ` Vladimir Sementsov-Ogievskiy
2019-06-14 16:02         ` Max Reitz
2019-06-14 16:39           ` Vladimir Sementsov-Ogievskiy
2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 15/42] block: Re-evaluate backing file handling in reopen Max Reitz
2019-06-14 13:42   ` Vladimir Sementsov-Ogievskiy
2019-06-14 15:52     ` Max Reitz
2019-06-14 16:43       ` Vladimir Sementsov-Ogievskiy
2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 16/42] block: Use child access functions when flushing Max Reitz
2019-06-14 14:01   ` Vladimir Sementsov-Ogievskiy
2019-06-14 15:55     ` Max Reitz
2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 17/42] block: Use CAFs in bdrv_refresh_limits() Max Reitz
2019-06-14 15:04   ` Vladimir Sementsov-Ogievskiy
2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 18/42] block: Use CAFs in bdrv_refresh_filename() Max Reitz
2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 19/42] block: Use CAF in bdrv_co_rw_vmstate() Max Reitz
2019-06-14 15:14   ` Vladimir Sementsov-Ogievskiy
2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 20/42] block/snapshot: Fall back to storage child Max Reitz
2019-06-14 15:22   ` Vladimir Sementsov-Ogievskiy
2019-06-14 16:10     ` Max Reitz
2019-06-14 16:47       ` Vladimir Sementsov-Ogievskiy
2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 21/42] block: Use CAFs for debug breakpoints Max Reitz
2019-06-14 15:29   ` Vladimir Sementsov-Ogievskiy
2019-06-14 16:12     ` Max Reitz
2019-06-14 20:28       ` Eric Blake
2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 22/42] block: Use CAFs in bdrv_get_allocated_file_size() Max Reitz
2019-06-12 22:17   ` Max Reitz
2019-06-14 15:41   ` Vladimir Sementsov-Ogievskiy
2019-06-14 16:15     ` Max Reitz
2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 23/42] blockdev: Use CAF in external_snapshot_prepare() Max Reitz
2019-06-14 15:46   ` Vladimir Sementsov-Ogievskiy
2019-06-14 16:20     ` Max Reitz
2019-06-14 16:58       ` Vladimir Sementsov-Ogievskiy
2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 24/42] block: Use child access functions for QAPI queries Max Reitz
2019-06-18 12:06   ` Vladimir Sementsov-Ogievskiy
2019-06-18 14:22     ` Max Reitz
2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 25/42] mirror: Deal with filters Max Reitz
2019-06-18 13:12   ` Vladimir Sementsov-Ogievskiy
2019-06-18 14:47     ` Max Reitz
2019-06-18 14:55       ` Vladimir Sementsov-Ogievskiy
2019-06-18 15:20         ` Max Reitz
2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 26/42] backup: " Max Reitz
2019-06-18 13:45   ` Vladimir Sementsov-Ogievskiy
2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 27/42] commit: " Max Reitz
2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 28/42] stream: " Max Reitz
2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 29/42] nbd: Use CAF when looking for dirty bitmap Max Reitz
2019-06-18 13:58   ` Vladimir Sementsov-Ogievskiy
2019-06-18 14:48   ` Eric Blake
2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 30/42] qemu-img: Use child access functions Max Reitz
2019-06-19  9:18   ` Vladimir Sementsov-Ogievskiy
2019-06-19 15:49     ` Max Reitz
2019-06-21 13:15       ` Vladimir Sementsov-Ogievskiy
2019-07-24  9:54         ` Vladimir Sementsov-Ogievskiy
2019-07-25 16:34           ` Max Reitz
2019-07-26 13:44             ` Vladimir Sementsov-Ogievskiy
2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 31/42] block: Drop backing_bs() Max Reitz
2019-06-19  9:18   ` Vladimir Sementsov-Ogievskiy
2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 32/42] block: Make bdrv_get_cumulative_perm() public Max Reitz
2019-06-19  9:19   ` Vladimir Sementsov-Ogievskiy
2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 33/42] blockdev: Fix active commit choice Max Reitz
2019-06-19  9:31   ` Vladimir Sementsov-Ogievskiy
2019-06-19 15:59     ` Max Reitz
2019-06-21 13:26       ` Vladimir Sementsov-Ogievskiy
2019-07-24  9:54         ` Vladimir Sementsov-Ogievskiy
2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 34/42] block: Inline bdrv_co_block_status_from_*() Max Reitz
2019-06-19  9:34   ` Vladimir Sementsov-Ogievskiy
2019-06-19 16:01     ` Max Reitz
2019-06-19 16:07     ` Max Reitz
2019-06-21 13:39   ` Vladimir Sementsov-Ogievskiy
2019-07-24  9:54     ` Vladimir Sementsov-Ogievskiy
2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 35/42] block: Fix check_to_replace_node() Max Reitz
2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 36/42] iotests: Add tests for mirror @replaces loops Max Reitz
2019-06-12 22:09 ` [Qemu-devel] [PATCH v5 37/42] block: Leave BDS.backing_file constant Max Reitz
2019-06-12 22:10 ` [Qemu-devel] [PATCH v5 38/42] iotests: Let complete_and_wait() work with commit Max Reitz
2019-06-12 22:10 ` [Qemu-devel] [PATCH v5 39/42] iotests: Add filter commit test cases Max Reitz
2019-06-12 22:10 ` [Qemu-devel] [PATCH v5 40/42] iotests: Add filter mirror " Max Reitz
2019-06-12 22:10 ` [Qemu-devel] [PATCH v5 41/42] iotests: Add test for commit in sub directory Max Reitz
2019-06-12 22:10 ` [Qemu-devel] [PATCH v5 42/42] iotests: Test committing to overridden backing Max Reitz
2019-06-13 15:28 ` [Qemu-devel] [PATCH v5 00/42] block: Deal with filters Vladimir Sementsov-Ogievskiy
2019-06-13 16:12   ` Max Reitz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).