All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v14 00/14] Drop in_use from BlockDriverState and enable point-in-time snapshot exporting over NBD
@ 2014-02-19 13:42 Fam Zheng
  2014-02-19 13:42 ` [Qemu-devel] [PATCH v14 01/14] block: Add BlockOpType enum Fam Zheng
                   ` (13 more replies)
  0 siblings, 14 replies; 33+ messages in thread
From: Fam Zheng @ 2014-02-19 13:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, benoit.canet, rjones, armbru, imain, stefanha, pbonzini

This series adds for point-in-time snapshot NBD exporting based on
blockdev-backup (variant of drive-backup with existing device as target).

We get a thin point-in-time snapshot by COW mechanism of drive-backup, and
export it through built in NBD server. The steps are as below:

 1. (SHELL) qemu-img create -f qcow2 BACKUP.qcow2 <source size here>

    (Alternatively we can use -o backing_file=RUNNING-VM.img to omit explicitly
    providing the size by ourselves, but it's risky because RUNNING-VM.qcow2 is
    used r/w by guest. Whether or not setting backing file in the image file
    doesn't matter, as we are going to override the backing hd in the next
    step)

 2. (QMP) blockdev-add backing=source-drive file.driver=file file.filename=BACKUP.qcow2 id=target0 if=none driver=qcow2

    (where source-drive is the running BlockDriverState name for
    RUNNING-VM.img. This patch implements "backing=" option to override
    backing_hd for added drive)

 3. (QMP) blockdev-backup device=source-drive sync=none target=target0

    (this is the QMP command introduced by this series, which use a named
    device as target of drive-backup)

 4. (QMP) nbd-server-add device=target0

When image fleecing done:

 1. (QMP) block-job-cancel device=source-drive

 2. (HMP) drive_del target0

 3. (SHELL) rm BACKUP.qcow2

v14: Address Benoit's comments and rebase to kevin's block tree:

    02: Explicit initialize op_blockers with QLIST_INIT.
    05: Fix sizeof(bs->backing_format) passed to pstrcpy().


Fam Zheng (14):
  block: Add BlockOpType enum
  block: Introduce op_blockers to BlockDriverState
  block: Replace in_use with operation blocker
  block: Move op_blocker check from block_job_create to its caller
  block: Add bdrv_set_backing_hd()
  block: Add backing_blocker in BlockDriverState
  block: Parse "backing" option to reference existing BDS
  block: Support dropping active in bdrv_drop_intermediate
  stream: Use bdrv_drop_intermediate and drop close_unused_images
  qmp: Add command 'blockdev-backup'
  block: Allow backup on referenced named BlockDriverState
  block: Add blockdev-backup to transaction
  qemu-iotests: Test blockdev-backup in 055
  qemu-iotests: Image fleecing test case 081

 block-migration.c               |   7 +-
 block.c                         | 310 +++++++++++++++++++++++++++-------------
 block/backup.c                  |  26 ++++
 block/commit.c                  |   1 +
 block/stream.c                  |  30 +---
 blockdev.c                      | 118 +++++++++++++--
 blockjob.c                      |  14 +-
 hw/block/dataplane/virtio-blk.c |  19 ++-
 include/block/block.h           |  29 +++-
 include/block/block_int.h       |   9 +-
 include/block/blockjob.h        |   3 +
 qapi-schema.json                |  50 +++++++
 qmp-commands.hx                 |  44 ++++++
 tests/qemu-iotests/055          | 275 +++++++++++++++++++++++++++++------
 tests/qemu-iotests/055.out      |   4 +-
 tests/qemu-iotests/081          |  99 +++++++++++++
 tests/qemu-iotests/081.out      |   5 +
 tests/qemu-iotests/group        |   1 +
 18 files changed, 842 insertions(+), 202 deletions(-)
 create mode 100755 tests/qemu-iotests/081
 create mode 100644 tests/qemu-iotests/081.out

-- 
1.8.5.4

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [Qemu-devel] [PATCH v14 01/14] block: Add BlockOpType enum
  2014-02-19 13:42 [Qemu-devel] [PATCH v14 00/14] Drop in_use from BlockDriverState and enable point-in-time snapshot exporting over NBD Fam Zheng
@ 2014-02-19 13:42 ` Fam Zheng
  2014-02-19 15:25   ` Benoît Canet
  2014-02-19 13:42 ` [Qemu-devel] [PATCH v14 02/14] block: Introduce op_blockers to BlockDriverState Fam Zheng
                   ` (12 subsequent siblings)
  13 siblings, 1 reply; 33+ messages in thread
From: Fam Zheng @ 2014-02-19 13:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, benoit.canet, rjones, armbru, imain, stefanha, pbonzini

This adds the enum of all the operations that can be taken on a block
device.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 include/block/block.h | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/include/block/block.h b/include/block/block.h
index 780f48b..8820735 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -154,6 +154,25 @@ typedef struct BDRVReopenState {
     void *opaque;
 } BDRVReopenState;
 
+/*
+ * Block operation types
+ */
+typedef enum BlockOpType {
+    BLOCK_OP_TYPE_BACKUP_SOURCE,
+    BLOCK_OP_TYPE_BACKUP_TARGET,
+    BLOCK_OP_TYPE_CHANGE,
+    BLOCK_OP_TYPE_COMMIT,
+    BLOCK_OP_TYPE_DATAPLANE,
+    BLOCK_OP_TYPE_DRIVE_DEL,
+    BLOCK_OP_TYPE_EJECT,
+    BLOCK_OP_TYPE_EXTERNAL_SNAPSHOT,
+    BLOCK_OP_TYPE_INTERNAL_SNAPSHOT,
+    BLOCK_OP_TYPE_INTERNAL_SNAPSHOT_DELETE,
+    BLOCK_OP_TYPE_MIRROR,
+    BLOCK_OP_TYPE_RESIZE,
+    BLOCK_OP_TYPE_STREAM,
+    BLOCK_OP_TYPE_MAX,
+} BlockOpType;
 
 void bdrv_iostatus_enable(BlockDriverState *bs);
 void bdrv_iostatus_reset(BlockDriverState *bs);
-- 
1.8.5.4

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [Qemu-devel] [PATCH v14 02/14] block: Introduce op_blockers to BlockDriverState
  2014-02-19 13:42 [Qemu-devel] [PATCH v14 00/14] Drop in_use from BlockDriverState and enable point-in-time snapshot exporting over NBD Fam Zheng
  2014-02-19 13:42 ` [Qemu-devel] [PATCH v14 01/14] block: Add BlockOpType enum Fam Zheng
@ 2014-02-19 13:42 ` Fam Zheng
  2014-02-19 15:26   ` Benoît Canet
  2014-02-19 13:42 ` [Qemu-devel] [PATCH v14 03/14] block: Replace in_use with operation blocker Fam Zheng
                   ` (11 subsequent siblings)
  13 siblings, 1 reply; 33+ messages in thread
From: Fam Zheng @ 2014-02-19 13:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, benoit.canet, rjones, armbru, imain, stefanha, pbonzini

BlockDriverState.op_blockers is an array of lists with BLOCK_OP_TYPE_MAX
elements. Each list is a list of blockers of an operation type
(BlockOpType), that marks this BDS as currently blocked for a certain
type of operation with reason errors stored in the list. The rule of
usage is:

 * BDS user who wants to take an operation should check if there's any
   blocker of the type with bdrv_op_is_blocked().

 * BDS user who wants to block certain types of operation, should call
   bdrv_op_block (or bdrv_op_block_all to block all types of operations,
   which is similar to the existing bdrv_set_in_use()).

 * A blocker is only referenced by op_blockers, so the lifecycle is
   managed by caller, and shouldn't be lost until unblock, so typically
   a caller does these:

   - Allocate a blocker with error_setg or similar, call bdrv_op_block()
     to block some operations.
   - Hold the blocker, do his job.
   - Unblock operations that it blocked, with the same reason pointer
     passed to bdrv_op_unblock().
   - Release the blocker with error_free().

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 block.c                   | 75 +++++++++++++++++++++++++++++++++++++++++++++++
 include/block/block.h     |  7 +++++
 include/block/block_int.h |  5 ++++
 3 files changed, 87 insertions(+)

diff --git a/block.c b/block.c
index 80310fe..485d4f0 100644
--- a/block.c
+++ b/block.c
@@ -335,6 +335,7 @@ void bdrv_register(BlockDriver *bdrv)
 BlockDriverState *bdrv_new(const char *device_name)
 {
     BlockDriverState *bs;
+    int i;
 
     bs = g_malloc0(sizeof(BlockDriverState));
     QLIST_INIT(&bs->dirty_bitmaps);
@@ -342,6 +343,9 @@ BlockDriverState *bdrv_new(const char *device_name)
     if (device_name[0] != '\0') {
         QTAILQ_INSERT_TAIL(&bdrv_states, bs, device_list);
     }
+    for (i = 0; i < BLOCK_OP_TYPE_MAX; i++) {
+        QLIST_INIT(&bs->op_blockers[i]);
+    }
     bdrv_iostatus_disable(bs);
     notifier_list_init(&bs->close_notifiers);
     notifier_with_return_list_init(&bs->before_write_notifiers);
@@ -1851,6 +1855,8 @@ static void bdrv_move_feature_fields(BlockDriverState *bs_dest,
      * We do want to swap name but don't want to swap linked list entries
      */
     bs_dest->node_list   = bs_src->node_list;
+    memcpy(bs_dest->op_blockers, bs_src->op_blockers,
+           sizeof(bs_dest->op_blockers));
 }
 
 /*
@@ -5130,6 +5136,75 @@ void bdrv_unref(BlockDriverState *bs)
     }
 }
 
+struct BdrvOpBlocker {
+    Error *reason;
+    QLIST_ENTRY(BdrvOpBlocker) list;
+};
+
+bool bdrv_op_is_blocked(BlockDriverState *bs, BlockOpType op, Error **errp)
+{
+    BdrvOpBlocker *blocker;
+    assert((int) op >= 0 && op < BLOCK_OP_TYPE_MAX);
+    if (!QLIST_EMPTY(&bs->op_blockers[op])) {
+        blocker = QLIST_FIRST(&bs->op_blockers[op]);
+        if (errp) {
+            *errp = error_copy(blocker->reason);
+        }
+        return true;
+    }
+    return false;
+}
+
+void bdrv_op_block(BlockDriverState *bs, BlockOpType op, Error *reason)
+{
+    BdrvOpBlocker *blocker;
+    assert((int) op >= 0 && op < BLOCK_OP_TYPE_MAX);
+
+    blocker = g_malloc0(sizeof(BdrvOpBlocker));
+    blocker->reason = reason;
+    QLIST_INSERT_HEAD(&bs->op_blockers[op], blocker, list);
+}
+
+void bdrv_op_unblock(BlockDriverState *bs, BlockOpType op, Error *reason)
+{
+    BdrvOpBlocker *blocker, *next;
+    assert((int) op >= 0 && op < BLOCK_OP_TYPE_MAX);
+    QLIST_FOREACH_SAFE(blocker, &bs->op_blockers[op], list, next) {
+        if (blocker->reason == reason) {
+            QLIST_REMOVE(blocker, list);
+            g_free(blocker);
+        }
+    }
+}
+
+void bdrv_op_block_all(BlockDriverState *bs, Error *reason)
+{
+    int i;
+    for (i = 0; i < BLOCK_OP_TYPE_MAX; i++) {
+        bdrv_op_block(bs, i, reason);
+    }
+}
+
+void bdrv_op_unblock_all(BlockDriverState *bs, Error *reason)
+{
+    int i;
+    for (i = 0; i < BLOCK_OP_TYPE_MAX; i++) {
+        bdrv_op_unblock(bs, i, reason);
+    }
+}
+
+bool bdrv_op_blocker_is_empty(BlockDriverState *bs)
+{
+    int i;
+
+    for (i = 0; i < BLOCK_OP_TYPE_MAX; i++) {
+        if (!QLIST_EMPTY(&bs->op_blockers[i])) {
+            return false;
+        }
+    }
+    return true;
+}
+
 void bdrv_set_in_use(BlockDriverState *bs, int in_use)
 {
     assert(bs->in_use != in_use);
diff --git a/include/block/block.h b/include/block/block.h
index 8820735..4874e2a 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -474,6 +474,13 @@ void bdrv_unref(BlockDriverState *bs);
 void bdrv_set_in_use(BlockDriverState *bs, int in_use);
 int bdrv_in_use(BlockDriverState *bs);
 
+bool bdrv_op_is_blocked(BlockDriverState *bs, BlockOpType op, Error **errp);
+void bdrv_op_block(BlockDriverState *bs, BlockOpType op, Error *reason);
+void bdrv_op_unblock(BlockDriverState *bs, BlockOpType op, Error *reason);
+void bdrv_op_block_all(BlockDriverState *bs, Error *reason);
+void bdrv_op_unblock_all(BlockDriverState *bs, Error *reason);
+bool bdrv_op_blocker_is_empty(BlockDriverState *bs);
+
 #ifdef CONFIG_LINUX_AIO
 int raw_get_aio_fd(BlockDriverState *bs);
 #else
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 0bcf1c9..4e558d0 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -270,6 +270,8 @@ typedef struct BlockLimits {
     size_t opt_mem_alignment;
 } BlockLimits;
 
+typedef struct BdrvOpBlocker BdrvOpBlocker;
+
 /*
  * Note: the function bdrv_append() copies and swaps contents of
  * BlockDriverStates, so if you add new fields to this struct, please
@@ -361,6 +363,9 @@ struct BlockDriverState {
 
     QLIST_HEAD(, BdrvTrackedRequest) tracked_requests;
 
+    /* operation blockers */
+    QLIST_HEAD(, BdrvOpBlocker) op_blockers[BLOCK_OP_TYPE_MAX];
+
     /* long-running background operation */
     BlockJob *job;
 
-- 
1.8.5.4

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [Qemu-devel] [PATCH v14 03/14] block: Replace in_use with operation blocker
  2014-02-19 13:42 [Qemu-devel] [PATCH v14 00/14] Drop in_use from BlockDriverState and enable point-in-time snapshot exporting over NBD Fam Zheng
  2014-02-19 13:42 ` [Qemu-devel] [PATCH v14 01/14] block: Add BlockOpType enum Fam Zheng
  2014-02-19 13:42 ` [Qemu-devel] [PATCH v14 02/14] block: Introduce op_blockers to BlockDriverState Fam Zheng
@ 2014-02-19 13:42 ` Fam Zheng
  2014-02-19 15:26   ` Benoît Canet
  2014-02-19 13:42 ` [Qemu-devel] [PATCH v14 04/14] block: Move op_blocker check from block_job_create to its caller Fam Zheng
                   ` (10 subsequent siblings)
  13 siblings, 1 reply; 33+ messages in thread
From: Fam Zheng @ 2014-02-19 13:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, benoit.canet, rjones, armbru, imain, stefanha, pbonzini

This drops BlockDriverState.in_use with op_blockers:

  - Call bdrv_op_block_all in place of bdrv_set_in_use(bs, 1).
  - Call bdrv_op_unblock_all in place of bdrv_set_in_use(bs, 0).
  - Check bdrv_op_is_blocked() in place of bdrv_in_use(bs).
    The specific types are used, e.g. in place of starting block backup,
    bdrv_op_is_blocked(bs, BLOCK_OP_TYPE_BACKUP, ...).
  - Check bdrv_op_blocker_is_empty() in place of assert(!bs->in_use).

Note: there is only bdrv_op_block_all and bdrv_op_unblock_all callers at
this moment. So although the checks are specific to op types, this
changes can still be seen as identical logic with previously with
in_use. The difference is error message are improved because of blocker
error info.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 block-migration.c               |  7 +++++--
 block.c                         | 24 +++++++-----------------
 blockdev.c                      | 15 ++++++---------
 blockjob.c                      | 14 +++++++++-----
 hw/block/dataplane/virtio-blk.c | 19 ++++++++++++-------
 include/block/block.h           |  2 --
 include/block/block_int.h       |  1 -
 include/block/blockjob.h        |  3 +++
 8 files changed, 42 insertions(+), 43 deletions(-)

diff --git a/block-migration.c b/block-migration.c
index 897fdba..bf9a25f 100644
--- a/block-migration.c
+++ b/block-migration.c
@@ -59,6 +59,7 @@ typedef struct BlkMigDevState {
     unsigned long *aio_bitmap;
     int64_t completed_sectors;
     BdrvDirtyBitmap *dirty_bitmap;
+    Error *blocker;
 } BlkMigDevState;
 
 typedef struct BlkMigBlock {
@@ -346,7 +347,8 @@ static void init_blk_migration_it(void *opaque, BlockDriverState *bs)
         bmds->completed_sectors = 0;
         bmds->shared_base = block_mig_state.shared_base;
         alloc_aio_bitmap(bmds);
-        bdrv_set_in_use(bs, 1);
+        error_setg(&bmds->blocker, "block device is in use by migration");
+        bdrv_op_block_all(bs, bmds->blocker);
         bdrv_ref(bs);
 
         block_mig_state.total_sector_sum += sectors;
@@ -584,7 +586,8 @@ static void blk_mig_cleanup(void)
     blk_mig_lock();
     while ((bmds = QSIMPLEQ_FIRST(&block_mig_state.bmds_list)) != NULL) {
         QSIMPLEQ_REMOVE_HEAD(&block_mig_state.bmds_list, entry);
-        bdrv_set_in_use(bmds->bs, 0);
+        bdrv_op_unblock_all(bmds->bs, bmds->blocker);
+        error_free(bmds->blocker);
         bdrv_unref(bmds->bs);
         g_free(bmds->aio_bitmap);
         g_free(bmds);
diff --git a/block.c b/block.c
index 485d4f0..b5cb0c7 100644
--- a/block.c
+++ b/block.c
@@ -1843,7 +1843,6 @@ static void bdrv_move_feature_fields(BlockDriverState *bs_dest,
     bs_dest->refcnt             = bs_src->refcnt;
 
     /* job */
-    bs_dest->in_use             = bs_src->in_use;
     bs_dest->job                = bs_src->job;
 
     /* keep the same entry in bdrv_states */
@@ -1880,7 +1879,7 @@ void bdrv_swap(BlockDriverState *bs_new, BlockDriverState *bs_old)
     assert(QLIST_EMPTY(&bs_new->dirty_bitmaps));
     assert(bs_new->job == NULL);
     assert(bs_new->dev == NULL);
-    assert(bs_new->in_use == 0);
+    assert(bdrv_op_blocker_is_empty(bs_new));
     assert(bs_new->io_limits_enabled == false);
     assert(!throttle_have_timer(&bs_new->throttle_state));
 
@@ -1899,7 +1898,7 @@ void bdrv_swap(BlockDriverState *bs_new, BlockDriverState *bs_old)
     /* Check a few fields that should remain attached to the device */
     assert(bs_new->dev == NULL);
     assert(bs_new->job == NULL);
-    assert(bs_new->in_use == 0);
+    assert(bdrv_op_blocker_is_empty(bs_new));
     assert(bs_new->io_limits_enabled == false);
     assert(!throttle_have_timer(&bs_new->throttle_state));
 
@@ -1936,7 +1935,7 @@ static void bdrv_delete(BlockDriverState *bs)
 {
     assert(!bs->dev);
     assert(!bs->job);
-    assert(!bs->in_use);
+    assert(bdrv_op_blocker_is_empty(bs));
     assert(!bs->refcnt);
     assert(QLIST_EMPTY(&bs->dirty_bitmaps));
 
@@ -2118,7 +2117,8 @@ int bdrv_commit(BlockDriverState *bs)
         return -ENOTSUP;
     }
 
-    if (bdrv_in_use(bs) || bdrv_in_use(bs->backing_hd)) {
+    if (bdrv_op_is_blocked(bs, BLOCK_OP_TYPE_COMMIT, NULL) ||
+        bdrv_op_is_blocked(bs->backing_hd, BLOCK_OP_TYPE_COMMIT, NULL)) {
         return -EBUSY;
     }
 
@@ -3353,8 +3353,9 @@ int bdrv_truncate(BlockDriverState *bs, int64_t offset)
         return -ENOTSUP;
     if (bs->read_only)
         return -EACCES;
-    if (bdrv_in_use(bs))
+    if (bdrv_op_is_blocked(bs, BLOCK_OP_TYPE_RESIZE, NULL)) {
         return -EBUSY;
+    }
     ret = drv->bdrv_truncate(bs, offset);
     if (ret == 0) {
         ret = refresh_total_sectors(bs, offset >> BDRV_SECTOR_BITS);
@@ -5205,17 +5206,6 @@ bool bdrv_op_blocker_is_empty(BlockDriverState *bs)
     return true;
 }
 
-void bdrv_set_in_use(BlockDriverState *bs, int in_use)
-{
-    assert(bs->in_use != in_use);
-    bs->in_use = in_use;
-}
-
-int bdrv_in_use(BlockDriverState *bs)
-{
-    return bs->in_use;
-}
-
 void bdrv_iostatus_enable(BlockDriverState *bs)
 {
     bs->iostatus_enabled = true;
diff --git a/blockdev.c b/blockdev.c
index f083612..53dce70 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1289,8 +1289,8 @@ static void external_snapshot_prepare(BlkTransactionState *common,
         return;
     }
 
-    if (bdrv_in_use(state->old_bs)) {
-        error_set(errp, QERR_DEVICE_IN_USE, device);
+    if (bdrv_op_is_blocked(state->old_bs,
+                           BLOCK_OP_TYPE_EXTERNAL_SNAPSHOT, errp)) {
         return;
     }
 
@@ -1512,8 +1512,7 @@ exit:
 
 static void eject_device(BlockDriverState *bs, int force, Error **errp)
 {
-    if (bdrv_in_use(bs)) {
-        error_set(errp, QERR_DEVICE_IN_USE, bdrv_get_device_name(bs));
+    if (bdrv_op_is_blocked(bs, BLOCK_OP_TYPE_EJECT, errp)) {
         return;
     }
     if (!bdrv_dev_has_removable_media(bs)) {
@@ -1719,7 +1718,7 @@ int do_drive_del(Monitor *mon, const QDict *qdict, QObject **ret_data)
         qerror_report(QERR_DEVICE_NOT_FOUND, id);
         return -1;
     }
-    if (bdrv_in_use(bs)) {
+    if (bdrv_op_is_blocked(bs, BLOCK_OP_TYPE_DRIVE_DEL, NULL)) {
         qerror_report(QERR_DEVICE_IN_USE, id);
         return -1;
     }
@@ -1972,8 +1971,7 @@ void qmp_drive_backup(const char *device, const char *target,
         }
     }
 
-    if (bdrv_in_use(bs)) {
-        error_set(errp, QERR_DEVICE_IN_USE, device);
+    if (bdrv_op_is_blocked(bs, BLOCK_OP_TYPE_BACKUP_SOURCE, errp)) {
         return;
     }
 
@@ -2106,8 +2104,7 @@ void qmp_drive_mirror(const char *device, const char *target,
         }
     }
 
-    if (bdrv_in_use(bs)) {
-        error_set(errp, QERR_DEVICE_IN_USE, device);
+    if (bdrv_op_is_blocked(bs, BLOCK_OP_TYPE_MIRROR, errp)) {
         return;
     }
 
diff --git a/blockjob.c b/blockjob.c
index 9e5fd5c..f1ff036 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -41,14 +41,16 @@ void *block_job_create(const BlockJobDriver *driver, BlockDriverState *bs,
 {
     BlockJob *job;
 
-    if (bs->job || bdrv_in_use(bs)) {
+    if (bs->job || !bdrv_op_blocker_is_empty(bs)) {
         error_set(errp, QERR_DEVICE_IN_USE, bdrv_get_device_name(bs));
         return NULL;
     }
     bdrv_ref(bs);
-    bdrv_set_in_use(bs, 1);
-
     job = g_malloc0(driver->instance_size);
+    error_setg(&job->blocker, "block device is in use by block job: %s",
+               BlockJobType_lookup[driver->job_type]);
+    bdrv_op_block_all(bs, job->blocker);
+
     job->driver        = driver;
     job->bs            = bs;
     job->cb            = cb;
@@ -63,8 +65,9 @@ void *block_job_create(const BlockJobDriver *driver, BlockDriverState *bs,
         block_job_set_speed(job, speed, &local_err);
         if (error_is_set(&local_err)) {
             bs->job = NULL;
+            bdrv_op_unblock_all(bs, job->blocker);
+            error_free(job->blocker);
             g_free(job);
-            bdrv_set_in_use(bs, 0);
             error_propagate(errp, local_err);
             return NULL;
         }
@@ -79,8 +82,9 @@ void block_job_completed(BlockJob *job, int ret)
     assert(bs->job == job);
     job->cb(job->opaque, ret);
     bs->job = NULL;
+    bdrv_op_unblock_all(bs, job->blocker);
+    error_free(job->blocker);
     g_free(job);
-    bdrv_set_in_use(bs, 0);
 }
 
 void block_job_set_speed(BlockJob *job, int64_t speed, Error **errp)
diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c
index 2237edb..cee0aa5 100644
--- a/hw/block/dataplane/virtio-blk.c
+++ b/hw/block/dataplane/virtio-blk.c
@@ -69,6 +69,9 @@ struct VirtIOBlockDataPlane {
                                              queue */
 
     unsigned int num_reqs;
+
+    /* Operation blocker on BDS */
+    Error *blocker;
 };
 
 /* Raise an interrupt to signal guest, if necessary */
@@ -368,6 +371,7 @@ void virtio_blk_data_plane_create(VirtIODevice *vdev, VirtIOBlkConf *blk,
 {
     VirtIOBlockDataPlane *s;
     int fd;
+    Error *local_err = NULL;
 
     *dataplane = NULL;
 
@@ -390,9 +394,10 @@ void virtio_blk_data_plane_create(VirtIODevice *vdev, VirtIOBlkConf *blk,
     /* If dataplane is (re-)enabled while the guest is running there could be
      * block jobs that can conflict.
      */
-    if (bdrv_in_use(blk->conf.bs)) {
-        error_setg(errp,
-                   "cannot start dataplane thread while device is in use");
+    if (bdrv_op_is_blocked(blk->conf.bs, BLOCK_OP_TYPE_DATAPLANE, &local_err)) {
+        error_report("cannot start dataplane thread: %s",
+                      error_get_pretty(local_err));
+        error_free(local_err);
         return;
     }
 
@@ -407,9 +412,8 @@ void virtio_blk_data_plane_create(VirtIODevice *vdev, VirtIOBlkConf *blk,
     s->vdev = vdev;
     s->fd = fd;
     s->blk = blk;
-
-    /* Prevent block operations that conflict with data plane thread */
-    bdrv_set_in_use(blk->conf.bs, 1);
+    error_setg(&s->blocker, "block device is in use by data plane");
+    bdrv_op_block_all(blk->conf.bs, s->blocker);
 
     *dataplane = s;
 }
@@ -421,7 +425,8 @@ void virtio_blk_data_plane_destroy(VirtIOBlockDataPlane *s)
     }
 
     virtio_blk_data_plane_stop(s);
-    bdrv_set_in_use(s->blk->conf.bs, 0);
+    bdrv_op_unblock_all(s->blk->conf.bs, s->blocker);
+    error_free(s->blocker);
     g_free(s);
 }
 
diff --git a/include/block/block.h b/include/block/block.h
index 4874e2a..a46f70a 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -471,8 +471,6 @@ void bdrv_disable_copy_on_read(BlockDriverState *bs);
 
 void bdrv_ref(BlockDriverState *bs);
 void bdrv_unref(BlockDriverState *bs);
-void bdrv_set_in_use(BlockDriverState *bs, int in_use);
-int bdrv_in_use(BlockDriverState *bs);
 
 bool bdrv_op_is_blocked(BlockDriverState *bs, BlockOpType op, Error **errp);
 void bdrv_op_block(BlockDriverState *bs, BlockOpType op, Error *reason);
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 4e558d0..1d3f76f 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -359,7 +359,6 @@ struct BlockDriverState {
     QTAILQ_ENTRY(BlockDriverState) device_list;
     QLIST_HEAD(, BdrvDirtyBitmap) dirty_bitmaps;
     int refcnt;
-    int in_use; /* users other than guest access, eg. block migration */
 
     QLIST_HEAD(, BdrvTrackedRequest) tracked_requests;
 
diff --git a/include/block/blockjob.h b/include/block/blockjob.h
index d76de62..c0a7875 100644
--- a/include/block/blockjob.h
+++ b/include/block/blockjob.h
@@ -106,6 +106,9 @@ struct BlockJob {
     /** The completion function that will be called when the job completes.  */
     BlockDriverCompletionFunc *cb;
 
+    /** Block other operations when block job is running */
+    Error *blocker;
+
     /** The opaque value that is passed to the completion function.  */
     void *opaque;
 };
-- 
1.8.5.4

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [Qemu-devel] [PATCH v14 04/14] block: Move op_blocker check from block_job_create to its caller
  2014-02-19 13:42 [Qemu-devel] [PATCH v14 00/14] Drop in_use from BlockDriverState and enable point-in-time snapshot exporting over NBD Fam Zheng
                   ` (2 preceding siblings ...)
  2014-02-19 13:42 ` [Qemu-devel] [PATCH v14 03/14] block: Replace in_use with operation blocker Fam Zheng
@ 2014-02-19 13:42 ` Fam Zheng
  2014-02-19 15:28   ` Benoît Canet
  2014-02-19 13:42 ` [Qemu-devel] [PATCH v14 05/14] block: Add bdrv_set_backing_hd() Fam Zheng
                   ` (9 subsequent siblings)
  13 siblings, 1 reply; 33+ messages in thread
From: Fam Zheng @ 2014-02-19 13:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, benoit.canet, rjones, armbru, imain, stefanha, pbonzini

It makes no sense to check for "any" blocker on bs, we are here only
because of the mechanical conversion from in_use to op_blockers. Remove
it now, and let the callers check specific operation types. Backup and
mirror already have it, add checker to stream and commit.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 blockdev.c | 8 ++++++++
 blockjob.c | 2 +-
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/blockdev.c b/blockdev.c
index 53dce70..306aad5 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1840,6 +1840,10 @@ void qmp_block_stream(const char *device, bool has_base,
         return;
     }
 
+    if (bdrv_op_is_blocked(bs, BLOCK_OP_TYPE_STREAM, errp)) {
+        return;
+    }
+
     if (base) {
         base_bs = bdrv_find_backing_image(bs, base);
         if (base_bs == NULL) {
@@ -1880,6 +1884,10 @@ void qmp_block_commit(const char *device,
         return;
     }
 
+    if (bdrv_op_is_blocked(bs, BLOCK_OP_TYPE_COMMIT, errp)) {
+        return;
+    }
+
     /* default top_bs is the active layer */
     top_bs = bs;
 
diff --git a/blockjob.c b/blockjob.c
index f1ff036..21e21c0 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -41,7 +41,7 @@ void *block_job_create(const BlockJobDriver *driver, BlockDriverState *bs,
 {
     BlockJob *job;
 
-    if (bs->job || !bdrv_op_blocker_is_empty(bs)) {
+    if (bs->job) {
         error_set(errp, QERR_DEVICE_IN_USE, bdrv_get_device_name(bs));
         return NULL;
     }
-- 
1.8.5.4

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [Qemu-devel] [PATCH v14 05/14] block: Add bdrv_set_backing_hd()
  2014-02-19 13:42 [Qemu-devel] [PATCH v14 00/14] Drop in_use from BlockDriverState and enable point-in-time snapshot exporting over NBD Fam Zheng
                   ` (3 preceding siblings ...)
  2014-02-19 13:42 ` [Qemu-devel] [PATCH v14 04/14] block: Move op_blocker check from block_job_create to its caller Fam Zheng
@ 2014-02-19 13:42 ` Fam Zheng
  2014-02-19 15:27   ` Benoît Canet
  2014-02-19 13:42 ` [Qemu-devel] [PATCH v14 06/14] block: Add backing_blocker in BlockDriverState Fam Zheng
                   ` (8 subsequent siblings)
  13 siblings, 1 reply; 33+ messages in thread
From: Fam Zheng @ 2014-02-19 13:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, benoit.canet, rjones, armbru, imain, stefanha, pbonzini

This is the common but non-trivial steps to assign or change the
backing_hd of BDS.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 block.c               | 34 +++++++++++++++++++++++++++-------
 include/block/block.h |  1 +
 2 files changed, 28 insertions(+), 7 deletions(-)

diff --git a/block.c b/block.c
index b5cb0c7..dec44d4 100644
--- a/block.c
+++ b/block.c
@@ -1041,6 +1041,26 @@ fail:
     return ret;
 }
 
+void bdrv_set_backing_hd(BlockDriverState *bs, BlockDriverState *backing_hd)
+{
+    if (bs->backing_hd) {
+        bdrv_unref(bs->backing_hd);
+    }
+
+    bs->backing_hd = backing_hd;
+    if (!backing_hd) {
+        bs->backing_file[0] = '\0';
+        bs->backing_format[0] = '\0';
+        goto out;
+    }
+    pstrcpy(bs->backing_file, sizeof(bs->backing_file), backing_hd->filename);
+    pstrcpy(bs->backing_format, sizeof(bs->backing_format),
+            backing_hd->drv ? backing_hd->drv->format_name : "");
+    bdrv_ref(bs->backing_hd);
+out:
+    bdrv_refresh_limits(bs);
+}
+
 /*
  * Opens the backing file for a BlockDriverState if not yet open
  *
@@ -1054,6 +1074,7 @@ int bdrv_open_backing_file(BlockDriverState *bs, QDict *options, Error **errp)
     char backing_filename[PATH_MAX];
     int back_flags, ret;
     BlockDriver *back_drv = NULL;
+    BlockDriverState *backing_hd;
     Error *local_err = NULL;
 
     if (bs->backing_hd != NULL) {
@@ -1077,6 +1098,8 @@ int bdrv_open_backing_file(BlockDriverState *bs, QDict *options, Error **errp)
                                        sizeof(backing_filename));
     }
 
+    backing_hd = bdrv_new("");
+
     if (bs->backing_format[0] != '\0') {
         back_drv = bdrv_find_format(bs->backing_format);
     }
@@ -1085,23 +1108,20 @@ int bdrv_open_backing_file(BlockDriverState *bs, QDict *options, Error **errp)
     back_flags = bs->open_flags & ~(BDRV_O_RDWR | BDRV_O_SNAPSHOT |
                                     BDRV_O_COPY_ON_READ);
 
-    assert(bs->backing_hd == NULL);
-    ret = bdrv_open(&bs->backing_hd,
+    ret = bdrv_open(&backing_hd,
                     *backing_filename ? backing_filename : NULL, NULL, options,
                     back_flags, back_drv, &local_err);
     if (ret < 0) {
         bs->backing_hd = NULL;
+        bdrv_unref(backing_hd);
+        backing_hd = NULL;
         bs->open_flags |= BDRV_O_NO_BACKING;
         error_setg(errp, "Could not open backing file: %s",
                    error_get_pretty(local_err));
         error_free(local_err);
         return ret;
     }
-
-    if (bs->backing_hd->file) {
-        pstrcpy(bs->backing_file, sizeof(bs->backing_file),
-                bs->backing_hd->file->filename);
-    }
+    bdrv_set_backing_hd(bs, backing_hd);
 
     /* Recalculate the BlockLimits with the backing file */
     bdrv_refresh_limits(bs);
diff --git a/include/block/block.h b/include/block/block.h
index a46f70a..ee1582d 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -208,6 +208,7 @@ int bdrv_parse_discard_flags(const char *mode, int *flags);
 int bdrv_open_image(BlockDriverState **pbs, const char *filename,
                     QDict *options, const char *bdref_key, int flags,
                     bool allow_none, Error **errp);
+void bdrv_set_backing_hd(BlockDriverState *bs, BlockDriverState *backing_hd);
 int bdrv_open_backing_file(BlockDriverState *bs, QDict *options, Error **errp);
 int bdrv_open(BlockDriverState **pbs, const char *filename,
               const char *reference, QDict *options, int flags,
-- 
1.8.5.4

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [Qemu-devel] [PATCH v14 06/14] block: Add backing_blocker in BlockDriverState
  2014-02-19 13:42 [Qemu-devel] [PATCH v14 00/14] Drop in_use from BlockDriverState and enable point-in-time snapshot exporting over NBD Fam Zheng
                   ` (4 preceding siblings ...)
  2014-02-19 13:42 ` [Qemu-devel] [PATCH v14 05/14] block: Add bdrv_set_backing_hd() Fam Zheng
@ 2014-02-19 13:42 ` Fam Zheng
  2014-02-19 15:32   ` Benoît Canet
  2014-02-19 21:17   ` Jeff Cody
  2014-02-19 13:42 ` [Qemu-devel] [PATCH v14 07/14] block: Parse "backing" option to reference existing BDS Fam Zheng
                   ` (7 subsequent siblings)
  13 siblings, 2 replies; 33+ messages in thread
From: Fam Zheng @ 2014-02-19 13:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, benoit.canet, rjones, armbru, imain, stefanha, pbonzini

This makes use of op_blocker and blocks all the operations except for
commit target, on each BlockDriverState->backing_hd.

The asserts for op_blocker in bdrv_swap are removed because with this
change, the target of block commit has at least the backing blocker of
its child, so the assertion is not true. Callers should do their check.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 block.c                   | 19 +++++++++++++++----
 include/block/block_int.h |  3 +++
 2 files changed, 18 insertions(+), 4 deletions(-)

diff --git a/block.c b/block.c
index dec44d4..95d8c1f 100644
--- a/block.c
+++ b/block.c
@@ -1044,19 +1044,33 @@ fail:
 void bdrv_set_backing_hd(BlockDriverState *bs, BlockDriverState *backing_hd)
 {
     if (bs->backing_hd) {
+        assert(error_is_set(&bs->backing_blocker));
+        bdrv_op_unblock_all(bs->backing_hd, bs->backing_blocker);
         bdrv_unref(bs->backing_hd);
+    } else if (backing_hd) {
+        error_setg(&bs->backing_blocker,
+                   "device is used as backing hd of '%s'",
+                   bs->device_name);
     }
 
     bs->backing_hd = backing_hd;
     if (!backing_hd) {
         bs->backing_file[0] = '\0';
         bs->backing_format[0] = '\0';
+        if (error_is_set(&bs->backing_blocker)) {
+            error_free(bs->backing_blocker);
+        }
         goto out;
     }
     pstrcpy(bs->backing_file, sizeof(bs->backing_file), backing_hd->filename);
     pstrcpy(bs->backing_format, sizeof(bs->backing_format),
             backing_hd->drv ? backing_hd->drv->format_name : "");
     bdrv_ref(bs->backing_hd);
+
+    bdrv_op_block_all(bs->backing_hd, bs->backing_blocker);
+    /* Otherwise we won't be able to commit due to check in bdrv_commit */
+    bdrv_op_unblock(bs->backing_hd, BLOCK_OP_TYPE_COMMIT,
+                    bs->backing_blocker);
 out:
     bdrv_refresh_limits(bs);
 }
@@ -1696,8 +1710,7 @@ void bdrv_close(BlockDriverState *bs)
 
     if (bs->drv) {
         if (bs->backing_hd) {
-            bdrv_unref(bs->backing_hd);
-            bs->backing_hd = NULL;
+            bdrv_set_backing_hd(bs, NULL);
         }
         bs->drv->bdrv_close(bs);
         g_free(bs->opaque);
@@ -1899,7 +1912,6 @@ void bdrv_swap(BlockDriverState *bs_new, BlockDriverState *bs_old)
     assert(QLIST_EMPTY(&bs_new->dirty_bitmaps));
     assert(bs_new->job == NULL);
     assert(bs_new->dev == NULL);
-    assert(bdrv_op_blocker_is_empty(bs_new));
     assert(bs_new->io_limits_enabled == false);
     assert(!throttle_have_timer(&bs_new->throttle_state));
 
@@ -1918,7 +1930,6 @@ void bdrv_swap(BlockDriverState *bs_new, BlockDriverState *bs_old)
     /* Check a few fields that should remain attached to the device */
     assert(bs_new->dev == NULL);
     assert(bs_new->job == NULL);
-    assert(bdrv_op_blocker_is_empty(bs_new));
     assert(bs_new->io_limits_enabled == false);
     assert(!throttle_have_timer(&bs_new->throttle_state));
 
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 1d3f76f..1f4f78b 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -369,6 +369,9 @@ struct BlockDriverState {
     BlockJob *job;
 
     QDict *options;
+
+    /* The error object in use for blocking operations on backing_hd */
+    Error *backing_blocker;
 };
 
 int get_tmp_filename(char *filename, int size);
-- 
1.8.5.4

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [Qemu-devel] [PATCH v14 07/14] block: Parse "backing" option to reference existing BDS
  2014-02-19 13:42 [Qemu-devel] [PATCH v14 00/14] Drop in_use from BlockDriverState and enable point-in-time snapshot exporting over NBD Fam Zheng
                   ` (5 preceding siblings ...)
  2014-02-19 13:42 ` [Qemu-devel] [PATCH v14 06/14] block: Add backing_blocker in BlockDriverState Fam Zheng
@ 2014-02-19 13:42 ` Fam Zheng
  2014-02-19 13:42 ` [Qemu-devel] [PATCH v14 08/14] block: Support dropping active in bdrv_drop_intermediate Fam Zheng
                   ` (6 subsequent siblings)
  13 siblings, 0 replies; 33+ messages in thread
From: Fam Zheng @ 2014-02-19 13:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, benoit.canet, rjones, armbru, imain, stefanha, pbonzini

Now it's safe to allow reference for backing_hd in the interface.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 block.c | 26 ++++++++++++++++++++++++--
 1 file changed, 24 insertions(+), 2 deletions(-)

diff --git a/block.c b/block.c
index 95d8c1f..a2bf24c 100644
--- a/block.c
+++ b/block.c
@@ -1395,12 +1395,34 @@ int bdrv_open(BlockDriverState **pbs, const char *filename,
     /* If there is a backing file, use it */
     if ((flags & BDRV_O_NO_BACKING) == 0) {
         QDict *backing_options;
+        const char *backing_name;
+        BlockDriverState *backing_hd;
 
+        backing_name = qdict_get_try_str(options, "backing");
         qdict_extract_subqdict(options, &backing_options, "backing.");
-        ret = bdrv_open_backing_file(bs, backing_options, &local_err);
-        if (ret < 0) {
+
+        if (backing_name && qdict_size(backing_options)) {
+            error_setg(&local_err,
+                       "Option \"backing\" and \"backing.*\" cannot be "
+                       "used together");
+            ret = -EINVAL;
             goto close_and_fail;
         }
+        if (backing_name) {
+            backing_hd = bdrv_find(backing_name);
+            if (!backing_hd) {
+                error_set(&local_err, QERR_DEVICE_NOT_FOUND, backing_name);
+                ret = -ENOENT;
+                goto close_and_fail;
+            }
+            qdict_del(options, "backing");
+            bdrv_set_backing_hd(bs, backing_hd);
+        } else {
+            ret = bdrv_open_backing_file(bs, backing_options, &local_err);
+            if (ret < 0) {
+                goto close_and_fail;
+            }
+        }
     }
 
 done:
-- 
1.8.5.4

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [Qemu-devel] [PATCH v14 08/14] block: Support dropping active in bdrv_drop_intermediate
  2014-02-19 13:42 [Qemu-devel] [PATCH v14 00/14] Drop in_use from BlockDriverState and enable point-in-time snapshot exporting over NBD Fam Zheng
                   ` (6 preceding siblings ...)
  2014-02-19 13:42 ` [Qemu-devel] [PATCH v14 07/14] block: Parse "backing" option to reference existing BDS Fam Zheng
@ 2014-02-19 13:42 ` Fam Zheng
  2014-02-19 15:34   ` Benoît Canet
  2014-02-19 21:22   ` Jeff Cody
  2014-02-19 13:42 ` [Qemu-devel] [PATCH v14 09/14] stream: Use bdrv_drop_intermediate and drop close_unused_images Fam Zheng
                   ` (5 subsequent siblings)
  13 siblings, 2 replies; 33+ messages in thread
From: Fam Zheng @ 2014-02-19 13:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, benoit.canet, rjones, armbru, imain, stefanha, pbonzini

Dropping intermediate could be useful both for commit and stream, and
BDS refcnt plus bdrv_swap could do most of the job nicely. It also needs
to work with op blockers.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 block.c        | 146 +++++++++++++++++++++++++--------------------------------
 block/commit.c |   1 +
 2 files changed, 66 insertions(+), 81 deletions(-)

diff --git a/block.c b/block.c
index a2bf24c..cf41f3d 100644
--- a/block.c
+++ b/block.c
@@ -2485,115 +2485,99 @@ BlockDriverState *bdrv_find_overlay(BlockDriverState *active,
     return overlay;
 }
 
-typedef struct BlkIntermediateStates {
-    BlockDriverState *bs;
-    QSIMPLEQ_ENTRY(BlkIntermediateStates) entry;
-} BlkIntermediateStates;
-
-
 /*
- * Drops images above 'base' up to and including 'top', and sets the image
- * above 'top' to have base as its backing file.
+ * Drops images above 'base' up to and including 'top', and sets new 'base'
+ * as backing_hd of top_overlay (the image orignally has 'top' as backing
+ * file). top_overlay may be NULL if 'top' is active, no such update needed.
+ * Requires that the top_overlay to 'top' is opened r/w.
  *
- * Requires that the overlay to 'top' is opened r/w, so that the backing file
- * information in 'bs' can be properly updated.
+ * 1) This will convert the following chain:
  *
- * E.g., this will convert the following chain:
- * bottom <- base <- intermediate <- top <- active
+ *     ... <- base <- ... <- top <- overlay <-... <- active
  *
  * to
  *
- * bottom <- base <- active
+ *     ... <- base <- overlay <- active
  *
- * It is allowed for bottom==base, in which case it converts:
+ * 2) It is allowed for bottom==base, in which case it converts:
  *
- * base <- intermediate <- top <- active
+ *     base <- ... <- top <- overlay <- ... <- active
  *
  * to
  *
- * base <- active
+ *     base <- overlay <- active
+ *
+ * 2) It also allows active==top, in which case it converts:
+ *
+ *     ... <- base <- ... <- top (active)
+ *
+ * to
+ *
+ *     ... <- base == active == top
+ *
+ * i.e. only base and lower remains: *top == *base when return.
+ *
+ * 3) If base==NULL, it will drop all the BDS below overlay and set its
+ * backing_hd to NULL. I.e.:
+ *
+ *     base(NULL) <- ... <- overlay <- ... <- active
+ *
+ * to
  *
- * Error conditions:
- *  if active == top, that is considered an error
+ *     overlay <- ... <- active
  *
  */
 int bdrv_drop_intermediate(BlockDriverState *active, BlockDriverState *top,
                            BlockDriverState *base)
 {
-    BlockDriverState *intermediate;
-    BlockDriverState *base_bs = NULL;
-    BlockDriverState *new_top_bs = NULL;
-    BlkIntermediateStates *intermediate_state, *next;
-    int ret = -EIO;
-
-    QSIMPLEQ_HEAD(states_to_delete, BlkIntermediateStates) states_to_delete;
-    QSIMPLEQ_INIT(&states_to_delete);
-
-    if (!top->drv || !base->drv) {
-        goto exit;
-    }
-
-    new_top_bs = bdrv_find_overlay(active, top);
+    BlockDriverState *drop_start, *overlay;
+    int ret = -EINVAL;
 
-    if (new_top_bs == NULL) {
-        /* we could not find the image above 'top', this is an error */
+    if (!top->drv || (base && !base->drv)) {
         goto exit;
     }
-
-    /* special case of new_top_bs->backing_hd already pointing to base - nothing
-     * to do, no intermediate images */
-    if (new_top_bs->backing_hd == base) {
+    if (top == base) {
         ret = 0;
-        goto exit;
-    }
-
-    intermediate = top;
-
-    /* now we will go down through the list, and add each BDS we find
-     * into our deletion queue, until we hit the 'base'
-     */
-    while (intermediate) {
-        intermediate_state = g_malloc0(sizeof(BlkIntermediateStates));
-        intermediate_state->bs = intermediate;
-        QSIMPLEQ_INSERT_TAIL(&states_to_delete, intermediate_state, entry);
-
-        if (intermediate->backing_hd == base) {
-            base_bs = intermediate->backing_hd;
-            break;
+    } else if (top == active) {
+        assert(base);
+        drop_start = active->backing_hd;
+        bdrv_swap(active, base);
+        base->backing_hd = NULL;
+        bdrv_unref(drop_start);
+        ret = 0;
+    } else {
+        /* If there's an overlay, its backing_hd points to top's BDS now,
+         * the top image is dropped but this BDS structure is kept and swapped
+         * with base, this way we keep the pointers valid after dropping top */
+        overlay = bdrv_find_overlay(active, top);
+        if (!overlay) {
+            goto exit;
+        }
+        if (base) {
+            ret = bdrv_change_backing_file(overlay, base->filename,
+                                           base->drv->format_name);
+        } else {
+            ret = bdrv_change_backing_file(overlay, NULL, NULL);
+        }
+        if (ret) {
+            goto exit;
+        }
+        if (base) {
+            drop_start = top->backing_hd;
+            bdrv_swap(top, base);
+            /* Break the loop formed by bdrv_swap */
+            bdrv_set_backing_hd(base, NULL);
+        } else {
+            bdrv_set_backing_hd(overlay, NULL);
+            drop_start = top;
         }
-        intermediate = intermediate->backing_hd;
-    }
-    if (base_bs == NULL) {
-        /* something went wrong, we did not end at the base. safely
-         * unravel everything, and exit with error */
-        goto exit;
-    }
-
-    /* success - we can delete the intermediate states, and link top->base */
-    ret = bdrv_change_backing_file(new_top_bs, base_bs->filename,
-                                   base_bs->drv ? base_bs->drv->format_name : "");
-    if (ret) {
-        goto exit;
-    }
-    new_top_bs->backing_hd = base_bs;
-
-    bdrv_refresh_limits(new_top_bs);
 
-    QSIMPLEQ_FOREACH_SAFE(intermediate_state, &states_to_delete, entry, next) {
-        /* so that bdrv_close() does not recursively close the chain */
-        intermediate_state->bs->backing_hd = NULL;
-        bdrv_unref(intermediate_state->bs);
+        bdrv_unref(drop_start);
     }
-    ret = 0;
-
 exit:
-    QSIMPLEQ_FOREACH_SAFE(intermediate_state, &states_to_delete, entry, next) {
-        g_free(intermediate_state);
-    }
     return ret;
 }
 
-
 static int bdrv_check_byte_request(BlockDriverState *bs, int64_t offset,
                                    size_t size)
 {
diff --git a/block/commit.c b/block/commit.c
index acec4ac..b10eb79 100644
--- a/block/commit.c
+++ b/block/commit.c
@@ -142,6 +142,7 @@ wait:
     if (!block_job_is_cancelled(&s->common) && sector_num == end) {
         /* success */
         ret = bdrv_drop_intermediate(active, top, base);
+        base = top;
     }
 
 exit_free_buf:
-- 
1.8.5.4

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [Qemu-devel] [PATCH v14 09/14] stream: Use bdrv_drop_intermediate and drop close_unused_images
  2014-02-19 13:42 [Qemu-devel] [PATCH v14 00/14] Drop in_use from BlockDriverState and enable point-in-time snapshot exporting over NBD Fam Zheng
                   ` (7 preceding siblings ...)
  2014-02-19 13:42 ` [Qemu-devel] [PATCH v14 08/14] block: Support dropping active in bdrv_drop_intermediate Fam Zheng
@ 2014-02-19 13:42 ` Fam Zheng
  2014-02-19 21:23   ` Jeff Cody
  2014-02-19 13:42 ` [Qemu-devel] [PATCH v14 10/14] qmp: Add command 'blockdev-backup' Fam Zheng
                   ` (4 subsequent siblings)
  13 siblings, 1 reply; 33+ messages in thread
From: Fam Zheng @ 2014-02-19 13:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, benoit.canet, rjones, armbru, imain, stefanha, pbonzini

This reuses the new bdrv_drop_intermediate.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 block/stream.c | 30 +-----------------------------
 1 file changed, 1 insertion(+), 29 deletions(-)

diff --git a/block/stream.c b/block/stream.c
index dd0b4ac..9cdcf0e 100644
--- a/block/stream.c
+++ b/block/stream.c
@@ -51,34 +51,6 @@ static int coroutine_fn stream_populate(BlockDriverState *bs,
     return bdrv_co_copy_on_readv(bs, sector_num, nb_sectors, &qiov);
 }
 
-static void close_unused_images(BlockDriverState *top, BlockDriverState *base,
-                                const char *base_id)
-{
-    BlockDriverState *intermediate;
-    intermediate = top->backing_hd;
-
-    /* Must assign before bdrv_delete() to prevent traversing dangling pointer
-     * while we delete backing image instances.
-     */
-    top->backing_hd = base;
-
-    while (intermediate) {
-        BlockDriverState *unused;
-
-        /* reached base */
-        if (intermediate == base) {
-            break;
-        }
-
-        unused = intermediate;
-        intermediate = intermediate->backing_hd;
-        unused->backing_hd = NULL;
-        bdrv_unref(unused);
-    }
-
-    bdrv_refresh_limits(top);
-}
-
 static void coroutine_fn stream_run(void *opaque)
 {
     StreamBlockJob *s = opaque;
@@ -192,7 +164,7 @@ wait:
             }
         }
         ret = bdrv_change_backing_file(bs, base_id, base_fmt);
-        close_unused_images(bs, base, base_id);
+        bdrv_drop_intermediate(bs, bs->backing_hd, base);
     }
 
     qemu_vfree(buf);
-- 
1.8.5.4

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [Qemu-devel] [PATCH v14 10/14] qmp: Add command 'blockdev-backup'
  2014-02-19 13:42 [Qemu-devel] [PATCH v14 00/14] Drop in_use from BlockDriverState and enable point-in-time snapshot exporting over NBD Fam Zheng
                   ` (8 preceding siblings ...)
  2014-02-19 13:42 ` [Qemu-devel] [PATCH v14 09/14] stream: Use bdrv_drop_intermediate and drop close_unused_images Fam Zheng
@ 2014-02-19 13:42 ` Fam Zheng
  2014-02-19 13:42 ` [Qemu-devel] [PATCH v14 11/14] block: Allow backup on referenced named BlockDriverState Fam Zheng
                   ` (3 subsequent siblings)
  13 siblings, 0 replies; 33+ messages in thread
From: Fam Zheng @ 2014-02-19 13:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, benoit.canet, rjones, armbru, imain, stefanha, pbonzini

Similar to drive-backup, but this command uses a device id as target
instead of creating/opening an image file.

Also add blocker on target bs, since the target is also a named device
now.

Add check and report error for bs == target which became possible but is
an illegal case with introduction of blockdev-backup.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 block/backup.c   | 26 ++++++++++++++++++++++++++
 blockdev.c       | 47 +++++++++++++++++++++++++++++++++++++++++++++++
 qapi-schema.json | 49 +++++++++++++++++++++++++++++++++++++++++++++++++
 qmp-commands.hx  | 44 ++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 166 insertions(+)

diff --git a/block/backup.c b/block/backup.c
index 15a2e55..ea46340 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -344,6 +344,7 @@ static void coroutine_fn backup_run(void *opaque)
     hbitmap_free(job->bitmap);
 
     bdrv_iostatus_disable(target);
+    bdrv_op_unblock_all(target, job->common.blocker);
     bdrv_unref(target);
 
     block_job_completed(&job->common, ret);
@@ -362,6 +363,11 @@ void backup_start(BlockDriverState *bs, BlockDriverState *target,
     assert(target);
     assert(cb);
 
+    if (bs == target) {
+        error_setg(errp, "Source and target cannot be the same");
+        return;
+    }
+
     if ((on_source_error == BLOCKDEV_ON_ERROR_STOP ||
          on_source_error == BLOCKDEV_ON_ERROR_ENOSPC) &&
         !bdrv_iostatus_is_enabled(bs)) {
@@ -369,6 +375,24 @@ void backup_start(BlockDriverState *bs, BlockDriverState *target,
         return;
     }
 
+    if (!bdrv_is_inserted(bs)) {
+        error_set(errp, QERR_DEVICE_HAS_NO_MEDIUM, bs->device_name);
+        return;
+    }
+
+    if (!bdrv_is_inserted(target)) {
+        error_set(errp, QERR_DEVICE_HAS_NO_MEDIUM, target->device_name);
+        return;
+    }
+
+    if (bdrv_op_is_blocked(bs, BLOCK_OP_TYPE_BACKUP_SOURCE, errp)) {
+        return;
+    }
+
+    if (bdrv_op_is_blocked(target, BLOCK_OP_TYPE_BACKUP_TARGET, errp)) {
+        return;
+    }
+
     len = bdrv_getlength(bs);
     if (len < 0) {
         error_setg_errno(errp, -len, "unable to get length for '%s'",
@@ -382,6 +406,8 @@ void backup_start(BlockDriverState *bs, BlockDriverState *target,
         return;
     }
 
+    bdrv_op_block_all(target, job->common.blocker);
+
     job->on_source_error = on_source_error;
     job->on_target_error = on_target_error;
     job->target = target;
diff --git a/blockdev.c b/blockdev.c
index 306aad5..890cfea 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1963,6 +1963,8 @@ void qmp_drive_backup(const char *device, const char *target,
         return;
     }
 
+    /* Although backup_run has this check too, we need to use bs->drv below, so
+     * do an early check redundantly. */
     if (!bdrv_is_inserted(bs)) {
         error_set(errp, QERR_DEVICE_HAS_NO_MEDIUM, device);
         return;
@@ -1979,6 +1981,7 @@ void qmp_drive_backup(const char *device, const char *target,
         }
     }
 
+    /* Early check to avoid creating target */
     if (bdrv_op_is_blocked(bs, BLOCK_OP_TYPE_BACKUP_SOURCE, errp)) {
         return;
     }
@@ -2041,6 +2044,50 @@ BlockDeviceInfoList *qmp_query_named_block_nodes(Error **errp)
     return bdrv_named_nodes_list();
 }
 
+void qmp_blockdev_backup(const char *device, const char *target,
+                         enum MirrorSyncMode sync,
+                         bool has_speed, int64_t speed,
+                         bool has_on_source_error,
+                         BlockdevOnError on_source_error,
+                         bool has_on_target_error,
+                         BlockdevOnError on_target_error,
+                         Error **errp)
+{
+    BlockDriverState *bs;
+    BlockDriverState *target_bs;
+    Error *local_err = NULL;
+
+    if (!has_speed) {
+        speed = 0;
+    }
+    if (!has_on_source_error) {
+        on_source_error = BLOCKDEV_ON_ERROR_REPORT;
+    }
+    if (!has_on_target_error) {
+        on_target_error = BLOCKDEV_ON_ERROR_REPORT;
+    }
+
+    bs = bdrv_find(device);
+    if (!bs) {
+        error_set(errp, QERR_DEVICE_NOT_FOUND, device);
+        return;
+    }
+
+    target_bs = bdrv_find(target);
+    if (!target_bs) {
+        error_set(errp, QERR_DEVICE_NOT_FOUND, target);
+        return;
+    }
+
+    bdrv_ref(target_bs);
+    backup_start(bs, target_bs, speed, sync, on_source_error, on_target_error,
+                 block_job_cb, bs, &local_err);
+    if (local_err != NULL) {
+        bdrv_unref(target_bs);
+        error_propagate(errp, local_err);
+    }
+}
+
 #define DEFAULT_MIRROR_BUF_SIZE   (10 << 20)
 
 void qmp_drive_mirror(const char *device, const char *target,
diff --git a/qapi-schema.json b/qapi-schema.json
index 7cfb5e5..7b0be86 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -1840,6 +1840,40 @@
             '*on-target-error': 'BlockdevOnError' } }
 
 ##
+# @BlockdevBackup
+#
+# @device: the name of the device which should be copied.
+#
+# @target: the name of the backup target device.
+#
+# @sync: what parts of the disk image should be copied to the destination
+#        (all the disk, only the sectors allocated in the topmost image, or
+#        only new I/O).
+#
+# @speed: #optional the maximum speed, in bytes per second.
+#
+# @on-source-error: #optional the action to take on an error on the source,
+#                   default 'report'.  'stop' and 'enospc' can only be used
+#                   if the block device supports io-status (see BlockInfo).
+#
+# @on-target-error: #optional the action to take on an error on the target,
+#                   default 'report' (no limitations, since this applies to
+#                   a different block device than @device).
+#
+# Note that @on-source-error and @on-target-error only affect background I/O.
+# If an error occurs during a guest write request, the device's rerror/werror
+# actions will be used.
+#
+# Since: 2.0
+##
+{ 'type': 'BlockdevBackup',
+  'data': { 'device': 'str', 'target': 'str',
+            'sync': 'MirrorSyncMode',
+            '*speed': 'int',
+            '*on-source-error': 'BlockdevOnError',
+            '*on-target-error': 'BlockdevOnError' } }
+
+##
 # @Abort
 #
 # This action can be used to test transaction failure.
@@ -2048,6 +2082,21 @@
 { 'command': 'query-named-block-nodes', 'returns': [ 'BlockDeviceInfo' ] }
 
 ##
+# @blockdev-backup
+#
+# Block device version for drive-backup command. Use existing device as target
+# of backup.
+#
+# For the arguments, see the documentation of BlockdevBackup.
+#
+# Returns: Nothing on success.
+#          If @device or @target is not a valid block device, DeviceNotFound.
+#
+# Since 2.0
+##
+{ 'command': 'blockdev-backup', 'data': 'BlockdevBackup' }
+
+##
 # @drive-mirror
 #
 # Start mirroring a block device's writes to a new destination.
diff --git a/qmp-commands.hx b/qmp-commands.hx
index cce6b81..c3ee46a 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -1054,6 +1054,50 @@ Example:
                                                "sync": "full",
                                                "target": "backup.img" } }
 <- { "return": {} }
+
+EQMP
+
+    {
+        .name       = "blockdev-backup",
+        .args_type  = "sync:s,device:B,target:B,speed:i?,"
+                      "on-source-error:s?,on-target-error:s?",
+        .mhandler.cmd_new = qmp_marshal_input_blockdev_backup,
+    },
+
+SQMP
+blockdev-backup
+------------
+
+The device version of drive-backup: this command takes a existing named device
+as backup target.
+
+Arguments:
+
+- "device": the name of the device which should be copied.
+            (json-string)
+- "target": the target of the new image. If the file exists, or if it is a
+            device, the existing file/device will be used as the new
+            destination.  If it does not exist, a new file will be created.
+            (json-string)
+- "sync": what parts of the disk image should be copied to the destination;
+          possibilities include "full" for all the disk, "top" for only the
+          sectors allocated in the topmost image, or "none" to only replicate
+          new I/O (MirrorSyncMode).
+- "speed": the maximum speed, in bytes per second (json-int, optional)
+- "on-source-error": the action to take on an error on the source, default
+                     'report'.  'stop' and 'enospc' can only be used
+                     if the block device supports io-status.
+                     (BlockdevOnError, optional)
+- "on-target-error": the action to take on an error on the target, default
+                     'report' (no limitations, since this applies to
+                     a different block device than device).
+                     (BlockdevOnError, optional)
+
+Example:
+-> { "execute": "blockdev-backup", "arguments": { "device": "src-id",
+                                                  "target": "tgt-id" } }
+<- { "return": {} }
+
 EQMP
 
     {
-- 
1.8.5.4

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [Qemu-devel] [PATCH v14 11/14] block: Allow backup on referenced named BlockDriverState
  2014-02-19 13:42 [Qemu-devel] [PATCH v14 00/14] Drop in_use from BlockDriverState and enable point-in-time snapshot exporting over NBD Fam Zheng
                   ` (9 preceding siblings ...)
  2014-02-19 13:42 ` [Qemu-devel] [PATCH v14 10/14] qmp: Add command 'blockdev-backup' Fam Zheng
@ 2014-02-19 13:42 ` Fam Zheng
  2014-02-19 13:42 ` [Qemu-devel] [PATCH v14 12/14] block: Add blockdev-backup to transaction Fam Zheng
                   ` (2 subsequent siblings)
  13 siblings, 0 replies; 33+ messages in thread
From: Fam Zheng @ 2014-02-19 13:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, benoit.canet, rjones, armbru, imain, stefanha, pbonzini

Drive backup is a read only operation on source bs. We want to allow
this specific case to enable image-fleecing. Note that when
image-fleecing job starts, the job still add its blocker to source bs,
and any other operation on it will be blocked by that.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 block.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/block.c b/block.c
index cf41f3d..1af43b9 100644
--- a/block.c
+++ b/block.c
@@ -1071,6 +1071,8 @@ void bdrv_set_backing_hd(BlockDriverState *bs, BlockDriverState *backing_hd)
     /* Otherwise we won't be able to commit due to check in bdrv_commit */
     bdrv_op_unblock(bs->backing_hd, BLOCK_OP_TYPE_COMMIT,
                     bs->backing_blocker);
+    bdrv_op_unblock(bs->backing_hd, BLOCK_OP_TYPE_BACKUP_SOURCE,
+                    bs->backing_blocker);
 out:
     bdrv_refresh_limits(bs);
 }
-- 
1.8.5.4

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [Qemu-devel] [PATCH v14 12/14] block: Add blockdev-backup to transaction
  2014-02-19 13:42 [Qemu-devel] [PATCH v14 00/14] Drop in_use from BlockDriverState and enable point-in-time snapshot exporting over NBD Fam Zheng
                   ` (10 preceding siblings ...)
  2014-02-19 13:42 ` [Qemu-devel] [PATCH v14 11/14] block: Allow backup on referenced named BlockDriverState Fam Zheng
@ 2014-02-19 13:42 ` Fam Zheng
  2014-02-19 13:42 ` [Qemu-devel] [PATCH v14 13/14] qemu-iotests: Test blockdev-backup in 055 Fam Zheng
  2014-02-19 13:42 ` [Qemu-devel] [PATCH v14 14/14] qemu-iotests: Image fleecing test case 081 Fam Zheng
  13 siblings, 0 replies; 33+ messages in thread
From: Fam Zheng @ 2014-02-19 13:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, benoit.canet, rjones, armbru, imain, stefanha, pbonzini

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 blockdev.c       | 48 ++++++++++++++++++++++++++++++++++++++++++++++++
 qapi-schema.json |  1 +
 2 files changed, 49 insertions(+)

diff --git a/blockdev.c b/blockdev.c
index 890cfea..81446f5 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1405,6 +1405,49 @@ static void drive_backup_abort(BlkTransactionState *common)
     }
 }
 
+typedef struct BlockdevBackupState {
+    BlkTransactionState common;
+    BlockDriverState *bs;
+    BlockJob *job;
+} BlockdevBackupState;
+
+static void blockdev_backup_prepare(BlkTransactionState *common, Error **errp)
+{
+    BlockdevBackupState *state = DO_UPCAST(BlockdevBackupState, common, common);
+    BlockdevBackup *backup;
+    Error *local_err = NULL;
+
+    assert(common->action->kind == TRANSACTION_ACTION_KIND_BLOCKDEV_BACKUP);
+    backup = common->action->blockdev_backup;
+
+    qmp_blockdev_backup(backup->device, backup->target,
+                        backup->sync,
+                        backup->has_speed, backup->speed,
+                        backup->has_on_source_error, backup->on_source_error,
+                        backup->has_on_target_error, backup->on_target_error,
+                        &local_err);
+    if (error_is_set(&local_err)) {
+        error_propagate(errp, local_err);
+        state->bs = NULL;
+        state->job = NULL;
+        return;
+    }
+
+    state->bs = bdrv_find(backup->device);
+    state->job = state->bs->job;
+}
+
+static void blockdev_backup_abort(BlkTransactionState *common)
+{
+    BlockdevBackupState *state = DO_UPCAST(BlockdevBackupState, common, common);
+    BlockDriverState *bs = state->bs;
+
+    /* Only cancel if it's the job we started */
+    if (bs && bs->job && bs->job == state->job) {
+        block_job_cancel_sync(bs->job);
+    }
+}
+
 static void abort_prepare(BlkTransactionState *common, Error **errp)
 {
     error_setg(errp, "Transaction aborted using Abort action");
@@ -1427,6 +1470,11 @@ static const BdrvActionOps actions[] = {
         .prepare = drive_backup_prepare,
         .abort = drive_backup_abort,
     },
+    [TRANSACTION_ACTION_KIND_BLOCKDEV_BACKUP] = {
+        .instance_size = sizeof(BlockdevBackupState),
+        .prepare = blockdev_backup_prepare,
+        .abort = blockdev_backup_abort,
+    },
     [TRANSACTION_ACTION_KIND_ABORT] = {
         .instance_size = sizeof(BlkTransactionState),
         .prepare = abort_prepare,
diff --git a/qapi-schema.json b/qapi-schema.json
index 7b0be86..a00f4d5 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -1893,6 +1893,7 @@
   'data': {
        'blockdev-snapshot-sync': 'BlockdevSnapshot',
        'drive-backup': 'DriveBackup',
+       'blockdev-backup': 'BlockdevBackup',
        'abort': 'Abort',
        'blockdev-snapshot-internal-sync': 'BlockdevSnapshotInternal'
    } }
-- 
1.8.5.4

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [Qemu-devel] [PATCH v14 13/14] qemu-iotests: Test blockdev-backup in 055
  2014-02-19 13:42 [Qemu-devel] [PATCH v14 00/14] Drop in_use from BlockDriverState and enable point-in-time snapshot exporting over NBD Fam Zheng
                   ` (11 preceding siblings ...)
  2014-02-19 13:42 ` [Qemu-devel] [PATCH v14 12/14] block: Add blockdev-backup to transaction Fam Zheng
@ 2014-02-19 13:42 ` Fam Zheng
  2014-02-19 13:42 ` [Qemu-devel] [PATCH v14 14/14] qemu-iotests: Image fleecing test case 081 Fam Zheng
  13 siblings, 0 replies; 33+ messages in thread
From: Fam Zheng @ 2014-02-19 13:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, benoit.canet, rjones, armbru, imain, stefanha, pbonzini

This applies cases on drive-backup on blockdev-backup, except cases with
target format and mode.

Also add a case to check source == target.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 tests/qemu-iotests/055     | 275 ++++++++++++++++++++++++++++++++++++++-------
 tests/qemu-iotests/055.out |   4 +-
 2 files changed, 235 insertions(+), 44 deletions(-)

diff --git a/tests/qemu-iotests/055 b/tests/qemu-iotests/055
index 451b67d..1fab088 100755
--- a/tests/qemu-iotests/055
+++ b/tests/qemu-iotests/055
@@ -1,6 +1,6 @@
 #!/usr/bin/env python
 #
-# Tests for drive-backup
+# Tests for drive-backup and blockdev-backup
 #
 # Copyright (C) 2013 Red Hat, Inc.
 #
@@ -27,6 +27,7 @@ from iotests import qemu_img, qemu_io
 
 test_img = os.path.join(iotests.test_dir, 'test.img')
 target_img = os.path.join(iotests.test_dir, 'target.img')
+blockdev_target_img = os.path.join(iotests.test_dir, 'blockdev-target.img')
 
 class TestSingleDrive(iotests.QMPTestCase):
     image_len = 64 * 1024 * 1024 # MB
@@ -38,34 +39,48 @@ class TestSingleDrive(iotests.QMPTestCase):
         qemu_io('-c', 'write -P0xd5 1M 32k', test_img)
         qemu_io('-c', 'write -P0xdc 32M 124k', test_img)
         qemu_io('-c', 'write -P0xdc 67043328 64k', test_img)
+        qemu_img('create', '-f', iotests.imgfmt, blockdev_target_img, str(TestSingleDrive.image_len))
 
-        self.vm = iotests.VM().add_drive(test_img)
+        self.vm = iotests.VM().add_drive(test_img).add_drive(blockdev_target_img)
         self.vm.launch()
 
     def tearDown(self):
         self.vm.shutdown()
         os.remove(test_img)
+        os.remove(blockdev_target_img)
         try:
             os.remove(target_img)
         except OSError:
             pass
 
-    def test_cancel(self):
+    def do_test_cancel(self, test_drive_backup):
         self.assert_no_active_block_jobs()
 
-        result = self.vm.qmp('drive-backup', device='drive0',
-                             target=target_img, sync='full')
+        if test_drive_backup:
+            result = self.vm.qmp('drive-backup', device='drive0',
+                                 target=target_img, sync='full')
+        else:
+            result = self.vm.qmp('blockdev-backup', device='drive0',
+                                 target='drive1', sync='full')
         self.assert_qmp(result, 'return', {})
 
         event = self.cancel_and_wait()
         self.assert_qmp(event, 'data/type', 'backup')
 
-    def test_pause(self):
+    def test_cancel(self):
+        self.do_test_cancel(True)
+        self.do_test_cancel(False)
+
+    def do_test_pause(self, test_drive_backup):
         self.assert_no_active_block_jobs()
 
         self.vm.pause_drive('drive0')
-        result = self.vm.qmp('drive-backup', device='drive0',
-                             target=target_img, sync='full')
+        if test_drive_backup:
+            result = self.vm.qmp('drive-backup', device='drive0',
+                                 target=target_img, sync='full')
+        else:
+            result = self.vm.qmp('blockdev-backup', device='drive0',
+                                 target='drive1', sync='full')
         self.assert_qmp(result, 'return', {})
 
         result = self.vm.qmp('block-job-pause', device='drive0')
@@ -86,14 +101,28 @@ class TestSingleDrive(iotests.QMPTestCase):
         self.wait_until_completed()
 
         self.vm.shutdown()
-        self.assertTrue(iotests.compare_images(test_img, target_img),
-                        'target image does not match source after backup')
+        if test_drive_backup:
+            self.assertTrue(iotests.compare_images(test_img, target_img),
+                            'target image does not match source after backup')
+        else:
+            self.assertTrue(iotests.compare_images(test_img, blockdev_target_img),
+                            'target image does not match source after backup')
+
+    def test_pause_drive_backup(self):
+        self.do_test_pause(True)
+
+    def test_pause_blockdev_backup(self):
+        self.do_test_pause(False)
 
     def test_medium_not_found(self):
         result = self.vm.qmp('drive-backup', device='ide1-cd0',
                              target=target_img, sync='full')
         self.assert_qmp(result, 'error/class', 'GenericError')
 
+        result = self.vm.qmp('blockdev-backup', device='ide1-cd0',
+                             target='drive1', sync='full')
+        self.assert_qmp(result, 'error/class', 'GenericError')
+
     def test_image_not_found(self):
         result = self.vm.qmp('drive-backup', device='drive0',
                              target=target_img, sync='full', mode='existing')
@@ -110,26 +139,56 @@ class TestSingleDrive(iotests.QMPTestCase):
                              target=target_img, sync='full')
         self.assert_qmp(result, 'error/class', 'DeviceNotFound')
 
+        result = self.vm.qmp('blockdev-backup', device='nonexistent',
+                             target='drive0', sync='full')
+        self.assert_qmp(result, 'error/class', 'DeviceNotFound')
+
+        result = self.vm.qmp('blockdev-backup', device='drive0',
+                             target='nonexistent', sync='full')
+        self.assert_qmp(result, 'error/class', 'DeviceNotFound')
+
+        result = self.vm.qmp('blockdev-backup', device='nonexistent',
+                             target='nonexistent', sync='full')
+        self.assert_qmp(result, 'error/class', 'DeviceNotFound')
+
+    def test_target_is_source(self):
+        result = self.vm.qmp('blockdev-backup', device='drive0',
+                             target='drive0', sync='full')
+        self.assert_qmp(result, 'error/class', 'GenericError')
+
 class TestSetSpeed(iotests.QMPTestCase):
     image_len = 80 * 1024 * 1024 # MB
 
     def setUp(self):
         qemu_img('create', '-f', iotests.imgfmt, test_img, str(TestSetSpeed.image_len))
         qemu_io('-c', 'write -P1 0 512', test_img)
-        self.vm = iotests.VM().add_drive(test_img)
+        qemu_img('create', '-f', iotests.imgfmt, blockdev_target_img, str(TestSingleDrive.image_len))
+
+        self.vm = iotests.VM().add_drive(test_img).add_drive(blockdev_target_img)
         self.vm.launch()
 
     def tearDown(self):
         self.vm.shutdown()
         os.remove(test_img)
-        os.remove(target_img)
+        try:
+            os.remove(blockdev_target_img)
+        except OSError:
+            pass
+        try:
+            os.remove(target_img)
+        except OSError:
+            pass
 
-    def test_set_speed(self):
+    def do_test_set_speed(self, test_drive_backup):
         self.assert_no_active_block_jobs()
 
         self.vm.pause_drive('drive0')
-        result = self.vm.qmp('drive-backup', device='drive0',
-                             target=target_img, sync='full')
+        if test_drive_backup:
+            result = self.vm.qmp('drive-backup', device='drive0',
+                                 target=target_img, sync='full')
+        else:
+            result = self.vm.qmp('blockdev-backup', device='drive0',
+                                 target='drive1', sync='full')
         self.assert_qmp(result, 'return', {})
 
         # Default speed is 0
@@ -148,10 +207,14 @@ class TestSetSpeed(iotests.QMPTestCase):
         event = self.cancel_and_wait(resume=True)
         self.assert_qmp(event, 'data/type', 'backup')
 
-        # Check setting speed in drive-backup works
+        # Check setting speed option works
         self.vm.pause_drive('drive0')
-        result = self.vm.qmp('drive-backup', device='drive0',
-                             target=target_img, sync='full', speed=4*1024*1024)
+        if test_drive_backup:
+            result = self.vm.qmp('drive-backup', device='drive0',
+                                 target=target_img, sync='full', speed=4*1024*1024)
+        else:
+            result = self.vm.qmp('blockdev-backup', device='drive0',
+                                 target='drive1', sync='full', speed=4*1024*1024)
         self.assert_qmp(result, 'return', {})
 
         result = self.vm.qmp('query-block-jobs')
@@ -161,18 +224,32 @@ class TestSetSpeed(iotests.QMPTestCase):
         event = self.cancel_and_wait(resume=True)
         self.assert_qmp(event, 'data/type', 'backup')
 
-    def test_set_speed_invalid(self):
+    def test_set_speed_drive_backup(self):
+        self.do_test_set_speed(True)
+
+    def test_set_speed_blockdev_backup(self):
+        self.do_test_set_speed(False)
+
+    def do_test_set_speed_invalid(self, test_drive_backup):
         self.assert_no_active_block_jobs()
 
-        result = self.vm.qmp('drive-backup', device='drive0',
-                             target=target_img, sync='full', speed=-1)
+        if test_drive_backup:
+            result = self.vm.qmp('drive-backup', device='drive0',
+                                 target=target_img, sync='full', speed=-1)
+        else:
+            result = self.vm.qmp('blockdev-backup', device='drive0',
+                                 target='drive1', sync='full', speed=-1)
         self.assert_qmp(result, 'error/class', 'GenericError')
 
         self.assert_no_active_block_jobs()
 
         self.vm.pause_drive('drive0')
-        result = self.vm.qmp('drive-backup', device='drive0',
-                             target=target_img, sync='full')
+        if test_drive_backup:
+            result = self.vm.qmp('drive-backup', device='drive0',
+                                 target=target_img, sync='full')
+        else:
+            result = self.vm.qmp('blockdev-backup', device='drive0',
+                                 target='drive1', sync='full')
         self.assert_qmp(result, 'return', {})
 
         result = self.vm.qmp('block-job-set-speed', device='drive0', speed=-1)
@@ -181,6 +258,12 @@ class TestSetSpeed(iotests.QMPTestCase):
         event = self.cancel_and_wait(resume=True)
         self.assert_qmp(event, 'data/type', 'backup')
 
+    def test_set_speed_invalid_drive_backup(self):
+        self.do_test_set_speed_invalid(True)
+
+    def test_set_speed_invalid_blockdev_backup(self):
+        self.do_test_set_speed_invalid(False)
+
 class TestSingleTransaction(iotests.QMPTestCase):
     image_len = 64 * 1024 * 1024 # MB
 
@@ -190,44 +273,71 @@ class TestSingleTransaction(iotests.QMPTestCase):
         qemu_io('-c', 'write -P0xd5 1M 32k', test_img)
         qemu_io('-c', 'write -P0xdc 32M 124k', test_img)
         qemu_io('-c', 'write -P0xdc 67043328 64k', test_img)
+        qemu_img('create', '-f', iotests.imgfmt, blockdev_target_img, str(TestSingleDrive.image_len))
 
-        self.vm = iotests.VM().add_drive(test_img)
+        self.vm = iotests.VM().add_drive(test_img).add_drive(blockdev_target_img)
         self.vm.launch()
 
     def tearDown(self):
         self.vm.shutdown()
         os.remove(test_img)
+        os.remove(blockdev_target_img)
         try:
             os.remove(target_img)
         except OSError:
             pass
 
-    def test_cancel(self):
+    def do_test_cancel(self, test_drive_backup):
         self.assert_no_active_block_jobs()
 
-        result = self.vm.qmp('transaction', actions=[{
-                'type': 'drive-backup',
-                'data': { 'device': 'drive0',
-                          'target': target_img,
-                          'sync': 'full' },
-            }
-        ])
+        if test_drive_backup:
+            result = self.vm.qmp('transaction', actions=[{
+                    'type': 'drive-backup',
+                    'data': { 'device': 'drive0',
+                              'target': target_img,
+                              'sync': 'full' },
+                }
+            ])
+        else:
+            result = self.vm.qmp('transaction', actions=[{
+                    'type': 'blockdev-backup',
+                    'data': { 'device': 'drive0',
+                              'target': 'drive1',
+                              'sync': 'full' },
+                }
+            ])
+
         self.assert_qmp(result, 'return', {})
 
         event = self.cancel_and_wait()
         self.assert_qmp(event, 'data/type', 'backup')
 
-    def test_pause(self):
+    def test_cancel_drive_backup(self):
+        self.do_test_cancel(True)
+
+    def test_cancel_blockdev_backup(self):
+        self.do_test_cancel(False)
+
+    def do_test_pause(self, test_drive_backup):
         self.assert_no_active_block_jobs()
 
         self.vm.pause_drive('drive0')
-        result = self.vm.qmp('transaction', actions=[{
-                'type': 'drive-backup',
-                'data': { 'device': 'drive0',
-                          'target': target_img,
-                          'sync': 'full' },
-            }
-        ])
+        if test_drive_backup:
+            result = self.vm.qmp('transaction', actions=[{
+                    'type': 'drive-backup',
+                    'data': { 'device': 'drive0',
+                              'target': target_img,
+                              'sync': 'full' },
+                }
+            ])
+        else:
+            result = self.vm.qmp('transaction', actions=[{
+                    'type': 'blockdev-backup',
+                    'data': { 'device': 'drive0',
+                              'target': 'drive1',
+                              'sync': 'full' },
+                }
+            ])
         self.assert_qmp(result, 'return', {})
 
         result = self.vm.qmp('block-job-pause', device='drive0')
@@ -248,8 +358,18 @@ class TestSingleTransaction(iotests.QMPTestCase):
         self.wait_until_completed()
 
         self.vm.shutdown()
-        self.assertTrue(iotests.compare_images(test_img, target_img),
-                        'target image does not match source after backup')
+        if test_drive_backup:
+            self.assertTrue(iotests.compare_images(test_img, target_img),
+                            'target image does not match source after backup')
+        else:
+            self.assertTrue(iotests.compare_images(test_img, blockdev_target_img),
+                            'target image does not match source after backup')
+
+    def test_pause_drive_backup(self):
+        self.do_test_pause(True)
+
+    def test_pause_blockdev_backup(self):
+        self.do_test_pause(False)
 
     def test_medium_not_found(self):
         result = self.vm.qmp('transaction', actions=[{
@@ -260,6 +380,14 @@ class TestSingleTransaction(iotests.QMPTestCase):
             }
         ])
         self.assert_qmp(result, 'error/class', 'GenericError')
+        result = self.vm.qmp('transaction', actions=[{
+                'type': 'blockdev-backup',
+                'data': { 'device': 'ide1-cd0',
+                          'target': 'drive1',
+                          'sync': 'full' },
+            }
+        ])
+        self.assert_qmp(result, 'error/class', 'GenericError')
 
     def test_image_not_found(self):
         result = self.vm.qmp('transaction', actions=[{
@@ -283,6 +411,43 @@ class TestSingleTransaction(iotests.QMPTestCase):
         ])
         self.assert_qmp(result, 'error/class', 'DeviceNotFound')
 
+        result = self.vm.qmp('transaction', actions=[{
+                'type': 'blockdev-backup',
+                'data': { 'device': 'nonexistent',
+                          'target': 'drive1',
+                          'sync': 'full' },
+            }
+        ])
+        self.assert_qmp(result, 'error/class', 'DeviceNotFound')
+
+        result = self.vm.qmp('transaction', actions=[{
+                'type': 'blockdev-backup',
+                'data': { 'device': 'drive0',
+                          'target': 'nonexistent',
+                          'sync': 'full' },
+            }
+        ])
+        self.assert_qmp(result, 'error/class', 'DeviceNotFound')
+
+        result = self.vm.qmp('transaction', actions=[{
+                'type': 'blockdev-backup',
+                'data': { 'device': 'nonexistent',
+                          'target': 'nonexistent',
+                          'sync': 'full' },
+            }
+        ])
+        self.assert_qmp(result, 'error/class', 'DeviceNotFound')
+
+    def test_target_is_source(self):
+        result = self.vm.qmp('transaction', actions=[{
+                'type': 'blockdev-backup',
+                'data': { 'device': 'drive0',
+                          'target': 'drive0',
+                          'sync': 'full' },
+            }
+        ])
+        self.assert_qmp(result, 'error/class', 'GenericError')
+
     def test_abort(self):
         result = self.vm.qmp('transaction', actions=[{
                 'type': 'drive-backup',
@@ -298,5 +463,31 @@ class TestSingleTransaction(iotests.QMPTestCase):
         self.assert_qmp(result, 'error/class', 'GenericError')
         self.assert_no_active_block_jobs()
 
+        result = self.vm.qmp('transaction', actions=[{
+                'type': 'blockdev-backup',
+                'data': { 'device': 'nonexistent',
+                          'target': 'drive1',
+                          'sync': 'full' },
+            }, {
+                'type': 'Abort',
+                'data': {},
+            }
+        ])
+        self.assert_qmp(result, 'error/class', 'GenericError')
+        self.assert_no_active_block_jobs()
+
+        result = self.vm.qmp('transaction', actions=[{
+                'type': 'blockdev-backup',
+                'data': { 'device': 'drive0',
+                          'target': 'nonexistent',
+                          'sync': 'full' },
+            }, {
+                'type': 'Abort',
+                'data': {},
+            }
+        ])
+        self.assert_qmp(result, 'error/class', 'GenericError')
+        self.assert_no_active_block_jobs()
+
 if __name__ == '__main__':
     iotests.main(supported_fmts=['raw', 'qcow2'])
diff --git a/tests/qemu-iotests/055.out b/tests/qemu-iotests/055.out
index 6323079..c6a10f8 100644
--- a/tests/qemu-iotests/055.out
+++ b/tests/qemu-iotests/055.out
@@ -1,5 +1,5 @@
-..............
+.....................
 ----------------------------------------------------------------------
-Ran 14 tests
+Ran 21 tests
 
 OK
-- 
1.8.5.4

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [Qemu-devel] [PATCH v14 14/14] qemu-iotests: Image fleecing test case 081
  2014-02-19 13:42 [Qemu-devel] [PATCH v14 00/14] Drop in_use from BlockDriverState and enable point-in-time snapshot exporting over NBD Fam Zheng
                   ` (12 preceding siblings ...)
  2014-02-19 13:42 ` [Qemu-devel] [PATCH v14 13/14] qemu-iotests: Test blockdev-backup in 055 Fam Zheng
@ 2014-02-19 13:42 ` Fam Zheng
  13 siblings, 0 replies; 33+ messages in thread
From: Fam Zheng @ 2014-02-19 13:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, benoit.canet, rjones, armbru, imain, stefanha, pbonzini

This tests the workflow of creating a lightweight point-in-time snapshot
with blockdev-backup command and export it with built-in NBD server.

It's tested that after the snapshot it created, writing to the original
device doesn't change data that can be read from target with NBD.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 tests/qemu-iotests/081     | 99 ++++++++++++++++++++++++++++++++++++++++++++++
 tests/qemu-iotests/081.out |  5 +++
 tests/qemu-iotests/group   |  1 +
 3 files changed, 105 insertions(+)
 create mode 100755 tests/qemu-iotests/081
 create mode 100644 tests/qemu-iotests/081.out

diff --git a/tests/qemu-iotests/081 b/tests/qemu-iotests/081
new file mode 100755
index 0000000..8be32d7
--- /dev/null
+++ b/tests/qemu-iotests/081
@@ -0,0 +1,99 @@
+#!/usr/bin/env python
+#
+# Tests for image fleecing (point in time snapshot export to NBD)
+#
+# Copyright (C) 2014 Red Hat, Inc.
+#
+# Based on 055.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+#
+
+import time
+import os
+import iotests
+from iotests import qemu_img, qemu_io
+
+test_img = os.path.join(iotests.test_dir, 'test.img')
+target_img = os.path.join(iotests.test_dir, 'target.img')
+nbd_sock = os.path.join(iotests.test_dir, 'nbd.sock')
+
+class TestImageFleecing(iotests.QMPTestCase):
+    image_len = 64 * 1024 * 1024 # MB
+
+    def setUp(self):
+        # Write data to the image so we can compare later
+        qemu_img('create', '-f', iotests.imgfmt, test_img, str(TestImageFleecing.image_len))
+        self.patterns = [
+                ("0x5d", "0", "64k"),
+                ("0xd5", "1M", "64k"),
+                ("0xdc", "32M", "64k"),
+                ("0xdc", "67043328", "64k")]
+
+        for p in self.patterns:
+            qemu_io('-c', 'write -P%s %s %s' % p, test_img)
+
+        qemu_img('create', '-f', iotests.imgfmt, target_img, str(TestImageFleecing.image_len))
+
+        self.vm = iotests.VM().add_drive(test_img)
+        self.vm.launch()
+
+        self.overwrite_patterns = [
+                ("0xa0", "0", "64k"),
+                ("0x0a", "1M", "64k"),
+                ("0x55", "32M", "64k"),
+                ("0x56", "67043328", "64k")]
+
+        self.nbd_uri = "nbd+unix:///drive1?socket=%s" % nbd_sock
+
+    def tearDown(self):
+        self.vm.shutdown()
+        os.remove(test_img)
+        os.remove(target_img)
+
+    def verify_patterns(self):
+        for p in self.patterns:
+            self.assertEqual(-1, qemu_io(self.nbd_uri, '-c', 'read -P%s %s %s' % p).find("verification failed"),
+                             "Failed to verify pattern: %s %s %s" % p)
+
+    def test_image_fleecing(self):
+        result = self.vm.qmp("blockdev-add", **{"options": {
+            "driver": "qcow2",
+            "id": "drive1",
+            "file": {
+                "driver": "file",
+                "filename": target_img,
+                },
+            "backing": "drive0",
+            }})
+        self.assert_qmp(result, 'return', {})
+        result = self.vm.qmp("nbd-server-start", **{"addr": { "type": "unix", "data": { "path": nbd_sock } } })
+        self.assert_qmp(result, 'return', {})
+        result = self.vm.qmp("blockdev-backup", device="drive0", target="drive1", sync="none")
+        self.assert_qmp(result, 'return', {})
+        result = self.vm.qmp("nbd-server-add", device="drive1")
+        self.assert_qmp(result, 'return', {})
+
+        self.verify_patterns()
+
+        for p in self.overwrite_patterns:
+            self.vm.hmp_qemu_io("drive0", "write -P%s %s %s" % p)
+
+        self.verify_patterns()
+
+        self.cancel_and_wait(resume=True)
+        self.assert_no_active_block_jobs()
+
+if __name__ == '__main__':
+    iotests.main(supported_fmts=['raw', 'qcow2'])
diff --git a/tests/qemu-iotests/081.out b/tests/qemu-iotests/081.out
new file mode 100644
index 0000000..ae1213e
--- /dev/null
+++ b/tests/qemu-iotests/081.out
@@ -0,0 +1,5 @@
+.
+----------------------------------------------------------------------
+Ran 1 tests
+
+OK
diff --git a/tests/qemu-iotests/group b/tests/qemu-iotests/group
index d8be74a..fe5d764 100644
--- a/tests/qemu-iotests/group
+++ b/tests/qemu-iotests/group
@@ -83,3 +83,4 @@
 074 rw auto
 077 rw auto
 079 rw auto
+081 rw auto
-- 
1.8.5.4

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH v14 01/14] block: Add BlockOpType enum
  2014-02-19 13:42 ` [Qemu-devel] [PATCH v14 01/14] block: Add BlockOpType enum Fam Zheng
@ 2014-02-19 15:25   ` Benoît Canet
  0 siblings, 0 replies; 33+ messages in thread
From: Benoît Canet @ 2014-02-19 15:25 UTC (permalink / raw)
  To: Fam Zheng
  Cc: kwolf, benoit.canet, armbru, qemu-devel, rjones, imain, stefanha,
	pbonzini

The Wednesday 19 Feb 2014 à 21:42:18 (+0800), Fam Zheng wrote :
> This adds the enum of all the operations that can be taken on a block
> device.
> 
> Signed-off-by: Fam Zheng <famz@redhat.com>
> ---
>  include/block/block.h | 19 +++++++++++++++++++
>  1 file changed, 19 insertions(+)
> 
> diff --git a/include/block/block.h b/include/block/block.h
> index 780f48b..8820735 100644
> --- a/include/block/block.h
> +++ b/include/block/block.h
> @@ -154,6 +154,25 @@ typedef struct BDRVReopenState {
>      void *opaque;
>  } BDRVReopenState;
>  
> +/*
> + * Block operation types
> + */
> +typedef enum BlockOpType {
> +    BLOCK_OP_TYPE_BACKUP_SOURCE,
> +    BLOCK_OP_TYPE_BACKUP_TARGET,
> +    BLOCK_OP_TYPE_CHANGE,
> +    BLOCK_OP_TYPE_COMMIT,
> +    BLOCK_OP_TYPE_DATAPLANE,
> +    BLOCK_OP_TYPE_DRIVE_DEL,
> +    BLOCK_OP_TYPE_EJECT,
> +    BLOCK_OP_TYPE_EXTERNAL_SNAPSHOT,
> +    BLOCK_OP_TYPE_INTERNAL_SNAPSHOT,
> +    BLOCK_OP_TYPE_INTERNAL_SNAPSHOT_DELETE,
> +    BLOCK_OP_TYPE_MIRROR,
> +    BLOCK_OP_TYPE_RESIZE,
> +    BLOCK_OP_TYPE_STREAM,
> +    BLOCK_OP_TYPE_MAX,
> +} BlockOpType;
>  
>  void bdrv_iostatus_enable(BlockDriverState *bs);
>  void bdrv_iostatus_reset(BlockDriverState *bs);
> -- 
> 1.8.5.4
> 
Reviewed-by: Benoit Canet <benoit@irqsave.net>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH v14 02/14] block: Introduce op_blockers to BlockDriverState
  2014-02-19 13:42 ` [Qemu-devel] [PATCH v14 02/14] block: Introduce op_blockers to BlockDriverState Fam Zheng
@ 2014-02-19 15:26   ` Benoît Canet
  0 siblings, 0 replies; 33+ messages in thread
From: Benoît Canet @ 2014-02-19 15:26 UTC (permalink / raw)
  To: Fam Zheng
  Cc: kwolf, benoit.canet, armbru, qemu-devel, rjones, imain, stefanha,
	pbonzini

The Wednesday 19 Feb 2014 à 21:42:19 (+0800), Fam Zheng wrote :
> BlockDriverState.op_blockers is an array of lists with BLOCK_OP_TYPE_MAX
> elements. Each list is a list of blockers of an operation type
> (BlockOpType), that marks this BDS as currently blocked for a certain
> type of operation with reason errors stored in the list. The rule of
> usage is:
> 
>  * BDS user who wants to take an operation should check if there's any
>    blocker of the type with bdrv_op_is_blocked().
> 
>  * BDS user who wants to block certain types of operation, should call
>    bdrv_op_block (or bdrv_op_block_all to block all types of operations,
>    which is similar to the existing bdrv_set_in_use()).
> 
>  * A blocker is only referenced by op_blockers, so the lifecycle is
>    managed by caller, and shouldn't be lost until unblock, so typically
>    a caller does these:
> 
>    - Allocate a blocker with error_setg or similar, call bdrv_op_block()
>      to block some operations.
>    - Hold the blocker, do his job.
>    - Unblock operations that it blocked, with the same reason pointer
>      passed to bdrv_op_unblock().
>    - Release the blocker with error_free().
> 
> Signed-off-by: Fam Zheng <famz@redhat.com>
> ---
>  block.c                   | 75 +++++++++++++++++++++++++++++++++++++++++++++++
>  include/block/block.h     |  7 +++++
>  include/block/block_int.h |  5 ++++
>  3 files changed, 87 insertions(+)
> 
> diff --git a/block.c b/block.c
> index 80310fe..485d4f0 100644
> --- a/block.c
> +++ b/block.c
> @@ -335,6 +335,7 @@ void bdrv_register(BlockDriver *bdrv)
>  BlockDriverState *bdrv_new(const char *device_name)
>  {
>      BlockDriverState *bs;
> +    int i;
>  
>      bs = g_malloc0(sizeof(BlockDriverState));
>      QLIST_INIT(&bs->dirty_bitmaps);
> @@ -342,6 +343,9 @@ BlockDriverState *bdrv_new(const char *device_name)
>      if (device_name[0] != '\0') {
>          QTAILQ_INSERT_TAIL(&bdrv_states, bs, device_list);
>      }
> +    for (i = 0; i < BLOCK_OP_TYPE_MAX; i++) {
> +        QLIST_INIT(&bs->op_blockers[i]);
> +    }
>      bdrv_iostatus_disable(bs);
>      notifier_list_init(&bs->close_notifiers);
>      notifier_with_return_list_init(&bs->before_write_notifiers);
> @@ -1851,6 +1855,8 @@ static void bdrv_move_feature_fields(BlockDriverState *bs_dest,
>       * We do want to swap name but don't want to swap linked list entries
>       */
>      bs_dest->node_list   = bs_src->node_list;
> +    memcpy(bs_dest->op_blockers, bs_src->op_blockers,
> +           sizeof(bs_dest->op_blockers));
>  }
>  
>  /*
> @@ -5130,6 +5136,75 @@ void bdrv_unref(BlockDriverState *bs)
>      }
>  }
>  
> +struct BdrvOpBlocker {
> +    Error *reason;
> +    QLIST_ENTRY(BdrvOpBlocker) list;
> +};
> +
> +bool bdrv_op_is_blocked(BlockDriverState *bs, BlockOpType op, Error **errp)
> +{
> +    BdrvOpBlocker *blocker;
> +    assert((int) op >= 0 && op < BLOCK_OP_TYPE_MAX);
> +    if (!QLIST_EMPTY(&bs->op_blockers[op])) {
> +        blocker = QLIST_FIRST(&bs->op_blockers[op]);
> +        if (errp) {
> +            *errp = error_copy(blocker->reason);
> +        }
> +        return true;
> +    }
> +    return false;
> +}
> +
> +void bdrv_op_block(BlockDriverState *bs, BlockOpType op, Error *reason)
> +{
> +    BdrvOpBlocker *blocker;
> +    assert((int) op >= 0 && op < BLOCK_OP_TYPE_MAX);
> +
> +    blocker = g_malloc0(sizeof(BdrvOpBlocker));
> +    blocker->reason = reason;
> +    QLIST_INSERT_HEAD(&bs->op_blockers[op], blocker, list);
> +}
> +
> +void bdrv_op_unblock(BlockDriverState *bs, BlockOpType op, Error *reason)
> +{
> +    BdrvOpBlocker *blocker, *next;
> +    assert((int) op >= 0 && op < BLOCK_OP_TYPE_MAX);
> +    QLIST_FOREACH_SAFE(blocker, &bs->op_blockers[op], list, next) {
> +        if (blocker->reason == reason) {
> +            QLIST_REMOVE(blocker, list);
> +            g_free(blocker);
> +        }
> +    }
> +}
> +
> +void bdrv_op_block_all(BlockDriverState *bs, Error *reason)
> +{
> +    int i;
> +    for (i = 0; i < BLOCK_OP_TYPE_MAX; i++) {
> +        bdrv_op_block(bs, i, reason);
> +    }
> +}
> +
> +void bdrv_op_unblock_all(BlockDriverState *bs, Error *reason)
> +{
> +    int i;
> +    for (i = 0; i < BLOCK_OP_TYPE_MAX; i++) {
> +        bdrv_op_unblock(bs, i, reason);
> +    }
> +}
> +
> +bool bdrv_op_blocker_is_empty(BlockDriverState *bs)
> +{
> +    int i;
> +
> +    for (i = 0; i < BLOCK_OP_TYPE_MAX; i++) {
> +        if (!QLIST_EMPTY(&bs->op_blockers[i])) {
> +            return false;
> +        }
> +    }
> +    return true;
> +}
> +
>  void bdrv_set_in_use(BlockDriverState *bs, int in_use)
>  {
>      assert(bs->in_use != in_use);
> diff --git a/include/block/block.h b/include/block/block.h
> index 8820735..4874e2a 100644
> --- a/include/block/block.h
> +++ b/include/block/block.h
> @@ -474,6 +474,13 @@ void bdrv_unref(BlockDriverState *bs);
>  void bdrv_set_in_use(BlockDriverState *bs, int in_use);
>  int bdrv_in_use(BlockDriverState *bs);
>  
> +bool bdrv_op_is_blocked(BlockDriverState *bs, BlockOpType op, Error **errp);
> +void bdrv_op_block(BlockDriverState *bs, BlockOpType op, Error *reason);
> +void bdrv_op_unblock(BlockDriverState *bs, BlockOpType op, Error *reason);
> +void bdrv_op_block_all(BlockDriverState *bs, Error *reason);
> +void bdrv_op_unblock_all(BlockDriverState *bs, Error *reason);
> +bool bdrv_op_blocker_is_empty(BlockDriverState *bs);
> +
>  #ifdef CONFIG_LINUX_AIO
>  int raw_get_aio_fd(BlockDriverState *bs);
>  #else
> diff --git a/include/block/block_int.h b/include/block/block_int.h
> index 0bcf1c9..4e558d0 100644
> --- a/include/block/block_int.h
> +++ b/include/block/block_int.h
> @@ -270,6 +270,8 @@ typedef struct BlockLimits {
>      size_t opt_mem_alignment;
>  } BlockLimits;
>  
> +typedef struct BdrvOpBlocker BdrvOpBlocker;
> +
>  /*
>   * Note: the function bdrv_append() copies and swaps contents of
>   * BlockDriverStates, so if you add new fields to this struct, please
> @@ -361,6 +363,9 @@ struct BlockDriverState {
>  
>      QLIST_HEAD(, BdrvTrackedRequest) tracked_requests;
>  
> +    /* operation blockers */
> +    QLIST_HEAD(, BdrvOpBlocker) op_blockers[BLOCK_OP_TYPE_MAX];
> +
>      /* long-running background operation */
>      BlockJob *job;
>  
> -- 
> 1.8.5.4
> 
Reviewed-by: Benoit Canet <benoit@irqsave.net>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH v14 03/14] block: Replace in_use with operation blocker
  2014-02-19 13:42 ` [Qemu-devel] [PATCH v14 03/14] block: Replace in_use with operation blocker Fam Zheng
@ 2014-02-19 15:26   ` Benoît Canet
  0 siblings, 0 replies; 33+ messages in thread
From: Benoît Canet @ 2014-02-19 15:26 UTC (permalink / raw)
  To: Fam Zheng
  Cc: kwolf, benoit.canet, armbru, qemu-devel, rjones, imain, stefanha,
	pbonzini

The Wednesday 19 Feb 2014 à 21:42:20 (+0800), Fam Zheng wrote :
> This drops BlockDriverState.in_use with op_blockers:
> 
>   - Call bdrv_op_block_all in place of bdrv_set_in_use(bs, 1).
>   - Call bdrv_op_unblock_all in place of bdrv_set_in_use(bs, 0).
>   - Check bdrv_op_is_blocked() in place of bdrv_in_use(bs).
>     The specific types are used, e.g. in place of starting block backup,
>     bdrv_op_is_blocked(bs, BLOCK_OP_TYPE_BACKUP, ...).
>   - Check bdrv_op_blocker_is_empty() in place of assert(!bs->in_use).
> 
> Note: there is only bdrv_op_block_all and bdrv_op_unblock_all callers at
> this moment. So although the checks are specific to op types, this
> changes can still be seen as identical logic with previously with
> in_use. The difference is error message are improved because of blocker
> error info.
> 
> Signed-off-by: Fam Zheng <famz@redhat.com>
> ---
>  block-migration.c               |  7 +++++--
>  block.c                         | 24 +++++++-----------------
>  blockdev.c                      | 15 ++++++---------
>  blockjob.c                      | 14 +++++++++-----
>  hw/block/dataplane/virtio-blk.c | 19 ++++++++++++-------
>  include/block/block.h           |  2 --
>  include/block/block_int.h       |  1 -
>  include/block/blockjob.h        |  3 +++
>  8 files changed, 42 insertions(+), 43 deletions(-)
> 
> diff --git a/block-migration.c b/block-migration.c
> index 897fdba..bf9a25f 100644
> --- a/block-migration.c
> +++ b/block-migration.c
> @@ -59,6 +59,7 @@ typedef struct BlkMigDevState {
>      unsigned long *aio_bitmap;
>      int64_t completed_sectors;
>      BdrvDirtyBitmap *dirty_bitmap;
> +    Error *blocker;
>  } BlkMigDevState;
>  
>  typedef struct BlkMigBlock {
> @@ -346,7 +347,8 @@ static void init_blk_migration_it(void *opaque, BlockDriverState *bs)
>          bmds->completed_sectors = 0;
>          bmds->shared_base = block_mig_state.shared_base;
>          alloc_aio_bitmap(bmds);
> -        bdrv_set_in_use(bs, 1);
> +        error_setg(&bmds->blocker, "block device is in use by migration");
> +        bdrv_op_block_all(bs, bmds->blocker);
>          bdrv_ref(bs);
>  
>          block_mig_state.total_sector_sum += sectors;
> @@ -584,7 +586,8 @@ static void blk_mig_cleanup(void)
>      blk_mig_lock();
>      while ((bmds = QSIMPLEQ_FIRST(&block_mig_state.bmds_list)) != NULL) {
>          QSIMPLEQ_REMOVE_HEAD(&block_mig_state.bmds_list, entry);
> -        bdrv_set_in_use(bmds->bs, 0);
> +        bdrv_op_unblock_all(bmds->bs, bmds->blocker);
> +        error_free(bmds->blocker);
>          bdrv_unref(bmds->bs);
>          g_free(bmds->aio_bitmap);
>          g_free(bmds);
> diff --git a/block.c b/block.c
> index 485d4f0..b5cb0c7 100644
> --- a/block.c
> +++ b/block.c
> @@ -1843,7 +1843,6 @@ static void bdrv_move_feature_fields(BlockDriverState *bs_dest,
>      bs_dest->refcnt             = bs_src->refcnt;
>  
>      /* job */
> -    bs_dest->in_use             = bs_src->in_use;
>      bs_dest->job                = bs_src->job;
>  
>      /* keep the same entry in bdrv_states */
> @@ -1880,7 +1879,7 @@ void bdrv_swap(BlockDriverState *bs_new, BlockDriverState *bs_old)
>      assert(QLIST_EMPTY(&bs_new->dirty_bitmaps));
>      assert(bs_new->job == NULL);
>      assert(bs_new->dev == NULL);
> -    assert(bs_new->in_use == 0);
> +    assert(bdrv_op_blocker_is_empty(bs_new));
>      assert(bs_new->io_limits_enabled == false);
>      assert(!throttle_have_timer(&bs_new->throttle_state));
>  
> @@ -1899,7 +1898,7 @@ void bdrv_swap(BlockDriverState *bs_new, BlockDriverState *bs_old)
>      /* Check a few fields that should remain attached to the device */
>      assert(bs_new->dev == NULL);
>      assert(bs_new->job == NULL);
> -    assert(bs_new->in_use == 0);
> +    assert(bdrv_op_blocker_is_empty(bs_new));
>      assert(bs_new->io_limits_enabled == false);
>      assert(!throttle_have_timer(&bs_new->throttle_state));
>  
> @@ -1936,7 +1935,7 @@ static void bdrv_delete(BlockDriverState *bs)
>  {
>      assert(!bs->dev);
>      assert(!bs->job);
> -    assert(!bs->in_use);
> +    assert(bdrv_op_blocker_is_empty(bs));
>      assert(!bs->refcnt);
>      assert(QLIST_EMPTY(&bs->dirty_bitmaps));
>  
> @@ -2118,7 +2117,8 @@ int bdrv_commit(BlockDriverState *bs)
>          return -ENOTSUP;
>      }
>  
> -    if (bdrv_in_use(bs) || bdrv_in_use(bs->backing_hd)) {
> +    if (bdrv_op_is_blocked(bs, BLOCK_OP_TYPE_COMMIT, NULL) ||
> +        bdrv_op_is_blocked(bs->backing_hd, BLOCK_OP_TYPE_COMMIT, NULL)) {
>          return -EBUSY;
>      }
>  
> @@ -3353,8 +3353,9 @@ int bdrv_truncate(BlockDriverState *bs, int64_t offset)
>          return -ENOTSUP;
>      if (bs->read_only)
>          return -EACCES;
> -    if (bdrv_in_use(bs))
> +    if (bdrv_op_is_blocked(bs, BLOCK_OP_TYPE_RESIZE, NULL)) {
>          return -EBUSY;
> +    }
>      ret = drv->bdrv_truncate(bs, offset);
>      if (ret == 0) {
>          ret = refresh_total_sectors(bs, offset >> BDRV_SECTOR_BITS);
> @@ -5205,17 +5206,6 @@ bool bdrv_op_blocker_is_empty(BlockDriverState *bs)
>      return true;
>  }
>  
> -void bdrv_set_in_use(BlockDriverState *bs, int in_use)
> -{
> -    assert(bs->in_use != in_use);
> -    bs->in_use = in_use;
> -}
> -
> -int bdrv_in_use(BlockDriverState *bs)
> -{
> -    return bs->in_use;
> -}
> -
>  void bdrv_iostatus_enable(BlockDriverState *bs)
>  {
>      bs->iostatus_enabled = true;
> diff --git a/blockdev.c b/blockdev.c
> index f083612..53dce70 100644
> --- a/blockdev.c
> +++ b/blockdev.c
> @@ -1289,8 +1289,8 @@ static void external_snapshot_prepare(BlkTransactionState *common,
>          return;
>      }
>  
> -    if (bdrv_in_use(state->old_bs)) {
> -        error_set(errp, QERR_DEVICE_IN_USE, device);
> +    if (bdrv_op_is_blocked(state->old_bs,
> +                           BLOCK_OP_TYPE_EXTERNAL_SNAPSHOT, errp)) {
>          return;
>      }
>  
> @@ -1512,8 +1512,7 @@ exit:
>  
>  static void eject_device(BlockDriverState *bs, int force, Error **errp)
>  {
> -    if (bdrv_in_use(bs)) {
> -        error_set(errp, QERR_DEVICE_IN_USE, bdrv_get_device_name(bs));
> +    if (bdrv_op_is_blocked(bs, BLOCK_OP_TYPE_EJECT, errp)) {
>          return;
>      }
>      if (!bdrv_dev_has_removable_media(bs)) {
> @@ -1719,7 +1718,7 @@ int do_drive_del(Monitor *mon, const QDict *qdict, QObject **ret_data)
>          qerror_report(QERR_DEVICE_NOT_FOUND, id);
>          return -1;
>      }
> -    if (bdrv_in_use(bs)) {
> +    if (bdrv_op_is_blocked(bs, BLOCK_OP_TYPE_DRIVE_DEL, NULL)) {
>          qerror_report(QERR_DEVICE_IN_USE, id);
>          return -1;
>      }
> @@ -1972,8 +1971,7 @@ void qmp_drive_backup(const char *device, const char *target,
>          }
>      }
>  
> -    if (bdrv_in_use(bs)) {
> -        error_set(errp, QERR_DEVICE_IN_USE, device);
> +    if (bdrv_op_is_blocked(bs, BLOCK_OP_TYPE_BACKUP_SOURCE, errp)) {
>          return;
>      }
>  
> @@ -2106,8 +2104,7 @@ void qmp_drive_mirror(const char *device, const char *target,
>          }
>      }
>  
> -    if (bdrv_in_use(bs)) {
> -        error_set(errp, QERR_DEVICE_IN_USE, device);
> +    if (bdrv_op_is_blocked(bs, BLOCK_OP_TYPE_MIRROR, errp)) {
>          return;
>      }
>  
> diff --git a/blockjob.c b/blockjob.c
> index 9e5fd5c..f1ff036 100644
> --- a/blockjob.c
> +++ b/blockjob.c
> @@ -41,14 +41,16 @@ void *block_job_create(const BlockJobDriver *driver, BlockDriverState *bs,
>  {
>      BlockJob *job;
>  
> -    if (bs->job || bdrv_in_use(bs)) {
> +    if (bs->job || !bdrv_op_blocker_is_empty(bs)) {
>          error_set(errp, QERR_DEVICE_IN_USE, bdrv_get_device_name(bs));
>          return NULL;
>      }
>      bdrv_ref(bs);
> -    bdrv_set_in_use(bs, 1);
> -
>      job = g_malloc0(driver->instance_size);
> +    error_setg(&job->blocker, "block device is in use by block job: %s",
> +               BlockJobType_lookup[driver->job_type]);
> +    bdrv_op_block_all(bs, job->blocker);
> +
>      job->driver        = driver;
>      job->bs            = bs;
>      job->cb            = cb;
> @@ -63,8 +65,9 @@ void *block_job_create(const BlockJobDriver *driver, BlockDriverState *bs,
>          block_job_set_speed(job, speed, &local_err);
>          if (error_is_set(&local_err)) {
>              bs->job = NULL;
> +            bdrv_op_unblock_all(bs, job->blocker);
> +            error_free(job->blocker);
>              g_free(job);
> -            bdrv_set_in_use(bs, 0);
>              error_propagate(errp, local_err);
>              return NULL;
>          }
> @@ -79,8 +82,9 @@ void block_job_completed(BlockJob *job, int ret)
>      assert(bs->job == job);
>      job->cb(job->opaque, ret);
>      bs->job = NULL;
> +    bdrv_op_unblock_all(bs, job->blocker);
> +    error_free(job->blocker);
>      g_free(job);
> -    bdrv_set_in_use(bs, 0);
>  }
>  
>  void block_job_set_speed(BlockJob *job, int64_t speed, Error **errp)
> diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c
> index 2237edb..cee0aa5 100644
> --- a/hw/block/dataplane/virtio-blk.c
> +++ b/hw/block/dataplane/virtio-blk.c
> @@ -69,6 +69,9 @@ struct VirtIOBlockDataPlane {
>                                               queue */
>  
>      unsigned int num_reqs;
> +
> +    /* Operation blocker on BDS */
> +    Error *blocker;
>  };
>  
>  /* Raise an interrupt to signal guest, if necessary */
> @@ -368,6 +371,7 @@ void virtio_blk_data_plane_create(VirtIODevice *vdev, VirtIOBlkConf *blk,
>  {
>      VirtIOBlockDataPlane *s;
>      int fd;
> +    Error *local_err = NULL;
>  
>      *dataplane = NULL;
>  
> @@ -390,9 +394,10 @@ void virtio_blk_data_plane_create(VirtIODevice *vdev, VirtIOBlkConf *blk,
>      /* If dataplane is (re-)enabled while the guest is running there could be
>       * block jobs that can conflict.
>       */
> -    if (bdrv_in_use(blk->conf.bs)) {
> -        error_setg(errp,
> -                   "cannot start dataplane thread while device is in use");
> +    if (bdrv_op_is_blocked(blk->conf.bs, BLOCK_OP_TYPE_DATAPLANE, &local_err)) {
> +        error_report("cannot start dataplane thread: %s",
> +                      error_get_pretty(local_err));
> +        error_free(local_err);
>          return;
>      }
>  
> @@ -407,9 +412,8 @@ void virtio_blk_data_plane_create(VirtIODevice *vdev, VirtIOBlkConf *blk,
>      s->vdev = vdev;
>      s->fd = fd;
>      s->blk = blk;
> -
> -    /* Prevent block operations that conflict with data plane thread */
> -    bdrv_set_in_use(blk->conf.bs, 1);
> +    error_setg(&s->blocker, "block device is in use by data plane");
> +    bdrv_op_block_all(blk->conf.bs, s->blocker);
>  
>      *dataplane = s;
>  }
> @@ -421,7 +425,8 @@ void virtio_blk_data_plane_destroy(VirtIOBlockDataPlane *s)
>      }
>  
>      virtio_blk_data_plane_stop(s);
> -    bdrv_set_in_use(s->blk->conf.bs, 0);
> +    bdrv_op_unblock_all(s->blk->conf.bs, s->blocker);
> +    error_free(s->blocker);
>      g_free(s);
>  }
>  
> diff --git a/include/block/block.h b/include/block/block.h
> index 4874e2a..a46f70a 100644
> --- a/include/block/block.h
> +++ b/include/block/block.h
> @@ -471,8 +471,6 @@ void bdrv_disable_copy_on_read(BlockDriverState *bs);
>  
>  void bdrv_ref(BlockDriverState *bs);
>  void bdrv_unref(BlockDriverState *bs);
> -void bdrv_set_in_use(BlockDriverState *bs, int in_use);
> -int bdrv_in_use(BlockDriverState *bs);
>  
>  bool bdrv_op_is_blocked(BlockDriverState *bs, BlockOpType op, Error **errp);
>  void bdrv_op_block(BlockDriverState *bs, BlockOpType op, Error *reason);
> diff --git a/include/block/block_int.h b/include/block/block_int.h
> index 4e558d0..1d3f76f 100644
> --- a/include/block/block_int.h
> +++ b/include/block/block_int.h
> @@ -359,7 +359,6 @@ struct BlockDriverState {
>      QTAILQ_ENTRY(BlockDriverState) device_list;
>      QLIST_HEAD(, BdrvDirtyBitmap) dirty_bitmaps;
>      int refcnt;
> -    int in_use; /* users other than guest access, eg. block migration */
>  
>      QLIST_HEAD(, BdrvTrackedRequest) tracked_requests;
>  
> diff --git a/include/block/blockjob.h b/include/block/blockjob.h
> index d76de62..c0a7875 100644
> --- a/include/block/blockjob.h
> +++ b/include/block/blockjob.h
> @@ -106,6 +106,9 @@ struct BlockJob {
>      /** The completion function that will be called when the job completes.  */
>      BlockDriverCompletionFunc *cb;
>  
> +    /** Block other operations when block job is running */
> +    Error *blocker;
> +
>      /** The opaque value that is passed to the completion function.  */
>      void *opaque;
>  };
> -- 
> 1.8.5.4
> 
Reviewed-by: Benoit Canet <benoit@irqsave.net>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH v14 05/14] block: Add bdrv_set_backing_hd()
  2014-02-19 13:42 ` [Qemu-devel] [PATCH v14 05/14] block: Add bdrv_set_backing_hd() Fam Zheng
@ 2014-02-19 15:27   ` Benoît Canet
  0 siblings, 0 replies; 33+ messages in thread
From: Benoît Canet @ 2014-02-19 15:27 UTC (permalink / raw)
  To: Fam Zheng
  Cc: kwolf, benoit.canet, armbru, qemu-devel, rjones, imain, stefanha,
	pbonzini

The Wednesday 19 Feb 2014 à 21:42:22 (+0800), Fam Zheng wrote :
> This is the common but non-trivial steps to assign or change the
> backing_hd of BDS.
> 
> Signed-off-by: Fam Zheng <famz@redhat.com>
> ---
>  block.c               | 34 +++++++++++++++++++++++++++-------
>  include/block/block.h |  1 +
>  2 files changed, 28 insertions(+), 7 deletions(-)
> 
> diff --git a/block.c b/block.c
> index b5cb0c7..dec44d4 100644
> --- a/block.c
> +++ b/block.c
> @@ -1041,6 +1041,26 @@ fail:
>      return ret;
>  }
>  
> +void bdrv_set_backing_hd(BlockDriverState *bs, BlockDriverState *backing_hd)
> +{
> +    if (bs->backing_hd) {
> +        bdrv_unref(bs->backing_hd);
> +    }
> +
> +    bs->backing_hd = backing_hd;
> +    if (!backing_hd) {
> +        bs->backing_file[0] = '\0';
> +        bs->backing_format[0] = '\0';
> +        goto out;
> +    }
> +    pstrcpy(bs->backing_file, sizeof(bs->backing_file), backing_hd->filename);
> +    pstrcpy(bs->backing_format, sizeof(bs->backing_format),
> +            backing_hd->drv ? backing_hd->drv->format_name : "");
> +    bdrv_ref(bs->backing_hd);
> +out:
> +    bdrv_refresh_limits(bs);
> +}
> +
>  /*
>   * Opens the backing file for a BlockDriverState if not yet open
>   *
> @@ -1054,6 +1074,7 @@ int bdrv_open_backing_file(BlockDriverState *bs, QDict *options, Error **errp)
>      char backing_filename[PATH_MAX];
>      int back_flags, ret;
>      BlockDriver *back_drv = NULL;
> +    BlockDriverState *backing_hd;
>      Error *local_err = NULL;
>  
>      if (bs->backing_hd != NULL) {
> @@ -1077,6 +1098,8 @@ int bdrv_open_backing_file(BlockDriverState *bs, QDict *options, Error **errp)
>                                         sizeof(backing_filename));
>      }
>  
> +    backing_hd = bdrv_new("");
> +
>      if (bs->backing_format[0] != '\0') {
>          back_drv = bdrv_find_format(bs->backing_format);
>      }
> @@ -1085,23 +1108,20 @@ int bdrv_open_backing_file(BlockDriverState *bs, QDict *options, Error **errp)
>      back_flags = bs->open_flags & ~(BDRV_O_RDWR | BDRV_O_SNAPSHOT |
>                                      BDRV_O_COPY_ON_READ);
>  
> -    assert(bs->backing_hd == NULL);
> -    ret = bdrv_open(&bs->backing_hd,
> +    ret = bdrv_open(&backing_hd,
>                      *backing_filename ? backing_filename : NULL, NULL, options,
>                      back_flags, back_drv, &local_err);
>      if (ret < 0) {
>          bs->backing_hd = NULL;
> +        bdrv_unref(backing_hd);
> +        backing_hd = NULL;
>          bs->open_flags |= BDRV_O_NO_BACKING;
>          error_setg(errp, "Could not open backing file: %s",
>                     error_get_pretty(local_err));
>          error_free(local_err);
>          return ret;
>      }
> -
> -    if (bs->backing_hd->file) {
> -        pstrcpy(bs->backing_file, sizeof(bs->backing_file),
> -                bs->backing_hd->file->filename);
> -    }
> +    bdrv_set_backing_hd(bs, backing_hd);
>  
>      /* Recalculate the BlockLimits with the backing file */
>      bdrv_refresh_limits(bs);
> diff --git a/include/block/block.h b/include/block/block.h
> index a46f70a..ee1582d 100644
> --- a/include/block/block.h
> +++ b/include/block/block.h
> @@ -208,6 +208,7 @@ int bdrv_parse_discard_flags(const char *mode, int *flags);
>  int bdrv_open_image(BlockDriverState **pbs, const char *filename,
>                      QDict *options, const char *bdref_key, int flags,
>                      bool allow_none, Error **errp);
> +void bdrv_set_backing_hd(BlockDriverState *bs, BlockDriverState *backing_hd);
>  int bdrv_open_backing_file(BlockDriverState *bs, QDict *options, Error **errp);
>  int bdrv_open(BlockDriverState **pbs, const char *filename,
>                const char *reference, QDict *options, int flags,
> -- 
> 1.8.5.4
> 
Reviewed-by: Benoit Canet <benoit@irqsave.net>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH v14 04/14] block: Move op_blocker check from block_job_create to its caller
  2014-02-19 13:42 ` [Qemu-devel] [PATCH v14 04/14] block: Move op_blocker check from block_job_create to its caller Fam Zheng
@ 2014-02-19 15:28   ` Benoît Canet
  0 siblings, 0 replies; 33+ messages in thread
From: Benoît Canet @ 2014-02-19 15:28 UTC (permalink / raw)
  To: Fam Zheng
  Cc: kwolf, benoit.canet, rjones, qemu-devel, armbru, imain, stefanha,
	pbonzini

The Wednesday 19 Feb 2014 à 21:42:21 (+0800), Fam Zheng wrote :
> It makes no sense to check for "any" blocker on bs, we are here only
> because of the mechanical conversion from in_use to op_blockers. Remove
> it now, and let the callers check specific operation types. Backup and
> mirror already have it, add checker to stream and commit.
> 
> Signed-off-by: Fam Zheng <famz@redhat.com>
> ---
>  blockdev.c | 8 ++++++++
>  blockjob.c | 2 +-
>  2 files changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/blockdev.c b/blockdev.c
> index 53dce70..306aad5 100644
> --- a/blockdev.c
> +++ b/blockdev.c
> @@ -1840,6 +1840,10 @@ void qmp_block_stream(const char *device, bool has_base,
>          return;
>      }
>  
> +    if (bdrv_op_is_blocked(bs, BLOCK_OP_TYPE_STREAM, errp)) {
> +        return;
> +    }
> +
>      if (base) {
>          base_bs = bdrv_find_backing_image(bs, base);
>          if (base_bs == NULL) {
> @@ -1880,6 +1884,10 @@ void qmp_block_commit(const char *device,
>          return;
>      }
>  
> +    if (bdrv_op_is_blocked(bs, BLOCK_OP_TYPE_COMMIT, errp)) {
> +        return;
> +    }
> +
>      /* default top_bs is the active layer */
>      top_bs = bs;
>  
> diff --git a/blockjob.c b/blockjob.c
> index f1ff036..21e21c0 100644
> --- a/blockjob.c
> +++ b/blockjob.c
> @@ -41,7 +41,7 @@ void *block_job_create(const BlockJobDriver *driver, BlockDriverState *bs,
>  {
>      BlockJob *job;
>  
> -    if (bs->job || !bdrv_op_blocker_is_empty(bs)) {
> +    if (bs->job) {
>          error_set(errp, QERR_DEVICE_IN_USE, bdrv_get_device_name(bs));
>          return NULL;
>      }
> -- 
> 1.8.5.4
> 
> 
Reviewed-by: Benoit Canet <benoit@irqsave.net>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH v14 06/14] block: Add backing_blocker in BlockDriverState
  2014-02-19 13:42 ` [Qemu-devel] [PATCH v14 06/14] block: Add backing_blocker in BlockDriverState Fam Zheng
@ 2014-02-19 15:32   ` Benoît Canet
  2014-02-19 21:17   ` Jeff Cody
  1 sibling, 0 replies; 33+ messages in thread
From: Benoît Canet @ 2014-02-19 15:32 UTC (permalink / raw)
  To: Fam Zheng
  Cc: kwolf, benoit.canet, armbru, qemu-devel, rjones, imain, stefanha,
	pbonzini

The Wednesday 19 Feb 2014 à 21:42:23 (+0800), Fam Zheng wrote :
> This makes use of op_blocker and blocks all the operations except for
> commit target, on each BlockDriverState->backing_hd.
> 
> The asserts for op_blocker in bdrv_swap are removed because with this
> change, the target of block commit has at least the backing blocker of
> its child, so the assertion is not true. Callers should do their check.
> 
> Signed-off-by: Fam Zheng <famz@redhat.com>
> ---
>  block.c                   | 19 +++++++++++++++----
>  include/block/block_int.h |  3 +++
>  2 files changed, 18 insertions(+), 4 deletions(-)
> 
> diff --git a/block.c b/block.c
> index dec44d4..95d8c1f 100644
> --- a/block.c
> +++ b/block.c
> @@ -1044,19 +1044,33 @@ fail:
>  void bdrv_set_backing_hd(BlockDriverState *bs, BlockDriverState *backing_hd)
>  {
>      if (bs->backing_hd) {
> +        assert(error_is_set(&bs->backing_blocker));
> +        bdrv_op_unblock_all(bs->backing_hd, bs->backing_blocker);
>          bdrv_unref(bs->backing_hd);
> +    } else if (backing_hd) {
> +        error_setg(&bs->backing_blocker,
> +                   "device is used as backing hd of '%s'",
> +                   bs->device_name);
>      }
>  
>      bs->backing_hd = backing_hd;
>      if (!backing_hd) {
>          bs->backing_file[0] = '\0';
>          bs->backing_format[0] = '\0';
> +        if (error_is_set(&bs->backing_blocker)) {
> +            error_free(bs->backing_blocker);
> +        }
>          goto out;
>      }
>      pstrcpy(bs->backing_file, sizeof(bs->backing_file), backing_hd->filename);
>      pstrcpy(bs->backing_format, sizeof(bs->backing_format),
>              backing_hd->drv ? backing_hd->drv->format_name : "");
>      bdrv_ref(bs->backing_hd);
> +
> +    bdrv_op_block_all(bs->backing_hd, bs->backing_blocker);
> +    /* Otherwise we won't be able to commit due to check in bdrv_commit */
> +    bdrv_op_unblock(bs->backing_hd, BLOCK_OP_TYPE_COMMIT,
> +                    bs->backing_blocker);
>  out:
>      bdrv_refresh_limits(bs);
>  }
> @@ -1696,8 +1710,7 @@ void bdrv_close(BlockDriverState *bs)
>  
>      if (bs->drv) {
>          if (bs->backing_hd) {
> -            bdrv_unref(bs->backing_hd);
> -            bs->backing_hd = NULL;
> +            bdrv_set_backing_hd(bs, NULL);
>          }
>          bs->drv->bdrv_close(bs);
>          g_free(bs->opaque);
> @@ -1899,7 +1912,6 @@ void bdrv_swap(BlockDriverState *bs_new, BlockDriverState *bs_old)
>      assert(QLIST_EMPTY(&bs_new->dirty_bitmaps));
>      assert(bs_new->job == NULL);
>      assert(bs_new->dev == NULL);
> -    assert(bdrv_op_blocker_is_empty(bs_new));
>      assert(bs_new->io_limits_enabled == false);
>      assert(!throttle_have_timer(&bs_new->throttle_state));
>  
> @@ -1918,7 +1930,6 @@ void bdrv_swap(BlockDriverState *bs_new, BlockDriverState *bs_old)
>      /* Check a few fields that should remain attached to the device */
>      assert(bs_new->dev == NULL);
>      assert(bs_new->job == NULL);
> -    assert(bdrv_op_blocker_is_empty(bs_new));
>      assert(bs_new->io_limits_enabled == false);
>      assert(!throttle_have_timer(&bs_new->throttle_state));
>  
> diff --git a/include/block/block_int.h b/include/block/block_int.h
> index 1d3f76f..1f4f78b 100644
> --- a/include/block/block_int.h
> +++ b/include/block/block_int.h
> @@ -369,6 +369,9 @@ struct BlockDriverState {
>      BlockJob *job;
>  
>      QDict *options;
> +
> +    /* The error object in use for blocking operations on backing_hd */
> +    Error *backing_blocker;
>  };
>  
>  int get_tmp_filename(char *filename, int size);
> -- 
> 1.8.5.4
> 
Reviewed-by: Benoit Canet <benoit@irqsave.net>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH v14 08/14] block: Support dropping active in bdrv_drop_intermediate
  2014-02-19 13:42 ` [Qemu-devel] [PATCH v14 08/14] block: Support dropping active in bdrv_drop_intermediate Fam Zheng
@ 2014-02-19 15:34   ` Benoît Canet
  2014-02-19 21:22   ` Jeff Cody
  1 sibling, 0 replies; 33+ messages in thread
From: Benoît Canet @ 2014-02-19 15:34 UTC (permalink / raw)
  To: Fam Zheng
  Cc: kwolf, benoit.canet, rjones, qemu-devel, armbru, imain, stefanha,
	pbonzini

The Wednesday 19 Feb 2014 à 21:42:25 (+0800), Fam Zheng wrote :
> Dropping intermediate could be useful both for commit and stream, and
> BDS refcnt plus bdrv_swap could do most of the job nicely. It also needs
> to work with op blockers.
> 
> Signed-off-by: Fam Zheng <famz@redhat.com>
> ---
>  block.c        | 146 +++++++++++++++++++++++++--------------------------------
>  block/commit.c |   1 +
>  2 files changed, 66 insertions(+), 81 deletions(-)
> 
> diff --git a/block.c b/block.c
> index a2bf24c..cf41f3d 100644
> --- a/block.c
> +++ b/block.c
> @@ -2485,115 +2485,99 @@ BlockDriverState *bdrv_find_overlay(BlockDriverState *active,
>      return overlay;
>  }
>  
> -typedef struct BlkIntermediateStates {
> -    BlockDriverState *bs;
> -    QSIMPLEQ_ENTRY(BlkIntermediateStates) entry;
> -} BlkIntermediateStates;
> -
> -
>  /*
> - * Drops images above 'base' up to and including 'top', and sets the image
> - * above 'top' to have base as its backing file.
> + * Drops images above 'base' up to and including 'top', and sets new 'base'
> + * as backing_hd of top_overlay (the image orignally has 'top' as backing
> + * file). top_overlay may be NULL if 'top' is active, no such update needed.
> + * Requires that the top_overlay to 'top' is opened r/w.
>   *
> - * Requires that the overlay to 'top' is opened r/w, so that the backing file
> - * information in 'bs' can be properly updated.
> + * 1) This will convert the following chain:
>   *
> - * E.g., this will convert the following chain:
> - * bottom <- base <- intermediate <- top <- active
> + *     ... <- base <- ... <- top <- overlay <-... <- active
>   *
>   * to
>   *
> - * bottom <- base <- active
> + *     ... <- base <- overlay <- active
>   *
> - * It is allowed for bottom==base, in which case it converts:
> + * 2) It is allowed for bottom==base, in which case it converts:
>   *
> - * base <- intermediate <- top <- active
> + *     base <- ... <- top <- overlay <- ... <- active
>   *
>   * to
>   *
> - * base <- active
> + *     base <- overlay <- active
> + *
> + * 2) It also allows active==top, in which case it converts:
> + *
> + *     ... <- base <- ... <- top (active)
> + *
> + * to
> + *
> + *     ... <- base == active == top
> + *
> + * i.e. only base and lower remains: *top == *base when return.
> + *
> + * 3) If base==NULL, it will drop all the BDS below overlay and set its
> + * backing_hd to NULL. I.e.:
> + *
> + *     base(NULL) <- ... <- overlay <- ... <- active
> + *
> + * to
>   *
> - * Error conditions:
> - *  if active == top, that is considered an error
> + *     overlay <- ... <- active
>   *
>   */
>  int bdrv_drop_intermediate(BlockDriverState *active, BlockDriverState *top,
>                             BlockDriverState *base)
>  {
> -    BlockDriverState *intermediate;
> -    BlockDriverState *base_bs = NULL;
> -    BlockDriverState *new_top_bs = NULL;
> -    BlkIntermediateStates *intermediate_state, *next;
> -    int ret = -EIO;
> -
> -    QSIMPLEQ_HEAD(states_to_delete, BlkIntermediateStates) states_to_delete;
> -    QSIMPLEQ_INIT(&states_to_delete);
> -
> -    if (!top->drv || !base->drv) {
> -        goto exit;
> -    }
> -
> -    new_top_bs = bdrv_find_overlay(active, top);
> +    BlockDriverState *drop_start, *overlay;
> +    int ret = -EINVAL;
>  
> -    if (new_top_bs == NULL) {
> -        /* we could not find the image above 'top', this is an error */
> +    if (!top->drv || (base && !base->drv)) {
>          goto exit;
>      }
> -
> -    /* special case of new_top_bs->backing_hd already pointing to base - nothing
> -     * to do, no intermediate images */
> -    if (new_top_bs->backing_hd == base) {
> +    if (top == base) {
>          ret = 0;
> -        goto exit;
> -    }
> -
> -    intermediate = top;
> -
> -    /* now we will go down through the list, and add each BDS we find
> -     * into our deletion queue, until we hit the 'base'
> -     */
> -    while (intermediate) {
> -        intermediate_state = g_malloc0(sizeof(BlkIntermediateStates));
> -        intermediate_state->bs = intermediate;
> -        QSIMPLEQ_INSERT_TAIL(&states_to_delete, intermediate_state, entry);
> -
> -        if (intermediate->backing_hd == base) {
> -            base_bs = intermediate->backing_hd;
> -            break;
> +    } else if (top == active) {
> +        assert(base);
> +        drop_start = active->backing_hd;
> +        bdrv_swap(active, base);
> +        base->backing_hd = NULL;
> +        bdrv_unref(drop_start);
> +        ret = 0;
> +    } else {
> +        /* If there's an overlay, its backing_hd points to top's BDS now,
> +         * the top image is dropped but this BDS structure is kept and swapped
> +         * with base, this way we keep the pointers valid after dropping top */
> +        overlay = bdrv_find_overlay(active, top);
> +        if (!overlay) {
> +            goto exit;
> +        }
> +        if (base) {
> +            ret = bdrv_change_backing_file(overlay, base->filename,
> +                                           base->drv->format_name);
> +        } else {
> +            ret = bdrv_change_backing_file(overlay, NULL, NULL);
> +        }
> +        if (ret) {
> +            goto exit;
> +        }
> +        if (base) {
> +            drop_start = top->backing_hd;
> +            bdrv_swap(top, base);
> +            /* Break the loop formed by bdrv_swap */
> +            bdrv_set_backing_hd(base, NULL);
> +        } else {
> +            bdrv_set_backing_hd(overlay, NULL);
> +            drop_start = top;
>          }
> -        intermediate = intermediate->backing_hd;
> -    }
> -    if (base_bs == NULL) {
> -        /* something went wrong, we did not end at the base. safely
> -         * unravel everything, and exit with error */
> -        goto exit;
> -    }
> -
> -    /* success - we can delete the intermediate states, and link top->base */
> -    ret = bdrv_change_backing_file(new_top_bs, base_bs->filename,
> -                                   base_bs->drv ? base_bs->drv->format_name : "");
> -    if (ret) {
> -        goto exit;
> -    }
> -    new_top_bs->backing_hd = base_bs;
> -
> -    bdrv_refresh_limits(new_top_bs);
>  
> -    QSIMPLEQ_FOREACH_SAFE(intermediate_state, &states_to_delete, entry, next) {
> -        /* so that bdrv_close() does not recursively close the chain */
> -        intermediate_state->bs->backing_hd = NULL;
> -        bdrv_unref(intermediate_state->bs);
> +        bdrv_unref(drop_start);
>      }
> -    ret = 0;
> -
>  exit:
> -    QSIMPLEQ_FOREACH_SAFE(intermediate_state, &states_to_delete, entry, next) {
> -        g_free(intermediate_state);
> -    }
>      return ret;
>  }
>  
> -
>  static int bdrv_check_byte_request(BlockDriverState *bs, int64_t offset,
>                                     size_t size)
>  {
> diff --git a/block/commit.c b/block/commit.c
> index acec4ac..b10eb79 100644
> --- a/block/commit.c
> +++ b/block/commit.c
> @@ -142,6 +142,7 @@ wait:
>      if (!block_job_is_cancelled(&s->common) && sector_num == end) {
>          /* success */
>          ret = bdrv_drop_intermediate(active, top, base);
> +        base = top;
>      }
>  
>  exit_free_buf:
> -- 
> 1.8.5.4
> 
> 

This is where I hit peter principle in the reviewing of this series.
Will stop reviewing here and let someone stronger than me continue.

Best regards

Benoît

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH v14 06/14] block: Add backing_blocker in BlockDriverState
  2014-02-19 13:42 ` [Qemu-devel] [PATCH v14 06/14] block: Add backing_blocker in BlockDriverState Fam Zheng
  2014-02-19 15:32   ` Benoît Canet
@ 2014-02-19 21:17   ` Jeff Cody
  2014-02-20  5:01     ` Fam Zheng
  1 sibling, 1 reply; 33+ messages in thread
From: Jeff Cody @ 2014-02-19 21:17 UTC (permalink / raw)
  To: Fam Zheng
  Cc: kwolf, benoit.canet, rjones, qemu-devel, armbru, imain, stefanha,
	pbonzini

On Wed, Feb 19, 2014 at 09:42:23PM +0800, Fam Zheng wrote:
> This makes use of op_blocker and blocks all the operations except for
> commit target, on each BlockDriverState->backing_hd.
> 
> The asserts for op_blocker in bdrv_swap are removed because with this
> change, the target of block commit has at least the backing blocker of
> its child, so the assertion is not true. Callers should do their check.
> 
> Signed-off-by: Fam Zheng <famz@redhat.com>
> ---
>  block.c                   | 19 +++++++++++++++----
>  include/block/block_int.h |  3 +++
>  2 files changed, 18 insertions(+), 4 deletions(-)
> 
> diff --git a/block.c b/block.c
> index dec44d4..95d8c1f 100644
> --- a/block.c
> +++ b/block.c
> @@ -1044,19 +1044,33 @@ fail:
>  void bdrv_set_backing_hd(BlockDriverState *bs, BlockDriverState *backing_hd)
>  {
>      if (bs->backing_hd) {
> +        assert(error_is_set(&bs->backing_blocker));

When I run block-commit, on either the active or non-active layer, I
get an assertion here.  The qemu-iotests do not catch it, and I
presume it is because happens a couple of seconds or so after the
success message is returned over QMP.

> +        bdrv_op_unblock_all(bs->backing_hd, bs->backing_blocker);
>          bdrv_unref(bs->backing_hd);
> +    } else if (backing_hd) {
> +        error_setg(&bs->backing_blocker,
> +                   "device is used as backing hd of '%s'",
> +                   bs->device_name);
>      }
>  
>      bs->backing_hd = backing_hd;
>      if (!backing_hd) {
>          bs->backing_file[0] = '\0';
>          bs->backing_format[0] = '\0';
> +        if (error_is_set(&bs->backing_blocker)) {
> +            error_free(bs->backing_blocker);
> +        }
>          goto out;
>      }
>      pstrcpy(bs->backing_file, sizeof(bs->backing_file), backing_hd->filename);
>      pstrcpy(bs->backing_format, sizeof(bs->backing_format),
>              backing_hd->drv ? backing_hd->drv->format_name : "");
>      bdrv_ref(bs->backing_hd);
> +
> +    bdrv_op_block_all(bs->backing_hd, bs->backing_blocker);
> +    /* Otherwise we won't be able to commit due to check in bdrv_commit */
> +    bdrv_op_unblock(bs->backing_hd, BLOCK_OP_TYPE_COMMIT,
> +                    bs->backing_blocker);
>  out:
>      bdrv_refresh_limits(bs);
>  }
> @@ -1696,8 +1710,7 @@ void bdrv_close(BlockDriverState *bs)
>  
>      if (bs->drv) {
>          if (bs->backing_hd) {
> -            bdrv_unref(bs->backing_hd);
> -            bs->backing_hd = NULL;
> +            bdrv_set_backing_hd(bs, NULL);
>          }
>          bs->drv->bdrv_close(bs);
>          g_free(bs->opaque);
> @@ -1899,7 +1912,6 @@ void bdrv_swap(BlockDriverState *bs_new, BlockDriverState *bs_old)
>      assert(QLIST_EMPTY(&bs_new->dirty_bitmaps));
>      assert(bs_new->job == NULL);
>      assert(bs_new->dev == NULL);
> -    assert(bdrv_op_blocker_is_empty(bs_new));
>      assert(bs_new->io_limits_enabled == false);
>      assert(!throttle_have_timer(&bs_new->throttle_state));
>  
> @@ -1918,7 +1930,6 @@ void bdrv_swap(BlockDriverState *bs_new, BlockDriverState *bs_old)
>      /* Check a few fields that should remain attached to the device */
>      assert(bs_new->dev == NULL);
>      assert(bs_new->job == NULL);
> -    assert(bdrv_op_blocker_is_empty(bs_new));
>      assert(bs_new->io_limits_enabled == false);
>      assert(!throttle_have_timer(&bs_new->throttle_state));
>  
> diff --git a/include/block/block_int.h b/include/block/block_int.h
> index 1d3f76f..1f4f78b 100644
> --- a/include/block/block_int.h
> +++ b/include/block/block_int.h
> @@ -369,6 +369,9 @@ struct BlockDriverState {
>      BlockJob *job;
>  
>      QDict *options;
> +
> +    /* The error object in use for blocking operations on backing_hd */
> +    Error *backing_blocker;
>  };
>  
>  int get_tmp_filename(char *filename, int size);
> -- 
> 1.8.5.4
> 
> 

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH v14 08/14] block: Support dropping active in bdrv_drop_intermediate
  2014-02-19 13:42 ` [Qemu-devel] [PATCH v14 08/14] block: Support dropping active in bdrv_drop_intermediate Fam Zheng
  2014-02-19 15:34   ` Benoît Canet
@ 2014-02-19 21:22   ` Jeff Cody
  2014-02-19 23:24     ` Jeff Cody
  1 sibling, 1 reply; 33+ messages in thread
From: Jeff Cody @ 2014-02-19 21:22 UTC (permalink / raw)
  To: Fam Zheng
  Cc: kwolf, benoit.canet, rjones, qemu-devel, armbru, imain, stefanha,
	pbonzini

On Wed, Feb 19, 2014 at 09:42:25PM +0800, Fam Zheng wrote:
> Dropping intermediate could be useful both for commit and stream, and
> BDS refcnt plus bdrv_swap could do most of the job nicely. It also needs
> to work with op blockers.
> 
> Signed-off-by: Fam Zheng <famz@redhat.com>
> ---
>  block.c        | 146 +++++++++++++++++++++++++--------------------------------
>  block/commit.c |   1 +
>  2 files changed, 66 insertions(+), 81 deletions(-)
> 
> diff --git a/block.c b/block.c
> index a2bf24c..cf41f3d 100644
> --- a/block.c
> +++ b/block.c
> @@ -2485,115 +2485,99 @@ BlockDriverState *bdrv_find_overlay(BlockDriverState *active,
>      return overlay;
>  }
>  
> -typedef struct BlkIntermediateStates {
> -    BlockDriverState *bs;
> -    QSIMPLEQ_ENTRY(BlkIntermediateStates) entry;
> -} BlkIntermediateStates;
> -
> -
>  /*
> - * Drops images above 'base' up to and including 'top', and sets the image
> - * above 'top' to have base as its backing file.
> + * Drops images above 'base' up to and including 'top', and sets new 'base'
> + * as backing_hd of top_overlay (the image orignally has 'top' as backing

What is 'top_overlay'?  Do you mean "top's overlay" by this?

> + * file). top_overlay may be NULL if 'top' is active, no such update needed.
> + * Requires that the top_overlay to 'top' is opened r/w.
>   *
> - * Requires that the overlay to 'top' is opened r/w, so that the backing file
> - * information in 'bs' can be properly updated.
> + * 1) This will convert the following chain:
>   *
> - * E.g., this will convert the following chain:
> - * bottom <- base <- intermediate <- top <- active
> + *     ... <- base <- ... <- top <- overlay <-... <- active
>   *
>   * to
>   *
> - * bottom <- base <- active
> + *     ... <- base <- overlay <- active
>   *
> - * It is allowed for bottom==base, in which case it converts:
> + * 2) It is allowed for bottom==base, in which case it converts:
>   *
> - * base <- intermediate <- top <- active
> + *     base <- ... <- top <- overlay <- ... <- active
>   *
>   * to
>   *
> - * base <- active
> + *     base <- overlay <- active
> + *
> + * 2) It also allows active==top, in which case it converts:
> + *
> + *     ... <- base <- ... <- top (active)
> + *
> + * to
> + *
> + *     ... <- base == active == top
> + *
> + * i.e. only base and lower remains: *top == *base when return.
> + *
> + * 3) If base==NULL, it will drop all the BDS below overlay and set its
> + * backing_hd to NULL. I.e.:
> + *
> + *     base(NULL) <- ... <- overlay <- ... <- active
> + *
> + * to
>   *
> - * Error conditions:
> - *  if active == top, that is considered an error
> + *     overlay <- ... <- active
>   *
>   */
>  int bdrv_drop_intermediate(BlockDriverState *active, BlockDriverState *top,
>                             BlockDriverState *base)

With the active case, we aren't necessarily really just dropping
intermediate images anymore. Maybe we should rename this function now to
'bdrv_rebase_chain()'?

>  {
> -    BlockDriverState *intermediate;
> -    BlockDriverState *base_bs = NULL;
> -    BlockDriverState *new_top_bs = NULL;
> -    BlkIntermediateStates *intermediate_state, *next;
> -    int ret = -EIO;
> -
> -    QSIMPLEQ_HEAD(states_to_delete, BlkIntermediateStates) states_to_delete;
> -    QSIMPLEQ_INIT(&states_to_delete);
> -
> -    if (!top->drv || !base->drv) {
> -        goto exit;
> -    }
> -
> -    new_top_bs = bdrv_find_overlay(active, top);
> +    BlockDriverState *drop_start, *overlay;
> +    int ret = -EINVAL;
>  
> -    if (new_top_bs == NULL) {
> -        /* we could not find the image above 'top', this is an error */
> +    if (!top->drv || (base && !base->drv)) {
>          goto exit;
>      }
> -
> -    /* special case of new_top_bs->backing_hd already pointing to base - nothing
> -     * to do, no intermediate images */
> -    if (new_top_bs->backing_hd == base) {
> +    if (top == base) {
>          ret = 0;
> -        goto exit;
> -    }
> -
> -    intermediate = top;
> -
> -    /* now we will go down through the list, and add each BDS we find
> -     * into our deletion queue, until we hit the 'base'
> -     */
> -    while (intermediate) {
> -        intermediate_state = g_malloc0(sizeof(BlkIntermediateStates));
> -        intermediate_state->bs = intermediate;
> -        QSIMPLEQ_INSERT_TAIL(&states_to_delete, intermediate_state, entry);
> -
> -        if (intermediate->backing_hd == base) {
> -            base_bs = intermediate->backing_hd;
> -            break;
> +    } else if (top == active) {
> +        assert(base);
> +        drop_start = active->backing_hd;
> +        bdrv_swap(active, base);
> +        base->backing_hd = NULL;
> +        bdrv_unref(drop_start);
> +        ret = 0;

This now orphans everything between active->backing_hd and the
original base, without performing a bdrv_unref/delete on them.

> +    } else {
> +        /* If there's an overlay, its backing_hd points to top's BDS now,
> +         * the top image is dropped but this BDS structure is kept and swapped
> +         * with base, this way we keep the pointers valid after dropping top */
> +        overlay = bdrv_find_overlay(active, top);
> +        if (!overlay) {
> +            goto exit;
> +        }


> +        if (base) {
> +            ret = bdrv_change_backing_file(overlay, base->filename,
> +                                           base->drv->format_name);
> +        } else {
> +            ret = bdrv_change_backing_file(overlay, NULL, NULL);
> +        }
> +        if (ret) {
> +            goto exit;
> +        }
> +        if (base) {
> +            drop_start = top->backing_hd;
> +            bdrv_swap(top, base);
> +            /* Break the loop formed by bdrv_swap */
> +            bdrv_set_backing_hd(base, NULL);

And in the non-active case here, everything between top->backing_hd
and the original base is orphaned as well.  These should all be
explicitly unreferenced.

Also, side effect:
Caller needs to beware now that base and top are now swapped [1].

> +        } else {
> +            bdrv_set_backing_hd(overlay, NULL);
> +            drop_start = top;

Again, everything between top and the original base is orphaned, but
should be cleaned up.

Caller does not have to worry about base and top being swapped [1].


>          }
> -        intermediate = intermediate->backing_hd;
> -    }
> -    if (base_bs == NULL) {
> -        /* something went wrong, we did not end at the base. safely
> -         * unravel everything, and exit with error */
> -        goto exit;
> -    }
> -
> -    /* success - we can delete the intermediate states, and link top->base */
> -    ret = bdrv_change_backing_file(new_top_bs, base_bs->filename,
> -                                   base_bs->drv ? base_bs->drv->format_name : "");
> -    if (ret) {
> -        goto exit;
> -    }
> -    new_top_bs->backing_hd = base_bs;
> -
> -    bdrv_refresh_limits(new_top_bs);
>  
> -    QSIMPLEQ_FOREACH_SAFE(intermediate_state, &states_to_delete, entry, next) {
> -        /* so that bdrv_close() does not recursively close the chain */
> -        intermediate_state->bs->backing_hd = NULL;
> -        bdrv_unref(intermediate_state->bs);
> +        bdrv_unref(drop_start);

We will get an assertion here.  In the non-active case, the backing_hd
is explicitly set to NULL via bdrv_set_backing_hd().  That function
will call bdrv_unref() on the same BDS that drop_start was assigned,
so we have a double call to bdrv_unref().

>      }
> -    ret = 0;
> -
>  exit:
> -    QSIMPLEQ_FOREACH_SAFE(intermediate_state, &states_to_delete, entry, next) {
> -        g_free(intermediate_state);
> -    }
>      return ret;
>  }
>  
> -
>  static int bdrv_check_byte_request(BlockDriverState *bs, int64_t offset,
>                                     size_t size)
>  {
> diff --git a/block/commit.c b/block/commit.c
> index acec4ac..b10eb79 100644
> --- a/block/commit.c
> +++ b/block/commit.c
> @@ -142,6 +142,7 @@ wait:
>      if (!block_job_is_cancelled(&s->common) && sector_num == end) {
>          /* success */
>          ret = bdrv_drop_intermediate(active, top, base);
> +        base = top;

This is where it is highlighted to me how odd it is to use the side
effects of bdrv_swap() in bdrv_drop_intermediate() for the non-active
layer case.

The function bdrv_drop_intermediate() is now actually pretty complex
and tricky to use, with side effects that the caller needs to beware
of, that change depending on the nature of the arguments passed.

[1] Side affects, depending on active, top, and base:

      active = top |  base = NULL  |   side effect
    -----------------------------------------------
(A) false          |     false     |  top and base are swapped
(B) false          |     true      |  none
(C) true           |     false     |  top and base are swapped
(D) true           |     true      |  assert()


Case (C) is reasonable, because active and base need to be swapped,
and top == active.  It is expected almost by definition.

Case (A) is a bit odd, especially in light of case (B).


>      }
>  
>  exit_free_buf:


Further down, out of the context of this patch, we have:


 exit_restore_reopen:
     /* restore base open flags here if appropriate (e.g., change the base back
      * to r/o). These reopens do not need to be atomic, since we won't abort
      * even on failure here */
     if (s->base_flags != bdrv_get_flags(base)) {
         bdrv_reopen(base, s->base_flags, NULL);
     }

OK, 'base' is the one we want to operate on now, that was set to
'top', which has the contents of the old 'base'.


     overlay_bs = bdrv_find_overlay(active, top);

Will we find the right overlay here?  I think now overlay_bs will
always be NULL, so we won't restore the r/o flags (if set) for the
overlay of the original 'top'.

     if (overlay_bs && s->orig_overlay_flags != bdrv_get_flags(overlay_bs)) {
         bdrv_reopen(overlay_bs, s->orig_overlay_flags, NULL);
     }






> -- 
> 1.8.5.4
> 
> 

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH v14 09/14] stream: Use bdrv_drop_intermediate and drop close_unused_images
  2014-02-19 13:42 ` [Qemu-devel] [PATCH v14 09/14] stream: Use bdrv_drop_intermediate and drop close_unused_images Fam Zheng
@ 2014-02-19 21:23   ` Jeff Cody
  0 siblings, 0 replies; 33+ messages in thread
From: Jeff Cody @ 2014-02-19 21:23 UTC (permalink / raw)
  To: Fam Zheng
  Cc: kwolf, benoit.canet, rjones, qemu-devel, armbru, imain, stefanha,
	pbonzini

On Wed, Feb 19, 2014 at 09:42:26PM +0800, Fam Zheng wrote:
> This reuses the new bdrv_drop_intermediate.
> 
> Signed-off-by: Fam Zheng <famz@redhat.com>
> ---
>  block/stream.c | 30 +-----------------------------
>  1 file changed, 1 insertion(+), 29 deletions(-)
> 
> diff --git a/block/stream.c b/block/stream.c
> index dd0b4ac..9cdcf0e 100644
> --- a/block/stream.c
> +++ b/block/stream.c
> @@ -51,34 +51,6 @@ static int coroutine_fn stream_populate(BlockDriverState *bs,
>      return bdrv_co_copy_on_readv(bs, sector_num, nb_sectors, &qiov);
>  }
>  
> -static void close_unused_images(BlockDriverState *top, BlockDriverState *base,
> -                                const char *base_id)
> -{
> -    BlockDriverState *intermediate;
> -    intermediate = top->backing_hd;
> -
> -    /* Must assign before bdrv_delete() to prevent traversing dangling pointer
> -     * while we delete backing image instances.
> -     */
> -    top->backing_hd = base;
> -
> -    while (intermediate) {
> -        BlockDriverState *unused;
> -
> -        /* reached base */
> -        if (intermediate == base) {
> -            break;
> -        }
> -
> -        unused = intermediate;
> -        intermediate = intermediate->backing_hd;
> -        unused->backing_hd = NULL;
> -        bdrv_unref(unused);
> -    }
> -
> -    bdrv_refresh_limits(top);
> -}
> -
>  static void coroutine_fn stream_run(void *opaque)
>  {
>      StreamBlockJob *s = opaque;
> @@ -192,7 +164,7 @@ wait:
>              }
>          }


>          ret = bdrv_change_backing_file(bs, base_id, base_fmt);

This is redundant if bdrv_drop_intermediate() is used, and can be
removed.

> -        close_unused_images(bs, base, base_id);
> +        bdrv_drop_intermediate(bs, bs->backing_hd, base);

Previously, if base == NULL and base_id == NULL, that meant that we
were streaming from the very first backing file all the way up to the
active layer.  When close_unused_images() was called with base ==
NULL, the BDS chain was flattened and the active layer became all that
was left.

By calling bdrv_drop_intermediate() with base == NULL, we get the same
behavior, so that is OK.

What about if base != NULL here?  It still works OK, but now 'base' is
really 'bs->backing_hd'.  But I don't think we care anymore in this
function, or in the streaming block job, so I think it is OK.



>      }
>  
>      qemu_vfree(buf);
> -- 
> 1.8.5.4
> 
> 

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH v14 08/14] block: Support dropping active in bdrv_drop_intermediate
  2014-02-19 21:22   ` Jeff Cody
@ 2014-02-19 23:24     ` Jeff Cody
  2014-02-20  4:37       ` Fam Zheng
  0 siblings, 1 reply; 33+ messages in thread
From: Jeff Cody @ 2014-02-19 23:24 UTC (permalink / raw)
  To: Fam Zheng
  Cc: kwolf, benoit.canet, rjones, qemu-devel, armbru, imain, stefanha,
	pbonzini

On Wed, Feb 19, 2014 at 04:22:30PM -0500, Jeff Cody wrote:
> On Wed, Feb 19, 2014 at 09:42:25PM +0800, Fam Zheng wrote:
> > Dropping intermediate could be useful both for commit and stream, and
> > BDS refcnt plus bdrv_swap could do most of the job nicely. It also needs
> > to work with op blockers.
> > 
> > Signed-off-by: Fam Zheng <famz@redhat.com>
> > ---
> >  block.c        | 146 +++++++++++++++++++++++++--------------------------------
> >  block/commit.c |   1 +
> >  2 files changed, 66 insertions(+), 81 deletions(-)
> > 
> > diff --git a/block.c b/block.c
> > index a2bf24c..cf41f3d 100644
> > --- a/block.c
> > +++ b/block.c
> > @@ -2485,115 +2485,99 @@ BlockDriverState *bdrv_find_overlay(BlockDriverState *active,
> >      return overlay;
> >  }
> >  
> > -typedef struct BlkIntermediateStates {
> > -    BlockDriverState *bs;
> > -    QSIMPLEQ_ENTRY(BlkIntermediateStates) entry;
> > -} BlkIntermediateStates;
> > -
> > -
> >  /*
> > - * Drops images above 'base' up to and including 'top', and sets the image
> > - * above 'top' to have base as its backing file.
> > + * Drops images above 'base' up to and including 'top', and sets new 'base'
> > + * as backing_hd of top_overlay (the image orignally has 'top' as backing
> 
> What is 'top_overlay'?  Do you mean "top's overlay" by this?
> 
> > + * file). top_overlay may be NULL if 'top' is active, no such update needed.
> > + * Requires that the top_overlay to 'top' is opened r/w.
> >   *
> > - * Requires that the overlay to 'top' is opened r/w, so that the backing file
> > - * information in 'bs' can be properly updated.
> > + * 1) This will convert the following chain:
> >   *
> > - * E.g., this will convert the following chain:
> > - * bottom <- base <- intermediate <- top <- active
> > + *     ... <- base <- ... <- top <- overlay <-... <- active
> >   *
> >   * to
> >   *
> > - * bottom <- base <- active
> > + *     ... <- base <- overlay <- active
> >   *
> > - * It is allowed for bottom==base, in which case it converts:
> > + * 2) It is allowed for bottom==base, in which case it converts:
> >   *
> > - * base <- intermediate <- top <- active
> > + *     base <- ... <- top <- overlay <- ... <- active
> >   *
> >   * to
> >   *
> > - * base <- active
> > + *     base <- overlay <- active
> > + *
> > + * 2) It also allows active==top, in which case it converts:
> > + *
> > + *     ... <- base <- ... <- top (active)
> > + *
> > + * to
> > + *
> > + *     ... <- base == active == top
> > + *
> > + * i.e. only base and lower remains: *top == *base when return.
> > + *
> > + * 3) If base==NULL, it will drop all the BDS below overlay and set its
> > + * backing_hd to NULL. I.e.:
> > + *
> > + *     base(NULL) <- ... <- overlay <- ... <- active
> > + *
> > + * to
> >   *
> > - * Error conditions:
> > - *  if active == top, that is considered an error
> > + *     overlay <- ... <- active
> >   *
> >   */
> >  int bdrv_drop_intermediate(BlockDriverState *active, BlockDriverState *top,
> >                             BlockDriverState *base)
> 
> With the active case, we aren't necessarily really just dropping
> intermediate images anymore. Maybe we should rename this function now to
> 'bdrv_rebase_chain()'?
> 
> >  {
> > -    BlockDriverState *intermediate;
> > -    BlockDriverState *base_bs = NULL;
> > -    BlockDriverState *new_top_bs = NULL;
> > -    BlkIntermediateStates *intermediate_state, *next;
> > -    int ret = -EIO;
> > -
> > -    QSIMPLEQ_HEAD(states_to_delete, BlkIntermediateStates) states_to_delete;
> > -    QSIMPLEQ_INIT(&states_to_delete);
> > -
> > -    if (!top->drv || !base->drv) {
> > -        goto exit;
> > -    }
> > -
> > -    new_top_bs = bdrv_find_overlay(active, top);
> > +    BlockDriverState *drop_start, *overlay;
> > +    int ret = -EINVAL;
> >  
> > -    if (new_top_bs == NULL) {
> > -        /* we could not find the image above 'top', this is an error */
> > +    if (!top->drv || (base && !base->drv)) {
> >          goto exit;
> >      }
> > -
> > -    /* special case of new_top_bs->backing_hd already pointing to base - nothing
> > -     * to do, no intermediate images */
> > -    if (new_top_bs->backing_hd == base) {
> > +    if (top == base) {
> >          ret = 0;
> > -        goto exit;
> > -    }
> > -
> > -    intermediate = top;
> > -
> > -    /* now we will go down through the list, and add each BDS we find
> > -     * into our deletion queue, until we hit the 'base'
> > -     */
> > -    while (intermediate) {
> > -        intermediate_state = g_malloc0(sizeof(BlkIntermediateStates));
> > -        intermediate_state->bs = intermediate;
> > -        QSIMPLEQ_INSERT_TAIL(&states_to_delete, intermediate_state, entry);
> > -
> > -        if (intermediate->backing_hd == base) {
> > -            base_bs = intermediate->backing_hd;
> > -            break;
> > +    } else if (top == active) {
> > +        assert(base);
> > +        drop_start = active->backing_hd;
> > +        bdrv_swap(active, base);
> > +        base->backing_hd = NULL;
> > +        bdrv_unref(drop_start);
> > +        ret = 0;
> 
> This now orphans everything between active->backing_hd and the
> original base, without performing a bdrv_unref/delete on them.
>

Actually, nevermind on that - bdrv_delete() will end up recursively
closing the chain, until it reaches a NULL backing_hd.

So we have something like this:


    |||<-- ([base]) <-- [B] <-- [A] <----- ([active])

bdrv_swap(active,base):

    ------ ([active]) <-- [B] <-- [A]  |||-- ([base])
    |                              ^
    |                              |
    --------------------------------

base->backing_hd = NULL:

    |||<-- ([active]) <-- [B] <-- [A]  |||-- ([base])


bdrv_unref(drop_start) will then close from [A] down until it reaches
a NULL backing_hd, and all we will be left is:

  |||-- ([base])


So we are OK.



> > +    } else {
> > +        /* If there's an overlay, its backing_hd points to top's BDS now,
> > +         * the top image is dropped but this BDS structure is kept and swapped
> > +         * with base, this way we keep the pointers valid after dropping top */
> > +        overlay = bdrv_find_overlay(active, top);
> > +        if (!overlay) {
> > +            goto exit;
> > +        }
> 
> 
> > +        if (base) {
> > +            ret = bdrv_change_backing_file(overlay, base->filename,
> > +                                           base->drv->format_name);
> > +        } else {
> > +            ret = bdrv_change_backing_file(overlay, NULL, NULL);
> > +        }
> > +        if (ret) {
> > +            goto exit;
> > +        }
> > +        if (base) {
> > +            drop_start = top->backing_hd;
> > +            bdrv_swap(top, base);
> > +            /* Break the loop formed by bdrv_swap */
> > +            bdrv_set_backing_hd(base, NULL);
> 
> And in the non-active case here, everything between top->backing_hd
> and the original base is orphaned as well.  These should all be
> explicitly unreferenced.

Same here, bdrv_unref() will eventually go through the chain, starting
from top->backing_hd.  But this is a problem; won't we end up in a
loop then?

Take this chain:

drop_start = [A]

    |||-- ([base]) <-- [B] <--- [A] <--- ([top]) <--- [active]


bdrv_swap(top, base):

    -- [B] <-- [A] <-- ([top])    |||--- ([base]) <-- [active]
    |                    ^
    |                    |
    ---------------------

Then we call bdrv_unref(drop_start (or bdrv_set_backing_hd() does),
and we end up with:

bdrv_unref(A)
    bdrv_unref(B)
        bdrv_unref(top)
            bdrv_unref(A) <--- assert
                .....
            

So I think we want this line:

> > +            bdrv_set_backing_hd(base, NULL);

To be:

> > +            bdrv_set_backing_hd(top, NULL);


Right?  Or, just set top->backing_hd = NULL, so we get:

    -- [B] <-- [A]   |||-- ([top])    |||--- ([base]) <-- [active]
    |                         ^
    |                         |
    ---------------------------

bdrv_unref(A)
    bdrv_unref(B)
        bdrv_unref(top)


Which leaves:

    |||--- ([base]) <-- [active]


So this part above still needs addressing, I think.

> 
> Also, side effect:
> Caller needs to beware now that base and top are now swapped [1].
> 
> > +        } else {
> > +            bdrv_set_backing_hd(overlay, NULL);
> > +            drop_start = top;
> 
> Again, everything between top and the original base is orphaned, but
> should be cleaned up.
>
> Caller does not have to worry about base and top being swapped [1].
>

This should be fine, I think.


I think everything else I mentioned below this point is still
relevant, however.

> 
> >          }
> > -        intermediate = intermediate->backing_hd;
> > -    }
> > -    if (base_bs == NULL) {
> > -        /* something went wrong, we did not end at the base. safely
> > -         * unravel everything, and exit with error */
> > -        goto exit;
> > -    }
> > -
> > -    /* success - we can delete the intermediate states, and link top->base */
> > -    ret = bdrv_change_backing_file(new_top_bs, base_bs->filename,
> > -                                   base_bs->drv ? base_bs->drv->format_name : "");
> > -    if (ret) {
> > -        goto exit;
> > -    }
> > -    new_top_bs->backing_hd = base_bs;
> > -
> > -    bdrv_refresh_limits(new_top_bs);
> >  
> > -    QSIMPLEQ_FOREACH_SAFE(intermediate_state, &states_to_delete, entry, next) {
> > -        /* so that bdrv_close() does not recursively close the chain */
> > -        intermediate_state->bs->backing_hd = NULL;
> > -        bdrv_unref(intermediate_state->bs);
> > +        bdrv_unref(drop_start);
> 
> We will get an assertion here.  In the non-active case, the backing_hd
> is explicitly set to NULL via bdrv_set_backing_hd().  That function
> will call bdrv_unref() on the same BDS that drop_start was assigned,
> so we have a double call to bdrv_unref().
> 
> >      }
> > -    ret = 0;
> > -
> >  exit:
> > -    QSIMPLEQ_FOREACH_SAFE(intermediate_state, &states_to_delete, entry, next) {
> > -        g_free(intermediate_state);
> > -    }
> >      return ret;
> >  }
> >  
> > -
> >  static int bdrv_check_byte_request(BlockDriverState *bs, int64_t offset,
> >                                     size_t size)
> >  {
> > diff --git a/block/commit.c b/block/commit.c
> > index acec4ac..b10eb79 100644
> > --- a/block/commit.c
> > +++ b/block/commit.c
> > @@ -142,6 +142,7 @@ wait:
> >      if (!block_job_is_cancelled(&s->common) && sector_num == end) {
> >          /* success */
> >          ret = bdrv_drop_intermediate(active, top, base);
> > +        base = top;
> 
> This is where it is highlighted to me how odd it is to use the side
> effects of bdrv_swap() in bdrv_drop_intermediate() for the non-active
> layer case.
> 
> The function bdrv_drop_intermediate() is now actually pretty complex
> and tricky to use, with side effects that the caller needs to beware
> of, that change depending on the nature of the arguments passed.
> 
> [1] Side affects, depending on active, top, and base:
> 
>       active = top |  base = NULL  |   side effect
>     -----------------------------------------------
> (A) false          |     false     |  top and base are swapped
> (B) false          |     true      |  none
> (C) true           |     false     |  top and base are swapped
> (D) true           |     true      |  assert()
> 
> 
> Case (C) is reasonable, because active and base need to be swapped,
> and top == active.  It is expected almost by definition.
> 
> Case (A) is a bit odd, especially in light of case (B).
> 
> 
> >      }
> >  
> >  exit_free_buf:
> 
> 
> Further down, out of the context of this patch, we have:
> 
> 
>  exit_restore_reopen:
>      /* restore base open flags here if appropriate (e.g., change the base back
>       * to r/o). These reopens do not need to be atomic, since we won't abort
>       * even on failure here */
>      if (s->base_flags != bdrv_get_flags(base)) {
>          bdrv_reopen(base, s->base_flags, NULL);
>      }
> 
> OK, 'base' is the one we want to operate on now, that was set to
> 'top', which has the contents of the old 'base'.
> 
> 
>      overlay_bs = bdrv_find_overlay(active, top);
> 
> Will we find the right overlay here?  I think now overlay_bs will
> always be NULL, so we won't restore the r/o flags (if set) for the
> overlay of the original 'top'.
> 
>      if (overlay_bs && s->orig_overlay_flags != bdrv_get_flags(overlay_bs)) {
>          bdrv_reopen(overlay_bs, s->orig_overlay_flags, NULL);
>      }
> 
> 
> 
> 
> 
> 
> > -- 
> > 1.8.5.4
> > 
> > 

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH v14 08/14] block: Support dropping active in bdrv_drop_intermediate
  2014-02-19 23:24     ` Jeff Cody
@ 2014-02-20  4:37       ` Fam Zheng
  2014-02-20  5:57         ` Jeff Cody
  0 siblings, 1 reply; 33+ messages in thread
From: Fam Zheng @ 2014-02-20  4:37 UTC (permalink / raw)
  To: Jeff Cody
  Cc: kwolf, benoit.canet, rjones, qemu-devel, armbru, imain, stefanha,
	pbonzini

On Wed, 02/19 18:24, Jeff Cody wrote:
> On Wed, Feb 19, 2014 at 04:22:30PM -0500, Jeff Cody wrote:
> > On Wed, Feb 19, 2014 at 09:42:25PM +0800, Fam Zheng wrote:
> > > Dropping intermediate could be useful both for commit and stream, and
> > > BDS refcnt plus bdrv_swap could do most of the job nicely. It also needs
> > > to work with op blockers.
> > > 
> > > Signed-off-by: Fam Zheng <famz@redhat.com>
> > > ---
> > >  block.c        | 146 +++++++++++++++++++++++++--------------------------------
> > >  block/commit.c |   1 +
> > >  2 files changed, 66 insertions(+), 81 deletions(-)
> > > 
> > > diff --git a/block.c b/block.c
> > > index a2bf24c..cf41f3d 100644
> > > --- a/block.c
> > > +++ b/block.c
> > > @@ -2485,115 +2485,99 @@ BlockDriverState *bdrv_find_overlay(BlockDriverState *active,
> > >      return overlay;
> > >  }
> > >  
> > > -typedef struct BlkIntermediateStates {
> > > -    BlockDriverState *bs;
> > > -    QSIMPLEQ_ENTRY(BlkIntermediateStates) entry;
> > > -} BlkIntermediateStates;
> > > -
> > > -
> > >  /*
> > > - * Drops images above 'base' up to and including 'top', and sets the image
> > > - * above 'top' to have base as its backing file.
> > > + * Drops images above 'base' up to and including 'top', and sets new 'base'
> > > + * as backing_hd of top_overlay (the image orignally has 'top' as backing
> > 
> > What is 'top_overlay'?  Do you mean "top's overlay" by this?

Yes, as noted in the parenthesis.

> > 
> > > + * file). top_overlay may be NULL if 'top' is active, no such update needed.
> > > + * Requires that the top_overlay to 'top' is opened r/w.
> > >   *
> > > - * Requires that the overlay to 'top' is opened r/w, so that the backing file
> > > - * information in 'bs' can be properly updated.
> > > + * 1) This will convert the following chain:
> > >   *
> > > - * E.g., this will convert the following chain:
> > > - * bottom <- base <- intermediate <- top <- active
> > > + *     ... <- base <- ... <- top <- overlay <-... <- active
> > >   *
> > >   * to
> > >   *
> > > - * bottom <- base <- active
> > > + *     ... <- base <- overlay <- active
> > >   *
> > > - * It is allowed for bottom==base, in which case it converts:
> > > + * 2) It is allowed for bottom==base, in which case it converts:
> > >   *
> > > - * base <- intermediate <- top <- active
> > > + *     base <- ... <- top <- overlay <- ... <- active
> > >   *
> > >   * to
> > >   *
> > > - * base <- active
> > > + *     base <- overlay <- active
> > > + *
> > > + * 2) It also allows active==top, in which case it converts:
> > > + *
> > > + *     ... <- base <- ... <- top (active)
> > > + *
> > > + * to
> > > + *
> > > + *     ... <- base == active == top
> > > + *
> > > + * i.e. only base and lower remains: *top == *base when return.
> > > + *
> > > + * 3) If base==NULL, it will drop all the BDS below overlay and set its
> > > + * backing_hd to NULL. I.e.:
> > > + *
> > > + *     base(NULL) <- ... <- overlay <- ... <- active
> > > + *
> > > + * to
> > >   *
> > > - * Error conditions:
> > > - *  if active == top, that is considered an error
> > > + *     overlay <- ... <- active
> > >   *
> > >   */
> > >  int bdrv_drop_intermediate(BlockDriverState *active, BlockDriverState *top,
> > >                             BlockDriverState *base)
> > 
> > With the active case, we aren't necessarily really just dropping
> > intermediate images anymore. Maybe we should rename this function now to
> > 'bdrv_rebase_chain()'?
> > 
> > >  {
> > > -    BlockDriverState *intermediate;
> > > -    BlockDriverState *base_bs = NULL;
> > > -    BlockDriverState *new_top_bs = NULL;
> > > -    BlkIntermediateStates *intermediate_state, *next;
> > > -    int ret = -EIO;
> > > -
> > > -    QSIMPLEQ_HEAD(states_to_delete, BlkIntermediateStates) states_to_delete;
> > > -    QSIMPLEQ_INIT(&states_to_delete);
> > > -
> > > -    if (!top->drv || !base->drv) {
> > > -        goto exit;
> > > -    }
> > > -
> > > -    new_top_bs = bdrv_find_overlay(active, top);
> > > +    BlockDriverState *drop_start, *overlay;
> > > +    int ret = -EINVAL;
> > >  
> > > -    if (new_top_bs == NULL) {
> > > -        /* we could not find the image above 'top', this is an error */
> > > +    if (!top->drv || (base && !base->drv)) {
> > >          goto exit;
> > >      }
> > > -
> > > -    /* special case of new_top_bs->backing_hd already pointing to base - nothing
> > > -     * to do, no intermediate images */
> > > -    if (new_top_bs->backing_hd == base) {
> > > +    if (top == base) {
> > >          ret = 0;
> > > -        goto exit;
> > > -    }
> > > -
> > > -    intermediate = top;
> > > -
> > > -    /* now we will go down through the list, and add each BDS we find
> > > -     * into our deletion queue, until we hit the 'base'
> > > -     */
> > > -    while (intermediate) {
> > > -        intermediate_state = g_malloc0(sizeof(BlkIntermediateStates));
> > > -        intermediate_state->bs = intermediate;
> > > -        QSIMPLEQ_INSERT_TAIL(&states_to_delete, intermediate_state, entry);
> > > -
> > > -        if (intermediate->backing_hd == base) {
> > > -            base_bs = intermediate->backing_hd;
> > > -            break;
> > > +    } else if (top == active) {
> > > +        assert(base);
> > > +        drop_start = active->backing_hd;
> > > +        bdrv_swap(active, base);
> > > +        base->backing_hd = NULL;
> > > +        bdrv_unref(drop_start);
> > > +        ret = 0;
> > > +    } else {
> > > +        /* If there's an overlay, its backing_hd points to top's BDS now,
> > > +         * the top image is dropped but this BDS structure is kept and swapped
> > > +         * with base, this way we keep the pointers valid after dropping top */
> > > +        overlay = bdrv_find_overlay(active, top);
> > > +        if (!overlay) {
> > > +            goto exit;
> > > +        }
> > > +        if (base) {
> > > +            ret = bdrv_change_backing_file(overlay, base->filename,
> > > +                                           base->drv->format_name);
> > > +        } else {
> > > +            ret = bdrv_change_backing_file(overlay, NULL, NULL);
> > > +        }
> > > +        if (ret) {
> > > +            goto exit;
> > > +        }
> > > +        if (base) {
> > > +            drop_start = top->backing_hd;
> > > +            bdrv_swap(top, base);
> > > +            /* Break the loop formed by bdrv_swap */
> > > +            bdrv_set_backing_hd(base, NULL);
> > 
> > And in the non-active case here, everything between top->backing_hd
> > and the original base is orphaned as well.  These should all be
> > explicitly unreferenced.
> 
> Same here, bdrv_unref() will eventually go through the chain, starting
> from top->backing_hd.  But this is a problem; won't we end up in a
> loop then?

Although the content is swapped, the pointer is not:

(I presume your "[base]" and "[top]" are denoting content, not pointer)

> 
> Take this chain:
> 
> drop_start = [A]
> 
>     |||-- ([base]) <-- [B] <--- [A] <--- ([top]) <--- [active]
               ^                              ^
               |                              |
              base                           top
> 
> 
> bdrv_swap(top, base):
> 
>     -- [B] <-- [A] <-- ([top])    |||--- ([base]) <-- [active]
                            ^                 ^
                            |                 |
                           base               top
>     |                    ^
>     |                    |
>     ---------------------
> 
> Then we call bdrv_unref(drop_start (or bdrv_set_backing_hd() does),
> and we end up with:
> 
> bdrv_unref(A)
>     bdrv_unref(B)
>         bdrv_unref(top)
>             bdrv_unref(A) <--- assert
>                 .....
>             
> 
> So I think we want this line:
> 
> > > +            bdrv_set_backing_hd(base, NULL);

so, this breaks the chain,

> 
> To be:
> 
> > > +            bdrv_set_backing_hd(top, NULL);

This will lose track of original base's backing_hd.

So I think we are OK here.

But I find that a fix is needed in bdrv_set_backing_hd to handle the rebase
correctly.

> 
> 
> Right?  Or, just set top->backing_hd = NULL, so we get:
> 
>     -- [B] <-- [A]   |||-- ([top])    |||--- ([base]) <-- [active]
>     |                         ^
>     |                         |
>     ---------------------------
> 
> bdrv_unref(A)
>     bdrv_unref(B)
>         bdrv_unref(top)
> 
> 
> Which leaves:
> 
>     |||--- ([base]) <-- [active]
> 
> 
> So this part above still needs addressing, I think.
> 
> > 
> > Also, side effect:
> > Caller needs to beware now that base and top are now swapped [1].
> > 
> > > +        } else {
> > > +            bdrv_set_backing_hd(overlay, NULL);
> > > +            drop_start = top;
> > 
> > Again, everything between top and the original base is orphaned, but
> > should be cleaned up.
> >
> > Caller does not have to worry about base and top being swapped [1].
> >
> 
> This should be fine, I think.
> 
> 
> I think everything else I mentioned below this point is still
> relevant, however.
> 
> > 
> > >          }
> > > -        intermediate = intermediate->backing_hd;
> > > -    }
> > > -    if (base_bs == NULL) {
> > > -        /* something went wrong, we did not end at the base. safely
> > > -         * unravel everything, and exit with error */
> > > -        goto exit;
> > > -    }
> > > -
> > > -    /* success - we can delete the intermediate states, and link top->base */
> > > -    ret = bdrv_change_backing_file(new_top_bs, base_bs->filename,
> > > -                                   base_bs->drv ? base_bs->drv->format_name : "");
> > > -    if (ret) {
> > > -        goto exit;
> > > -    }
> > > -    new_top_bs->backing_hd = base_bs;
> > > -
> > > -    bdrv_refresh_limits(new_top_bs);
> > >  
> > > -    QSIMPLEQ_FOREACH_SAFE(intermediate_state, &states_to_delete, entry, next) {
> > > -        /* so that bdrv_close() does not recursively close the chain */
> > > -        intermediate_state->bs->backing_hd = NULL;
> > > -        bdrv_unref(intermediate_state->bs);
> > > +        bdrv_unref(drop_start);
> > 
> > We will get an assertion here.  In the non-active case, the backing_hd
> > is explicitly set to NULL via bdrv_set_backing_hd().  That function
> > will call bdrv_unref() on the same BDS that drop_start was assigned,
> > so we have a double call to bdrv_unref().
> > 
> > >      }
> > > -    ret = 0;
> > > -
> > >  exit:
> > > -    QSIMPLEQ_FOREACH_SAFE(intermediate_state, &states_to_delete, entry, next) {
> > > -        g_free(intermediate_state);
> > > -    }
> > >      return ret;
> > >  }
> > >  
> > > -
> > >  static int bdrv_check_byte_request(BlockDriverState *bs, int64_t offset,
> > >                                     size_t size)
> > >  {
> > > diff --git a/block/commit.c b/block/commit.c
> > > index acec4ac..b10eb79 100644
> > > --- a/block/commit.c
> > > +++ b/block/commit.c
> > > @@ -142,6 +142,7 @@ wait:
> > >      if (!block_job_is_cancelled(&s->common) && sector_num == end) {
> > >          /* success */
> > >          ret = bdrv_drop_intermediate(active, top, base);
> > > +        base = top;
> > 
> > This is where it is highlighted to me how odd it is to use the side
> > effects of bdrv_swap() in bdrv_drop_intermediate() for the non-active
> > layer case.
> > 
> > The function bdrv_drop_intermediate() is now actually pretty complex
> > and tricky to use, with side effects that the caller needs to beware
> > of, that change depending on the nature of the arguments passed.
> > 
> > [1] Side affects, depending on active, top, and base:
> > 
> >       active = top |  base = NULL  |   side effect
> >     -----------------------------------------------
> > (A) false          |     false     |  top and base are swapped
> > (B) false          |     true      |  none
> > (C) true           |     false     |  top and base are swapped
> > (D) true           |     true      |  assert()
> > 
> > 
> > Case (C) is reasonable, because active and base need to be swapped,
> > and top == active.  It is expected almost by definition.
> > 
> > Case (A) is a bit odd, especially in light of case (B).

Makes sense, I will remove this side effect.

> > 
> > 
> > >      }
> > >  
> > >  exit_free_buf:
> > 
> > 
> > Further down, out of the context of this patch, we have:
> > 
> > 
> >  exit_restore_reopen:
> >      /* restore base open flags here if appropriate (e.g., change the base back
> >       * to r/o). These reopens do not need to be atomic, since we won't abort
> >       * even on failure here */
> >      if (s->base_flags != bdrv_get_flags(base)) {
> >          bdrv_reopen(base, s->base_flags, NULL);
> >      }
> > 
> > OK, 'base' is the one we want to operate on now, that was set to
> > 'top', which has the contents of the old 'base'.
> > 

If I remove the swap, we don't need to set base to top here.

> > 
> >      overlay_bs = bdrv_find_overlay(active, top);
> > 
> > Will we find the right overlay here?  I think now overlay_bs will
> > always be NULL, so we won't restore the r/o flags (if set) for the
> > overlay of the original 'top'.
> > 
> >      if (overlay_bs && s->orig_overlay_flags != bdrv_get_flags(overlay_bs)) {
> >          bdrv_reopen(overlay_bs, s->orig_overlay_flags, NULL);
> >      }

Yes, need a fix here.

Thanks,
Fam

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH v14 06/14] block: Add backing_blocker in BlockDriverState
  2014-02-19 21:17   ` Jeff Cody
@ 2014-02-20  5:01     ` Fam Zheng
  2014-02-20  5:08       ` Jeff Cody
  0 siblings, 1 reply; 33+ messages in thread
From: Fam Zheng @ 2014-02-20  5:01 UTC (permalink / raw)
  To: Jeff Cody
  Cc: kwolf, benoit.canet, rjones, qemu-devel, armbru, imain, stefanha,
	pbonzini

On Wed, 02/19 16:17, Jeff Cody wrote:
> On Wed, Feb 19, 2014 at 09:42:23PM +0800, Fam Zheng wrote:
> > This makes use of op_blocker and blocks all the operations except for
> > commit target, on each BlockDriverState->backing_hd.
> > 
> > The asserts for op_blocker in bdrv_swap are removed because with this
> > change, the target of block commit has at least the backing blocker of
> > its child, so the assertion is not true. Callers should do their check.
> > 
> > Signed-off-by: Fam Zheng <famz@redhat.com>
> > ---
> >  block.c                   | 19 +++++++++++++++----
> >  include/block/block_int.h |  3 +++
> >  2 files changed, 18 insertions(+), 4 deletions(-)
> > 
> > diff --git a/block.c b/block.c
> > index dec44d4..95d8c1f 100644
> > --- a/block.c
> > +++ b/block.c
> > @@ -1044,19 +1044,33 @@ fail:
> >  void bdrv_set_backing_hd(BlockDriverState *bs, BlockDriverState *backing_hd)
> >  {
> >      if (bs->backing_hd) {
> > +        assert(error_is_set(&bs->backing_blocker));
> 
> When I run block-commit, on either the active or non-active layer, I
> get an assertion here.  The qemu-iotests do not catch it, and I
> presume it is because happens a couple of seconds or so after the
> success message is returned over QMP.
> 

I can't reproduce this, could you give some specific steps? Thanks.

Fam

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH v14 06/14] block: Add backing_blocker in BlockDriverState
  2014-02-20  5:01     ` Fam Zheng
@ 2014-02-20  5:08       ` Jeff Cody
  2014-02-20  8:28         ` Fam Zheng
  0 siblings, 1 reply; 33+ messages in thread
From: Jeff Cody @ 2014-02-20  5:08 UTC (permalink / raw)
  To: Fam Zheng
  Cc: kwolf, benoit.canet, rjones, qemu-devel, armbru, imain, stefanha,
	pbonzini

On Thu, Feb 20, 2014 at 01:01:38PM +0800, Fam Zheng wrote:
> On Wed, 02/19 16:17, Jeff Cody wrote:
> > On Wed, Feb 19, 2014 at 09:42:23PM +0800, Fam Zheng wrote:
> > > This makes use of op_blocker and blocks all the operations except for
> > > commit target, on each BlockDriverState->backing_hd.
> > > 
> > > The asserts for op_blocker in bdrv_swap are removed because with this
> > > change, the target of block commit has at least the backing blocker of
> > > its child, so the assertion is not true. Callers should do their check.
> > > 
> > > Signed-off-by: Fam Zheng <famz@redhat.com>
> > > ---
> > >  block.c                   | 19 +++++++++++++++----
> > >  include/block/block_int.h |  3 +++
> > >  2 files changed, 18 insertions(+), 4 deletions(-)
> > > 
> > > diff --git a/block.c b/block.c
> > > index dec44d4..95d8c1f 100644
> > > --- a/block.c
> > > +++ b/block.c
> > > @@ -1044,19 +1044,33 @@ fail:
> > >  void bdrv_set_backing_hd(BlockDriverState *bs, BlockDriverState *backing_hd)
> > >  {
> > >      if (bs->backing_hd) {
> > > +        assert(error_is_set(&bs->backing_blocker));
> > 
> > When I run block-commit, on either the active or non-active layer, I
> > get an assertion here.  The qemu-iotests do not catch it, and I
> > presume it is because happens a couple of seconds or so after the
> > success message is returned over QMP.
> > 
> 
> I can't reproduce this, could you give some specific steps? Thanks.
>

Sure - I am guessing the key is performing some live block snapshots
first.  Here is what I did (this is from memory, but I think the steps
are right):

Nothing special really about the cmdline:
qemu-system-x86_64 -drive file=/home/jtc/test.qcow2,if=virtio -qmp stdio ...

The QMP commands:

For the non-active layer case:

{ "execute": "qmp_capabilities" }
{ "execute": "blockdev-snapshot-sync", "arguments": { "device": "virtio0","snapshot-file":"/tmp/snap1.qcow2","format": "qcow2" } }
{ "execute": "blockdev-snapshot-sync", "arguments": { "device": "virtio0","snapshot-file":"/tmp/snap2.qcow2","format": "qcow2" } }
{ "execute": "block-commit", "arguments": { "device": "virtio0", "top": "/tmp/snap1.qcow2" } }


For the active layer case (I think I still had 2 snapshots here, not
entirely positive):

{ "execute": "qmp_capabilities" }
{ "execute": "blockdev-snapshot-sync", "arguments": { "device": "virtio0","snapshot-file":"/tmp/snap1.qcow2","format": "qcow2" } }
{ "execute": "blockdev-snapshot-sync", "arguments": { "device": "virtio0","snapshot-file":"/tmp/snap2.qcow2","format": "qcow2" } }
{ "execute": "block-commit", "arguments": { "device": "virtio0", "top": "/tmp/snap2.qcow2" } }
{ "execute": "block-job-complete", "arguments": { "device": "virtio0" }}

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH v14 08/14] block: Support dropping active in bdrv_drop_intermediate
  2014-02-20  4:37       ` Fam Zheng
@ 2014-02-20  5:57         ` Jeff Cody
  2014-02-20  8:34           ` Fam Zheng
  0 siblings, 1 reply; 33+ messages in thread
From: Jeff Cody @ 2014-02-20  5:57 UTC (permalink / raw)
  To: Fam Zheng
  Cc: kwolf, benoit.canet, rjones, qemu-devel, armbru, imain, stefanha,
	pbonzini

On Thu, Feb 20, 2014 at 12:37:17PM +0800, Fam Zheng wrote:
> On Wed, 02/19 18:24, Jeff Cody wrote:
> > On Wed, Feb 19, 2014 at 04:22:30PM -0500, Jeff Cody wrote:
> > > On Wed, Feb 19, 2014 at 09:42:25PM +0800, Fam Zheng wrote:
> > > > Dropping intermediate could be useful both for commit and stream, and
> > > > BDS refcnt plus bdrv_swap could do most of the job nicely. It also needs
> > > > to work with op blockers.
> > > > 
> > > > Signed-off-by: Fam Zheng <famz@redhat.com>
> > > > ---
> > > >  block.c        | 146 +++++++++++++++++++++++++--------------------------------
> > > >  block/commit.c |   1 +
> > > >  2 files changed, 66 insertions(+), 81 deletions(-)
> > > > 
> > > > diff --git a/block.c b/block.c
> > > > index a2bf24c..cf41f3d 100644
> > > > --- a/block.c
> > > > +++ b/block.c
> > > > @@ -2485,115 +2485,99 @@ BlockDriverState *bdrv_find_overlay(BlockDriverState *active,
> > > >      return overlay;
> > > >  }
> > > >  
> > > > -typedef struct BlkIntermediateStates {
> > > > -    BlockDriverState *bs;
> > > > -    QSIMPLEQ_ENTRY(BlkIntermediateStates) entry;
> > > > -} BlkIntermediateStates;
> > > > -
> > > > -
> > > >  /*
> > > > - * Drops images above 'base' up to and including 'top', and sets the image
> > > > - * above 'top' to have base as its backing file.
> > > > + * Drops images above 'base' up to and including 'top', and sets new 'base'
> > > > + * as backing_hd of top_overlay (the image orignally has 'top' as backing
> > > 
> > > What is 'top_overlay'?  Do you mean "top's overlay" by this?
> 
> Yes, as noted in the parenthesis.
>

I would just say "top's overlay".  What I found confusing by that, is
when you reference something like 'top_overlay', it looks like an
actual variable name.  So I was searching for that variable name, and
wondered if it was just vestigial from an earlier revision.  Maybe
that is just me, though :)

> > > 
> > > > + * file). top_overlay may be NULL if 'top' is active, no such update needed.
> > > > + * Requires that the top_overlay to 'top' is opened r/w.
> > > >   *
> > > > - * Requires that the overlay to 'top' is opened r/w, so that the backing file
> > > > - * information in 'bs' can be properly updated.
> > > > + * 1) This will convert the following chain:
> > > >   *
> > > > - * E.g., this will convert the following chain:
> > > > - * bottom <- base <- intermediate <- top <- active
> > > > + *     ... <- base <- ... <- top <- overlay <-... <- active
> > > >   *
> > > >   * to
> > > >   *
> > > > - * bottom <- base <- active
> > > > + *     ... <- base <- overlay <- active
> > > >   *
> > > > - * It is allowed for bottom==base, in which case it converts:
> > > > + * 2) It is allowed for bottom==base, in which case it converts:
> > > >   *
> > > > - * base <- intermediate <- top <- active
> > > > + *     base <- ... <- top <- overlay <- ... <- active
> > > >   *
> > > >   * to
> > > >   *
> > > > - * base <- active
> > > > + *     base <- overlay <- active
> > > > + *
> > > > + * 2) It also allows active==top, in which case it converts:
> > > > + *
> > > > + *     ... <- base <- ... <- top (active)
> > > > + *
> > > > + * to
> > > > + *
> > > > + *     ... <- base == active == top
> > > > + *
> > > > + * i.e. only base and lower remains: *top == *base when return.
> > > > + *
> > > > + * 3) If base==NULL, it will drop all the BDS below overlay and set its
> > > > + * backing_hd to NULL. I.e.:
> > > > + *
> > > > + *     base(NULL) <- ... <- overlay <- ... <- active
> > > > + *
> > > > + * to
> > > >   *
> > > > - * Error conditions:
> > > > - *  if active == top, that is considered an error
> > > > + *     overlay <- ... <- active
> > > >   *
> > > >   */
> > > >  int bdrv_drop_intermediate(BlockDriverState *active, BlockDriverState *top,
> > > >                             BlockDriverState *base)
> > > 
> > > With the active case, we aren't necessarily really just dropping
> > > intermediate images anymore. Maybe we should rename this function now to
> > > 'bdrv_rebase_chain()'?
> > > 
> > > >  {
> > > > -    BlockDriverState *intermediate;
> > > > -    BlockDriverState *base_bs = NULL;
> > > > -    BlockDriverState *new_top_bs = NULL;
> > > > -    BlkIntermediateStates *intermediate_state, *next;
> > > > -    int ret = -EIO;
> > > > -
> > > > -    QSIMPLEQ_HEAD(states_to_delete, BlkIntermediateStates) states_to_delete;
> > > > -    QSIMPLEQ_INIT(&states_to_delete);
> > > > -
> > > > -    if (!top->drv || !base->drv) {
> > > > -        goto exit;
> > > > -    }
> > > > -
> > > > -    new_top_bs = bdrv_find_overlay(active, top);
> > > > +    BlockDriverState *drop_start, *overlay;
> > > > +    int ret = -EINVAL;
> > > >  
> > > > -    if (new_top_bs == NULL) {
> > > > -        /* we could not find the image above 'top', this is an error */
> > > > +    if (!top->drv || (base && !base->drv)) {
> > > >          goto exit;
> > > >      }
> > > > -
> > > > -    /* special case of new_top_bs->backing_hd already pointing to base - nothing
> > > > -     * to do, no intermediate images */
> > > > -    if (new_top_bs->backing_hd == base) {
> > > > +    if (top == base) {
> > > >          ret = 0;
> > > > -        goto exit;
> > > > -    }
> > > > -
> > > > -    intermediate = top;
> > > > -
> > > > -    /* now we will go down through the list, and add each BDS we find
> > > > -     * into our deletion queue, until we hit the 'base'
> > > > -     */
> > > > -    while (intermediate) {
> > > > -        intermediate_state = g_malloc0(sizeof(BlkIntermediateStates));
> > > > -        intermediate_state->bs = intermediate;
> > > > -        QSIMPLEQ_INSERT_TAIL(&states_to_delete, intermediate_state, entry);
> > > > -
> > > > -        if (intermediate->backing_hd == base) {
> > > > -            base_bs = intermediate->backing_hd;
> > > > -            break;
> > > > +    } else if (top == active) {
> > > > +        assert(base);
> > > > +        drop_start = active->backing_hd;
> > > > +        bdrv_swap(active, base);
> > > > +        base->backing_hd = NULL;
> > > > +        bdrv_unref(drop_start);
> > > > +        ret = 0;
> > > > +    } else {
> > > > +        /* If there's an overlay, its backing_hd points to top's BDS now,
> > > > +         * the top image is dropped but this BDS structure is kept and swapped
> > > > +         * with base, this way we keep the pointers valid after dropping top */
> > > > +        overlay = bdrv_find_overlay(active, top);
> > > > +        if (!overlay) {
> > > > +            goto exit;
> > > > +        }
> > > > +        if (base) {
> > > > +            ret = bdrv_change_backing_file(overlay, base->filename,
> > > > +                                           base->drv->format_name);
> > > > +        } else {
> > > > +            ret = bdrv_change_backing_file(overlay, NULL, NULL);
> > > > +        }
> > > > +        if (ret) {
> > > > +            goto exit;
> > > > +        }
> > > > +        if (base) {
> > > > +            drop_start = top->backing_hd;
> > > > +            bdrv_swap(top, base);
> > > > +            /* Break the loop formed by bdrv_swap */
> > > > +            bdrv_set_backing_hd(base, NULL);
> > > 
> > > And in the non-active case here, everything between top->backing_hd
> > > and the original base is orphaned as well.  These should all be
> > > explicitly unreferenced.
> > 
> > Same here, bdrv_unref() will eventually go through the chain, starting
> > from top->backing_hd.  But this is a problem; won't we end up in a
> > loop then?
> 
> Although the content is swapped, the pointer is not:
> 
> (I presume your "[base]" and "[top]" are denoting content, not pointer)
>

Correct.  But part of the content that is swapped, are the backing_hd
pointers.

> > 
> > Take this chain:
> > 
> > drop_start = [A]
> > 
> >     |||-- ([base]) <-- [B] <--- [A] <--- ([top]) <--- [active]
>                ^                              ^
>                |                              |
>               base                           top
> > 
> > 
> > bdrv_swap(top, base):
> > 
> >     -- [B] <-- [A] <-- ([top])    |||--- ([base]) <-- [active]
>                             ^                 ^
>                             |                 |
>                            base               top
> >     |                    ^
> >     |                    |
> >     ---------------------
> > 

Correct, those are the pointers.

> > Then we call bdrv_unref(drop_start (or bdrv_set_backing_hd() does),
> > and we end up with:
> > 

dropping an anchor here: [1]

> > bdrv_unref(A)
> >     bdrv_unref(B)
> >         bdrv_unref(top)
> >             bdrv_unref(A) <--- assert
> >                 .....
> >             
> > 
> > So I think we want this line:
> > 
> > > > +            bdrv_set_backing_hd(base, NULL);
> 
> so, this breaks the chain,

Yes, you are right, we want base->backing_hd to be NULL.  But the
chain has not been broken yet.

The loop [1] still exists, because once we enter bdrv_set_backing_hd()
we begin to call bdrv_unref(A). And base_ptr->backing_hd still points
to A, and B will point to base_ptr.

Here is the first part of bdrv_set_backing_hd():
    if (bs->backing_hd) {
        bdrv_op_unblock_all(bs->backing_hd, bs->backing_blocker);
        bdrv_unref(bs->backing_hd);


> 
> > 
> > To be:
> > 
> > > > +            bdrv_set_backing_hd(top, NULL);
> 
> This will lose track of original base's backing_hd.

Right, we don't want that, sorry...  I shouldn't have written that, my
brain failed me.  I mentally conflated top and [top].

> 
> So I think we are OK here.
>

I don't think we are, we still need to address the backing_hd loop,
and I think it needs to be done here, where we have the information.

> But I find that a fix is needed in bdrv_set_backing_hd to handle the rebase
> correctly.
>

What we really want, prior to starting to unref anything, is to set
the drop_start = base_ptr->backing_hd, and then set
base_ptr->backing_hd = NULL.  Then the bdrv_unref(drop_start) will
perform as expected (see [2], below).

And, at least in the usage here, we probably don't want
bdrv_set_backing_hd() to unref anything for us, but I'm sure there is
some way to make that work if it is cleaner that way.

That will get you what I was originally trying to get at in my
previous email, when I unfortunately conflated top contents with top
pointer:

> > 
> > 
> > Right?  Or, just set top->backing_hd = NULL, so we get:
                         ^^^ 
please read this as 'base' (as in base_ptr)
    

anchor [2]:

> > 
> >     -- [B] <-- [A]   |||-- ([top])    |||--- ([base]) <-- [active]
> >     |                         ^
> >     |                         |
> >     ---------------------------

base_ptr->backing_hd is set to NULL first ^^

then the bdrv_unref(drop_start):

> > 
> > bdrv_unref(A)
> >     bdrv_unref(B)
> >         bdrv_unref(top)
> > 
> > 
> > Which leaves:
> > 
> >     |||--- ([base]) <-- [active]
> > 
> > 
> > So this part above still needs addressing, I think.
> > 
> > > 
> > > Also, side effect:
> > > Caller needs to beware now that base and top are now swapped [1].
> > > 
> > > > +        } else {
> > > > +            bdrv_set_backing_hd(overlay, NULL);
> > > > +            drop_start = top;
> > > 
> > > Again, everything between top and the original base is orphaned, but
> > > should be cleaned up.
> > >
> > > Caller does not have to worry about base and top being swapped [1].
> > >
> > 
> > This should be fine, I think.
> > 
> > 
> > I think everything else I mentioned below this point is still
> > relevant, however.
> > 
> > > 
> > > >          }
> > > > -        intermediate = intermediate->backing_hd;
> > > > -    }
> > > > -    if (base_bs == NULL) {
> > > > -        /* something went wrong, we did not end at the base. safely
> > > > -         * unravel everything, and exit with error */
> > > > -        goto exit;
> > > > -    }
> > > > -
> > > > -    /* success - we can delete the intermediate states, and link top->base */
> > > > -    ret = bdrv_change_backing_file(new_top_bs, base_bs->filename,
> > > > -                                   base_bs->drv ? base_bs->drv->format_name : "");
> > > > -    if (ret) {
> > > > -        goto exit;
> > > > -    }
> > > > -    new_top_bs->backing_hd = base_bs;
> > > > -
> > > > -    bdrv_refresh_limits(new_top_bs);
> > > >  
> > > > -    QSIMPLEQ_FOREACH_SAFE(intermediate_state, &states_to_delete, entry, next) {
> > > > -        /* so that bdrv_close() does not recursively close the chain */
> > > > -        intermediate_state->bs->backing_hd = NULL;
> > > > -        bdrv_unref(intermediate_state->bs);
> > > > +        bdrv_unref(drop_start);
> > > 
> > > We will get an assertion here.  In the non-active case, the backing_hd
> > > is explicitly set to NULL via bdrv_set_backing_hd().  That function
> > > will call bdrv_unref() on the same BDS that drop_start was assigned,
> > > so we have a double call to bdrv_unref().
> > > 
> > > >      }
> > > > -    ret = 0;
> > > > -
> > > >  exit:
> > > > -    QSIMPLEQ_FOREACH_SAFE(intermediate_state, &states_to_delete, entry, next) {
> > > > -        g_free(intermediate_state);
> > > > -    }
> > > >      return ret;
> > > >  }
> > > >  
> > > > -
> > > >  static int bdrv_check_byte_request(BlockDriverState *bs, int64_t offset,
> > > >                                     size_t size)
> > > >  {
> > > > diff --git a/block/commit.c b/block/commit.c
> > > > index acec4ac..b10eb79 100644
> > > > --- a/block/commit.c
> > > > +++ b/block/commit.c
> > > > @@ -142,6 +142,7 @@ wait:
> > > >      if (!block_job_is_cancelled(&s->common) && sector_num == end) {
> > > >          /* success */
> > > >          ret = bdrv_drop_intermediate(active, top, base);
> > > > +        base = top;
> > > 
> > > This is where it is highlighted to me how odd it is to use the side
> > > effects of bdrv_swap() in bdrv_drop_intermediate() for the non-active
> > > layer case.
> > > 
> > > The function bdrv_drop_intermediate() is now actually pretty complex
> > > and tricky to use, with side effects that the caller needs to beware
> > > of, that change depending on the nature of the arguments passed.
> > > 
> > > [1] Side affects, depending on active, top, and base:
> > > 
> > >       active = top |  base = NULL  |   side effect
> > >     -----------------------------------------------
> > > (A) false          |     false     |  top and base are swapped
> > > (B) false          |     true      |  none
> > > (C) true           |     false     |  top and base are swapped
> > > (D) true           |     true      |  assert()
> > > 
> > > 
> > > Case (C) is reasonable, because active and base need to be swapped,
> > > and top == active.  It is expected almost by definition.
> > > 
> > > Case (A) is a bit odd, especially in light of case (B).
> 
> Makes sense, I will remove this side effect.
>
> > > 
> > > 
> > > >      }
> > > >  
> > > >  exit_free_buf:
> > > 
> > > 
> > > Further down, out of the context of this patch, we have:
> > > 
> > > 
> > >  exit_restore_reopen:
> > >      /* restore base open flags here if appropriate (e.g., change the base back
> > >       * to r/o). These reopens do not need to be atomic, since we won't abort
> > >       * even on failure here */
> > >      if (s->base_flags != bdrv_get_flags(base)) {
> > >          bdrv_reopen(base, s->base_flags, NULL);
> > >      }
> > > 
> > > OK, 'base' is the one we want to operate on now, that was set to
> > > 'top', which has the contents of the old 'base'.
> > > 
> 
> If I remove the swap, we don't need to set base to top here.
>

Yes, that will keep the usage more consistent, thanks.

> > > 
> > >      overlay_bs = bdrv_find_overlay(active, top);
> > > 
> > > Will we find the right overlay here?  I think now overlay_bs will
> > > always be NULL, so we won't restore the r/o flags (if set) for the
> > > overlay of the original 'top'.
> > > 
> > >      if (overlay_bs && s->orig_overlay_flags != bdrv_get_flags(overlay_bs)) {
> > >          bdrv_reopen(overlay_bs, s->orig_overlay_flags, NULL);
> > >      }
> 
> Yes, need a fix here.
> 
> Thanks,
> Fam

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH v14 06/14] block: Add backing_blocker in BlockDriverState
  2014-02-20  5:08       ` Jeff Cody
@ 2014-02-20  8:28         ` Fam Zheng
  2014-02-20 11:59           ` Jeff Cody
  0 siblings, 1 reply; 33+ messages in thread
From: Fam Zheng @ 2014-02-20  8:28 UTC (permalink / raw)
  To: Jeff Cody
  Cc: kwolf, benoit.canet, rjones, qemu-devel, armbru, imain, stefanha,
	pbonzini

On Thu, 02/20 00:08, Jeff Cody wrote:
> On Thu, Feb 20, 2014 at 01:01:38PM +0800, Fam Zheng wrote:
> > On Wed, 02/19 16:17, Jeff Cody wrote:
> > > On Wed, Feb 19, 2014 at 09:42:23PM +0800, Fam Zheng wrote:
> > > > This makes use of op_blocker and blocks all the operations except for
> > > > commit target, on each BlockDriverState->backing_hd.
> > > > 
> > > > The asserts for op_blocker in bdrv_swap are removed because with this
> > > > change, the target of block commit has at least the backing blocker of
> > > > its child, so the assertion is not true. Callers should do their check.
> > > > 
> > > > Signed-off-by: Fam Zheng <famz@redhat.com>
> > > > ---
> > > >  block.c                   | 19 +++++++++++++++----
> > > >  include/block/block_int.h |  3 +++
> > > >  2 files changed, 18 insertions(+), 4 deletions(-)
> > > > 
> > > > diff --git a/block.c b/block.c
> > > > index dec44d4..95d8c1f 100644
> > > > --- a/block.c
> > > > +++ b/block.c
> > > > @@ -1044,19 +1044,33 @@ fail:
> > > >  void bdrv_set_backing_hd(BlockDriverState *bs, BlockDriverState *backing_hd)
> > > >  {
> > > >      if (bs->backing_hd) {
> > > > +        assert(error_is_set(&bs->backing_blocker));
> > > 
> > > When I run block-commit, on either the active or non-active layer, I
> > > get an assertion here.  The qemu-iotests do not catch it, and I
> > > presume it is because happens a couple of seconds or so after the
> > > success message is returned over QMP.
> > > 
> > 
> > I can't reproduce this, could you give some specific steps? Thanks.
> >
> 
> Sure - I am guessing the key is performing some live block snapshots
> first.  Here is what I did (this is from memory, but I think the steps
> are right):
> 
> Nothing special really about the cmdline:
> qemu-system-x86_64 -drive file=/home/jtc/test.qcow2,if=virtio -qmp stdio ...
> 
> The QMP commands:
> 
> For the non-active layer case:
> 
> { "execute": "qmp_capabilities" }
> { "execute": "blockdev-snapshot-sync", "arguments": { "device": "virtio0","snapshot-file":"/tmp/snap1.qcow2","format": "qcow2" } }
> { "execute": "blockdev-snapshot-sync", "arguments": { "device": "virtio0","snapshot-file":"/tmp/snap2.qcow2","format": "qcow2" } }
> { "execute": "block-commit", "arguments": { "device": "virtio0", "top": "/tmp/snap1.qcow2" } }
> 
> 
> For the active layer case (I think I still had 2 snapshots here, not
> entirely positive):
> 
> { "execute": "qmp_capabilities" }
> { "execute": "blockdev-snapshot-sync", "arguments": { "device": "virtio0","snapshot-file":"/tmp/snap1.qcow2","format": "qcow2" } }
> { "execute": "blockdev-snapshot-sync", "arguments": { "device": "virtio0","snapshot-file":"/tmp/snap2.qcow2","format": "qcow2" } }
> { "execute": "block-commit", "arguments": { "device": "virtio0", "top": "/tmp/snap2.qcow2" } }
> { "execute": "block-job-complete", "arguments": { "device": "virtio0" }}
> 

Yes. I forgot to use bdrv_set_backing_hd in bdrv_append.

Could you try if the below patch fixes it? Thanks.

Fam

---

diff --git a/block.c b/block.c
index 1af43b9..66a8e35 100644
--- a/block.c
+++ b/block.c
@@ -1978,7 +1978,7 @@ void bdrv_append(BlockDriverState *bs_new, BlockDriverState *bs_top)

     /* The contents of 'tmp' will become bs_top, as we are
      * swapping bs_new and bs_top contents. */
-    bs_top->backing_hd = bs_new;
+    bdrv_set_backing_hd(bs_top, bs_new);
     bs_top->open_flags &= ~BDRV_O_NO_BACKING;
     pstrcpy(bs_top->backing_file, sizeof(bs_top->backing_file),
             bs_new->filename);

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH v14 08/14] block: Support dropping active in bdrv_drop_intermediate
  2014-02-20  5:57         ` Jeff Cody
@ 2014-02-20  8:34           ` Fam Zheng
  0 siblings, 0 replies; 33+ messages in thread
From: Fam Zheng @ 2014-02-20  8:34 UTC (permalink / raw)
  To: Jeff Cody
  Cc: kwolf, benoit.canet, rjones, qemu-devel, armbru, imain, stefanha,
	pbonzini

On Thu, 02/20 00:57, Jeff Cody wrote:
> On Thu, Feb 20, 2014 at 12:37:17PM +0800, Fam Zheng wrote:
> > On Wed, 02/19 18:24, Jeff Cody wrote:
> > > On Wed, Feb 19, 2014 at 04:22:30PM -0500, Jeff Cody wrote:
> > > > On Wed, Feb 19, 2014 at 09:42:25PM +0800, Fam Zheng wrote:
> > > > >  /*
> > > > > - * Drops images above 'base' up to and including 'top', and sets the image
> > > > > - * above 'top' to have base as its backing file.
> > > > > + * Drops images above 'base' up to and including 'top', and sets new 'base'
> > > > > + * as backing_hd of top_overlay (the image orignally has 'top' as backing
> > > > 
> > > > What is 'top_overlay'?  Do you mean "top's overlay" by this?
> > 
> > Yes, as noted in the parenthesis.
> >
> 
> I would just say "top's overlay".  What I found confusing by that, is
> when you reference something like 'top_overlay', it looks like an
> actual variable name.  So I was searching for that variable name, and
> wondered if it was just vestigial from an earlier revision.  Maybe
> that is just me, though :)
> 

I will update the wording for less confusion. Sorry about that.

> > > > And in the non-active case here, everything between top->backing_hd
> > > > and the original base is orphaned as well.  These should all be
> > > > explicitly unreferenced.
> > > 
> > > Same here, bdrv_unref() will eventually go through the chain, starting
> > > from top->backing_hd.  But this is a problem; won't we end up in a
> > > loop then?
> > 
> > Although the content is swapped, the pointer is not:
> > 
> > (I presume your "[base]" and "[top]" are denoting content, not pointer)
> >
> 
> Correct.  But part of the content that is swapped, are the backing_hd
> pointers.
> 
> > > 
> > > Take this chain:
> > > 
> > > drop_start = [A]
> > > 
> > >     |||-- ([base]) <-- [B] <--- [A] <--- ([top]) <--- [active]
> >                ^                              ^
> >                |                              |
> >               base                           top
> > > 
> > > 
> > > bdrv_swap(top, base):
> > > 
> > >     -- [B] <-- [A] <-- ([top])    |||--- ([base]) <-- [active]
> >                             ^                 ^
> >                             |                 |
> >                            base               top
> > >     |                    ^
> > >     |                    |
> > >     ---------------------
> > > 
> 
> Correct, those are the pointers.
> 
> > > Then we call bdrv_unref(drop_start (or bdrv_set_backing_hd() does),
> > > and we end up with:
> > > 
> 
> dropping an anchor here: [1]
> 
> > > bdrv_unref(A)
> > >     bdrv_unref(B)
> > >         bdrv_unref(top)
> > >             bdrv_unref(A) <--- assert
> > >                 .....
> > >             
> > > 
> > > So I think we want this line:
> > > 
> > > > > +            bdrv_set_backing_hd(base, NULL);
> > 
> > so, this breaks the chain,
> 
> Yes, you are right, we want base->backing_hd to be NULL.  But the
> chain has not been broken yet.
> 
> The loop [1] still exists, because once we enter bdrv_set_backing_hd()
> we begin to call bdrv_unref(A). And base_ptr->backing_hd still points
> to A, and B will point to base_ptr.

Yes, that need to be fixed.

> 
> Here is the first part of bdrv_set_backing_hd():
>     if (bs->backing_hd) {
>         bdrv_op_unblock_all(bs->backing_hd, bs->backing_blocker);
>         bdrv_unref(bs->backing_hd);
> 
> 
> > 
> > > 
> > > To be:
> > > 
> > > > > +            bdrv_set_backing_hd(top, NULL);
> > 
> > This will lose track of original base's backing_hd.
> 
> Right, we don't want that, sorry...  I shouldn't have written that, my
> brain failed me.  I mentally conflated top and [top].
> 
> > 
> > So I think we are OK here.
> >
> 
> I don't think we are, we still need to address the backing_hd loop,
> and I think it needs to be done here, where we have the information.

Again, you are right :)

Thanks,
Fam

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH v14 06/14] block: Add backing_blocker in BlockDriverState
  2014-02-20  8:28         ` Fam Zheng
@ 2014-02-20 11:59           ` Jeff Cody
  0 siblings, 0 replies; 33+ messages in thread
From: Jeff Cody @ 2014-02-20 11:59 UTC (permalink / raw)
  To: Fam Zheng
  Cc: kwolf, benoit.canet, rjones, qemu-devel, armbru, imain, stefanha,
	pbonzini

On Thu, Feb 20, 2014 at 04:28:56PM +0800, Fam Zheng wrote:
> On Thu, 02/20 00:08, Jeff Cody wrote:
> > On Thu, Feb 20, 2014 at 01:01:38PM +0800, Fam Zheng wrote:
> > > On Wed, 02/19 16:17, Jeff Cody wrote:
> > > > On Wed, Feb 19, 2014 at 09:42:23PM +0800, Fam Zheng wrote:
> > > > > This makes use of op_blocker and blocks all the operations except for
> > > > > commit target, on each BlockDriverState->backing_hd.
> > > > > 
> > > > > The asserts for op_blocker in bdrv_swap are removed because with this
> > > > > change, the target of block commit has at least the backing blocker of
> > > > > its child, so the assertion is not true. Callers should do their check.
> > > > > 
> > > > > Signed-off-by: Fam Zheng <famz@redhat.com>
> > > > > ---
> > > > >  block.c                   | 19 +++++++++++++++----
> > > > >  include/block/block_int.h |  3 +++
> > > > >  2 files changed, 18 insertions(+), 4 deletions(-)
> > > > > 
> > > > > diff --git a/block.c b/block.c
> > > > > index dec44d4..95d8c1f 100644
> > > > > --- a/block.c
> > > > > +++ b/block.c
> > > > > @@ -1044,19 +1044,33 @@ fail:
> > > > >  void bdrv_set_backing_hd(BlockDriverState *bs, BlockDriverState *backing_hd)
> > > > >  {
> > > > >      if (bs->backing_hd) {
> > > > > +        assert(error_is_set(&bs->backing_blocker));
> > > > 
> > > > When I run block-commit, on either the active or non-active layer, I
> > > > get an assertion here.  The qemu-iotests do not catch it, and I
> > > > presume it is because happens a couple of seconds or so after the
> > > > success message is returned over QMP.
> > > > 
> > > 
> > > I can't reproduce this, could you give some specific steps? Thanks.
> > >
> > 
> > Sure - I am guessing the key is performing some live block snapshots
> > first.  Here is what I did (this is from memory, but I think the steps
> > are right):
> > 
> > Nothing special really about the cmdline:
> > qemu-system-x86_64 -drive file=/home/jtc/test.qcow2,if=virtio -qmp stdio ...
> > 
> > The QMP commands:
> > 
> > For the non-active layer case:
> > 
> > { "execute": "qmp_capabilities" }
> > { "execute": "blockdev-snapshot-sync", "arguments": { "device": "virtio0","snapshot-file":"/tmp/snap1.qcow2","format": "qcow2" } }
> > { "execute": "blockdev-snapshot-sync", "arguments": { "device": "virtio0","snapshot-file":"/tmp/snap2.qcow2","format": "qcow2" } }
> > { "execute": "block-commit", "arguments": { "device": "virtio0", "top": "/tmp/snap1.qcow2" } }
> > 
> > 
> > For the active layer case (I think I still had 2 snapshots here, not
> > entirely positive):
> > 
> > { "execute": "qmp_capabilities" }
> > { "execute": "blockdev-snapshot-sync", "arguments": { "device": "virtio0","snapshot-file":"/tmp/snap1.qcow2","format": "qcow2" } }
> > { "execute": "blockdev-snapshot-sync", "arguments": { "device": "virtio0","snapshot-file":"/tmp/snap2.qcow2","format": "qcow2" } }
> > { "execute": "block-commit", "arguments": { "device": "virtio0", "top": "/tmp/snap2.qcow2" } }
> > { "execute": "block-job-complete", "arguments": { "device": "virtio0" }}
> > 
> 
> Yes. I forgot to use bdrv_set_backing_hd in bdrv_append.
> 
> Could you try if the below patch fixes it? Thanks.
> 

Yep, this seems to fix it.

> Fam
> 
> ---
> 
> diff --git a/block.c b/block.c
> index 1af43b9..66a8e35 100644
> --- a/block.c
> +++ b/block.c
> @@ -1978,7 +1978,7 @@ void bdrv_append(BlockDriverState *bs_new, BlockDriverState *bs_top)
> 
>      /* The contents of 'tmp' will become bs_top, as we are
>       * swapping bs_new and bs_top contents. */
> -    bs_top->backing_hd = bs_new;
> +    bdrv_set_backing_hd(bs_top, bs_new);
>      bs_top->open_flags &= ~BDRV_O_NO_BACKING;
>      pstrcpy(bs_top->backing_file, sizeof(bs_top->backing_file),
>              bs_new->filename);

^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2014-02-20 12:00 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-02-19 13:42 [Qemu-devel] [PATCH v14 00/14] Drop in_use from BlockDriverState and enable point-in-time snapshot exporting over NBD Fam Zheng
2014-02-19 13:42 ` [Qemu-devel] [PATCH v14 01/14] block: Add BlockOpType enum Fam Zheng
2014-02-19 15:25   ` Benoît Canet
2014-02-19 13:42 ` [Qemu-devel] [PATCH v14 02/14] block: Introduce op_blockers to BlockDriverState Fam Zheng
2014-02-19 15:26   ` Benoît Canet
2014-02-19 13:42 ` [Qemu-devel] [PATCH v14 03/14] block: Replace in_use with operation blocker Fam Zheng
2014-02-19 15:26   ` Benoît Canet
2014-02-19 13:42 ` [Qemu-devel] [PATCH v14 04/14] block: Move op_blocker check from block_job_create to its caller Fam Zheng
2014-02-19 15:28   ` Benoît Canet
2014-02-19 13:42 ` [Qemu-devel] [PATCH v14 05/14] block: Add bdrv_set_backing_hd() Fam Zheng
2014-02-19 15:27   ` Benoît Canet
2014-02-19 13:42 ` [Qemu-devel] [PATCH v14 06/14] block: Add backing_blocker in BlockDriverState Fam Zheng
2014-02-19 15:32   ` Benoît Canet
2014-02-19 21:17   ` Jeff Cody
2014-02-20  5:01     ` Fam Zheng
2014-02-20  5:08       ` Jeff Cody
2014-02-20  8:28         ` Fam Zheng
2014-02-20 11:59           ` Jeff Cody
2014-02-19 13:42 ` [Qemu-devel] [PATCH v14 07/14] block: Parse "backing" option to reference existing BDS Fam Zheng
2014-02-19 13:42 ` [Qemu-devel] [PATCH v14 08/14] block: Support dropping active in bdrv_drop_intermediate Fam Zheng
2014-02-19 15:34   ` Benoît Canet
2014-02-19 21:22   ` Jeff Cody
2014-02-19 23:24     ` Jeff Cody
2014-02-20  4:37       ` Fam Zheng
2014-02-20  5:57         ` Jeff Cody
2014-02-20  8:34           ` Fam Zheng
2014-02-19 13:42 ` [Qemu-devel] [PATCH v14 09/14] stream: Use bdrv_drop_intermediate and drop close_unused_images Fam Zheng
2014-02-19 21:23   ` Jeff Cody
2014-02-19 13:42 ` [Qemu-devel] [PATCH v14 10/14] qmp: Add command 'blockdev-backup' Fam Zheng
2014-02-19 13:42 ` [Qemu-devel] [PATCH v14 11/14] block: Allow backup on referenced named BlockDriverState Fam Zheng
2014-02-19 13:42 ` [Qemu-devel] [PATCH v14 12/14] block: Add blockdev-backup to transaction Fam Zheng
2014-02-19 13:42 ` [Qemu-devel] [PATCH v14 13/14] qemu-iotests: Test blockdev-backup in 055 Fam Zheng
2014-02-19 13:42 ` [Qemu-devel] [PATCH v14 14/14] qemu-iotests: Image fleecing test case 081 Fam Zheng

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.