All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 0/7] Point-in-time snapshot exporting over NBD
@ 2013-07-02  5:59 Fam Zheng
  2013-07-02  5:59 ` [Qemu-devel] [PATCH 1/7] block: Convert BlockDriverState.in_use to refcount Fam Zheng
                   ` (6 more replies)
  0 siblings, 7 replies; 21+ messages in thread
From: Fam Zheng @ 2013-07-02  5:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: kwolf, famz, obarenbo, roliveri, hbrock, rjones, armbru, pmyers,
	imain, stefanha, pbonzini

This series adds for point-in-time snapshot NBD exporting based on
drive-backup. The ideas is described below and patches followed (the missing
part is item 3, which work is in progress by Ian Main who will have another
patch on it). As the work does not overlap, these series should be quite
reviewable by itself.

Background
==========

The goal of image fleecing is to provide a interface to inspect a point-in-time
snapshot of guest image data, not being interfered with guest overwrites after
it's created.  With drive-backup we already have the point-in-time snapshot
image (the target image), we only need three modifications to realize this:

 1. Give backup target an id, so we can add it to NBD server.

 2. Assign source device as backing of target, so reading the unallocated will
    be passed to source.
    As there's copy-on-write mechanism with drive-backup job, all the modified
    data after snapshot created is copied to target, the unallocated data is
    guaranteed to be unchanged, so reading from the source is correct. Note
    that this requires target format supports backing file.

 3. Adding sync mode 'none' to drive-backup, so the block job only copy changed
    data from source, which has minimal IO overhead.

Usage
=====

With above three, we can simply export a point-in-time snapshot with two QMP commands:

    drive-backup device=virtio0 format=qcow2 target=point-in-time.qcow2 target-id=pit0 sync=none
    (sync=none is not implemented for now but we can simulate with speed=1)

    nbd-server-add device=pit0 writable=no

Lifecycle
=========

Reference count for BlockDriverState is implemented to manage their lifecycles.
Device attachment, block job, backing hd, bs->file NBD and others are current
referred points of a BDS, they each call bdrv_get_ref when starting to use a
BDS and bdrv_put_ref when releasing, e.g. on hot plug/unplug, nbd_server_add,
drive-backup, etc.

I.e., when a drive-backup target bs is being exported through NBD and the backup
job finishes (or cancelled), the device is not deleted, the client can continue
to access the NBD target until nbd_server_stop. It is automatically deleted
when the last reference is released.

The target image file is not removed automatically, since we can't assume that
it's temporary. It's up to the user to remove it manually, or we need to add a
command or option to mark the target bs temporary.

Fam Zheng (7):
  block: Convert BlockDriverState.in_use to refcount
  block: use refcount to manage BlockDriverState lifecycle
  nbd: use BDS refcount
  block: simplify bdrv_drop_intermediate
  block: rename bdrv_in_use to bdrv_is_shared
  block: add target-id option to drive-backup QMP command
  block: assign backing relationship in drive-backup

 block-migration.c               |   5 +-
 block.c                         | 123 +++++++++++++++-------------------------
 block/backup.c                  |  16 +++++-
 block/blkdebug.c                |   1 +
 block/blkverify.c               |   2 +
 block/mirror.c                  |   4 +-
 block/snapshot.c                |   3 +-
 block/stream.c                  |   2 +-
 block/vvfat.c                   |   4 +-
 blockdev-nbd.c                  |   9 +--
 blockdev.c                      |  19 ++++---
 blockjob.c                      |   8 +--
 hw/block/dataplane/virtio-blk.c |   4 +-
 hw/block/xen_disk.c             |   7 +--
 include/block/block.h           |   5 +-
 include/block/block_int.h       |  18 +++++-
 nbd.c                           |   5 ++
 qapi-schema.json                |   7 ++-
 qmp-commands.hx                 |   3 +-
 19 files changed, 126 insertions(+), 119 deletions(-)

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH 1/7] block: Convert BlockDriverState.in_use to refcount
  2013-07-02  5:59 [Qemu-devel] [PATCH 0/7] Point-in-time snapshot exporting over NBD Fam Zheng
@ 2013-07-02  5:59 ` Fam Zheng
  2013-07-02 10:21   ` Paolo Bonzini
  2013-07-02 19:41   ` Eric Blake
  2013-07-02  5:59 ` [Qemu-devel] [PATCH 2/7] block: use refcount to manage BlockDriverState lifecycle Fam Zheng
                   ` (5 subsequent siblings)
  6 siblings, 2 replies; 21+ messages in thread
From: Fam Zheng @ 2013-07-02  5:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: kwolf, famz, obarenbo, roliveri, hbrock, rjones, armbru, pmyers,
	imain, stefanha, pbonzini

Use numeric value to replace in_use flag in BDS, this will make
lifecycle management with ref count possible. This patch only replaces
existing uses of bdrv_set_in_use, so no logic change here.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 block-migration.c               |  4 ++--
 block.c                         | 23 +++++++++++++++--------
 blockjob.c                      |  6 +++---
 hw/block/dataplane/virtio-blk.c |  4 ++--
 include/block/block.h           |  3 ++-
 include/block/block_int.h       |  2 +-
 6 files changed, 25 insertions(+), 17 deletions(-)

diff --git a/block-migration.c b/block-migration.c
index 2fd7699..2efb6c0 100644
--- a/block-migration.c
+++ b/block-migration.c
@@ -321,7 +321,7 @@ static void init_blk_migration_it(void *opaque, BlockDriverState *bs)
         bmds->shared_base = block_mig_state.shared_base;
         alloc_aio_bitmap(bmds);
         drive_get_ref(drive_get_by_blockdev(bs));
-        bdrv_set_in_use(bs, 1);
+        bdrv_get_ref(bs);
 
         block_mig_state.total_sector_sum += sectors;
 
@@ -557,7 +557,7 @@ static void blk_mig_cleanup(void)
     blk_mig_lock();
     while ((bmds = QSIMPLEQ_FIRST(&block_mig_state.bmds_list)) != NULL) {
         QSIMPLEQ_REMOVE_HEAD(&block_mig_state.bmds_list, entry);
-        bdrv_set_in_use(bmds->bs, 0);
+        bdrv_get_ref(bmds->bs);
         drive_put_ref(drive_get_by_blockdev(bmds->bs));
         g_free(bmds->aio_bitmap);
         g_free(bmds);
diff --git a/block.c b/block.c
index 6c493ad..84c3181 100644
--- a/block.c
+++ b/block.c
@@ -1503,8 +1503,10 @@ static void bdrv_move_feature_fields(BlockDriverState *bs_dest,
     /* dirty bitmap */
     bs_dest->dirty_bitmap       = bs_src->dirty_bitmap;
 
+    /* ref count */
+    bs_dest->refcount           = bs_src->refcount;
+
     /* job */
-    bs_dest->in_use             = bs_src->in_use;
     bs_dest->job                = bs_src->job;
 
     /* keep the same entry in bdrv_states */
@@ -1534,7 +1536,7 @@ void bdrv_swap(BlockDriverState *bs_new, BlockDriverState *bs_old)
     assert(bs_new->dirty_bitmap == NULL);
     assert(bs_new->job == NULL);
     assert(bs_new->dev == NULL);
-    assert(bs_new->in_use == 0);
+    assert(bs_new->refcount == 0);
     assert(bs_new->io_limits_enabled == false);
     assert(bs_new->block_timer == NULL);
 
@@ -1553,7 +1555,7 @@ void bdrv_swap(BlockDriverState *bs_new, BlockDriverState *bs_old)
     /* Check a few fields that should remain attached to the device */
     assert(bs_new->dev == NULL);
     assert(bs_new->job == NULL);
-    assert(bs_new->in_use == 0);
+    assert(bs_new->refcount == 0);
     assert(bs_new->io_limits_enabled == false);
     assert(bs_new->block_timer == NULL);
 
@@ -1590,7 +1592,7 @@ void bdrv_delete(BlockDriverState *bs)
 {
     assert(!bs->dev);
     assert(!bs->job);
-    assert(!bs->in_use);
+    assert(!bs->refcount);
 
     /* remove from list, if necessary */
     bdrv_make_anon(bs);
@@ -4360,15 +4362,20 @@ int64_t bdrv_get_dirty_count(BlockDriverState *bs)
     }
 }
 
-void bdrv_set_in_use(BlockDriverState *bs, int in_use)
+void bdrv_put_ref(BlockDriverState *bs)
+{
+    assert(bs->refcount > 0);
+    bs->refcount--;
+}
+
+void bdrv_get_ref(BlockDriverState *bs)
 {
-    assert(bs->in_use != in_use);
-    bs->in_use = in_use;
+    bs->refcount++;
 }
 
 int bdrv_in_use(BlockDriverState *bs)
 {
-    return bs->in_use;
+    return bs->refcount > 0;
 }
 
 void bdrv_iostatus_enable(BlockDriverState *bs)
diff --git a/blockjob.c b/blockjob.c
index ca80df1..a841a66 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -45,7 +45,7 @@ void *block_job_create(const BlockJobType *job_type, BlockDriverState *bs,
         error_set(errp, QERR_DEVICE_IN_USE, bdrv_get_device_name(bs));
         return NULL;
     }
-    bdrv_set_in_use(bs, 1);
+    bdrv_get_ref(bs);
 
     job = g_malloc0(job_type->instance_size);
     job->job_type      = job_type;
@@ -63,7 +63,7 @@ void *block_job_create(const BlockJobType *job_type, BlockDriverState *bs,
         if (error_is_set(&local_err)) {
             bs->job = NULL;
             g_free(job);
-            bdrv_set_in_use(bs, 0);
+            bdrv_put_ref(bs);
             error_propagate(errp, local_err);
             return NULL;
         }
@@ -79,7 +79,7 @@ void block_job_completed(BlockJob *job, int ret)
     job->cb(job->opaque, ret);
     bs->job = NULL;
     g_free(job);
-    bdrv_set_in_use(bs, 0);
+    bdrv_put_ref(bs);
 }
 
 void block_job_set_speed(BlockJob *job, int64_t speed, Error **errp)
diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c
index 0356665..9893a11 100644
--- a/hw/block/dataplane/virtio-blk.c
+++ b/hw/block/dataplane/virtio-blk.c
@@ -431,7 +431,7 @@ bool virtio_blk_data_plane_create(VirtIODevice *vdev, VirtIOBlkConf *blk,
     s->blk = blk;
 
     /* Prevent block operations that conflict with data plane thread */
-    bdrv_set_in_use(blk->conf.bs, 1);
+    bdrv_get_ref(blk->conf.bs);
 
     error_setg(&s->migration_blocker,
             "x-data-plane does not support migration");
@@ -450,7 +450,7 @@ void virtio_blk_data_plane_destroy(VirtIOBlockDataPlane *s)
     virtio_blk_data_plane_stop(s);
     migrate_del_blocker(s->migration_blocker);
     error_free(s->migration_blocker);
-    bdrv_set_in_use(s->blk->conf.bs, 0);
+    bdrv_put_ref(s->blk->conf.bs);
     g_free(s);
 }
 
diff --git a/include/block/block.h b/include/block/block.h
index dd8eca1..77f0f0d 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -353,7 +353,8 @@ int64_t bdrv_get_dirty_count(BlockDriverState *bs);
 void bdrv_enable_copy_on_read(BlockDriverState *bs);
 void bdrv_disable_copy_on_read(BlockDriverState *bs);
 
-void bdrv_set_in_use(BlockDriverState *bs, int in_use);
+void bdrv_put_ref(BlockDriverState *bs);
+void bdrv_get_ref(BlockDriverState *bs);
 int bdrv_in_use(BlockDriverState *bs);
 
 #ifdef CONFIG_LINUX_AIO
diff --git a/include/block/block_int.h b/include/block/block_int.h
index c6ac871..1ea80e7 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -294,7 +294,7 @@ struct BlockDriverState {
     BlockDeviceIoStatus iostatus;
     char device_name[32];
     HBitmap *dirty_bitmap;
-    int in_use; /* users other than guest access, eg. block migration */
+    int refcount;
     QTAILQ_ENTRY(BlockDriverState) list;
 
     QLIST_HEAD(, BdrvTrackedRequest) tracked_requests;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH 2/7] block: use refcount to manage BlockDriverState lifecycle
  2013-07-02  5:59 [Qemu-devel] [PATCH 0/7] Point-in-time snapshot exporting over NBD Fam Zheng
  2013-07-02  5:59 ` [Qemu-devel] [PATCH 1/7] block: Convert BlockDriverState.in_use to refcount Fam Zheng
@ 2013-07-02  5:59 ` Fam Zheng
  2013-07-02  5:59 ` [Qemu-devel] [PATCH 3/7] nbd: use BDS refcount Fam Zheng
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 21+ messages in thread
From: Fam Zheng @ 2013-07-02  5:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: kwolf, famz, obarenbo, roliveri, hbrock, rjones, armbru, pmyers,
	imain, stefanha, pbonzini

Device attach, NBD server and block job can share a BlockDriverState,
this patch makes them to use refcount to grab/release a bs so they don't
interfere each other with BDS lifecycle.

Local BDS allocation/releasing don't need to use refcount to manage it,
just use bdrv_new() and bdrv_delete().

If BDS is possibly shared with other code (e.g. NBD and guest device),
use bdrv_new to allocate, then bdrv_get_ref to grab it, when job is
done, call bdrv_put_ref() to release it. Don't call bdrv_delete(), since
it's deleted automatically by the last bdrv_put_ref().

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 block-migration.c         |  1 -
 block.c                   | 31 ++++++++++++++++++++-----------
 block/backup.c            |  3 ++-
 block/blkdebug.c          |  1 +
 block/blkverify.c         |  2 ++
 block/mirror.c            |  4 ++--
 block/snapshot.c          |  3 ++-
 block/stream.c            |  2 +-
 block/vvfat.c             |  4 +++-
 blockdev.c                |  5 +++--
 hw/block/xen_disk.c       |  7 +------
 include/block/block_int.h | 16 ++++++++++++++++
 12 files changed, 53 insertions(+), 26 deletions(-)

diff --git a/block-migration.c b/block-migration.c
index 2efb6c0..2154b3a 100644
--- a/block-migration.c
+++ b/block-migration.c
@@ -558,7 +558,6 @@ static void blk_mig_cleanup(void)
     while ((bmds = QSIMPLEQ_FIRST(&block_mig_state.bmds_list)) != NULL) {
         QSIMPLEQ_REMOVE_HEAD(&block_mig_state.bmds_list, entry);
         bdrv_get_ref(bmds->bs);
-        drive_put_ref(drive_get_by_blockdev(bmds->bs));
         g_free(bmds->aio_bitmap);
         g_free(bmds);
     }
diff --git a/block.c b/block.c
index 84c3181..d1ce570 100644
--- a/block.c
+++ b/block.c
@@ -740,7 +740,6 @@ static int bdrv_open_common(BlockDriverState *bs, BlockDriverState *file,
             ret = -EINVAL;
             goto free_and_fail;
         }
-        assert(file != NULL);
         bs->file = file;
         ret = drv->bdrv_open(bs, options, open_flags);
     }
@@ -748,7 +747,6 @@ static int bdrv_open_common(BlockDriverState *bs, BlockDriverState *file,
     if (ret < 0) {
         goto free_and_fail;
     }
-
     ret = refresh_total_sectors(bs, bs->total_sectors);
     if (ret < 0) {
         goto free_and_fail;
@@ -925,6 +923,7 @@ int bdrv_open_backing_file(BlockDriverState *bs, QDict *options)
         bs->open_flags |= BDRV_O_NO_BACKING;
         return ret;
     }
+    bdrv_get_ref(bs->backing_hd);
     return 0;
 }
 
@@ -1068,9 +1067,11 @@ int bdrv_open(BlockDriverState *bs, const char *filename, QDict *options,
 
     if (bs->file != file) {
         bdrv_delete(file);
-        file = NULL;
     }
 
+    file = NULL;
+    bdrv_get_ref(bs->file);
+
     /* If there is a backing file, use it */
     if ((flags & BDRV_O_NO_BACKING) == 0) {
         QDict *backing_options;
@@ -1367,7 +1368,7 @@ void bdrv_close(BlockDriverState *bs)
 
     if (bs->drv) {
         if (bs->backing_hd) {
-            bdrv_delete(bs->backing_hd);
+            bdrv_put_ref(bs->backing_hd);
             bs->backing_hd = NULL;
         }
         bs->drv->bdrv_close(bs);
@@ -1391,7 +1392,7 @@ void bdrv_close(BlockDriverState *bs)
         bs->options = NULL;
 
         if (bs->file != NULL) {
-            bdrv_delete(bs->file);
+            bdrv_put_ref(bs->file);
             bs->file = NULL;
         }
     }
@@ -1536,7 +1537,7 @@ void bdrv_swap(BlockDriverState *bs_new, BlockDriverState *bs_old)
     assert(bs_new->dirty_bitmap == NULL);
     assert(bs_new->job == NULL);
     assert(bs_new->dev == NULL);
-    assert(bs_new->refcount == 0);
+    assert(bs_new->refcount <= 1);
     assert(bs_new->io_limits_enabled == false);
     assert(bs_new->block_timer == NULL);
 
@@ -1555,7 +1556,7 @@ void bdrv_swap(BlockDriverState *bs_new, BlockDriverState *bs_old)
     /* Check a few fields that should remain attached to the device */
     assert(bs_new->dev == NULL);
     assert(bs_new->job == NULL);
-    assert(bs_new->refcount == 0);
+    assert(bs_new->refcount <= 1);
     assert(bs_new->io_limits_enabled == false);
     assert(bs_new->block_timer == NULL);
 
@@ -1609,6 +1610,7 @@ int bdrv_attach_dev(BlockDriverState *bs, void *dev)
         return -EBUSY;
     }
     bs->dev = dev;
+    bdrv_get_ref(bs);
     bdrv_iostatus_reset(bs);
     return 0;
 }
@@ -1626,6 +1628,7 @@ void bdrv_detach_dev(BlockDriverState *bs, void *dev)
 {
     assert(bs->dev == dev);
     bs->dev = NULL;
+    bdrv_put_ref(bs);
     bs->dev_ops = NULL;
     bs->dev_opaque = NULL;
     bs->buffer_alignment = 512;
@@ -2102,13 +2105,16 @@ int bdrv_drop_intermediate(BlockDriverState *active, BlockDriverState *top,
     if (ret) {
         goto exit;
     }
+    if (new_top_bs->backing_hd) {
+        bdrv_put_ref(new_top_bs->backing_hd);
+    }
     new_top_bs->backing_hd = base_bs;
-
+    bdrv_get_ref(base_bs);
 
     QSIMPLEQ_FOREACH_SAFE(intermediate_state, &states_to_delete, entry, next) {
         /* so that bdrv_close() does not recursively close the chain */
         intermediate_state->bs->backing_hd = NULL;
-        bdrv_delete(intermediate_state->bs);
+        bdrv_put_ref(intermediate_state->bs);
     }
     ret = 0;
 
@@ -4365,7 +4371,10 @@ int64_t bdrv_get_dirty_count(BlockDriverState *bs)
 void bdrv_put_ref(BlockDriverState *bs)
 {
     assert(bs->refcount > 0);
-    bs->refcount--;
+    if (--bs->refcount == 0) {
+        bdrv_close(bs);
+        bdrv_delete(bs);
+    }
 }
 
 void bdrv_get_ref(BlockDriverState *bs)
@@ -4375,7 +4384,7 @@ void bdrv_get_ref(BlockDriverState *bs)
 
 int bdrv_in_use(BlockDriverState *bs)
 {
-    return bs->refcount > 0;
+    return bs->refcount > 1;
 }
 
 void bdrv_iostatus_enable(BlockDriverState *bs)
diff --git a/block/backup.c b/block/backup.c
index 16105d4..4e9f927 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -294,7 +294,7 @@ static void coroutine_fn backup_run(void *opaque)
     hbitmap_free(job->bitmap);
 
     bdrv_iostatus_disable(target);
-    bdrv_delete(target);
+    bdrv_put_ref(target);
 
     block_job_completed(&job->common, ret);
 }
@@ -332,6 +332,7 @@ void backup_start(BlockDriverState *bs, BlockDriverState *target,
         return;
     }
 
+    bdrv_get_ref(target);
     job->on_source_error = on_source_error;
     job->on_target_error = on_target_error;
     job->target = target;
diff --git a/block/blkdebug.c b/block/blkdebug.c
index ccb627a..83c4018 100644
--- a/block/blkdebug.c
+++ b/block/blkdebug.c
@@ -389,6 +389,7 @@ static int blkdebug_open(BlockDriverState *bs, QDict *options, int flags)
     if (ret < 0) {
         goto fail;
     }
+    bdrv_get_ref(bs->file);
 
     ret = 0;
 fail:
diff --git a/block/blkverify.c b/block/blkverify.c
index 1d58cc3..fd63f2b 100644
--- a/block/blkverify.c
+++ b/block/blkverify.c
@@ -145,6 +145,8 @@ static int blkverify_open(BlockDriverState *bs, QDict *options, int flags)
         goto fail;
     }
 
+    bdrv_get_ref(bs->file);
+
     /* Open the test file */
     filename = qemu_opt_get(opts, "x-image");
     if (filename == NULL) {
diff --git a/block/mirror.c b/block/mirror.c
index bed4a7e..ed98313 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -300,6 +300,7 @@ static void coroutine_fn mirror_run(void *opaque)
     int ret = 0;
     int n;
 
+    bdrv_get_ref(s->target);
     if (block_job_is_cancelled(&s->common)) {
         goto immediate_exit;
     }
@@ -479,8 +480,7 @@ immediate_exit:
         }
         bdrv_swap(s->target, s->common.bs);
     }
-    bdrv_close(s->target);
-    bdrv_delete(s->target);
+    bdrv_put_ref(s->target);
     block_job_completed(&s->common, ret);
 }
 
diff --git a/block/snapshot.c b/block/snapshot.c
index 6c6d9de..2d864d4 100644
--- a/block/snapshot.c
+++ b/block/snapshot.c
@@ -99,7 +99,8 @@ int bdrv_snapshot_goto(BlockDriverState *bs,
         ret = bdrv_snapshot_goto(bs->file, snapshot_id);
         open_ret = drv->bdrv_open(bs, NULL, bs->open_flags);
         if (open_ret < 0) {
-            bdrv_delete(bs->file);
+            bdrv_put_ref(bs->file);
+            bs->file = NULL;
             bs->drv = NULL;
             return open_ret;
         }
diff --git a/block/stream.c b/block/stream.c
index 7fe9e48..e611f31 100644
--- a/block/stream.c
+++ b/block/stream.c
@@ -68,7 +68,7 @@ static void close_unused_images(BlockDriverState *top, BlockDriverState *base,
         unused = intermediate;
         intermediate = intermediate->backing_hd;
         unused->backing_hd = NULL;
-        bdrv_delete(unused);
+        bdrv_put_ref(unused);
     }
     top->backing_hd = base;
 }
diff --git a/block/vvfat.c b/block/vvfat.c
index 87b0279..a73d765 100644
--- a/block/vvfat.c
+++ b/block/vvfat.c
@@ -2938,8 +2938,10 @@ static int enable_write_target(BDRVVVFATState *s)
     ret = bdrv_open(s->qcow, s->qcow_filename, NULL,
             BDRV_O_RDWR | BDRV_O_CACHE_WB | BDRV_O_NO_FLUSH, bdrv_qcow);
     if (ret < 0) {
-	return ret;
+        bdrv_delete(s->qcow);
+        return ret;
     }
+    bdrv_get_ref(s->qcow);
 
 #ifndef _WIN32
     unlink(s->qcow_filename);
diff --git a/blockdev.c b/blockdev.c
index b3a57e0..2c2ea59 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -212,7 +212,6 @@ static void bdrv_format_print(void *opaque, const char *name)
 static void drive_uninit(DriveInfo *dinfo)
 {
     qemu_opts_del(dinfo->opts);
-    bdrv_delete(dinfo->bdrv);
     g_free(dinfo->id);
     QTAILQ_REMOVE(&drives, dinfo, next);
     g_free(dinfo->serial);
@@ -894,6 +893,7 @@ static void external_snapshot_prepare(BlkTransactionState *common,
     if (ret != 0) {
         error_setg_file_open(errp, -ret, new_image_file);
     }
+    bdrv_get_ref(state->new_bs);
 }
 
 static void external_snapshot_commit(BlkTransactionState *common)
@@ -908,6 +908,7 @@ static void external_snapshot_commit(BlkTransactionState *common)
      * don't want to abort all of them if one of them fails the reopen */
     bdrv_reopen(state->new_bs, state->new_bs->open_flags & ~BDRV_O_RDWR,
                 NULL);
+    bdrv_put_ref(state->new_bs);
 }
 
 static void external_snapshot_abort(BlkTransactionState *common)
@@ -915,7 +916,7 @@ static void external_snapshot_abort(BlkTransactionState *common)
     ExternalSnapshotState *state =
                              DO_UPCAST(ExternalSnapshotState, common, common);
     if (state->new_bs) {
-        bdrv_delete(state->new_bs);
+        bdrv_put_ref(state->new_bs);
     }
 }
 
diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c
index 247f32f..ae17acc 100644
--- a/hw/block/xen_disk.c
+++ b/hw/block/xen_disk.c
@@ -910,12 +910,7 @@ static void blk_disconnect(struct XenDevice *xendev)
     struct XenBlkDev *blkdev = container_of(xendev, struct XenBlkDev, xendev);
 
     if (blkdev->bs) {
-        if (!blkdev->dinfo) {
-            /* close/delete only if we created it ourself */
-            bdrv_close(blkdev->bs);
-            bdrv_detach_dev(blkdev->bs, blkdev);
-            bdrv_delete(blkdev->bs);
-        }
+        bdrv_detach_dev(blkdev->bs, blkdev);
         blkdev->bs = NULL;
     }
     xen_be_unbind_evtchn(&blkdev->xendev);
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 1ea80e7..92acddf 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -294,6 +294,22 @@ struct BlockDriverState {
     BlockDeviceIoStatus iostatus;
     char device_name[32];
     HBitmap *dirty_bitmap;
+
+    /* Number of global references of this BDS, increased each time when:
+     * - Attached to device
+     * - Used by block job (both source and target)
+     * - NBD exported
+     * - Referenced as backing_hd of other BDS
+     * - Shared by multiple parts of QEMU in any other way
+     * And decreased each time when one of these roles finished.
+     *
+     * Note: Local/temp BDS isn't required to get a refcount. Two ways to
+     * manage the BDS lifecycle:
+     * - If BDS is only accessable locally, use bdrv_new() and bdrv_delete(),
+     *   Just forget refcount.
+     * - bdrv_new() then bdrv_get_ref()/bdrv_put_ref. DON'T call bdrv_delete()
+     *   yourself.
+     */
     int refcount;
     QTAILQ_ENTRY(BlockDriverState) list;
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH 3/7] nbd: use BDS refcount
  2013-07-02  5:59 [Qemu-devel] [PATCH 0/7] Point-in-time snapshot exporting over NBD Fam Zheng
  2013-07-02  5:59 ` [Qemu-devel] [PATCH 1/7] block: Convert BlockDriverState.in_use to refcount Fam Zheng
  2013-07-02  5:59 ` [Qemu-devel] [PATCH 2/7] block: use refcount to manage BlockDriverState lifecycle Fam Zheng
@ 2013-07-02  5:59 ` Fam Zheng
  2013-07-02 10:16   ` Paolo Bonzini
  2013-07-02  5:59 ` [Qemu-devel] [PATCH 4/7] block: simplify bdrv_drop_intermediate Fam Zheng
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 21+ messages in thread
From: Fam Zheng @ 2013-07-02  5:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: kwolf, famz, obarenbo, roliveri, hbrock, rjones, armbru, pmyers,
	imain, stefanha, pbonzini

Previously, nbd call drive_get_ref on the drive of bs. A BDS doesn't
always have associated dinfo, it's more proper to use bdrv_get_ref().

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 blockdev-nbd.c | 9 +--------
 nbd.c          | 5 +++++
 2 files changed, 6 insertions(+), 8 deletions(-)

diff --git a/blockdev-nbd.c b/blockdev-nbd.c
index 95f10c8..d8bcd6f 100644
--- a/blockdev-nbd.c
+++ b/blockdev-nbd.c
@@ -69,12 +69,6 @@ static void nbd_close_notifier(Notifier *n, void *data)
     g_free(cn);
 }
 
-static void nbd_server_put_ref(NBDExport *exp)
-{
-    BlockDriverState *bs = nbd_export_get_blockdev(exp);
-    drive_put_ref(drive_get_by_blockdev(bs));
-}
-
 void qmp_nbd_server_add(const char *device, bool has_writable, bool writable,
                         Error **errp)
 {
@@ -106,10 +100,9 @@ void qmp_nbd_server_add(const char *device, bool has_writable, bool writable,
     }
 
     exp = nbd_export_new(bs, 0, -1, writable ? 0 : NBD_FLAG_READ_ONLY,
-                         nbd_server_put_ref);
+                         NULL);
 
     nbd_export_set_name(exp, device);
-    drive_get_ref(drive_get_by_blockdev(bs));
 
     n = g_malloc0(sizeof(NBDCloseNotifier));
     n->n.notify = nbd_close_notifier;
diff --git a/nbd.c b/nbd.c
index 2606403..f28b9fb 100644
--- a/nbd.c
+++ b/nbd.c
@@ -881,6 +881,7 @@ NBDExport *nbd_export_new(BlockDriverState *bs, off_t dev_offset,
     exp->nbdflags = nbdflags;
     exp->size = size == -1 ? bdrv_getlength(bs) : size;
     exp->close = close;
+    bdrv_get_ref(bs);
     return exp;
 }
 
@@ -927,6 +928,10 @@ void nbd_export_close(NBDExport *exp)
     }
     nbd_export_set_name(exp, NULL);
     nbd_export_put(exp);
+    if (exp->bs) {
+        bdrv_put_ref(exp->bs);
+        exp->bs = NULL;
+    }
 }
 
 void nbd_export_get(NBDExport *exp)
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH 4/7] block: simplify bdrv_drop_intermediate
  2013-07-02  5:59 [Qemu-devel] [PATCH 0/7] Point-in-time snapshot exporting over NBD Fam Zheng
                   ` (2 preceding siblings ...)
  2013-07-02  5:59 ` [Qemu-devel] [PATCH 3/7] nbd: use BDS refcount Fam Zheng
@ 2013-07-02  5:59 ` Fam Zheng
  2013-07-04 14:02   ` Stefan Hajnoczi
  2013-07-02  5:59 ` [Qemu-devel] [PATCH 5/7] block: rename bdrv_in_use to bdrv_is_shared Fam Zheng
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 21+ messages in thread
From: Fam Zheng @ 2013-07-02  5:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: kwolf, famz, obarenbo, roliveri, hbrock, rjones, armbru, pmyers,
	imain, stefanha, pbonzini

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 block.c | 71 ++++++++++-------------------------------------------------------
 1 file changed, 11 insertions(+), 60 deletions(-)

diff --git a/block.c b/block.c
index d1ce570..ae5de17 100644
--- a/block.c
+++ b/block.c
@@ -2015,12 +2015,6 @@ BlockDriverState *bdrv_find_overlay(BlockDriverState *active,
     return overlay;
 }
 
-typedef struct BlkIntermediateStates {
-    BlockDriverState *bs;
-    QSIMPLEQ_ENTRY(BlkIntermediateStates) entry;
-} BlkIntermediateStates;
-
-
 /*
  * Drops images above 'base' up to and including 'top', and sets the image
  * above 'top' to have base as its backing file.
@@ -2050,15 +2044,9 @@ typedef struct BlkIntermediateStates {
 int bdrv_drop_intermediate(BlockDriverState *active, BlockDriverState *top,
                            BlockDriverState *base)
 {
-    BlockDriverState *intermediate;
-    BlockDriverState *base_bs = NULL;
     BlockDriverState *new_top_bs = NULL;
-    BlkIntermediateStates *intermediate_state, *next;
     int ret = -EIO;
 
-    QSIMPLEQ_HEAD(states_to_delete, BlkIntermediateStates) states_to_delete;
-    QSIMPLEQ_INIT(&states_to_delete);
-
     if (!top->drv || !base->drv) {
         goto exit;
     }
@@ -2070,58 +2058,21 @@ int bdrv_drop_intermediate(BlockDriverState *active, BlockDriverState *top,
         goto exit;
     }
 
-    /* special case of new_top_bs->backing_hd already pointing to base - nothing
-     * to do, no intermediate images */
-    if (new_top_bs->backing_hd == base) {
-        ret = 0;
-        goto exit;
-    }
-
-    intermediate = top;
-
-    /* now we will go down through the list, and add each BDS we find
-     * into our deletion queue, until we hit the 'base'
-     */
-    while (intermediate) {
-        intermediate_state = g_malloc0(sizeof(BlkIntermediateStates));
-        intermediate_state->bs = intermediate;
-        QSIMPLEQ_INSERT_TAIL(&states_to_delete, intermediate_state, entry);
-
-        if (intermediate->backing_hd == base) {
-            base_bs = intermediate->backing_hd;
-            break;
+    while (new_top_bs->backing_hd && new_top_bs->backing_hd != base) {
+        BlockDriverState *backing = new_top_bs->backing_hd;
+        if (backing == NULL) {
+            goto exit;
         }
-        intermediate = intermediate->backing_hd;
-    }
-    if (base_bs == NULL) {
-        /* something went wrong, we did not end at the base. safely
-         * unravel everything, and exit with error */
-        goto exit;
+        new_top_bs->backing_hd = backing->backing_hd;
+        /* break backing_hd chain before releasing bs, so we don't free all the
+         * way up the backing chain */
+        backing->backing_hd = NULL;
+        bdrv_put_ref(backing);
     }
 
-    /* success - we can delete the intermediate states, and link top->base */
-    ret = bdrv_change_backing_file(new_top_bs, base_bs->filename,
-                                   base_bs->drv ? base_bs->drv->format_name : "");
-    if (ret) {
-        goto exit;
-    }
-    if (new_top_bs->backing_hd) {
-        bdrv_put_ref(new_top_bs->backing_hd);
-    }
-    new_top_bs->backing_hd = base_bs;
-    bdrv_get_ref(base_bs);
-
-    QSIMPLEQ_FOREACH_SAFE(intermediate_state, &states_to_delete, entry, next) {
-        /* so that bdrv_close() does not recursively close the chain */
-        intermediate_state->bs->backing_hd = NULL;
-        bdrv_put_ref(intermediate_state->bs);
-    }
-    ret = 0;
-
+    ret = bdrv_change_backing_file(new_top_bs, base->filename,
+                                   base->drv ? base->drv->format_name : "");
 exit:
-    QSIMPLEQ_FOREACH_SAFE(intermediate_state, &states_to_delete, entry, next) {
-        g_free(intermediate_state);
-    }
     return ret;
 }
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH 5/7] block: rename bdrv_in_use to bdrv_is_shared
  2013-07-02  5:59 [Qemu-devel] [PATCH 0/7] Point-in-time snapshot exporting over NBD Fam Zheng
                   ` (3 preceding siblings ...)
  2013-07-02  5:59 ` [Qemu-devel] [PATCH 4/7] block: simplify bdrv_drop_intermediate Fam Zheng
@ 2013-07-02  5:59 ` Fam Zheng
  2013-07-02  5:59 ` [Qemu-devel] [PATCH 6/7] block: add target-id option to drive-backup QMP command Fam Zheng
  2013-07-02  5:59 ` [Qemu-devel] [PATCH 7/7] block: assign backing relationship in drive-backup Fam Zheng
  6 siblings, 0 replies; 21+ messages in thread
From: Fam Zheng @ 2013-07-02  5:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: kwolf, famz, obarenbo, roliveri, hbrock, rjones, armbru, pmyers,
	imain, stefanha, pbonzini

The patch only does a rename: bdrv_in_use is obsecure literally (any BDS
is certain to be used somewhere). Rename it to bdrv_is_shared since we
have reference count now and the user number of the BDS is reflected
there.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 block.c               | 16 ++++++++++------
 blockdev.c            | 10 +++++-----
 blockjob.c            |  2 +-
 include/block/block.h |  2 +-
 4 files changed, 17 insertions(+), 13 deletions(-)

diff --git a/block.c b/block.c
index ae5de17..85dc76f 100644
--- a/block.c
+++ b/block.c
@@ -1775,7 +1775,7 @@ int bdrv_commit(BlockDriverState *bs)
         return -ENOTSUP;
     }
 
-    if (bdrv_in_use(bs) || bdrv_in_use(bs->backing_hd)) {
+    if (bdrv_is_shared(bs) || bdrv_is_shared(bs->backing_hd)) {
         return -EBUSY;
     }
 
@@ -2621,14 +2621,18 @@ int bdrv_truncate(BlockDriverState *bs, int64_t offset)
 {
     BlockDriver *drv = bs->drv;
     int ret;
-    if (!drv)
+    if (!drv) {
         return -ENOMEDIUM;
-    if (!drv->bdrv_truncate)
+    }
+    if (!drv->bdrv_truncate) {
         return -ENOTSUP;
-    if (bs->read_only)
+    }
+    if (bs->read_only) {
         return -EACCES;
-    if (bdrv_in_use(bs))
+    }
+    if (bdrv_is_shared(bs)) {
         return -EBUSY;
+    }
     ret = drv->bdrv_truncate(bs, offset);
     if (ret == 0) {
         ret = refresh_total_sectors(bs, offset >> BDRV_SECTOR_BITS);
@@ -4333,7 +4337,7 @@ void bdrv_get_ref(BlockDriverState *bs)
     bs->refcount++;
 }
 
-int bdrv_in_use(BlockDriverState *bs)
+int bdrv_is_shared(BlockDriverState *bs)
 {
     return bs->refcount > 1;
 }
diff --git a/blockdev.c b/blockdev.c
index 2c2ea59..d02d99a 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -858,7 +858,7 @@ static void external_snapshot_prepare(BlkTransactionState *common,
         return;
     }
 
-    if (bdrv_in_use(state->old_bs)) {
+    if (bdrv_is_shared(state->old_bs)) {
         error_set(errp, QERR_DEVICE_IN_USE, device);
         return;
     }
@@ -1064,7 +1064,7 @@ exit:
 
 static void eject_device(BlockDriverState *bs, int force, Error **errp)
 {
-    if (bdrv_in_use(bs)) {
+    if (bdrv_is_shared(bs)) {
         error_set(errp, QERR_DEVICE_IN_USE, bdrv_get_device_name(bs));
         return;
     }
@@ -1226,7 +1226,7 @@ int do_drive_del(Monitor *mon, const QDict *qdict, QObject **ret_data)
         qerror_report(QERR_DEVICE_NOT_FOUND, id);
         return -1;
     }
-    if (bdrv_in_use(bs)) {
+    if (bdrv_is_shared(bs)) {
         qerror_report(QERR_DEVICE_IN_USE, id);
         return -1;
     }
@@ -1471,7 +1471,7 @@ void qmp_drive_backup(const char *device, const char *target,
         }
     }
 
-    if (bdrv_in_use(bs)) {
+    if (bdrv_is_shared(bs)) {
         error_set(errp, QERR_DEVICE_IN_USE, device);
         return;
     }
@@ -1588,7 +1588,7 @@ void qmp_drive_mirror(const char *device, const char *target,
         }
     }
 
-    if (bdrv_in_use(bs)) {
+    if (bdrv_is_shared(bs)) {
         error_set(errp, QERR_DEVICE_IN_USE, device);
         return;
     }
diff --git a/blockjob.c b/blockjob.c
index a841a66..3e9b9a8 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -41,7 +41,7 @@ void *block_job_create(const BlockJobType *job_type, BlockDriverState *bs,
 {
     BlockJob *job;
 
-    if (bs->job || bdrv_in_use(bs)) {
+    if (bs->job || bdrv_is_shared(bs)) {
         error_set(errp, QERR_DEVICE_IN_USE, bdrv_get_device_name(bs));
         return NULL;
     }
diff --git a/include/block/block.h b/include/block/block.h
index 77f0f0d..6b33f5a 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -355,7 +355,7 @@ void bdrv_disable_copy_on_read(BlockDriverState *bs);
 
 void bdrv_put_ref(BlockDriverState *bs);
 void bdrv_get_ref(BlockDriverState *bs);
-int bdrv_in_use(BlockDriverState *bs);
+int bdrv_is_shared(BlockDriverState *bs);
 
 #ifdef CONFIG_LINUX_AIO
 int raw_get_aio_fd(BlockDriverState *bs);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH 6/7] block: add target-id option to drive-backup QMP command
  2013-07-02  5:59 [Qemu-devel] [PATCH 0/7] Point-in-time snapshot exporting over NBD Fam Zheng
                   ` (4 preceding siblings ...)
  2013-07-02  5:59 ` [Qemu-devel] [PATCH 5/7] block: rename bdrv_in_use to bdrv_is_shared Fam Zheng
@ 2013-07-02  5:59 ` Fam Zheng
  2013-07-02 19:59   ` Eric Blake
  2013-07-02  5:59 ` [Qemu-devel] [PATCH 7/7] block: assign backing relationship in drive-backup Fam Zheng
  6 siblings, 1 reply; 21+ messages in thread
From: Fam Zheng @ 2013-07-02  5:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: kwolf, famz, obarenbo, roliveri, hbrock, rjones, armbru, pmyers,
	imain, stefanha, pbonzini

Add target-id (optional) to drive-backup command, to make the target bs
a named drive so that we can operate on it (e.g. export with NBD).

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 blockdev.c       | 4 +++-
 qapi-schema.json | 7 +++++--
 qmp-commands.hx  | 3 ++-
 3 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index d02d99a..a297eaf 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -936,6 +936,7 @@ static void drive_backup_prepare(BlkTransactionState *common, Error **errp)
     backup = common->action->drive_backup;
 
     qmp_drive_backup(backup->device, backup->target,
+                     backup->has_target_id, backup->target_id,
                      backup->has_format, backup->format,
                      backup->has_mode, backup->mode,
                      backup->has_speed, backup->speed,
@@ -1421,6 +1422,7 @@ void qmp_block_commit(const char *device,
 }
 
 void qmp_drive_backup(const char *device, const char *target,
+                      bool has_target_id, const char *target_id,
                       bool has_format, const char *format,
                       bool has_mode, enum NewImageMode mode,
                       bool has_speed, int64_t speed,
@@ -1495,7 +1497,7 @@ void qmp_drive_backup(const char *device, const char *target,
         return;
     }
 
-    target_bs = bdrv_new("");
+    target_bs = bdrv_new(has_target_id ? target_id : "");
     ret = bdrv_open(target_bs, target, NULL, flags, drv);
     if (ret < 0) {
         bdrv_delete(target_bs);
diff --git a/qapi-schema.json b/qapi-schema.json
index 5c32528..2f2a87f 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -1654,7 +1654,8 @@
 # Since: 1.6
 ##
 { 'type': 'DriveBackup',
-  'data': { 'device': 'str', 'target': 'str', '*format': 'str',
+  'data': { 'device': 'str', 'target': 'str',
+            '*target-id': 'str', '*format': 'str',
             '*mode': 'NewImageMode', '*speed': 'int',
             '*on-source-error': 'BlockdevOnError',
             '*on-target-error': 'BlockdevOnError' } }
@@ -1807,6 +1808,7 @@
 #          is a device, the existing file/device will be used as the new
 #          destination.  If it does not exist, a new file will be created.
 #
+# @target-id: #optional the drive id of the target.
 # @format: #optional the format of the new destination, default is to
 #          probe if @mode is 'existing', else the format of the source
 #
@@ -1833,7 +1835,8 @@
 # Since 1.6
 ##
 { 'command': 'drive-backup',
-  'data': { 'device': 'str', 'target': 'str', '*format': 'str',
+  'data': { 'device': 'str', 'target': 'str',
+            '*target-id': 'str', '*format': 'str',
             '*mode': 'NewImageMode', '*speed': 'int',
             '*on-source-error': 'BlockdevOnError',
             '*on-target-error': 'BlockdevOnError' } }
diff --git a/qmp-commands.hx b/qmp-commands.hx
index 362f0e1..3ed03de 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -913,7 +913,7 @@ EQMP
 
     {
         .name       = "drive-backup",
-        .args_type  = "device:B,target:s,speed:i?,mode:s?,format:s?,"
+        .args_type  = "device:B,target:s,target-id:s?,speed:i?,mode:s?,format:s?,"
                       "on-source-error:s?,on-target-error:s?",
         .mhandler.cmd_new = qmp_marshal_input_drive_backup,
     },
@@ -936,6 +936,7 @@ Arguments:
             device, the existing file/device will be used as the new
             destination.  If it does not exist, a new file will be created.
             (json-string)
+- "target-id": the drive id of the target image.
 - "format": the format of the new destination, default is to probe if 'mode' is
             'existing', else the format of the source
             (json-string, optional)
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH 7/7] block: assign backing relationship in drive-backup
  2013-07-02  5:59 [Qemu-devel] [PATCH 0/7] Point-in-time snapshot exporting over NBD Fam Zheng
                   ` (5 preceding siblings ...)
  2013-07-02  5:59 ` [Qemu-devel] [PATCH 6/7] block: add target-id option to drive-backup QMP command Fam Zheng
@ 2013-07-02  5:59 ` Fam Zheng
  2013-07-04 14:32   ` Stefan Hajnoczi
  6 siblings, 1 reply; 21+ messages in thread
From: Fam Zheng @ 2013-07-02  5:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: kwolf, famz, obarenbo, roliveri, hbrock, rjones, armbru, pmyers,
	imain, stefanha, pbonzini

Assign source image as the backing hd of target bs, so reading target bs
gets the point-in-time copy of data from source image.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 block/backup.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/block/backup.c b/block/backup.c
index 4e9f927..2dd0540 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -294,6 +294,11 @@ static void coroutine_fn backup_run(void *opaque)
     hbitmap_free(job->bitmap);
 
     bdrv_iostatus_disable(target);
+
+    bdrv_put_ref(target->backing_hd);
+    target->backing_hd = NULL;
+    target->backing_file[0] = '\0';
+    target->backing_format[0] = '\0';
     bdrv_put_ref(target);
 
     block_job_completed(&job->common, ret);
@@ -332,7 +337,15 @@ void backup_start(BlockDriverState *bs, BlockDriverState *target,
         return;
     }
 
+    target->backing_hd = bs;
+    pstrcpy(target->backing_file, sizeof(target->backing_file),
+            bs->filename);
+    pstrcpy(target->backing_format, sizeof(target->backing_format),
+            bs->drv->format_name);
     bdrv_get_ref(target);
+    /* Get another ref to source for backing_hd relationship */
+    bdrv_get_ref(bs);
+
     job->on_source_error = on_source_error;
     job->on_target_error = on_target_error;
     job->target = target;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 3/7] nbd: use BDS refcount
  2013-07-02  5:59 ` [Qemu-devel] [PATCH 3/7] nbd: use BDS refcount Fam Zheng
@ 2013-07-02 10:16   ` Paolo Bonzini
  2013-07-03  1:10     ` Fam Zheng
  0 siblings, 1 reply; 21+ messages in thread
From: Paolo Bonzini @ 2013-07-02 10:16 UTC (permalink / raw)
  To: Fam Zheng
  Cc: kwolf, obarenbo, armbru, roliveri, hbrock, qemu-devel, rjones,
	pmyers, imain, stefanha

Il 02/07/2013 07:59, Fam Zheng ha scritto:
> Previously, nbd call drive_get_ref on the drive of bs. A BDS doesn't
> always have associated dinfo, it's more proper to use bdrv_get_ref().

This has the important side effect of marking the exported disk as
"in_use" (to use the terms before the series).  Right now you can serve
a disk and, at the same time, stream it or mirror it or create a live
snapshot of it.

Do we really want to block anything for a device being served?  Perhaps
truncation, but maybe not even that.  The NBD server is meant to be as
unobtrusive as possible (in some sense NBD accesses are the same as
guest accesses).

Paolo

> Signed-off-by: Fam Zheng <famz@redhat.com>
> ---
>  blockdev-nbd.c | 9 +--------
>  nbd.c          | 5 +++++
>  2 files changed, 6 insertions(+), 8 deletions(-)
> 
> diff --git a/blockdev-nbd.c b/blockdev-nbd.c
> index 95f10c8..d8bcd6f 100644
> --- a/blockdev-nbd.c
> +++ b/blockdev-nbd.c
> @@ -69,12 +69,6 @@ static void nbd_close_notifier(Notifier *n, void *data)
>      g_free(cn);
>  }
>  
> -static void nbd_server_put_ref(NBDExport *exp)
> -{
> -    BlockDriverState *bs = nbd_export_get_blockdev(exp);
> -    drive_put_ref(drive_get_by_blockdev(bs));
> -}
> -
>  void qmp_nbd_server_add(const char *device, bool has_writable, bool writable,
>                          Error **errp)
>  {
> @@ -106,10 +100,9 @@ void qmp_nbd_server_add(const char *device, bool has_writable, bool writable,
>      }
>  
>      exp = nbd_export_new(bs, 0, -1, writable ? 0 : NBD_FLAG_READ_ONLY,
> -                         nbd_server_put_ref);
> +                         NULL);
>  
>      nbd_export_set_name(exp, device);
> -    drive_get_ref(drive_get_by_blockdev(bs));
>  
>      n = g_malloc0(sizeof(NBDCloseNotifier));
>      n->n.notify = nbd_close_notifier;
> diff --git a/nbd.c b/nbd.c
> index 2606403..f28b9fb 100644
> --- a/nbd.c
> +++ b/nbd.c
> @@ -881,6 +881,7 @@ NBDExport *nbd_export_new(BlockDriverState *bs, off_t dev_offset,
>      exp->nbdflags = nbdflags;
>      exp->size = size == -1 ? bdrv_getlength(bs) : size;
>      exp->close = close;
> +    bdrv_get_ref(bs);
>      return exp;
>  }
>  
> @@ -927,6 +928,10 @@ void nbd_export_close(NBDExport *exp)
>      }
>      nbd_export_set_name(exp, NULL);
>      nbd_export_put(exp);
> +    if (exp->bs) {
> +        bdrv_put_ref(exp->bs);
> +        exp->bs = NULL;
> +    }
>  }
>  
>  void nbd_export_get(NBDExport *exp)
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 1/7] block: Convert BlockDriverState.in_use to refcount
  2013-07-02  5:59 ` [Qemu-devel] [PATCH 1/7] block: Convert BlockDriverState.in_use to refcount Fam Zheng
@ 2013-07-02 10:21   ` Paolo Bonzini
  2013-07-08  8:37     ` Fam Zheng
  2013-07-02 19:41   ` Eric Blake
  1 sibling, 1 reply; 21+ messages in thread
From: Paolo Bonzini @ 2013-07-02 10:21 UTC (permalink / raw)
  To: Fam Zheng
  Cc: kwolf, obarenbo, armbru, roliveri, hbrock, qemu-devel, rjones,
	pmyers, imain, stefanha

Il 02/07/2013 07:59, Fam Zheng ha scritto:
> Use numeric value to replace in_use flag in BDS, this will make
> lifecycle management with ref count possible. This patch only replaces
> existing uses of bdrv_set_in_use, so no logic change here.

This still does not entirely explain the rules for who sets in_use (or
gets a reference) and who checks in_use.

Why should offline commit worry about the disk being shared, for
example?  The reason "of course" is that a background job might modify
bs->backing_hd while commit is running (or might expect bs->backing_hd
to not change).  However, this is in no way related to a reference count.

So I think your series is doing two things right (setting the in_use
flag for BDSes in the ->file and ->backing_hd chains; turning the in_use
flag into a counter) and one wrong (tying bdrv_in_use checks to the
lifecycle).  I wonder thus if we need two counters: the "in use" counter
and the reference count for the lifecycle.

One further nit is inline.

> Signed-off-by: Fam Zheng <famz@redhat.com>
> ---
>  block-migration.c               |  4 ++--
>  block.c                         | 23 +++++++++++++++--------
>  blockjob.c                      |  6 +++---
>  hw/block/dataplane/virtio-blk.c |  4 ++--
>  include/block/block.h           |  3 ++-
>  include/block/block_int.h       |  2 +-
>  6 files changed, 25 insertions(+), 17 deletions(-)
> 
> diff --git a/block-migration.c b/block-migration.c
> index 2fd7699..2efb6c0 100644
> --- a/block-migration.c
> +++ b/block-migration.c
> @@ -321,7 +321,7 @@ static void init_blk_migration_it(void *opaque, BlockDriverState *bs)
>          bmds->shared_base = block_mig_state.shared_base;
>          alloc_aio_bitmap(bmds);
>          drive_get_ref(drive_get_by_blockdev(bs));
> -        bdrv_set_in_use(bs, 1);
> +        bdrv_get_ref(bs);
>  
>          block_mig_state.total_sector_sum += sectors;
>  
> @@ -557,7 +557,7 @@ static void blk_mig_cleanup(void)
>      blk_mig_lock();
>      while ((bmds = QSIMPLEQ_FIRST(&block_mig_state.bmds_list)) != NULL) {
>          QSIMPLEQ_REMOVE_HEAD(&block_mig_state.bmds_list, entry);
> -        bdrv_set_in_use(bmds->bs, 0);
> +        bdrv_get_ref(bmds->bs);
>          drive_put_ref(drive_get_by_blockdev(bmds->bs));
>          g_free(bmds->aio_bitmap);
>          g_free(bmds);
> diff --git a/block.c b/block.c
> index 6c493ad..84c3181 100644
> --- a/block.c
> +++ b/block.c
> @@ -1503,8 +1503,10 @@ static void bdrv_move_feature_fields(BlockDriverState *bs_dest,
>      /* dirty bitmap */
>      bs_dest->dirty_bitmap       = bs_src->dirty_bitmap;
>  
> +    /* ref count */
> +    bs_dest->refcount           = bs_src->refcount;
> +
>      /* job */
> -    bs_dest->in_use             = bs_src->in_use;
>      bs_dest->job                = bs_src->job;
>  
>      /* keep the same entry in bdrv_states */
> @@ -1534,7 +1536,7 @@ void bdrv_swap(BlockDriverState *bs_new, BlockDriverState *bs_old)
>      assert(bs_new->dirty_bitmap == NULL);
>      assert(bs_new->job == NULL);
>      assert(bs_new->dev == NULL);
> -    assert(bs_new->in_use == 0);
> +    assert(bs_new->refcount == 0);
>      assert(bs_new->io_limits_enabled == false);
>      assert(bs_new->block_timer == NULL);
>  
> @@ -1553,7 +1555,7 @@ void bdrv_swap(BlockDriverState *bs_new, BlockDriverState *bs_old)
>      /* Check a few fields that should remain attached to the device */
>      assert(bs_new->dev == NULL);
>      assert(bs_new->job == NULL);
> -    assert(bs_new->in_use == 0);
> +    assert(bs_new->refcount == 0);
>      assert(bs_new->io_limits_enabled == false);
>      assert(bs_new->block_timer == NULL);
>  
> @@ -1590,7 +1592,7 @@ void bdrv_delete(BlockDriverState *bs)
>  {
>      assert(!bs->dev);
>      assert(!bs->job);
> -    assert(!bs->in_use);
> +    assert(!bs->refcount);
>  
>      /* remove from list, if necessary */
>      bdrv_make_anon(bs);
> @@ -4360,15 +4362,20 @@ int64_t bdrv_get_dirty_count(BlockDriverState *bs)
>      }
>  }
>  
> -void bdrv_set_in_use(BlockDriverState *bs, int in_use)
> +void bdrv_put_ref(BlockDriverState *bs)
> +{
> +    assert(bs->refcount > 0);
> +    bs->refcount--;
> +}
> +
> +void bdrv_get_ref(BlockDriverState *bs)
>  {
> -    assert(bs->in_use != in_use);
> -    bs->in_use = in_use;
> +    bs->refcount++;
>  }

The convention that we use in QEMU is bdrv_ref/bdrv_unref.

Paolo

>  int bdrv_in_use(BlockDriverState *bs)
>  {
> -    return bs->in_use;
> +    return bs->refcount > 0;
>  }
>  
>  void bdrv_iostatus_enable(BlockDriverState *bs)
> diff --git a/blockjob.c b/blockjob.c
> index ca80df1..a841a66 100644
> --- a/blockjob.c
> +++ b/blockjob.c
> @@ -45,7 +45,7 @@ void *block_job_create(const BlockJobType *job_type, BlockDriverState *bs,
>          error_set(errp, QERR_DEVICE_IN_USE, bdrv_get_device_name(bs));
>          return NULL;
>      }
> -    bdrv_set_in_use(bs, 1);
> +    bdrv_get_ref(bs);
>  
>      job = g_malloc0(job_type->instance_size);
>      job->job_type      = job_type;
> @@ -63,7 +63,7 @@ void *block_job_create(const BlockJobType *job_type, BlockDriverState *bs,
>          if (error_is_set(&local_err)) {
>              bs->job = NULL;
>              g_free(job);
> -            bdrv_set_in_use(bs, 0);
> +            bdrv_put_ref(bs);
>              error_propagate(errp, local_err);
>              return NULL;
>          }
> @@ -79,7 +79,7 @@ void block_job_completed(BlockJob *job, int ret)
>      job->cb(job->opaque, ret);
>      bs->job = NULL;
>      g_free(job);
> -    bdrv_set_in_use(bs, 0);
> +    bdrv_put_ref(bs);
>  }
>  
>  void block_job_set_speed(BlockJob *job, int64_t speed, Error **errp)
> diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c
> index 0356665..9893a11 100644
> --- a/hw/block/dataplane/virtio-blk.c
> +++ b/hw/block/dataplane/virtio-blk.c
> @@ -431,7 +431,7 @@ bool virtio_blk_data_plane_create(VirtIODevice *vdev, VirtIOBlkConf *blk,
>      s->blk = blk;
>  
>      /* Prevent block operations that conflict with data plane thread */
> -    bdrv_set_in_use(blk->conf.bs, 1);
> +    bdrv_get_ref(blk->conf.bs);
>  
>      error_setg(&s->migration_blocker,
>              "x-data-plane does not support migration");
> @@ -450,7 +450,7 @@ void virtio_blk_data_plane_destroy(VirtIOBlockDataPlane *s)
>      virtio_blk_data_plane_stop(s);
>      migrate_del_blocker(s->migration_blocker);
>      error_free(s->migration_blocker);
> -    bdrv_set_in_use(s->blk->conf.bs, 0);
> +    bdrv_put_ref(s->blk->conf.bs);
>      g_free(s);
>  }
>  
> diff --git a/include/block/block.h b/include/block/block.h
> index dd8eca1..77f0f0d 100644
> --- a/include/block/block.h
> +++ b/include/block/block.h
> @@ -353,7 +353,8 @@ int64_t bdrv_get_dirty_count(BlockDriverState *bs);
>  void bdrv_enable_copy_on_read(BlockDriverState *bs);
>  void bdrv_disable_copy_on_read(BlockDriverState *bs);
>  
> -void bdrv_set_in_use(BlockDriverState *bs, int in_use);
> +void bdrv_put_ref(BlockDriverState *bs);
> +void bdrv_get_ref(BlockDriverState *bs);
>  int bdrv_in_use(BlockDriverState *bs);
>  
>  #ifdef CONFIG_LINUX_AIO
> diff --git a/include/block/block_int.h b/include/block/block_int.h
> index c6ac871..1ea80e7 100644
> --- a/include/block/block_int.h
> +++ b/include/block/block_int.h
> @@ -294,7 +294,7 @@ struct BlockDriverState {
>      BlockDeviceIoStatus iostatus;
>      char device_name[32];
>      HBitmap *dirty_bitmap;
> -    int in_use; /* users other than guest access, eg. block migration */
> +    int refcount;
>      QTAILQ_ENTRY(BlockDriverState) list;
>  
>      QLIST_HEAD(, BdrvTrackedRequest) tracked_requests;
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 1/7] block: Convert BlockDriverState.in_use to refcount
  2013-07-02  5:59 ` [Qemu-devel] [PATCH 1/7] block: Convert BlockDriverState.in_use to refcount Fam Zheng
  2013-07-02 10:21   ` Paolo Bonzini
@ 2013-07-02 19:41   ` Eric Blake
  2013-07-03  0:59     ` Fam Zheng
  1 sibling, 1 reply; 21+ messages in thread
From: Eric Blake @ 2013-07-02 19:41 UTC (permalink / raw)
  To: Fam Zheng
  Cc: kwolf, obarenbo, armbru, roliveri, hbrock, qemu-devel, rjones,
	pmyers, imain, stefanha, pbonzini

[-- Attachment #1: Type: text/plain, Size: 1759 bytes --]

On 07/01/2013 11:59 PM, Fam Zheng wrote:
> Use numeric value to replace in_use flag in BDS, this will make
> lifecycle management with ref count possible. This patch only replaces
> existing uses of bdrv_set_in_use, so no logic change here.
> 
> Signed-off-by: Fam Zheng <famz@redhat.com>
> ---
>  block-migration.c               |  4 ++--
>  block.c                         | 23 +++++++++++++++--------
>  blockjob.c                      |  6 +++---
>  hw/block/dataplane/virtio-blk.c |  4 ++--
>  include/block/block.h           |  3 ++-
>  include/block/block_int.h       |  2 +-
>  6 files changed, 25 insertions(+), 17 deletions(-)
> 
> diff --git a/block-migration.c b/block-migration.c
> index 2fd7699..2efb6c0 100644
> --- a/block-migration.c
> +++ b/block-migration.c
> @@ -321,7 +321,7 @@ static void init_blk_migration_it(void *opaque, BlockDriverState *bs)
>          bmds->shared_base = block_mig_state.shared_base;
>          alloc_aio_bitmap(bmds);
>          drive_get_ref(drive_get_by_blockdev(bs));
> -        bdrv_set_in_use(bs, 1);
> +        bdrv_get_ref(bs);

The old code sets the flag, the new code gets a ref.

>  
>          block_mig_state.total_sector_sum += sectors;
>  
> @@ -557,7 +557,7 @@ static void blk_mig_cleanup(void)
>      blk_mig_lock();
>      while ((bmds = QSIMPLEQ_FIRST(&block_mig_state.bmds_list)) != NULL) {
>          QSIMPLEQ_REMOVE_HEAD(&block_mig_state.bmds_list, entry);
> -        bdrv_set_in_use(bmds->bs, 0);
> +        bdrv_get_ref(bmds->bs);

The old code sets the flag, the new code gets a ref.  Shouldn't this be
clearing a ref, with bdrv_put_ref?

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 621 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 6/7] block: add target-id option to drive-backup QMP command
  2013-07-02  5:59 ` [Qemu-devel] [PATCH 6/7] block: add target-id option to drive-backup QMP command Fam Zheng
@ 2013-07-02 19:59   ` Eric Blake
  0 siblings, 0 replies; 21+ messages in thread
From: Eric Blake @ 2013-07-02 19:59 UTC (permalink / raw)
  To: Fam Zheng
  Cc: kwolf, obarenbo, armbru, roliveri, hbrock, qemu-devel, rjones,
	pmyers, imain, stefanha, pbonzini

[-- Attachment #1: Type: text/plain, Size: 2786 bytes --]

On 07/01/2013 11:59 PM, Fam Zheng wrote:
> Add target-id (optional) to drive-backup command, to make the target bs
> a named drive so that we can operate on it (e.g. export with NBD).
> 
> Signed-off-by: Fam Zheng <famz@redhat.com>
> ---
>  blockdev.c       | 4 +++-
>  qapi-schema.json | 7 +++++--
>  qmp-commands.hx  | 3 ++-
>  3 files changed, 10 insertions(+), 4 deletions(-)
> 

> +++ b/qapi-schema.json
> @@ -1654,7 +1654,8 @@
>  # Since: 1.6
>  ##
>  { 'type': 'DriveBackup',
> -  'data': { 'device': 'str', 'target': 'str', '*format': 'str',
> +  'data': { 'device': 'str', 'target': 'str',
> +            '*target-id': 'str', '*format': 'str',

Seems undocumented...

>              '*mode': 'NewImageMode', '*speed': 'int',
>              '*on-source-error': 'BlockdevOnError',
>              '*on-target-error': 'BlockdevOnError' } }
> @@ -1807,6 +1808,7 @@
>  #          is a device, the existing file/device will be used as the new
>  #          destination.  If it does not exist, a new file will be created.
>  #
> +# @target-id: #optional the drive id of the target.

...until I read this.  Hmm, I think we should first consolidate things
for DriveBackup (so that documentation is listed only once, prior to the
DriveBackup 'type' declaration), by rebasing things on top of
in the same was as Kevin's series "[PATCH v3 0/3] qapi: Top-level type
reference for command definitions" does for BlockdevSnapshot.

The documentation is not incorrect, but it also isn't very helpful -
what is the "drive id of the target" and when would I want to set it?
What do I gain by overriding the drive id, and what is the default
behavior when I don't pass in the option?

> +++ b/qmp-commands.hx
> @@ -913,7 +913,7 @@ EQMP
>  
>      {
>          .name       = "drive-backup",
> -        .args_type  = "device:B,target:s,speed:i?,mode:s?,format:s?,"
> +        .args_type  = "device:B,target:s,target-id:s?,speed:i?,mode:s?,format:s?,"
>                        "on-source-error:s?,on-target-error:s?",
>          .mhandler.cmd_new = qmp_marshal_input_drive_backup,
>      },
> @@ -936,6 +936,7 @@ Arguments:
>              device, the existing file/device will be used as the new
>              destination.  If it does not exist, a new file will be created.
>              (json-string)
> +- "target-id": the drive id of the target image.

Should probably mention (json-string, optional), as done elsewhere in
this command.

>  - "format": the format of the new destination, default is to probe if 'mode' is
>              'existing', else the format of the source
>              (json-string, optional)
> 

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 621 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 1/7] block: Convert BlockDriverState.in_use to refcount
  2013-07-02 19:41   ` Eric Blake
@ 2013-07-03  0:59     ` Fam Zheng
  0 siblings, 0 replies; 21+ messages in thread
From: Fam Zheng @ 2013-07-03  0:59 UTC (permalink / raw)
  To: Eric Blake
  Cc: kwolf, obarenbo, armbru, roliveri, hbrock, qemu-devel, rjones,
	pmyers, imain, stefanha, pbonzini

On Tue, 07/02 13:41, Eric Blake wrote:
> On 07/01/2013 11:59 PM, Fam Zheng wrote:
> > Use numeric value to replace in_use flag in BDS, this will make
> > lifecycle management with ref count possible. This patch only replaces
> > existing uses of bdrv_set_in_use, so no logic change here.
> > 
> > Signed-off-by: Fam Zheng <famz@redhat.com>
> > ---
> >  block-migration.c               |  4 ++--
> >  block.c                         | 23 +++++++++++++++--------
> >  blockjob.c                      |  6 +++---
> >  hw/block/dataplane/virtio-blk.c |  4 ++--
> >  include/block/block.h           |  3 ++-
> >  include/block/block_int.h       |  2 +-
> >  6 files changed, 25 insertions(+), 17 deletions(-)
> > 
> > diff --git a/block-migration.c b/block-migration.c
> > index 2fd7699..2efb6c0 100644
> > --- a/block-migration.c
> > +++ b/block-migration.c
> > @@ -321,7 +321,7 @@ static void init_blk_migration_it(void *opaque, BlockDriverState *bs)
> >          bmds->shared_base = block_mig_state.shared_base;
> >          alloc_aio_bitmap(bmds);
> >          drive_get_ref(drive_get_by_blockdev(bs));
> > -        bdrv_set_in_use(bs, 1);
> > +        bdrv_get_ref(bs);
> 
> The old code sets the flag, the new code gets a ref.
> 
> >  
> >          block_mig_state.total_sector_sum += sectors;
> >  
> > @@ -557,7 +557,7 @@ static void blk_mig_cleanup(void)
> >      blk_mig_lock();
> >      while ((bmds = QSIMPLEQ_FIRST(&block_mig_state.bmds_list)) != NULL) {
> >          QSIMPLEQ_REMOVE_HEAD(&block_mig_state.bmds_list, entry);
> > -        bdrv_set_in_use(bmds->bs, 0);
> > +        bdrv_get_ref(bmds->bs);
> 
> The old code sets the flag, the new code gets a ref.  Shouldn't this be
> clearing a ref, with bdrv_put_ref?
Yes, thanks for pointing out.



-- 
Fam

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 3/7] nbd: use BDS refcount
  2013-07-02 10:16   ` Paolo Bonzini
@ 2013-07-03  1:10     ` Fam Zheng
  2013-07-03  5:58       ` Paolo Bonzini
  0 siblings, 1 reply; 21+ messages in thread
From: Fam Zheng @ 2013-07-03  1:10 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: kwolf, obarenbo, armbru, roliveri, hbrock, qemu-devel, rjones,
	pmyers, imain, stefanha

On Tue, 07/02 12:16, Paolo Bonzini wrote:
> Il 02/07/2013 07:59, Fam Zheng ha scritto:
> > Previously, nbd call drive_get_ref on the drive of bs. A BDS doesn't
> > always have associated dinfo, it's more proper to use bdrv_get_ref().
> 
> This has the important side effect of marking the exported disk as
> "in_use" (to use the terms before the series).  Right now you can serve
> a disk and, at the same time, stream it or mirror it or create a live
> snapshot of it.
> 
> Do we really want to block anything for a device being served?  Perhaps
> truncation, but maybe not even that.  The NBD server is meant to be as
> unobtrusive as possible (in some sense NBD accesses are the same as
> guest accesses).
> 
> Paolo
> 

OK, it is better to work like that. But I don't quite understand why was
there drive_get_ref() on the device (w/o the series), as there's already
a close notifier? And it just drive_put_ref() when bs is closed?

> > Signed-off-by: Fam Zheng <famz@redhat.com>
> > ---
> >  blockdev-nbd.c | 9 +--------
> >  nbd.c          | 5 +++++
> >  2 files changed, 6 insertions(+), 8 deletions(-)
> > 
> > diff --git a/blockdev-nbd.c b/blockdev-nbd.c
> > index 95f10c8..d8bcd6f 100644
> > --- a/blockdev-nbd.c
> > +++ b/blockdev-nbd.c
> > @@ -69,12 +69,6 @@ static void nbd_close_notifier(Notifier *n, void *data)
> >      g_free(cn);
> >  }
> >  
> > -static void nbd_server_put_ref(NBDExport *exp)
> > -{
> > -    BlockDriverState *bs = nbd_export_get_blockdev(exp);
> > -    drive_put_ref(drive_get_by_blockdev(bs));
> > -}
> > -
> >  void qmp_nbd_server_add(const char *device, bool has_writable, bool writable,
> >                          Error **errp)
> >  {
> > @@ -106,10 +100,9 @@ void qmp_nbd_server_add(const char *device, bool has_writable, bool writable,
> >      }
> >  
> >      exp = nbd_export_new(bs, 0, -1, writable ? 0 : NBD_FLAG_READ_ONLY,
> > -                         nbd_server_put_ref);
> > +                         NULL);
> >  
> >      nbd_export_set_name(exp, device);
> > -    drive_get_ref(drive_get_by_blockdev(bs));
> >  
> >      n = g_malloc0(sizeof(NBDCloseNotifier));
> >      n->n.notify = nbd_close_notifier;
> > diff --git a/nbd.c b/nbd.c
> > index 2606403..f28b9fb 100644
> > --- a/nbd.c
> > +++ b/nbd.c
> > @@ -881,6 +881,7 @@ NBDExport *nbd_export_new(BlockDriverState *bs, off_t dev_offset,
> >      exp->nbdflags = nbdflags;
> >      exp->size = size == -1 ? bdrv_getlength(bs) : size;
> >      exp->close = close;
> > +    bdrv_get_ref(bs);
> >      return exp;
> >  }
> >  
> > @@ -927,6 +928,10 @@ void nbd_export_close(NBDExport *exp)
> >      }
> >      nbd_export_set_name(exp, NULL);
> >      nbd_export_put(exp);
> > +    if (exp->bs) {
> > +        bdrv_put_ref(exp->bs);
> > +        exp->bs = NULL;
> > +    }
> >  }
> >  
> >  void nbd_export_get(NBDExport *exp)
> > 
> 

-- 
Fam

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 3/7] nbd: use BDS refcount
  2013-07-03  1:10     ` Fam Zheng
@ 2013-07-03  5:58       ` Paolo Bonzini
  2013-07-03  6:30         ` Fam Zheng
  0 siblings, 1 reply; 21+ messages in thread
From: Paolo Bonzini @ 2013-07-03  5:58 UTC (permalink / raw)
  To: famz
  Cc: kwolf, rjones, obarenbo, roliveri, hbrock, armbru, qemu-devel,
	pmyers, imain, stefanha

Il 03/07/2013 03:10, Fam Zheng ha scritto:
> > This has the important side effect of marking the exported disk as
> > "in_use" (to use the terms before the series).  Right now you can serve
> > a disk and, at the same time, stream it or mirror it or create a live
> > snapshot of it.
> > 
> > Do we really want to block anything for a device being served?  Perhaps
> > truncation, but maybe not even that.  The NBD server is meant to be as
> > unobtrusive as possible (in some sense NBD accesses are the same as
> > guest accesses).
> 
> OK, it is better to work like that. But I don't quite understand why was
> there drive_get_ref() on the device (w/o the series), as there's already
> a close notifier? And it just drive_put_ref() when bs is closed?

The close notifier runs when the user invokes a drive_del or eject
command from the monitor.  The drive_get_ref/drive_put_ref delays the
bdrv_delete until after nbd.c has cleaned up all the connections.

Paolo

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 3/7] nbd: use BDS refcount
  2013-07-03  5:58       ` Paolo Bonzini
@ 2013-07-03  6:30         ` Fam Zheng
  2013-07-03  7:28           ` Paolo Bonzini
  0 siblings, 1 reply; 21+ messages in thread
From: Fam Zheng @ 2013-07-03  6:30 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: kwolf, rjones, obarenbo, roliveri, hbrock, armbru, qemu-devel,
	pmyers, imain, stefanha

On Wed, 07/03 07:58, Paolo Bonzini wrote:
> Il 03/07/2013 03:10, Fam Zheng ha scritto:
> > > This has the important side effect of marking the exported disk as
> > > "in_use" (to use the terms before the series).  Right now you can serve
> > > a disk and, at the same time, stream it or mirror it or create a live
> > > snapshot of it.
> > > 
> > > Do we really want to block anything for a device being served?  Perhaps
> > > truncation, but maybe not even that.  The NBD server is meant to be as
> > > unobtrusive as possible (in some sense NBD accesses are the same as
> > > guest accesses).
> > 
> > OK, it is better to work like that. But I don't quite understand why was
> > there drive_get_ref() on the device (w/o the series), as there's already
> > a close notifier? And it just drive_put_ref() when bs is closed?
> 
> The close notifier runs when the user invokes a drive_del or eject
> command from the monitor.  The drive_get_ref/drive_put_ref delays the
> bdrv_delete until after nbd.c has cleaned up all the connections.

But drive_put_ref is called by close notifier. I think it can be
omitted, registering a close notifier is enough, and close the export
when drive_del calls it. It doesn't make more sense w/ drive_get_ref,
does it?

-- 
Fam

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 3/7] nbd: use BDS refcount
  2013-07-03  6:30         ` Fam Zheng
@ 2013-07-03  7:28           ` Paolo Bonzini
  0 siblings, 0 replies; 21+ messages in thread
From: Paolo Bonzini @ 2013-07-03  7:28 UTC (permalink / raw)
  To: famz
  Cc: kwolf, rjones, obarenbo, roliveri, hbrock, armbru, qemu-devel,
	pmyers, imain, stefanha

Il 03/07/2013 08:30, Fam Zheng ha scritto:
>> > The close notifier runs when the user invokes a drive_del or eject
>> > command from the monitor.  The drive_get_ref/drive_put_ref delays the
>> > bdrv_delete until after nbd.c has cleaned up all the connections.
> But drive_put_ref is called by close notifier.

Not necessarily.  nbd_export_close calls nbd_client_close, which shuts
down the socket.  However, if requests are being processed, they will
complete after nbd_export_close returns.  Completing the requests leads
to the following call chain:

   nbd_request_put (from nbd_trip)
   calls nbd_client_put
   calls nbd_export_put
   calls exp->close (if refcount goes to 0)
   calls drive_put_ref

Completion will happen as soon as the main loop runs again, because
after shutdown() the reads and writes will fail.  Still, it is
asynchronous, hence the call to drive_put_ref is also asynchronous.

> I think it can be
> omitted, registering a close notifier is enough, and close the export
> when drive_del calls it. It doesn't make more sense w/ drive_get_ref,
> does it?

I think that would cause a dangling pointer if NBD requests are being
processed at the time drive_del runs.

Paolo

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 4/7] block: simplify bdrv_drop_intermediate
  2013-07-02  5:59 ` [Qemu-devel] [PATCH 4/7] block: simplify bdrv_drop_intermediate Fam Zheng
@ 2013-07-04 14:02   ` Stefan Hajnoczi
  2013-07-05  1:00     ` Fam Zheng
  0 siblings, 1 reply; 21+ messages in thread
From: Stefan Hajnoczi @ 2013-07-04 14:02 UTC (permalink / raw)
  To: Fam Zheng
  Cc: kwolf, obarenbo, armbru, roliveri, hbrock, qemu-devel, rjones,
	pmyers, imain, stefanha, pbonzini

On Tue, Jul 02, 2013 at 01:59:46PM +0800, Fam Zheng wrote:
> Signed-off-by: Fam Zheng <famz@redhat.com>
> ---
>  block.c | 71 ++++++++++-------------------------------------------------------
>  1 file changed, 11 insertions(+), 60 deletions(-)

Way too big for no commit description.  Okay, I admit I jumped into this
patch and didn't review all the previous ones - maybe it would have been
obvious if I had read them linearly :).

Should the description be something like "We no longer need to a keep a
list because refcounts ..."?

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 7/7] block: assign backing relationship in drive-backup
  2013-07-02  5:59 ` [Qemu-devel] [PATCH 7/7] block: assign backing relationship in drive-backup Fam Zheng
@ 2013-07-04 14:32   ` Stefan Hajnoczi
  0 siblings, 0 replies; 21+ messages in thread
From: Stefan Hajnoczi @ 2013-07-04 14:32 UTC (permalink / raw)
  To: Fam Zheng
  Cc: kwolf, obarenbo, armbru, roliveri, hbrock, qemu-devel, rjones,
	pmyers, imain, stefanha, pbonzini

On Tue, Jul 02, 2013 at 01:59:49PM +0800, Fam Zheng wrote:
> Assign source image as the backing hd of target bs, so reading target bs
> gets the point-in-time copy of data from source image.
> 
> Signed-off-by: Fam Zheng <famz@redhat.com>
> ---
>  block/backup.c | 13 +++++++++++++
>  1 file changed, 13 insertions(+)
> 
> diff --git a/block/backup.c b/block/backup.c
> index 4e9f927..2dd0540 100644
> --- a/block/backup.c
> +++ b/block/backup.c
> @@ -294,6 +294,11 @@ static void coroutine_fn backup_run(void *opaque)
>      hbitmap_free(job->bitmap);
>  
>      bdrv_iostatus_disable(target);
> +
> +    bdrv_put_ref(target->backing_hd);
> +    target->backing_hd = NULL;
> +    target->backing_file[0] = '\0';
> +    target->backing_format[0] = '\0';
>      bdrv_put_ref(target);
>  
>      block_job_completed(&job->common, ret);
> @@ -332,7 +337,15 @@ void backup_start(BlockDriverState *bs, BlockDriverState *target,
>          return;
>      }
>  
> +    target->backing_hd = bs;
> +    pstrcpy(target->backing_file, sizeof(target->backing_file),
> +            bs->filename);
> +    pstrcpy(target->backing_format, sizeof(target->backing_format),
> +            bs->drv->format_name);
>      bdrv_get_ref(target);
> +    /* Get another ref to source for backing_hd relationship */
> +    bdrv_get_ref(bs);
> +
>      job->on_source_error = on_source_error;
>      job->on_target_error = on_target_error;
>      job->target = target;

This is a strange way of overriding the backing file.  The target exists
after drive-backup completes but now has a NULL backing_hd.

Also, we set backing_hd even for a raw image.

I thought this would be achieved as follows:

1. The management tool creates the qcow2 file.
2. drive-add if=none,id=target,file=backup.qcow2,backing_hd=drive0
3. drive-backup drive=drive0,target=target,mode=existing

Something along these lines.  The difference is that we do not force
override the backing_hd.  It is only done when the management tool/user
decides explicitly to use backing files.  Also, the target is left in a
usable state after the blockjob completes.

I know the drive-add approach has been shelved in favor of directly
using drive-backup, but this patch seems a little too hacky IMO.

Stefan

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 4/7] block: simplify bdrv_drop_intermediate
  2013-07-04 14:02   ` Stefan Hajnoczi
@ 2013-07-05  1:00     ` Fam Zheng
  0 siblings, 0 replies; 21+ messages in thread
From: Fam Zheng @ 2013-07-05  1:00 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: kwolf, rjones, obarenbo, roliveri, hbrock, armbru, qemu-devel,
	pmyers, imain, stefanha, pbonzini

On Thu, 07/04 16:02, Stefan Hajnoczi wrote:
> On Tue, Jul 02, 2013 at 01:59:46PM +0800, Fam Zheng wrote:
> > Signed-off-by: Fam Zheng <famz@redhat.com>
> > ---
> >  block.c | 71 ++++++++++-------------------------------------------------------
> >  1 file changed, 11 insertions(+), 60 deletions(-)
> 
> Way too big for no commit description.  Okay, I admit I jumped into this
> patch and didn't review all the previous ones - maybe it would have been
> obvious if I had read them linearly :).
> 
> Should the description be something like "We no longer need to a keep a
> list because refcounts ..."?
> 
Yes, will add a decent message.

Thanks.

-- 
Fam

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 1/7] block: Convert BlockDriverState.in_use to refcount
  2013-07-02 10:21   ` Paolo Bonzini
@ 2013-07-08  8:37     ` Fam Zheng
  0 siblings, 0 replies; 21+ messages in thread
From: Fam Zheng @ 2013-07-08  8:37 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: kwolf, rjones, obarenbo, roliveri, hbrock, armbru, qemu-devel,
	pmyers, imain, stefanha

On Tue, 07/02 12:21, Paolo Bonzini wrote:
> Il 02/07/2013 07:59, Fam Zheng ha scritto:
> > Use numeric value to replace in_use flag in BDS, this will make
> > lifecycle management with ref count possible. This patch only replaces
> > existing uses of bdrv_set_in_use, so no logic change here.
> 
> This still does not entirely explain the rules for who sets in_use (or
> gets a reference) and who checks in_use.
> 
> Why should offline commit worry about the disk being shared, for
> example?  The reason "of course" is that a background job might modify
> bs->backing_hd while commit is running (or might expect bs->backing_hd
> to not change).  However, this is in no way related to a reference count.
> 
> So I think your series is doing two things right (setting the in_use
> flag for BDSes in the ->file and ->backing_hd chains; turning the in_use
> flag into a counter) and one wrong (tying bdrv_in_use checks to the
> lifecycle).  I wonder thus if we need two counters: the "in use" counter
> and the reference count for the lifecycle.

But the two are not orthognal, it's somehow like rlock and wlock.  What
I try is to simplify the (usual) logic that a BDS user calls
bdrv_get_ref() for lifecycle as well as bdrv_set_in_use() to secure the
state. With refcount as a sign for in_use, they just need inference if
its ref count is more than one: it's shared, don't modify it, and expect
it to change, like rwlock implies.

I think commit should worry about the disk being shared: We eventually
need the target of drive-backup to be exported through NBD, which means
it has a device id, and its backing_hd is also attached to guest device,
so we can't commit it. That said, I think a BDS being shared is an
evidence for not to make any change. Do we have any exception on this
rule?

For NBD there is close notifier, so I want to drop the refcount (PATCH
3/7).

> > -void bdrv_set_in_use(BlockDriverState *bs, int in_use)
> > +void bdrv_put_ref(BlockDriverState *bs)
> > +{
> > +    assert(bs->refcount > 0);
> > +    bs->refcount--;
> > +}
> > +
> > +void bdrv_get_ref(BlockDriverState *bs)
> >  {
> > -    assert(bs->in_use != in_use);
> > -    bs->in_use = in_use;
> > +    bs->refcount++;
> >  }
> 
> The convention that we use in QEMU is bdrv_ref/bdrv_unref.

I'm following drive_get_ref() here.

Thanks.

Fam

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2013-07-08  8:38 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-07-02  5:59 [Qemu-devel] [PATCH 0/7] Point-in-time snapshot exporting over NBD Fam Zheng
2013-07-02  5:59 ` [Qemu-devel] [PATCH 1/7] block: Convert BlockDriverState.in_use to refcount Fam Zheng
2013-07-02 10:21   ` Paolo Bonzini
2013-07-08  8:37     ` Fam Zheng
2013-07-02 19:41   ` Eric Blake
2013-07-03  0:59     ` Fam Zheng
2013-07-02  5:59 ` [Qemu-devel] [PATCH 2/7] block: use refcount to manage BlockDriverState lifecycle Fam Zheng
2013-07-02  5:59 ` [Qemu-devel] [PATCH 3/7] nbd: use BDS refcount Fam Zheng
2013-07-02 10:16   ` Paolo Bonzini
2013-07-03  1:10     ` Fam Zheng
2013-07-03  5:58       ` Paolo Bonzini
2013-07-03  6:30         ` Fam Zheng
2013-07-03  7:28           ` Paolo Bonzini
2013-07-02  5:59 ` [Qemu-devel] [PATCH 4/7] block: simplify bdrv_drop_intermediate Fam Zheng
2013-07-04 14:02   ` Stefan Hajnoczi
2013-07-05  1:00     ` Fam Zheng
2013-07-02  5:59 ` [Qemu-devel] [PATCH 5/7] block: rename bdrv_in_use to bdrv_is_shared Fam Zheng
2013-07-02  5:59 ` [Qemu-devel] [PATCH 6/7] block: add target-id option to drive-backup QMP command Fam Zheng
2013-07-02 19:59   ` Eric Blake
2013-07-02  5:59 ` [Qemu-devel] [PATCH 7/7] block: assign backing relationship in drive-backup Fam Zheng
2013-07-04 14:32   ` Stefan Hajnoczi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.