All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v5 00/11] block: drive-backup live backup command
@ 2013-05-30 12:34 Stefan Hajnoczi
  2013-05-30 12:34 ` [Qemu-devel] [PATCH v5 01/11] notify: add NotiferWithReturn so notifier list can abort Stefan Hajnoczi
                   ` (11 more replies)
  0 siblings, 12 replies; 59+ messages in thread
From: Stefan Hajnoczi @ 2013-05-30 12:34 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Fam Zheng, xiawenc, imain, Stefan Hajnoczi,
	Paolo Bonzini, dietmar

Note: These patches apply to kevin/block.  You can also grab the code from git
here:
git://github.com/stefanha/qemu.git block-backup-core

This series adds a new QMP command, drive-backup, which takes a point-in-time
snapshot of a block device.  The snapshot is copied out to a target block
device.  A simple example is:

  drive-backup device=virtio0 format=qcow2 target=backup-20130401.qcow2

The original drive-backup blockjob was written by Dietmar Maurer
<dietmar@proxmox.com>.  He is currently busy but I feel the feature is worth
pushing into QEMU since there has been interest.  This is my version of his
patch, plus the QMP command and qemu-iotests test case.

QMP 'transaction' support is included since v3.  It adds support for atomic
snapshots of multiple block devices.  I also added an 'abort' transaction to
allow testing of the .abort()/.cleanup() code path.  Thanks to Wenchao for
making qmp_transaction() extensible.

How is this different from block-stream and drive-mirror?
---------------------------------------------------------
Both block-stream and drive-mirror do not provide immediate point-in-time
snapshots.  Instead they copy data into a new file and then switch to it.  In
other words, the point at which the "snapshot" is taken cannot be controlled
directly.

drive-backup intercepts guest writes and saves data into the target block
device before it is overwritten.  The target block device can be a raw image
file, backing files are not used to implement this feature.

How can drive-backup be used?
-----------------------------
The simplest use-case is to copy a point-in-time snapshot to a local file.

More advanced users may wish to make the target an NBD URL.  The NBD server
listening on the other side can process the backup writes any way it wishes.  I
previously posted an RFC series with a backup server that streamed Dietmar's
VMA backup archive format.

What's next for drive-backup?
-----------------------------
1. Sync modes like drive-mirror (top, full, none).  This makes it possible to
   preserve the backing file chain.

v5:
 * Use bdrv_co_write_zeroes(job->target) [kwolf]

   This change means that we write zeroes over NBD again.  The optimization can
   be reintroduced by skipping zeroes when bdrv_has_zero_init() is true and the
   sectors are allocated.  Leave that for a future series, if we decide to do
   this optimization again because it may require extra block driver
   configuration to indicate that an image has zero init.

 * iostatus error handling [kwolf/pbonzini]

   Add configurable on-source-error and on-target-error actions just like
   drive-mirror.  These are used when the block job coroutine hits an error.
   If we are in guest write request context, return the errno and let the usual
   guest error handling take over.

 * Allow BdrvActionOps->commit() to be NULL [eblake]
 * Use bdrv_getlength() in qmp_drive_mirror() [kwolf]
 * Drop redundant proto_drv check [kwolf]
 * Fix outdated DPRINTF() function names
 * Drop comment about non-existent bitmap coroutine race [kwolf]
 * Rename BACKUP_SECTORS_PER_CLUSTER [kwolf]
 * Fix completion when image is not multiple of backup cluster size [fam]

v4:
 * Use notifier lists and BdrvTrackedRequest instead of custom callbacks [bonzini]
 * Add drive-backup QMP example JSON [eblake]
 * Add "Since: 1.6" to QMP schema changes [eblake]

v3:
 * Rename to drive-backup for consistency with drive-mirror [kwolf]
 * Add QMP transaction support [kwolf]
 * Introduce bdrv_add_before_write_cb() to hook writes
 * Mention 'query-block-jobs' lists job of type 'backup' [eblake]
 * Rename rwlock to flush_rwlock [kwolf]
 * Fix space in block/backup.c comment [kwolf]

v2:
 * s/block_backup/block-backup/ in commit message [eblake]
 * Avoid funny spacing in QMP docs [eblake]
 * Document query-block-jobs and block-job-cancel usage [eblake]

Dietmar Maurer (1):
  block: add basic backup support to block driver

Stefan Hajnoczi (10):
  notify: add NotiferWithReturn so notifier list can abort
  block: add bdrv_add_before_write_notifier()
  blockdev: drop redundant proto_drv check
  blockdev: use bdrv_getlength() in qmp_drive_mirror()
  block: add drive-backup QMP command
  blockdev: rename BlkTransactionStates to singular
  blockdev: allow BdrvActionOps->commit() to be NULL
  blockdev: add DriveBackup transaction
  blockdev: add Abort transaction
  qemu-iotests: add 055 drive-backup test case

 block.c                    |  23 +--
 block/Makefile.objs        |   1 +
 block/backup.c             | 355 +++++++++++++++++++++++++++++++++++++++++++++
 blockdev.c                 | 288 +++++++++++++++++++++++++++---------
 include/block/block_int.h  |  42 +++++-
 include/qemu/notify.h      |  25 ++++
 qapi-schema.json           |  97 ++++++++++++-
 qmp-commands.hx            |  46 ++++++
 tests/qemu-iotests/055     | 256 ++++++++++++++++++++++++++++++++
 tests/qemu-iotests/055.out |   5 +
 tests/qemu-iotests/group   |   1 +
 util/notify.c              |  30 ++++
 12 files changed, 1088 insertions(+), 81 deletions(-)
 create mode 100644 block/backup.c
 create mode 100755 tests/qemu-iotests/055
 create mode 100644 tests/qemu-iotests/055.out

-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Qemu-devel] [PATCH v5 01/11] notify: add NotiferWithReturn so notifier list can abort
  2013-05-30 12:34 [Qemu-devel] [PATCH v5 00/11] block: drive-backup live backup command Stefan Hajnoczi
@ 2013-05-30 12:34 ` Stefan Hajnoczi
  2013-05-30 22:27   ` Eric Blake
  2013-06-19 10:55   ` Kevin Wolf
  2013-05-30 12:34 ` [Qemu-devel] [PATCH v5 02/11] block: add bdrv_add_before_write_notifier() Stefan Hajnoczi
                   ` (10 subsequent siblings)
  11 siblings, 2 replies; 59+ messages in thread
From: Stefan Hajnoczi @ 2013-05-30 12:34 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Fam Zheng, xiawenc, imain, Stefan Hajnoczi,
	Paolo Bonzini, dietmar

notifier_list_notify() has no return value.  This is fine when we just
want to invoke side-effects.

Sometimes it's useful for notifiers to produce a return value.  This
allows notifiers to "veto" an operation and will be used by the block
layer before-write notifier.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 include/qemu/notify.h | 25 +++++++++++++++++++++++++
 util/notify.c         | 30 ++++++++++++++++++++++++++++++
 2 files changed, 55 insertions(+)

diff --git a/include/qemu/notify.h b/include/qemu/notify.h
index 4e2e7f0..d3103e7 100644
--- a/include/qemu/notify.h
+++ b/include/qemu/notify.h
@@ -40,4 +40,29 @@ void notifier_remove(Notifier *notifier);
 
 void notifier_list_notify(NotifierList *list, void *data);
 
+/* Same as Notifier but allows .notify() to return errors */
+typedef struct NotifierWithReturn NotifierWithReturn;
+
+struct NotifierWithReturn {
+    int (*notify)(NotifierWithReturn *notifier, void *data);
+    QLIST_ENTRY(NotifierWithReturn) node;
+};
+
+typedef struct NotifierWithReturnList {
+    QLIST_HEAD(, NotifierWithReturn) notifiers;
+} NotifierWithReturnList;
+
+#define NOTIFIER_WITH_RETURN_LIST_INITIALIZER(head) \
+    { QLIST_HEAD_INITIALIZER((head).notifiers) }
+
+void notifier_with_return_list_init(NotifierWithReturnList *list);
+
+void notifier_with_return_list_add(NotifierWithReturnList *list,
+                                   NotifierWithReturn *notifier);
+
+void notifier_with_return_remove(NotifierWithReturn *notifier);
+
+int notifier_with_return_list_notify(NotifierWithReturnList *list,
+                                     void *data);
+
 #endif
diff --git a/util/notify.c b/util/notify.c
index 7b7692a..f215dfc 100644
--- a/util/notify.c
+++ b/util/notify.c
@@ -39,3 +39,33 @@ void notifier_list_notify(NotifierList *list, void *data)
         notifier->notify(notifier, data);
     }
 }
+
+void notifier_with_return_list_init(NotifierWithReturnList *list)
+{
+    QLIST_INIT(&list->notifiers);
+}
+
+void notifier_with_return_list_add(NotifierWithReturnList *list,
+                                   NotifierWithReturn *notifier)
+{
+    QLIST_INSERT_HEAD(&list->notifiers, notifier, node);
+}
+
+void notifier_with_return_remove(NotifierWithReturn *notifier)
+{
+    QLIST_REMOVE(notifier, node);
+}
+
+int notifier_with_return_list_notify(NotifierWithReturnList *list, void *data)
+{
+    NotifierWithReturn *notifier, *next;
+    int ret = 0;
+
+    QLIST_FOREACH_SAFE(notifier, &list->notifiers, node, next) {
+        ret = notifier->notify(notifier, data);
+        if (ret != 0) {
+            break;
+        }
+    }
+    return ret;
+}
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [Qemu-devel] [PATCH v5 02/11] block: add bdrv_add_before_write_notifier()
  2013-05-30 12:34 [Qemu-devel] [PATCH v5 00/11] block: drive-backup live backup command Stefan Hajnoczi
  2013-05-30 12:34 ` [Qemu-devel] [PATCH v5 01/11] notify: add NotiferWithReturn so notifier list can abort Stefan Hajnoczi
@ 2013-05-30 12:34 ` Stefan Hajnoczi
  2013-05-30 22:47   ` Eric Blake
  2013-06-19 10:56   ` Kevin Wolf
  2013-05-30 12:34 ` [Qemu-devel] [PATCH v5 03/11] block: add basic backup support to block driver Stefan Hajnoczi
                   ` (9 subsequent siblings)
  11 siblings, 2 replies; 59+ messages in thread
From: Stefan Hajnoczi @ 2013-05-30 12:34 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Fam Zheng, xiawenc, imain, Stefan Hajnoczi,
	Paolo Bonzini, dietmar

The bdrv_add_before_write_notifier() function installs a callback that
is invoked before a write request is processed.  This will be used to
implement copy-on-write point-in-time snapshots where we need to copy
out old data before overwriting it.

Note that BdrvTrackedRequest is moved to block_int.h since it is passed
to .notify() functions.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 block.c                   | 23 ++++++++++++-----------
 include/block/block_int.h | 23 ++++++++++++++++++++++-
 2 files changed, 34 insertions(+), 12 deletions(-)

diff --git a/block.c b/block.c
index 65c0b60..9e43f20 100644
--- a/block.c
+++ b/block.c
@@ -308,6 +308,7 @@ BlockDriverState *bdrv_new(const char *device_name)
     }
     bdrv_iostatus_disable(bs);
     notifier_list_init(&bs->close_notifiers);
+    notifier_with_return_list_init(&bs->before_write_notifiers);
 
     return bs;
 }
@@ -1850,16 +1851,6 @@ int bdrv_commit_all(void)
     return 0;
 }
 
-struct BdrvTrackedRequest {
-    BlockDriverState *bs;
-    int64_t sector_num;
-    int nb_sectors;
-    bool is_write;
-    QLIST_ENTRY(BdrvTrackedRequest) list;
-    Coroutine *co; /* owner, used for deadlock detection */
-    CoQueue wait_queue; /* coroutines blocked on this request */
-};
-
 /**
  * Remove an active request from the tracked requests list
  *
@@ -2630,7 +2621,11 @@ static int coroutine_fn bdrv_co_do_writev(BlockDriverState *bs,
 
     tracked_request_begin(&req, bs, sector_num, nb_sectors, true);
 
-    if (flags & BDRV_REQ_ZERO_WRITE) {
+    ret = notifier_with_return_list_notify(&bs->before_write_notifiers, &req);
+
+    if (ret < 0) {
+        /* Do nothing, write notifier decided to fail this request */
+    } else if (flags & BDRV_REQ_ZERO_WRITE) {
         ret = bdrv_co_do_write_zeroes(bs, sector_num, nb_sectors);
     } else {
         ret = drv->bdrv_co_writev(bs, sector_num, nb_sectors, qiov);
@@ -4894,3 +4889,9 @@ AioContext *bdrv_get_aio_context(BlockDriverState *bs)
     /* Currently BlockDriverState always uses the main loop AioContext */
     return qemu_get_aio_context();
 }
+
+void bdrv_add_before_write_notifier(BlockDriverState *bs,
+                                    NotifierWithReturn *notifier)
+{
+    notifier_with_return_list_add(&bs->before_write_notifiers, notifier);
+}
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 6078dd3..440d510 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -58,7 +58,16 @@
 #define BLOCK_OPT_LAZY_REFCOUNTS    "lazy_refcounts"
 #define BLOCK_OPT_ADAPTER_TYPE      "adapter_type"
 
-typedef struct BdrvTrackedRequest BdrvTrackedRequest;
+typedef struct BdrvTrackedRequest {
+    BlockDriverState *bs;
+    int64_t sector_num;
+    int nb_sectors;
+    bool is_write;
+    QLIST_ENTRY(BdrvTrackedRequest) list;
+    Coroutine *co; /* owner, used for deadlock detection */
+    CoQueue wait_queue; /* coroutines blocked on this request */
+} BdrvTrackedRequest;
+
 
 typedef struct BlockIOLimit {
     int64_t bps[3];
@@ -247,6 +256,9 @@ struct BlockDriverState {
 
     NotifierList close_notifiers;
 
+    /* Callback before write request is processed */
+    NotifierWithReturnList before_write_notifiers;
+
     /* number of in-flight copy-on-read requests */
     unsigned int copy_on_read_in_flight;
 
@@ -298,6 +310,15 @@ void bdrv_set_io_limits(BlockDriverState *bs,
                         BlockIOLimit *io_limits);
 
 /**
+ * bdrv_add_before_write_notifier:
+ *
+ * Register a callback that is invoked before write requests are processed but
+ * after any throttling or waiting for overlapping requests.
+ */
+void bdrv_add_before_write_notifier(BlockDriverState *bs,
+                                    NotifierWithReturn *notifier);
+
+/**
  * bdrv_get_aio_context:
  *
  * Returns: the currently bound #AioContext
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [Qemu-devel] [PATCH v5 03/11] block: add basic backup support to block driver
  2013-05-30 12:34 [Qemu-devel] [PATCH v5 00/11] block: drive-backup live backup command Stefan Hajnoczi
  2013-05-30 12:34 ` [Qemu-devel] [PATCH v5 01/11] notify: add NotiferWithReturn so notifier list can abort Stefan Hajnoczi
  2013-05-30 12:34 ` [Qemu-devel] [PATCH v5 02/11] block: add bdrv_add_before_write_notifier() Stefan Hajnoczi
@ 2013-05-30 12:34 ` Stefan Hajnoczi
  2013-06-06  3:56   ` Fam Zheng
                     ` (3 more replies)
  2013-05-30 12:34 ` [Qemu-devel] [PATCH v5 04/11] blockdev: drop redundant proto_drv check Stefan Hajnoczi
                   ` (8 subsequent siblings)
  11 siblings, 4 replies; 59+ messages in thread
From: Stefan Hajnoczi @ 2013-05-30 12:34 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Fam Zheng, xiawenc, imain, Stefan Hajnoczi,
	Paolo Bonzini, dietmar

From: Dietmar Maurer <dietmar@proxmox.com>

backup_start() creates a block job that copies a point-in-time snapshot
of a block device to a target block device.

We call backup_do_cow() for each write during backup. That function
reads the original data from the block device before it gets
overwritten.  The data is then written to the target device.

Currently backup cluster size is hardcoded to 65536 bytes.

[I made a number of changes to Dietmar's original patch and folded them
in to make code review easy.  Here is the full list:

 * Drop BackupDumpFunc interface in favor of a target block device
 * Detect zero clusters with buffer_is_zero() and use bdrv_co_write_zeroes()
 * Use 0 delay instead of 1us, like other block jobs
 * Unify creation/start functions into backup_start()
 * Simplify cleanup, free bitmap in backup_run() instead of cb
 * function
 * Use HBitmap to avoid duplicating bitmap code
 * Use bdrv_getlength() instead of accessing ->total_sectors
 * directly
 * Delete the backup.h header file, it is no longer necessary
 * Move ./backup.c to block/backup.c
 * Remove #ifdefed out code
 * Coding style and whitespace cleanups
 * Use bdrv_add_before_write_notifier() instead of blockjob-specific hooks
 * Keep our own in-flight CowRequest list instead of using block.c
   tracked requests.  This means a little code duplication but is much
   simpler than trying to share the tracked requests list and use the
   backup block size.
 * Add on_source_error and on_target_error error handling.

-- stefanha]

Signed-off-by: Dietmar Maurer <dietmar@proxmox.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

backup size fixes

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 block/Makefile.objs       |   1 +
 block/backup.c            | 355 ++++++++++++++++++++++++++++++++++++++++++++++
 include/block/block_int.h |  19 +++
 3 files changed, 375 insertions(+)
 create mode 100644 block/backup.c

diff --git a/block/Makefile.objs b/block/Makefile.objs
index 5f0358a..88bd101 100644
--- a/block/Makefile.objs
+++ b/block/Makefile.objs
@@ -20,5 +20,6 @@ endif
 common-obj-y += stream.o
 common-obj-y += commit.o
 common-obj-y += mirror.o
+common-obj-y += backup.o
 
 $(obj)/curl.o: QEMU_CFLAGS+=$(CURL_CFLAGS)
diff --git a/block/backup.c b/block/backup.c
new file mode 100644
index 0000000..466744a
--- /dev/null
+++ b/block/backup.c
@@ -0,0 +1,355 @@
+/*
+ * QEMU backup
+ *
+ * Copyright (C) 2013 Proxmox Server Solutions
+ *
+ * Authors:
+ *  Dietmar Maurer (dietmar@proxmox.com)
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ *
+ */
+
+#include <stdio.h>
+#include <errno.h>
+#include <unistd.h>
+
+#include "block/block.h"
+#include "block/block_int.h"
+#include "block/blockjob.h"
+#include "qemu/ratelimit.h"
+
+#define DEBUG_BACKUP 0
+
+#define DPRINTF(fmt, ...) \
+    do { \
+        if (DEBUG_BACKUP) { \
+            fprintf(stderr, "backup: " fmt, ## __VA_ARGS__); \
+        } \
+    } while (0)
+
+#define BACKUP_CLUSTER_BITS 16
+#define BACKUP_CLUSTER_SIZE (1 << BACKUP_CLUSTER_BITS)
+#define BACKUP_SECTORS_PER_CLUSTER (BACKUP_CLUSTER_SIZE / BDRV_SECTOR_SIZE)
+
+#define SLICE_TIME 100000000ULL /* ns */
+
+typedef struct CowRequest {
+    int64_t start;
+    int64_t end;
+    QLIST_ENTRY(CowRequest) list;
+    CoQueue wait_queue; /* coroutines blocked on this request */
+} CowRequest;
+
+typedef struct BackupBlockJob {
+    BlockJob common;
+    BlockDriverState *target;
+    RateLimit limit;
+    BlockdevOnError on_source_error;
+    BlockdevOnError on_target_error;
+    CoRwlock flush_rwlock;
+    uint64_t sectors_read;
+    HBitmap *bitmap;
+    QLIST_HEAD(, CowRequest) inflight_reqs;
+} BackupBlockJob;
+
+/* See if in-flight requests overlap and wait for them to complete */
+static void coroutine_fn wait_for_overlapping_requests(BackupBlockJob *job,
+                                                       int64_t start,
+                                                       int64_t end)
+{
+    CowRequest *req;
+    bool retry;
+
+    do {
+        retry = false;
+        QLIST_FOREACH(req, &job->inflight_reqs, list) {
+            if (end > req->start && start < req->end) {
+                qemu_co_queue_wait(&req->wait_queue);
+                retry = true;
+                break;
+            }
+        }
+    } while (retry);
+}
+
+/* Keep track of an in-flight request */
+static void cow_request_begin(CowRequest *req, BackupBlockJob *job,
+                                     int64_t start, int64_t end)
+{
+    req->start = start;
+    req->end = end;
+    qemu_co_queue_init(&req->wait_queue);
+    QLIST_INSERT_HEAD(&job->inflight_reqs, req, list);
+}
+
+/* Forget about a completed request */
+static void cow_request_end(CowRequest *req)
+{
+    QLIST_REMOVE(req, list);
+    qemu_co_queue_restart_all(&req->wait_queue);
+}
+
+static int coroutine_fn backup_do_cow(BlockDriverState *bs,
+                                      int64_t sector_num, int nb_sectors,
+                                      bool *error_is_read)
+{
+    BackupBlockJob *job = (BackupBlockJob *)bs->job;
+    CowRequest cow_request;
+    struct iovec iov;
+    QEMUIOVector bounce_qiov;
+    void *bounce_buffer = NULL;
+    int ret = 0;
+    int64_t start, end, total_sectors;
+    int n;
+
+    qemu_co_rwlock_rdlock(&job->flush_rwlock);
+
+    start = sector_num / BACKUP_SECTORS_PER_CLUSTER;
+    end = DIV_ROUND_UP(sector_num + nb_sectors, BACKUP_SECTORS_PER_CLUSTER);
+
+    DPRINTF("%s enter %s C%" PRId64 " %" PRId64 " %d\n",
+            __func__, bdrv_get_device_name(bs), start, sector_num, nb_sectors);
+
+    wait_for_overlapping_requests(job, start, end);
+    cow_request_begin(&cow_request, job, start, end);
+
+    total_sectors = bdrv_getlength(bs);
+    if (total_sectors < 0) {
+        if (error_is_read) {
+            *error_is_read = true;
+        }
+        ret = total_sectors;
+        goto out;
+    }
+    total_sectors /= BDRV_SECTOR_SIZE;
+
+    for (; start < end; start++) {
+        if (hbitmap_get(job->bitmap, start)) {
+            DPRINTF("%s skip C%" PRId64 "\n", __func__, start);
+            continue; /* already copied */
+        }
+
+        DPRINTF("%s C%" PRId64 "\n", __func__, start);
+
+        n = MIN(BACKUP_SECTORS_PER_CLUSTER,
+                total_sectors - start * BACKUP_SECTORS_PER_CLUSTER);
+
+        if (!bounce_buffer) {
+            bounce_buffer = qemu_blockalign(bs, BACKUP_CLUSTER_SIZE);
+        }
+        iov.iov_base = bounce_buffer;
+        iov.iov_len = n * BDRV_SECTOR_SIZE;
+        qemu_iovec_init_external(&bounce_qiov, &iov, 1);
+
+        ret = bdrv_co_readv(bs, start * BACKUP_SECTORS_PER_CLUSTER, n,
+                            &bounce_qiov);
+        if (ret < 0) {
+            DPRINTF("%s bdrv_co_readv C%" PRId64 " failed\n", __func__, start);
+            if (error_is_read) {
+                *error_is_read = true;
+            }
+            goto out;
+        }
+
+        if (buffer_is_zero(iov.iov_base, iov.iov_len)) {
+            ret = bdrv_co_write_zeroes(job->target,
+                                       start * BACKUP_SECTORS_PER_CLUSTER, n);
+        } else {
+            ret = bdrv_co_writev(job->target,
+                                 start * BACKUP_SECTORS_PER_CLUSTER, n,
+                                 &bounce_qiov);
+        }
+        if (ret < 0) {
+            DPRINTF("%s write C%" PRId64 " failed\n", __func__, start);
+            if (error_is_read) {
+                *error_is_read = false;
+            }
+            goto out;
+        }
+
+        hbitmap_set(job->bitmap, start, 1);
+
+        /* Publish progress */
+        job->sectors_read += n;
+        job->common.offset += n * BDRV_SECTOR_SIZE;
+
+        DPRINTF("%s done C%" PRId64 "\n", __func__, start);
+    }
+
+out:
+    if (bounce_buffer) {
+        qemu_vfree(bounce_buffer);
+    }
+
+    cow_request_end(&cow_request);
+
+    qemu_co_rwlock_unlock(&job->flush_rwlock);
+
+    return ret;
+}
+
+static int coroutine_fn backup_before_write_notify(
+        NotifierWithReturn *notifier,
+        void *opaque)
+{
+    BdrvTrackedRequest *req = opaque;
+
+    return backup_do_cow(req->bs, req->sector_num, req->nb_sectors, NULL);
+}
+
+static void backup_set_speed(BlockJob *job, int64_t speed, Error **errp)
+{
+    BackupBlockJob *s = container_of(job, BackupBlockJob, common);
+
+    if (speed < 0) {
+        error_set(errp, QERR_INVALID_PARAMETER, "speed");
+        return;
+    }
+    ratelimit_set_speed(&s->limit, speed / BDRV_SECTOR_SIZE, SLICE_TIME);
+}
+
+static void backup_iostatus_reset(BlockJob *job)
+{
+    BackupBlockJob *s = container_of(job, BackupBlockJob, common);
+
+    bdrv_iostatus_reset(s->target);
+}
+
+static BlockJobType backup_job_type = {
+    .instance_size = sizeof(BackupBlockJob),
+    .job_type = "backup",
+    .set_speed = backup_set_speed,
+    .iostatus_reset = backup_iostatus_reset,
+};
+
+static BlockErrorAction backup_error_action(BackupBlockJob *job,
+                                            bool read, int error)
+{
+    if (read) {
+        return block_job_error_action(&job->common, job->common.bs,
+                                      job->on_source_error, true, error);
+    } else {
+        return block_job_error_action(&job->common, job->target,
+                                      job->on_target_error, false, error);
+    }
+}
+
+static void coroutine_fn backup_run(void *opaque)
+{
+    BackupBlockJob *job = opaque;
+    BlockDriverState *bs = job->common.bs;
+    BlockDriverState *target = job->target;
+    BlockdevOnError on_target_error = job->on_target_error;
+    NotifierWithReturn before_write = {
+        .notify = backup_before_write_notify,
+    };
+    int64_t start, end;
+    int ret = 0;
+
+    QLIST_INIT(&job->inflight_reqs);
+    qemu_co_rwlock_init(&job->flush_rwlock);
+
+    start = 0;
+    end = DIV_ROUND_UP(bdrv_getlength(bs) / BDRV_SECTOR_SIZE,
+                       BACKUP_SECTORS_PER_CLUSTER);
+
+    job->bitmap = hbitmap_alloc(end, 0);
+
+    bdrv_set_on_error(target, on_target_error, on_target_error);
+    bdrv_iostatus_enable(target);
+
+    bdrv_add_before_write_notifier(bs, &before_write);
+
+    DPRINTF("backup_run start %s %" PRId64 " %" PRId64 "\n",
+            bdrv_get_device_name(bs), start, end);
+
+    for (; start < end; start++) {
+        bool error_is_read;
+
+        if (block_job_is_cancelled(&job->common)) {
+            break;
+        }
+
+        /* we need to yield so that qemu_aio_flush() returns.
+         * (without, VM does not reboot)
+         */
+        if (job->common.speed) {
+            uint64_t delay_ns = ratelimit_calculate_delay(
+                &job->limit, job->sectors_read);
+            job->sectors_read = 0;
+            block_job_sleep_ns(&job->common, rt_clock, delay_ns);
+        } else {
+            block_job_sleep_ns(&job->common, rt_clock, 0);
+        }
+
+        if (block_job_is_cancelled(&job->common)) {
+            break;
+        }
+
+        DPRINTF("backup_run loop C%" PRId64 "\n", start);
+
+        ret = backup_do_cow(bs, start * BACKUP_SECTORS_PER_CLUSTER, 1,
+                            &error_is_read);
+        if (ret < 0) {
+            /* Depending on error action, fail now or retry cluster */
+            BlockErrorAction action =
+                backup_error_action(job, error_is_read, -ret);
+            if (action == BDRV_ACTION_REPORT) {
+                break;
+            } else {
+                start--;
+                continue;
+            }
+        }
+    }
+
+    notifier_with_return_remove(&before_write);
+
+    /* wait until pending backup_do_cow() calls have completed */
+    qemu_co_rwlock_wrlock(&job->flush_rwlock);
+    qemu_co_rwlock_unlock(&job->flush_rwlock);
+
+    hbitmap_free(job->bitmap);
+
+    bdrv_iostatus_disable(target);
+    bdrv_delete(job->target);
+
+    DPRINTF("backup_run complete %d\n", ret);
+    block_job_completed(&job->common, ret);
+}
+
+void backup_start(BlockDriverState *bs, BlockDriverState *target,
+                  int64_t speed,
+                  BlockdevOnError on_source_error,
+                  BlockdevOnError on_target_error,
+                  BlockDriverCompletionFunc *cb, void *opaque,
+                  Error **errp)
+{
+    assert(bs);
+    assert(target);
+    assert(cb);
+
+    DPRINTF("backup_start %s\n", bdrv_get_device_name(bs));
+
+    if ((on_source_error == BLOCKDEV_ON_ERROR_STOP ||
+         on_source_error == BLOCKDEV_ON_ERROR_ENOSPC) &&
+        !bdrv_iostatus_is_enabled(bs)) {
+        error_set(errp, QERR_INVALID_PARAMETER, "on-source-error");
+        return;
+    }
+
+    BackupBlockJob *job = block_job_create(&backup_job_type, bs, speed,
+                                           cb, opaque, errp);
+    if (!job) {
+        return;
+    }
+
+    job->on_source_error = on_source_error;
+    job->on_target_error = on_target_error;
+    job->target = target;
+    job->common.len = bdrv_getlength(bs);
+    job->common.co = qemu_coroutine_create(backup_run);
+    qemu_coroutine_enter(job->common.co, job);
+}
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 440d510..dccf1d2 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -398,4 +398,23 @@ void mirror_start(BlockDriverState *bs, BlockDriverState *target,
                   BlockDriverCompletionFunc *cb,
                   void *opaque, Error **errp);
 
+/*
+ * backup_start:
+ * @bs: Block device to operate on.
+ * @target: Block device to write to.
+ * @speed: The maximum speed, in bytes per second, or 0 for unlimited.
+ * @on_source_error: The action to take upon error reading from the source.
+ * @on_target_error: The action to take upon error writing to the target.
+ * @cb: Completion function for the job.
+ * @opaque: Opaque pointer value passed to @cb.
+ *
+ * Start a backup operation on @bs.  Clusters in @bs are written to @target
+ * until the job is cancelled or manually completed.
+ */
+void backup_start(BlockDriverState *bs, BlockDriverState *target,
+                  int64_t speed, BlockdevOnError on_source_error,
+                  BlockdevOnError on_target_error,
+                  BlockDriverCompletionFunc *cb, void *opaque,
+                  Error **errp);
+
 #endif /* BLOCK_INT_H */
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [Qemu-devel] [PATCH v5 04/11] blockdev: drop redundant proto_drv check
  2013-05-30 12:34 [Qemu-devel] [PATCH v5 00/11] block: drive-backup live backup command Stefan Hajnoczi
                   ` (2 preceding siblings ...)
  2013-05-30 12:34 ` [Qemu-devel] [PATCH v5 03/11] block: add basic backup support to block driver Stefan Hajnoczi
@ 2013-05-30 12:34 ` Stefan Hajnoczi
  2013-06-03 17:14   ` Eric Blake
                     ` (2 more replies)
  2013-05-30 12:34 ` [Qemu-devel] [PATCH v5 05/11] blockdev: use bdrv_getlength() in qmp_drive_mirror() Stefan Hajnoczi
                   ` (7 subsequent siblings)
  11 siblings, 3 replies; 59+ messages in thread
From: Stefan Hajnoczi @ 2013-05-30 12:34 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Fam Zheng, xiawenc, imain, Stefan Hajnoczi,
	Paolo Bonzini, dietmar

It is not necessary to check that we can find a protocol block driver
since we create or open the image file.  This produces the error that we
need anyway.

Besides, the QERR_INVALID_BLOCK_FORMAT is inappropriate since the
protocol is incorrect rather than the format.

Also drop an empty line between bdrv_open() and checking its return
value.  This may be due to copy-pasting from earlier code that performed
other operations before handling errors.

Reported-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 blockdev.c | 15 ---------------
 1 file changed, 15 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index b9b2d10..01db519 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -818,7 +818,6 @@ typedef struct ExternalSnapshotStates {
 static void external_snapshot_prepare(BlkTransactionStates *common,
                                       Error **errp)
 {
-    BlockDriver *proto_drv;
     BlockDriver *drv;
     int flags, ret;
     Error *local_err = NULL;
@@ -874,12 +873,6 @@ static void external_snapshot_prepare(BlkTransactionStates *common,
 
     flags = states->old_bs->open_flags;
 
-    proto_drv = bdrv_find_protocol(new_image_file);
-    if (!proto_drv) {
-        error_set(errp, QERR_INVALID_BLOCK_FORMAT, format);
-        return;
-    }
-
     /* create new image w/backing file */
     if (mode != NEW_IMAGE_MODE_EXISTING) {
         bdrv_img_create(new_image_file, format,
@@ -1368,7 +1361,6 @@ void qmp_drive_mirror(const char *device, const char *target,
 {
     BlockDriverState *bs;
     BlockDriverState *source, *target_bs;
-    BlockDriver *proto_drv;
     BlockDriver *drv = NULL;
     Error *local_err = NULL;
     int flags;
@@ -1436,12 +1428,6 @@ void qmp_drive_mirror(const char *device, const char *target,
         sync = MIRROR_SYNC_MODE_FULL;
     }
 
-    proto_drv = bdrv_find_protocol(target);
-    if (!proto_drv) {
-        error_set(errp, QERR_INVALID_BLOCK_FORMAT, format);
-        return;
-    }
-
     bdrv_get_geometry(bs, &size);
     size *= 512;
     if (sync == MIRROR_SYNC_MODE_FULL && mode != NEW_IMAGE_MODE_EXISTING) {
@@ -1476,7 +1462,6 @@ void qmp_drive_mirror(const char *device, const char *target,
      */
     target_bs = bdrv_new("");
     ret = bdrv_open(target_bs, target, NULL, flags | BDRV_O_NO_BACKING, drv);
-
     if (ret < 0) {
         bdrv_delete(target_bs);
         error_set(errp, QERR_OPEN_FILE_FAILED, target);
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [Qemu-devel] [PATCH v5 05/11] blockdev: use bdrv_getlength() in qmp_drive_mirror()
  2013-05-30 12:34 [Qemu-devel] [PATCH v5 00/11] block: drive-backup live backup command Stefan Hajnoczi
                   ` (3 preceding siblings ...)
  2013-05-30 12:34 ` [Qemu-devel] [PATCH v5 04/11] blockdev: drop redundant proto_drv check Stefan Hajnoczi
@ 2013-05-30 12:34 ` Stefan Hajnoczi
  2013-05-30 13:24   ` Paolo Bonzini
                     ` (2 more replies)
  2013-05-30 12:34 ` [Qemu-devel] [PATCH v5 06/11] block: add drive-backup QMP command Stefan Hajnoczi
                   ` (6 subsequent siblings)
  11 siblings, 3 replies; 59+ messages in thread
From: Stefan Hajnoczi @ 2013-05-30 12:34 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Fam Zheng, xiawenc, imain, Stefan Hajnoczi,
	Paolo Bonzini, dietmar

Use bdrv_getlength() for its byte units and error return instead of
bdrv_get_geometry().

Reported-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 blockdev.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index 01db519..73b175b 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1364,7 +1364,7 @@ void qmp_drive_mirror(const char *device, const char *target,
     BlockDriver *drv = NULL;
     Error *local_err = NULL;
     int flags;
-    uint64_t size;
+    int64_t size;
     int ret;
 
     if (!has_speed) {
@@ -1428,8 +1428,12 @@ void qmp_drive_mirror(const char *device, const char *target,
         sync = MIRROR_SYNC_MODE_FULL;
     }
 
-    bdrv_get_geometry(bs, &size);
-    size *= 512;
+    size = bdrv_getlength(bs);
+    if (size < 0) {
+        error_setg_errno(errp, -size, "bdrv_getlength failed");
+        return;
+    }
+
     if (sync == MIRROR_SYNC_MODE_FULL && mode != NEW_IMAGE_MODE_EXISTING) {
         /* create new image w/o backing file */
         assert(format && drv);
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [Qemu-devel] [PATCH v5 06/11] block: add drive-backup QMP command
  2013-05-30 12:34 [Qemu-devel] [PATCH v5 00/11] block: drive-backup live backup command Stefan Hajnoczi
                   ` (4 preceding siblings ...)
  2013-05-30 12:34 ` [Qemu-devel] [PATCH v5 05/11] blockdev: use bdrv_getlength() in qmp_drive_mirror() Stefan Hajnoczi
@ 2013-05-30 12:34 ` Stefan Hajnoczi
  2013-06-03 17:09   ` Eric Blake
  2013-06-19 11:07   ` Kevin Wolf
  2013-05-30 12:34 ` [Qemu-devel] [PATCH v5 07/11] blockdev: rename BlkTransactionStates to singular Stefan Hajnoczi
                   ` (5 subsequent siblings)
  11 siblings, 2 replies; 59+ messages in thread
From: Stefan Hajnoczi @ 2013-05-30 12:34 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Fam Zheng, xiawenc, imain, Stefan Hajnoczi,
	Paolo Bonzini, dietmar

@drive-backup

Start a point-in-time copy of a block device to a new destination.  The
status of ongoing drive-backup operations can be checked with
query-block-jobs where the BlockJobInfo.type field has the value 'backup'.
The operation can be stopped before it has completed using the
block-job-cancel command.

@device: the name of the device which should be copied.

@target: the target of the new image. If the file exists, or if it
         is a device, the existing file/device will be used as the new
         destination.  If it does not exist, a new file will be created.

@format: #optional the format of the new destination, default is to
         probe if @mode is 'existing', else the format of the source

@mode: #optional whether and how QEMU should create a new image, default is
       'absolute-paths'.

@speed: #optional the maximum speed, in bytes per second

@on-source-error: #optional the action to take on an error on the source,
                  default 'report'.  'stop' and 'enospc' can only be used
                  if the block device supports io-status (see BlockInfo).

@on-target-error: #optional the action to take on an error on the target,
                  default 'report' (no limitations, since this applies to
                  a different block device than @device).

Note that @on-source-error and @on-target-error only affect background I/O.
If an error occurs during a guest write request, the device's rerror/werror
actions will be used.

Returns: nothing on success
         If @device is not a valid block device, DeviceNotFound

Since 1.6

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 blockdev.c       | 97 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 qapi-schema.json | 46 +++++++++++++++++++++++++++
 qmp-commands.hx  | 46 +++++++++++++++++++++++++++
 3 files changed, 189 insertions(+)

diff --git a/blockdev.c b/blockdev.c
index 73b175b..5de33f5 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1346,6 +1346,103 @@ void qmp_block_commit(const char *device,
     drive_get_ref(drive_get_by_blockdev(bs));
 }
 
+void qmp_drive_backup(const char *device, const char *target,
+                      bool has_format, const char *format,
+                      bool has_mode, enum NewImageMode mode,
+                      bool has_speed, int64_t speed,
+                      bool has_on_source_error, BlockdevOnError on_source_error,
+                      bool has_on_target_error, BlockdevOnError on_target_error,
+                      Error **errp)
+{
+    BlockDriverState *bs;
+    BlockDriverState *target_bs;
+    BlockDriver *drv = NULL;
+    Error *local_err = NULL;
+    int flags;
+    int64_t size;
+    int ret;
+
+    if (!has_speed) {
+        speed = 0;
+    }
+    if (!has_on_source_error) {
+        on_source_error = BLOCKDEV_ON_ERROR_REPORT;
+    }
+    if (!has_on_target_error) {
+        on_target_error = BLOCKDEV_ON_ERROR_REPORT;
+    }
+    if (!has_mode) {
+        mode = NEW_IMAGE_MODE_ABSOLUTE_PATHS;
+    }
+
+    bs = bdrv_find(device);
+    if (!bs) {
+        error_set(errp, QERR_DEVICE_NOT_FOUND, device);
+        return;
+    }
+
+    if (!bdrv_is_inserted(bs)) {
+        error_set(errp, QERR_DEVICE_HAS_NO_MEDIUM, device);
+        return;
+    }
+
+    if (!has_format) {
+        format = mode == NEW_IMAGE_MODE_EXISTING ? NULL : bs->drv->format_name;
+    }
+    if (format) {
+        drv = bdrv_find_format(format);
+        if (!drv) {
+            error_set(errp, QERR_INVALID_BLOCK_FORMAT, format);
+            return;
+        }
+    }
+
+    if (bdrv_in_use(bs)) {
+        error_set(errp, QERR_DEVICE_IN_USE, device);
+        return;
+    }
+
+    flags = bs->open_flags | BDRV_O_RDWR;
+
+    size = bdrv_getlength(bs);
+    if (size < 0) {
+        error_setg_errno(errp, -size, "bdrv_getlength failed");
+        return;
+    }
+
+    if (mode != NEW_IMAGE_MODE_EXISTING) {
+        assert(format && drv);
+        bdrv_img_create(target, format,
+                        NULL, NULL, NULL, size, flags, &local_err, false);
+    }
+
+    if (error_is_set(&local_err)) {
+        error_propagate(errp, local_err);
+        return;
+    }
+
+    target_bs = bdrv_new("");
+    ret = bdrv_open(target_bs, target, NULL, flags, drv);
+    if (ret < 0) {
+        bdrv_delete(target_bs);
+        error_set(errp, QERR_OPEN_FILE_FAILED, target);
+        return;
+    }
+
+    backup_start(bs, target_bs, speed, on_source_error, on_target_error,
+                 block_job_cb, bs, &local_err);
+    if (local_err != NULL) {
+        bdrv_delete(target_bs);
+        error_propagate(errp, local_err);
+        return;
+    }
+
+    /* Grab a reference so hotplug does not delete the BlockDriverState from
+     * underneath us.
+     */
+    drive_get_ref(drive_get_by_blockdev(bs));
+}
+
 #define DEFAULT_MIRROR_BUF_SIZE   (10 << 20)
 
 void qmp_drive_mirror(const char *device, const char *target,
diff --git a/qapi-schema.json b/qapi-schema.json
index ef1f657..55b1a37 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -1730,6 +1730,52 @@
             '*speed': 'int' } }
 
 ##
+# @drive-backup
+#
+# Start a point-in-time copy of a block device to a new destination.  The
+# status of ongoing drive-backup operations can be checked with
+# query-block-jobs where the BlockJobInfo.type field has the value 'backup'.
+# The operation can be stopped before it has completed using the
+# block-job-cancel command.
+#
+# @device: the name of the device which should be copied.
+#
+# @target: the target of the new image. If the file exists, or if it
+#          is a device, the existing file/device will be used as the new
+#          destination.  If it does not exist, a new file will be created.
+#
+# @format: #optional the format of the new destination, default is to
+#          probe if @mode is 'existing', else the format of the source
+#
+# @mode: #optional whether and how QEMU should create a new image, default is
+#        'absolute-paths'.
+#
+# @speed: #optional the maximum speed, in bytes per second
+#
+# @on-source-error: #optional the action to take on an error on the source,
+#                   default 'report'.  'stop' and 'enospc' can only be used
+#                   if the block device supports io-status (see BlockInfo).
+#
+# @on-target-error: #optional the action to take on an error on the target,
+#                   default 'report' (no limitations, since this applies to
+#                   a different block device than @device).
+#
+# Note that @on-source-error and @on-target-error only affect background I/O.
+# If an error occurs during a guest write request, the device's rerror/werror
+# actions will be used.
+#
+# Returns: nothing on success
+#          If @device is not a valid block device, DeviceNotFound
+#
+# Since 1.6
+##
+{ 'command': 'drive-backup',
+  'data': { 'device': 'str', 'target': 'str', '*format': 'str',
+            '*mode': 'NewImageMode', '*speed': 'int',
+            '*on-source-error': 'BlockdevOnError',
+            '*on-target-error': 'BlockdevOnError' } }
+
+##
 # @drive-mirror
 #
 # Start mirroring a block device's writes to a new destination.
diff --git a/qmp-commands.hx b/qmp-commands.hx
index ffd130e..d1f7766 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -912,6 +912,52 @@ EQMP
     },
 
     {
+        .name       = "drive-backup",
+        .args_type  = "device:B,target:s,speed:i?,mode:s?,format:s?,"
+                      "on-source-error:s?,on-target-error:s?",
+        .mhandler.cmd_new = qmp_marshal_input_drive_backup,
+    },
+
+SQMP
+drive-backup
+------------
+
+Start a point-in-time copy of a block device to a new destination.  The
+status of ongoing drive-backup operations can be checked with
+query-block-jobs where the BlockJobInfo.type field has the value 'backup'.
+The operation can be stopped before it has completed using the
+block-job-cancel command.
+
+Arguments:
+
+- "device": the name of the device which should be copied.
+            (json-string)
+- "target": the target of the new image. If the file exists, or if it is a
+            device, the existing file/device will be used as the new
+            destination.  If it does not exist, a new file will be created.
+            (json-string)
+- "format": the format of the new destination, default is to probe if 'mode' is
+            'existing', else the format of the source
+            (json-string, optional)
+- "mode": whether and how QEMU should create a new image
+          (NewImageMode, optional, default 'absolute-paths')
+- "speed": the maximum speed, in bytes per second (json-int, optional)
+- "on-source-error": the action to take on an error on the source, default
+                     'report'.  'stop' and 'enospc' can only be used
+                     if the block device supports io-status.
+                     (BlockdevOnError, optional)
+- "on-target-error": the action to take on an error on the target, default
+                     'report' (no limitations, since this applies to
+                     a different block device than device).
+                     (BlockdevOnError, optional)
+
+Example:
+-> { "execute": "drive-backup", "arguments": { "device": "drive0",
+                                               "target": "backup.img" } }
+<- { "return": {} }
+EQMP
+
+    {
         .name       = "block-job-set-speed",
         .args_type  = "device:B,speed:o",
         .mhandler.cmd_new = qmp_marshal_input_block_job_set_speed,
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [Qemu-devel] [PATCH v5 07/11] blockdev: rename BlkTransactionStates to singular
  2013-05-30 12:34 [Qemu-devel] [PATCH v5 00/11] block: drive-backup live backup command Stefan Hajnoczi
                   ` (5 preceding siblings ...)
  2013-05-30 12:34 ` [Qemu-devel] [PATCH v5 06/11] block: add drive-backup QMP command Stefan Hajnoczi
@ 2013-05-30 12:34 ` Stefan Hajnoczi
  2013-05-30 22:55   ` Eric Blake
  2013-06-19 11:13   ` Kevin Wolf
  2013-05-30 12:34 ` [Qemu-devel] [PATCH v5 08/11] blockdev: allow BdrvActionOps->commit() to be NULL Stefan Hajnoczi
                   ` (4 subsequent siblings)
  11 siblings, 2 replies; 59+ messages in thread
From: Stefan Hajnoczi @ 2013-05-30 12:34 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Fam Zheng, xiawenc, imain, Stefan Hajnoczi,
	Paolo Bonzini, dietmar

The QMP 'transaction' command keeps a list of in-flight transactions.
The transaction state structure is called BlkTransactionStates even
though it only deals with a single transaction.  The only plural thing
is the linked list of transaction states.

I find it confusing to call the single structure "States".  This patch
renames it to "State", just like BlockDriverState is singular.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 blockdev.c | 104 ++++++++++++++++++++++++++++++-------------------------------
 1 file changed, 52 insertions(+), 52 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index 5de33f5..94ad829 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -780,7 +780,7 @@ void qmp_blockdev_snapshot_sync(const char *device, const char *snapshot_file,
 
 /* New and old BlockDriverState structs for group snapshots */
 
-typedef struct BlkTransactionStates BlkTransactionStates;
+typedef struct BlkTransactionState BlkTransactionState;
 
 /* Only prepare() may fail. In a single transaction, only one of commit() or
    abort() will be called, clean() will always be called if it present. */
@@ -788,13 +788,13 @@ typedef struct BdrvActionOps {
     /* Size of state struct, in bytes. */
     size_t instance_size;
     /* Prepare the work, must NOT be NULL. */
-    void (*prepare)(BlkTransactionStates *common, Error **errp);
+    void (*prepare)(BlkTransactionState *common, Error **errp);
     /* Commit the changes, must NOT be NULL. */
-    void (*commit)(BlkTransactionStates *common);
+    void (*commit)(BlkTransactionState *common);
     /* Abort the changes on fail, can be NULL. */
-    void (*abort)(BlkTransactionStates *common);
+    void (*abort)(BlkTransactionState *common);
     /* Clean up resource in the end, can be NULL. */
-    void (*clean)(BlkTransactionStates *common);
+    void (*clean)(BlkTransactionState *common);
 } BdrvActionOps;
 
 /*
@@ -802,20 +802,20 @@ typedef struct BdrvActionOps {
  * that compiler will also arrange it to the same address with parent instance.
  * Later it will be used in free().
  */
-struct BlkTransactionStates {
+struct BlkTransactionState {
     TransactionAction *action;
     const BdrvActionOps *ops;
-    QSIMPLEQ_ENTRY(BlkTransactionStates) entry;
+    QSIMPLEQ_ENTRY(BlkTransactionState) entry;
 };
 
 /* external snapshot private data */
-typedef struct ExternalSnapshotStates {
-    BlkTransactionStates common;
+typedef struct ExternalSnapshotState {
+    BlkTransactionState common;
     BlockDriverState *old_bs;
     BlockDriverState *new_bs;
-} ExternalSnapshotStates;
+} ExternalSnapshotState;
 
-static void external_snapshot_prepare(BlkTransactionStates *common,
+static void external_snapshot_prepare(BlkTransactionState *common,
                                       Error **errp)
 {
     BlockDriver *drv;
@@ -825,8 +825,8 @@ static void external_snapshot_prepare(BlkTransactionStates *common,
     const char *new_image_file;
     const char *format = "qcow2";
     enum NewImageMode mode = NEW_IMAGE_MODE_ABSOLUTE_PATHS;
-    ExternalSnapshotStates *states =
-                             DO_UPCAST(ExternalSnapshotStates, common, common);
+    ExternalSnapshotState *state =
+                             DO_UPCAST(ExternalSnapshotState, common, common);
     TransactionAction *action = common->action;
 
     /* get parameters */
@@ -848,36 +848,36 @@ static void external_snapshot_prepare(BlkTransactionStates *common,
         return;
     }
 
-    states->old_bs = bdrv_find(device);
-    if (!states->old_bs) {
+    state->old_bs = bdrv_find(device);
+    if (!state->old_bs) {
         error_set(errp, QERR_DEVICE_NOT_FOUND, device);
         return;
     }
 
-    if (!bdrv_is_inserted(states->old_bs)) {
+    if (!bdrv_is_inserted(state->old_bs)) {
         error_set(errp, QERR_DEVICE_HAS_NO_MEDIUM, device);
         return;
     }
 
-    if (bdrv_in_use(states->old_bs)) {
+    if (bdrv_in_use(state->old_bs)) {
         error_set(errp, QERR_DEVICE_IN_USE, device);
         return;
     }
 
-    if (!bdrv_is_read_only(states->old_bs)) {
-        if (bdrv_flush(states->old_bs)) {
+    if (!bdrv_is_read_only(state->old_bs)) {
+        if (bdrv_flush(state->old_bs)) {
             error_set(errp, QERR_IO_ERROR);
             return;
         }
     }
 
-    flags = states->old_bs->open_flags;
+    flags = state->old_bs->open_flags;
 
     /* create new image w/backing file */
     if (mode != NEW_IMAGE_MODE_EXISTING) {
         bdrv_img_create(new_image_file, format,
-                        states->old_bs->filename,
-                        states->old_bs->drv->format_name,
+                        state->old_bs->filename,
+                        state->old_bs->drv->format_name,
                         NULL, -1, flags, &local_err, false);
         if (error_is_set(&local_err)) {
             error_propagate(errp, local_err);
@@ -886,42 +886,42 @@ static void external_snapshot_prepare(BlkTransactionStates *common,
     }
 
     /* We will manually add the backing_hd field to the bs later */
-    states->new_bs = bdrv_new("");
+    state->new_bs = bdrv_new("");
     /* TODO Inherit bs->options or only take explicit options with an
      * extended QMP command? */
-    ret = bdrv_open(states->new_bs, new_image_file, NULL,
+    ret = bdrv_open(state->new_bs, new_image_file, NULL,
                     flags | BDRV_O_NO_BACKING, drv);
     if (ret != 0) {
         error_set(errp, QERR_OPEN_FILE_FAILED, new_image_file);
     }
 }
 
-static void external_snapshot_commit(BlkTransactionStates *common)
+static void external_snapshot_commit(BlkTransactionState *common)
 {
-    ExternalSnapshotStates *states =
-                             DO_UPCAST(ExternalSnapshotStates, common, common);
+    ExternalSnapshotState *state =
+                             DO_UPCAST(ExternalSnapshotState, common, common);
 
-    /* This removes our old bs from the bdrv_states, and adds the new bs */
-    bdrv_append(states->new_bs, states->old_bs);
+    /* This removes our old bs and adds the new bs */
+    bdrv_append(state->new_bs, state->old_bs);
     /* We don't need (or want) to use the transactional
      * bdrv_reopen_multiple() across all the entries at once, because we
      * don't want to abort all of them if one of them fails the reopen */
-    bdrv_reopen(states->new_bs, states->new_bs->open_flags & ~BDRV_O_RDWR,
+    bdrv_reopen(state->new_bs, state->new_bs->open_flags & ~BDRV_O_RDWR,
                 NULL);
 }
 
-static void external_snapshot_abort(BlkTransactionStates *common)
+static void external_snapshot_abort(BlkTransactionState *common)
 {
-    ExternalSnapshotStates *states =
-                             DO_UPCAST(ExternalSnapshotStates, common, common);
-    if (states->new_bs) {
-        bdrv_delete(states->new_bs);
+    ExternalSnapshotState *state =
+                             DO_UPCAST(ExternalSnapshotState, common, common);
+    if (state->new_bs) {
+        bdrv_delete(state->new_bs);
     }
 }
 
 static const BdrvActionOps actions[] = {
     [TRANSACTION_ACTION_KIND_BLOCKDEV_SNAPSHOT_SYNC] = {
-        .instance_size = sizeof(ExternalSnapshotStates),
+        .instance_size = sizeof(ExternalSnapshotState),
         .prepare  = external_snapshot_prepare,
         .commit   = external_snapshot_commit,
         .abort = external_snapshot_abort,
@@ -936,10 +936,10 @@ static const BdrvActionOps actions[] = {
 void qmp_transaction(TransactionActionList *dev_list, Error **errp)
 {
     TransactionActionList *dev_entry = dev_list;
-    BlkTransactionStates *states, *next;
+    BlkTransactionState *state, *next;
     Error *local_err = NULL;
 
-    QSIMPLEQ_HEAD(snap_bdrv_states, BlkTransactionStates) snap_bdrv_states;
+    QSIMPLEQ_HEAD(snap_bdrv_states, BlkTransactionState) snap_bdrv_states;
     QSIMPLEQ_INIT(&snap_bdrv_states);
 
     /* drain all i/o before any snapshots */
@@ -956,20 +956,20 @@ void qmp_transaction(TransactionActionList *dev_list, Error **errp)
         assert(dev_info->kind < ARRAY_SIZE(actions));
 
         ops = &actions[dev_info->kind];
-        states = g_malloc0(ops->instance_size);
-        states->ops = ops;
-        states->action = dev_info;
-        QSIMPLEQ_INSERT_TAIL(&snap_bdrv_states, states, entry);
+        state = g_malloc0(ops->instance_size);
+        state->ops = ops;
+        state->action = dev_info;
+        QSIMPLEQ_INSERT_TAIL(&snap_bdrv_states, state, entry);
 
-        states->ops->prepare(states, &local_err);
+        state->ops->prepare(state, &local_err);
         if (error_is_set(&local_err)) {
             error_propagate(errp, local_err);
             goto delete_and_fail;
         }
     }
 
-    QSIMPLEQ_FOREACH(states, &snap_bdrv_states, entry) {
-        states->ops->commit(states);
+    QSIMPLEQ_FOREACH(state, &snap_bdrv_states, entry) {
+        state->ops->commit(state);
     }
 
     /* success */
@@ -980,17 +980,17 @@ delete_and_fail:
     * failure, and it is all-or-none; abandon each new bs, and keep using
     * the original bs for all images
     */
-    QSIMPLEQ_FOREACH(states, &snap_bdrv_states, entry) {
-        if (states->ops->abort) {
-            states->ops->abort(states);
+    QSIMPLEQ_FOREACH(state, &snap_bdrv_states, entry) {
+        if (state->ops->abort) {
+            state->ops->abort(state);
         }
     }
 exit:
-    QSIMPLEQ_FOREACH_SAFE(states, &snap_bdrv_states, entry, next) {
-        if (states->ops->clean) {
-            states->ops->clean(states);
+    QSIMPLEQ_FOREACH_SAFE(state, &snap_bdrv_states, entry, next) {
+        if (state->ops->clean) {
+            state->ops->clean(state);
         }
-        g_free(states);
+        g_free(state);
     }
 }
 
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [Qemu-devel] [PATCH v5 08/11] blockdev: allow BdrvActionOps->commit() to be NULL
  2013-05-30 12:34 [Qemu-devel] [PATCH v5 00/11] block: drive-backup live backup command Stefan Hajnoczi
                   ` (6 preceding siblings ...)
  2013-05-30 12:34 ` [Qemu-devel] [PATCH v5 07/11] blockdev: rename BlkTransactionStates to singular Stefan Hajnoczi
@ 2013-05-30 12:34 ` Stefan Hajnoczi
  2013-05-30 22:57   ` Eric Blake
  2013-06-19 11:14   ` Kevin Wolf
  2013-05-30 12:34 ` [Qemu-devel] [PATCH v5 09/11] blockdev: add DriveBackup transaction Stefan Hajnoczi
                   ` (3 subsequent siblings)
  11 siblings, 2 replies; 59+ messages in thread
From: Stefan Hajnoczi @ 2013-05-30 12:34 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Fam Zheng, xiawenc, imain, Stefan Hajnoczi,
	Paolo Bonzini, dietmar

Some QMP 'transaction' types don't need to do anything on .commit().
Make .commit() optional just like .abort().

The "drive-backup" action will take advantage of this, it only needs to
cancel the block job on .abort().  Other block job actions will probably
follow the same pattern, so allow .commit() to be NULL.

Suggested-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 blockdev.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index 94ad829..0e649df 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -789,7 +789,7 @@ typedef struct BdrvActionOps {
     size_t instance_size;
     /* Prepare the work, must NOT be NULL. */
     void (*prepare)(BlkTransactionState *common, Error **errp);
-    /* Commit the changes, must NOT be NULL. */
+    /* Commit the changes, can be NULL. */
     void (*commit)(BlkTransactionState *common);
     /* Abort the changes on fail, can be NULL. */
     void (*abort)(BlkTransactionState *common);
@@ -969,7 +969,9 @@ void qmp_transaction(TransactionActionList *dev_list, Error **errp)
     }
 
     QSIMPLEQ_FOREACH(state, &snap_bdrv_states, entry) {
-        state->ops->commit(state);
+        if (state->ops->commit) {
+            state->ops->commit(state);
+        }
     }
 
     /* success */
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [Qemu-devel] [PATCH v5 09/11] blockdev: add DriveBackup transaction
  2013-05-30 12:34 [Qemu-devel] [PATCH v5 00/11] block: drive-backup live backup command Stefan Hajnoczi
                   ` (7 preceding siblings ...)
  2013-05-30 12:34 ` [Qemu-devel] [PATCH v5 08/11] blockdev: allow BdrvActionOps->commit() to be NULL Stefan Hajnoczi
@ 2013-05-30 12:34 ` Stefan Hajnoczi
  2013-06-03 17:20   ` Eric Blake
  2013-06-19 11:17   ` Kevin Wolf
  2013-05-30 12:34 ` [Qemu-devel] [PATCH v5 10/11] blockdev: add Abort transaction Stefan Hajnoczi
                   ` (2 subsequent siblings)
  11 siblings, 2 replies; 59+ messages in thread
From: Stefan Hajnoczi @ 2013-05-30 12:34 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Fam Zheng, xiawenc, imain, Stefan Hajnoczi,
	Paolo Bonzini, dietmar

This patch adds a transactional version of the drive-backup QMP command.
It allows atomic snapshots of multiple drives along with automatic
cleanup if there is a failure to start one of the backup jobs.

Note that QMP events are emitted for block job completion/cancellation
and the block job will be listed by query-block-jobs.

@device: the name of the device whose writes should be mirrored.

@target: the target of the new image. If the file exists, or if it
         is a device, the existing file/device will be used as the new
         destination.  If it does not exist, a new file will be created.

@format: #optional the format of the new destination, default is to
         probe if @mode is 'existing', else the format of the source

@mode: #optional whether and how QEMU should create a new image, default is
       'absolute-paths'.

@speed: #optional the maximum speed, in bytes per second

@on-source-error: #optional the action to take on an error on the source,
                  default 'report'.  'stop' and 'enospc' can only be used
                  if the block device supports io-status (see BlockInfo).

@on-target-error: #optional the action to take on an error on the target,
                  default 'report' (no limitations, since this applies to
                  a different block device than @device).

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 blockdev.c       | 49 +++++++++++++++++++++++++++++++++++++++++++++++++
 qapi-schema.json | 40 +++++++++++++++++++++++++++++++++++++++-
 2 files changed, 88 insertions(+), 1 deletion(-)

diff --git a/blockdev.c b/blockdev.c
index 0e649df..18a2012 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -919,6 +919,50 @@ static void external_snapshot_abort(BlkTransactionState *common)
     }
 }
 
+typedef struct DriveBackupState {
+    BlkTransactionState common;
+    BlockDriverState *bs;
+    BlockJob *job;
+} DriveBackupState;
+
+static void drive_backup_prepare(BlkTransactionState *common, Error **errp)
+{
+    DriveBackupState *state = DO_UPCAST(DriveBackupState, common, common);
+    DriveBackup *backup;
+    Error *local_err = NULL;
+
+    assert(common->action->kind == TRANSACTION_ACTION_KIND_DRIVE_BACKUP);
+    backup = common->action->drive_backup;
+
+    qmp_drive_backup(backup->device, backup->target,
+                     backup->has_format, backup->format,
+                     backup->has_mode, backup->mode,
+                     backup->has_speed, backup->speed,
+                     backup->has_on_source_error, backup->on_source_error,
+                     backup->has_on_target_error, backup->on_target_error,
+                     &local_err);
+    if (error_is_set(&local_err)) {
+        error_propagate(errp, local_err);
+        state->bs = NULL;
+        state->job = NULL;
+        return;
+    }
+
+    state->bs = bdrv_find(backup->device);
+    state->job = state->bs->job;
+}
+
+static void drive_backup_abort(BlkTransactionState *common)
+{
+    DriveBackupState *state = DO_UPCAST(DriveBackupState, common, common);
+    BlockDriverState *bs = state->bs;
+
+    /* Only cancel if it's the job we started */
+    if (bs && bs->job && bs->job == state->job) {
+        block_job_cancel_sync(bs->job);
+    }
+}
+
 static const BdrvActionOps actions[] = {
     [TRANSACTION_ACTION_KIND_BLOCKDEV_SNAPSHOT_SYNC] = {
         .instance_size = sizeof(ExternalSnapshotState),
@@ -926,6 +970,11 @@ static const BdrvActionOps actions[] = {
         .commit   = external_snapshot_commit,
         .abort = external_snapshot_abort,
     },
+    [TRANSACTION_ACTION_KIND_DRIVE_BACKUP] = {
+        .instance_size = sizeof(DriveBackupState),
+        .prepare = drive_backup_prepare,
+        .abort = drive_backup_abort,
+    },
 };
 
 /*
diff --git a/qapi-schema.json b/qapi-schema.json
index 55b1a37..6bf96aa 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -1609,6 +1609,43 @@
             '*mode': 'NewImageMode' } }
 
 ##
+# @DriveBackup
+#
+# @device: the name of the device which should be copied.
+#
+# @target: the target of the new image. If the file exists, or if it
+#          is a device, the existing file/device will be used as the new
+#          destination.  If it does not exist, a new file will be created.
+#
+# @format: #optional the format of the new destination, default is to
+#          probe if @mode is 'existing', else the format of the source
+#
+# @mode: #optional whether and how QEMU should create a new image, default is
+#        'absolute-paths'.
+#
+# @speed: #optional the maximum speed, in bytes per second
+#
+# @on-source-error: #optional the action to take on an error on the source,
+#                   default 'report'.  'stop' and 'enospc' can only be used
+#                   if the block device supports io-status (see BlockInfo).
+#
+# @on-target-error: #optional the action to take on an error on the target,
+#                   default 'report' (no limitations, since this applies to
+#                   a different block device than @device).
+#
+# Note that @on-source-error and @on-target-error only affect background I/O.
+# If an error occurs during a guest write request, the device's rerror/werror
+# actions will be used.
+#
+# Since: 1.6
+##
+{ 'type': 'DriveBackup',
+  'data': { 'device': 'str', 'target': 'str', '*format': 'str',
+            '*mode': 'NewImageMode', '*speed': 'int',
+            '*on-source-error': 'BlockdevOnError',
+            '*on-target-error': 'BlockdevOnError' } }
+
+##
 # @TransactionAction
 #
 # A discriminated record of operations that can be performed with
@@ -1616,7 +1653,8 @@
 ##
 { 'union': 'TransactionAction',
   'data': {
-       'blockdev-snapshot-sync': 'BlockdevSnapshot'
+       'blockdev-snapshot-sync': 'BlockdevSnapshot',
+       'drive-backup': 'DriveBackup'
    } }
 
 ##
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [Qemu-devel] [PATCH v5 10/11] blockdev: add Abort transaction
  2013-05-30 12:34 [Qemu-devel] [PATCH v5 00/11] block: drive-backup live backup command Stefan Hajnoczi
                   ` (8 preceding siblings ...)
  2013-05-30 12:34 ` [Qemu-devel] [PATCH v5 09/11] blockdev: add DriveBackup transaction Stefan Hajnoczi
@ 2013-05-30 12:34 ` Stefan Hajnoczi
  2013-05-30 13:11   ` Eric Blake
  2013-05-30 12:34 ` [Qemu-devel] [PATCH v5 11/11] qemu-iotests: add 055 drive-backup test case Stefan Hajnoczi
  2013-06-06  5:06 ` [Qemu-devel] [PATCH v5 00/11] block: drive-backup live backup command Wenchao Xia
  11 siblings, 1 reply; 59+ messages in thread
From: Stefan Hajnoczi @ 2013-05-30 12:34 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Fam Zheng, xiawenc, imain, Stefan Hajnoczi,
	Paolo Bonzini, dietmar

The Abort action can be used to test QMP 'transaction' failure.  Add it
as the last action to exercise the .abort() and .cleanup() code paths
for all previous actions.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 blockdev.c       | 15 +++++++++++++++
 qapi-schema.json | 13 ++++++++++++-
 2 files changed, 27 insertions(+), 1 deletion(-)

diff --git a/blockdev.c b/blockdev.c
index 18a2012..e3393d2 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -963,6 +963,16 @@ static void drive_backup_abort(BlkTransactionState *common)
     }
 }
 
+static void abort_prepare(BlkTransactionState *common, Error **errp)
+{
+    error_setg(errp, "Transaction aborted using Abort action");
+}
+
+static void abort_commit(BlkTransactionState *common)
+{
+    assert(false); /* this action never succeeds */
+}
+
 static const BdrvActionOps actions[] = {
     [TRANSACTION_ACTION_KIND_BLOCKDEV_SNAPSHOT_SYNC] = {
         .instance_size = sizeof(ExternalSnapshotState),
@@ -975,6 +985,11 @@ static const BdrvActionOps actions[] = {
         .prepare = drive_backup_prepare,
         .abort = drive_backup_abort,
     },
+    [TRANSACTION_ACTION_KIND_ABORT] = {
+        .instance_size = sizeof(BlkTransactionState),
+        .prepare = abort_prepare,
+        .commit = abort_commit,
+    },
 };
 
 /*
diff --git a/qapi-schema.json b/qapi-schema.json
index 6bf96aa..0da1b98 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -1646,6 +1646,16 @@
             '*on-target-error': 'BlockdevOnError' } }
 
 ##
+# @Abort
+#
+# This action can be used to test transaction failure.
+#
+# Since: 1.6
+###
+{ 'type': 'Abort',
+  'data': { } }
+
+##
 # @TransactionAction
 #
 # A discriminated record of operations that can be performed with
@@ -1654,7 +1664,8 @@
 { 'union': 'TransactionAction',
   'data': {
        'blockdev-snapshot-sync': 'BlockdevSnapshot',
-       'drive-backup': 'DriveBackup'
+       'drive-backup': 'DriveBackup',
+       'abort': 'Abort'
    } }
 
 ##
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [Qemu-devel] [PATCH v5 11/11] qemu-iotests: add 055 drive-backup test case
  2013-05-30 12:34 [Qemu-devel] [PATCH v5 00/11] block: drive-backup live backup command Stefan Hajnoczi
                   ` (9 preceding siblings ...)
  2013-05-30 12:34 ` [Qemu-devel] [PATCH v5 10/11] blockdev: add Abort transaction Stefan Hajnoczi
@ 2013-05-30 12:34 ` Stefan Hajnoczi
  2013-06-19 12:41   ` Kevin Wolf
  2013-06-06  5:06 ` [Qemu-devel] [PATCH v5 00/11] block: drive-backup live backup command Wenchao Xia
  11 siblings, 1 reply; 59+ messages in thread
From: Stefan Hajnoczi @ 2013-05-30 12:34 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Fam Zheng, xiawenc, imain, Stefan Hajnoczi,
	Paolo Bonzini, dietmar

Testing drive-backup is similar to image streaming and drive mirroring.
This test case is based on 041.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 tests/qemu-iotests/055     | 256 +++++++++++++++++++++++++++++++++++++++++++++
 tests/qemu-iotests/055.out |   5 +
 tests/qemu-iotests/group   |   1 +
 3 files changed, 262 insertions(+)
 create mode 100755 tests/qemu-iotests/055
 create mode 100644 tests/qemu-iotests/055.out

diff --git a/tests/qemu-iotests/055 b/tests/qemu-iotests/055
new file mode 100755
index 0000000..0341301
--- /dev/null
+++ b/tests/qemu-iotests/055
@@ -0,0 +1,256 @@
+#!/usr/bin/env python
+#
+# Tests for drive-backup
+#
+# Copyright (C) 2013 Red Hat, Inc.
+#
+# Based on 041.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+#
+
+import time
+import os
+import iotests
+from iotests import qemu_img, qemu_io
+
+test_img = os.path.join(iotests.test_dir, 'test.img')
+target_img = os.path.join(iotests.test_dir, 'target.img')
+
+class TestSingleDrive(iotests.QMPTestCase):
+    image_len = 120 * 1024 * 1024 # MB
+
+    def setUp(self):
+        qemu_img('create', '-f', iotests.imgfmt, test_img, str(TestSingleDrive.image_len))
+        self.vm = iotests.VM().add_drive(test_img)
+        self.vm.launch()
+
+    def tearDown(self):
+        self.vm.shutdown()
+        os.remove(test_img)
+        try:
+            os.remove(target_img)
+        except OSError:
+            pass
+
+    def test_cancel(self):
+        self.assert_no_active_block_jobs()
+
+        result = self.vm.qmp('drive-backup', device='drive0',
+                             target=target_img)
+        self.assert_qmp(result, 'return', {})
+
+        self.cancel_and_wait()
+
+    def test_pause(self):
+        self.assert_no_active_block_jobs()
+
+        result = self.vm.qmp('drive-backup', device='drive0',
+                             target=target_img)
+        self.assert_qmp(result, 'return', {})
+
+        result = self.vm.qmp('block-job-pause', device='drive0')
+        self.assert_qmp(result, 'return', {})
+
+        time.sleep(1)
+        result = self.vm.qmp('query-block-jobs')
+        offset = self.dictpath(result, 'return[0]/offset')
+
+        time.sleep(1)
+        result = self.vm.qmp('query-block-jobs')
+        self.assert_qmp(result, 'return[0]/offset', offset)
+
+        self.vm.shutdown()
+        self.assertTrue(iotests.compare_images(test_img, target_img),
+                        'target image does not match source after backup')
+
+    def test_medium_not_found(self):
+        result = self.vm.qmp('drive-backup', device='ide1-cd0',
+                             target=target_img)
+        self.assert_qmp(result, 'error/class', 'GenericError')
+
+    def test_image_not_found(self):
+        result = self.vm.qmp('drive-backup', device='drive0',
+                             mode='existing', target=target_img)
+        self.assert_qmp(result, 'error/class', 'GenericError')
+
+    def test_device_not_found(self):
+        result = self.vm.qmp('drive-backup', device='nonexistent',
+                             target=target_img)
+        self.assert_qmp(result, 'error/class', 'DeviceNotFound')
+
+class TestSetSpeed(iotests.QMPTestCase):
+    image_len = 80 * 1024 * 1024 # MB
+
+    def setUp(self):
+        qemu_img('create', '-f', iotests.imgfmt, test_img, str(TestSetSpeed.image_len))
+        self.vm = iotests.VM().add_drive(test_img)
+        self.vm.launch()
+
+    def tearDown(self):
+        self.vm.shutdown()
+        os.remove(test_img)
+        os.remove(target_img)
+
+    def test_set_speed(self):
+        self.assert_no_active_block_jobs()
+
+        result = self.vm.qmp('drive-backup', device='drive0',
+                             target=target_img)
+        self.assert_qmp(result, 'return', {})
+
+        # Default speed is 0
+        result = self.vm.qmp('query-block-jobs')
+        self.assert_qmp(result, 'return[0]/device', 'drive0')
+        self.assert_qmp(result, 'return[0]/speed', 0)
+
+        result = self.vm.qmp('block-job-set-speed', device='drive0', speed=8 * 1024 * 1024)
+        self.assert_qmp(result, 'return', {})
+
+        # Ensure the speed we set was accepted
+        result = self.vm.qmp('query-block-jobs')
+        self.assert_qmp(result, 'return[0]/device', 'drive0')
+        self.assert_qmp(result, 'return[0]/speed', 8 * 1024 * 1024)
+
+        self.cancel_and_wait()
+
+        # Check setting speed in drive-backup works
+        result = self.vm.qmp('drive-backup', device='drive0',
+                             target=target_img, speed=4*1024*1024)
+        self.assert_qmp(result, 'return', {})
+
+        result = self.vm.qmp('query-block-jobs')
+        self.assert_qmp(result, 'return[0]/device', 'drive0')
+        self.assert_qmp(result, 'return[0]/speed', 4 * 1024 * 1024)
+
+        self.cancel_and_wait()
+
+    def test_set_speed_invalid(self):
+        self.assert_no_active_block_jobs()
+
+        result = self.vm.qmp('drive-backup', device='drive0',
+                             target=target_img, speed=-1)
+        self.assert_qmp(result, 'error/class', 'GenericError')
+
+        self.assert_no_active_block_jobs()
+
+        result = self.vm.qmp('drive-backup', device='drive0',
+                             target=target_img)
+        self.assert_qmp(result, 'return', {})
+
+        result = self.vm.qmp('block-job-set-speed', device='drive0', speed=-1)
+        self.assert_qmp(result, 'error/class', 'GenericError')
+
+        self.cancel_and_wait()
+
+class TestSingleTransaction(iotests.QMPTestCase):
+    image_len = 120 * 1024 * 1024 # MB
+
+    def setUp(self):
+        qemu_img('create', '-f', iotests.imgfmt, test_img, str(TestSingleDrive.image_len))
+        self.vm = iotests.VM().add_drive(test_img)
+        self.vm.launch()
+
+    def tearDown(self):
+        self.vm.shutdown()
+        os.remove(test_img)
+        try:
+            os.remove(target_img)
+        except OSError:
+            pass
+
+    def test_cancel(self):
+        self.assert_no_active_block_jobs()
+
+        result = self.vm.qmp('transaction', actions=[{
+                'type': 'drive-backup',
+                'data': { 'device': 'drive0',
+                          'target': target_img },
+            }
+        ])
+        self.assert_qmp(result, 'return', {})
+
+        self.cancel_and_wait()
+
+    def test_pause(self):
+        self.assert_no_active_block_jobs()
+
+        result = self.vm.qmp('transaction', actions=[{
+                'type': 'drive-backup',
+                'data': { 'device': 'drive0',
+                          'target': target_img },
+            }
+        ])
+        self.assert_qmp(result, 'return', {})
+
+        result = self.vm.qmp('block-job-pause', device='drive0')
+        self.assert_qmp(result, 'return', {})
+
+        time.sleep(1)
+        result = self.vm.qmp('query-block-jobs')
+        offset = self.dictpath(result, 'return[0]/offset')
+
+        time.sleep(1)
+        result = self.vm.qmp('query-block-jobs')
+        self.assert_qmp(result, 'return[0]/offset', offset)
+
+        self.vm.shutdown()
+        self.assertTrue(iotests.compare_images(test_img, target_img),
+                        'target image does not match source after backup')
+
+    def test_medium_not_found(self):
+        result = self.vm.qmp('transaction', actions=[{
+                'type': 'drive-backup',
+                'data': { 'device': 'ide1-cd0',
+                          'target': target_img },
+            }
+        ])
+        self.assert_qmp(result, 'error/class', 'GenericError')
+
+    def test_image_not_found(self):
+        result = self.vm.qmp('transaction', actions=[{
+                'type': 'drive-backup',
+                'data': { 'device': 'drive0',
+                          'mode': 'existing',
+                          'target': target_img },
+            }
+        ])
+        self.assert_qmp(result, 'error/class', 'GenericError')
+
+    def test_device_not_found(self):
+        result = self.vm.qmp('transaction', actions=[{
+                'type': 'drive-backup',
+                'data': { 'device': 'nonexistent',
+                          'mode': 'existing',
+                          'target': target_img },
+            }
+        ])
+        self.assert_qmp(result, 'error/class', 'DeviceNotFound')
+
+    def test_abort(self):
+        result = self.vm.qmp('transaction', actions=[{
+                'type': 'drive-backup',
+                'data': { 'device': 'nonexistent',
+                          'mode': 'existing',
+                          'target': target_img },
+            }, {
+                'type': 'Abort',
+                'data': {},
+            }
+        ])
+        self.assert_qmp(result, 'error/class', 'GenericError')
+        self.assert_no_active_block_jobs()
+
+if __name__ == '__main__':
+    iotests.main(supported_fmts=['raw', 'qcow2'])
diff --git a/tests/qemu-iotests/055.out b/tests/qemu-iotests/055.out
new file mode 100644
index 0000000..fa16b5c
--- /dev/null
+++ b/tests/qemu-iotests/055.out
@@ -0,0 +1,5 @@
+.............
+----------------------------------------------------------------------
+Ran 13 tests
+
+OK
diff --git a/tests/qemu-iotests/group b/tests/qemu-iotests/group
index 387b050..e5762f9 100644
--- a/tests/qemu-iotests/group
+++ b/tests/qemu-iotests/group
@@ -61,3 +61,4 @@
 052 rw auto backing
 053 rw auto
 054 rw auto
+055 rw auto
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* Re: [Qemu-devel] [PATCH v5 10/11] blockdev: add Abort transaction
  2013-05-30 12:34 ` [Qemu-devel] [PATCH v5 10/11] blockdev: add Abort transaction Stefan Hajnoczi
@ 2013-05-30 13:11   ` Eric Blake
  2013-06-03  9:21     ` Stefan Hajnoczi
  0 siblings, 1 reply; 59+ messages in thread
From: Eric Blake @ 2013-05-30 13:11 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Kevin Wolf, Fam Zheng, qemu-devel, xiawenc, imain, Paolo Bonzini,
	dietmar

[-- Attachment #1: Type: text/plain, Size: 481 bytes --]

On 05/30/2013 06:34 AM, Stefan Hajnoczi wrote:
> The Abort action can be used to test QMP 'transaction' failure.  Add it
> as the last action to exercise the .abort() and .cleanup() code paths
> for all previous actions.

Another thread questioned whether we should name this action
__org.qemu-debug-abort, if only to be clear that it is not a normal
interface.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 621 bytes --]

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Qemu-devel] [PATCH v5 05/11] blockdev: use bdrv_getlength() in qmp_drive_mirror()
  2013-05-30 12:34 ` [Qemu-devel] [PATCH v5 05/11] blockdev: use bdrv_getlength() in qmp_drive_mirror() Stefan Hajnoczi
@ 2013-05-30 13:24   ` Paolo Bonzini
  2013-06-03 17:15   ` Eric Blake
  2013-06-19 10:58   ` Kevin Wolf
  2 siblings, 0 replies; 59+ messages in thread
From: Paolo Bonzini @ 2013-05-30 13:24 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Kevin Wolf, Fam Zheng, qemu-devel, dietmar, imain, xiawenc

Il 30/05/2013 14:34, Stefan Hajnoczi ha scritto:
> Use bdrv_getlength() for its byte units and error return instead of
> bdrv_get_geometry().
> 
> Reported-by: Kevin Wolf <kwolf@redhat.com>
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>  blockdev.c | 10 +++++++---
>  1 file changed, 7 insertions(+), 3 deletions(-)
> 
> diff --git a/blockdev.c b/blockdev.c
> index 01db519..73b175b 100644
> --- a/blockdev.c
> +++ b/blockdev.c
> @@ -1364,7 +1364,7 @@ void qmp_drive_mirror(const char *device, const char *target,
>      BlockDriver *drv = NULL;
>      Error *local_err = NULL;
>      int flags;
> -    uint64_t size;
> +    int64_t size;
>      int ret;
>  
>      if (!has_speed) {
> @@ -1428,8 +1428,12 @@ void qmp_drive_mirror(const char *device, const char *target,
>          sync = MIRROR_SYNC_MODE_FULL;
>      }
>  
> -    bdrv_get_geometry(bs, &size);
> -    size *= 512;
> +    size = bdrv_getlength(bs);
> +    if (size < 0) {
> +        error_setg_errno(errp, -size, "bdrv_getlength failed");
> +        return;
> +    }
> +
>      if (sync == MIRROR_SYNC_MODE_FULL && mode != NEW_IMAGE_MODE_EXISTING) {
>          /* create new image w/o backing file */
>          assert(format && drv);
> 

Thanks!

Paolo

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Qemu-devel] [PATCH v5 01/11] notify: add NotiferWithReturn so notifier list can abort
  2013-05-30 12:34 ` [Qemu-devel] [PATCH v5 01/11] notify: add NotiferWithReturn so notifier list can abort Stefan Hajnoczi
@ 2013-05-30 22:27   ` Eric Blake
  2013-06-03  9:11     ` Stefan Hajnoczi
  2013-06-19 10:55   ` Kevin Wolf
  1 sibling, 1 reply; 59+ messages in thread
From: Eric Blake @ 2013-05-30 22:27 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Kevin Wolf, Fam Zheng, qemu-devel, xiawenc, imain, Paolo Bonzini,
	dietmar

[-- Attachment #1: Type: text/plain, Size: 1742 bytes --]

On 05/30/2013 06:34 AM, Stefan Hajnoczi wrote:
> notifier_list_notify() has no return value.  This is fine when we just
> want to invoke side-effects.
> 
> Sometimes it's useful for notifiers to produce a return value.  This
> allows notifiers to "veto" an operation and will be used by the block
> layer before-write notifier.
> 
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>  include/qemu/notify.h | 25 +++++++++++++++++++++++++
>  util/notify.c         | 30 ++++++++++++++++++++++++++++++
>  2 files changed, 55 insertions(+)
> 
> diff --git a/include/qemu/notify.h b/include/qemu/notify.h
> index 4e2e7f0..d3103e7 100644
> --- a/include/qemu/notify.h
> +++ b/include/qemu/notify.h
> @@ -40,4 +40,29 @@ void notifier_remove(Notifier *notifier);
>  
>  void notifier_list_notify(NotifierList *list, void *data);
>  
> +/* Same as Notifier but allows .notify() to return errors */

It's probably worth documenting that the callback must return 0 for
success, and that the first non-zero return is passed back without
further interpretation (thus allowing negative errno values if desired,
vs. a simpler -1).

Isn't this really just syntax sugar since existing notifiers could
already use a member within the opaque *data argument as a way to
short-circuit later notifiers on earlier errors?  If so, how hard would
it be to scrub the existing code base to always return 0 in existing
notifiers, rather than adding a parallel naming scheme?

At any rate, adding a new naming scheme minimizes churn, and the code
itself looks okay;

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 621 bytes --]

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Qemu-devel] [PATCH v5 02/11] block: add bdrv_add_before_write_notifier()
  2013-05-30 12:34 ` [Qemu-devel] [PATCH v5 02/11] block: add bdrv_add_before_write_notifier() Stefan Hajnoczi
@ 2013-05-30 22:47   ` Eric Blake
  2013-06-19 10:56   ` Kevin Wolf
  1 sibling, 0 replies; 59+ messages in thread
From: Eric Blake @ 2013-05-30 22:47 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Kevin Wolf, Fam Zheng, qemu-devel, xiawenc, imain, Paolo Bonzini,
	dietmar

[-- Attachment #1: Type: text/plain, Size: 811 bytes --]

On 05/30/2013 06:34 AM, Stefan Hajnoczi wrote:
> The bdrv_add_before_write_notifier() function installs a callback that
> is invoked before a write request is processed.  This will be used to
> implement copy-on-write point-in-time snapshots where we need to copy
> out old data before overwriting it.
> 
> Note that BdrvTrackedRequest is moved to block_int.h since it is passed
> to .notify() functions.
> 
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>  block.c                   | 23 ++++++++++++-----------
>  include/block/block_int.h | 23 ++++++++++++++++++++++-
>  2 files changed, 34 insertions(+), 12 deletions(-)

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 621 bytes --]

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Qemu-devel] [PATCH v5 07/11] blockdev: rename BlkTransactionStates to singular
  2013-05-30 12:34 ` [Qemu-devel] [PATCH v5 07/11] blockdev: rename BlkTransactionStates to singular Stefan Hajnoczi
@ 2013-05-30 22:55   ` Eric Blake
  2013-06-19 11:13   ` Kevin Wolf
  1 sibling, 0 replies; 59+ messages in thread
From: Eric Blake @ 2013-05-30 22:55 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Kevin Wolf, Fam Zheng, qemu-devel, xiawenc, imain, Paolo Bonzini,
	dietmar

[-- Attachment #1: Type: text/plain, Size: 817 bytes --]

On 05/30/2013 06:34 AM, Stefan Hajnoczi wrote:
> The QMP 'transaction' command keeps a list of in-flight transactions.
> The transaction state structure is called BlkTransactionStates even
> though it only deals with a single transaction.  The only plural thing
> is the linked list of transaction states.
> 
> I find it confusing to call the single structure "States".  This patch
> renames it to "State", just like BlockDriverState is singular.
> 
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>  blockdev.c | 104 ++++++++++++++++++++++++++++++-------------------------------
>  1 file changed, 52 insertions(+), 52 deletions(-)

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 621 bytes --]

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Qemu-devel] [PATCH v5 08/11] blockdev: allow BdrvActionOps->commit() to be NULL
  2013-05-30 12:34 ` [Qemu-devel] [PATCH v5 08/11] blockdev: allow BdrvActionOps->commit() to be NULL Stefan Hajnoczi
@ 2013-05-30 22:57   ` Eric Blake
  2013-06-03  9:16     ` Stefan Hajnoczi
  2013-06-19 11:14   ` Kevin Wolf
  1 sibling, 1 reply; 59+ messages in thread
From: Eric Blake @ 2013-05-30 22:57 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Kevin Wolf, Fam Zheng, qemu-devel, xiawenc, imain, Paolo Bonzini,
	dietmar

[-- Attachment #1: Type: text/plain, Size: 911 bytes --]

On 05/30/2013 06:34 AM, Stefan Hajnoczi wrote:
> Some QMP 'transaction' types don't need to do anything on .commit().
> Make .commit() optional just like .abort().
> 
> The "drive-backup" action will take advantage of this, it only needs to
> cancel the block job on .abort().  Other block job actions will probably
> follow the same pattern, so allow .commit() to be NULL.
> 
> Suggested-by: Eric Blake <eblake@redhat.com>
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>  blockdev.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)

Reviewed-by: Eric Blake <eblake@redhat.com>

Is it worth enforcing that at least one of commit or abort is supplied
(that is, assert if the user codes up an action that has neither
callback)?  Or is that just overkill?

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 621 bytes --]

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Qemu-devel] [PATCH v5 01/11] notify: add NotiferWithReturn so notifier list can abort
  2013-05-30 22:27   ` Eric Blake
@ 2013-06-03  9:11     ` Stefan Hajnoczi
  0 siblings, 0 replies; 59+ messages in thread
From: Stefan Hajnoczi @ 2013-06-03  9:11 UTC (permalink / raw)
  To: Eric Blake
  Cc: Kevin Wolf, Fam Zheng, qemu-devel, xiawenc, imain, Paolo Bonzini,
	dietmar

On Thu, May 30, 2013 at 04:27:48PM -0600, Eric Blake wrote:
> On 05/30/2013 06:34 AM, Stefan Hajnoczi wrote:
> > notifier_list_notify() has no return value.  This is fine when we just
> > want to invoke side-effects.
> > 
> > Sometimes it's useful for notifiers to produce a return value.  This
> > allows notifiers to "veto" an operation and will be used by the block
> > layer before-write notifier.
> > 
> > Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> > ---
> >  include/qemu/notify.h | 25 +++++++++++++++++++++++++
> >  util/notify.c         | 30 ++++++++++++++++++++++++++++++
> >  2 files changed, 55 insertions(+)
> > 
> > diff --git a/include/qemu/notify.h b/include/qemu/notify.h
> > index 4e2e7f0..d3103e7 100644
> > --- a/include/qemu/notify.h
> > +++ b/include/qemu/notify.h
> > @@ -40,4 +40,29 @@ void notifier_remove(Notifier *notifier);
> >  
> >  void notifier_list_notify(NotifierList *list, void *data);
> >  
> > +/* Same as Notifier but allows .notify() to return errors */
> 
> It's probably worth documenting that the callback must return 0 for
> success, and that the first non-zero return is passed back without
> further interpretation (thus allowing negative errno values if desired,
> vs. a simpler -1).

You are right, let's improve the doc comments.

> Isn't this really just syntax sugar since existing notifiers could
> already use a member within the opaque *data argument as a way to
> short-circuit later notifiers on earlier errors?  If so, how hard would
> it be to scrub the existing code base to always return 0 in existing
> notifiers, rather than adding a parallel naming scheme?

I considered both approaches and decided on NotifierWithReturn because
it neither touches existing users nor adds a side-effect for aborting.
BTW adding a side-effect in the BdrvTrackedRequest case is ugly, the
struct is used for more than just the pre-write notifier and the field
would be useless in the other cases.

Stefan

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Qemu-devel] [PATCH v5 08/11] blockdev: allow BdrvActionOps->commit() to be NULL
  2013-05-30 22:57   ` Eric Blake
@ 2013-06-03  9:16     ` Stefan Hajnoczi
  0 siblings, 0 replies; 59+ messages in thread
From: Stefan Hajnoczi @ 2013-06-03  9:16 UTC (permalink / raw)
  To: Eric Blake
  Cc: Kevin Wolf, Fam Zheng, qemu-devel, xiawenc, imain, Paolo Bonzini,
	dietmar

On Thu, May 30, 2013 at 04:57:21PM -0600, Eric Blake wrote:
> On 05/30/2013 06:34 AM, Stefan Hajnoczi wrote:
> > Some QMP 'transaction' types don't need to do anything on .commit().
> > Make .commit() optional just like .abort().
> > 
> > The "drive-backup" action will take advantage of this, it only needs to
> > cancel the block job on .abort().  Other block job actions will probably
> > follow the same pattern, so allow .commit() to be NULL.
> > 
> > Suggested-by: Eric Blake <eblake@redhat.com>
> > Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> > ---
> >  blockdev.c | 6 ++++--
> >  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> Reviewed-by: Eric Blake <eblake@redhat.com>
> 
> Is it worth enforcing that at least one of commit or abort is supplied
> (that is, assert if the user codes up an action that has neither
> callback)?  Or is that just overkill?

I left it out because it seems like overkill.  The action types are
small in number, statically defined in an array, and we test both the
commit and abort code paths.  So there's no convenience place to put a
compile-time check and developers would stumble across their mistake
when running their tests.

Stefan

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Qemu-devel] [PATCH v5 10/11] blockdev: add Abort transaction
  2013-05-30 13:11   ` Eric Blake
@ 2013-06-03  9:21     ` Stefan Hajnoczi
  2013-06-19 11:21       ` Kevin Wolf
  0 siblings, 1 reply; 59+ messages in thread
From: Stefan Hajnoczi @ 2013-06-03  9:21 UTC (permalink / raw)
  To: Eric Blake
  Cc: Kevin Wolf, Fam Zheng, qemu-devel, xiawenc, imain, Paolo Bonzini,
	dietmar

On Thu, May 30, 2013 at 07:11:25AM -0600, Eric Blake wrote:
> On 05/30/2013 06:34 AM, Stefan Hajnoczi wrote:
> > The Abort action can be used to test QMP 'transaction' failure.  Add it
> > as the last action to exercise the .abort() and .cleanup() code paths
> > for all previous actions.
> 
> Another thread questioned whether we should name this action
> __org.qemu-debug-abort, if only to be clear that it is not a normal
> interface.

kwolf: Do you want Abort to be namespaced as a debug-only action?

I think it's perfectly supportable so there's no need to hide it.
Granted it's rare that anyone would want to use this action.

Stefan

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Qemu-devel] [PATCH v5 06/11] block: add drive-backup QMP command
  2013-05-30 12:34 ` [Qemu-devel] [PATCH v5 06/11] block: add drive-backup QMP command Stefan Hajnoczi
@ 2013-06-03 17:09   ` Eric Blake
  2013-06-19 11:07   ` Kevin Wolf
  1 sibling, 0 replies; 59+ messages in thread
From: Eric Blake @ 2013-06-03 17:09 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Kevin Wolf, Fam Zheng, qemu-devel, xiawenc, imain, Paolo Bonzini,
	dietmar

[-- Attachment #1: Type: text/plain, Size: 839 bytes --]

On 05/30/2013 06:34 AM, Stefan Hajnoczi wrote:
> @drive-backup
> 
> Start a point-in-time copy of a block device to a new destination.  The
> status of ongoing drive-backup operations can be checked with
> query-block-jobs where the BlockJobInfo.type field has the value 'backup'.
> The operation can be stopped before it has completed using the
> block-job-cancel command.
> 

> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>  blockdev.c       | 97 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  qapi-schema.json | 46 +++++++++++++++++++++++++++
>  qmp-commands.hx  | 46 +++++++++++++++++++++++++++
>  3 files changed, 189 insertions(+)

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 621 bytes --]

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Qemu-devel] [PATCH v5 04/11] blockdev: drop redundant proto_drv check
  2013-05-30 12:34 ` [Qemu-devel] [PATCH v5 04/11] blockdev: drop redundant proto_drv check Stefan Hajnoczi
@ 2013-06-03 17:14   ` Eric Blake
  2013-06-06  5:00   ` Wenchao Xia
  2013-06-19 10:57   ` Kevin Wolf
  2 siblings, 0 replies; 59+ messages in thread
From: Eric Blake @ 2013-06-03 17:14 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Kevin Wolf, Fam Zheng, qemu-devel, xiawenc, imain, Paolo Bonzini,
	dietmar

[-- Attachment #1: Type: text/plain, Size: 873 bytes --]

On 05/30/2013 06:34 AM, Stefan Hajnoczi wrote:
> It is not necessary to check that we can find a protocol block driver
> since we create or open the image file.  This produces the error that we
> need anyway.
> 
> Besides, the QERR_INVALID_BLOCK_FORMAT is inappropriate since the
> protocol is incorrect rather than the format.
> 
> Also drop an empty line between bdrv_open() and checking its return
> value.  This may be due to copy-pasting from earlier code that performed
> other operations before handling errors.
> 
> Reported-by: Kevin Wolf <kwolf@redhat.com>
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>  blockdev.c | 15 ---------------
>  1 file changed, 15 deletions(-)

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 621 bytes --]

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Qemu-devel] [PATCH v5 05/11] blockdev: use bdrv_getlength() in qmp_drive_mirror()
  2013-05-30 12:34 ` [Qemu-devel] [PATCH v5 05/11] blockdev: use bdrv_getlength() in qmp_drive_mirror() Stefan Hajnoczi
  2013-05-30 13:24   ` Paolo Bonzini
@ 2013-06-03 17:15   ` Eric Blake
  2013-06-19 10:58   ` Kevin Wolf
  2 siblings, 0 replies; 59+ messages in thread
From: Eric Blake @ 2013-06-03 17:15 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Kevin Wolf, Fam Zheng, qemu-devel, xiawenc, imain, Paolo Bonzini,
	dietmar

[-- Attachment #1: Type: text/plain, Size: 497 bytes --]

On 05/30/2013 06:34 AM, Stefan Hajnoczi wrote:
> Use bdrv_getlength() for its byte units and error return instead of
> bdrv_get_geometry().
> 
> Reported-by: Kevin Wolf <kwolf@redhat.com>
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>  blockdev.c | 10 +++++++---
>  1 file changed, 7 insertions(+), 3 deletions(-)

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 621 bytes --]

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Qemu-devel] [PATCH v5 09/11] blockdev: add DriveBackup transaction
  2013-05-30 12:34 ` [Qemu-devel] [PATCH v5 09/11] blockdev: add DriveBackup transaction Stefan Hajnoczi
@ 2013-06-03 17:20   ` Eric Blake
  2013-06-19 11:17   ` Kevin Wolf
  1 sibling, 0 replies; 59+ messages in thread
From: Eric Blake @ 2013-06-03 17:20 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Kevin Wolf, Fam Zheng, qemu-devel, xiawenc, imain, Paolo Bonzini,
	dietmar

[-- Attachment #1: Type: text/plain, Size: 1302 bytes --]

On 05/30/2013 06:34 AM, Stefan Hajnoczi wrote:
> This patch adds a transactional version of the drive-backup QMP command.
> It allows atomic snapshots of multiple drives along with automatic
> cleanup if there is a failure to start one of the backup jobs.
> 
> Note that QMP events are emitted for block job completion/cancellation
> and the block job will be listed by query-block-jobs.
> 

> 
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>  blockdev.c       | 49 +++++++++++++++++++++++++++++++++++++++++++++++++
>  qapi-schema.json | 40 +++++++++++++++++++++++++++++++++++++++-
>  2 files changed, 88 insertions(+), 1 deletion(-)

It would be nice if we could avoid copy-and-paste between the
transaction and the self-standing command, perhaps by teaching the qapi
code generator to recognize:

{ 'command': 'foo', 'data': 'DictType' }

as shorthand for

{ 'command': 'foo', 'data': { ...inlined contents of DictType... } }

But that is a cleanup that would apply to other existing qapi commands,
and is therefore better done as an independent series, and therefore
does not stop me from giving:

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 621 bytes --]

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Qemu-devel] [PATCH v5 03/11] block: add basic backup support to block driver
  2013-05-30 12:34 ` [Qemu-devel] [PATCH v5 03/11] block: add basic backup support to block driver Stefan Hajnoczi
@ 2013-06-06  3:56   ` Fam Zheng
  2013-06-06  8:05     ` Stefan Hajnoczi
  2013-06-17  3:43   ` Fam Zheng
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 59+ messages in thread
From: Fam Zheng @ 2013-06-06  3:56 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Kevin Wolf, qemu-devel, dietmar, imain, Paolo Bonzini, xiawenc

On Thu, 05/30 14:34, Stefan Hajnoczi wrote:
> +
> +static int coroutine_fn backup_before_write_notify(
> +        NotifierWithReturn *notifier,
> +        void *opaque)
> +{
> +    BdrvTrackedRequest *req = opaque;
> +
> +    return backup_do_cow(req->bs, req->sector_num, req->nb_sectors, NULL);
> +}

I'm wondering if we can see the logic here with a backing hd
relationship?  req->bs is a backing file of job->target, but guest is
going to write to it, so we need to COW down the data to job->target
before overwritting (i.e.  cluster is not allocated in child).

I think if we do this in block layer, there's not much necessity for a
before-write notifier here (although it may be useful for other cases):

    in bdrv_write:
    for child in req->bs->open_children
        if not child->is_allocated(req->sectors)
            do COW to child

The advantage of this is that we won't need to start block-backup job in
sync mode "none" to do point-in-time snapshot (image fleecing), and we
get writable snapshot (possibility to open backing file writable and
write to it safely) as a by-product.

But we will need to keep track of parent<->child of block states, and we
still need to take care of overlapping writing between block job and
guest request.

-- 
Fam

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Qemu-devel] [PATCH v5 04/11] blockdev: drop redundant proto_drv check
  2013-05-30 12:34 ` [Qemu-devel] [PATCH v5 04/11] blockdev: drop redundant proto_drv check Stefan Hajnoczi
  2013-06-03 17:14   ` Eric Blake
@ 2013-06-06  5:00   ` Wenchao Xia
  2013-06-19 10:57   ` Kevin Wolf
  2 siblings, 0 replies; 59+ messages in thread
From: Wenchao Xia @ 2013-06-06  5:00 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Kevin Wolf, Fam Zheng, qemu-devel, imain, Paolo Bonzini, dietmar

于 2013-5-30 20:34, Stefan Hajnoczi 写道:
> It is not necessary to check that we can find a protocol block driver
> since we create or open the image file.  This produces the error that we
> need anyway.
> 
> Besides, the QERR_INVALID_BLOCK_FORMAT is inappropriate since the
> protocol is incorrect rather than the format.
> 
> Also drop an empty line between bdrv_open() and checking its return
> value.  This may be due to copy-pasting from earlier code that performed
> other operations before handling errors.
> 
> Reported-by: Kevin Wolf <kwolf@redhat.com>
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>   blockdev.c | 15 ---------------
>   1 file changed, 15 deletions(-)
> 
> diff --git a/blockdev.c b/blockdev.c
> index b9b2d10..01db519 100644
> --- a/blockdev.c
> +++ b/blockdev.c
> @@ -818,7 +818,6 @@ typedef struct ExternalSnapshotStates {
>   static void external_snapshot_prepare(BlkTransactionStates *common,
>                                         Error **errp)
>   {
> -    BlockDriver *proto_drv;
>       BlockDriver *drv;
>       int flags, ret;
>       Error *local_err = NULL;
> @@ -874,12 +873,6 @@ static void external_snapshot_prepare(BlkTransactionStates *common,
> 
>       flags = states->old_bs->open_flags;
> 
> -    proto_drv = bdrv_find_protocol(new_image_file);
> -    if (!proto_drv) {
> -        error_set(errp, QERR_INVALID_BLOCK_FORMAT, format);
> -        return;
> -    }
> -
>       /* create new image w/backing file */
>       if (mode != NEW_IMAGE_MODE_EXISTING) {
>           bdrv_img_create(new_image_file, format,
> @@ -1368,7 +1361,6 @@ void qmp_drive_mirror(const char *device, const char *target,
>   {
>       BlockDriverState *bs;
>       BlockDriverState *source, *target_bs;
> -    BlockDriver *proto_drv;
>       BlockDriver *drv = NULL;
>       Error *local_err = NULL;
>       int flags;
> @@ -1436,12 +1428,6 @@ void qmp_drive_mirror(const char *device, const char *target,
>           sync = MIRROR_SYNC_MODE_FULL;
>       }
> 
> -    proto_drv = bdrv_find_protocol(target);
> -    if (!proto_drv) {
> -        error_set(errp, QERR_INVALID_BLOCK_FORMAT, format);
> -        return;
> -    }
> -
>       bdrv_get_geometry(bs, &size);
>       size *= 512;
>       if (sync == MIRROR_SYNC_MODE_FULL && mode != NEW_IMAGE_MODE_EXISTING) {
> @@ -1476,7 +1462,6 @@ void qmp_drive_mirror(const char *device, const char *target,
>        */
>       target_bs = bdrv_new("");
>       ret = bdrv_open(target_bs, target, NULL, flags | BDRV_O_NO_BACKING, drv);
> -
>       if (ret < 0) {
>           bdrv_delete(target_bs);
>           error_set(errp, QERR_OPEN_FILE_FAILED, target);
> 
  Although it is not merged yet, I'll use it as pre-patches in my
series which add internal snapshot support in qmp_transaction.

Reviewed-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>

-- 
Best Regards

Wenchao Xia

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Qemu-devel] [PATCH v5 00/11] block: drive-backup live backup command
  2013-05-30 12:34 [Qemu-devel] [PATCH v5 00/11] block: drive-backup live backup command Stefan Hajnoczi
                   ` (10 preceding siblings ...)
  2013-05-30 12:34 ` [Qemu-devel] [PATCH v5 11/11] qemu-iotests: add 055 drive-backup test case Stefan Hajnoczi
@ 2013-06-06  5:06 ` Wenchao Xia
  11 siblings, 0 replies; 59+ messages in thread
From: Wenchao Xia @ 2013-06-06  5:06 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Kevin Wolf, Fam Zheng, qemu-devel, imain, Paolo Bonzini, dietmar

于 2013-5-30 20:34, Stefan Hajnoczi 写道:
> Note: These patches apply to kevin/block.  You can also grab the code from git
> here:
> git://github.com/stefanha/qemu.git block-backup-core
> 
> This series adds a new QMP command, drive-backup, which takes a point-in-time
> snapshot of a block device.  The snapshot is copied out to a target block
> device.  A simple example is:
> 
>    drive-backup device=virtio0 format=qcow2 target=backup-20130401.qcow2
> 
> The original drive-backup blockjob was written by Dietmar Maurer
> <dietmar@proxmox.com>.  He is currently busy but I feel the feature is worth
> pushing into QEMU since there has been interest.  This is my version of his
> patch, plus the QMP command and qemu-iotests test case.
> 
> QMP 'transaction' support is included since v3.  It adds support for atomic
> snapshots of multiple block devices.  I also added an 'abort' transaction to
> allow testing of the .abort()/.cleanup() code path.  Thanks to Wenchao for
> making qmp_transaction() extensible.
> 
> How is this different from block-stream and drive-mirror?
> ---------------------------------------------------------
> Both block-stream and drive-mirror do not provide immediate point-in-time
> snapshots.  Instead they copy data into a new file and then switch to it.  In
> other words, the point at which the "snapshot" is taken cannot be controlled
> directly.
> 
> drive-backup intercepts guest writes and saves data into the target block
> device before it is overwritten.  The target block device can be a raw image
> file, backing files are not used to implement this feature.
> 
> How can drive-backup be used?
> -----------------------------
> The simplest use-case is to copy a point-in-time snapshot to a local file.
> 
> More advanced users may wish to make the target an NBD URL.  The NBD server
> listening on the other side can process the backup writes any way it wishes.  I
> previously posted an RFC series with a backup server that streamed Dietmar's
> VMA backup archive format.
> 
> What's next for drive-backup?
> -----------------------------
> 1. Sync modes like drive-mirror (top, full, none).  This makes it possible to
>     preserve the backing file chain.
> 
> v5:
>   * Use bdrv_co_write_zeroes(job->target) [kwolf]
> 
>     This change means that we write zeroes over NBD again.  The optimization can
>     be reintroduced by skipping zeroes when bdrv_has_zero_init() is true and the
>     sectors are allocated.  Leave that for a future series, if we decide to do
>     this optimization again because it may require extra block driver
>     configuration to indicate that an image has zero init.
> 
>   * iostatus error handling [kwolf/pbonzini]
> 
>     Add configurable on-source-error and on-target-error actions just like
>     drive-mirror.  These are used when the block job coroutine hits an error.
>     If we are in guest write request context, return the errno and let the usual
>     guest error handling take over.
> 
>   * Allow BdrvActionOps->commit() to be NULL [eblake]
>   * Use bdrv_getlength() in qmp_drive_mirror() [kwolf]
>   * Drop redundant proto_drv check [kwolf]
>   * Fix outdated DPRINTF() function names
>   * Drop comment about non-existent bitmap coroutine race [kwolf]
>   * Rename BACKUP_SECTORS_PER_CLUSTER [kwolf]
>   * Fix completion when image is not multiple of backup cluster size [fam]
> 
> v4:
>   * Use notifier lists and BdrvTrackedRequest instead of custom callbacks [bonzini]
>   * Add drive-backup QMP example JSON [eblake]
>   * Add "Since: 1.6" to QMP schema changes [eblake]
> 
> v3:
>   * Rename to drive-backup for consistency with drive-mirror [kwolf]
>   * Add QMP transaction support [kwolf]
>   * Introduce bdrv_add_before_write_cb() to hook writes
>   * Mention 'query-block-jobs' lists job of type 'backup' [eblake]
>   * Rename rwlock to flush_rwlock [kwolf]
>   * Fix space in block/backup.c comment [kwolf]
> 
> v2:
>   * s/block_backup/block-backup/ in commit message [eblake]
>   * Avoid funny spacing in QMP docs [eblake]
>   * Document query-block-jobs and block-job-cancel usage [eblake]
> 
> Dietmar Maurer (1):
>    block: add basic backup support to block driver
> 
> Stefan Hajnoczi (10):
>    notify: add NotiferWithReturn so notifier list can abort
>    block: add bdrv_add_before_write_notifier()
>    blockdev: drop redundant proto_drv check
>    blockdev: use bdrv_getlength() in qmp_drive_mirror()
>    block: add drive-backup QMP command
>    blockdev: rename BlkTransactionStates to singular
>    blockdev: allow BdrvActionOps->commit() to be NULL
>    blockdev: add DriveBackup transaction
>    blockdev: add Abort transaction
>    qemu-iotests: add 055 drive-backup test case
> 
>   block.c                    |  23 +--
>   block/Makefile.objs        |   1 +
>   block/backup.c             | 355 +++++++++++++++++++++++++++++++++++++++++++++
>   blockdev.c                 | 288 +++++++++++++++++++++++++++---------
>   include/block/block_int.h  |  42 +++++-
>   include/qemu/notify.h      |  25 ++++
>   qapi-schema.json           |  97 ++++++++++++-
>   qmp-commands.hx            |  46 ++++++
>   tests/qemu-iotests/055     | 256 ++++++++++++++++++++++++++++++++
>   tests/qemu-iotests/055.out |   5 +
>   tests/qemu-iotests/group   |   1 +
>   util/notify.c              |  30 ++++
>   12 files changed, 1088 insertions(+), 81 deletions(-)
>   create mode 100644 block/backup.c
>   create mode 100755 tests/qemu-iotests/055
>   create mode 100644 tests/qemu-iotests/055.out
> 
  Reviewed patch 4, 7, 8 and they seems good, will use them in my
following work.

-- 
Best Regards

Wenchao Xia

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Qemu-devel] [PATCH v5 03/11] block: add basic backup support to block driver
  2013-06-06  3:56   ` Fam Zheng
@ 2013-06-06  8:05     ` Stefan Hajnoczi
  2013-06-06  8:56       ` Fam Zheng
  0 siblings, 1 reply; 59+ messages in thread
From: Stefan Hajnoczi @ 2013-06-06  8:05 UTC (permalink / raw)
  To: Stefan Hajnoczi, qemu-devel, Kevin Wolf, dietmar, imain,
	Paolo Bonzini, xiawenc, Eric Blake

On Thu, Jun 06, 2013 at 11:56:18AM +0800, Fam Zheng wrote:
> On Thu, 05/30 14:34, Stefan Hajnoczi wrote:
> > +
> > +static int coroutine_fn backup_before_write_notify(
> > +        NotifierWithReturn *notifier,
> > +        void *opaque)
> > +{
> > +    BdrvTrackedRequest *req = opaque;
> > +
> > +    return backup_do_cow(req->bs, req->sector_num, req->nb_sectors, NULL);
> > +}
> 
> I'm wondering if we can see the logic here with a backing hd
> relationship?  req->bs is a backing file of job->target, but guest is
> going to write to it, so we need to COW down the data to job->target
> before overwritting (i.e.  cluster is not allocated in child).
>
> I think if we do this in block layer, there's not much necessity for a
> before-write notifier here (although it may be useful for other cases):
> 
>     in bdrv_write:
>     for child in req->bs->open_children
>         if not child->is_allocated(req->sectors)
>             do COW to child
> 
> The advantage of this is that we won't need to start block-backup job in
> sync mode "none" to do point-in-time snapshot (image fleecing), and we
> get writable snapshot (possibility to open backing file writable and
> write to it safely) as a by-product.
> 
> But we will need to keep track of parent<->child of block states, and we
> still need to take care of overlapping writing between block job and
> guest request.

There's one catch here: bs->target may not support backing files, it can
be a raw file, for example.  We'll only use backing files for
point-in-time snapshots but other use cases might not.  raw doesn't
really implement is_allocated(), so the whole concept would have to
change a little:

bs->open_children becomes independent of backing files - any
BlockDriverState can be added to this list.  ->is_allocated() basically
becomes the bitmap that we keep in the block job.

In the end I'm not sure there is much advantage since we need
backup_do_cow() and the overlapping request code anyway for the block
job.

Stefan

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Qemu-devel] [PATCH v5 03/11] block: add basic backup support to block driver
  2013-06-06  8:05     ` Stefan Hajnoczi
@ 2013-06-06  8:56       ` Fam Zheng
  2013-06-07  7:18         ` Stefan Hajnoczi
  0 siblings, 1 reply; 59+ messages in thread
From: Fam Zheng @ 2013-06-06  8:56 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Kevin Wolf, qemu-devel, xiawenc, imain, Stefan Hajnoczi,
	Paolo Bonzini, dietmar

On Thu, 06/06 10:05, Stefan Hajnoczi wrote:
> On Thu, Jun 06, 2013 at 11:56:18AM +0800, Fam Zheng wrote:
> > On Thu, 05/30 14:34, Stefan Hajnoczi wrote:
> > > +
> > > +static int coroutine_fn backup_before_write_notify(
> > > +        NotifierWithReturn *notifier,
> > > +        void *opaque)
> > > +{
> > > +    BdrvTrackedRequest *req = opaque;
> > > +
> > > +    return backup_do_cow(req->bs, req->sector_num, req->nb_sectors, NULL);
> > > +}
> > 
> > I'm wondering if we can see the logic here with a backing hd
> > relationship?  req->bs is a backing file of job->target, but guest is
> > going to write to it, so we need to COW down the data to job->target
> > before overwritting (i.e.  cluster is not allocated in child).
> >
> > I think if we do this in block layer, there's not much necessity for a
> > before-write notifier here (although it may be useful for other cases):
> > 
> >     in bdrv_write:
> >     for child in req->bs->open_children
> >         if not child->is_allocated(req->sectors)
> >             do COW to child
> > 
> > The advantage of this is that we won't need to start block-backup job in
> > sync mode "none" to do point-in-time snapshot (image fleecing), and we
> > get writable snapshot (possibility to open backing file writable and
> > write to it safely) as a by-product.
> > 
> > But we will need to keep track of parent<->child of block states, and we
> > still need to take care of overlapping writing between block job and
> > guest request.
> 
> There's one catch here: bs->target may not support backing files, it can
> be a raw file, for example.  We'll only use backing files for
> point-in-time snapshots but other use cases might not.  raw doesn't
> really implement is_allocated(), so the whole concept would have to
> change a little:

Another use case may be parent modification. Suppose we have

                    ,--- child1.qcow2
    parent.qcow2  <
                    `--- child2.qcow2

We can use parent.qcow2 as block device in QEMU without breaking
child1.qcow2 or child2.qcow2 by telling QEMU who its children are:

  $QEMU -drive file=parent.qcow2,children=child1.qcow2:child2.qcow2

Then we open the three images and setup parent_bs->open_children, the
children are protected from being corrupted.

> 
> bs->open_children becomes independent of backing files - any
> BlockDriverState can be added to this list.  ->is_allocated() basically
> becomes the bitmap that we keep in the block job.

Yes. But it is possible to keep a bitmap for raw (and those don't
implement is_allocated()) in block layer too, or in overlay: could
add-cow by Dongxu Wang help here?

> 
> In the end I'm not sure there is much advantage since we need
> backup_do_cow() and the overlapping request code anyway for the block
> job.

-- 
Fam

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Qemu-devel] [PATCH v5 03/11] block: add basic backup support to block driver
  2013-06-06  8:56       ` Fam Zheng
@ 2013-06-07  7:18         ` Stefan Hajnoczi
  2013-06-13  6:03           ` Wenchao Xia
  0 siblings, 1 reply; 59+ messages in thread
From: Stefan Hajnoczi @ 2013-06-07  7:18 UTC (permalink / raw)
  To: Stefan Hajnoczi, qemu-devel, Kevin Wolf, dietmar, imain,
	Paolo Bonzini, xiawenc, Eric Blake

On Thu, Jun 06, 2013 at 04:56:49PM +0800, Fam Zheng wrote:
> On Thu, 06/06 10:05, Stefan Hajnoczi wrote:
> > On Thu, Jun 06, 2013 at 11:56:18AM +0800, Fam Zheng wrote:
> > > On Thu, 05/30 14:34, Stefan Hajnoczi wrote:
> > > > +
> > > > +static int coroutine_fn backup_before_write_notify(
> > > > +        NotifierWithReturn *notifier,
> > > > +        void *opaque)
> > > > +{
> > > > +    BdrvTrackedRequest *req = opaque;
> > > > +
> > > > +    return backup_do_cow(req->bs, req->sector_num, req->nb_sectors, NULL);
> > > > +}
> > > 
> > > I'm wondering if we can see the logic here with a backing hd
> > > relationship?  req->bs is a backing file of job->target, but guest is
> > > going to write to it, so we need to COW down the data to job->target
> > > before overwritting (i.e.  cluster is not allocated in child).
> > >
> > > I think if we do this in block layer, there's not much necessity for a
> > > before-write notifier here (although it may be useful for other cases):
> > > 
> > >     in bdrv_write:
> > >     for child in req->bs->open_children
> > >         if not child->is_allocated(req->sectors)
> > >             do COW to child
> > > 
> > > The advantage of this is that we won't need to start block-backup job in
> > > sync mode "none" to do point-in-time snapshot (image fleecing), and we
> > > get writable snapshot (possibility to open backing file writable and
> > > write to it safely) as a by-product.
> > > 
> > > But we will need to keep track of parent<->child of block states, and we
> > > still need to take care of overlapping writing between block job and
> > > guest request.
> > 
> > There's one catch here: bs->target may not support backing files, it can
> > be a raw file, for example.  We'll only use backing files for
> > point-in-time snapshots but other use cases might not.  raw doesn't
> > really implement is_allocated(), so the whole concept would have to
> > change a little:
> 
> Another use case may be parent modification. Suppose we have
> 
>                     ,--- child1.qcow2
>     parent.qcow2  <
>                     `--- child2.qcow2
> 
> We can use parent.qcow2 as block device in QEMU without breaking
> child1.qcow2 or child2.qcow2 by telling QEMU who its children are:
> 
>   $QEMU -drive file=parent.qcow2,children=child1.qcow2:child2.qcow2
> 
> Then we open the three images and setup parent_bs->open_children, the
> children are protected from being corrupted.
> 
> > 
> > bs->open_children becomes independent of backing files - any
> > BlockDriverState can be added to this list.  ->is_allocated() basically
> > becomes the bitmap that we keep in the block job.
> 
> Yes. But it is possible to keep a bitmap for raw (and those don't
> implement is_allocated()) in block layer too, or in overlay: could
> add-cow by Dongxu Wang help here?

Yes absolutely.

Stefan

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Qemu-devel] [PATCH v5 03/11] block: add basic backup support to block driver
  2013-06-07  7:18         ` Stefan Hajnoczi
@ 2013-06-13  6:03           ` Wenchao Xia
  2013-06-13  6:07             ` Wenchao Xia
  0 siblings, 1 reply; 59+ messages in thread
From: Wenchao Xia @ 2013-06-13  6:03 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Kevin Wolf, Stefan Hajnoczi, qemu-devel, imain, Paolo Bonzini, dietmar

于 2013-6-7 15:18, Stefan Hajnoczi 写道:
> On Thu, Jun 06, 2013 at 04:56:49PM +0800, Fam Zheng wrote:
>> On Thu, 06/06 10:05, Stefan Hajnoczi wrote:
>>> On Thu, Jun 06, 2013 at 11:56:18AM +0800, Fam Zheng wrote:
>>>> On Thu, 05/30 14:34, Stefan Hajnoczi wrote:
>>>>> +
>>>>> +static int coroutine_fn backup_before_write_notify(
>>>>> +        NotifierWithReturn *notifier,
>>>>> +        void *opaque)
>>>>> +{
>>>>> +    BdrvTrackedRequest *req = opaque;
>>>>> +
>>>>> +    return backup_do_cow(req->bs, req->sector_num, req->nb_sectors, NULL);
>>>>> +}
>>>>
>>>> I'm wondering if we can see the logic here with a backing hd
>>>> relationship?  req->bs is a backing file of job->target, but guest is
>>>> going to write to it, so we need to COW down the data to job->target
>>>> before overwritting (i.e.  cluster is not allocated in child).
>>>>
>>>> I think if we do this in block layer, there's not much necessity for a
>>>> before-write notifier here (although it may be useful for other cases):
>>>>
>>>>      in bdrv_write:
>>>>      for child in req->bs->open_children
>>>>          if not child->is_allocated(req->sectors)
>>>>              do COW to child
>>>>
>>>> The advantage of this is that we won't need to start block-backup job in
>>>> sync mode "none" to do point-in-time snapshot (image fleecing), and we
>>>> get writable snapshot (possibility to open backing file writable and
>>>> write to it safely) as a by-product.
>>>>
>>>> But we will need to keep track of parent<->child of block states, and we
>>>> still need to take care of overlapping writing between block job and
>>>> guest request.
>>>
>>> There's one catch here: bs->target may not support backing files, it can
>>> be a raw file, for example.  We'll only use backing files for
>>> point-in-time snapshots but other use cases might not.  raw doesn't
>>> really implement is_allocated(), so the whole concept would have to
>>> change a little:
>>
>> Another use case may be parent modification. Suppose we have
>>
>>                      ,--- child1.qcow2
>>      parent.qcow2  <
>>                      `--- child2.qcow2
>>
>> We can use parent.qcow2 as block device in QEMU without breaking
>> child1.qcow2 or child2.qcow2 by telling QEMU who its children are:
>>
>>    $QEMU -drive file=parent.qcow2,children=child1.qcow2:child2.qcow2
>>
>> Then we open the three images and setup parent_bs->open_children, the
>> children are protected from being corrupted.
>>
>>>
>>> bs->open_children becomes independent of backing files - any
>>> BlockDriverState can be added to this list.  ->is_allocated() basically
>>> becomes the bitmap that we keep in the block job.
>>
>> Yes. But it is possible to keep a bitmap for raw (and those don't
>> implement is_allocated()) in block layer too, or in overlay: could
>> add-cow by Dongxu Wang help here?
>
> Yes absolutely.
>
> Stefan
>
   One advantage of external backup, or backing up chain, is that it
holds 'Delta' data only and is small enough. If it is changed toward a
'full' data writable snapshot, it become bigger. With backup chain
qemu-img can restore/clone a writable and usable one, So I don't
think adding that in qemu emulator helps much, and it will make things
more complicit.... user won't care who is doing the job, qemu or
qemu-img.

-- 
Best Regards

Wenchao Xia

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Qemu-devel] [PATCH v5 03/11] block: add basic backup support to block driver
  2013-06-13  6:03           ` Wenchao Xia
@ 2013-06-13  6:07             ` Wenchao Xia
  2013-06-13  6:33               ` Fam Zheng
  0 siblings, 1 reply; 59+ messages in thread
From: Wenchao Xia @ 2013-06-13  6:07 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Kevin Wolf, Stefan Hajnoczi, qemu-devel, imain, Paolo Bonzini, dietmar

于 2013-6-13 14:03, Wenchao Xia 写道:
> 于 2013-6-7 15:18, Stefan Hajnoczi 写道:
>> On Thu, Jun 06, 2013 at 04:56:49PM +0800, Fam Zheng wrote:
>>> On Thu, 06/06 10:05, Stefan Hajnoczi wrote:
>>>> On Thu, Jun 06, 2013 at 11:56:18AM +0800, Fam Zheng wrote:
>>>>> On Thu, 05/30 14:34, Stefan Hajnoczi wrote:
>>>>>> +
>>>>>> +static int coroutine_fn backup_before_write_notify(
>>>>>> +        NotifierWithReturn *notifier,
>>>>>> +        void *opaque)
>>>>>> +{
>>>>>> +    BdrvTrackedRequest *req = opaque;
>>>>>> +
>>>>>> +    return backup_do_cow(req->bs, req->sector_num,
>>>>>> req->nb_sectors, NULL);
>>>>>> +}
>>>>>
>>>>> I'm wondering if we can see the logic here with a backing hd
>>>>> relationship?  req->bs is a backing file of job->target, but guest is
>>>>> going to write to it, so we need to COW down the data to job->target
>>>>> before overwritting (i.e.  cluster is not allocated in child).
>>>>>
>>>>> I think if we do this in block layer, there's not much necessity for a
>>>>> before-write notifier here (although it may be useful for other
>>>>> cases):
>>>>>
>>>>>      in bdrv_write:
>>>>>      for child in req->bs->open_children
>>>>>          if not child->is_allocated(req->sectors)
>>>>>              do COW to child
>>>>>
>>>>> The advantage of this is that we won't need to start block-backup
>>>>> job in
>>>>> sync mode "none" to do point-in-time snapshot (image fleecing), and we
>>>>> get writable snapshot (possibility to open backing file writable and
>>>>> write to it safely) as a by-product.
>>>>>
>>>>> But we will need to keep track of parent<->child of block states,
>>>>> and we
>>>>> still need to take care of overlapping writing between block job and
>>>>> guest request.
>>>>
>>>> There's one catch here: bs->target may not support backing files, it
>>>> can
>>>> be a raw file, for example.  We'll only use backing files for
>>>> point-in-time snapshots but other use cases might not.  raw doesn't
>>>> really implement is_allocated(), so the whole concept would have to
>>>> change a little:
>>>
>>> Another use case may be parent modification. Suppose we have
>>>
>>>                      ,--- child1.qcow2
>>>      parent.qcow2  <
>>>                      `--- child2.qcow2
>>>
>>> We can use parent.qcow2 as block device in QEMU without breaking
>>> child1.qcow2 or child2.qcow2 by telling QEMU who its children are:
>>>
>>>    $QEMU -drive file=parent.qcow2,children=child1.qcow2:child2.qcow2
>>>
>>> Then we open the three images and setup parent_bs->open_children, the
>>> children are protected from being corrupted.
>>>
>>>>
>>>> bs->open_children becomes independent of backing files - any
>>>> BlockDriverState can be added to this list.  ->is_allocated() basically
>>>> becomes the bitmap that we keep in the block job.
>>>
>>> Yes. But it is possible to keep a bitmap for raw (and those don't
>>> implement is_allocated()) in block layer too, or in overlay: could
>>> add-cow by Dongxu Wang help here?
>>
>> Yes absolutely.
>>
>> Stefan
>>
>    One advantage of external backup, or backing up chain, is that it
> holds 'Delta' data only and is small enough. If it is changed toward a
> 'full' data writable snapshot, it become bigger. With backup chain
> qemu-img can restore/clone a writable and usable one, So I don't
> think adding that in qemu emulator helps much, and it will make things
> more complicit.... user won't care who is doing the job, qemu or
> qemu-img.
>
   I mean that "get writable snapshot (possibility to open backing file
writable and write to it safely) as a by-product." in this series, is
not very valuable.

-- 
Best Regards

Wenchao Xia

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Qemu-devel] [PATCH v5 03/11] block: add basic backup support to block driver
  2013-06-13  6:07             ` Wenchao Xia
@ 2013-06-13  6:33               ` Fam Zheng
  2013-06-13  8:02                 ` Wenchao Xia
  2013-06-13  8:51                 ` Stefan Hajnoczi
  0 siblings, 2 replies; 59+ messages in thread
From: Fam Zheng @ 2013-06-13  6:33 UTC (permalink / raw)
  To: Wenchao Xia
  Cc: Kevin Wolf, Stefan Hajnoczi, qemu-devel, imain, Stefan Hajnoczi,
	Paolo Bonzini, dietmar

On Thu, 06/13 14:07, Wenchao Xia wrote:
> 于 2013-6-13 14:03, Wenchao Xia 写道:
> >于 2013-6-7 15:18, Stefan Hajnoczi 写道:
> >>On Thu, Jun 06, 2013 at 04:56:49PM +0800, Fam Zheng wrote:
> >>>On Thu, 06/06 10:05, Stefan Hajnoczi wrote:
> >>>>On Thu, Jun 06, 2013 at 11:56:18AM +0800, Fam Zheng wrote:
> >>>>>On Thu, 05/30 14:34, Stefan Hajnoczi wrote:
> >>>>>>+
> >>>>>>+static int coroutine_fn backup_before_write_notify(
> >>>>>>+        NotifierWithReturn *notifier,
> >>>>>>+        void *opaque)
> >>>>>>+{
> >>>>>>+    BdrvTrackedRequest *req = opaque;
> >>>>>>+
> >>>>>>+    return backup_do_cow(req->bs, req->sector_num,
> >>>>>>req->nb_sectors, NULL);
> >>>>>>+}
> >>>>>
> >>>>>I'm wondering if we can see the logic here with a backing hd
> >>>>>relationship?  req->bs is a backing file of job->target, but guest is
> >>>>>going to write to it, so we need to COW down the data to job->target
> >>>>>before overwritting (i.e.  cluster is not allocated in child).
> >>>>>
> >>>>>I think if we do this in block layer, there's not much necessity for a
> >>>>>before-write notifier here (although it may be useful for other
> >>>>>cases):
> >>>>>
> >>>>>     in bdrv_write:
> >>>>>     for child in req->bs->open_children
> >>>>>         if not child->is_allocated(req->sectors)
> >>>>>             do COW to child
> >>>>>
> >>>>>The advantage of this is that we won't need to start block-backup
> >>>>>job in
> >>>>>sync mode "none" to do point-in-time snapshot (image fleecing), and we
> >>>>>get writable snapshot (possibility to open backing file writable and
> >>>>>write to it safely) as a by-product.
> >>>>>
> >>>>>But we will need to keep track of parent<->child of block states,
> >>>>>and we
> >>>>>still need to take care of overlapping writing between block job and
> >>>>>guest request.
> >>>>
> >>>>There's one catch here: bs->target may not support backing files, it
> >>>>can
> >>>>be a raw file, for example.  We'll only use backing files for
> >>>>point-in-time snapshots but other use cases might not.  raw doesn't
> >>>>really implement is_allocated(), so the whole concept would have to
> >>>>change a little:
> >>>
> >>>Another use case may be parent modification. Suppose we have
> >>>
> >>>                     ,--- child1.qcow2
> >>>     parent.qcow2  <
> >>>                     `--- child2.qcow2
> >>>
> >>>We can use parent.qcow2 as block device in QEMU without breaking
> >>>child1.qcow2 or child2.qcow2 by telling QEMU who its children are:
> >>>
> >>>   $QEMU -drive file=parent.qcow2,children=child1.qcow2:child2.qcow2
> >>>
> >>>Then we open the three images and setup parent_bs->open_children, the
> >>>children are protected from being corrupted.
> >>>
> >>>>
> >>>>bs->open_children becomes independent of backing files - any
> >>>>BlockDriverState can be added to this list.  ->is_allocated() basically
> >>>>becomes the bitmap that we keep in the block job.
> >>>
> >>>Yes. But it is possible to keep a bitmap for raw (and those don't
> >>>implement is_allocated()) in block layer too, or in overlay: could
> >>>add-cow by Dongxu Wang help here?
> >>
> >>Yes absolutely.
> >>
> >>Stefan
> >>
> >   One advantage of external backup, or backing up chain, is that it
> >holds 'Delta' data only and is small enough. If it is changed toward a
> >'full' data writable snapshot, it become bigger. With backup chain
> >qemu-img can restore/clone a writable and usable one, So I don't
> >think adding that in qemu emulator helps much, and it will make things
> >more complicit.... user won't care who is doing the job, qemu or
> >qemu-img.
> >
>   I mean that "get writable snapshot (possibility to open backing file
> writable and write to it safely) as a by-product." in this series, is
> not very valuable.
> 

I'm not selling writable snapshot, my point was just that semantic of
block-backup, getting a point-in-time snapshot, inherently works like a
backing chain but writting to parent (guest drive) will not break its
children (our thin PIT snapshot). If we see it this way, COW is not so
specific to a block job like block-backup, it can be generic in the
backing chain logic.

Though, the value in a writable snapshot is that we can actually
_modify_ a backing image in place, rather than forking the chain to
write to the new child. This is not supported with qemu or qemu-img now,
once you create a child with the image as backing file, you mustn't
modify it.

-- 
Fam

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Qemu-devel] [PATCH v5 03/11] block: add basic backup support to block driver
  2013-06-13  6:33               ` Fam Zheng
@ 2013-06-13  8:02                 ` Wenchao Xia
  2013-06-13  8:51                 ` Stefan Hajnoczi
  1 sibling, 0 replies; 59+ messages in thread
From: Wenchao Xia @ 2013-06-13  8:02 UTC (permalink / raw)
  To: Stefan Hajnoczi, Kevin Wolf, Stefan Hajnoczi, qemu-devel, imain,
	Paolo Bonzini, dietmar, Fam Zheng

于 2013-6-13 14:33, Fam Zheng 写道:
> On Thu, 06/13 14:07, Wenchao Xia wrote:
>> 于 2013-6-13 14:03, Wenchao Xia 写道:
>>> 于 2013-6-7 15:18, Stefan Hajnoczi 写道:
>>>> On Thu, Jun 06, 2013 at 04:56:49PM +0800, Fam Zheng wrote:
>>>>> On Thu, 06/06 10:05, Stefan Hajnoczi wrote:
>>>>>> On Thu, Jun 06, 2013 at 11:56:18AM +0800, Fam Zheng wrote:
>>>>>>> On Thu, 05/30 14:34, Stefan Hajnoczi wrote:
>>>>>>>> +
>>>>>>>> +static int coroutine_fn backup_before_write_notify(
>>>>>>>> +        NotifierWithReturn *notifier,
>>>>>>>> +        void *opaque)
>>>>>>>> +{
>>>>>>>> +    BdrvTrackedRequest *req = opaque;
>>>>>>>> +
>>>>>>>> +    return backup_do_cow(req->bs, req->sector_num,
>>>>>>>> req->nb_sectors, NULL);
>>>>>>>> +}
>>>>>>>
>>>>>>> I'm wondering if we can see the logic here with a backing hd
>>>>>>> relationship?  req->bs is a backing file of job->target, but guest is
>>>>>>> going to write to it, so we need to COW down the data to job->target
>>>>>>> before overwritting (i.e.  cluster is not allocated in child).
>>>>>>>
>>>>>>> I think if we do this in block layer, there's not much necessity for a
>>>>>>> before-write notifier here (although it may be useful for other
>>>>>>> cases):
>>>>>>>
>>>>>>>      in bdrv_write:
>>>>>>>      for child in req->bs->open_children
>>>>>>>          if not child->is_allocated(req->sectors)
>>>>>>>              do COW to child
>>>>>>>
>>>>>>> The advantage of this is that we won't need to start block-backup
>>>>>>> job in
>>>>>>> sync mode "none" to do point-in-time snapshot (image fleecing), and we
>>>>>>> get writable snapshot (possibility to open backing file writable and
>>>>>>> write to it safely) as a by-product.
>>>>>>>
>>>>>>> But we will need to keep track of parent<->child of block states,
>>>>>>> and we
>>>>>>> still need to take care of overlapping writing between block job and
>>>>>>> guest request.
>>>>>>
>>>>>> There's one catch here: bs->target may not support backing files, it
>>>>>> can
>>>>>> be a raw file, for example.  We'll only use backing files for
>>>>>> point-in-time snapshots but other use cases might not.  raw doesn't
>>>>>> really implement is_allocated(), so the whole concept would have to
>>>>>> change a little:
>>>>>
>>>>> Another use case may be parent modification. Suppose we have
>>>>>
>>>>>                      ,--- child1.qcow2
>>>>>      parent.qcow2  <
>>>>>                      `--- child2.qcow2
>>>>>
>>>>> We can use parent.qcow2 as block device in QEMU without breaking
>>>>> child1.qcow2 or child2.qcow2 by telling QEMU who its children are:
>>>>>
>>>>>    $QEMU -drive file=parent.qcow2,children=child1.qcow2:child2.qcow2
>>>>>
>>>>> Then we open the three images and setup parent_bs->open_children, the
>>>>> children are protected from being corrupted.
>>>>>
>>>>>>
>>>>>> bs->open_children becomes independent of backing files - any
>>>>>> BlockDriverState can be added to this list.  ->is_allocated() basically
>>>>>> becomes the bitmap that we keep in the block job.
>>>>>
>>>>> Yes. But it is possible to keep a bitmap for raw (and those don't
>>>>> implement is_allocated()) in block layer too, or in overlay: could
>>>>> add-cow by Dongxu Wang help here?
>>>>
>>>> Yes absolutely.
>>>>
>>>> Stefan
>>>>
>>>    One advantage of external backup, or backing up chain, is that it
>>> holds 'Delta' data only and is small enough. If it is changed toward a
>>> 'full' data writable snapshot, it become bigger. With backup chain
>>> qemu-img can restore/clone a writable and usable one, So I don't
>>> think adding that in qemu emulator helps much, and it will make things
>>> more complicit.... user won't care who is doing the job, qemu or
>>> qemu-img.
>>>
>>    I mean that "get writable snapshot (possibility to open backing file
>> writable and write to it safely) as a by-product." in this series, is
>> not very valuable.
>>
>
> I'm not selling writable snapshot, my point was just that semantic of
> block-backup, getting a point-in-time snapshot, inherently works like a
> backing chain but writting to parent (guest drive) will not break its
> children (our thin PIT snapshot). If we see it this way, COW is not so
> specific to a block job like block-backup, it can be generic in the
> backing chain logic.
>
   OK, similar thing happens in drive-mirror, if you treat it as
backing chain. To do it, following assumption need to be removed:
1 top *bs is the active one.
2 guest write request goes only to top *bs.
3 *bs->backing_hd do not care about *bs(also a hidden change: maybe
image should remember its child as well as parent.)

   Actually it will change the chain relationship from one direction of
top-down into both direction. I think a separate series is needed to
do that.

> Though, the value in a writable snapshot is that we can actually
> _modify_ a backing image in place, rather than forking the chain to

   I can't see a good user case requiring modifying backing image in
place, to me snapshot means a read only one with time stamp in the past.
Personally I think forbid modification of of snapshot is correct,
in construction there should be pre-defined concepts to avoid
chaos in architecture.

> write to the new child. This is not supported with qemu or qemu-img now,
> once you create a child with the image as backing file, you mustn't
> modify it.
>




-- 
Best Regards

Wenchao Xia

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Qemu-devel] [PATCH v5 03/11] block: add basic backup support to block driver
  2013-06-13  6:33               ` Fam Zheng
  2013-06-13  8:02                 ` Wenchao Xia
@ 2013-06-13  8:51                 ` Stefan Hajnoczi
  1 sibling, 0 replies; 59+ messages in thread
From: Stefan Hajnoczi @ 2013-06-13  8:51 UTC (permalink / raw)
  To: Wenchao Xia, Stefan Hajnoczi, Kevin Wolf, qemu-devel, imain,
	Paolo Bonzini, dietmar

On Thu, Jun 13, 2013 at 02:33:40PM +0800, Fam Zheng wrote:
> On Thu, 06/13 14:07, Wenchao Xia wrote:
> > 于 2013-6-13 14:03, Wenchao Xia 写道:
> > >于 2013-6-7 15:18, Stefan Hajnoczi 写道:
> > >>On Thu, Jun 06, 2013 at 04:56:49PM +0800, Fam Zheng wrote:
> > >>>On Thu, 06/06 10:05, Stefan Hajnoczi wrote:
> > >>>>On Thu, Jun 06, 2013 at 11:56:18AM +0800, Fam Zheng wrote:
> > >>>>>On Thu, 05/30 14:34, Stefan Hajnoczi wrote:
> > >>>>>>+
> > >>>>>>+static int coroutine_fn backup_before_write_notify(
> > >>>>>>+        NotifierWithReturn *notifier,
> > >>>>>>+        void *opaque)
> > >>>>>>+{
> > >>>>>>+    BdrvTrackedRequest *req = opaque;
> > >>>>>>+
> > >>>>>>+    return backup_do_cow(req->bs, req->sector_num,
> > >>>>>>req->nb_sectors, NULL);
> > >>>>>>+}
> > >>>>>
> > >>>>>I'm wondering if we can see the logic here with a backing hd
> > >>>>>relationship?  req->bs is a backing file of job->target, but guest is
> > >>>>>going to write to it, so we need to COW down the data to job->target
> > >>>>>before overwritting (i.e.  cluster is not allocated in child).
> > >>>>>
> > >>>>>I think if we do this in block layer, there's not much necessity for a
> > >>>>>before-write notifier here (although it may be useful for other
> > >>>>>cases):
> > >>>>>
> > >>>>>     in bdrv_write:
> > >>>>>     for child in req->bs->open_children
> > >>>>>         if not child->is_allocated(req->sectors)
> > >>>>>             do COW to child
> > >>>>>
> > >>>>>The advantage of this is that we won't need to start block-backup
> > >>>>>job in
> > >>>>>sync mode "none" to do point-in-time snapshot (image fleecing), and we
> > >>>>>get writable snapshot (possibility to open backing file writable and
> > >>>>>write to it safely) as a by-product.
> > >>>>>
> > >>>>>But we will need to keep track of parent<->child of block states,
> > >>>>>and we
> > >>>>>still need to take care of overlapping writing between block job and
> > >>>>>guest request.
> > >>>>
> > >>>>There's one catch here: bs->target may not support backing files, it
> > >>>>can
> > >>>>be a raw file, for example.  We'll only use backing files for
> > >>>>point-in-time snapshots but other use cases might not.  raw doesn't
> > >>>>really implement is_allocated(), so the whole concept would have to
> > >>>>change a little:
> > >>>
> > >>>Another use case may be parent modification. Suppose we have
> > >>>
> > >>>                     ,--- child1.qcow2
> > >>>     parent.qcow2  <
> > >>>                     `--- child2.qcow2
> > >>>
> > >>>We can use parent.qcow2 as block device in QEMU without breaking
> > >>>child1.qcow2 or child2.qcow2 by telling QEMU who its children are:
> > >>>
> > >>>   $QEMU -drive file=parent.qcow2,children=child1.qcow2:child2.qcow2
> > >>>
> > >>>Then we open the three images and setup parent_bs->open_children, the
> > >>>children are protected from being corrupted.
> > >>>
> > >>>>
> > >>>>bs->open_children becomes independent of backing files - any
> > >>>>BlockDriverState can be added to this list.  ->is_allocated() basically
> > >>>>becomes the bitmap that we keep in the block job.
> > >>>
> > >>>Yes. But it is possible to keep a bitmap for raw (and those don't
> > >>>implement is_allocated()) in block layer too, or in overlay: could
> > >>>add-cow by Dongxu Wang help here?
> > >>
> > >>Yes absolutely.
> > >>
> > >>Stefan
> > >>
> > >   One advantage of external backup, or backing up chain, is that it
> > >holds 'Delta' data only and is small enough. If it is changed toward a
> > >'full' data writable snapshot, it become bigger. With backup chain
> > >qemu-img can restore/clone a writable and usable one, So I don't
> > >think adding that in qemu emulator helps much, and it will make things
> > >more complicit.... user won't care who is doing the job, qemu or
> > >qemu-img.
> > >
> >   I mean that "get writable snapshot (possibility to open backing file
> > writable and write to it safely) as a by-product." in this series, is
> > not very valuable.
> > 
> 
> I'm not selling writable snapshot, my point was just that semantic of
> block-backup, getting a point-in-time snapshot, inherently works like a
> backing chain but writting to parent (guest drive) will not break its
> children (our thin PIT snapshot). If we see it this way, COW is not so
> specific to a block job like block-backup, it can be generic in the
> backing chain logic.
> 
> Though, the value in a writable snapshot is that we can actually
> _modify_ a backing image in place, rather than forking the chain to
> write to the new child. This is not supported with qemu or qemu-img now,
> once you create a child with the image as backing file, you mustn't
> modify it.

Supporting writable snapshots in this style is like traditional LVM
snapshots, it requires O(n) writes where n is the number of children.
So it does not scale (LVM recently added the thin provisioning target to
use a shared storage pool and solve this problem).

The second challenge with writable snapshots is that you can only use
them when the QEMU process knows about all children.  That means you
cannot use them in the common use-case where there is a template backing
file:

  web-server.qcow2 <- web001.qcow2
                   <- web002.qcow2
		   <- web003.qcow2

The web001 guest doesn't know about web002 and web003.  Even if it did,
it would be dangerous to modify web-server.qcow2 while the other two
QEMU processes have it open.

For these reasons I'm not eager to get into writable backing files,
better to create a new writable image.

Stefan

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Qemu-devel] [PATCH v5 03/11] block: add basic backup support to block driver
  2013-05-30 12:34 ` [Qemu-devel] [PATCH v5 03/11] block: add basic backup support to block driver Stefan Hajnoczi
  2013-06-06  3:56   ` Fam Zheng
@ 2013-06-17  3:43   ` Fam Zheng
  2013-06-17 12:36     ` Stefan Hajnoczi
  2013-06-18 14:52   ` Kevin Wolf
  2013-06-19 10:50   ` Kevin Wolf
  3 siblings, 1 reply; 59+ messages in thread
From: Fam Zheng @ 2013-06-17  3:43 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Kevin Wolf, qemu-devel, dietmar, imain, Paolo Bonzini, xiawenc

> +static void coroutine_fn backup_run(void *opaque)
> +{
> +    BackupBlockJob *job = opaque;
> +    BlockDriverState *bs = job->common.bs;
> +    BlockDriverState *target = job->target;
> +    BlockdevOnError on_target_error = job->on_target_error;
> +    NotifierWithReturn before_write = {
> +        .notify = backup_before_write_notify,
> +    };
> +    int64_t start, end;
> +    int ret = 0;
> +
> +    QLIST_INIT(&job->inflight_reqs);
> +    qemu_co_rwlock_init(&job->flush_rwlock);
> +
> +    start = 0;
> +    end = DIV_ROUND_UP(bdrv_getlength(bs) / BDRV_SECTOR_SIZE,
> +                       BACKUP_SECTORS_PER_CLUSTER);
> +
> +    job->bitmap = hbitmap_alloc(end, 0);
> +
> +    bdrv_set_on_error(target, on_target_error, on_target_error);
> +    bdrv_iostatus_enable(target);
> +
> +    bdrv_add_before_write_notifier(bs, &before_write);
> +
> +    DPRINTF("backup_run start %s %" PRId64 " %" PRId64 "\n",
> +            bdrv_get_device_name(bs), start, end);
> +
> +    for (; start < end; start++) {
> +        bool error_is_read;
> +
> +        if (block_job_is_cancelled(&job->common)) {
> +            break;
> +        }
> +
> +        /* we need to yield so that qemu_aio_flush() returns.
> +         * (without, VM does not reboot)
> +         */
> +        if (job->common.speed) {
> +            uint64_t delay_ns = ratelimit_calculate_delay(
> +                &job->limit, job->sectors_read);
> +            job->sectors_read = 0;
> +            block_job_sleep_ns(&job->common, rt_clock, delay_ns);
> +        } else {
> +            block_job_sleep_ns(&job->common, rt_clock, 0);
> +        }
> +
> +        if (block_job_is_cancelled(&job->common)) {
> +            break;
> +        }
> +
> +        DPRINTF("backup_run loop C%" PRId64 "\n", start);
> +
> +        ret = backup_do_cow(bs, start * BACKUP_SECTORS_PER_CLUSTER, 1,
> +                            &error_is_read);
> +        if (ret < 0) {
> +            /* Depending on error action, fail now or retry cluster */
> +            BlockErrorAction action =
> +                backup_error_action(job, error_is_read, -ret);
> +            if (action == BDRV_ACTION_REPORT) {
> +                break;
> +            } else {
> +                start--;
> +                continue;
> +            }
> +        }
> +    }
> +
> +    notifier_with_return_remove(&before_write);
> +
> +    /* wait until pending backup_do_cow() calls have completed */
> +    qemu_co_rwlock_wrlock(&job->flush_rwlock);
> +    qemu_co_rwlock_unlock(&job->flush_rwlock);
> +
> +    hbitmap_free(job->bitmap);
> +
> +    bdrv_iostatus_disable(target);
> +    bdrv_delete(job->target);

drive-mirror has bdrv_close before deleting target, why don't we need
one here?

> +
> +    DPRINTF("backup_run complete %d\n", ret);
> +    block_job_completed(&job->common, ret);
> +}
> +


Fam

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Qemu-devel] [PATCH v5 03/11] block: add basic backup support to block driver
  2013-06-17  3:43   ` Fam Zheng
@ 2013-06-17 12:36     ` Stefan Hajnoczi
  0 siblings, 0 replies; 59+ messages in thread
From: Stefan Hajnoczi @ 2013-06-17 12:36 UTC (permalink / raw)
  To: Stefan Hajnoczi, qemu-devel, Kevin Wolf, dietmar, imain,
	Paolo Bonzini, xiawenc, Eric Blake

On Mon, Jun 17, 2013 at 11:43:24AM +0800, Fam Zheng wrote:
> > +    bdrv_iostatus_disable(target);
> > +    bdrv_delete(job->target);
> 
> drive-mirror has bdrv_close before deleting target, why don't we need
> one here?

Use the source, Luke! :)

void bdrv_delete(BlockDriverState *bs)
{
    assert(!bs->dev);
    assert(!bs->job);
    assert(!bs->in_use);

    bdrv_close(bs);

    /* remove from list, if necessary */
    bdrv_make_anon(bs);

    g_free(bs);
}

Stefan

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Qemu-devel] [PATCH v5 03/11] block: add basic backup support to block driver
  2013-05-30 12:34 ` [Qemu-devel] [PATCH v5 03/11] block: add basic backup support to block driver Stefan Hajnoczi
  2013-06-06  3:56   ` Fam Zheng
  2013-06-17  3:43   ` Fam Zheng
@ 2013-06-18 14:52   ` Kevin Wolf
  2013-06-19  7:38     ` Stefan Hajnoczi
  2013-06-19 10:50   ` Kevin Wolf
  3 siblings, 1 reply; 59+ messages in thread
From: Kevin Wolf @ 2013-06-18 14:52 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Fam Zheng, qemu-devel, dietmar, imain, Paolo Bonzini, xiawenc

Am 30.05.2013 um 14:34 hat Stefan Hajnoczi geschrieben:
> From: Dietmar Maurer <dietmar@proxmox.com>
> 
> backup_start() creates a block job that copies a point-in-time snapshot
> of a block device to a target block device.
> 
> We call backup_do_cow() for each write during backup. That function
> reads the original data from the block device before it gets
> overwritten.  The data is then written to the target device.
> 
> Currently backup cluster size is hardcoded to 65536 bytes.
> 
> [I made a number of changes to Dietmar's original patch and folded them
> in to make code review easy.  Here is the full list:
> 
>  * Drop BackupDumpFunc interface in favor of a target block device
>  * Detect zero clusters with buffer_is_zero() and use bdrv_co_write_zeroes()
>  * Use 0 delay instead of 1us, like other block jobs
>  * Unify creation/start functions into backup_start()
>  * Simplify cleanup, free bitmap in backup_run() instead of cb
>  * function
>  * Use HBitmap to avoid duplicating bitmap code
>  * Use bdrv_getlength() instead of accessing ->total_sectors
>  * directly
>  * Delete the backup.h header file, it is no longer necessary
>  * Move ./backup.c to block/backup.c
>  * Remove #ifdefed out code
>  * Coding style and whitespace cleanups
>  * Use bdrv_add_before_write_notifier() instead of blockjob-specific hooks
>  * Keep our own in-flight CowRequest list instead of using block.c
>    tracked requests.  This means a little code duplication but is much
>    simpler than trying to share the tracked requests list and use the
>    backup block size.
>  * Add on_source_error and on_target_error error handling.
> 
> -- stefanha]
> 
> Signed-off-by: Dietmar Maurer <dietmar@proxmox.com>
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> 
> backup size fixes
> 
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

You probably didn't want to have the second part in the final commit
message?

Kevin

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Qemu-devel] [PATCH v5 03/11] block: add basic backup support to block driver
  2013-06-18 14:52   ` Kevin Wolf
@ 2013-06-19  7:38     ` Stefan Hajnoczi
  0 siblings, 0 replies; 59+ messages in thread
From: Stefan Hajnoczi @ 2013-06-19  7:38 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: Fam Zheng, qemu-devel, dietmar, imain, Paolo Bonzini, xiawenc

On Tue, Jun 18, 2013 at 04:52:11PM +0200, Kevin Wolf wrote:
> Am 30.05.2013 um 14:34 hat Stefan Hajnoczi geschrieben:
> > From: Dietmar Maurer <dietmar@proxmox.com>
> > 
> > backup_start() creates a block job that copies a point-in-time snapshot
> > of a block device to a target block device.
> > 
> > We call backup_do_cow() for each write during backup. That function
> > reads the original data from the block device before it gets
> > overwritten.  The data is then written to the target device.
> > 
> > Currently backup cluster size is hardcoded to 65536 bytes.
> > 
> > [I made a number of changes to Dietmar's original patch and folded them
> > in to make code review easy.  Here is the full list:
> > 
> >  * Drop BackupDumpFunc interface in favor of a target block device
> >  * Detect zero clusters with buffer_is_zero() and use bdrv_co_write_zeroes()
> >  * Use 0 delay instead of 1us, like other block jobs
> >  * Unify creation/start functions into backup_start()
> >  * Simplify cleanup, free bitmap in backup_run() instead of cb
> >  * function
> >  * Use HBitmap to avoid duplicating bitmap code
> >  * Use bdrv_getlength() instead of accessing ->total_sectors
> >  * directly
> >  * Delete the backup.h header file, it is no longer necessary
> >  * Move ./backup.c to block/backup.c
> >  * Remove #ifdefed out code
> >  * Coding style and whitespace cleanups
> >  * Use bdrv_add_before_write_notifier() instead of blockjob-specific hooks
> >  * Keep our own in-flight CowRequest list instead of using block.c
> >    tracked requests.  This means a little code duplication but is much
> >    simpler than trying to share the tracked requests list and use the
> >    backup block size.
> >  * Add on_source_error and on_target_error error handling.
> > 
> > -- stefanha]
> > 
> > Signed-off-by: Dietmar Maurer <dietmar@proxmox.com>
> > Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> > 
> > backup size fixes
> > 
> > Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> 
> You probably didn't want to have the second part in the final commit
> message?

Thanks for pointing this out.  I overlooked it when squashing.  Will
fix.

Stefan

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Qemu-devel] [PATCH v5 03/11] block: add basic backup support to block driver
  2013-05-30 12:34 ` [Qemu-devel] [PATCH v5 03/11] block: add basic backup support to block driver Stefan Hajnoczi
                     ` (2 preceding siblings ...)
  2013-06-18 14:52   ` Kevin Wolf
@ 2013-06-19 10:50   ` Kevin Wolf
  2013-06-19 11:14     ` Paolo Bonzini
                       ` (2 more replies)
  3 siblings, 3 replies; 59+ messages in thread
From: Kevin Wolf @ 2013-06-19 10:50 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Fam Zheng, qemu-devel, dietmar, imain, Paolo Bonzini, xiawenc

Am 30.05.2013 um 14:34 hat Stefan Hajnoczi geschrieben:
> From: Dietmar Maurer <dietmar@proxmox.com>
> 
> backup_start() creates a block job that copies a point-in-time snapshot
> of a block device to a target block device.
> 
> We call backup_do_cow() for each write during backup. That function
> reads the original data from the block device before it gets
> overwritten.  The data is then written to the target device.
> 
> Currently backup cluster size is hardcoded to 65536 bytes.
> 
> [I made a number of changes to Dietmar's original patch and folded them
> in to make code review easy.  Here is the full list:
> 
>  * Drop BackupDumpFunc interface in favor of a target block device
>  * Detect zero clusters with buffer_is_zero() and use bdrv_co_write_zeroes()
>  * Use 0 delay instead of 1us, like other block jobs
>  * Unify creation/start functions into backup_start()
>  * Simplify cleanup, free bitmap in backup_run() instead of cb
>  * function
>  * Use HBitmap to avoid duplicating bitmap code
>  * Use bdrv_getlength() instead of accessing ->total_sectors
>  * directly
>  * Delete the backup.h header file, it is no longer necessary
>  * Move ./backup.c to block/backup.c
>  * Remove #ifdefed out code
>  * Coding style and whitespace cleanups
>  * Use bdrv_add_before_write_notifier() instead of blockjob-specific hooks
>  * Keep our own in-flight CowRequest list instead of using block.c
>    tracked requests.  This means a little code duplication but is much
>    simpler than trying to share the tracked requests list and use the
>    backup block size.
>  * Add on_source_error and on_target_error error handling.
> 
> -- stefanha]
> 
> Signed-off-by: Dietmar Maurer <dietmar@proxmox.com>
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> 
> backup size fixes
> 
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>  block/Makefile.objs       |   1 +
>  block/backup.c            | 355 ++++++++++++++++++++++++++++++++++++++++++++++
>  include/block/block_int.h |  19 +++
>  3 files changed, 375 insertions(+)
>  create mode 100644 block/backup.c
> 
> diff --git a/block/Makefile.objs b/block/Makefile.objs
> index 5f0358a..88bd101 100644
> --- a/block/Makefile.objs
> +++ b/block/Makefile.objs
> @@ -20,5 +20,6 @@ endif
>  common-obj-y += stream.o
>  common-obj-y += commit.o
>  common-obj-y += mirror.o
> +common-obj-y += backup.o
>  
>  $(obj)/curl.o: QEMU_CFLAGS+=$(CURL_CFLAGS)
> diff --git a/block/backup.c b/block/backup.c
> new file mode 100644
> index 0000000..466744a
> --- /dev/null
> +++ b/block/backup.c
> @@ -0,0 +1,355 @@
> +/*
> + * QEMU backup
> + *
> + * Copyright (C) 2013 Proxmox Server Solutions
> + *
> + * Authors:
> + *  Dietmar Maurer (dietmar@proxmox.com)
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> + * See the COPYING file in the top-level directory.
> + *
> + */
> +
> +#include <stdio.h>
> +#include <errno.h>
> +#include <unistd.h>
> +
> +#include "block/block.h"
> +#include "block/block_int.h"
> +#include "block/blockjob.h"
> +#include "qemu/ratelimit.h"
> +
> +#define DEBUG_BACKUP 0
> +
> +#define DPRINTF(fmt, ...) \
> +    do { \
> +        if (DEBUG_BACKUP) { \
> +            fprintf(stderr, "backup: " fmt, ## __VA_ARGS__); \
> +        } \
> +    } while (0)
> +
> +#define BACKUP_CLUSTER_BITS 16
> +#define BACKUP_CLUSTER_SIZE (1 << BACKUP_CLUSTER_BITS)
> +#define BACKUP_SECTORS_PER_CLUSTER (BACKUP_CLUSTER_SIZE / BDRV_SECTOR_SIZE)
> +
> +#define SLICE_TIME 100000000ULL /* ns */
> +
> +typedef struct CowRequest {
> +    int64_t start;
> +    int64_t end;
> +    QLIST_ENTRY(CowRequest) list;
> +    CoQueue wait_queue; /* coroutines blocked on this request */
> +} CowRequest;
> +
> +typedef struct BackupBlockJob {
> +    BlockJob common;
> +    BlockDriverState *target;
> +    RateLimit limit;
> +    BlockdevOnError on_source_error;
> +    BlockdevOnError on_target_error;
> +    CoRwlock flush_rwlock;
> +    uint64_t sectors_read;
> +    HBitmap *bitmap;
> +    QLIST_HEAD(, CowRequest) inflight_reqs;
> +} BackupBlockJob;
> +
> +/* See if in-flight requests overlap and wait for them to complete */
> +static void coroutine_fn wait_for_overlapping_requests(BackupBlockJob *job,
> +                                                       int64_t start,
> +                                                       int64_t end)
> +{
> +    CowRequest *req;
> +    bool retry;
> +
> +    do {
> +        retry = false;
> +        QLIST_FOREACH(req, &job->inflight_reqs, list) {
> +            if (end > req->start && start < req->end) {
> +                qemu_co_queue_wait(&req->wait_queue);
> +                retry = true;
> +                break;
> +            }
> +        }
> +    } while (retry);
> +}
> +
> +/* Keep track of an in-flight request */
> +static void cow_request_begin(CowRequest *req, BackupBlockJob *job,
> +                                     int64_t start, int64_t end)
> +{
> +    req->start = start;
> +    req->end = end;
> +    qemu_co_queue_init(&req->wait_queue);
> +    QLIST_INSERT_HEAD(&job->inflight_reqs, req, list);
> +}
> +
> +/* Forget about a completed request */
> +static void cow_request_end(CowRequest *req)
> +{
> +    QLIST_REMOVE(req, list);
> +    qemu_co_queue_restart_all(&req->wait_queue);
> +}
> +
> +static int coroutine_fn backup_do_cow(BlockDriverState *bs,
> +                                      int64_t sector_num, int nb_sectors,
> +                                      bool *error_is_read)
> +{
> +    BackupBlockJob *job = (BackupBlockJob *)bs->job;
> +    CowRequest cow_request;
> +    struct iovec iov;
> +    QEMUIOVector bounce_qiov;
> +    void *bounce_buffer = NULL;
> +    int ret = 0;
> +    int64_t start, end, total_sectors;
> +    int n;
> +
> +    qemu_co_rwlock_rdlock(&job->flush_rwlock);
> +
> +    start = sector_num / BACKUP_SECTORS_PER_CLUSTER;
> +    end = DIV_ROUND_UP(sector_num + nb_sectors, BACKUP_SECTORS_PER_CLUSTER);
> +
> +    DPRINTF("%s enter %s C%" PRId64 " %" PRId64 " %d\n",
> +            __func__, bdrv_get_device_name(bs), start, sector_num, nb_sectors);

Maybe put the first "%s" and __func__ directly into the DPRINTF macro?

> +    wait_for_overlapping_requests(job, start, end);
> +    cow_request_begin(&cow_request, job, start, end);
> +
> +    total_sectors = bdrv_getlength(bs);
> +    if (total_sectors < 0) {
> +        if (error_is_read) {
> +            *error_is_read = true;
> +        }
> +        ret = total_sectors;
> +        goto out;
> +    }
> +    total_sectors /= BDRV_SECTOR_SIZE;

Why is this needed? There are two callers of the function: One is the
write notifier, which has already a sector number that is checked
against the image size. The other one is background job that gets the
size once when it starts. I don't think there's a reason to call the
block driver each time and potentially do an expensive request.

At first I thought it has something to do with resizing images, but it's
forbidden while a block job is running (otherwise the job's main loop
exit condition would be wrong, too), so that doesn't make it necessary.

> +
> +    for (; start < end; start++) {
> +        if (hbitmap_get(job->bitmap, start)) {
> +            DPRINTF("%s skip C%" PRId64 "\n", __func__, start);
> +            continue; /* already copied */
> +        }
> +
> +        DPRINTF("%s C%" PRId64 "\n", __func__, start);
> +
> +        n = MIN(BACKUP_SECTORS_PER_CLUSTER,
> +                total_sectors - start * BACKUP_SECTORS_PER_CLUSTER);
> +
> +        if (!bounce_buffer) {
> +            bounce_buffer = qemu_blockalign(bs, BACKUP_CLUSTER_SIZE);
> +        }
> +        iov.iov_base = bounce_buffer;
> +        iov.iov_len = n * BDRV_SECTOR_SIZE;
> +        qemu_iovec_init_external(&bounce_qiov, &iov, 1);
> +
> +        ret = bdrv_co_readv(bs, start * BACKUP_SECTORS_PER_CLUSTER, n,
> +                            &bounce_qiov);
> +        if (ret < 0) {
> +            DPRINTF("%s bdrv_co_readv C%" PRId64 " failed\n", __func__, start);
> +            if (error_is_read) {
> +                *error_is_read = true;
> +            }
> +            goto out;
> +        }
> +
> +        if (buffer_is_zero(iov.iov_base, iov.iov_len)) {
> +            ret = bdrv_co_write_zeroes(job->target,
> +                                       start * BACKUP_SECTORS_PER_CLUSTER, n);
> +        } else {
> +            ret = bdrv_co_writev(job->target,
> +                                 start * BACKUP_SECTORS_PER_CLUSTER, n,
> +                                 &bounce_qiov);
> +        }
> +        if (ret < 0) {
> +            DPRINTF("%s write C%" PRId64 " failed\n", __func__, start);
> +            if (error_is_read) {
> +                *error_is_read = false;
> +            }
> +            goto out;
> +        }
> +
> +        hbitmap_set(job->bitmap, start, 1);
> +
> +        /* Publish progress */
> +        job->sectors_read += n;
> +        job->common.offset += n * BDRV_SECTOR_SIZE;

This is interesting, because the function is not only called by the
background job, but also by write notifiers. So 'offset' in a literal
sense doesn't make too much sense because we're not operating purely
sequential.

The QAPI documentation describes 'offset' like this:

# @offset: the current progress value

If we take it as just that, I think we could actually consider this code
correct, because it's a useful measure for the progress (each sector is
copied only once, either by the job or by a notifier), even though it
really has nothing to do with an offset into the image.

Maybe a comment would be appropriate.

> +
> +        DPRINTF("%s done C%" PRId64 "\n", __func__, start);
> +    }
> +
> +out:
> +    if (bounce_buffer) {
> +        qemu_vfree(bounce_buffer);
> +    }
> +
> +    cow_request_end(&cow_request);
> +
> +    qemu_co_rwlock_unlock(&job->flush_rwlock);
> +
> +    return ret;
> +}
> +
> +static int coroutine_fn backup_before_write_notify(
> +        NotifierWithReturn *notifier,
> +        void *opaque)
> +{
> +    BdrvTrackedRequest *req = opaque;
> +
> +    return backup_do_cow(req->bs, req->sector_num, req->nb_sectors, NULL);
> +}
> +
> +static void backup_set_speed(BlockJob *job, int64_t speed, Error **errp)
> +{
> +    BackupBlockJob *s = container_of(job, BackupBlockJob, common);
> +
> +    if (speed < 0) {
> +        error_set(errp, QERR_INVALID_PARAMETER, "speed");
> +        return;
> +    }
> +    ratelimit_set_speed(&s->limit, speed / BDRV_SECTOR_SIZE, SLICE_TIME);
> +}
> +
> +static void backup_iostatus_reset(BlockJob *job)
> +{
> +    BackupBlockJob *s = container_of(job, BackupBlockJob, common);
> +
> +    bdrv_iostatus_reset(s->target);
> +}
> +
> +static BlockJobType backup_job_type = {
> +    .instance_size = sizeof(BackupBlockJob),
> +    .job_type = "backup",
> +    .set_speed = backup_set_speed,
> +    .iostatus_reset = backup_iostatus_reset,
> +};

Align = on the same column? Should probably be const, too.

> +
> +static BlockErrorAction backup_error_action(BackupBlockJob *job,
> +                                            bool read, int error)
> +{
> +    if (read) {
> +        return block_job_error_action(&job->common, job->common.bs,
> +                                      job->on_source_error, true, error);
> +    } else {
> +        return block_job_error_action(&job->common, job->target,
> +                                      job->on_target_error, false, error);
> +    }
> +}
> +
> +static void coroutine_fn backup_run(void *opaque)
> +{
> +    BackupBlockJob *job = opaque;
> +    BlockDriverState *bs = job->common.bs;
> +    BlockDriverState *target = job->target;
> +    BlockdevOnError on_target_error = job->on_target_error;
> +    NotifierWithReturn before_write = {
> +        .notify = backup_before_write_notify,
> +    };
> +    int64_t start, end;
> +    int ret = 0;
> +
> +    QLIST_INIT(&job->inflight_reqs);
> +    qemu_co_rwlock_init(&job->flush_rwlock);
> +
> +    start = 0;
> +    end = DIV_ROUND_UP(bdrv_getlength(bs) / BDRV_SECTOR_SIZE,

bdrv_getlength() can fail.

> +                       BACKUP_SECTORS_PER_CLUSTER);
> +
> +    job->bitmap = hbitmap_alloc(end, 0);
> +
> +    bdrv_set_on_error(target, on_target_error, on_target_error);
> +    bdrv_iostatus_enable(target);

Mirroring also has:

    bdrv_set_enable_write_cache(s->target, true);

Wouldn't it make sense for backup as well?

> +
> +    bdrv_add_before_write_notifier(bs, &before_write);
> +
> +    DPRINTF("backup_run start %s %" PRId64 " %" PRId64 "\n",
> +            bdrv_get_device_name(bs), start, end);
> +
> +    for (; start < end; start++) {
> +        bool error_is_read;
> +
> +        if (block_job_is_cancelled(&job->common)) {
> +            break;
> +        }
> +
> +        /* we need to yield so that qemu_aio_flush() returns.
> +         * (without, VM does not reboot)
> +         */
> +        if (job->common.speed) {
> +            uint64_t delay_ns = ratelimit_calculate_delay(
> +                &job->limit, job->sectors_read);
> +            job->sectors_read = 0;
> +            block_job_sleep_ns(&job->common, rt_clock, delay_ns);

Here's the other implication of counting the progress of copies
initiated by write notifiers: The more copies they trigger, the less
additional copies are made by the background job.

I'm willing to count this as a feature rather than a bug.

> +        } else {
> +            block_job_sleep_ns(&job->common, rt_clock, 0);
> +        }
> +
> +        if (block_job_is_cancelled(&job->common)) {
> +            break;
> +        }
> +
> +        DPRINTF("backup_run loop C%" PRId64 "\n", start);
> +
> +        ret = backup_do_cow(bs, start * BACKUP_SECTORS_PER_CLUSTER, 1,
> +                            &error_is_read);

You're taking advantage of the fact that backup_do_cow() rounds this one
sector up to 64k, which is a much more reasonable size. But why not
specify BACKUP_SECTORS_PER_CLUSTER as nb_sectors when this is what you
really assume?

> +        if (ret < 0) {
> +            /* Depending on error action, fail now or retry cluster */
> +            BlockErrorAction action =
> +                backup_error_action(job, error_is_read, -ret);
> +            if (action == BDRV_ACTION_REPORT) {
> +                break;
> +            } else {
> +                start--;
> +                continue;
> +            }
> +        }
> +    }
> +
> +    notifier_with_return_remove(&before_write);
> +
> +    /* wait until pending backup_do_cow() calls have completed */
> +    qemu_co_rwlock_wrlock(&job->flush_rwlock);
> +    qemu_co_rwlock_unlock(&job->flush_rwlock);
> +
> +    hbitmap_free(job->bitmap);
> +
> +    bdrv_iostatus_disable(target);
> +    bdrv_delete(job->target);
> +
> +    DPRINTF("backup_run complete %d\n", ret);
> +    block_job_completed(&job->common, ret);
> +}

Kevin

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Qemu-devel] [PATCH v5 01/11] notify: add NotiferWithReturn so notifier list can abort
  2013-05-30 12:34 ` [Qemu-devel] [PATCH v5 01/11] notify: add NotiferWithReturn so notifier list can abort Stefan Hajnoczi
  2013-05-30 22:27   ` Eric Blake
@ 2013-06-19 10:55   ` Kevin Wolf
  1 sibling, 0 replies; 59+ messages in thread
From: Kevin Wolf @ 2013-06-19 10:55 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Fam Zheng, qemu-devel, dietmar, imain, Paolo Bonzini, xiawenc

Am 30.05.2013 um 14:34 hat Stefan Hajnoczi geschrieben:
> notifier_list_notify() has no return value.  This is fine when we just
> want to invoke side-effects.
> 
> Sometimes it's useful for notifiers to produce a return value.  This
> allows notifiers to "veto" an operation and will be used by the block
> layer before-write notifier.
> 
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

Reviewed-by: Kevin Wolf <kwolf@redhat.com>

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Qemu-devel] [PATCH v5 02/11] block: add bdrv_add_before_write_notifier()
  2013-05-30 12:34 ` [Qemu-devel] [PATCH v5 02/11] block: add bdrv_add_before_write_notifier() Stefan Hajnoczi
  2013-05-30 22:47   ` Eric Blake
@ 2013-06-19 10:56   ` Kevin Wolf
  1 sibling, 0 replies; 59+ messages in thread
From: Kevin Wolf @ 2013-06-19 10:56 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Fam Zheng, qemu-devel, dietmar, imain, Paolo Bonzini, xiawenc

Am 30.05.2013 um 14:34 hat Stefan Hajnoczi geschrieben:
> The bdrv_add_before_write_notifier() function installs a callback that
> is invoked before a write request is processed.  This will be used to
> implement copy-on-write point-in-time snapshots where we need to copy
> out old data before overwriting it.
> 
> Note that BdrvTrackedRequest is moved to block_int.h since it is passed
> to .notify() functions.
> 
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

Reviewed-by: Kevin Wolf <kwolf@redhat.com>

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Qemu-devel] [PATCH v5 04/11] blockdev: drop redundant proto_drv check
  2013-05-30 12:34 ` [Qemu-devel] [PATCH v5 04/11] blockdev: drop redundant proto_drv check Stefan Hajnoczi
  2013-06-03 17:14   ` Eric Blake
  2013-06-06  5:00   ` Wenchao Xia
@ 2013-06-19 10:57   ` Kevin Wolf
  2 siblings, 0 replies; 59+ messages in thread
From: Kevin Wolf @ 2013-06-19 10:57 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Fam Zheng, qemu-devel, dietmar, imain, Paolo Bonzini, xiawenc

Am 30.05.2013 um 14:34 hat Stefan Hajnoczi geschrieben:
> It is not necessary to check that we can find a protocol block driver
> since we create or open the image file.  This produces the error that we
> need anyway.
> 
> Besides, the QERR_INVALID_BLOCK_FORMAT is inappropriate since the
> protocol is incorrect rather than the format.
> 
> Also drop an empty line between bdrv_open() and checking its return
> value.  This may be due to copy-pasting from earlier code that performed
> other operations before handling errors.
> 
> Reported-by: Kevin Wolf <kwolf@redhat.com>
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

Reviewed-by: Kevin Wolf <kwolf@redhat.com>

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Qemu-devel] [PATCH v5 05/11] blockdev: use bdrv_getlength() in qmp_drive_mirror()
  2013-05-30 12:34 ` [Qemu-devel] [PATCH v5 05/11] blockdev: use bdrv_getlength() in qmp_drive_mirror() Stefan Hajnoczi
  2013-05-30 13:24   ` Paolo Bonzini
  2013-06-03 17:15   ` Eric Blake
@ 2013-06-19 10:58   ` Kevin Wolf
  2 siblings, 0 replies; 59+ messages in thread
From: Kevin Wolf @ 2013-06-19 10:58 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Fam Zheng, qemu-devel, dietmar, imain, Paolo Bonzini, xiawenc

Am 30.05.2013 um 14:34 hat Stefan Hajnoczi geschrieben:
> Use bdrv_getlength() for its byte units and error return instead of
> bdrv_get_geometry().
> 
> Reported-by: Kevin Wolf <kwolf@redhat.com>
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

Reviewed-by: Kevin Wolf <kwolf@redhat.com>

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Qemu-devel] [PATCH v5 06/11] block: add drive-backup QMP command
  2013-05-30 12:34 ` [Qemu-devel] [PATCH v5 06/11] block: add drive-backup QMP command Stefan Hajnoczi
  2013-06-03 17:09   ` Eric Blake
@ 2013-06-19 11:07   ` Kevin Wolf
  2013-06-20 12:19     ` Stefan Hajnoczi
  1 sibling, 1 reply; 59+ messages in thread
From: Kevin Wolf @ 2013-06-19 11:07 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Fam Zheng, qemu-devel, dietmar, imain, Paolo Bonzini, xiawenc

Am 30.05.2013 um 14:34 hat Stefan Hajnoczi geschrieben:
> @drive-backup
> 
> Start a point-in-time copy of a block device to a new destination.  The
> status of ongoing drive-backup operations can be checked with
> query-block-jobs where the BlockJobInfo.type field has the value 'backup'.
> The operation can be stopped before it has completed using the
> block-job-cancel command.
> 
> @device: the name of the device which should be copied.
> 
> @target: the target of the new image. If the file exists, or if it
>          is a device, the existing file/device will be used as the new
>          destination.  If it does not exist, a new file will be created.
> 
> @format: #optional the format of the new destination, default is to
>          probe if @mode is 'existing', else the format of the source
> 
> @mode: #optional whether and how QEMU should create a new image, default is
>        'absolute-paths'.
> 
> @speed: #optional the maximum speed, in bytes per second
> 
> @on-source-error: #optional the action to take on an error on the source,
>                   default 'report'.  'stop' and 'enospc' can only be used
>                   if the block device supports io-status (see BlockInfo).
> 
> @on-target-error: #optional the action to take on an error on the target,
>                   default 'report' (no limitations, since this applies to
>                   a different block device than @device).
> 
> Note that @on-source-error and @on-target-error only affect background I/O.
> If an error occurs during a guest write request, the device's rerror/werror
> actions will be used.
> 
> Returns: nothing on success
>          If @device is not a valid block device, DeviceNotFound
> 
> Since 1.6
> 
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

Reviewed-by: Kevin Wolf <kwolf@redhat.com>

But what about HMP, the series doesn't seem to have a separate patch to
add a command there?

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Qemu-devel] [PATCH v5 07/11] blockdev: rename BlkTransactionStates to singular
  2013-05-30 12:34 ` [Qemu-devel] [PATCH v5 07/11] blockdev: rename BlkTransactionStates to singular Stefan Hajnoczi
  2013-05-30 22:55   ` Eric Blake
@ 2013-06-19 11:13   ` Kevin Wolf
  1 sibling, 0 replies; 59+ messages in thread
From: Kevin Wolf @ 2013-06-19 11:13 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Fam Zheng, qemu-devel, dietmar, imain, Paolo Bonzini, xiawenc

Am 30.05.2013 um 14:34 hat Stefan Hajnoczi geschrieben:
> The QMP 'transaction' command keeps a list of in-flight transactions.
> The transaction state structure is called BlkTransactionStates even
> though it only deals with a single transaction.  The only plural thing
> is the linked list of transaction states.
> 
> I find it confusing to call the single structure "States".  This patch
> renames it to "State", just like BlockDriverState is singular.
> 
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

> -    /* This removes our old bs from the bdrv_states, and adds the new bs */
> -    bdrv_append(states->new_bs, states->old_bs);
> +    /* This removes our old bs and adds the new bs */
> +    bdrv_append(state->new_bs, state->old_bs);

Tsk, you said this was only renaming. Don't believe I didn't see it!

Reviewed-by: Kevin Wolf <kwolf@redhat.com>

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Qemu-devel] [PATCH v5 03/11] block: add basic backup support to block driver
  2013-06-19 10:50   ` Kevin Wolf
@ 2013-06-19 11:14     ` Paolo Bonzini
  2013-06-20 12:05       ` Stefan Hajnoczi
  2013-06-19 11:19     ` Paolo Bonzini
  2013-06-20 12:11     ` Stefan Hajnoczi
  2 siblings, 1 reply; 59+ messages in thread
From: Paolo Bonzini @ 2013-06-19 11:14 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: Fam Zheng, qemu-devel, dietmar, imain, Stefan Hajnoczi, xiawenc

Il 19/06/2013 12:50, Kevin Wolf ha scritto:
>> > +        /* Publish progress */
>> > +        job->sectors_read += n;
>> > +        job->common.offset += n * BDRV_SECTOR_SIZE;
> This is interesting, because the function is not only called by the
> background job, but also by write notifiers. So 'offset' in a literal
> sense doesn't make too much sense because we're not operating purely
> sequential.
> 
> The QAPI documentation describes 'offset' like this:
> 
> # @offset: the current progress value
> 
> If we take it as just that, I think we could actually consider this code
> correct, because it's a useful measure for the progress (each sector is
> copied only once, either by the job or by a notifier), even though it
> really has nothing to do with an offset into the image.

Yes, this is similar to what we do for mirroring.  I think it is a feature.

Paolo

> Maybe a comment would be appropriate.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Qemu-devel] [PATCH v5 08/11] blockdev: allow BdrvActionOps->commit() to be NULL
  2013-05-30 12:34 ` [Qemu-devel] [PATCH v5 08/11] blockdev: allow BdrvActionOps->commit() to be NULL Stefan Hajnoczi
  2013-05-30 22:57   ` Eric Blake
@ 2013-06-19 11:14   ` Kevin Wolf
  1 sibling, 0 replies; 59+ messages in thread
From: Kevin Wolf @ 2013-06-19 11:14 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Fam Zheng, qemu-devel, dietmar, imain, Paolo Bonzini, xiawenc

Am 30.05.2013 um 14:34 hat Stefan Hajnoczi geschrieben:
> Some QMP 'transaction' types don't need to do anything on .commit().
> Make .commit() optional just like .abort().
> 
> The "drive-backup" action will take advantage of this, it only needs to
> cancel the block job on .abort().  Other block job actions will probably
> follow the same pattern, so allow .commit() to be NULL.
> 
> Suggested-by: Eric Blake <eblake@redhat.com>
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

Reviewed-by: Kevin Wolf <kwolf@redhat.com>

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Qemu-devel] [PATCH v5 09/11] blockdev: add DriveBackup transaction
  2013-05-30 12:34 ` [Qemu-devel] [PATCH v5 09/11] blockdev: add DriveBackup transaction Stefan Hajnoczi
  2013-06-03 17:20   ` Eric Blake
@ 2013-06-19 11:17   ` Kevin Wolf
  1 sibling, 0 replies; 59+ messages in thread
From: Kevin Wolf @ 2013-06-19 11:17 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Fam Zheng, qemu-devel, dietmar, imain, Paolo Bonzini, xiawenc

Am 30.05.2013 um 14:34 hat Stefan Hajnoczi geschrieben:
> This patch adds a transactional version of the drive-backup QMP command.
> It allows atomic snapshots of multiple drives along with automatic
> cleanup if there is a failure to start one of the backup jobs.
> 
> Note that QMP events are emitted for block job completion/cancellation
> and the block job will be listed by query-block-jobs.
> 
> @device: the name of the device whose writes should be mirrored.
> 
> @target: the target of the new image. If the file exists, or if it
>          is a device, the existing file/device will be used as the new
>          destination.  If it does not exist, a new file will be created.
> 
> @format: #optional the format of the new destination, default is to
>          probe if @mode is 'existing', else the format of the source
> 
> @mode: #optional whether and how QEMU should create a new image, default is
>        'absolute-paths'.
> 
> @speed: #optional the maximum speed, in bytes per second
> 
> @on-source-error: #optional the action to take on an error on the source,
>                   default 'report'.  'stop' and 'enospc' can only be used
>                   if the block device supports io-status (see BlockInfo).
> 
> @on-target-error: #optional the action to take on an error on the target,
>                   default 'report' (no limitations, since this applies to
>                   a different block device than @device).
> 
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

Reviewed-by: Kevin Wolf <kwolf@redhat.com>

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Qemu-devel] [PATCH v5 03/11] block: add basic backup support to block driver
  2013-06-19 10:50   ` Kevin Wolf
  2013-06-19 11:14     ` Paolo Bonzini
@ 2013-06-19 11:19     ` Paolo Bonzini
  2013-06-20 12:06       ` Stefan Hajnoczi
  2013-06-20 12:11     ` Stefan Hajnoczi
  2 siblings, 1 reply; 59+ messages in thread
From: Paolo Bonzini @ 2013-06-19 11:19 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: Fam Zheng, qemu-devel, dietmar, imain, Stefan Hajnoczi, xiawenc

Il 19/06/2013 12:50, Kevin Wolf ha scritto:
>> > +
>> > +    DPRINTF("%s enter %s C%" PRId64 " %" PRId64 " %d\n",
>> > +            __func__, bdrv_get_device_name(bs), start, sector_num, nb_sectors);
> Maybe put the first "%s" and __func__ directly into the DPRINTF macro?
> 

Or just use tracepoints.  backup_do_cow could definitely be one, and it
would subsume another DPRINTF ("backup_run loop").

hbitmap_get and block_job_completed are two other useful tracepoint that
is not present.

All that's left then are the DPRINTF for failed readv and writev, which
could also be useful in generic code (bdrv_co_*_done).

Can be done as a follow-up, of course.

Paolo

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Qemu-devel] [PATCH v5 10/11] blockdev: add Abort transaction
  2013-06-03  9:21     ` Stefan Hajnoczi
@ 2013-06-19 11:21       ` Kevin Wolf
  2013-06-21  9:31         ` Eric Blake
  0 siblings, 1 reply; 59+ messages in thread
From: Kevin Wolf @ 2013-06-19 11:21 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Fam Zheng, qemu-devel, xiawenc, imain, Paolo Bonzini, dietmar

Am 03.06.2013 um 11:21 hat Stefan Hajnoczi geschrieben:
> On Thu, May 30, 2013 at 07:11:25AM -0600, Eric Blake wrote:
> > On 05/30/2013 06:34 AM, Stefan Hajnoczi wrote:
> > > The Abort action can be used to test QMP 'transaction' failure.  Add it
> > > as the last action to exercise the .abort() and .cleanup() code paths
> > > for all previous actions.
> > 
> > Another thread questioned whether we should name this action
> > __org.qemu-debug-abort, if only to be clear that it is not a normal
> > interface.
> 
> kwolf: Do you want Abort to be namespaced as a debug-only action?
> 
> I think it's perfectly supportable so there's no need to hide it.
> Granted it's rare that anyone would want to use this action.

I don't have a strong opinion on this. On the one hand it's probably
useless outside testing and debugging, on the other hand it won't be a
pain to support and the interface looks like it could be stable for
the next few years...

It's your patch, so you get to decide.

Reviewed-by: Kevin Wolf <kwolf@redhat.com>

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Qemu-devel] [PATCH v5 11/11] qemu-iotests: add 055 drive-backup test case
  2013-05-30 12:34 ` [Qemu-devel] [PATCH v5 11/11] qemu-iotests: add 055 drive-backup test case Stefan Hajnoczi
@ 2013-06-19 12:41   ` Kevin Wolf
  0 siblings, 0 replies; 59+ messages in thread
From: Kevin Wolf @ 2013-06-19 12:41 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Fam Zheng, qemu-devel, dietmar, imain, Paolo Bonzini, xiawenc

Am 30.05.2013 um 14:34 hat Stefan Hajnoczi geschrieben:
> Testing drive-backup is similar to image streaming and drive mirroring.
> This test case is based on 041.
> 
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>  tests/qemu-iotests/055     | 256 +++++++++++++++++++++++++++++++++++++++++++++
>  tests/qemu-iotests/055.out |   5 +
>  tests/qemu-iotests/group   |   1 +
>  3 files changed, 262 insertions(+)
>  create mode 100755 tests/qemu-iotests/055
>  create mode 100644 tests/qemu-iotests/055.out
> 
> diff --git a/tests/qemu-iotests/055 b/tests/qemu-iotests/055
> new file mode 100755
> index 0000000..0341301
> --- /dev/null
> +++ b/tests/qemu-iotests/055
> @@ -0,0 +1,256 @@
> +#!/usr/bin/env python
> +#
> +# Tests for drive-backup
> +#
> +# Copyright (C) 2013 Red Hat, Inc.
> +#
> +# Based on 041.
> +#
> +# This program is free software; you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation; either version 2 of the License, or
> +# (at your option) any later version.
> +#
> +# This program is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with this program.  If not, see <http://www.gnu.org/licenses/>.
> +#
> +
> +import time
> +import os
> +import iotests
> +from iotests import qemu_img, qemu_io
> +
> +test_img = os.path.join(iotests.test_dir, 'test.img')
> +target_img = os.path.join(iotests.test_dir, 'target.img')
> +
> +class TestSingleDrive(iotests.QMPTestCase):
> +    image_len = 120 * 1024 * 1024 # MB
> +
> +    def setUp(self):
> +        qemu_img('create', '-f', iotests.imgfmt, test_img, str(TestSingleDrive.image_len))
> +        self.vm = iotests.VM().add_drive(test_img)
> +        self.vm.launch()
> +
> +    def tearDown(self):
> +        self.vm.shutdown()
> +        os.remove(test_img)
> +        try:
> +            os.remove(target_img)
> +        except OSError:
> +            pass
> +
> +    def test_cancel(self):
> +        self.assert_no_active_block_jobs()
> +
> +        result = self.vm.qmp('drive-backup', device='drive0',
> +                             target=target_img)
> +        self.assert_qmp(result, 'return', {})
> +
> +        self.cancel_and_wait()

Check the return value?

More instances follow.

> +
> +    def test_pause(self):
> +        self.assert_no_active_block_jobs()
> +
> +        result = self.vm.qmp('drive-backup', device='drive0',
> +                             target=target_img)
> +        self.assert_qmp(result, 'return', {})
> +
> +        result = self.vm.qmp('block-job-pause', device='drive0')
> +        self.assert_qmp(result, 'return', {})
> +
> +        time.sleep(1)
> +        result = self.vm.qmp('query-block-jobs')
> +        offset = self.dictpath(result, 'return[0]/offset')
> +
> +        time.sleep(1)
> +        result = self.vm.qmp('query-block-jobs')
> +        self.assert_qmp(result, 'return[0]/offset', offset)
> +
> +        self.vm.shutdown()
> +        self.assertTrue(iotests.compare_images(test_img, target_img),
> +                        'target image does not match source after backup')

As discussed on IRC, test_img should be initialised with non-zero data
so that this check is actually meaningful.

Then it's also necessary to resume the job and wait for completion
before the check will succeed.

The same applies to the corresponding transaction test case.

Kevin

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Qemu-devel] [PATCH v5 03/11] block: add basic backup support to block driver
  2013-06-19 11:14     ` Paolo Bonzini
@ 2013-06-20 12:05       ` Stefan Hajnoczi
  0 siblings, 0 replies; 59+ messages in thread
From: Stefan Hajnoczi @ 2013-06-20 12:05 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Kevin Wolf, Fam Zheng, qemu-devel, dietmar, imain, xiawenc

On Wed, Jun 19, 2013 at 01:14:17PM +0200, Paolo Bonzini wrote:
> Il 19/06/2013 12:50, Kevin Wolf ha scritto:
> >> > +        /* Publish progress */
> >> > +        job->sectors_read += n;
> >> > +        job->common.offset += n * BDRV_SECTOR_SIZE;
> > This is interesting, because the function is not only called by the
> > background job, but also by write notifiers. So 'offset' in a literal
> > sense doesn't make too much sense because we're not operating purely
> > sequential.
> > 
> > The QAPI documentation describes 'offset' like this:
> > 
> > # @offset: the current progress value
> > 
> > If we take it as just that, I think we could actually consider this code
> > correct, because it's a useful measure for the progress (each sector is
> > copied only once, either by the job or by a notifier), even though it
> > really has nothing to do with an offset into the image.
> 
> Yes, this is similar to what we do for mirroring.  I think it is a feature.

Yes, we explicitly defined "offset" to be an opaque progress value for
this reason.

I will still add a comment since the name "offset" does make the
drive-backup code suggest the wrong thing.

Stefan

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Qemu-devel] [PATCH v5 03/11] block: add basic backup support to block driver
  2013-06-19 11:19     ` Paolo Bonzini
@ 2013-06-20 12:06       ` Stefan Hajnoczi
  0 siblings, 0 replies; 59+ messages in thread
From: Stefan Hajnoczi @ 2013-06-20 12:06 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Kevin Wolf, Fam Zheng, qemu-devel, dietmar, imain, xiawenc

On Wed, Jun 19, 2013 at 01:19:34PM +0200, Paolo Bonzini wrote:
> Il 19/06/2013 12:50, Kevin Wolf ha scritto:
> >> > +
> >> > +    DPRINTF("%s enter %s C%" PRId64 " %" PRId64 " %d\n",
> >> > +            __func__, bdrv_get_device_name(bs), start, sector_num, nb_sectors);
> > Maybe put the first "%s" and __func__ directly into the DPRINTF macro?
> > 
> 
> Or just use tracepoints.  backup_do_cow could definitely be one, and it
> would subsume another DPRINTF ("backup_run loop").
> 
> hbitmap_get and block_job_completed are two other useful tracepoint that
> is not present.
> 
> All that's left then are the DPRINTF for failed readv and writev, which
> could also be useful in generic code (bdrv_co_*_done).
> 
> Can be done as a follow-up, of course.

I need to respin anyway.  The only reason for DPRINTF() is that I
originally wanted to change as little as possible from the orginal
patch.  But we've gone so far we might as well do this too :).

Stefan

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Qemu-devel] [PATCH v5 03/11] block: add basic backup support to block driver
  2013-06-19 10:50   ` Kevin Wolf
  2013-06-19 11:14     ` Paolo Bonzini
  2013-06-19 11:19     ` Paolo Bonzini
@ 2013-06-20 12:11     ` Stefan Hajnoczi
  2 siblings, 0 replies; 59+ messages in thread
From: Stefan Hajnoczi @ 2013-06-20 12:11 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: Fam Zheng, qemu-devel, dietmar, imain, Paolo Bonzini, xiawenc

On Wed, Jun 19, 2013 at 12:50:16PM +0200, Kevin Wolf wrote:
> Am 30.05.2013 um 14:34 hat Stefan Hajnoczi geschrieben:
> > +    DPRINTF("%s enter %s C%" PRId64 " %" PRId64 " %d\n",
> > +            __func__, bdrv_get_device_name(bs), start, sector_num, nb_sectors);
> 
> Maybe put the first "%s" and __func__ directly into the DPRINTF macro?

Will use trace events instead.

> > +    wait_for_overlapping_requests(job, start, end);
> > +    cow_request_begin(&cow_request, job, start, end);
> > +
> > +    total_sectors = bdrv_getlength(bs);
> > +    if (total_sectors < 0) {
> > +        if (error_is_read) {
> > +            *error_is_read = true;
> > +        }
> > +        ret = total_sectors;
> > +        goto out;
> > +    }
> > +    total_sectors /= BDRV_SECTOR_SIZE;
> 
> Why is this needed? There are two callers of the function: One is the
> write notifier, which has already a sector number that is checked
> against the image size. The other one is background job that gets the
> size once when it starts. I don't think there's a reason to call the
> block driver each time and potentially do an expensive request.
> 
> At first I thought it has something to do with resizing images, but it's
> forbidden while a block job is running (otherwise the job's main loop
> exit condition would be wrong, too), so that doesn't make it necessary.

Thanks, I will eliminate the call.

> > +        /* Publish progress */
> > +        job->sectors_read += n;
> > +        job->common.offset += n * BDRV_SECTOR_SIZE;
> 
> This is interesting, because the function is not only called by the
> background job, but also by write notifiers. So 'offset' in a literal
> sense doesn't make too much sense because we're not operating purely
> sequential.
> 
> The QAPI documentation describes 'offset' like this:
> 
> # @offset: the current progress value
> 
> If we take it as just that, I think we could actually consider this code
> correct, because it's a useful measure for the progress (each sector is
> copied only once, either by the job or by a notifier), even though it
> really has nothing to do with an offset into the image.
> 
> Maybe a comment would be appropriate.

Will add the comment.

> > +static BlockJobType backup_job_type = {
> > +    .instance_size = sizeof(BackupBlockJob),
> > +    .job_type = "backup",
> > +    .set_speed = backup_set_speed,
> > +    .iostatus_reset = backup_iostatus_reset,
> > +};
> 
> Align = on the same column? Should probably be const, too.

Thanks for pointing the const out.  I'll throw in alignment as a bonus,
there is a team of ASCII artists here who do amazing whitespace work ;).

> > +static void coroutine_fn backup_run(void *opaque)
> > +{
> > +    BackupBlockJob *job = opaque;
> > +    BlockDriverState *bs = job->common.bs;
> > +    BlockDriverState *target = job->target;
> > +    BlockdevOnError on_target_error = job->on_target_error;
> > +    NotifierWithReturn before_write = {
> > +        .notify = backup_before_write_notify,
> > +    };
> > +    int64_t start, end;
> > +    int ret = 0;
> > +
> > +    QLIST_INIT(&job->inflight_reqs);
> > +    qemu_co_rwlock_init(&job->flush_rwlock);
> > +
> > +    start = 0;
> > +    end = DIV_ROUND_UP(bdrv_getlength(bs) / BDRV_SECTOR_SIZE,
> 
> bdrv_getlength() can fail.

Will fix.

> > +                       BACKUP_SECTORS_PER_CLUSTER);
> > +
> > +    job->bitmap = hbitmap_alloc(end, 0);
> > +
> > +    bdrv_set_on_error(target, on_target_error, on_target_error);
> > +    bdrv_iostatus_enable(target);
> 
> Mirroring also has:
> 
>     bdrv_set_enable_write_cache(s->target, true);
> 
> Wouldn't it make sense for backup as well?

I guess so.

> > +        /* we need to yield so that qemu_aio_flush() returns.
> > +         * (without, VM does not reboot)
> > +         */
> > +        if (job->common.speed) {
> > +            uint64_t delay_ns = ratelimit_calculate_delay(
> > +                &job->limit, job->sectors_read);
> > +            job->sectors_read = 0;
> > +            block_job_sleep_ns(&job->common, rt_clock, delay_ns);
> 
> Here's the other implication of counting the progress of copies
> initiated by write notifiers: The more copies they trigger, the less
> additional copies are made by the background job.
> 
> I'm willing to count this as a feature rather than a bug.

Yes, the guest does not get throttled by the block job.

> > +        } else {
> > +            block_job_sleep_ns(&job->common, rt_clock, 0);
> > +        }
> > +
> > +        if (block_job_is_cancelled(&job->common)) {
> > +            break;
> > +        }
> > +
> > +        DPRINTF("backup_run loop C%" PRId64 "\n", start);
> > +
> > +        ret = backup_do_cow(bs, start * BACKUP_SECTORS_PER_CLUSTER, 1,
> > +                            &error_is_read);
> 
> You're taking advantage of the fact that backup_do_cow() rounds this one
> sector up to 64k, which is a much more reasonable size. But why not
> specify BACKUP_SECTORS_PER_CLUSTER as nb_sectors when this is what you
> really assume?

Sounds good though I need to double-check if we run into issues when
hitting the end of the disk (if not aligned to
BACKUP_SECTORS_PER_CLUSTER).

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Qemu-devel] [PATCH v5 06/11] block: add drive-backup QMP command
  2013-06-19 11:07   ` Kevin Wolf
@ 2013-06-20 12:19     ` Stefan Hajnoczi
  2013-06-20 13:15       ` Kevin Wolf
  0 siblings, 1 reply; 59+ messages in thread
From: Stefan Hajnoczi @ 2013-06-20 12:19 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: Fam Zheng, qemu-devel, dietmar, imain, Paolo Bonzini, xiawenc

On Wed, Jun 19, 2013 at 01:07:42PM +0200, Kevin Wolf wrote:
> Am 30.05.2013 um 14:34 hat Stefan Hajnoczi geschrieben:
> > @drive-backup
> > 
> > Start a point-in-time copy of a block device to a new destination.  The
> > status of ongoing drive-backup operations can be checked with
> > query-block-jobs where the BlockJobInfo.type field has the value 'backup'.
> > The operation can be stopped before it has completed using the
> > block-job-cancel command.
> > 
> > @device: the name of the device which should be copied.
> > 
> > @target: the target of the new image. If the file exists, or if it
> >          is a device, the existing file/device will be used as the new
> >          destination.  If it does not exist, a new file will be created.
> > 
> > @format: #optional the format of the new destination, default is to
> >          probe if @mode is 'existing', else the format of the source
> > 
> > @mode: #optional whether and how QEMU should create a new image, default is
> >        'absolute-paths'.
> > 
> > @speed: #optional the maximum speed, in bytes per second
> > 
> > @on-source-error: #optional the action to take on an error on the source,
> >                   default 'report'.  'stop' and 'enospc' can only be used
> >                   if the block device supports io-status (see BlockInfo).
> > 
> > @on-target-error: #optional the action to take on an error on the target,
> >                   default 'report' (no limitations, since this applies to
> >                   a different block device than @device).
> > 
> > Note that @on-source-error and @on-target-error only affect background I/O.
> > If an error occurs during a guest write request, the device's rerror/werror
> > actions will be used.
> > 
> > Returns: nothing on success
> >          If @device is not a valid block device, DeviceNotFound
> > 
> > Since 1.6
> > 
> > Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> 
> Reviewed-by: Kevin Wolf <kwolf@redhat.com>
> 
> But what about HMP, the series doesn't seem to have a separate patch to
> add a command there?

I can send it as a follow-up if you want.  The test cases use QMP and so
do management tools - I didn't feel the need for HMP.

Stefan

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Qemu-devel] [PATCH v5 06/11] block: add drive-backup QMP command
  2013-06-20 12:19     ` Stefan Hajnoczi
@ 2013-06-20 13:15       ` Kevin Wolf
  0 siblings, 0 replies; 59+ messages in thread
From: Kevin Wolf @ 2013-06-20 13:15 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Fam Zheng, qemu-devel, dietmar, imain, Paolo Bonzini, xiawenc

Am 20.06.2013 um 14:19 hat Stefan Hajnoczi geschrieben:
> On Wed, Jun 19, 2013 at 01:07:42PM +0200, Kevin Wolf wrote:
> > Am 30.05.2013 um 14:34 hat Stefan Hajnoczi geschrieben:
> > > @drive-backup
> > > 
> > > Start a point-in-time copy of a block device to a new destination.  The
> > > status of ongoing drive-backup operations can be checked with
> > > query-block-jobs where the BlockJobInfo.type field has the value 'backup'.
> > > The operation can be stopped before it has completed using the
> > > block-job-cancel command.
> > > 
> > > @device: the name of the device which should be copied.
> > > 
> > > @target: the target of the new image. If the file exists, or if it
> > >          is a device, the existing file/device will be used as the new
> > >          destination.  If it does not exist, a new file will be created.
> > > 
> > > @format: #optional the format of the new destination, default is to
> > >          probe if @mode is 'existing', else the format of the source
> > > 
> > > @mode: #optional whether and how QEMU should create a new image, default is
> > >        'absolute-paths'.
> > > 
> > > @speed: #optional the maximum speed, in bytes per second
> > > 
> > > @on-source-error: #optional the action to take on an error on the source,
> > >                   default 'report'.  'stop' and 'enospc' can only be used
> > >                   if the block device supports io-status (see BlockInfo).
> > > 
> > > @on-target-error: #optional the action to take on an error on the target,
> > >                   default 'report' (no limitations, since this applies to
> > >                   a different block device than @device).
> > > 
> > > Note that @on-source-error and @on-target-error only affect background I/O.
> > > If an error occurs during a guest write request, the device's rerror/werror
> > > actions will be used.
> > > 
> > > Returns: nothing on success
> > >          If @device is not a valid block device, DeviceNotFound
> > > 
> > > Since 1.6
> > > 
> > > Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> > 
> > Reviewed-by: Kevin Wolf <kwolf@redhat.com>
> > 
> > But what about HMP, the series doesn't seem to have a separate patch to
> > add a command there?
> 
> I can send it as a follow-up if you want.  The test cases use QMP and so
> do management tools - I didn't feel the need for HMP.

Yes, please do. HMP is what plain qemu users use. I wouldn't consider it
legacy - quite the opposite: Having QMP as a stable machine-readable
protocol gives us more freedom to make HMP a good interface for human
users.

Kevin

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Qemu-devel] [PATCH v5 10/11] blockdev: add Abort transaction
  2013-06-19 11:21       ` Kevin Wolf
@ 2013-06-21  9:31         ` Eric Blake
  0 siblings, 0 replies; 59+ messages in thread
From: Eric Blake @ 2013-06-21  9:31 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: Fam Zheng, qemu-devel, xiawenc, imain, Stefan Hajnoczi,
	Paolo Bonzini, dietmar

[-- Attachment #1: Type: text/plain, Size: 1554 bytes --]

On 06/19/2013 12:21 PM, Kevin Wolf wrote:
> Am 03.06.2013 um 11:21 hat Stefan Hajnoczi geschrieben:
>> On Thu, May 30, 2013 at 07:11:25AM -0600, Eric Blake wrote:
>>> On 05/30/2013 06:34 AM, Stefan Hajnoczi wrote:
>>>> The Abort action can be used to test QMP 'transaction' failure.  Add it
>>>> as the last action to exercise the .abort() and .cleanup() code paths
>>>> for all previous actions.
>>>
>>> Another thread questioned whether we should name this action
>>> __org.qemu-debug-abort, if only to be clear that it is not a normal
>>> interface.
>>
>> kwolf: Do you want Abort to be namespaced as a debug-only action?
>>
>> I think it's perfectly supportable so there's no need to hide it.
>> Granted it's rare that anyone would want to use this action.
> 
> I don't have a strong opinion on this. On the one hand it's probably
> useless outside testing and debugging, on the other hand it won't be a
> pain to support and the interface looks like it could be stable for
> the next few years...
> 
> It's your patch, so you get to decide.

I don't have a strong opinion, either - it doesn't matter what it is
named if no one outside of qtest uses that name :)  When developing
libvirt interfaces around new actions, I might use an abort to
intentionally test behavior of libvirt error-handling code, but that
would still be something I do by hand (by any name) and not something
coded into libvirt itself.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 621 bytes --]

^ permalink raw reply	[flat|nested] 59+ messages in thread

end of thread, other threads:[~2013-06-21  9:31 UTC | newest]

Thread overview: 59+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-05-30 12:34 [Qemu-devel] [PATCH v5 00/11] block: drive-backup live backup command Stefan Hajnoczi
2013-05-30 12:34 ` [Qemu-devel] [PATCH v5 01/11] notify: add NotiferWithReturn so notifier list can abort Stefan Hajnoczi
2013-05-30 22:27   ` Eric Blake
2013-06-03  9:11     ` Stefan Hajnoczi
2013-06-19 10:55   ` Kevin Wolf
2013-05-30 12:34 ` [Qemu-devel] [PATCH v5 02/11] block: add bdrv_add_before_write_notifier() Stefan Hajnoczi
2013-05-30 22:47   ` Eric Blake
2013-06-19 10:56   ` Kevin Wolf
2013-05-30 12:34 ` [Qemu-devel] [PATCH v5 03/11] block: add basic backup support to block driver Stefan Hajnoczi
2013-06-06  3:56   ` Fam Zheng
2013-06-06  8:05     ` Stefan Hajnoczi
2013-06-06  8:56       ` Fam Zheng
2013-06-07  7:18         ` Stefan Hajnoczi
2013-06-13  6:03           ` Wenchao Xia
2013-06-13  6:07             ` Wenchao Xia
2013-06-13  6:33               ` Fam Zheng
2013-06-13  8:02                 ` Wenchao Xia
2013-06-13  8:51                 ` Stefan Hajnoczi
2013-06-17  3:43   ` Fam Zheng
2013-06-17 12:36     ` Stefan Hajnoczi
2013-06-18 14:52   ` Kevin Wolf
2013-06-19  7:38     ` Stefan Hajnoczi
2013-06-19 10:50   ` Kevin Wolf
2013-06-19 11:14     ` Paolo Bonzini
2013-06-20 12:05       ` Stefan Hajnoczi
2013-06-19 11:19     ` Paolo Bonzini
2013-06-20 12:06       ` Stefan Hajnoczi
2013-06-20 12:11     ` Stefan Hajnoczi
2013-05-30 12:34 ` [Qemu-devel] [PATCH v5 04/11] blockdev: drop redundant proto_drv check Stefan Hajnoczi
2013-06-03 17:14   ` Eric Blake
2013-06-06  5:00   ` Wenchao Xia
2013-06-19 10:57   ` Kevin Wolf
2013-05-30 12:34 ` [Qemu-devel] [PATCH v5 05/11] blockdev: use bdrv_getlength() in qmp_drive_mirror() Stefan Hajnoczi
2013-05-30 13:24   ` Paolo Bonzini
2013-06-03 17:15   ` Eric Blake
2013-06-19 10:58   ` Kevin Wolf
2013-05-30 12:34 ` [Qemu-devel] [PATCH v5 06/11] block: add drive-backup QMP command Stefan Hajnoczi
2013-06-03 17:09   ` Eric Blake
2013-06-19 11:07   ` Kevin Wolf
2013-06-20 12:19     ` Stefan Hajnoczi
2013-06-20 13:15       ` Kevin Wolf
2013-05-30 12:34 ` [Qemu-devel] [PATCH v5 07/11] blockdev: rename BlkTransactionStates to singular Stefan Hajnoczi
2013-05-30 22:55   ` Eric Blake
2013-06-19 11:13   ` Kevin Wolf
2013-05-30 12:34 ` [Qemu-devel] [PATCH v5 08/11] blockdev: allow BdrvActionOps->commit() to be NULL Stefan Hajnoczi
2013-05-30 22:57   ` Eric Blake
2013-06-03  9:16     ` Stefan Hajnoczi
2013-06-19 11:14   ` Kevin Wolf
2013-05-30 12:34 ` [Qemu-devel] [PATCH v5 09/11] blockdev: add DriveBackup transaction Stefan Hajnoczi
2013-06-03 17:20   ` Eric Blake
2013-06-19 11:17   ` Kevin Wolf
2013-05-30 12:34 ` [Qemu-devel] [PATCH v5 10/11] blockdev: add Abort transaction Stefan Hajnoczi
2013-05-30 13:11   ` Eric Blake
2013-06-03  9:21     ` Stefan Hajnoczi
2013-06-19 11:21       ` Kevin Wolf
2013-06-21  9:31         ` Eric Blake
2013-05-30 12:34 ` [Qemu-devel] [PATCH v5 11/11] qemu-iotests: add 055 drive-backup test case Stefan Hajnoczi
2013-06-19 12:41   ` Kevin Wolf
2013-06-06  5:06 ` [Qemu-devel] [PATCH v5 00/11] block: drive-backup live backup command Wenchao Xia

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.