All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 00/10] block: incremental backup transactions using BlockJobTxn
@ 2015-06-25 12:12 Stefan Hajnoczi
  2015-06-25 12:12 ` [Qemu-devel] [PATCH 01/10] qapi: Add transaction support to block-dirty-bitmap operations Stefan Hajnoczi
                   ` (10 more replies)
  0 siblings, 11 replies; 25+ messages in thread
From: Stefan Hajnoczi @ 2015-06-25 12:12 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, famz, John Snow, Jeff Cody, mreitz, vsementsov,
	Stefan Hajnoczi

This series is based on my block branch
(https://github.com/stefanha/qemu/commits/block).

It uses patches from John Snow's "[PATCH v6 00/10] block: incremental backup
transactions" series but implements the feature with a new transaction
mechanism for blockjobs called BlockJobTxn.

Recap: motivation for block job transactions
--------------------------------------------
If an incremental backup block job fails then we reclaim the bitmap so
the job can be retried.  The problem comes when multiple jobs are started as
part of a qmp 'transaction' command.  We need to group these jobs in a
transaction so that either all jobs complete successfully or all bitmaps are
reclaimed.

Without transactions, there is a case where some jobs complete successfully and
throw away their bitmaps, making it impossible to retry the backup by rerunning
the command if one of the jobs fails.

How does this implementation work?
----------------------------------
These patches add a BlockJobTxn object with the following API:

  txn = block_job_txn_new();
  block_job_txn_add_job(txn, job1);
  block_job_txn_add_job(txn, job2);
  block_job_txn_begin();

The jobs either both complete successfully or they both fail/cancel.  If the
user cancels job1 then job2 will also be cancelled and vice versa.

Jobs stay alive waiting for other jobs to complete.  They can be cancelled by
the user during this time.  Job blockers are still in effect and no other block
job can run on this device in the meantime (since QEMU currently only allows 1
job per device).  This is the main drawback to this approach but reasonable
since you probably don't want to run other jobs/operations until you're sure
the backup was successful (you won't be able to retry a failed backup if
there's a new job running).

Adding transaction support to the backup job is very easy.  It just needs to
make a call before throwing away the bitmap and returning from its coroutine:

  block_job_txn_prepare_to_complete(job->txn, job, ret);

  if (job->sync_bitmap) {
      BdrvDirtyBitmap *bm;
      if (ret < 0 || block_job_is_cancelled(&job->common)) {
          ...

John Snow (4):
  qapi: Add transaction support to block-dirty-bitmap operations
  iotests: add transactional incremental backup test
  block: rename BlkTransactionState and BdrvActionOps
  iotests: 124 - transactional failure test

Kashyap Chamarthy (1):
  qmp-commands.hx: Update the supported 'transaction' operations

Stefan Hajnoczi (5):
  block: keep bitmap if incremental backup job is cancelled
  block: add block job transactions
  blockdev: make BlockJobTxn available to qmp 'transaction'
  block/backup: support block job transactions
  tests: add BlockJobTxn unit test

 block.c                    |  19 ++-
 block/backup.c             |   9 +-
 blockdev.c                 | 298 +++++++++++++++++++++++++++++++++++----------
 blockjob.c                 | 160 ++++++++++++++++++++++++
 docs/bitmaps.md            |   6 +-
 include/block/block.h      |   2 +-
 include/block/block_int.h  |   6 +-
 include/block/blockjob.h   |  49 ++++++++
 qapi-schema.json           |   6 +-
 qmp-commands.hx            |  21 +++-
 tests/Makefile             |   3 +
 tests/qemu-iotests/124     | 180 ++++++++++++++++++++++++++-
 tests/qemu-iotests/124.out |   4 +-
 tests/test-blockjob-txn.c  | 191 +++++++++++++++++++++++++++++
 trace-events               |   4 +
 15 files changed, 873 insertions(+), 85 deletions(-)
 create mode 100644 tests/test-blockjob-txn.c

-- 
2.4.3

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Qemu-devel] [PATCH 01/10] qapi: Add transaction support to block-dirty-bitmap operations
  2015-06-25 12:12 [Qemu-devel] [PATCH 00/10] block: incremental backup transactions using BlockJobTxn Stefan Hajnoczi
@ 2015-06-25 12:12 ` Stefan Hajnoczi
  2015-06-25 12:12 ` [Qemu-devel] [PATCH 02/10] iotests: add transactional incremental backup test Stefan Hajnoczi
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 25+ messages in thread
From: Stefan Hajnoczi @ 2015-06-25 12:12 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, famz, John Snow, Jeff Cody, mreitz, vsementsov,
	Stefan Hajnoczi

From: John Snow <jsnow@redhat.com>

This adds two qmp commands to transactions.

block-dirty-bitmap-add allows you to create a bitmap simultaneously
alongside a new full backup to accomplish a clean synchronization
point.

block-dirty-bitmap-clear allows you to reset a bitmap back to as-if
it were new, which can also be used alongside a full backup to
accomplish a clean synchronization point.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 block.c                   |  19 +++++++-
 blockdev.c                | 114 +++++++++++++++++++++++++++++++++++++++++++++-
 docs/bitmaps.md           |   6 +--
 include/block/block.h     |   1 -
 include/block/block_int.h |   3 ++
 qapi-schema.json          |   6 ++-
 6 files changed, 139 insertions(+), 10 deletions(-)

diff --git a/block.c b/block.c
index 81233be..1c8e06e 100644
--- a/block.c
+++ b/block.c
@@ -3510,10 +3510,25 @@ void bdrv_reset_dirty_bitmap(BdrvDirtyBitmap *bitmap,
     hbitmap_reset(bitmap->bitmap, cur_sector, nr_sectors);
 }
 
-void bdrv_clear_dirty_bitmap(BdrvDirtyBitmap *bitmap)
+void bdrv_clear_dirty_bitmap(BdrvDirtyBitmap *bitmap, HBitmap **out)
 {
     assert(bdrv_dirty_bitmap_enabled(bitmap));
-    hbitmap_reset_all(bitmap->bitmap);
+    if (!out) {
+        hbitmap_reset_all(bitmap->bitmap);
+    } else {
+        HBitmap *backup = bitmap->bitmap;
+        bitmap->bitmap = hbitmap_alloc(bitmap->size,
+                                       hbitmap_granularity(backup));
+        *out = backup;
+    }
+}
+
+void bdrv_undo_clear_dirty_bitmap(BdrvDirtyBitmap *bitmap, HBitmap *in)
+{
+    HBitmap *tmp = bitmap->bitmap;
+    assert(bdrv_dirty_bitmap_enabled(bitmap));
+    bitmap->bitmap = in;
+    hbitmap_free(tmp);
 }
 
 void bdrv_set_dirty(BlockDriverState *bs, int64_t cur_sector,
diff --git a/blockdev.c b/blockdev.c
index b354676..c06b4b6 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1708,6 +1708,106 @@ static void blockdev_backup_clean(BlkTransactionState *common)
     }
 }
 
+typedef struct BlockDirtyBitmapState {
+    BlkTransactionState common;
+    BdrvDirtyBitmap *bitmap;
+    BlockDriverState *bs;
+    AioContext *aio_context;
+    HBitmap *backup;
+    bool prepared;
+} BlockDirtyBitmapState;
+
+static void block_dirty_bitmap_add_prepare(BlkTransactionState *common,
+                                           Error **errp)
+{
+    Error *local_err = NULL;
+    BlockDirtyBitmapAdd *action;
+    BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
+                                             common, common);
+
+    action = common->action->block_dirty_bitmap_add;
+    /* AIO context taken and released within qmp_block_dirty_bitmap_add */
+    qmp_block_dirty_bitmap_add(action->node, action->name,
+                               action->has_granularity, action->granularity,
+                               &local_err);
+
+    if (!local_err) {
+        state->prepared = true;
+    } else {
+        error_propagate(errp, local_err);
+    }
+}
+
+static void block_dirty_bitmap_add_abort(BlkTransactionState *common)
+{
+    BlockDirtyBitmapAdd *action;
+    BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
+                                             common, common);
+
+    action = common->action->block_dirty_bitmap_add;
+    /* Should not be able to fail: IF the bitmap was added via .prepare(),
+     * then the node reference and bitmap name must have been valid.
+     */
+    if (state->prepared) {
+        qmp_block_dirty_bitmap_remove(action->node, action->name, &error_abort);
+    }
+}
+
+static void block_dirty_bitmap_clear_prepare(BlkTransactionState *common,
+                                             Error **errp)
+{
+    BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
+                                             common, common);
+    BlockDirtyBitmap *action;
+
+    action = common->action->block_dirty_bitmap_clear;
+    state->bitmap = block_dirty_bitmap_lookup(action->node,
+                                              action->name,
+                                              &state->bs,
+                                              &state->aio_context,
+                                              errp);
+    if (!state->bitmap) {
+        return;
+    }
+
+    if (bdrv_dirty_bitmap_frozen(state->bitmap)) {
+        error_setg(errp, "Cannot modify a frozen bitmap");
+        return;
+    } else if (!bdrv_dirty_bitmap_enabled(state->bitmap)) {
+        error_setg(errp, "Cannot clear a disabled bitmap");
+        return;
+    }
+
+    bdrv_clear_dirty_bitmap(state->bitmap, &state->backup);
+    /* AioContext is released in .clean() */
+}
+
+static void block_dirty_bitmap_clear_abort(BlkTransactionState *common)
+{
+    BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
+                                             common, common);
+
+    bdrv_undo_clear_dirty_bitmap(state->bitmap, state->backup);
+}
+
+static void block_dirty_bitmap_clear_commit(BlkTransactionState *common)
+{
+    BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
+                                             common, common);
+
+    hbitmap_free(state->backup);
+}
+
+static void block_dirty_bitmap_clear_clean(BlkTransactionState *common)
+{
+    BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
+                                             common, common);
+
+    if (state->aio_context) {
+        aio_context_release(state->aio_context);
+    }
+}
+
 static void abort_prepare(BlkTransactionState *common, Error **errp)
 {
     error_setg(errp, "Transaction aborted using Abort action");
@@ -1748,6 +1848,18 @@ static const BdrvActionOps actions[] = {
         .abort = internal_snapshot_abort,
         .clean = internal_snapshot_clean,
     },
+    [TRANSACTION_ACTION_KIND_BLOCK_DIRTY_BITMAP_ADD] = {
+        .instance_size = sizeof(BlockDirtyBitmapState),
+        .prepare = block_dirty_bitmap_add_prepare,
+        .abort = block_dirty_bitmap_add_abort,
+    },
+    [TRANSACTION_ACTION_KIND_BLOCK_DIRTY_BITMAP_CLEAR] = {
+        .instance_size = sizeof(BlockDirtyBitmapState),
+        .prepare = block_dirty_bitmap_clear_prepare,
+        .commit = block_dirty_bitmap_clear_commit,
+        .abort = block_dirty_bitmap_clear_abort,
+        .clean = block_dirty_bitmap_clear_clean,
+    }
 };
 
 /*
@@ -2131,7 +2243,7 @@ void qmp_block_dirty_bitmap_clear(const char *node, const char *name,
         goto out;
     }
 
-    bdrv_clear_dirty_bitmap(bitmap);
+    bdrv_clear_dirty_bitmap(bitmap, NULL);
 
  out:
     aio_context_release(aio_context);
diff --git a/docs/bitmaps.md b/docs/bitmaps.md
index f066b48..a60fee1 100644
--- a/docs/bitmaps.md
+++ b/docs/bitmaps.md
@@ -97,11 +97,7 @@ which is included at the end of this document.
 }
 ```
 
-## Transactions (Not yet implemented)
-
-* Transactional commands are forthcoming in a future version,
-  and are not yet available for use. This section serves as
-  documentation of intent for their design and usage.
+## Transactions
 
 ### Justification
 
diff --git a/include/block/block.h b/include/block/block.h
index 07bb724..a4c505d 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -493,7 +493,6 @@ void bdrv_set_dirty_bitmap(BdrvDirtyBitmap *bitmap,
                            int64_t cur_sector, int nr_sectors);
 void bdrv_reset_dirty_bitmap(BdrvDirtyBitmap *bitmap,
                              int64_t cur_sector, int nr_sectors);
-void bdrv_clear_dirty_bitmap(BdrvDirtyBitmap *bitmap);
 void bdrv_dirty_iter_init(BdrvDirtyBitmap *bitmap, struct HBitmapIter *hbi);
 void bdrv_set_dirty_iter(struct HBitmapIter *hbi, int64_t offset);
 int64_t bdrv_get_dirty_count(BdrvDirtyBitmap *bitmap);
diff --git a/include/block/block_int.h b/include/block/block_int.h
index b0476fc..ea3e7f0 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -663,4 +663,7 @@ void bdrv_set_dirty(BlockDriverState *bs, int64_t cur_sector, int nr_sectors);
 void bdrv_reset_dirty(BlockDriverState *bs, int64_t cur_sector,
                       int nr_sectors);
 
+void bdrv_clear_dirty_bitmap(BdrvDirtyBitmap *bitmap, HBitmap **out);
+void bdrv_undo_clear_dirty_bitmap(BdrvDirtyBitmap *bitmap, HBitmap *in);
+
 #endif /* BLOCK_INT_H */
diff --git a/qapi-schema.json b/qapi-schema.json
index 106008c..5fba570 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -1490,6 +1490,8 @@
 # abort since 1.6
 # blockdev-snapshot-internal-sync since 1.7
 # blockdev-backup since 2.3
+# block-dirty-bitmap-add since 2.4
+# block-dirty-bitmap-clear since 2.4
 ##
 { 'union': 'TransactionAction',
   'data': {
@@ -1497,7 +1499,9 @@
        'drive-backup': 'DriveBackup',
        'blockdev-backup': 'BlockdevBackup',
        'abort': 'Abort',
-       'blockdev-snapshot-internal-sync': 'BlockdevSnapshotInternal'
+       'blockdev-snapshot-internal-sync': 'BlockdevSnapshotInternal',
+       'block-dirty-bitmap-add': 'BlockDirtyBitmapAdd',
+       'block-dirty-bitmap-clear': 'BlockDirtyBitmap'
    } }
 
 ##
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Qemu-devel] [PATCH 02/10] iotests: add transactional incremental backup test
  2015-06-25 12:12 [Qemu-devel] [PATCH 00/10] block: incremental backup transactions using BlockJobTxn Stefan Hajnoczi
  2015-06-25 12:12 ` [Qemu-devel] [PATCH 01/10] qapi: Add transaction support to block-dirty-bitmap operations Stefan Hajnoczi
@ 2015-06-25 12:12 ` Stefan Hajnoczi
  2015-06-25 12:12 ` [Qemu-devel] [PATCH 03/10] block: rename BlkTransactionState and BdrvActionOps Stefan Hajnoczi
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 25+ messages in thread
From: Stefan Hajnoczi @ 2015-06-25 12:12 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, famz, John Snow, Jeff Cody, mreitz, vsementsov,
	Stefan Hajnoczi

From: John Snow <jsnow@redhat.com>

Test simple usage cases for using transactions to create
and synchronize incremental backups.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 tests/qemu-iotests/124     | 54 ++++++++++++++++++++++++++++++++++++++++++++++
 tests/qemu-iotests/124.out |  4 ++--
 2 files changed, 56 insertions(+), 2 deletions(-)

diff --git a/tests/qemu-iotests/124 b/tests/qemu-iotests/124
index 8abce2f..530a392 100644
--- a/tests/qemu-iotests/124
+++ b/tests/qemu-iotests/124
@@ -36,6 +36,23 @@ def try_remove(img):
         pass
 
 
+def transaction_action(action, **kwargs):
+    return {
+        'type': action,
+        'data': kwargs
+    }
+
+
+def transaction_bitmap_clear(node, name, **kwargs):
+    return transaction_action('block-dirty-bitmap-clear',
+                              node=node, name=name, **kwargs)
+
+
+def transaction_drive_backup(device, target, **kwargs):
+    return transaction_action('drive-backup', device=device, target=target,
+                              **kwargs)
+
+
 class Bitmap:
     def __init__(self, name, drive):
         self.name = name
@@ -264,6 +281,43 @@ class TestIncrementalBackup(iotests.QMPTestCase):
         return self.do_incremental_simple(granularity=131072)
 
 
+    def test_incremental_transaction(self):
+        '''Test: Verify backups made from transactionally created bitmaps.
+
+        Create a bitmap "before" VM execution begins, then create a second
+        bitmap AFTER writes have already occurred. Use transactions to create
+        a full backup and synchronize both bitmaps to this backup.
+        Create an incremental backup through both bitmaps and verify that
+        both backups match the current drive0 image.
+        '''
+
+        drive0 = self.drives[0]
+        bitmap0 = self.add_bitmap('bitmap0', drive0)
+        self.hmp_io_writes(drive0['id'], (('0xab', 0, 512),
+                                          ('0xfe', '16M', '256k'),
+                                          ('0x64', '32736k', '64k')))
+        bitmap1 = self.add_bitmap('bitmap1', drive0)
+
+        result = self.vm.qmp('transaction', actions=[
+            transaction_bitmap_clear(bitmap0.drive['id'], bitmap0.name),
+            transaction_bitmap_clear(bitmap1.drive['id'], bitmap1.name),
+            transaction_drive_backup(drive0['id'], drive0['backup'],
+                                     sync='full', format=drive0['fmt'])
+        ])
+        self.assert_qmp(result, 'return', {})
+        self.wait_until_completed(drive0['id'])
+        self.files.append(drive0['backup'])
+
+        self.hmp_io_writes(drive0['id'], (('0x9a', 0, 512),
+                                          ('0x55', '8M', '352k'),
+                                          ('0x78', '15872k', '1M')))
+        # Both bitmaps should be correctly in sync.
+        self.create_incremental(bitmap0)
+        self.create_incremental(bitmap1)
+        self.vm.shutdown()
+        self.check_backups()
+
+
     def test_incremental_failure(self):
         '''Test: Verify backups made after a failure are correct.
 
diff --git a/tests/qemu-iotests/124.out b/tests/qemu-iotests/124.out
index 2f7d390..594c16f 100644
--- a/tests/qemu-iotests/124.out
+++ b/tests/qemu-iotests/124.out
@@ -1,5 +1,5 @@
-.......
+........
 ----------------------------------------------------------------------
-Ran 7 tests
+Ran 8 tests
 
 OK
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Qemu-devel] [PATCH 03/10] block: rename BlkTransactionState and BdrvActionOps
  2015-06-25 12:12 [Qemu-devel] [PATCH 00/10] block: incremental backup transactions using BlockJobTxn Stefan Hajnoczi
  2015-06-25 12:12 ` [Qemu-devel] [PATCH 01/10] qapi: Add transaction support to block-dirty-bitmap operations Stefan Hajnoczi
  2015-06-25 12:12 ` [Qemu-devel] [PATCH 02/10] iotests: add transactional incremental backup test Stefan Hajnoczi
@ 2015-06-25 12:12 ` Stefan Hajnoczi
  2015-06-25 12:12 ` [Qemu-devel] [PATCH 04/10] block: keep bitmap if incremental backup job is cancelled Stefan Hajnoczi
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 25+ messages in thread
From: Stefan Hajnoczi @ 2015-06-25 12:12 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, famz, John Snow, Jeff Cody, mreitz, vsementsov,
	Stefan Hajnoczi

From: John Snow <jsnow@redhat.com>

These structures are misnomers, somewhat.

(1) BlockTransactionState is not state for a transaction,
    but is rather state for a single transaction action.
    Rename it "BlkActionState" to be more accurate.

(2) The BdrvActionOps describes operations for the BlkActionState,
    above. This name might imply a 'BdrvAction' or a 'BdrvActionState',
    which there isn't.
    Rename this to 'BlkActionOps' to match 'BlkActionState'.

Lastly, update the surrounding in-line documentation and comments
to reflect the current nature of how Transactions operate.

This patch changes only comments and names, and should not affect
behavior in any way.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 blockdev.c | 116 ++++++++++++++++++++++++++++++++++---------------------------
 1 file changed, 65 insertions(+), 51 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index c06b4b6..c572950 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1240,43 +1240,57 @@ static BdrvDirtyBitmap *block_dirty_bitmap_lookup(const char *node,
 
 /* New and old BlockDriverState structs for atomic group operations */
 
-typedef struct BlkTransactionState BlkTransactionState;
+typedef struct BlkActionState BlkActionState;
 
-/* Only prepare() may fail. In a single transaction, only one of commit() or
-   abort() will be called, clean() will always be called if it present. */
-typedef struct BdrvActionOps {
-    /* Size of state struct, in bytes. */
+/**
+ * BlkActionOps:
+ * Table of operations that define an Action.
+ *
+ * @instance_size: Size of state struct, in bytes.
+ * @prepare: Prepare the work, must NOT be NULL.
+ * @commit: Commit the changes, can be NULL.
+ * @abort: Abort the changes on fail, can be NULL.
+ * @clean: Clean up resources after all transaction actions have called
+ *         commit() or abort(). Can be NULL.
+ *
+ * Only prepare() may fail. In a single transaction, only one of commit() or
+ * abort() will be called. clean() will always be called if it is present.
+ */
+typedef struct BlkActionOps {
     size_t instance_size;
-    /* Prepare the work, must NOT be NULL. */
-    void (*prepare)(BlkTransactionState *common, Error **errp);
-    /* Commit the changes, can be NULL. */
-    void (*commit)(BlkTransactionState *common);
-    /* Abort the changes on fail, can be NULL. */
-    void (*abort)(BlkTransactionState *common);
-    /* Clean up resource in the end, can be NULL. */
-    void (*clean)(BlkTransactionState *common);
-} BdrvActionOps;
+    void (*prepare)(BlkActionState *common, Error **errp);
+    void (*commit)(BlkActionState *common);
+    void (*abort)(BlkActionState *common);
+    void (*clean)(BlkActionState *common);
+} BlkActionOps;
 
-/*
- * This structure must be arranged as first member in child type, assuming
- * that compiler will also arrange it to the same address with parent instance.
- * Later it will be used in free().
+/**
+ * BlkActionState:
+ * Describes one Action's state within a Transaction.
+ *
+ * @action: QAPI-defined enum identifying which Action to perform.
+ * @ops: Table of ActionOps this Action can perform.
+ * @entry: List membership for all Actions in this Transaction.
+ *
+ * This structure must be arranged as first member in a subclassed type,
+ * assuming that the compiler will also arrange it to the same offsets as the
+ * base class.
  */
-struct BlkTransactionState {
+struct BlkActionState {
     TransactionAction *action;
-    const BdrvActionOps *ops;
-    QSIMPLEQ_ENTRY(BlkTransactionState) entry;
+    const BlkActionOps *ops;
+    QSIMPLEQ_ENTRY(BlkActionState) entry;
 };
 
 /* internal snapshot private data */
 typedef struct InternalSnapshotState {
-    BlkTransactionState common;
+    BlkActionState common;
     BlockDriverState *bs;
     AioContext *aio_context;
     QEMUSnapshotInfo sn;
 } InternalSnapshotState;
 
-static void internal_snapshot_prepare(BlkTransactionState *common,
+static void internal_snapshot_prepare(BlkActionState *common,
                                       Error **errp)
 {
     Error *local_err = NULL;
@@ -1372,7 +1386,7 @@ static void internal_snapshot_prepare(BlkTransactionState *common,
     state->bs = bs;
 }
 
-static void internal_snapshot_abort(BlkTransactionState *common)
+static void internal_snapshot_abort(BlkActionState *common)
 {
     InternalSnapshotState *state =
                              DO_UPCAST(InternalSnapshotState, common, common);
@@ -1395,7 +1409,7 @@ static void internal_snapshot_abort(BlkTransactionState *common)
     }
 }
 
-static void internal_snapshot_clean(BlkTransactionState *common)
+static void internal_snapshot_clean(BlkActionState *common)
 {
     InternalSnapshotState *state = DO_UPCAST(InternalSnapshotState,
                                              common, common);
@@ -1407,13 +1421,13 @@ static void internal_snapshot_clean(BlkTransactionState *common)
 
 /* external snapshot private data */
 typedef struct ExternalSnapshotState {
-    BlkTransactionState common;
+    BlkActionState common;
     BlockDriverState *old_bs;
     BlockDriverState *new_bs;
     AioContext *aio_context;
 } ExternalSnapshotState;
 
-static void external_snapshot_prepare(BlkTransactionState *common,
+static void external_snapshot_prepare(BlkActionState *common,
                                       Error **errp)
 {
     BlockDriver *drv;
@@ -1534,7 +1548,7 @@ static void external_snapshot_prepare(BlkTransactionState *common,
     }
 }
 
-static void external_snapshot_commit(BlkTransactionState *common)
+static void external_snapshot_commit(BlkActionState *common)
 {
     ExternalSnapshotState *state =
                              DO_UPCAST(ExternalSnapshotState, common, common);
@@ -1552,7 +1566,7 @@ static void external_snapshot_commit(BlkTransactionState *common)
     aio_context_release(state->aio_context);
 }
 
-static void external_snapshot_abort(BlkTransactionState *common)
+static void external_snapshot_abort(BlkActionState *common)
 {
     ExternalSnapshotState *state =
                              DO_UPCAST(ExternalSnapshotState, common, common);
@@ -1565,13 +1579,13 @@ static void external_snapshot_abort(BlkTransactionState *common)
 }
 
 typedef struct DriveBackupState {
-    BlkTransactionState common;
+    BlkActionState common;
     BlockDriverState *bs;
     AioContext *aio_context;
     BlockJob *job;
 } DriveBackupState;
 
-static void drive_backup_prepare(BlkTransactionState *common, Error **errp)
+static void drive_backup_prepare(BlkActionState *common, Error **errp)
 {
     DriveBackupState *state = DO_UPCAST(DriveBackupState, common, common);
     BlockDriverState *bs;
@@ -1612,7 +1626,7 @@ static void drive_backup_prepare(BlkTransactionState *common, Error **errp)
     state->job = state->bs->job;
 }
 
-static void drive_backup_abort(BlkTransactionState *common)
+static void drive_backup_abort(BlkActionState *common)
 {
     DriveBackupState *state = DO_UPCAST(DriveBackupState, common, common);
     BlockDriverState *bs = state->bs;
@@ -1623,7 +1637,7 @@ static void drive_backup_abort(BlkTransactionState *common)
     }
 }
 
-static void drive_backup_clean(BlkTransactionState *common)
+static void drive_backup_clean(BlkActionState *common)
 {
     DriveBackupState *state = DO_UPCAST(DriveBackupState, common, common);
 
@@ -1633,13 +1647,13 @@ static void drive_backup_clean(BlkTransactionState *common)
 }
 
 typedef struct BlockdevBackupState {
-    BlkTransactionState common;
+    BlkActionState common;
     BlockDriverState *bs;
     BlockJob *job;
     AioContext *aio_context;
 } BlockdevBackupState;
 
-static void blockdev_backup_prepare(BlkTransactionState *common, Error **errp)
+static void blockdev_backup_prepare(BlkActionState *common, Error **errp)
 {
     BlockdevBackupState *state = DO_UPCAST(BlockdevBackupState, common, common);
     BlockdevBackup *backup;
@@ -1688,7 +1702,7 @@ static void blockdev_backup_prepare(BlkTransactionState *common, Error **errp)
     state->job = state->bs->job;
 }
 
-static void blockdev_backup_abort(BlkTransactionState *common)
+static void blockdev_backup_abort(BlkActionState *common)
 {
     BlockdevBackupState *state = DO_UPCAST(BlockdevBackupState, common, common);
     BlockDriverState *bs = state->bs;
@@ -1699,7 +1713,7 @@ static void blockdev_backup_abort(BlkTransactionState *common)
     }
 }
 
-static void blockdev_backup_clean(BlkTransactionState *common)
+static void blockdev_backup_clean(BlkActionState *common)
 {
     BlockdevBackupState *state = DO_UPCAST(BlockdevBackupState, common, common);
 
@@ -1709,7 +1723,7 @@ static void blockdev_backup_clean(BlkTransactionState *common)
 }
 
 typedef struct BlockDirtyBitmapState {
-    BlkTransactionState common;
+    BlkActionState common;
     BdrvDirtyBitmap *bitmap;
     BlockDriverState *bs;
     AioContext *aio_context;
@@ -1717,7 +1731,7 @@ typedef struct BlockDirtyBitmapState {
     bool prepared;
 } BlockDirtyBitmapState;
 
-static void block_dirty_bitmap_add_prepare(BlkTransactionState *common,
+static void block_dirty_bitmap_add_prepare(BlkActionState *common,
                                            Error **errp)
 {
     Error *local_err = NULL;
@@ -1738,7 +1752,7 @@ static void block_dirty_bitmap_add_prepare(BlkTransactionState *common,
     }
 }
 
-static void block_dirty_bitmap_add_abort(BlkTransactionState *common)
+static void block_dirty_bitmap_add_abort(BlkActionState *common)
 {
     BlockDirtyBitmapAdd *action;
     BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
@@ -1753,7 +1767,7 @@ static void block_dirty_bitmap_add_abort(BlkTransactionState *common)
     }
 }
 
-static void block_dirty_bitmap_clear_prepare(BlkTransactionState *common,
+static void block_dirty_bitmap_clear_prepare(BlkActionState *common,
                                              Error **errp)
 {
     BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
@@ -1782,7 +1796,7 @@ static void block_dirty_bitmap_clear_prepare(BlkTransactionState *common,
     /* AioContext is released in .clean() */
 }
 
-static void block_dirty_bitmap_clear_abort(BlkTransactionState *common)
+static void block_dirty_bitmap_clear_abort(BlkActionState *common)
 {
     BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
                                              common, common);
@@ -1790,7 +1804,7 @@ static void block_dirty_bitmap_clear_abort(BlkTransactionState *common)
     bdrv_undo_clear_dirty_bitmap(state->bitmap, state->backup);
 }
 
-static void block_dirty_bitmap_clear_commit(BlkTransactionState *common)
+static void block_dirty_bitmap_clear_commit(BlkActionState *common)
 {
     BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
                                              common, common);
@@ -1798,7 +1812,7 @@ static void block_dirty_bitmap_clear_commit(BlkTransactionState *common)
     hbitmap_free(state->backup);
 }
 
-static void block_dirty_bitmap_clear_clean(BlkTransactionState *common)
+static void block_dirty_bitmap_clear_clean(BlkActionState *common)
 {
     BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
                                              common, common);
@@ -1808,17 +1822,17 @@ static void block_dirty_bitmap_clear_clean(BlkTransactionState *common)
     }
 }
 
-static void abort_prepare(BlkTransactionState *common, Error **errp)
+static void abort_prepare(BlkActionState *common, Error **errp)
 {
     error_setg(errp, "Transaction aborted using Abort action");
 }
 
-static void abort_commit(BlkTransactionState *common)
+static void abort_commit(BlkActionState *common)
 {
     g_assert_not_reached(); /* this action never succeeds */
 }
 
-static const BdrvActionOps actions[] = {
+static const BlkActionOps actions[] = {
     [TRANSACTION_ACTION_KIND_BLOCKDEV_SNAPSHOT_SYNC] = {
         .instance_size = sizeof(ExternalSnapshotState),
         .prepare  = external_snapshot_prepare,
@@ -1838,7 +1852,7 @@ static const BdrvActionOps actions[] = {
         .clean = blockdev_backup_clean,
     },
     [TRANSACTION_ACTION_KIND_ABORT] = {
-        .instance_size = sizeof(BlkTransactionState),
+        .instance_size = sizeof(BlkActionState),
         .prepare = abort_prepare,
         .commit = abort_commit,
     },
@@ -1869,10 +1883,10 @@ static const BdrvActionOps actions[] = {
 void qmp_transaction(TransactionActionList *dev_list, Error **errp)
 {
     TransactionActionList *dev_entry = dev_list;
-    BlkTransactionState *state, *next;
+    BlkActionState *state, *next;
     Error *local_err = NULL;
 
-    QSIMPLEQ_HEAD(snap_bdrv_states, BlkTransactionState) snap_bdrv_states;
+    QSIMPLEQ_HEAD(snap_bdrv_states, BlkActionState) snap_bdrv_states;
     QSIMPLEQ_INIT(&snap_bdrv_states);
 
     /* drain all i/o before any operations */
@@ -1881,7 +1895,7 @@ void qmp_transaction(TransactionActionList *dev_list, Error **errp)
     /* We don't do anything in this loop that commits us to the operations */
     while (NULL != dev_entry) {
         TransactionAction *dev_info = NULL;
-        const BdrvActionOps *ops;
+        const BlkActionOps *ops;
 
         dev_info = dev_entry->value;
         dev_entry = dev_entry->next;
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Qemu-devel] [PATCH 04/10] block: keep bitmap if incremental backup job is cancelled
  2015-06-25 12:12 [Qemu-devel] [PATCH 00/10] block: incremental backup transactions using BlockJobTxn Stefan Hajnoczi
                   ` (2 preceding siblings ...)
  2015-06-25 12:12 ` [Qemu-devel] [PATCH 03/10] block: rename BlkTransactionState and BdrvActionOps Stefan Hajnoczi
@ 2015-06-25 12:12 ` Stefan Hajnoczi
  2015-06-26  6:00   ` Fam Zheng
  2015-06-29 22:36   ` John Snow
  2015-06-25 12:12 ` [Qemu-devel] [PATCH 05/10] block: add block job transactions Stefan Hajnoczi
                   ` (6 subsequent siblings)
  10 siblings, 2 replies; 25+ messages in thread
From: Stefan Hajnoczi @ 2015-06-25 12:12 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, famz, John Snow, Jeff Cody, mreitz, vsementsov,
	Stefan Hajnoczi

Reclaim the dirty bitmap if an incremental backup block job is
cancelled.  The ret variable may be 0 when the job is cancelled so it's
not enough to check ret < 0.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 block/backup.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/backup.c b/block/backup.c
index 4a1af68..ddf8424 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -431,7 +431,7 @@ static void coroutine_fn backup_run(void *opaque)
 
     if (job->sync_bitmap) {
         BdrvDirtyBitmap *bm;
-        if (ret < 0) {
+        if (ret < 0 || block_job_is_cancelled(&job->common)) {
             /* Merge the successor back into the parent, delete nothing. */
             bm = bdrv_reclaim_dirty_bitmap(bs, job->sync_bitmap, NULL);
             assert(bm);
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Qemu-devel] [PATCH 05/10] block: add block job transactions
  2015-06-25 12:12 [Qemu-devel] [PATCH 00/10] block: incremental backup transactions using BlockJobTxn Stefan Hajnoczi
                   ` (3 preceding siblings ...)
  2015-06-25 12:12 ` [Qemu-devel] [PATCH 04/10] block: keep bitmap if incremental backup job is cancelled Stefan Hajnoczi
@ 2015-06-25 12:12 ` Stefan Hajnoczi
  2015-06-26  6:41   ` Fam Zheng
  2015-06-29 22:38   ` John Snow
  2015-06-25 12:12 ` [Qemu-devel] [PATCH 06/10] blockdev: make BlockJobTxn available to qmp 'transaction' Stefan Hajnoczi
                   ` (5 subsequent siblings)
  10 siblings, 2 replies; 25+ messages in thread
From: Stefan Hajnoczi @ 2015-06-25 12:12 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, famz, John Snow, Jeff Cody, mreitz, vsementsov,
	Stefan Hajnoczi

Sometimes block jobs must execute as a transaction group.  Finishing
jobs wait until all other jobs are ready to complete successfully.
Failure or cancellation of one job cancels the other jobs in the group.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 blockjob.c                | 160 ++++++++++++++++++++++++++++++++++++++++++++++
 include/block/block.h     |   1 +
 include/block/block_int.h |   3 +-
 include/block/blockjob.h  |  49 ++++++++++++++
 trace-events              |   4 ++
 5 files changed, 216 insertions(+), 1 deletion(-)

diff --git a/blockjob.c b/blockjob.c
index ec46fad..3c6f1d4 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -400,3 +400,163 @@ void block_job_defer_to_main_loop(BlockJob *job,
 
     qemu_bh_schedule(data->bh);
 }
+
+/* Transactional group of block jobs */
+struct BlockJobTxn {
+    /* Jobs may be in different AioContexts so protect all fields */
+    QemuMutex lock;
+
+    /* Reference count for txn object */
+    unsigned int ref;
+
+    /* Is this txn cancelling its jobs? */
+    bool aborting;
+
+    /* Number of jobs still running */
+    unsigned int jobs_pending;
+
+    /* List of jobs */
+    QLIST_HEAD(, BlockJob) jobs;
+};
+
+BlockJobTxn *block_job_txn_new(void)
+{
+    BlockJobTxn *txn = g_new(BlockJobTxn, 1);
+    qemu_mutex_init(&txn->lock);
+    txn->ref = 1; /* dropped by block_job_txn_begin() */
+    txn->aborting = false;
+    txn->jobs_pending = 0;
+    QLIST_INIT(&txn->jobs);
+    return txn;
+}
+
+static void block_job_txn_unref(BlockJobTxn *txn)
+{
+    qemu_mutex_lock(&txn->lock);
+
+    if (--txn->ref > 0) {
+        qemu_mutex_unlock(&txn->lock);
+        return;
+    }
+
+    qemu_mutex_unlock(&txn->lock);
+    qemu_mutex_destroy(&txn->lock);
+    g_free(txn);
+}
+
+/* The purpose of this is to keep txn alive until all jobs have been added */
+void block_job_txn_begin(BlockJobTxn *txn)
+{
+    block_job_txn_unref(txn);
+}
+
+void block_job_txn_add_job(BlockJobTxn *txn, BlockJob *job)
+{
+    if (!txn) {
+        return;
+    }
+
+    assert(!job->txn);
+    job->txn = txn;
+
+    qemu_mutex_lock(&txn->lock);
+    txn->ref++;
+    txn->jobs_pending++;
+    QLIST_INSERT_HEAD(&txn->jobs, job, txn_list);
+    qemu_mutex_unlock(&txn->lock);
+}
+
+/* Cancel all other jobs in case of abort, wake all waiting jobs in case of
+ * successful completion.  Runs from main loop.
+ */
+static void block_job_txn_complete(BlockJob *job, void *opaque)
+{
+    BlockJobTxn *txn = opaque;
+    BlockJob *other_job;
+    bool aborting = txn->aborting;
+
+    qemu_mutex_lock(&txn->lock);
+    txn->ref++; /* keep txn alive until the end of this loop */
+
+    QLIST_FOREACH(other_job, &txn->jobs, txn_list) {
+        AioContext *ctx;
+
+        qemu_mutex_unlock(&txn->lock);
+        ctx = bdrv_get_aio_context(other_job->bs);
+        aio_context_acquire(ctx);
+
+        /* Cancel all other jobs if aborting.  Don't cancel our own failed job
+         * since cancellation throws away the error value.
+         */
+        if (aborting && other_job != job) {
+            block_job_cancel(other_job);
+        } else {
+            block_job_enter(other_job);
+        }
+
+        aio_context_release(ctx);
+        qemu_mutex_lock(&txn->lock);
+    }
+
+    qemu_mutex_unlock(&txn->lock);
+    block_job_txn_unref(txn);
+}
+
+void coroutine_fn block_job_txn_prepare_to_complete(BlockJobTxn *txn,
+                                                    BlockJob *job,
+                                                    int ret)
+{
+    if (!txn) {
+        return;
+    }
+
+    qemu_mutex_lock(&txn->lock);
+
+    /* This function is entered in 3 cases:
+     *
+     * 1. Successful job completion - wait for other jobs
+     * 2. First failed/cancelled job in txn - cancel other jobs and wait
+     * 3. Subsequent cancelled jobs - finish immediately, don't wait
+     */
+    trace_block_job_txn_prepare_to_complete_entry(txn, job, ret,
+                                                  block_job_is_cancelled(job),
+                                                  txn->aborting,
+                                                  txn->jobs_pending);
+
+    if (txn->aborting) { /* Case 3 */
+        assert(block_job_is_cancelled(job));
+        goto out; /* already cancelled, don't yield */
+    }
+
+    if (ret != 0 || block_job_is_cancelled(job)) { /* Case 2 */
+abort:
+        txn->aborting = true;
+        block_job_defer_to_main_loop(job, block_job_txn_complete, txn);
+    } else { /* Case 1 */
+        if (--txn->jobs_pending == 0) {
+            block_job_defer_to_main_loop(job, block_job_txn_complete, txn);
+        }
+    }
+
+    /* Wait for block_job_txn_complete() */
+    do {
+        qemu_mutex_unlock(&txn->lock);
+        job->busy = false;
+        qemu_coroutine_yield();
+        job->busy = true;
+        qemu_mutex_lock(&txn->lock);
+
+        if (block_job_is_cancelled(job) && !txn->aborting) {
+            goto abort; /* this job just got cancelled by the user */
+        }
+    } while (!txn->aborting && txn->jobs_pending > 0);
+
+out:
+    trace_block_job_txn_prepare_to_complete_return(txn, job, ret,
+                                                   block_job_is_cancelled(job),
+                                                   txn->aborting,
+                                                   txn->jobs_pending);
+
+    qemu_mutex_unlock(&txn->lock);
+    block_job_txn_unref(txn);
+}
diff --git a/include/block/block.h b/include/block/block.h
index a4c505d..cb19c73 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -13,6 +13,7 @@
 typedef struct BlockDriver BlockDriver;
 typedef struct BlockJob BlockJob;
 typedef struct BdrvChildRole BdrvChildRole;
+typedef struct BlockJobTxn BlockJobTxn;
 
 typedef struct BlockDriverInfo {
     /* in bytes, 0 if irrelevant */
diff --git a/include/block/block_int.h b/include/block/block_int.h
index ea3e7f0..812a18a 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -639,6 +639,7 @@ void mirror_start(BlockDriverState *bs, BlockDriverState *target,
  * @on_source_error: The action to take upon error reading from the source.
  * @on_target_error: The action to take upon error writing to the target.
  * @cb: Completion function for the job.
+ * @txn: Transaction that this job is part of (may be NULL).
  * @opaque: Opaque pointer value passed to @cb.
  *
  * Start a backup operation on @bs.  Clusters in @bs are written to @target
@@ -650,7 +651,7 @@ void backup_start(BlockDriverState *bs, BlockDriverState *target,
                   BlockdevOnError on_source_error,
                   BlockdevOnError on_target_error,
                   BlockCompletionFunc *cb, void *opaque,
-                  Error **errp);
+                  BlockJobTxn *txn, Error **errp);
 
 void blk_dev_change_media_cb(BlockBackend *blk, bool load);
 bool blk_dev_has_removable_media(BlockBackend *blk);
diff --git a/include/block/blockjob.h b/include/block/blockjob.h
index 57d8ef1..ce57e98 100644
--- a/include/block/blockjob.h
+++ b/include/block/blockjob.h
@@ -122,6 +122,10 @@ struct BlockJob {
 
     /** The opaque value that is passed to the completion function.  */
     void *opaque;
+
+    /** Non-NULL if this job is part of a transaction */
+    BlockJobTxn *txn;
+    QLIST_ENTRY(BlockJob) txn_list;
 };
 
 /**
@@ -348,4 +352,49 @@ void block_job_defer_to_main_loop(BlockJob *job,
                                   BlockJobDeferToMainLoopFn *fn,
                                   void *opaque);
 
+/**
+ * block_job_txn_new:
+ *
+ * Allocate and return a new block job transaction.  Jobs can be added to the
+ * transaction using block_job_txn_add_job().  block_job_txn_begin() must be
+ * called when all jobs (if any) have been added.
+ *
+ * All jobs in the transaction either complete successfully or fail/cancel as a
+ * group.  Jobs wait for each other before completing.  Cancelling one job
+ * cancels all jobs in the transaction.
+ */
+BlockJobTxn *block_job_txn_new(void);
+
+/**
+ * block_job_txn_add_job:
+ * @txn: The transaction
+ * @job: Job to add to the transaction
+ *
+ * Add @job to the transaction.  The @job must not already be in a transaction.
+ * The block job driver must call block_job_txn_prepare_to_complete() before
+ * final cleanup and completion.
+ */
+void block_job_txn_add_job(BlockJobTxn *txn, BlockJob *job);
+
+/**
+ * block_job_txn_begin:
+ * @txn: The transaction
+ *
+ * Call this to mark the end of adding jobs to the transaction.  This must be
+ * called even if no jobs were added.
+ */
+void block_job_txn_begin(BlockJobTxn *txn);
+
+/**
+ * block_job_txn_prepare_to_complete:
+ * @txn: The transaction
+ * @job: The block job
+ * @ret: Block job return value (0 for success, otherwise job failure)
+ *
+ * Wait for other jobs in the transaction to complete.  If @ret is non-zero or
+ * @job is cancelled, all other jobs in the transaction will be cancelled.
+ */
+void coroutine_fn block_job_txn_prepare_to_complete(BlockJobTxn *txn,
+                                                    BlockJob *job, int ret);
+
 #endif
diff --git a/trace-events b/trace-events
index 52b7efa..b6a43a0 100644
--- a/trace-events
+++ b/trace-events
@@ -123,6 +123,10 @@ virtio_blk_data_plane_start(void *s) "dataplane %p"
 virtio_blk_data_plane_stop(void *s) "dataplane %p"
 virtio_blk_data_plane_process_request(void *s, unsigned int out_num, unsigned int in_num, unsigned int head) "dataplane %p out_num %u in_num %u head %u"
 
+# blockjob.c
+block_job_txn_prepare_to_complete_entry(void *txn, void *job, int ret, bool cancelled, bool aborting, unsigned int jobs_pending) "txn %p job %p ret %d cancelled %d aborting %d jobs_pending %u"
+block_job_txn_prepare_to_complete_return(void *txn, void *job, int ret, bool cancelled, bool aborting, unsigned int jobs_pending) "txn %p job %p ret %d cancelled %d aborting %d jobs_pending %u"
+
 # hw/virtio/dataplane/vring.c
 vring_setup(uint64_t physical, void *desc, void *avail, void *used) "vring physical %#"PRIx64" desc %p avail %p used %p"
 
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Qemu-devel] [PATCH 06/10] blockdev: make BlockJobTxn available to qmp 'transaction'
  2015-06-25 12:12 [Qemu-devel] [PATCH 00/10] block: incremental backup transactions using BlockJobTxn Stefan Hajnoczi
                   ` (4 preceding siblings ...)
  2015-06-25 12:12 ` [Qemu-devel] [PATCH 05/10] block: add block job transactions Stefan Hajnoczi
@ 2015-06-25 12:12 ` Stefan Hajnoczi
  2015-06-26  6:42   ` Fam Zheng
  2015-06-29 22:38   ` John Snow
  2015-06-25 12:12 ` [Qemu-devel] [PATCH 07/10] block/backup: support block job transactions Stefan Hajnoczi
                   ` (4 subsequent siblings)
  10 siblings, 2 replies; 25+ messages in thread
From: Stefan Hajnoczi @ 2015-06-25 12:12 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, famz, John Snow, Jeff Cody, mreitz, vsementsov,
	Stefan Hajnoczi

Provide a BlockJobTxn to actions executed in a qmp 'transaction'
command.  This allows actions to make their block jobs either complete
as a group or fail/cancel together.

The next patch adds the first user.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 blockdev.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/blockdev.c b/blockdev.c
index c572950..453f3ec 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1279,6 +1279,7 @@ typedef struct BlkActionOps {
 struct BlkActionState {
     TransactionAction *action;
     const BlkActionOps *ops;
+    BlockJobTxn *block_job_txn;
     QSIMPLEQ_ENTRY(BlkActionState) entry;
 };
 
@@ -1883,12 +1884,15 @@ static const BlkActionOps actions[] = {
 void qmp_transaction(TransactionActionList *dev_list, Error **errp)
 {
     TransactionActionList *dev_entry = dev_list;
+    BlockJobTxn *block_job_txn;
     BlkActionState *state, *next;
     Error *local_err = NULL;
 
     QSIMPLEQ_HEAD(snap_bdrv_states, BlkActionState) snap_bdrv_states;
     QSIMPLEQ_INIT(&snap_bdrv_states);
 
+    block_job_txn = block_job_txn_new();
+
     /* drain all i/o before any operations */
     bdrv_drain_all();
 
@@ -1908,6 +1912,7 @@ void qmp_transaction(TransactionActionList *dev_list, Error **errp)
         state = g_malloc0(ops->instance_size);
         state->ops = ops;
         state->action = dev_info;
+        state->block_job_txn = block_job_txn;
         QSIMPLEQ_INSERT_TAIL(&snap_bdrv_states, state, entry);
 
         state->ops->prepare(state, &local_err);
@@ -1917,6 +1922,8 @@ void qmp_transaction(TransactionActionList *dev_list, Error **errp)
         }
     }
 
+    block_job_txn_begin(block_job_txn);
+
     QSIMPLEQ_FOREACH(state, &snap_bdrv_states, entry) {
         if (state->ops->commit) {
             state->ops->commit(state);
@@ -1927,6 +1934,8 @@ void qmp_transaction(TransactionActionList *dev_list, Error **errp)
     goto exit;
 
 delete_and_fail:
+    block_job_txn_begin(block_job_txn);
+
     /* failure, and it is all-or-none; roll back all operations */
     QSIMPLEQ_FOREACH(state, &snap_bdrv_states, entry) {
         if (state->ops->abort) {
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Qemu-devel] [PATCH 07/10] block/backup: support block job transactions
  2015-06-25 12:12 [Qemu-devel] [PATCH 00/10] block: incremental backup transactions using BlockJobTxn Stefan Hajnoczi
                   ` (5 preceding siblings ...)
  2015-06-25 12:12 ` [Qemu-devel] [PATCH 06/10] blockdev: make BlockJobTxn available to qmp 'transaction' Stefan Hajnoczi
@ 2015-06-25 12:12 ` Stefan Hajnoczi
  2015-06-26  6:44   ` Fam Zheng
  2015-06-29 22:39   ` John Snow
  2015-06-25 12:12 ` [Qemu-devel] [PATCH 08/10] iotests: 124 - transactional failure test Stefan Hajnoczi
                   ` (3 subsequent siblings)
  10 siblings, 2 replies; 25+ messages in thread
From: Stefan Hajnoczi @ 2015-06-25 12:12 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, famz, John Snow, Jeff Cody, mreitz, vsementsov,
	Stefan Hajnoczi

Join the transaction when the backup block job is in incremental backup
mode.

This ensures that the sync bitmap is not thrown away if another block
job in the transaction is cancelled or fails.  This is critical so
incremental backup with multiple disks can be retried in case of
cancellation/failure.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 block/backup.c |  7 +++++-
 blockdev.c     | 73 ++++++++++++++++++++++++++++++++++++++++++----------------
 2 files changed, 59 insertions(+), 21 deletions(-)

diff --git a/block/backup.c b/block/backup.c
index ddf8424..fa86b0e 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -429,6 +429,8 @@ static void coroutine_fn backup_run(void *opaque)
     qemu_co_rwlock_wrlock(&job->flush_rwlock);
     qemu_co_rwlock_unlock(&job->flush_rwlock);
 
+    block_job_txn_prepare_to_complete(job->common.txn, &job->common, ret);
+
     if (job->sync_bitmap) {
         BdrvDirtyBitmap *bm;
         if (ret < 0 || block_job_is_cancelled(&job->common)) {
@@ -457,7 +459,7 @@ void backup_start(BlockDriverState *bs, BlockDriverState *target,
                   BlockdevOnError on_source_error,
                   BlockdevOnError on_target_error,
                   BlockCompletionFunc *cb, void *opaque,
-                  Error **errp)
+                  BlockJobTxn *txn, Error **errp)
 {
     int64_t len;
 
@@ -537,6 +539,9 @@ void backup_start(BlockDriverState *bs, BlockDriverState *target,
     job->sync_mode = sync_mode;
     job->sync_bitmap = sync_mode == MIRROR_SYNC_MODE_DIRTY_BITMAP ?
                        sync_bitmap : NULL;
+    if (job->sync_bitmap) {
+        block_job_txn_add_job(txn, &job->common);
+    }
     job->common.len = len;
     job->common.co = qemu_coroutine_create(backup_run);
     qemu_coroutine_enter(job->common.co, job);
diff --git a/blockdev.c b/blockdev.c
index 453f3ec..4f27c35 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1586,6 +1586,18 @@ typedef struct DriveBackupState {
     BlockJob *job;
 } DriveBackupState;
 
+static void do_drive_backup(const char *device, const char *target,
+                            bool has_format, const char *format,
+                            enum MirrorSyncMode sync,
+                            bool has_mode, enum NewImageMode mode,
+                            bool has_speed, int64_t speed,
+                            bool has_bitmap, const char *bitmap,
+                            bool has_on_source_error,
+                            BlockdevOnError on_source_error,
+                            bool has_on_target_error,
+                            BlockdevOnError on_target_error,
+                            BlockJobTxn *txn, Error **errp);
+
 static void drive_backup_prepare(BlkActionState *common, Error **errp)
 {
     DriveBackupState *state = DO_UPCAST(DriveBackupState, common, common);
@@ -1609,15 +1621,16 @@ static void drive_backup_prepare(BlkActionState *common, Error **errp)
     state->aio_context = bdrv_get_aio_context(bs);
     aio_context_acquire(state->aio_context);
 
-    qmp_drive_backup(backup->device, backup->target,
-                     backup->has_format, backup->format,
-                     backup->sync,
-                     backup->has_mode, backup->mode,
-                     backup->has_speed, backup->speed,
-                     backup->has_bitmap, backup->bitmap,
-                     backup->has_on_source_error, backup->on_source_error,
-                     backup->has_on_target_error, backup->on_target_error,
-                     &local_err);
+    do_drive_backup(backup->device, backup->target,
+                    backup->has_format, backup->format,
+                    backup->sync,
+                    backup->has_mode, backup->mode,
+                    backup->has_speed, backup->speed,
+                    backup->has_bitmap, backup->bitmap,
+                    backup->has_on_source_error, backup->on_source_error,
+                    backup->has_on_target_error, backup->on_target_error,
+                    common->block_job_txn,
+                    &local_err);
     if (local_err) {
         error_propagate(errp, local_err);
         return;
@@ -2585,15 +2598,17 @@ out:
     aio_context_release(aio_context);
 }
 
-void qmp_drive_backup(const char *device, const char *target,
-                      bool has_format, const char *format,
-                      enum MirrorSyncMode sync,
-                      bool has_mode, enum NewImageMode mode,
-                      bool has_speed, int64_t speed,
-                      bool has_bitmap, const char *bitmap,
-                      bool has_on_source_error, BlockdevOnError on_source_error,
-                      bool has_on_target_error, BlockdevOnError on_target_error,
-                      Error **errp)
+static void do_drive_backup(const char *device, const char *target,
+                            bool has_format, const char *format,
+                            enum MirrorSyncMode sync,
+                            bool has_mode, enum NewImageMode mode,
+                            bool has_speed, int64_t speed,
+                            bool has_bitmap, const char *bitmap,
+                            bool has_on_source_error,
+                            BlockdevOnError on_source_error,
+                            bool has_on_target_error,
+                            BlockdevOnError on_target_error,
+                            BlockJobTxn *txn, Error **errp)
 {
     BlockBackend *blk;
     BlockDriverState *bs;
@@ -2710,7 +2725,7 @@ void qmp_drive_backup(const char *device, const char *target,
 
     backup_start(bs, target_bs, speed, sync, bmap,
                  on_source_error, on_target_error,
-                 block_job_cb, bs, &local_err);
+                 block_job_cb, bs, txn, &local_err);
     if (local_err != NULL) {
         bdrv_unref(target_bs);
         error_propagate(errp, local_err);
@@ -2721,6 +2736,24 @@ out:
     aio_context_release(aio_context);
 }
 
+void qmp_drive_backup(const char *device, const char *target,
+                      bool has_format, const char *format,
+                      enum MirrorSyncMode sync,
+                      bool has_mode, enum NewImageMode mode,
+                      bool has_speed, int64_t speed,
+                      bool has_bitmap, const char *bitmap,
+                      bool has_on_source_error, BlockdevOnError on_source_error,
+                      bool has_on_target_error, BlockdevOnError on_target_error,
+                      Error **errp)
+{
+    return do_drive_backup(device, target, has_format, format, sync,
+                           has_mode, mode, has_speed, speed,
+                           has_bitmap, bitmap,
+                           has_on_source_error, on_source_error,
+                           has_on_target_error, on_target_error,
+                           NULL, errp);
+}
+
 BlockDeviceInfoList *qmp_query_named_block_nodes(Error **errp)
 {
     return bdrv_named_nodes_list(errp);
@@ -2771,7 +2804,7 @@ void qmp_blockdev_backup(const char *device, const char *target,
     bdrv_ref(target_bs);
     bdrv_set_aio_context(target_bs, aio_context);
     backup_start(bs, target_bs, speed, sync, NULL, on_source_error,
-                 on_target_error, block_job_cb, bs, &local_err);
+                 on_target_error, block_job_cb, bs, NULL, &local_err);
     if (local_err != NULL) {
         bdrv_unref(target_bs);
         error_propagate(errp, local_err);
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Qemu-devel] [PATCH 08/10] iotests: 124 - transactional failure test
  2015-06-25 12:12 [Qemu-devel] [PATCH 00/10] block: incremental backup transactions using BlockJobTxn Stefan Hajnoczi
                   ` (6 preceding siblings ...)
  2015-06-25 12:12 ` [Qemu-devel] [PATCH 07/10] block/backup: support block job transactions Stefan Hajnoczi
@ 2015-06-25 12:12 ` Stefan Hajnoczi
  2015-06-25 12:12 ` [Qemu-devel] [PATCH 09/10] qmp-commands.hx: Update the supported 'transaction' operations Stefan Hajnoczi
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 25+ messages in thread
From: Stefan Hajnoczi @ 2015-06-25 12:12 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, famz, John Snow, Jeff Cody, mreitz, vsementsov,
	Stefan Hajnoczi

From: John Snow <jsnow@redhat.com>

Use a transaction to request an incremental backup across two drives.
Coerce one of the jobs to fail, and then re-run the transaction.

Verify that no bitmap data was lost due to the partial transaction
failure.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 tests/qemu-iotests/124     | 126 ++++++++++++++++++++++++++++++++++++++++++++-
 tests/qemu-iotests/124.out |   4 +-
 2 files changed, 127 insertions(+), 3 deletions(-)

diff --git a/tests/qemu-iotests/124 b/tests/qemu-iotests/124
index 530a392..a7c33f1 100644
--- a/tests/qemu-iotests/124
+++ b/tests/qemu-iotests/124
@@ -139,9 +139,12 @@ class TestIncrementalBackup(iotests.QMPTestCase):
     def do_qmp_backup(self, error='Input/output error', **kwargs):
         res = self.vm.qmp('drive-backup', **kwargs)
         self.assert_qmp(res, 'return', {})
+        return self.wait_qmp_backup(kwargs['device'], error)
 
+
+    def wait_qmp_backup(self, device, error='Input/output error'):
         event = self.vm.event_wait(name="BLOCK_JOB_COMPLETED",
-                                   match={'data': {'device': kwargs['device']}})
+                                   match={'data': {'device': device}})
         self.assertNotEqual(event, None)
 
         try:
@@ -156,6 +159,12 @@ class TestIncrementalBackup(iotests.QMPTestCase):
             return False
 
 
+    def wait_qmp_backup_cancelled(self, device):
+        event = self.vm.event_wait(name='BLOCK_JOB_CANCELLED',
+                                   match={'data': {'device': device}})
+        self.assertNotEqual(event, None)
+
+
     def create_anchor_backup(self, drive=None):
         if drive is None:
             drive = self.drives[-1]
@@ -375,6 +384,121 @@ class TestIncrementalBackup(iotests.QMPTestCase):
         self.check_backups()
 
 
+    def test_transaction_failure(self):
+        '''Test: Verify backups made from a transaction that partially fails.
+
+        Add a second drive with its own unique pattern, and add a bitmap to each
+        drive. Use blkdebug to interfere with the backup on just one drive and
+        attempt to create a coherent incremental backup across both drives.
+
+        verify a failure in one but not both, then delete the failed stubs and
+        re-run the same transaction.
+
+        verify that both incrementals are created successfully.
+        '''
+
+        # Create a second drive, with pattern:
+        drive1 = self.add_node('drive1')
+        self.img_create(drive1['file'], drive1['fmt'])
+        io_write_patterns(drive1['file'], (('0x14', 0, 512),
+                                           ('0x5d', '1M', '32k'),
+                                           ('0xcd', '32M', '124k')))
+
+        # Create a blkdebug interface to this img as 'drive1'
+        result = self.vm.qmp('blockdev-add', options={
+            'id': drive1['id'],
+            'driver': drive1['fmt'],
+            'file': {
+                'driver': 'blkdebug',
+                'image': {
+                    'driver': 'file',
+                    'filename': drive1['file']
+                },
+                'set-state': [{
+                    'event': 'flush_to_disk',
+                    'state': 1,
+                    'new_state': 2
+                }],
+                'inject-error': [{
+                    'event': 'read_aio',
+                    'errno': 5,
+                    'state': 2,
+                    'immediately': False,
+                    'once': True
+                }],
+            }
+        })
+        self.assert_qmp(result, 'return', {})
+
+        # Create bitmaps and full backups for both drives
+        drive0 = self.drives[0]
+        dr0bm0 = self.add_bitmap('bitmap0', drive0)
+        dr1bm0 = self.add_bitmap('bitmap0', drive1)
+        self.create_anchor_backup(drive0)
+        self.create_anchor_backup(drive1)
+        self.assert_no_active_block_jobs()
+        self.assertFalse(self.vm.get_qmp_events(wait=False))
+
+        # Emulate some writes
+        self.hmp_io_writes(drive0['id'], (('0xab', 0, 512),
+                                          ('0xfe', '16M', '256k'),
+                                          ('0x64', '32736k', '64k')))
+        self.hmp_io_writes(drive1['id'], (('0xba', 0, 512),
+                                          ('0xef', '16M', '256k'),
+                                          ('0x46', '32736k', '64k')))
+
+        # Create incremental backup targets
+        target0 = self.prepare_backup(dr0bm0)
+        target1 = self.prepare_backup(dr1bm0)
+
+        # Ask for a new incremental backup per-each drive,
+        # expecting drive1's backup to fail:
+        transaction = [
+            transaction_drive_backup(drive0['id'], target0, sync='dirty-bitmap',
+                                     format=drive0['fmt'], mode='existing',
+                                     bitmap=dr0bm0.name),
+            transaction_drive_backup(drive1['id'], target1, sync='dirty-bitmap',
+                                     format=drive1['fmt'], mode='existing',
+                                     bitmap=dr1bm0.name),
+        ]
+        result = self.vm.qmp('transaction', actions=transaction)
+        self.assert_qmp(result, 'return', {})
+
+        # Observe that drive0's backup is cancelled and drive1 completes with
+        # an error.
+        self.wait_qmp_backup_cancelled(drive0['id'])
+        self.assertFalse(self.wait_qmp_backup(drive1['id']))
+        error = self.vm.event_wait('BLOCK_JOB_ERROR')
+        self.assert_qmp(error, 'data', {'device': drive1['id'],
+                                        'action': 'report',
+                                        'operation': 'read'})
+        self.assertFalse(self.vm.get_qmp_events(wait=False))
+        self.assert_no_active_block_jobs()
+
+        # Delete drive0's successful target and eliminate our record of the
+        # unsuccessful drive1 target. Then re-run the same transaction.
+        dr0bm0.del_target()
+        dr1bm0.del_target()
+        target0 = self.prepare_backup(dr0bm0)
+        target1 = self.prepare_backup(dr1bm0)
+
+        # Re-run the exact same transaction.
+        result = self.vm.qmp('transaction', actions=transaction)
+        self.assert_qmp(result, 'return', {})
+
+        # Both should complete successfully this time.
+        self.assertTrue(self.wait_qmp_backup(drive0['id']))
+        self.assertTrue(self.wait_qmp_backup(drive1['id']))
+        self.make_reference_backup(dr0bm0)
+        self.make_reference_backup(dr1bm0)
+        self.assertFalse(self.vm.get_qmp_events(wait=False))
+        self.assert_no_active_block_jobs()
+
+        # And the images should of course validate.
+        self.vm.shutdown()
+        self.check_backups()
+
+
     def test_sync_dirty_bitmap_missing(self):
         self.assert_no_active_block_jobs()
         self.files.append(self.err_img)
diff --git a/tests/qemu-iotests/124.out b/tests/qemu-iotests/124.out
index 594c16f..dae404e 100644
--- a/tests/qemu-iotests/124.out
+++ b/tests/qemu-iotests/124.out
@@ -1,5 +1,5 @@
-........
+.........
 ----------------------------------------------------------------------
-Ran 8 tests
+Ran 9 tests
 
 OK
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Qemu-devel] [PATCH 09/10] qmp-commands.hx: Update the supported 'transaction' operations
  2015-06-25 12:12 [Qemu-devel] [PATCH 00/10] block: incremental backup transactions using BlockJobTxn Stefan Hajnoczi
                   ` (7 preceding siblings ...)
  2015-06-25 12:12 ` [Qemu-devel] [PATCH 08/10] iotests: 124 - transactional failure test Stefan Hajnoczi
@ 2015-06-25 12:12 ` Stefan Hajnoczi
  2015-06-25 12:12 ` [Qemu-devel] [PATCH 10/10] tests: add BlockJobTxn unit test Stefan Hajnoczi
  2015-06-29 15:08 ` [Qemu-devel] [PATCH 00/10] block: incremental backup transactions using BlockJobTxn Stefan Hajnoczi
  10 siblings, 0 replies; 25+ messages in thread
From: Stefan Hajnoczi @ 2015-06-25 12:12 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, famz, John Snow, Jeff Cody, mreitz, vsementsov,
	Stefan Hajnoczi

From: Kashyap Chamarthy <kchamart@redhat.com>

Although the canonical source of reference for QMP commands is
qapi-schema.json, for consistency's sake, update qmp-commands.hx to
state the list of supported transactionable operations, namely:

    drive-backup
    blockdev-backup
    blockdev-snapshot-internal-sync
    abort
    block-dirty-bitmap-add
    block-dirty-bitmap-clear

Signed-off-by: Kashyap Chamarthy <kchamart@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 qmp-commands.hx | 21 ++++++++++++++++-----
 1 file changed, 16 insertions(+), 5 deletions(-)

diff --git a/qmp-commands.hx b/qmp-commands.hx
index 3ffa612..4fd217d 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -1238,11 +1238,22 @@ SQMP
 transaction
 -----------
 
-Atomically operate on one or more block devices.  The only supported operations
-for now are drive-backup, internal and external snapshotting.  A list of
-dictionaries is accepted, that contains the actions to be performed.
-If there is any failure performing any of the operations, all operations
-for the group are abandoned.
+Atomically operate on one or more block devices.  Operations that are
+currently supported:
+
+    - drive-backup
+    - blockdev-backup
+    - blockdev-snapshot-sync
+    - blockdev-snapshot-internal-sync
+    - abort
+    - block-dirty-bitmap-add
+    - block-dirty-bitmap-clear
+
+Refer to the qemu/qapi-schema.json file for minimum required QEMU
+versions for these operations.  A list of dictionaries is accepted,
+that contains the actions to be performed.  If there is any failure
+performing any of the operations, all operations for the group are
+abandoned.
 
 For external snapshots, the dictionary contains the device, the file to use for
 the new snapshot, and the format.  The default format, if not specified, is
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Qemu-devel] [PATCH 10/10] tests: add BlockJobTxn unit test
  2015-06-25 12:12 [Qemu-devel] [PATCH 00/10] block: incremental backup transactions using BlockJobTxn Stefan Hajnoczi
                   ` (8 preceding siblings ...)
  2015-06-25 12:12 ` [Qemu-devel] [PATCH 09/10] qmp-commands.hx: Update the supported 'transaction' operations Stefan Hajnoczi
@ 2015-06-25 12:12 ` Stefan Hajnoczi
  2015-06-26  6:58   ` Fam Zheng
  2015-06-29 15:08 ` [Qemu-devel] [PATCH 00/10] block: incremental backup transactions using BlockJobTxn Stefan Hajnoczi
  10 siblings, 1 reply; 25+ messages in thread
From: Stefan Hajnoczi @ 2015-06-25 12:12 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, famz, John Snow, Jeff Cody, mreitz, vsementsov,
	Stefan Hajnoczi

The BlockJobTxn unit test verifies that both single jobs and pairs of
jobs behave as a transaction group.  Either all jobs complete
successfully or the group is cancelled.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 tests/Makefile            |   3 +
 tests/test-blockjob-txn.c | 191 ++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 194 insertions(+)
 create mode 100644 tests/test-blockjob-txn.c

diff --git a/tests/Makefile b/tests/Makefile
index eff5e11..837f30b 100644
--- a/tests/Makefile
+++ b/tests/Makefile
@@ -45,6 +45,8 @@ check-unit-y += tests/test-thread-pool$(EXESUF)
 gcov-files-test-thread-pool-y = thread-pool.c
 gcov-files-test-hbitmap-y = util/hbitmap.c
 check-unit-y += tests/test-hbitmap$(EXESUF)
+gcov-files-test-hbitmap-y = blockjob.c
+check-unit-y += tests/test-blockjob-txn$(EXESUF)
 check-unit-y += tests/test-x86-cpuid$(EXESUF)
 # all code tested by test-x86-cpuid is inside topology.h
 gcov-files-test-x86-cpuid-y =
@@ -283,6 +285,7 @@ tests/test-coroutine$(EXESUF): tests/test-coroutine.o $(block-obj-y) libqemuutil
 tests/test-aio$(EXESUF): tests/test-aio.o $(block-obj-y) libqemuutil.a libqemustub.a
 tests/test-rfifolock$(EXESUF): tests/test-rfifolock.o libqemuutil.a libqemustub.a
 tests/test-throttle$(EXESUF): tests/test-throttle.o $(block-obj-y) libqemuutil.a libqemustub.a
+tests/test-blockjob-txn$(EXESUF): tests/test-blockjob-txn.o $(block-obj-y) libqemuutil.a libqemustub.a
 tests/test-thread-pool$(EXESUF): tests/test-thread-pool.o $(block-obj-y) libqemuutil.a libqemustub.a
 tests/test-iov$(EXESUF): tests/test-iov.o libqemuutil.a
 tests/test-hbitmap$(EXESUF): tests/test-hbitmap.o libqemuutil.a libqemustub.a
diff --git a/tests/test-blockjob-txn.c b/tests/test-blockjob-txn.c
new file mode 100644
index 0000000..26697da
--- /dev/null
+++ b/tests/test-blockjob-txn.c
@@ -0,0 +1,191 @@
+/*
+ * Blockjob transactions tests
+ *
+ * Copyright Red Hat, Inc. 2015
+ *
+ * Authors:
+ *  Stefan Hajnoczi    <stefanha@redhat.com>
+ *
+ * This work is licensed under the terms of the GNU LGPL, version 2 or later.
+ * See the COPYING.LIB file in the top-level directory.
+ */
+
+#include <glib.h>
+#include "qapi/error.h"
+#include "qemu/main-loop.h"
+#include "block/blockjob.h"
+
+typedef struct {
+    BlockJob common;
+    unsigned int iterations;
+    int rc;
+} TestBlockJob;
+
+static const BlockJobDriver test_block_job_driver = {
+    .instance_size = sizeof(TestBlockJob),
+};
+
+static void test_block_job_complete(BlockJob *job, void *opaque)
+{
+    BlockDriverState *bs = job->bs;
+    int rc = (intptr_t)opaque;
+
+    if (block_job_is_cancelled(job)) {
+        rc = -ECANCELED;
+    }
+
+    block_job_completed(job, rc);
+    bdrv_unref(bs);
+}
+
+static void coroutine_fn test_block_job_run(void *opaque)
+{
+    TestBlockJob *s = opaque;
+    BlockJob *job = &s->common;
+
+    while (s->iterations--) {
+        block_job_sleep_ns(job, QEMU_CLOCK_REALTIME, 0);
+
+        if (block_job_is_cancelled(job)) {
+            break;
+        }
+    }
+
+    block_job_txn_prepare_to_complete(job->txn, job, s->rc);
+
+    block_job_defer_to_main_loop(job, test_block_job_complete,
+                                 (void *)(intptr_t)s->rc);
+}
+
+static void test_block_job_cb(void *opaque, int ret)
+{
+    *(int *)opaque = ret;
+}
+
+/* Create a block job that completes with a given return code after a given
+ * number of event loop iterations.  The return code is stored in the given
+ * result pointer.
+ */
+static BlockJob *test_block_job_start(unsigned int iterations, int rc,
+                                      int *result)
+{
+    BlockDriverState *bs;
+    TestBlockJob *s;
+
+    bs = bdrv_new();
+    s = block_job_create(&test_block_job_driver, bs, 0, test_block_job_cb,
+                         result, &error_abort);
+    s->iterations = iterations;
+    s->rc = rc;
+    s->common.co = qemu_coroutine_create(test_block_job_run);
+    qemu_coroutine_enter(s->common.co, s);
+    return &s->common;
+}
+
+static void test_single_job(int expected)
+{
+    BlockJob *job;
+    BlockJobTxn *txn;
+    int result = -EINPROGRESS;
+
+    txn = block_job_txn_new();
+    job = test_block_job_start(1, expected, &result);
+    block_job_txn_add_job(txn, job);
+    block_job_txn_begin(txn);
+
+    if (expected == -ECANCELED) {
+        block_job_cancel(job);
+    }
+
+    while (result == -EINPROGRESS) {
+        aio_poll(qemu_get_aio_context(), true);
+    }
+    g_assert_cmpint(result, ==, expected);
+}
+
+static void test_single_job_success(void)
+{
+    test_single_job(0);
+}
+
+static void test_single_job_failure(void)
+{
+    test_single_job(-EIO);
+}
+
+static void test_single_job_cancel(void)
+{
+    test_single_job(-ECANCELED);
+}
+
+static void test_pair_jobs(int expected1, int expected2)
+{
+    BlockJob *job1;
+    BlockJob *job2;
+    BlockJobTxn *txn;
+    int result1 = -EINPROGRESS;
+    int result2 = -EINPROGRESS;
+
+    txn = block_job_txn_new();
+    job1 = test_block_job_start(1, expected1, &result1);
+    block_job_txn_add_job(txn, job1);
+    job2 = test_block_job_start(2, expected2, &result2);
+    block_job_txn_add_job(txn, job2);
+    block_job_txn_begin(txn);
+
+    if (expected1 == -ECANCELED) {
+        block_job_cancel(job1);
+    }
+    if (expected2 == -ECANCELED) {
+        block_job_cancel(job2);
+    }
+
+    while (result1 == -EINPROGRESS || result2 == -EINPROGRESS) {
+        aio_poll(qemu_get_aio_context(), true);
+    }
+
+    /* Failure or cancellation of one job cancels the other job */
+    if (expected1 != 0) {
+        expected2 = -ECANCELED;
+    } else if (expected2 != 0) {
+        expected1 = -ECANCELED;
+    }
+
+    g_assert_cmpint(result1, ==, expected1);
+    g_assert_cmpint(result2, ==, expected2);
+}
+
+static void test_pair_jobs_success(void)
+{
+    test_pair_jobs(0, 0);
+}
+
+static void test_pair_jobs_failure(void)
+{
+    /* Test both orderings.  The two jobs run for a different number of
+     * iterations so the code path is different depending on which job fails
+     * first.
+     */
+    test_pair_jobs(-EIO, 0);
+    test_pair_jobs(0, -EIO);
+}
+
+static void test_pair_jobs_cancel(void)
+{
+    test_pair_jobs(-ECANCELED, 0);
+    test_pair_jobs(0, -ECANCELED);
+}
+
+int main(int argc, char **argv)
+{
+    qemu_init_main_loop(&error_abort);
+
+    g_test_init(&argc, &argv, NULL);
+    g_test_add_func("/single/success", test_single_job_success);
+    g_test_add_func("/single/failure", test_single_job_failure);
+    g_test_add_func("/single/cancel", test_single_job_cancel);
+    g_test_add_func("/pair/success", test_pair_jobs_success);
+    g_test_add_func("/pair/failure", test_pair_jobs_failure);
+    g_test_add_func("/pair/cancel", test_pair_jobs_cancel);
+    return g_test_run();
+}
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH 04/10] block: keep bitmap if incremental backup job is cancelled
  2015-06-25 12:12 ` [Qemu-devel] [PATCH 04/10] block: keep bitmap if incremental backup job is cancelled Stefan Hajnoczi
@ 2015-06-26  6:00   ` Fam Zheng
  2015-06-29 22:36   ` John Snow
  1 sibling, 0 replies; 25+ messages in thread
From: Fam Zheng @ 2015-06-26  6:00 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Kevin Wolf, Jeff Cody, qemu-devel, mreitz, vsementsov, John Snow

On Thu, 06/25 13:12, Stefan Hajnoczi wrote:
> Reclaim the dirty bitmap if an incremental backup block job is
> cancelled.  The ret variable may be 0 when the job is cancelled so it's
> not enough to check ret < 0.
> 
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>  block/backup.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/block/backup.c b/block/backup.c
> index 4a1af68..ddf8424 100644
> --- a/block/backup.c
> +++ b/block/backup.c
> @@ -431,7 +431,7 @@ static void coroutine_fn backup_run(void *opaque)
>  
>      if (job->sync_bitmap) {
>          BdrvDirtyBitmap *bm;
> -        if (ret < 0) {
> +        if (ret < 0 || block_job_is_cancelled(&job->common)) {
>              /* Merge the successor back into the parent, delete nothing. */
>              bm = bdrv_reclaim_dirty_bitmap(bs, job->sync_bitmap, NULL);
>              assert(bm);
> -- 
> 2.4.3
> 
> 

Reviewed-by: Fam Zheng <famz@redhat.com>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH 05/10] block: add block job transactions
  2015-06-25 12:12 ` [Qemu-devel] [PATCH 05/10] block: add block job transactions Stefan Hajnoczi
@ 2015-06-26  6:41   ` Fam Zheng
  2015-06-29 22:38   ` John Snow
  1 sibling, 0 replies; 25+ messages in thread
From: Fam Zheng @ 2015-06-26  6:41 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Kevin Wolf, Jeff Cody, qemu-devel, mreitz, vsementsov, John Snow

On Thu, 06/25 13:12, Stefan Hajnoczi wrote:
> Sometimes block jobs must execute as a transaction group.  Finishing
> jobs wait until all other jobs are ready to complete successfully.
> Failure or cancellation of one job cancels the other jobs in the group.
> 
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

Reviewed-by: Fam Zheng <famz@redhat.com>

> ---
>  blockjob.c                | 160 ++++++++++++++++++++++++++++++++++++++++++++++
>  include/block/block.h     |   1 +
>  include/block/block_int.h |   3 +-
>  include/block/blockjob.h  |  49 ++++++++++++++
>  trace-events              |   4 ++
>  5 files changed, 216 insertions(+), 1 deletion(-)
> 
> diff --git a/blockjob.c b/blockjob.c
> index ec46fad..3c6f1d4 100644
> --- a/blockjob.c
> +++ b/blockjob.c
> @@ -400,3 +400,163 @@ void block_job_defer_to_main_loop(BlockJob *job,
>  
>      qemu_bh_schedule(data->bh);
>  }
> +
> +/* Transactional group of block jobs */
> +struct BlockJobTxn {
> +    /* Jobs may be in different AioContexts so protect all fields */
> +    QemuMutex lock;
> +
> +    /* Reference count for txn object */
> +    unsigned int ref;
> +
> +    /* Is this txn cancelling its jobs? */
> +    bool aborting;
> +
> +    /* Number of jobs still running */
> +    unsigned int jobs_pending;
> +
> +    /* List of jobs */
> +    QLIST_HEAD(, BlockJob) jobs;
> +};
> +
> +BlockJobTxn *block_job_txn_new(void)
> +{
> +    BlockJobTxn *txn = g_new(BlockJobTxn, 1);
> +    qemu_mutex_init(&txn->lock);
> +    txn->ref = 1; /* dropped by block_job_txn_begin() */
> +    txn->aborting = false;
> +    txn->jobs_pending = 0;
> +    QLIST_INIT(&txn->jobs);
> +    return txn;
> +}
> +
> +static void block_job_txn_unref(BlockJobTxn *txn)
> +{
> +    qemu_mutex_lock(&txn->lock);
> +
> +    if (--txn->ref > 0) {
> +        qemu_mutex_unlock(&txn->lock);
> +        return;
> +    }
> +
> +    qemu_mutex_unlock(&txn->lock);
> +    qemu_mutex_destroy(&txn->lock);
> +    g_free(txn);
> +}
> +
> +/* The purpose of this is to keep txn alive until all jobs have been added */
> +void block_job_txn_begin(BlockJobTxn *txn)
> +{
> +    block_job_txn_unref(txn);
> +}
> +
> +void block_job_txn_add_job(BlockJobTxn *txn, BlockJob *job)
> +{
> +    if (!txn) {
> +        return;
> +    }
> +
> +    assert(!job->txn);
> +    job->txn = txn;
> +
> +    qemu_mutex_lock(&txn->lock);
> +    txn->ref++;
> +    txn->jobs_pending++;
> +    QLIST_INSERT_HEAD(&txn->jobs, job, txn_list);
> +    qemu_mutex_unlock(&txn->lock);
> +}
> +
> +/* Cancel all other jobs in case of abort, wake all waiting jobs in case of
> + * successful completion.  Runs from main loop.
> + */
> +static void block_job_txn_complete(BlockJob *job, void *opaque)
> +{
> +    BlockJobTxn *txn = opaque;
> +    BlockJob *other_job;
> +    bool aborting = txn->aborting;
> +
> +    qemu_mutex_lock(&txn->lock);
> +    txn->ref++; /* keep txn alive until the end of this loop */
> +
> +    QLIST_FOREACH(other_job, &txn->jobs, txn_list) {
> +        AioContext *ctx;
> +
> +        qemu_mutex_unlock(&txn->lock);
> +        ctx = bdrv_get_aio_context(other_job->bs);
> +        aio_context_acquire(ctx);
> +
> +        /* Cancel all other jobs if aborting.  Don't cancel our own failed job
> +         * since cancellation throws away the error value.
> +         */
> +        if (aborting && other_job != job) {
> +            block_job_cancel(other_job);
> +        } else {
> +            block_job_enter(other_job);
> +        }
> +
> +        aio_context_release(ctx);
> +        qemu_mutex_lock(&txn->lock);
> +    }
> +
> +    qemu_mutex_unlock(&txn->lock);
> +    block_job_txn_unref(txn);
> +}
> +
> +void coroutine_fn block_job_txn_prepare_to_complete(BlockJobTxn *txn,
> +                                                    BlockJob *job,
> +                                                    int ret)
> +{
> +    if (!txn) {
> +        return;
> +    }
> +
> +    qemu_mutex_lock(&txn->lock);
> +
> +    /* This function is entered in 3 cases:
> +     *
> +     * 1. Successful job completion - wait for other jobs
> +     * 2. First failed/cancelled job in txn - cancel other jobs and wait
> +     * 3. Subsequent cancelled jobs - finish immediately, don't wait
> +     */
> +    trace_block_job_txn_prepare_to_complete_entry(txn, job, ret,
> +                                                  block_job_is_cancelled(job),
> +                                                  txn->aborting,
> +                                                  txn->jobs_pending);
> +
> +    if (txn->aborting) { /* Case 3 */
> +        assert(block_job_is_cancelled(job));
> +        goto out; /* already cancelled, don't yield */
> +    }
> +
> +    if (ret != 0 || block_job_is_cancelled(job)) { /* Case 2 */
> +abort:
> +        txn->aborting = true;
> +        block_job_defer_to_main_loop(job, block_job_txn_complete, txn);
> +    } else { /* Case 1 */
> +        if (--txn->jobs_pending == 0) {
> +            block_job_defer_to_main_loop(job, block_job_txn_complete, txn);
> +        }
> +    }
> +
> +    /* Wait for block_job_txn_complete() */
> +    do {
> +        qemu_mutex_unlock(&txn->lock);
> +        job->busy = false;
> +        qemu_coroutine_yield();
> +        job->busy = true;
> +        qemu_mutex_lock(&txn->lock);
> +
> +        if (block_job_is_cancelled(job) && !txn->aborting) {
> +            goto abort; /* this job just got cancelled by the user */
> +        }
> +    } while (!txn->aborting && txn->jobs_pending > 0);
> +
> +out:
> +    trace_block_job_txn_prepare_to_complete_return(txn, job, ret,
> +                                                   block_job_is_cancelled(job),
> +                                                   txn->aborting,
> +                                                   txn->jobs_pending);
> +
> +    qemu_mutex_unlock(&txn->lock);
> +    block_job_txn_unref(txn);
> +}
> diff --git a/include/block/block.h b/include/block/block.h
> index a4c505d..cb19c73 100644
> --- a/include/block/block.h
> +++ b/include/block/block.h
> @@ -13,6 +13,7 @@
>  typedef struct BlockDriver BlockDriver;
>  typedef struct BlockJob BlockJob;
>  typedef struct BdrvChildRole BdrvChildRole;
> +typedef struct BlockJobTxn BlockJobTxn;
>  
>  typedef struct BlockDriverInfo {
>      /* in bytes, 0 if irrelevant */
> diff --git a/include/block/block_int.h b/include/block/block_int.h
> index ea3e7f0..812a18a 100644
> --- a/include/block/block_int.h
> +++ b/include/block/block_int.h
> @@ -639,6 +639,7 @@ void mirror_start(BlockDriverState *bs, BlockDriverState *target,
>   * @on_source_error: The action to take upon error reading from the source.
>   * @on_target_error: The action to take upon error writing to the target.
>   * @cb: Completion function for the job.
> + * @txn: Transaction that this job is part of (may be NULL).
>   * @opaque: Opaque pointer value passed to @cb.
>   *
>   * Start a backup operation on @bs.  Clusters in @bs are written to @target
> @@ -650,7 +651,7 @@ void backup_start(BlockDriverState *bs, BlockDriverState *target,
>                    BlockdevOnError on_source_error,
>                    BlockdevOnError on_target_error,
>                    BlockCompletionFunc *cb, void *opaque,
> -                  Error **errp);
> +                  BlockJobTxn *txn, Error **errp);
>  
>  void blk_dev_change_media_cb(BlockBackend *blk, bool load);
>  bool blk_dev_has_removable_media(BlockBackend *blk);
> diff --git a/include/block/blockjob.h b/include/block/blockjob.h
> index 57d8ef1..ce57e98 100644
> --- a/include/block/blockjob.h
> +++ b/include/block/blockjob.h
> @@ -122,6 +122,10 @@ struct BlockJob {
>  
>      /** The opaque value that is passed to the completion function.  */
>      void *opaque;
> +
> +    /** Non-NULL if this job is part of a transaction */
> +    BlockJobTxn *txn;
> +    QLIST_ENTRY(BlockJob) txn_list;
>  };
>  
>  /**
> @@ -348,4 +352,49 @@ void block_job_defer_to_main_loop(BlockJob *job,
>                                    BlockJobDeferToMainLoopFn *fn,
>                                    void *opaque);
>  
> +/**
> + * block_job_txn_new:
> + *
> + * Allocate and return a new block job transaction.  Jobs can be added to the
> + * transaction using block_job_txn_add_job().  block_job_txn_begin() must be
> + * called when all jobs (if any) have been added.
> + *
> + * All jobs in the transaction either complete successfully or fail/cancel as a
> + * group.  Jobs wait for each other before completing.  Cancelling one job
> + * cancels all jobs in the transaction.
> + */
> +BlockJobTxn *block_job_txn_new(void);
> +
> +/**
> + * block_job_txn_add_job:
> + * @txn: The transaction
> + * @job: Job to add to the transaction
> + *
> + * Add @job to the transaction.  The @job must not already be in a transaction.
> + * The block job driver must call block_job_txn_prepare_to_complete() before
> + * final cleanup and completion.
> + */
> +void block_job_txn_add_job(BlockJobTxn *txn, BlockJob *job);
> +
> +/**
> + * block_job_txn_begin:
> + * @txn: The transaction
> + *
> + * Call this to mark the end of adding jobs to the transaction.  This must be
> + * called even if no jobs were added.
> + */
> +void block_job_txn_begin(BlockJobTxn *txn);
> +
> +/**
> + * block_job_txn_prepare_to_complete:
> + * @txn: The transaction
> + * @job: The block job
> + * @ret: Block job return value (0 for success, otherwise job failure)
> + *
> + * Wait for other jobs in the transaction to complete.  If @ret is non-zero or
> + * @job is cancelled, all other jobs in the transaction will be cancelled.
> + */
> +void coroutine_fn block_job_txn_prepare_to_complete(BlockJobTxn *txn,
> +                                                    BlockJob *job, int ret);
> +
>  #endif
> diff --git a/trace-events b/trace-events
> index 52b7efa..b6a43a0 100644
> --- a/trace-events
> +++ b/trace-events
> @@ -123,6 +123,10 @@ virtio_blk_data_plane_start(void *s) "dataplane %p"
>  virtio_blk_data_plane_stop(void *s) "dataplane %p"
>  virtio_blk_data_plane_process_request(void *s, unsigned int out_num, unsigned int in_num, unsigned int head) "dataplane %p out_num %u in_num %u head %u"
>  
> +# blockjob.c
> +block_job_txn_prepare_to_complete_entry(void *txn, void *job, int ret, bool cancelled, bool aborting, unsigned int jobs_pending) "txn %p job %p ret %d cancelled %d aborting %d jobs_pending %u"
> +block_job_txn_prepare_to_complete_return(void *txn, void *job, int ret, bool cancelled, bool aborting, unsigned int jobs_pending) "txn %p job %p ret %d cancelled %d aborting %d jobs_pending %u"
> +
>  # hw/virtio/dataplane/vring.c
>  vring_setup(uint64_t physical, void *desc, void *avail, void *used) "vring physical %#"PRIx64" desc %p avail %p used %p"
>  
> -- 
> 2.4.3
> 
> 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH 06/10] blockdev: make BlockJobTxn available to qmp 'transaction'
  2015-06-25 12:12 ` [Qemu-devel] [PATCH 06/10] blockdev: make BlockJobTxn available to qmp 'transaction' Stefan Hajnoczi
@ 2015-06-26  6:42   ` Fam Zheng
  2015-06-29 22:38   ` John Snow
  1 sibling, 0 replies; 25+ messages in thread
From: Fam Zheng @ 2015-06-26  6:42 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Kevin Wolf, Jeff Cody, qemu-devel, mreitz, vsementsov, John Snow

On Thu, 06/25 13:12, Stefan Hajnoczi wrote:
> Provide a BlockJobTxn to actions executed in a qmp 'transaction'
> command.  This allows actions to make their block jobs either complete
> as a group or fail/cancel together.
> 
> The next patch adds the first user.
> 
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

Reviewed-by: Fam Zheng <famz@redhat.com>

> ---
>  blockdev.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> diff --git a/blockdev.c b/blockdev.c
> index c572950..453f3ec 100644
> --- a/blockdev.c
> +++ b/blockdev.c
> @@ -1279,6 +1279,7 @@ typedef struct BlkActionOps {
>  struct BlkActionState {
>      TransactionAction *action;
>      const BlkActionOps *ops;
> +    BlockJobTxn *block_job_txn;
>      QSIMPLEQ_ENTRY(BlkActionState) entry;
>  };
>  
> @@ -1883,12 +1884,15 @@ static const BlkActionOps actions[] = {
>  void qmp_transaction(TransactionActionList *dev_list, Error **errp)
>  {
>      TransactionActionList *dev_entry = dev_list;
> +    BlockJobTxn *block_job_txn;
>      BlkActionState *state, *next;
>      Error *local_err = NULL;
>  
>      QSIMPLEQ_HEAD(snap_bdrv_states, BlkActionState) snap_bdrv_states;
>      QSIMPLEQ_INIT(&snap_bdrv_states);
>  
> +    block_job_txn = block_job_txn_new();
> +
>      /* drain all i/o before any operations */
>      bdrv_drain_all();
>  
> @@ -1908,6 +1912,7 @@ void qmp_transaction(TransactionActionList *dev_list, Error **errp)
>          state = g_malloc0(ops->instance_size);
>          state->ops = ops;
>          state->action = dev_info;
> +        state->block_job_txn = block_job_txn;
>          QSIMPLEQ_INSERT_TAIL(&snap_bdrv_states, state, entry);
>  
>          state->ops->prepare(state, &local_err);
> @@ -1917,6 +1922,8 @@ void qmp_transaction(TransactionActionList *dev_list, Error **errp)
>          }
>      }
>  
> +    block_job_txn_begin(block_job_txn);
> +
>      QSIMPLEQ_FOREACH(state, &snap_bdrv_states, entry) {
>          if (state->ops->commit) {
>              state->ops->commit(state);
> @@ -1927,6 +1934,8 @@ void qmp_transaction(TransactionActionList *dev_list, Error **errp)
>      goto exit;
>  
>  delete_and_fail:
> +    block_job_txn_begin(block_job_txn);
> +
>      /* failure, and it is all-or-none; roll back all operations */
>      QSIMPLEQ_FOREACH(state, &snap_bdrv_states, entry) {
>          if (state->ops->abort) {
> -- 
> 2.4.3
> 
> 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH 07/10] block/backup: support block job transactions
  2015-06-25 12:12 ` [Qemu-devel] [PATCH 07/10] block/backup: support block job transactions Stefan Hajnoczi
@ 2015-06-26  6:44   ` Fam Zheng
  2015-06-29 22:39   ` John Snow
  1 sibling, 0 replies; 25+ messages in thread
From: Fam Zheng @ 2015-06-26  6:44 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Kevin Wolf, Jeff Cody, qemu-devel, mreitz, vsementsov, John Snow

On Thu, 06/25 13:12, Stefan Hajnoczi wrote:
> Join the transaction when the backup block job is in incremental backup
> mode.
> 
> This ensures that the sync bitmap is not thrown away if another block
> job in the transaction is cancelled or fails.  This is critical so
> incremental backup with multiple disks can be retried in case of
> cancellation/failure.
> 
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

Reviewed-by: Fam Zheng <famz@redhat.com>

> ---
>  block/backup.c |  7 +++++-
>  blockdev.c     | 73 ++++++++++++++++++++++++++++++++++++++++++----------------
>  2 files changed, 59 insertions(+), 21 deletions(-)
> 
> diff --git a/block/backup.c b/block/backup.c
> index ddf8424..fa86b0e 100644
> --- a/block/backup.c
> +++ b/block/backup.c
> @@ -429,6 +429,8 @@ static void coroutine_fn backup_run(void *opaque)
>      qemu_co_rwlock_wrlock(&job->flush_rwlock);
>      qemu_co_rwlock_unlock(&job->flush_rwlock);
>  
> +    block_job_txn_prepare_to_complete(job->common.txn, &job->common, ret);
> +
>      if (job->sync_bitmap) {
>          BdrvDirtyBitmap *bm;
>          if (ret < 0 || block_job_is_cancelled(&job->common)) {
> @@ -457,7 +459,7 @@ void backup_start(BlockDriverState *bs, BlockDriverState *target,
>                    BlockdevOnError on_source_error,
>                    BlockdevOnError on_target_error,
>                    BlockCompletionFunc *cb, void *opaque,
> -                  Error **errp)
> +                  BlockJobTxn *txn, Error **errp)
>  {
>      int64_t len;
>  
> @@ -537,6 +539,9 @@ void backup_start(BlockDriverState *bs, BlockDriverState *target,
>      job->sync_mode = sync_mode;
>      job->sync_bitmap = sync_mode == MIRROR_SYNC_MODE_DIRTY_BITMAP ?
>                         sync_bitmap : NULL;
> +    if (job->sync_bitmap) {
> +        block_job_txn_add_job(txn, &job->common);
> +    }
>      job->common.len = len;
>      job->common.co = qemu_coroutine_create(backup_run);
>      qemu_coroutine_enter(job->common.co, job);
> diff --git a/blockdev.c b/blockdev.c
> index 453f3ec..4f27c35 100644
> --- a/blockdev.c
> +++ b/blockdev.c
> @@ -1586,6 +1586,18 @@ typedef struct DriveBackupState {
>      BlockJob *job;
>  } DriveBackupState;
>  
> +static void do_drive_backup(const char *device, const char *target,
> +                            bool has_format, const char *format,
> +                            enum MirrorSyncMode sync,
> +                            bool has_mode, enum NewImageMode mode,
> +                            bool has_speed, int64_t speed,
> +                            bool has_bitmap, const char *bitmap,
> +                            bool has_on_source_error,
> +                            BlockdevOnError on_source_error,
> +                            bool has_on_target_error,
> +                            BlockdevOnError on_target_error,
> +                            BlockJobTxn *txn, Error **errp);
> +
>  static void drive_backup_prepare(BlkActionState *common, Error **errp)
>  {
>      DriveBackupState *state = DO_UPCAST(DriveBackupState, common, common);
> @@ -1609,15 +1621,16 @@ static void drive_backup_prepare(BlkActionState *common, Error **errp)
>      state->aio_context = bdrv_get_aio_context(bs);
>      aio_context_acquire(state->aio_context);
>  
> -    qmp_drive_backup(backup->device, backup->target,
> -                     backup->has_format, backup->format,
> -                     backup->sync,
> -                     backup->has_mode, backup->mode,
> -                     backup->has_speed, backup->speed,
> -                     backup->has_bitmap, backup->bitmap,
> -                     backup->has_on_source_error, backup->on_source_error,
> -                     backup->has_on_target_error, backup->on_target_error,
> -                     &local_err);
> +    do_drive_backup(backup->device, backup->target,
> +                    backup->has_format, backup->format,
> +                    backup->sync,
> +                    backup->has_mode, backup->mode,
> +                    backup->has_speed, backup->speed,
> +                    backup->has_bitmap, backup->bitmap,
> +                    backup->has_on_source_error, backup->on_source_error,
> +                    backup->has_on_target_error, backup->on_target_error,
> +                    common->block_job_txn,
> +                    &local_err);
>      if (local_err) {
>          error_propagate(errp, local_err);
>          return;
> @@ -2585,15 +2598,17 @@ out:
>      aio_context_release(aio_context);
>  }
>  
> -void qmp_drive_backup(const char *device, const char *target,
> -                      bool has_format, const char *format,
> -                      enum MirrorSyncMode sync,
> -                      bool has_mode, enum NewImageMode mode,
> -                      bool has_speed, int64_t speed,
> -                      bool has_bitmap, const char *bitmap,
> -                      bool has_on_source_error, BlockdevOnError on_source_error,
> -                      bool has_on_target_error, BlockdevOnError on_target_error,
> -                      Error **errp)
> +static void do_drive_backup(const char *device, const char *target,
> +                            bool has_format, const char *format,
> +                            enum MirrorSyncMode sync,
> +                            bool has_mode, enum NewImageMode mode,
> +                            bool has_speed, int64_t speed,
> +                            bool has_bitmap, const char *bitmap,
> +                            bool has_on_source_error,
> +                            BlockdevOnError on_source_error,
> +                            bool has_on_target_error,
> +                            BlockdevOnError on_target_error,
> +                            BlockJobTxn *txn, Error **errp)
>  {
>      BlockBackend *blk;
>      BlockDriverState *bs;
> @@ -2710,7 +2725,7 @@ void qmp_drive_backup(const char *device, const char *target,
>  
>      backup_start(bs, target_bs, speed, sync, bmap,
>                   on_source_error, on_target_error,
> -                 block_job_cb, bs, &local_err);
> +                 block_job_cb, bs, txn, &local_err);
>      if (local_err != NULL) {
>          bdrv_unref(target_bs);
>          error_propagate(errp, local_err);
> @@ -2721,6 +2736,24 @@ out:
>      aio_context_release(aio_context);
>  }
>  
> +void qmp_drive_backup(const char *device, const char *target,
> +                      bool has_format, const char *format,
> +                      enum MirrorSyncMode sync,
> +                      bool has_mode, enum NewImageMode mode,
> +                      bool has_speed, int64_t speed,
> +                      bool has_bitmap, const char *bitmap,
> +                      bool has_on_source_error, BlockdevOnError on_source_error,
> +                      bool has_on_target_error, BlockdevOnError on_target_error,
> +                      Error **errp)
> +{
> +    return do_drive_backup(device, target, has_format, format, sync,
> +                           has_mode, mode, has_speed, speed,
> +                           has_bitmap, bitmap,
> +                           has_on_source_error, on_source_error,
> +                           has_on_target_error, on_target_error,
> +                           NULL, errp);
> +}
> +
>  BlockDeviceInfoList *qmp_query_named_block_nodes(Error **errp)
>  {
>      return bdrv_named_nodes_list(errp);
> @@ -2771,7 +2804,7 @@ void qmp_blockdev_backup(const char *device, const char *target,
>      bdrv_ref(target_bs);
>      bdrv_set_aio_context(target_bs, aio_context);
>      backup_start(bs, target_bs, speed, sync, NULL, on_source_error,
> -                 on_target_error, block_job_cb, bs, &local_err);
> +                 on_target_error, block_job_cb, bs, NULL, &local_err);
>      if (local_err != NULL) {
>          bdrv_unref(target_bs);
>          error_propagate(errp, local_err);
> -- 
> 2.4.3
> 
> 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH 10/10] tests: add BlockJobTxn unit test
  2015-06-25 12:12 ` [Qemu-devel] [PATCH 10/10] tests: add BlockJobTxn unit test Stefan Hajnoczi
@ 2015-06-26  6:58   ` Fam Zheng
  0 siblings, 0 replies; 25+ messages in thread
From: Fam Zheng @ 2015-06-26  6:58 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Kevin Wolf, Jeff Cody, qemu-devel, mreitz, vsementsov, John Snow

On Thu, 06/25 13:12, Stefan Hajnoczi wrote:
> The BlockJobTxn unit test verifies that both single jobs and pairs of
> jobs behave as a transaction group.  Either all jobs complete
> successfully or the group is cancelled.
> 
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

Reviewed-by: Fam Zheng <famz@redhat.com>

> ---
>  tests/Makefile            |   3 +
>  tests/test-blockjob-txn.c | 191 ++++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 194 insertions(+)
>  create mode 100644 tests/test-blockjob-txn.c
> 
> diff --git a/tests/Makefile b/tests/Makefile
> index eff5e11..837f30b 100644
> --- a/tests/Makefile
> +++ b/tests/Makefile
> @@ -45,6 +45,8 @@ check-unit-y += tests/test-thread-pool$(EXESUF)
>  gcov-files-test-thread-pool-y = thread-pool.c
>  gcov-files-test-hbitmap-y = util/hbitmap.c
>  check-unit-y += tests/test-hbitmap$(EXESUF)
> +gcov-files-test-hbitmap-y = blockjob.c
> +check-unit-y += tests/test-blockjob-txn$(EXESUF)
>  check-unit-y += tests/test-x86-cpuid$(EXESUF)
>  # all code tested by test-x86-cpuid is inside topology.h
>  gcov-files-test-x86-cpuid-y =
> @@ -283,6 +285,7 @@ tests/test-coroutine$(EXESUF): tests/test-coroutine.o $(block-obj-y) libqemuutil
>  tests/test-aio$(EXESUF): tests/test-aio.o $(block-obj-y) libqemuutil.a libqemustub.a
>  tests/test-rfifolock$(EXESUF): tests/test-rfifolock.o libqemuutil.a libqemustub.a
>  tests/test-throttle$(EXESUF): tests/test-throttle.o $(block-obj-y) libqemuutil.a libqemustub.a
> +tests/test-blockjob-txn$(EXESUF): tests/test-blockjob-txn.o $(block-obj-y) libqemuutil.a libqemustub.a
>  tests/test-thread-pool$(EXESUF): tests/test-thread-pool.o $(block-obj-y) libqemuutil.a libqemustub.a
>  tests/test-iov$(EXESUF): tests/test-iov.o libqemuutil.a
>  tests/test-hbitmap$(EXESUF): tests/test-hbitmap.o libqemuutil.a libqemustub.a
> diff --git a/tests/test-blockjob-txn.c b/tests/test-blockjob-txn.c
> new file mode 100644
> index 0000000..26697da
> --- /dev/null
> +++ b/tests/test-blockjob-txn.c
> @@ -0,0 +1,191 @@
> +/*
> + * Blockjob transactions tests
> + *
> + * Copyright Red Hat, Inc. 2015
> + *
> + * Authors:
> + *  Stefan Hajnoczi    <stefanha@redhat.com>
> + *
> + * This work is licensed under the terms of the GNU LGPL, version 2 or later.
> + * See the COPYING.LIB file in the top-level directory.
> + */
> +
> +#include <glib.h>
> +#include "qapi/error.h"
> +#include "qemu/main-loop.h"
> +#include "block/blockjob.h"
> +
> +typedef struct {
> +    BlockJob common;
> +    unsigned int iterations;
> +    int rc;
> +} TestBlockJob;
> +
> +static const BlockJobDriver test_block_job_driver = {
> +    .instance_size = sizeof(TestBlockJob),
> +};
> +
> +static void test_block_job_complete(BlockJob *job, void *opaque)
> +{
> +    BlockDriverState *bs = job->bs;
> +    int rc = (intptr_t)opaque;
> +
> +    if (block_job_is_cancelled(job)) {
> +        rc = -ECANCELED;
> +    }
> +
> +    block_job_completed(job, rc);
> +    bdrv_unref(bs);
> +}
> +
> +static void coroutine_fn test_block_job_run(void *opaque)
> +{
> +    TestBlockJob *s = opaque;
> +    BlockJob *job = &s->common;
> +
> +    while (s->iterations--) {
> +        block_job_sleep_ns(job, QEMU_CLOCK_REALTIME, 0);
> +
> +        if (block_job_is_cancelled(job)) {
> +            break;
> +        }
> +    }
> +
> +    block_job_txn_prepare_to_complete(job->txn, job, s->rc);
> +
> +    block_job_defer_to_main_loop(job, test_block_job_complete,
> +                                 (void *)(intptr_t)s->rc);
> +}
> +
> +static void test_block_job_cb(void *opaque, int ret)
> +{
> +    *(int *)opaque = ret;
> +}
> +
> +/* Create a block job that completes with a given return code after a given
> + * number of event loop iterations.  The return code is stored in the given
> + * result pointer.
> + */
> +static BlockJob *test_block_job_start(unsigned int iterations, int rc,
> +                                      int *result)
> +{
> +    BlockDriverState *bs;
> +    TestBlockJob *s;
> +
> +    bs = bdrv_new();
> +    s = block_job_create(&test_block_job_driver, bs, 0, test_block_job_cb,
> +                         result, &error_abort);
> +    s->iterations = iterations;
> +    s->rc = rc;
> +    s->common.co = qemu_coroutine_create(test_block_job_run);
> +    qemu_coroutine_enter(s->common.co, s);
> +    return &s->common;
> +}
> +
> +static void test_single_job(int expected)
> +{
> +    BlockJob *job;
> +    BlockJobTxn *txn;
> +    int result = -EINPROGRESS;
> +
> +    txn = block_job_txn_new();
> +    job = test_block_job_start(1, expected, &result);
> +    block_job_txn_add_job(txn, job);
> +    block_job_txn_begin(txn);
> +
> +    if (expected == -ECANCELED) {
> +        block_job_cancel(job);
> +    }
> +
> +    while (result == -EINPROGRESS) {
> +        aio_poll(qemu_get_aio_context(), true);
> +    }
> +    g_assert_cmpint(result, ==, expected);
> +}
> +
> +static void test_single_job_success(void)
> +{
> +    test_single_job(0);
> +}
> +
> +static void test_single_job_failure(void)
> +{
> +    test_single_job(-EIO);
> +}
> +
> +static void test_single_job_cancel(void)
> +{
> +    test_single_job(-ECANCELED);
> +}
> +
> +static void test_pair_jobs(int expected1, int expected2)
> +{
> +    BlockJob *job1;
> +    BlockJob *job2;
> +    BlockJobTxn *txn;
> +    int result1 = -EINPROGRESS;
> +    int result2 = -EINPROGRESS;
> +
> +    txn = block_job_txn_new();
> +    job1 = test_block_job_start(1, expected1, &result1);
> +    block_job_txn_add_job(txn, job1);
> +    job2 = test_block_job_start(2, expected2, &result2);
> +    block_job_txn_add_job(txn, job2);
> +    block_job_txn_begin(txn);
> +
> +    if (expected1 == -ECANCELED) {
> +        block_job_cancel(job1);
> +    }
> +    if (expected2 == -ECANCELED) {
> +        block_job_cancel(job2);
> +    }
> +
> +    while (result1 == -EINPROGRESS || result2 == -EINPROGRESS) {
> +        aio_poll(qemu_get_aio_context(), true);
> +    }
> +
> +    /* Failure or cancellation of one job cancels the other job */
> +    if (expected1 != 0) {
> +        expected2 = -ECANCELED;
> +    } else if (expected2 != 0) {
> +        expected1 = -ECANCELED;
> +    }
> +
> +    g_assert_cmpint(result1, ==, expected1);
> +    g_assert_cmpint(result2, ==, expected2);
> +}
> +
> +static void test_pair_jobs_success(void)
> +{
> +    test_pair_jobs(0, 0);
> +}
> +
> +static void test_pair_jobs_failure(void)
> +{
> +    /* Test both orderings.  The two jobs run for a different number of
> +     * iterations so the code path is different depending on which job fails
> +     * first.
> +     */
> +    test_pair_jobs(-EIO, 0);
> +    test_pair_jobs(0, -EIO);
> +}
> +
> +static void test_pair_jobs_cancel(void)
> +{
> +    test_pair_jobs(-ECANCELED, 0);
> +    test_pair_jobs(0, -ECANCELED);
> +}
> +
> +int main(int argc, char **argv)
> +{
> +    qemu_init_main_loop(&error_abort);
> +
> +    g_test_init(&argc, &argv, NULL);
> +    g_test_add_func("/single/success", test_single_job_success);
> +    g_test_add_func("/single/failure", test_single_job_failure);
> +    g_test_add_func("/single/cancel", test_single_job_cancel);
> +    g_test_add_func("/pair/success", test_pair_jobs_success);
> +    g_test_add_func("/pair/failure", test_pair_jobs_failure);
> +    g_test_add_func("/pair/cancel", test_pair_jobs_cancel);
> +    return g_test_run();
> +}
> -- 
> 2.4.3
> 
> 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH 00/10] block: incremental backup transactions using BlockJobTxn
  2015-06-25 12:12 [Qemu-devel] [PATCH 00/10] block: incremental backup transactions using BlockJobTxn Stefan Hajnoczi
                   ` (9 preceding siblings ...)
  2015-06-25 12:12 ` [Qemu-devel] [PATCH 10/10] tests: add BlockJobTxn unit test Stefan Hajnoczi
@ 2015-06-29 15:08 ` Stefan Hajnoczi
  10 siblings, 0 replies; 25+ messages in thread
From: Stefan Hajnoczi @ 2015-06-29 15:08 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Kevin Wolf, famz, Jeff Cody, qemu-devel, mreitz, vsementsov, John Snow

[-- Attachment #1: Type: text/plain, Size: 3838 bytes --]

On Thu, Jun 25, 2015 at 01:12:02PM +0100, Stefan Hajnoczi wrote:
> This series is based on my block branch
> (https://github.com/stefanha/qemu/commits/block).
> 
> It uses patches from John Snow's "[PATCH v6 00/10] block: incremental backup
> transactions" series but implements the feature with a new transaction
> mechanism for blockjobs called BlockJobTxn.
> 
> Recap: motivation for block job transactions
> --------------------------------------------
> If an incremental backup block job fails then we reclaim the bitmap so
> the job can be retried.  The problem comes when multiple jobs are started as
> part of a qmp 'transaction' command.  We need to group these jobs in a
> transaction so that either all jobs complete successfully or all bitmaps are
> reclaimed.
> 
> Without transactions, there is a case where some jobs complete successfully and
> throw away their bitmaps, making it impossible to retry the backup by rerunning
> the command if one of the jobs fails.
> 
> How does this implementation work?
> ----------------------------------
> These patches add a BlockJobTxn object with the following API:
> 
>   txn = block_job_txn_new();
>   block_job_txn_add_job(txn, job1);
>   block_job_txn_add_job(txn, job2);
>   block_job_txn_begin();
> 
> The jobs either both complete successfully or they both fail/cancel.  If the
> user cancels job1 then job2 will also be cancelled and vice versa.
> 
> Jobs stay alive waiting for other jobs to complete.  They can be cancelled by
> the user during this time.  Job blockers are still in effect and no other block
> job can run on this device in the meantime (since QEMU currently only allows 1
> job per device).  This is the main drawback to this approach but reasonable
> since you probably don't want to run other jobs/operations until you're sure
> the backup was successful (you won't be able to retry a failed backup if
> there's a new job running).
> 
> Adding transaction support to the backup job is very easy.  It just needs to
> make a call before throwing away the bitmap and returning from its coroutine:
> 
>   block_job_txn_prepare_to_complete(job->txn, job, ret);
> 
>   if (job->sync_bitmap) {
>       BdrvDirtyBitmap *bm;
>       if (ret < 0 || block_job_is_cancelled(&job->common)) {
>           ...
> 
> John Snow (4):
>   qapi: Add transaction support to block-dirty-bitmap operations
>   iotests: add transactional incremental backup test
>   block: rename BlkTransactionState and BdrvActionOps
>   iotests: 124 - transactional failure test
> 
> Kashyap Chamarthy (1):
>   qmp-commands.hx: Update the supported 'transaction' operations
> 
> Stefan Hajnoczi (5):
>   block: keep bitmap if incremental backup job is cancelled
>   block: add block job transactions
>   blockdev: make BlockJobTxn available to qmp 'transaction'
>   block/backup: support block job transactions
>   tests: add BlockJobTxn unit test
> 
>  block.c                    |  19 ++-
>  block/backup.c             |   9 +-
>  blockdev.c                 | 298 +++++++++++++++++++++++++++++++++++----------
>  blockjob.c                 | 160 ++++++++++++++++++++++++
>  docs/bitmaps.md            |   6 +-
>  include/block/block.h      |   2 +-
>  include/block/block_int.h  |   6 +-
>  include/block/blockjob.h   |  49 ++++++++
>  qapi-schema.json           |   6 +-
>  qmp-commands.hx            |  21 +++-
>  tests/Makefile             |   3 +
>  tests/qemu-iotests/124     | 180 ++++++++++++++++++++++++++-
>  tests/qemu-iotests/124.out |   4 +-
>  tests/test-blockjob-txn.c  | 191 +++++++++++++++++++++++++++++
>  trace-events               |   4 +
>  15 files changed, 873 insertions(+), 85 deletions(-)
>  create mode 100644 tests/test-blockjob-txn.c

John: Do you want to review this series?

[-- Attachment #2: Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH 04/10] block: keep bitmap if incremental backup job is cancelled
  2015-06-25 12:12 ` [Qemu-devel] [PATCH 04/10] block: keep bitmap if incremental backup job is cancelled Stefan Hajnoczi
  2015-06-26  6:00   ` Fam Zheng
@ 2015-06-29 22:36   ` John Snow
  1 sibling, 0 replies; 25+ messages in thread
From: John Snow @ 2015-06-29 22:36 UTC (permalink / raw)
  To: Stefan Hajnoczi, qemu-devel
  Cc: Kevin Wolf, Jeff Cody, vsementsov, famz, mreitz



On 06/25/2015 08:12 AM, Stefan Hajnoczi wrote:
> Reclaim the dirty bitmap if an incremental backup block job is
> cancelled.  The ret variable may be 0 when the job is cancelled so it's
> not enough to check ret < 0.
> 
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>  block/backup.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/block/backup.c b/block/backup.c
> index 4a1af68..ddf8424 100644
> --- a/block/backup.c
> +++ b/block/backup.c
> @@ -431,7 +431,7 @@ static void coroutine_fn backup_run(void *opaque)
>  
>      if (job->sync_bitmap) {
>          BdrvDirtyBitmap *bm;
> -        if (ret < 0) {
> +        if (ret < 0 || block_job_is_cancelled(&job->common)) {
>              /* Merge the successor back into the parent, delete nothing. */
>              bm = bdrv_reclaim_dirty_bitmap(bs, job->sync_bitmap, NULL);
>              assert(bm);
> 

Duplicated here from the single patch sent out to Jeff Cody's tree, but
you're probably already aware...

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH 05/10] block: add block job transactions
  2015-06-25 12:12 ` [Qemu-devel] [PATCH 05/10] block: add block job transactions Stefan Hajnoczi
  2015-06-26  6:41   ` Fam Zheng
@ 2015-06-29 22:38   ` John Snow
  2015-06-30 16:20     ` Stefan Hajnoczi
  1 sibling, 1 reply; 25+ messages in thread
From: John Snow @ 2015-06-29 22:38 UTC (permalink / raw)
  To: Stefan Hajnoczi, qemu-devel
  Cc: Kevin Wolf, Jeff Cody, vsementsov, famz, mreitz



On 06/25/2015 08:12 AM, Stefan Hajnoczi wrote:
> Sometimes block jobs must execute as a transaction group.  Finishing
> jobs wait until all other jobs are ready to complete successfully.
> Failure or cancellation of one job cancels the other jobs in the group.
> 
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>  blockjob.c                | 160 ++++++++++++++++++++++++++++++++++++++++++++++
>  include/block/block.h     |   1 +
>  include/block/block_int.h |   3 +-
>  include/block/blockjob.h  |  49 ++++++++++++++
>  trace-events              |   4 ++
>  5 files changed, 216 insertions(+), 1 deletion(-)
> 
> diff --git a/blockjob.c b/blockjob.c
> index ec46fad..3c6f1d4 100644
> --- a/blockjob.c
> +++ b/blockjob.c
> @@ -400,3 +400,163 @@ void block_job_defer_to_main_loop(BlockJob *job,
>  
>      qemu_bh_schedule(data->bh);
>  }
> +
> +/* Transactional group of block jobs */
> +struct BlockJobTxn {
> +    /* Jobs may be in different AioContexts so protect all fields */
> +    QemuMutex lock;
> +
> +    /* Reference count for txn object */
> +    unsigned int ref;
> +
> +    /* Is this txn cancelling its jobs? */
> +    bool aborting;
> +
> +    /* Number of jobs still running */
> +    unsigned int jobs_pending;
> +
> +    /* List of jobs */
> +    QLIST_HEAD(, BlockJob) jobs;
> +};
> +
> +BlockJobTxn *block_job_txn_new(void)
> +{
> +    BlockJobTxn *txn = g_new(BlockJobTxn, 1);
> +    qemu_mutex_init(&txn->lock);
> +    txn->ref = 1; /* dropped by block_job_txn_begin() */
> +    txn->aborting = false;
> +    txn->jobs_pending = 0;
> +    QLIST_INIT(&txn->jobs);
> +    return txn;
> +}
> +
> +static void block_job_txn_unref(BlockJobTxn *txn)
> +{
> +    qemu_mutex_lock(&txn->lock);
> +
> +    if (--txn->ref > 0) {
> +        qemu_mutex_unlock(&txn->lock);
> +        return;
> +    }
> +
> +    qemu_mutex_unlock(&txn->lock);
> +    qemu_mutex_destroy(&txn->lock);
> +    g_free(txn);
> +}
> +
> +/* The purpose of this is to keep txn alive until all jobs have been added */
> +void block_job_txn_begin(BlockJobTxn *txn)
> +{
> +    block_job_txn_unref(txn);
> +}
> +

Maybe it's not entirely clear to the caller that "begin" may in fact
actually delete the BlockJobTxn. Shall we update the caller's pointer to
NULL in this case as a hint? Passing a **txn will imply that we are
giving up our ownership of the object.

> +void block_job_txn_add_job(BlockJobTxn *txn, BlockJob *job)
> +{
> +    if (!txn) {
> +        return;
> +    }
> +
> +    assert(!job->txn);
> +    job->txn = txn;
> +
> +    qemu_mutex_lock(&txn->lock);
> +    txn->ref++;
> +    txn->jobs_pending++;
> +    QLIST_INSERT_HEAD(&txn->jobs, job, txn_list);
> +    qemu_mutex_unlock(&txn->lock);
> +}
> +
> +/* Cancel all other jobs in case of abort, wake all waiting jobs in case of
> + * successful completion.  Runs from main loop.
> + */
> +static void block_job_txn_complete(BlockJob *job, void *opaque)
> +{
> +    BlockJobTxn *txn = opaque;
> +    BlockJob *other_job;
> +    bool aborting = txn->aborting;
> +
> +    qemu_mutex_lock(&txn->lock);
> +    txn->ref++; /* keep txn alive until the end of this loop */
> +
> +    QLIST_FOREACH(other_job, &txn->jobs, txn_list) {
> +        AioContext *ctx;
> +
> +        qemu_mutex_unlock(&txn->lock);
> +        ctx = bdrv_get_aio_context(other_job->bs);
> +        aio_context_acquire(ctx);
> +
> +        /* Cancel all other jobs if aborting.  Don't cancel our own failed job
> +         * since cancellation throws away the error value.
> +         */
> +        if (aborting && other_job != job) {
> +            block_job_cancel(other_job);
> +        } else {
> +            block_job_enter(other_job);
> +        }
> +
> +        aio_context_release(ctx);
> +        qemu_mutex_lock(&txn->lock);
> +    }
> +
> +    qemu_mutex_unlock(&txn->lock);
> +    block_job_txn_unref(txn);
> +}
> +

Maybe we can name this one along the lines of
block_job_txn_complete_impl or something clearly internal, so that we
can name the public interface simply "block_job_txn_complete."

Maybe I'm just bike shedding, but calling only a "prepare to X" function
without a matching "X" call in the exposed API seems odd.

I suppose it doesn't matter, because I can't think of anything nicer :)

> +void coroutine_fn block_job_txn_prepare_to_complete(BlockJobTxn *txn,
> +                                                    BlockJob *job,
> +                                                    int ret)
> +{
> +    if (!txn) {
> +        return;
> +    }
> +
> +    qemu_mutex_lock(&txn->lock);
> +
> +    /* This function is entered in 3 cases:
> +     *
> +     * 1. Successful job completion - wait for other jobs
> +     * 2. First failed/cancelled job in txn - cancel other jobs and wait
> +     * 3. Subsequent cancelled jobs - finish immediately, don't wait
> +     */
> +    trace_block_job_txn_prepare_to_complete_entry(txn, job, ret,
> +                                                  block_job_is_cancelled(job),
> +                                                  txn->aborting,
> +                                                  txn->jobs_pending);
> +
> +    if (txn->aborting) { /* Case 3 */
> +        assert(block_job_is_cancelled(job));
> +        goto out; /* already cancelled, don't yield */
> +    }
> +

So the first failure forces all jobs not-yet-complete to cancel. Is
there any chance for a race condition of two jobs completing almost
simultaneously, where the first fails and the second completes, and the
2nd job makes it here before it gets canceled?

BOD: I really assume the answer is "no," but it's not immediately
evident to me.

> +    if (ret != 0 || block_job_is_cancelled(job)) { /* Case 2 */
> +abort:
> +        txn->aborting = true;
> +        block_job_defer_to_main_loop(job, block_job_txn_complete, txn);
> +    } else { /* Case 1 */
> +        if (--txn->jobs_pending == 0) {
> +            block_job_defer_to_main_loop(job, block_job_txn_complete, txn);
> +        }
> +    }
> +
> +    /* Wait for block_job_txn_complete() */
> +    do {
> +        qemu_mutex_unlock(&txn->lock);
> +        job->busy = false;
> +        qemu_coroutine_yield();
> +        job->busy = true;
> +        qemu_mutex_lock(&txn->lock);
> +
> +        if (block_job_is_cancelled(job) && !txn->aborting) {
> +            goto abort; /* this job just got cancelled by the user */
> +        }
> +    } while (!txn->aborting && txn->jobs_pending > 0);
> +
> +out:
> +    trace_block_job_txn_prepare_to_complete_return(txn, job, ret,
> +                                                   block_job_is_cancelled(job),
> +                                                   txn->aborting,
> +                                                   txn->jobs_pending);
> +
> +    qemu_mutex_unlock(&txn->lock);
> +    block_job_txn_unref(txn);
> +}
> diff --git a/include/block/block.h b/include/block/block.h
> index a4c505d..cb19c73 100644
> --- a/include/block/block.h
> +++ b/include/block/block.h
> @@ -13,6 +13,7 @@
>  typedef struct BlockDriver BlockDriver;
>  typedef struct BlockJob BlockJob;
>  typedef struct BdrvChildRole BdrvChildRole;
> +typedef struct BlockJobTxn BlockJobTxn;
>  
>  typedef struct BlockDriverInfo {
>      /* in bytes, 0 if irrelevant */
> diff --git a/include/block/block_int.h b/include/block/block_int.h
> index ea3e7f0..812a18a 100644
> --- a/include/block/block_int.h
> +++ b/include/block/block_int.h
> @@ -639,6 +639,7 @@ void mirror_start(BlockDriverState *bs, BlockDriverState *target,
>   * @on_source_error: The action to take upon error reading from the source.
>   * @on_target_error: The action to take upon error writing to the target.
>   * @cb: Completion function for the job.
> + * @txn: Transaction that this job is part of (may be NULL).
>   * @opaque: Opaque pointer value passed to @cb.
>   *
>   * Start a backup operation on @bs.  Clusters in @bs are written to @target
> @@ -650,7 +651,7 @@ void backup_start(BlockDriverState *bs, BlockDriverState *target,
>                    BlockdevOnError on_source_error,
>                    BlockdevOnError on_target_error,
>                    BlockCompletionFunc *cb, void *opaque,
> -                  Error **errp);
> +                  BlockJobTxn *txn, Error **errp);
>  
>  void blk_dev_change_media_cb(BlockBackend *blk, bool load);
>  bool blk_dev_has_removable_media(BlockBackend *blk);
> diff --git a/include/block/blockjob.h b/include/block/blockjob.h
> index 57d8ef1..ce57e98 100644
> --- a/include/block/blockjob.h
> +++ b/include/block/blockjob.h
> @@ -122,6 +122,10 @@ struct BlockJob {
>  
>      /** The opaque value that is passed to the completion function.  */
>      void *opaque;
> +
> +    /** Non-NULL if this job is part of a transaction */
> +    BlockJobTxn *txn;
> +    QLIST_ENTRY(BlockJob) txn_list;
>  };
>  
>  /**
> @@ -348,4 +352,49 @@ void block_job_defer_to_main_loop(BlockJob *job,
>                                    BlockJobDeferToMainLoopFn *fn,
>                                    void *opaque);
>  
> +/**
> + * block_job_txn_new:
> + *
> + * Allocate and return a new block job transaction.  Jobs can be added to the
> + * transaction using block_job_txn_add_job().  block_job_txn_begin() must be
> + * called when all jobs (if any) have been added.
> + *
> + * All jobs in the transaction either complete successfully or fail/cancel as a
> + * group.  Jobs wait for each other before completing.  Cancelling one job
> + * cancels all jobs in the transaction.
> + */
> +BlockJobTxn *block_job_txn_new(void);
> +
> +/**
> + * block_job_txn_add_job:
> + * @txn: The transaction
> + * @job: Job to add to the transaction
> + *
> + * Add @job to the transaction.  The @job must not already be in a transaction.
> + * The block job driver must call block_job_txn_prepare_to_complete() before
> + * final cleanup and completion.
> + */
> +void block_job_txn_add_job(BlockJobTxn *txn, BlockJob *job);
> +
> +/**
> + * block_job_txn_begin:
> + * @txn: The transaction
> + *
> + * Call this to mark the end of adding jobs to the transaction.  This must be
> + * called even if no jobs were added.
> + */
> +void block_job_txn_begin(BlockJobTxn *txn);
> +
> +/**
> + * block_job_txn_prepare_to_complete:
> + * @txn: The transaction
> + * @job: The block job
> + * @ret: Block job return value (0 for success, otherwise job failure)
> + *
> + * Wait for other jobs in the transaction to complete.  If @ret is non-zero or
> + * @job is cancelled, all other jobs in the transaction will be cancelled.
> + */
> +void coroutine_fn block_job_txn_prepare_to_complete(BlockJobTxn *txn,
> +                                                    BlockJob *job, int ret);
> +
>  #endif
> diff --git a/trace-events b/trace-events
> index 52b7efa..b6a43a0 100644
> --- a/trace-events
> +++ b/trace-events
> @@ -123,6 +123,10 @@ virtio_blk_data_plane_start(void *s) "dataplane %p"
>  virtio_blk_data_plane_stop(void *s) "dataplane %p"
>  virtio_blk_data_plane_process_request(void *s, unsigned int out_num, unsigned int in_num, unsigned int head) "dataplane %p out_num %u in_num %u head %u"
>  
> +# blockjob.c
> +block_job_txn_prepare_to_complete_entry(void *txn, void *job, int ret, bool cancelled, bool aborting, unsigned int jobs_pending) "txn %p job %p ret %d cancelled %d aborting %d jobs_pending %u"
> +block_job_txn_prepare_to_complete_return(void *txn, void *job, int ret, bool cancelled, bool aborting, unsigned int jobs_pending) "txn %p job %p ret %d cancelled %d aborting %d jobs_pending %u"
> +
>  # hw/virtio/dataplane/vring.c
>  vring_setup(uint64_t physical, void *desc, void *avail, void *used) "vring physical %#"PRIx64" desc %p avail %p used %p"
>  
> 

Bike-shedding comments and a benefit-of-the-doubt aside;

Reviewed-by: John Snow <jsnow@redhat.com>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH 06/10] blockdev: make BlockJobTxn available to qmp 'transaction'
  2015-06-25 12:12 ` [Qemu-devel] [PATCH 06/10] blockdev: make BlockJobTxn available to qmp 'transaction' Stefan Hajnoczi
  2015-06-26  6:42   ` Fam Zheng
@ 2015-06-29 22:38   ` John Snow
  1 sibling, 0 replies; 25+ messages in thread
From: John Snow @ 2015-06-29 22:38 UTC (permalink / raw)
  To: Stefan Hajnoczi, qemu-devel
  Cc: Kevin Wolf, Jeff Cody, vsementsov, famz, mreitz



On 06/25/2015 08:12 AM, Stefan Hajnoczi wrote:
> Provide a BlockJobTxn to actions executed in a qmp 'transaction'
> command.  This allows actions to make their block jobs either complete
> as a group or fail/cancel together.
> 
> The next patch adds the first user.
> 
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>  blockdev.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> diff --git a/blockdev.c b/blockdev.c
> index c572950..453f3ec 100644
> --- a/blockdev.c
> +++ b/blockdev.c
> @@ -1279,6 +1279,7 @@ typedef struct BlkActionOps {
>  struct BlkActionState {
>      TransactionAction *action;
>      const BlkActionOps *ops;
> +    BlockJobTxn *block_job_txn;
>      QSIMPLEQ_ENTRY(BlkActionState) entry;
>  };
>  
> @@ -1883,12 +1884,15 @@ static const BlkActionOps actions[] = {
>  void qmp_transaction(TransactionActionList *dev_list, Error **errp)
>  {
>      TransactionActionList *dev_entry = dev_list;
> +    BlockJobTxn *block_job_txn;
>      BlkActionState *state, *next;
>      Error *local_err = NULL;
>  
>      QSIMPLEQ_HEAD(snap_bdrv_states, BlkActionState) snap_bdrv_states;
>      QSIMPLEQ_INIT(&snap_bdrv_states);
>  
> +    block_job_txn = block_job_txn_new();
> +
>      /* drain all i/o before any operations */
>      bdrv_drain_all();
>  
> @@ -1908,6 +1912,7 @@ void qmp_transaction(TransactionActionList *dev_list, Error **errp)
>          state = g_malloc0(ops->instance_size);
>          state->ops = ops;
>          state->action = dev_info;
> +        state->block_job_txn = block_job_txn;
>          QSIMPLEQ_INSERT_TAIL(&snap_bdrv_states, state, entry);
>  
>          state->ops->prepare(state, &local_err);
> @@ -1917,6 +1922,8 @@ void qmp_transaction(TransactionActionList *dev_list, Error **errp)
>          }
>      }
>  
> +    block_job_txn_begin(block_job_txn);
> +
>      QSIMPLEQ_FOREACH(state, &snap_bdrv_states, entry) {
>          if (state->ops->commit) {
>              state->ops->commit(state);
> @@ -1927,6 +1934,8 @@ void qmp_transaction(TransactionActionList *dev_list, Error **errp)
>      goto exit;
>  
>  delete_and_fail:
> +    block_job_txn_begin(block_job_txn);
> +
>      /* failure, and it is all-or-none; roll back all operations */
>      QSIMPLEQ_FOREACH(state, &snap_bdrv_states, entry) {
>          if (state->ops->abort) {
> 

Reviewed-by: John Snow <jsnow@redhat.com>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH 07/10] block/backup: support block job transactions
  2015-06-25 12:12 ` [Qemu-devel] [PATCH 07/10] block/backup: support block job transactions Stefan Hajnoczi
  2015-06-26  6:44   ` Fam Zheng
@ 2015-06-29 22:39   ` John Snow
  2015-06-30 15:27     ` Stefan Hajnoczi
  1 sibling, 1 reply; 25+ messages in thread
From: John Snow @ 2015-06-29 22:39 UTC (permalink / raw)
  To: Stefan Hajnoczi, qemu-devel
  Cc: Kevin Wolf, Jeff Cody, vsementsov, famz, mreitz



On 06/25/2015 08:12 AM, Stefan Hajnoczi wrote:
> Join the transaction when the backup block job is in incremental backup
> mode.
> 
> This ensures that the sync bitmap is not thrown away if another block
> job in the transaction is cancelled or fails.  This is critical so
> incremental backup with multiple disks can be retried in case of
> cancellation/failure.
> 
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>  block/backup.c |  7 +++++-
>  blockdev.c     | 73 ++++++++++++++++++++++++++++++++++++++++++----------------
>  2 files changed, 59 insertions(+), 21 deletions(-)
> 
> diff --git a/block/backup.c b/block/backup.c
> index ddf8424..fa86b0e 100644
> --- a/block/backup.c
> +++ b/block/backup.c
> @@ -429,6 +429,8 @@ static void coroutine_fn backup_run(void *opaque)
>      qemu_co_rwlock_wrlock(&job->flush_rwlock);
>      qemu_co_rwlock_unlock(&job->flush_rwlock);
>  
> +    block_job_txn_prepare_to_complete(job->common.txn, &job->common, ret);
> +
>      if (job->sync_bitmap) {
>          BdrvDirtyBitmap *bm;
>          if (ret < 0 || block_job_is_cancelled(&job->common)) {
> @@ -457,7 +459,7 @@ void backup_start(BlockDriverState *bs, BlockDriverState *target,
>                    BlockdevOnError on_source_error,
>                    BlockdevOnError on_target_error,
>                    BlockCompletionFunc *cb, void *opaque,
> -                  Error **errp)
> +                  BlockJobTxn *txn, Error **errp)
>  {
>      int64_t len;
>  
> @@ -537,6 +539,9 @@ void backup_start(BlockDriverState *bs, BlockDriverState *target,
>      job->sync_mode = sync_mode;
>      job->sync_bitmap = sync_mode == MIRROR_SYNC_MODE_DIRTY_BITMAP ?
>                         sync_bitmap : NULL;
> +    if (job->sync_bitmap) {
> +        block_job_txn_add_job(txn, &job->common);
> +    }

Hmm, is this what we want? This will add backup jobs to a transaction
only if they have a bitmap attached to the job.

However, if we're doing a mixture of full and incremental backups, we
may still want to roll back the incremental backups if the full backups
failed as part of the transaction.

The (admittedly more complicated) design I submitted will always add a
job to the transactional group, whether it has a bitmap or not. The
membership test was only if it was launched by the backup transaction
action. The bitmap is only checked for purposes of refcounting and
cleanup mechanics.

Maybe that wasn't what we wanted either, but this is a difference in how
our series will behave.

>      job->common.len = len;
>      job->common.co = qemu_coroutine_create(backup_run);
>      qemu_coroutine_enter(job->common.co, job);
> diff --git a/blockdev.c b/blockdev.c
> index 453f3ec..4f27c35 100644
> --- a/blockdev.c
> +++ b/blockdev.c
> @@ -1586,6 +1586,18 @@ typedef struct DriveBackupState {
>      BlockJob *job;
>  } DriveBackupState;
>  
> +static void do_drive_backup(const char *device, const char *target,
> +                            bool has_format, const char *format,
> +                            enum MirrorSyncMode sync,
> +                            bool has_mode, enum NewImageMode mode,
> +                            bool has_speed, int64_t speed,
> +                            bool has_bitmap, const char *bitmap,
> +                            bool has_on_source_error,
> +                            BlockdevOnError on_source_error,
> +                            bool has_on_target_error,
> +                            BlockdevOnError on_target_error,
> +                            BlockJobTxn *txn, Error **errp);
> +
>  static void drive_backup_prepare(BlkActionState *common, Error **errp)
>  {
>      DriveBackupState *state = DO_UPCAST(DriveBackupState, common, common);
> @@ -1609,15 +1621,16 @@ static void drive_backup_prepare(BlkActionState *common, Error **errp)
>      state->aio_context = bdrv_get_aio_context(bs);
>      aio_context_acquire(state->aio_context);
>  
> -    qmp_drive_backup(backup->device, backup->target,
> -                     backup->has_format, backup->format,
> -                     backup->sync,
> -                     backup->has_mode, backup->mode,
> -                     backup->has_speed, backup->speed,
> -                     backup->has_bitmap, backup->bitmap,
> -                     backup->has_on_source_error, backup->on_source_error,
> -                     backup->has_on_target_error, backup->on_target_error,
> -                     &local_err);
> +    do_drive_backup(backup->device, backup->target,
> +                    backup->has_format, backup->format,
> +                    backup->sync,
> +                    backup->has_mode, backup->mode,
> +                    backup->has_speed, backup->speed,
> +                    backup->has_bitmap, backup->bitmap,
> +                    backup->has_on_source_error, backup->on_source_error,
> +                    backup->has_on_target_error, backup->on_target_error,
> +                    common->block_job_txn,
> +                    &local_err);
>      if (local_err) {
>          error_propagate(errp, local_err);
>          return;
> @@ -2585,15 +2598,17 @@ out:
>      aio_context_release(aio_context);
>  }
>  
> -void qmp_drive_backup(const char *device, const char *target,
> -                      bool has_format, const char *format,
> -                      enum MirrorSyncMode sync,
> -                      bool has_mode, enum NewImageMode mode,
> -                      bool has_speed, int64_t speed,
> -                      bool has_bitmap, const char *bitmap,
> -                      bool has_on_source_error, BlockdevOnError on_source_error,
> -                      bool has_on_target_error, BlockdevOnError on_target_error,
> -                      Error **errp)
> +static void do_drive_backup(const char *device, const char *target,
> +                            bool has_format, const char *format,
> +                            enum MirrorSyncMode sync,
> +                            bool has_mode, enum NewImageMode mode,
> +                            bool has_speed, int64_t speed,
> +                            bool has_bitmap, const char *bitmap,
> +                            bool has_on_source_error,
> +                            BlockdevOnError on_source_error,
> +                            bool has_on_target_error,
> +                            BlockdevOnError on_target_error,
> +                            BlockJobTxn *txn, Error **errp)
>  {
>      BlockBackend *blk;
>      BlockDriverState *bs;
> @@ -2710,7 +2725,7 @@ void qmp_drive_backup(const char *device, const char *target,
>  
>      backup_start(bs, target_bs, speed, sync, bmap,
>                   on_source_error, on_target_error,
> -                 block_job_cb, bs, &local_err);
> +                 block_job_cb, bs, txn, &local_err);
>      if (local_err != NULL) {
>          bdrv_unref(target_bs);
>          error_propagate(errp, local_err);
> @@ -2721,6 +2736,24 @@ out:
>      aio_context_release(aio_context);
>  }
>  
> +void qmp_drive_backup(const char *device, const char *target,
> +                      bool has_format, const char *format,
> +                      enum MirrorSyncMode sync,
> +                      bool has_mode, enum NewImageMode mode,
> +                      bool has_speed, int64_t speed,
> +                      bool has_bitmap, const char *bitmap,
> +                      bool has_on_source_error, BlockdevOnError on_source_error,
> +                      bool has_on_target_error, BlockdevOnError on_target_error,
> +                      Error **errp)
> +{
> +    return do_drive_backup(device, target, has_format, format, sync,
> +                           has_mode, mode, has_speed, speed,
> +                           has_bitmap, bitmap,
> +                           has_on_source_error, on_source_error,
> +                           has_on_target_error, on_target_error,
> +                           NULL, errp);
> +}
> +
>  BlockDeviceInfoList *qmp_query_named_block_nodes(Error **errp)
>  {
>      return bdrv_named_nodes_list(errp);
> @@ -2771,7 +2804,7 @@ void qmp_blockdev_backup(const char *device, const char *target,
>      bdrv_ref(target_bs);
>      bdrv_set_aio_context(target_bs, aio_context);
>      backup_start(bs, target_bs, speed, sync, NULL, on_source_error,
> -                 on_target_error, block_job_cb, bs, &local_err);
> +                 on_target_error, block_job_cb, bs, NULL, &local_err);
>      if (local_err != NULL) {
>          bdrv_unref(target_bs);
>          error_propagate(errp, local_err);
> 

Looks solid otherwise.

--js

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH 07/10] block/backup: support block job transactions
  2015-06-29 22:39   ` John Snow
@ 2015-06-30 15:27     ` Stefan Hajnoczi
  2015-06-30 15:52       ` John Snow
  0 siblings, 1 reply; 25+ messages in thread
From: Stefan Hajnoczi @ 2015-06-30 15:27 UTC (permalink / raw)
  To: John Snow; +Cc: Kevin Wolf, famz, Jeff Cody, qemu-devel, mreitz, vsementsov

[-- Attachment #1: Type: text/plain, Size: 3088 bytes --]

On Mon, Jun 29, 2015 at 06:39:08PM -0400, John Snow wrote:
> 
> 
> On 06/25/2015 08:12 AM, Stefan Hajnoczi wrote:
> > Join the transaction when the backup block job is in incremental backup
> > mode.
> > 
> > This ensures that the sync bitmap is not thrown away if another block
> > job in the transaction is cancelled or fails.  This is critical so
> > incremental backup with multiple disks can be retried in case of
> > cancellation/failure.
> > 
> > Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> > ---
> >  block/backup.c |  7 +++++-
> >  blockdev.c     | 73 ++++++++++++++++++++++++++++++++++++++++++----------------
> >  2 files changed, 59 insertions(+), 21 deletions(-)
> > 
> > diff --git a/block/backup.c b/block/backup.c
> > index ddf8424..fa86b0e 100644
> > --- a/block/backup.c
> > +++ b/block/backup.c
> > @@ -429,6 +429,8 @@ static void coroutine_fn backup_run(void *opaque)
> >      qemu_co_rwlock_wrlock(&job->flush_rwlock);
> >      qemu_co_rwlock_unlock(&job->flush_rwlock);
> >  
> > +    block_job_txn_prepare_to_complete(job->common.txn, &job->common, ret);
> > +
> >      if (job->sync_bitmap) {
> >          BdrvDirtyBitmap *bm;
> >          if (ret < 0 || block_job_is_cancelled(&job->common)) {
> > @@ -457,7 +459,7 @@ void backup_start(BlockDriverState *bs, BlockDriverState *target,
> >                    BlockdevOnError on_source_error,
> >                    BlockdevOnError on_target_error,
> >                    BlockCompletionFunc *cb, void *opaque,
> > -                  Error **errp)
> > +                  BlockJobTxn *txn, Error **errp)
> >  {
> >      int64_t len;
> >  
> > @@ -537,6 +539,9 @@ void backup_start(BlockDriverState *bs, BlockDriverState *target,
> >      job->sync_mode = sync_mode;
> >      job->sync_bitmap = sync_mode == MIRROR_SYNC_MODE_DIRTY_BITMAP ?
> >                         sync_bitmap : NULL;
> > +    if (job->sync_bitmap) {
> > +        block_job_txn_add_job(txn, &job->common);
> > +    }
> 
> Hmm, is this what we want? This will add backup jobs to a transaction
> only if they have a bitmap attached to the job.
> 
> However, if we're doing a mixture of full and incremental backups, we
> may still want to roll back the incremental backups if the full backups
> failed as part of the transaction.
> 
> The (admittedly more complicated) design I submitted will always add a
> job to the transactional group, whether it has a bitmap or not. The
> membership test was only if it was launched by the backup transaction
> action. The bitmap is only checked for purposes of refcounting and
> cleanup mechanics.
> 
> Maybe that wasn't what we wanted either, but this is a difference in how
> our series will behave.

The 'backup' operation was added to the QMP 'transaction' command in
QEMU 1.6.  If we add non-incremental backup commands to the transaction
then behavior changes.

Perhaps DriveBackup and BlockdevBackup QAPI structures should take an
optional 'transaction' bool argument.  That way the caller decides which
behavior to use.

[-- Attachment #2: Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH 07/10] block/backup: support block job transactions
  2015-06-30 15:27     ` Stefan Hajnoczi
@ 2015-06-30 15:52       ` John Snow
  2015-07-01  8:45         ` Stefan Hajnoczi
  0 siblings, 1 reply; 25+ messages in thread
From: John Snow @ 2015-06-30 15:52 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Kevin Wolf, famz, Jeff Cody, qemu-devel, mreitz, vsementsov



On 06/30/2015 11:27 AM, Stefan Hajnoczi wrote:
> On Mon, Jun 29, 2015 at 06:39:08PM -0400, John Snow wrote:
>>
>>
>> On 06/25/2015 08:12 AM, Stefan Hajnoczi wrote:
>>> Join the transaction when the backup block job is in incremental backup
>>> mode.
>>>
>>> This ensures that the sync bitmap is not thrown away if another block
>>> job in the transaction is cancelled or fails.  This is critical so
>>> incremental backup with multiple disks can be retried in case of
>>> cancellation/failure.
>>>
>>> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
>>> ---
>>>  block/backup.c |  7 +++++-
>>>  blockdev.c     | 73 ++++++++++++++++++++++++++++++++++++++++++----------------
>>>  2 files changed, 59 insertions(+), 21 deletions(-)
>>>
>>> diff --git a/block/backup.c b/block/backup.c
>>> index ddf8424..fa86b0e 100644
>>> --- a/block/backup.c
>>> +++ b/block/backup.c
>>> @@ -429,6 +429,8 @@ static void coroutine_fn backup_run(void *opaque)
>>>      qemu_co_rwlock_wrlock(&job->flush_rwlock);
>>>      qemu_co_rwlock_unlock(&job->flush_rwlock);
>>>  
>>> +    block_job_txn_prepare_to_complete(job->common.txn, &job->common, ret);
>>> +
>>>      if (job->sync_bitmap) {
>>>          BdrvDirtyBitmap *bm;
>>>          if (ret < 0 || block_job_is_cancelled(&job->common)) {
>>> @@ -457,7 +459,7 @@ void backup_start(BlockDriverState *bs, BlockDriverState *target,
>>>                    BlockdevOnError on_source_error,
>>>                    BlockdevOnError on_target_error,
>>>                    BlockCompletionFunc *cb, void *opaque,
>>> -                  Error **errp)
>>> +                  BlockJobTxn *txn, Error **errp)
>>>  {
>>>      int64_t len;
>>>  
>>> @@ -537,6 +539,9 @@ void backup_start(BlockDriverState *bs, BlockDriverState *target,
>>>      job->sync_mode = sync_mode;
>>>      job->sync_bitmap = sync_mode == MIRROR_SYNC_MODE_DIRTY_BITMAP ?
>>>                         sync_bitmap : NULL;
>>> +    if (job->sync_bitmap) {
>>> +        block_job_txn_add_job(txn, &job->common);
>>> +    }
>>
>> Hmm, is this what we want? This will add backup jobs to a transaction
>> only if they have a bitmap attached to the job.
>>
>> However, if we're doing a mixture of full and incremental backups, we
>> may still want to roll back the incremental backups if the full backups
>> failed as part of the transaction.
>>
>> The (admittedly more complicated) design I submitted will always add a
>> job to the transactional group, whether it has a bitmap or not. The
>> membership test was only if it was launched by the backup transaction
>> action. The bitmap is only checked for purposes of refcounting and
>> cleanup mechanics.
>>
>> Maybe that wasn't what we wanted either, but this is a difference in how
>> our series will behave.
> 
> The 'backup' operation was added to the QMP 'transaction' command in
> QEMU 1.6.  If we add non-incremental backup commands to the transaction
> then behavior changes.
> 

Ugh, good point...

> Perhaps DriveBackup and BlockdevBackup QAPI structures should take an
> optional 'transaction' bool argument.  That way the caller decides which
> behavior to use.
> 

The way my version operated only changed the cleanup behavior -- it
didn't attempt to cancel other jobs if they failed or not. It naively
let them finish, then performed cleanup based on the overall completion
status.

That let the old behavior continue working like it did, but changed how
incrementals worked upon completion.

(1) Perhaps we can change the forced cancellation aspect and just allow
jobs to finish naturally even in the event of failure. It's wasteful,
but it does allow us to maintain the existing behavior while getting the
behavior we want for incremental transactions.

(2) Or, yes, add some sort of "all or nothing" flag to transactions(?*)
that users can toggle on/off. I had wondered in the past if it wouldn't
be advantageous for libvirt to be able to choose this behavior, if
managing state of partial completions was desirable in some cases to
avoid re-running operations unnecessarily.

*As a thought, perhaps cancel-all-on-error as a flag can be a property
of the QMP transaction command itself. When set, actions that launch
jobs can add the job to the TXN. An error can be raised if the flag is
used in conjunction with an action that doesn't currently/won't ever
support the do-or-die flag.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH 05/10] block: add block job transactions
  2015-06-29 22:38   ` John Snow
@ 2015-06-30 16:20     ` Stefan Hajnoczi
  0 siblings, 0 replies; 25+ messages in thread
From: Stefan Hajnoczi @ 2015-06-30 16:20 UTC (permalink / raw)
  To: John Snow; +Cc: Kevin Wolf, famz, Jeff Cody, qemu-devel, mreitz, vsementsov

[-- Attachment #1: Type: text/plain, Size: 4721 bytes --]

On Mon, Jun 29, 2015 at 06:38:13PM -0400, John Snow wrote:
> On 06/25/2015 08:12 AM, Stefan Hajnoczi wrote:
> > +/* The purpose of this is to keep txn alive until all jobs have been added */
> > +void block_job_txn_begin(BlockJobTxn *txn)
> > +{
> > +    block_job_txn_unref(txn);
> > +}
> > +
> 
> Maybe it's not entirely clear to the caller that "begin" may in fact
> actually delete the BlockJobTxn. Shall we update the caller's pointer to
> NULL in this case as a hint? Passing a **txn will imply that we are
> giving up our ownership of the object.

Good idea.

> > +/* Cancel all other jobs in case of abort, wake all waiting jobs in case of
> > + * successful completion.  Runs from main loop.
> > + */
> > +static void block_job_txn_complete(BlockJob *job, void *opaque)
> > +{
> > +    BlockJobTxn *txn = opaque;
> > +    BlockJob *other_job;
> > +    bool aborting = txn->aborting;
> > +
> > +    qemu_mutex_lock(&txn->lock);
> > +    txn->ref++; /* keep txn alive until the end of this loop */
> > +
> > +    QLIST_FOREACH(other_job, &txn->jobs, txn_list) {
> > +        AioContext *ctx;
> > +
> > +        qemu_mutex_unlock(&txn->lock);
> > +        ctx = bdrv_get_aio_context(other_job->bs);
> > +        aio_context_acquire(ctx);
> > +
> > +        /* Cancel all other jobs if aborting.  Don't cancel our own failed job
> > +         * since cancellation throws away the error value.
> > +         */
> > +        if (aborting && other_job != job) {
> > +            block_job_cancel(other_job);
> > +        } else {
> > +            block_job_enter(other_job);
> > +        }
> > +
> > +        aio_context_release(ctx);
> > +        qemu_mutex_lock(&txn->lock);
> > +    }
> > +
> > +    qemu_mutex_unlock(&txn->lock);
> > +    block_job_txn_unref(txn);
> > +}
> > +
> 
> Maybe we can name this one along the lines of
> block_job_txn_complete_impl or something clearly internal, so that we
> can name the public interface simply "block_job_txn_complete."
> 
> Maybe I'm just bike shedding, but calling only a "prepare to X" function
> without a matching "X" call in the exposed API seems odd.
> 
> I suppose it doesn't matter, because I can't think of anything nicer :)
> 
> > +void coroutine_fn block_job_txn_prepare_to_complete(BlockJobTxn *txn,
> > +                                                    BlockJob *job,
> > +                                                    int ret)

I'll rename the function to:
void coroutine_fn block_job_txn_job_done(BlockJobTxn *txn, BlockJob *job, int ret)

"complete" is already overloaded.  block_job_completed() is called by
jobs to terminate.  block_job_complete() is called by the monitor to
nudge a waiting job to finish.  They are two different things.  Maybe
using the term in BlockJobTxn makes things more confusing.

(It's possible to drop the txn argument since it can be fetched from
job->txn, but that makes boundary between BlockJob and BlockJobTxn
harder to understand.)

> > +{
> > +    if (!txn) {
> > +        return;
> > +    }
> > +
> > +    qemu_mutex_lock(&txn->lock);
> > +
> > +    /* This function is entered in 3 cases:
> > +     *
> > +     * 1. Successful job completion - wait for other jobs
> > +     * 2. First failed/cancelled job in txn - cancel other jobs and wait
> > +     * 3. Subsequent cancelled jobs - finish immediately, don't wait
> > +     */
> > +    trace_block_job_txn_prepare_to_complete_entry(txn, job, ret,
> > +                                                  block_job_is_cancelled(job),
> > +                                                  txn->aborting,
> > +                                                  txn->jobs_pending);
> > +
> > +    if (txn->aborting) { /* Case 3 */
> > +        assert(block_job_is_cancelled(job));
> > +        goto out; /* already cancelled, don't yield */
> > +    }
> > +
> 
> So the first failure forces all jobs not-yet-complete to cancel. Is
> there any chance for a race condition of two jobs completing almost
> simultaneously, where the first fails and the second completes, and the
> 2nd job makes it here before it gets canceled?
> 
> BOD: I really assume the answer is "no," but it's not immediately
> evident to me.

Good catch!  It is possible if the 2nd job is entered after the 1st job
yielded but before block_job_txn_complete() is called.  There would have
to be a callback pending for the 2nd job.

It can be fixed by adding a new case for jobs that call
block_job_txn_prepare_to_complete() when block_job_txn_complete has been
scheduled.  The job should simply wait for block_job_txn_complete().

I'll add a test case for this scenario.

[-- Attachment #2: Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH 07/10] block/backup: support block job transactions
  2015-06-30 15:52       ` John Snow
@ 2015-07-01  8:45         ` Stefan Hajnoczi
  0 siblings, 0 replies; 25+ messages in thread
From: Stefan Hajnoczi @ 2015-07-01  8:45 UTC (permalink / raw)
  To: John Snow; +Cc: Kevin Wolf, famz, Jeff Cody, qemu-devel, mreitz, vsementsov

[-- Attachment #1: Type: text/plain, Size: 3110 bytes --]

On Tue, Jun 30, 2015 at 11:52:54AM -0400, John Snow wrote:
> On 06/30/2015 11:27 AM, Stefan Hajnoczi wrote:
> > On Mon, Jun 29, 2015 at 06:39:08PM -0400, John Snow wrote:
> >> On 06/25/2015 08:12 AM, Stefan Hajnoczi wrote:
> >>> @@ -537,6 +539,9 @@ void backup_start(BlockDriverState *bs, BlockDriverState *target,
> >>>      job->sync_mode = sync_mode;
> >>>      job->sync_bitmap = sync_mode == MIRROR_SYNC_MODE_DIRTY_BITMAP ?
> >>>                         sync_bitmap : NULL;
> >>> +    if (job->sync_bitmap) {
> >>> +        block_job_txn_add_job(txn, &job->common);
> >>> +    }
> >>
> >> Hmm, is this what we want? This will add backup jobs to a transaction
> >> only if they have a bitmap attached to the job.
> >>
> >> However, if we're doing a mixture of full and incremental backups, we
> >> may still want to roll back the incremental backups if the full backups
> >> failed as part of the transaction.
> >>
> >> The (admittedly more complicated) design I submitted will always add a
> >> job to the transactional group, whether it has a bitmap or not. The
> >> membership test was only if it was launched by the backup transaction
> >> action. The bitmap is only checked for purposes of refcounting and
> >> cleanup mechanics.
> >>
> >> Maybe that wasn't what we wanted either, but this is a difference in how
> >> our series will behave.
> > 
> > The 'backup' operation was added to the QMP 'transaction' command in
> > QEMU 1.6.  If we add non-incremental backup commands to the transaction
> > then behavior changes.
> > 
> 
> Ugh, good point...
> 
> > Perhaps DriveBackup and BlockdevBackup QAPI structures should take an
> > optional 'transaction' bool argument.  That way the caller decides which
> > behavior to use.
> > 
> 
> The way my version operated only changed the cleanup behavior -- it
> didn't attempt to cancel other jobs if they failed or not. It naively
> let them finish, then performed cleanup based on the overall completion
> status.
> 
> That let the old behavior continue working like it did, but changed how
> incrementals worked upon completion.
> 
> (1) Perhaps we can change the forced cancellation aspect and just allow
> jobs to finish naturally even in the event of failure. It's wasteful,
> but it does allow us to maintain the existing behavior while getting the
> behavior we want for incremental transactions.
> 
> (2) Or, yes, add some sort of "all or nothing" flag to transactions(?*)
> that users can toggle on/off. I had wondered in the past if it wouldn't
> be advantageous for libvirt to be able to choose this behavior, if
> managing state of partial completions was desirable in some cases to
> avoid re-running operations unnecessarily.
> 
> *As a thought, perhaps cancel-all-on-error as a flag can be a property
> of the QMP transaction command itself. When set, actions that launch
> jobs can add the job to the TXN. An error can be raised if the flag is
> used in conjunction with an action that doesn't currently/won't ever
> support the do-or-die flag.

Good, this is easy to add.

[-- Attachment #2: Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2015-07-01  8:45 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-06-25 12:12 [Qemu-devel] [PATCH 00/10] block: incremental backup transactions using BlockJobTxn Stefan Hajnoczi
2015-06-25 12:12 ` [Qemu-devel] [PATCH 01/10] qapi: Add transaction support to block-dirty-bitmap operations Stefan Hajnoczi
2015-06-25 12:12 ` [Qemu-devel] [PATCH 02/10] iotests: add transactional incremental backup test Stefan Hajnoczi
2015-06-25 12:12 ` [Qemu-devel] [PATCH 03/10] block: rename BlkTransactionState and BdrvActionOps Stefan Hajnoczi
2015-06-25 12:12 ` [Qemu-devel] [PATCH 04/10] block: keep bitmap if incremental backup job is cancelled Stefan Hajnoczi
2015-06-26  6:00   ` Fam Zheng
2015-06-29 22:36   ` John Snow
2015-06-25 12:12 ` [Qemu-devel] [PATCH 05/10] block: add block job transactions Stefan Hajnoczi
2015-06-26  6:41   ` Fam Zheng
2015-06-29 22:38   ` John Snow
2015-06-30 16:20     ` Stefan Hajnoczi
2015-06-25 12:12 ` [Qemu-devel] [PATCH 06/10] blockdev: make BlockJobTxn available to qmp 'transaction' Stefan Hajnoczi
2015-06-26  6:42   ` Fam Zheng
2015-06-29 22:38   ` John Snow
2015-06-25 12:12 ` [Qemu-devel] [PATCH 07/10] block/backup: support block job transactions Stefan Hajnoczi
2015-06-26  6:44   ` Fam Zheng
2015-06-29 22:39   ` John Snow
2015-06-30 15:27     ` Stefan Hajnoczi
2015-06-30 15:52       ` John Snow
2015-07-01  8:45         ` Stefan Hajnoczi
2015-06-25 12:12 ` [Qemu-devel] [PATCH 08/10] iotests: 124 - transactional failure test Stefan Hajnoczi
2015-06-25 12:12 ` [Qemu-devel] [PATCH 09/10] qmp-commands.hx: Update the supported 'transaction' operations Stefan Hajnoczi
2015-06-25 12:12 ` [Qemu-devel] [PATCH 10/10] tests: add BlockJobTxn unit test Stefan Hajnoczi
2015-06-26  6:58   ` Fam Zheng
2015-06-29 15:08 ` [Qemu-devel] [PATCH 00/10] block: incremental backup transactions using BlockJobTxn Stefan Hajnoczi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.