[Qemu-devel] [PATCH 0/5] blockjobs: Fix transactional race condition

All of lore.kernel.org
 help / color / mirror / Atom feed

* [Qemu-devel] [PATCH 0/5] blockjobs: Fix transactional race condition
@ 2016-08-08 19:09 John Snow
  2016-08-08 19:09 ` [Qemu-devel] [PATCH 1/5] blockjob: fix dead pointer in txn list John Snow
                   ` (7 more replies)
  0 siblings, 8 replies; 15+ messages in thread
From: John Snow @ 2016-08-08 19:09 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, jcody, vsementsov, famz, qemu-devel, John Snow

There are a few problems with transactional job completion right now.

First, if jobs complete so quickly they complete before remaining jobs
get a chance to join the transaction, the completion mode can leave well
known state and the QLIST can get corrupted and the transactional jobs
can complete in batches or phases instead of all together.

Second, if two or more jobs defer to the main loop at roughly the same
time, it's possible for one job's cleanup to directly invoke the other
job's cleanup from within the same thread, leading to a situation that
will deadlock the entire transaction.

Thanks to Vladimir for pointing out these modes of failure.

________________________________________________________________________________

For convenience, this branch is available at:
https://github.com/jnsnow/qemu.git branch job-manual-start
https://github.com/jnsnow/qemu/tree/job-manual-start

This version is tagged job-manual-start-v1:
https://github.com/jnsnow/qemu/releases/tag/job-manual-start-v1

John Snow (4):
  blockjob: add block_job_start
  blockjob: refactor backup_start as backup_job_create
  blockjob: add .clean property
  iotests: add transactional failure race test

Vladimir Sementsov-Ogievskiy (1):
  blockjob: fix dead pointer in txn list

 block/backup.c             |  50 +++++++-----
 block/commit.c             |   2 +-
 block/mirror.c             |   2 +-
 block/stream.c             |   2 +-
 blockdev.c                 | 194 ++++++++++++++++++++++++++-------------------
 blockjob.c                 |  24 +++++-
 include/block/block_int.h  |  19 ++---
 include/block/blockjob.h   |  16 ++++
 tests/qemu-iotests/124     |  91 +++++++++++++++++++++
 tests/qemu-iotests/124.out |   4 +-
 tests/test-blockjob-txn.c  |   2 +-
 11 files changed, 284 insertions(+), 122 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Qemu-devel] [PATCH 1/5] blockjob: fix dead pointer in txn list
  2016-08-08 19:09 [Qemu-devel] [PATCH 0/5] blockjobs: Fix transactional race condition John Snow
@ 2016-08-08 19:09 ` John Snow
  2016-09-29 18:16   ` Eric Blake
  2016-08-08 19:09 ` [Qemu-devel] [PATCH 2/5] blockjob: add block_job_start John Snow
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 15+ messages in thread
From: John Snow @ 2016-08-08 19:09 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, jcody, vsementsov, famz, qemu-devel, John Snow

From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

Though it is not intended to be reached through normal circumstances,
if we do not gracefully deconstruct the transaction QLIST, we may wind
up with stale pointers in the list.

The rest of this series attempts to address the underlying issues,
but this should fix list inconsistencies.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Tested-by: John Snow <jsnow@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
[Rewrote commit message. --js]
Signed-off-by: John Snow <jsnow@redhat.com>

Signed-off-by: John Snow <jsnow@redhat.com>
---
 blockjob.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/blockjob.c b/blockjob.c
index a5ba3be..e045091 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -216,6 +216,7 @@ static void block_job_completed_single(BlockJob *job)
     }
     job->cb(job->opaque, job->ret);
     if (job->txn) {
+        QLIST_REMOVE(job, txn_list);
         block_job_txn_unref(job->txn);
     }
     block_job_unref(job);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [Qemu-devel] [PATCH 2/5] blockjob: add block_job_start
  2016-08-08 19:09 [Qemu-devel] [PATCH 0/5] blockjobs: Fix transactional race condition John Snow
  2016-08-08 19:09 ` [Qemu-devel] [PATCH 1/5] blockjob: fix dead pointer in txn list John Snow
@ 2016-08-08 19:09 ` John Snow
  2016-08-08 19:09 ` [Qemu-devel] [PATCH 3/5] blockjob: refactor backup_start as backup_job_create John Snow
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 15+ messages in thread
From: John Snow @ 2016-08-08 19:09 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, jcody, vsementsov, famz, qemu-devel, John Snow

Instead of automatically starting jobs at creation time via backup_start
et al, we'd like to return a job object pointer that can be started
manually at later point in time.

For now, add the block_job_start mechanism and start the jobs
automatically as we have been doing, with conversions job-by-job coming
in later patches.

Signed-off-by: John Snow <jsnow@redhat.com>
---
 block/backup.c            |  2 +-
 block/commit.c            |  2 +-
 block/mirror.c            |  2 +-
 block/stream.c            |  2 +-
 blockjob.c                | 11 ++++++++++-
 include/block/blockjob.h  |  8 ++++++++
 tests/test-blockjob-txn.c |  2 +-
 7 files changed, 23 insertions(+), 6 deletions(-)

diff --git a/block/backup.c b/block/backup.c
index 2c05323..2229e26 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -578,7 +578,7 @@ void backup_start(const char *job_id, BlockDriverState *bs,
     job->common.len = len;
     job->common.co = qemu_coroutine_create(backup_run, job);
     block_job_txn_add_job(txn, &job->common);
-    qemu_coroutine_enter(job->common.co);
+    block_job_start(&job->common);
     return;
 
  error:
diff --git a/block/commit.c b/block/commit.c
index 553e18d..f93864a 100644
--- a/block/commit.c
+++ b/block/commit.c
@@ -278,7 +278,7 @@ void commit_start(const char *job_id, BlockDriverState *bs,
     s->common.co = qemu_coroutine_create(commit_run, s);
 
     trace_commit_start(bs, base, top, s, s->common.co, opaque);
-    qemu_coroutine_enter(s->common.co);
+    block_job_start(&s->common);
 }
 
 
diff --git a/block/mirror.c b/block/mirror.c
index e0b3f41..79d9b84 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -965,7 +965,7 @@ static void mirror_start_job(const char *job_id, BlockDriverState *bs,
 
     s->common.co = qemu_coroutine_create(mirror_run, s);
     trace_mirror_start(bs, s, s->common.co, opaque);
-    qemu_coroutine_enter(s->common.co);
+    block_job_start(&s->common);
 }
 
 void mirror_start(const char *job_id, BlockDriverState *bs,
diff --git a/block/stream.c b/block/stream.c
index 3187481..c2a8a3e 100644
--- a/block/stream.c
+++ b/block/stream.c
@@ -233,5 +233,5 @@ void stream_start(const char *job_id, BlockDriverState *bs,
     s->on_error = on_error;
     s->common.co = qemu_coroutine_create(stream_run, s);
     trace_stream_start(bs, base, s, s->common.co, opaque);
-    qemu_coroutine_enter(s->common.co);
+    block_job_start(&s->common);
 }
diff --git a/blockjob.c b/blockjob.c
index e045091..0d07abc 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -158,7 +158,8 @@ void *block_job_create(const char *job_id, const BlockJobDriver *driver,
     job->blk           = blk;
     job->cb            = cb;
     job->opaque        = opaque;
-    job->busy          = true;
+    job->busy          = false;
+    job->paused        = true;
     job->refcnt        = 1;
     bs->job = job;
 
@@ -181,6 +182,14 @@ void *block_job_create(const char *job_id, const BlockJobDriver *driver,
     return job;
 }
 
+void block_job_start(BlockJob *job)
+{
+    assert(job && job->co && job->paused && !job->busy);
+    job->paused = false;
+    job->busy = true;
+    qemu_coroutine_enter(job->co);
+}
+
 void block_job_ref(BlockJob *job)
 {
     ++job->refcnt;
diff --git a/include/block/blockjob.h b/include/block/blockjob.h
index 4ddb4ae..e06258f 100644
--- a/include/block/blockjob.h
+++ b/include/block/blockjob.h
@@ -246,6 +246,14 @@ void *block_job_create(const char *job_id, const BlockJobDriver *driver,
                        BlockCompletionFunc *cb, void *opaque, Error **errp);
 
 /**
+ * block_job_start:
+ * @job: The job object as returned by @block_job_create.
+ *
+ * Begins execution of a block job.
+ */
+void block_job_start(BlockJob *job);
+
+/**
  * block_job_sleep_ns:
  * @job: The job that calls the function.
  * @clock: The clock to sleep on.
diff --git a/tests/test-blockjob-txn.c b/tests/test-blockjob-txn.c
index d049cba..8399f62 100644
--- a/tests/test-blockjob-txn.c
+++ b/tests/test-blockjob-txn.c
@@ -106,7 +106,7 @@ static BlockJob *test_block_job_start(unsigned int iterations,
     s->common.co = qemu_coroutine_create(test_block_job_run, s);
     data->job = s;
     data->result = result;
-    qemu_coroutine_enter(s->common.co);
+    block_job_start(&s->common);
     return &s->common;
 }
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [Qemu-devel] [PATCH 3/5] blockjob: refactor backup_start as backup_job_create
  2016-08-08 19:09 [Qemu-devel] [PATCH 0/5] blockjobs: Fix transactional race condition John Snow
  2016-08-08 19:09 ` [Qemu-devel] [PATCH 1/5] blockjob: fix dead pointer in txn list John Snow
  2016-08-08 19:09 ` [Qemu-devel] [PATCH 2/5] blockjob: add block_job_start John Snow
@ 2016-08-08 19:09 ` John Snow
  2016-08-08 21:23   ` John Snow
  2016-08-08 19:09 ` [Qemu-devel] [PATCH 4/5] blockjob: add .clean property John Snow
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 15+ messages in thread
From: John Snow @ 2016-08-08 19:09 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, jcody, vsementsov, famz, qemu-devel, John Snow

Refactor backup_start as backup_job_create, which only creates the job,
but does not automatically start it. The old interface, 'backup_start',
is not kept in favor of limiting the number of nearly-identical iterfaces
that would have to be edited to keep up with QAPI changes in the future.

Callers that wish to synchronously start the backup_block_job can
instead just call block_job_start immediately after calling
backup_job_create.

Transactions are updated to use the new interface, calling block_job_start
only during the .commit phase, which helps prevent race conditions where
jobs may finish before we even finish building the transaction. This may
happen, for instance, during empty blockup jobs.

Reported-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: John Snow <jsnow@redhat.com>
---
 block/backup.c            |  39 ++++++----
 blockdev.c                | 194 ++++++++++++++++++++++++++--------------------
 blockjob.c                |   9 ++-
 include/block/block_int.h |  19 ++---
 4 files changed, 149 insertions(+), 112 deletions(-)

diff --git a/block/backup.c b/block/backup.c
index 2229e26..5878ffe 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -474,13 +474,16 @@ static void coroutine_fn backup_run(void *opaque)
     block_job_defer_to_main_loop(&job->common, backup_complete, data);
 }
 
-void backup_start(const char *job_id, BlockDriverState *bs,
-                  BlockDriverState *target, int64_t speed,
-                  MirrorSyncMode sync_mode, BdrvDirtyBitmap *sync_bitmap,
-                  BlockdevOnError on_source_error,
-                  BlockdevOnError on_target_error,
-                  BlockCompletionFunc *cb, void *opaque,
-                  BlockJobTxn *txn, Error **errp)
+BlockJob *backup_job_create(const char *job_id,
+                            BlockDriverState *bs,
+                            BlockDriverState *target,
+                            int64_t speed,
+                            MirrorSyncMode sync_mode,
+                            BdrvDirtyBitmap *sync_bitmap,
+                            BlockdevOnError on_source_error,
+                            BlockdevOnError on_target_error,
+                            BlockCompletionFunc *cb, void *opaque,
+                            BlockJobTxn *txn, Error **errp)
 {
     int64_t len;
     BlockDriverInfo bdi;
@@ -492,46 +495,46 @@ void backup_start(const char *job_id, BlockDriverState *bs,
 
     if (bs == target) {
         error_setg(errp, "Source and target cannot be the same");
-        return;
+        return NULL;
     }
 
     if (!bdrv_is_inserted(bs)) {
         error_setg(errp, "Device is not inserted: %s",
                    bdrv_get_device_name(bs));
-        return;
+        return NULL;
     }
 
     if (!bdrv_is_inserted(target)) {
         error_setg(errp, "Device is not inserted: %s",
                    bdrv_get_device_name(target));
-        return;
+        return NULL;
     }
 
     if (bdrv_op_is_blocked(bs, BLOCK_OP_TYPE_BACKUP_SOURCE, errp)) {
-        return;
+        return NULL;
     }
 
     if (bdrv_op_is_blocked(target, BLOCK_OP_TYPE_BACKUP_TARGET, errp)) {
-        return;
+        return NULL;
     }
 
     if (sync_mode == MIRROR_SYNC_MODE_INCREMENTAL) {
         if (!sync_bitmap) {
             error_setg(errp, "must provide a valid bitmap name for "
                              "\"incremental\" sync mode");
-            return;
+            return NULL;
         }
 
         /* Create a new bitmap, and freeze/disable this one. */
         if (bdrv_dirty_bitmap_create_successor(bs, sync_bitmap, errp) < 0) {
-            return;
+            return NULL;
         }
     } else if (sync_bitmap) {
         error_setg(errp,
                    "a sync_bitmap was provided to backup_run, "
                    "but received an incompatible sync_mode (%s)",
                    MirrorSyncMode_lookup[sync_mode]);
-        return;
+        return NULL;
     }
 
     len = bdrv_getlength(bs);
@@ -578,8 +581,8 @@ void backup_start(const char *job_id, BlockDriverState *bs,
     job->common.len = len;
     job->common.co = qemu_coroutine_create(backup_run, job);
     block_job_txn_add_job(txn, &job->common);
-    block_job_start(&job->common);
-    return;
+
+    return &job->common;
 
  error:
     if (sync_bitmap) {
@@ -589,4 +592,6 @@ void backup_start(const char *job_id, BlockDriverState *bs,
         blk_unref(job->target);
         block_job_unref(&job->common);
     }
+
+    return NULL;
 }
diff --git a/blockdev.c b/blockdev.c
index 2161400..4b041d9 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1838,17 +1838,17 @@ typedef struct DriveBackupState {
     BlockJob *job;
 } DriveBackupState;
 
-static void do_drive_backup(const char *job_id, const char *device,
-                            const char *target, bool has_format,
-                            const char *format, enum MirrorSyncMode sync,
-                            bool has_mode, enum NewImageMode mode,
-                            bool has_speed, int64_t speed,
-                            bool has_bitmap, const char *bitmap,
-                            bool has_on_source_error,
-                            BlockdevOnError on_source_error,
-                            bool has_on_target_error,
-                            BlockdevOnError on_target_error,
-                            BlockJobTxn *txn, Error **errp);
+static BlockJob *do_drive_backup(const char *job_id, const char *device,
+                                 const char *target, bool has_format,
+                                 const char *format, enum MirrorSyncMode sync,
+                                 bool has_mode, enum NewImageMode mode,
+                                 bool has_speed, int64_t speed,
+                                 bool has_bitmap, const char *bitmap,
+                                 bool has_on_source_error,
+                                 BlockdevOnError on_source_error,
+                                 bool has_on_target_error,
+                                 BlockdevOnError on_target_error,
+                                 BlockJobTxn *txn, Error **errp);
 
 static void drive_backup_prepare(BlkActionState *common, Error **errp)
 {
@@ -1878,32 +1878,38 @@ static void drive_backup_prepare(BlkActionState *common, Error **errp)
     bdrv_drained_begin(blk_bs(blk));
     state->bs = blk_bs(blk);
 
-    do_drive_backup(backup->has_job_id ? backup->job_id : NULL,
-                    backup->device, backup->target,
-                    backup->has_format, backup->format,
-                    backup->sync,
-                    backup->has_mode, backup->mode,
-                    backup->has_speed, backup->speed,
-                    backup->has_bitmap, backup->bitmap,
-                    backup->has_on_source_error, backup->on_source_error,
-                    backup->has_on_target_error, backup->on_target_error,
-                    common->block_job_txn, &local_err);
+    state->job = do_drive_backup(backup->has_job_id ? backup->job_id : NULL,
+                                 backup->device, backup->target,
+                                 backup->has_format, backup->format,
+                                 backup->sync,
+                                 backup->has_mode, backup->mode,
+                                 backup->has_speed, backup->speed,
+                                 backup->has_bitmap, backup->bitmap,
+                                 backup->has_on_source_error,
+                                 backup->on_source_error,
+                                 backup->has_on_target_error,
+                                 backup->on_target_error,
+                                 common->block_job_txn, &local_err);
     if (local_err) {
         error_propagate(errp, local_err);
         return;
     }
+}
 
-    state->job = state->bs->job;
+static void drive_backup_commit(BlkActionState *common)
+{
+    DriveBackupState *state = DO_UPCAST(DriveBackupState, common, common);
+    if (state->job) {
+        block_job_start(state->job);
+    }
 }
 
 static void drive_backup_abort(BlkActionState *common)
 {
     DriveBackupState *state = DO_UPCAST(DriveBackupState, common, common);
-    BlockDriverState *bs = state->bs;
 
-    /* Only cancel if it's the job we started */
-    if (bs && bs->job && bs->job == state->job) {
-        block_job_cancel_sync(bs->job);
+    if (state->job) {
+        block_job_cancel_sync(state->job);
     }
 }
 
@@ -1924,14 +1930,15 @@ typedef struct BlockdevBackupState {
     AioContext *aio_context;
 } BlockdevBackupState;
 
-static void do_blockdev_backup(const char *job_id, const char *device,
-                               const char *target, enum MirrorSyncMode sync,
-                               bool has_speed, int64_t speed,
-                               bool has_on_source_error,
-                               BlockdevOnError on_source_error,
-                               bool has_on_target_error,
-                               BlockdevOnError on_target_error,
-                               BlockJobTxn *txn, Error **errp);
+static BlockJob *do_blockdev_backup(const char *job_id, const char *device,
+                                    const char *target,
+                                    enum MirrorSyncMode sync,
+                                    bool has_speed, int64_t speed,
+                                    bool has_on_source_error,
+                                    BlockdevOnError on_source_error,
+                                    bool has_on_target_error,
+                                    BlockdevOnError on_target_error,
+                                    BlockJobTxn *txn, Error **errp);
 
 static void blockdev_backup_prepare(BlkActionState *common, Error **errp)
 {
@@ -1971,28 +1978,35 @@ static void blockdev_backup_prepare(BlkActionState *common, Error **errp)
     state->bs = blk_bs(blk);
     bdrv_drained_begin(state->bs);
 
-    do_blockdev_backup(backup->has_job_id ? backup->job_id : NULL,
-                       backup->device, backup->target, backup->sync,
-                       backup->has_speed, backup->speed,
-                       backup->has_on_source_error, backup->on_source_error,
-                       backup->has_on_target_error, backup->on_target_error,
-                       common->block_job_txn, &local_err);
+    state->job = do_blockdev_backup(backup->has_job_id ? backup->job_id : NULL,
+                                    backup->device, backup->target,
+                                    backup->sync,
+                                    backup->has_speed, backup->speed,
+                                    backup->has_on_source_error,
+                                    backup->on_source_error,
+                                    backup->has_on_target_error,
+                                    backup->on_target_error,
+                                    common->block_job_txn, &local_err);
     if (local_err) {
         error_propagate(errp, local_err);
         return;
     }
+}
 
-    state->job = state->bs->job;
+static void blockdev_backup_commit(BlkActionState *common)
+{
+    BlockdevBackupState *state = DO_UPCAST(BlockdevBackupState, common, common);
+    if (state->job) {
+        block_job_start(state->job);
+    }
 }
 
 static void blockdev_backup_abort(BlkActionState *common)
 {
     BlockdevBackupState *state = DO_UPCAST(BlockdevBackupState, common, common);
-    BlockDriverState *bs = state->bs;
 
-    /* Only cancel if it's the job we started */
-    if (bs && bs->job && bs->job == state->job) {
-        block_job_cancel_sync(bs->job);
+    if (state->job) {
+        block_job_cancel_sync(state->job);
     }
 }
 
@@ -2142,12 +2156,14 @@ static const BlkActionOps actions[] = {
     [TRANSACTION_ACTION_KIND_DRIVE_BACKUP] = {
         .instance_size = sizeof(DriveBackupState),
         .prepare = drive_backup_prepare,
+        .commit = drive_backup_commit,
         .abort = drive_backup_abort,
         .clean = drive_backup_clean,
     },
     [TRANSACTION_ACTION_KIND_BLOCKDEV_BACKUP] = {
         .instance_size = sizeof(BlockdevBackupState),
         .prepare = blockdev_backup_prepare,
+        .commit = blockdev_backup_commit,
         .abort = blockdev_backup_abort,
         .clean = blockdev_backup_clean,
     },
@@ -3155,22 +3171,23 @@ out:
     aio_context_release(aio_context);
 }
 
-static void do_drive_backup(const char *job_id, const char *device,
-                            const char *target, bool has_format,
-                            const char *format, enum MirrorSyncMode sync,
-                            bool has_mode, enum NewImageMode mode,
-                            bool has_speed, int64_t speed,
-                            bool has_bitmap, const char *bitmap,
-                            bool has_on_source_error,
-                            BlockdevOnError on_source_error,
-                            bool has_on_target_error,
-                            BlockdevOnError on_target_error,
-                            BlockJobTxn *txn, Error **errp)
+static BlockJob *do_drive_backup(const char *job_id, const char *device,
+                                 const char *target, bool has_format,
+                                 const char *format, enum MirrorSyncMode sync,
+                                 bool has_mode, enum NewImageMode mode,
+                                 bool has_speed, int64_t speed,
+                                 bool has_bitmap, const char *bitmap,
+                                 bool has_on_source_error,
+                                 BlockdevOnError on_source_error,
+                                 bool has_on_target_error,
+                                 BlockdevOnError on_target_error,
+                                 BlockJobTxn *txn, Error **errp)
 {
     BlockBackend *blk;
     BlockDriverState *bs;
     BlockDriverState *target_bs;
     BlockDriverState *source = NULL;
+    BlockJob *job = NULL;
     BdrvDirtyBitmap *bmap = NULL;
     AioContext *aio_context;
     QDict *options = NULL;
@@ -3195,7 +3212,7 @@ static void do_drive_backup(const char *job_id, const char *device,
     if (!blk) {
         error_set(errp, ERROR_CLASS_DEVICE_NOT_FOUND,
                   "Device '%s' not found", device);
-        return;
+        return NULL;
     }
 
     aio_context = blk_get_aio_context(blk);
@@ -3276,9 +3293,9 @@ static void do_drive_backup(const char *job_id, const char *device,
         }
     }
 
-    backup_start(job_id, bs, target_bs, speed, sync, bmap,
-                 on_source_error, on_target_error,
-                 block_job_cb, bs, txn, &local_err);
+    job = backup_job_create(job_id, bs, target_bs, speed, sync, bmap,
+                            on_source_error, on_target_error,
+                            block_job_cb, bs, txn, &local_err);
     bdrv_unref(target_bs);
     if (local_err != NULL) {
         error_propagate(errp, local_err);
@@ -3287,6 +3304,7 @@ static void do_drive_backup(const char *job_id, const char *device,
 
 out:
     aio_context_release(aio_context);
+    return job;
 }
 
 void qmp_drive_backup(bool has_job_id, const char *job_id,
@@ -3300,13 +3318,17 @@ void qmp_drive_backup(bool has_job_id, const char *job_id,
                       bool has_on_target_error, BlockdevOnError on_target_error,
                       Error **errp)
 {
-    return do_drive_backup(has_job_id ? job_id : NULL, device, target,
-                           has_format, format, sync,
-                           has_mode, mode, has_speed, speed,
-                           has_bitmap, bitmap,
-                           has_on_source_error, on_source_error,
-                           has_on_target_error, on_target_error,
-                           NULL, errp);
+    BlockJob *job;
+    job = do_drive_backup(has_job_id ? job_id : NULL, device, target,
+                          has_format, format, sync,
+                          has_mode, mode, has_speed, speed,
+                          has_bitmap, bitmap,
+                          has_on_source_error, on_source_error,
+                          has_on_target_error, on_target_error,
+                          NULL, errp);
+    if (job) {
+        block_job_start(job);
+    }
 }
 
 BlockDeviceInfoList *qmp_query_named_block_nodes(Error **errp)
@@ -3314,20 +3336,21 @@ BlockDeviceInfoList *qmp_query_named_block_nodes(Error **errp)
     return bdrv_named_nodes_list(errp);
 }
 
-void do_blockdev_backup(const char *job_id, const char *device,
-                        const char *target, enum MirrorSyncMode sync,
-                         bool has_speed, int64_t speed,
-                         bool has_on_source_error,
-                         BlockdevOnError on_source_error,
-                         bool has_on_target_error,
-                         BlockdevOnError on_target_error,
-                         BlockJobTxn *txn, Error **errp)
+BlockJob *do_blockdev_backup(const char *job_id, const char *device,
+                             const char *target, enum MirrorSyncMode sync,
+                             bool has_speed, int64_t speed,
+                             bool has_on_source_error,
+                             BlockdevOnError on_source_error,
+                             bool has_on_target_error,
+                             BlockdevOnError on_target_error,
+                             BlockJobTxn *txn, Error **errp)
 {
     BlockBackend *blk;
     BlockDriverState *bs;
     BlockDriverState *target_bs;
     Error *local_err = NULL;
     AioContext *aio_context;
+    BlockJob *job = NULL;
 
     if (!has_speed) {
         speed = 0;
@@ -3342,7 +3365,7 @@ void do_blockdev_backup(const char *job_id, const char *device,
     blk = blk_by_name(device);
     if (!blk) {
         error_setg(errp, "Device '%s' not found", device);
-        return;
+        return NULL;
     }
 
     aio_context = blk_get_aio_context(blk);
@@ -3370,13 +3393,14 @@ void do_blockdev_backup(const char *job_id, const char *device,
             goto out;
         }
     }
-    backup_start(job_id, bs, target_bs, speed, sync, NULL, on_source_error,
-                 on_target_error, block_job_cb, bs, txn, &local_err);
+    job = backup_job_create(job_id, bs, target_bs, speed, sync, NULL, on_source_error,
+                            on_target_error, block_job_cb, bs, txn, &local_err);
     if (local_err != NULL) {
         error_propagate(errp, local_err);
     }
 out:
     aio_context_release(aio_context);
+    return job;
 }
 
 void qmp_blockdev_backup(bool has_job_id, const char *job_id,
@@ -3389,11 +3413,15 @@ void qmp_blockdev_backup(bool has_job_id, const char *job_id,
                          BlockdevOnError on_target_error,
                          Error **errp)
 {
-    do_blockdev_backup(has_job_id ? job_id : NULL, device, target,
-                       sync, has_speed, speed,
-                       has_on_source_error, on_source_error,
-                       has_on_target_error, on_target_error,
-                       NULL, errp);
+    BlockJob *job;
+    job = do_blockdev_backup(has_job_id ? job_id : NULL, device, target,
+                             sync, has_speed, speed,
+                             has_on_source_error, on_source_error,
+                             has_on_target_error, on_target_error,
+                             NULL, errp);
+    if (job) {
+        block_job_start(job);
+    }
 }
 
 /* Parameter check and block job starting for drive mirroring.
diff --git a/blockjob.c b/blockjob.c
index 0d07abc..74b19b9 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -117,9 +117,12 @@ static void block_job_detach_aio_context(void *opaque)
     block_job_unref(job);
 }
 
-void *block_job_create(const char *job_id, const BlockJobDriver *driver,
-                       BlockDriverState *bs, int64_t speed,
-                       BlockCompletionFunc *cb, void *opaque, Error **errp)
+void *block_job_create(const char *job_id,
+                       const BlockJobDriver *driver,
+                       BlockDriverState *bs,
+                       int64_t speed,
+                       BlockCompletionFunc *cb,
+                       void *opaque, Error **errp)
 {
     BlockBackend *blk;
     BlockJob *job;
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 47665be..ba63a8e 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -745,7 +745,7 @@ void mirror_start(const char *job_id, BlockDriverState *bs,
                   void *opaque, Error **errp);
 
 /*
- * backup_start:
+ * backup_job_create:
  * @job_id: The id of the newly-created job, or %NULL to use the
  * device name of @bs.
  * @bs: Block device to operate on.
@@ -759,16 +759,17 @@ void mirror_start(const char *job_id, BlockDriverState *bs,
  * @opaque: Opaque pointer value passed to @cb.
  * @txn: Transaction that this job is part of (may be NULL).
  *
- * Start a backup operation on @bs.  Clusters in @bs are written to @target
+ * Create a backup operation on @bs.  Clusters in @bs are written to @target
  * until the job is cancelled or manually completed.
  */
-void backup_start(const char *job_id, BlockDriverState *bs,
-                  BlockDriverState *target, int64_t speed,
-                  MirrorSyncMode sync_mode, BdrvDirtyBitmap *sync_bitmap,
-                  BlockdevOnError on_source_error,
-                  BlockdevOnError on_target_error,
-                  BlockCompletionFunc *cb, void *opaque,
-                  BlockJobTxn *txn, Error **errp);
+BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs,
+                            BlockDriverState *target, int64_t speed,
+                            MirrorSyncMode sync_mode,
+                            BdrvDirtyBitmap *sync_bitmap,
+                            BlockdevOnError on_source_error,
+                            BlockdevOnError on_target_error,
+                            BlockCompletionFunc *cb, void *opaque,
+                            BlockJobTxn *txn, Error **errp);
 
 void hmp_drive_add_node(Monitor *mon, const char *optstr);
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [Qemu-devel] [PATCH 4/5] blockjob: add .clean property
  2016-08-08 19:09 [Qemu-devel] [PATCH 0/5] blockjobs: Fix transactional race condition John Snow
                   ` (2 preceding siblings ...)
  2016-08-08 19:09 ` [Qemu-devel] [PATCH 3/5] blockjob: refactor backup_start as backup_job_create John Snow
@ 2016-08-08 19:09 ` John Snow
  2016-08-08 19:09 ` [Qemu-devel] [PATCH 5/5] iotests: add transactional failure race test John Snow
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 15+ messages in thread
From: John Snow @ 2016-08-08 19:09 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, jcody, vsementsov, famz, qemu-devel, John Snow

Cleaning up after we have deferred to the main thread but before the
transaction has converged can be dangerous and result in deadlocks
if the job cleanup invokes any BH polling loops.

A job may attempt to begin cleaning up, but may induce another job to
enter its cleanup routine. The second job, part of our same transaction,
will block waiting for the first job to finish, so neither job may now
make progress.

To rectify this, allow jobs to register a cleanup operation that will
always run regardless of if the job was in a transaction or not, and
if the transaction job group completed successfully or not.

Move sensitive cleanup to this callback instead which is guaranteed to
be run only after the transaction has converged, which removes sensitive
timing constraints from said cleanup.

Reported-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: John Snow <jsnow@redhat.com>
---
 block/backup.c           | 11 ++++++++---
 blockjob.c               |  3 +++
 include/block/blockjob.h |  8 ++++++++
 3 files changed, 19 insertions(+), 3 deletions(-)

diff --git a/block/backup.c b/block/backup.c
index 5878ffe..bcc43a2 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -246,6 +246,13 @@ static void backup_abort(BlockJob *job)
     }
 }
 
+static void backup_clean(BlockJob *job)
+{
+    BackupBlockJob *s = container_of(job, BackupBlockJob, common);
+    assert(s->target);
+    blk_unref(s->target);
+}
+
 static void backup_attached_aio_context(BlockJob *job, AioContext *aio_context)
 {
     BackupBlockJob *s = container_of(job, BackupBlockJob, common);
@@ -259,6 +266,7 @@ static const BlockJobDriver backup_job_driver = {
     .set_speed              = backup_set_speed,
     .commit                 = backup_commit,
     .abort                  = backup_abort,
+    .clean                  = backup_clean,
     .attached_aio_context   = backup_attached_aio_context,
 };
 
@@ -280,11 +288,8 @@ typedef struct {
 
 static void backup_complete(BlockJob *job, void *opaque)
 {
-    BackupBlockJob *s = container_of(job, BackupBlockJob, common);
     BackupCompleteData *data = opaque;
 
-    blk_unref(s->target);
-
     block_job_completed(job, data->ret);
     g_free(data);
 }
diff --git a/blockjob.c b/blockjob.c
index 74b19b9..7b375cf 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -226,6 +226,9 @@ static void block_job_completed_single(BlockJob *job)
             job->driver->abort(job);
         }
     }
+    if (job->driver->clean) {
+        job->driver->clean(job);
+    }
     job->cb(job->opaque, job->ret);
     if (job->txn) {
         QLIST_REMOVE(job, txn_list);
diff --git a/include/block/blockjob.h b/include/block/blockjob.h
index e06258f..81d1712 100644
--- a/include/block/blockjob.h
+++ b/include/block/blockjob.h
@@ -73,6 +73,14 @@ typedef struct BlockJobDriver {
     void (*abort)(BlockJob *job);
 
     /**
+     * If the callback is not NULL, it will be invoked after a call to either
+     * .commit() or .abort(). Regardless of which callback is invoked after
+     * completion, .clean() will always be called, even if the job does not
+     * belong to a transaction group.
+     */
+    void (*clean)(BlockJob *job);
+
+    /**
      * If the callback is not NULL, it will be invoked when the job transitions
      * into the paused state.  Paused jobs must not perform any asynchronous
      * I/O or event loop activity.  This callback is used to quiesce jobs.
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [Qemu-devel] [PATCH 5/5] iotests: add transactional failure race test
  2016-08-08 19:09 [Qemu-devel] [PATCH 0/5] blockjobs: Fix transactional race condition John Snow
                   ` (3 preceding siblings ...)
  2016-08-08 19:09 ` [Qemu-devel] [PATCH 4/5] blockjob: add .clean property John Snow
@ 2016-08-08 19:09 ` John Snow
  2016-08-10 15:19   ` Vladimir Sementsov-Ogievskiy
  2016-08-08 19:18 ` [Qemu-devel] [PATCH 0/5] blockjobs: Fix transactional race condition no-reply
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 15+ messages in thread
From: John Snow @ 2016-08-08 19:09 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, jcody, vsementsov, famz, qemu-devel, John Snow

Add a regression test for the case found by Vladimir.

Reported-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: John Snow <jsnow@redhat.com>
---
 tests/qemu-iotests/124     | 91 ++++++++++++++++++++++++++++++++++++++++++++++
 tests/qemu-iotests/124.out |  4 +-
 2 files changed, 93 insertions(+), 2 deletions(-)

diff --git a/tests/qemu-iotests/124 b/tests/qemu-iotests/124
index de7cdbe..a4a8c83 100644
--- a/tests/qemu-iotests/124
+++ b/tests/qemu-iotests/124
@@ -509,6 +509,97 @@ class TestIncrementalBackup(TestIncrementalBackupBase):
         self.check_backups()
 
 
+    def test_transaction_failure_race(self):
+        '''Test: Verify that transactions with jobs that have no data to
+        transfer do not cause race conditions in the cancellation of the entire
+        transaction job group.
+        '''
+
+        # Create a second drive, with pattern:
+        drive1 = self.add_node('drive1')
+        self.img_create(drive1['file'], drive1['fmt'])
+        io_write_patterns(drive1['file'], (('0x14', 0, 512),
+                                           ('0x5d', '1M', '32k'),
+                                           ('0xcd', '32M', '124k')))
+
+        # Create a blkdebug interface to this img as 'drive1'
+        result = self.vm.qmp('blockdev-add', options={
+            'id': drive1['id'],
+            'driver': drive1['fmt'],
+            'file': {
+                'driver': 'blkdebug',
+                'image': {
+                    'driver': 'file',
+                    'filename': drive1['file']
+                },
+                'set-state': [{
+                    'event': 'flush_to_disk',
+                    'state': 1,
+                    'new_state': 2
+                }],
+                'inject-error': [{
+                    'event': 'read_aio',
+                    'errno': 5,
+                    'state': 2,
+                    'immediately': False,
+                    'once': True
+                }],
+            }
+        })
+        self.assert_qmp(result, 'return', {})
+
+        # Create bitmaps and full backups for both drives
+        drive0 = self.drives[0]
+        dr0bm0 = self.add_bitmap('bitmap0', drive0)
+        dr1bm0 = self.add_bitmap('bitmap0', drive1)
+        self.create_anchor_backup(drive0)
+        self.create_anchor_backup(drive1)
+        self.assert_no_active_block_jobs()
+        self.assertFalse(self.vm.get_qmp_events(wait=False))
+
+        # Emulate some writes
+        self.hmp_io_writes(drive1['id'], (('0xba', 0, 512),
+                                          ('0xef', '16M', '256k'),
+                                          ('0x46', '32736k', '64k')))
+
+        # Create incremental backup targets
+        target0 = self.prepare_backup(dr0bm0)
+        target1 = self.prepare_backup(dr1bm0)
+
+        # Ask for a new incremental backup per-each drive, expecting drive1's
+        # backup to fail and attempt to cancel the empty drive0 job.
+        transaction = [
+            transaction_drive_backup(drive0['id'], target0, sync='incremental',
+                                     format=drive0['fmt'], mode='existing',
+                                     bitmap=dr0bm0.name),
+            transaction_drive_backup(drive1['id'], target1, sync='incremental',
+                                     format=drive1['fmt'], mode='existing',
+                                     bitmap=dr1bm0.name)
+        ]
+        result = self.vm.qmp('transaction', actions=transaction,
+                             properties={'completion-mode': 'grouped'} )
+        self.assert_qmp(result, 'return', {})
+
+        # Observe that drive0's backup is cancelled and drive1 completes with
+        # an error.
+        self.wait_qmp_backup_cancelled(drive0['id'])
+        self.assertFalse(self.wait_qmp_backup(drive1['id']))
+        error = self.vm.event_wait('BLOCK_JOB_ERROR')
+        self.assert_qmp(error, 'data', {'device': drive1['id'],
+                                        'action': 'report',
+                                        'operation': 'read'})
+        self.assertFalse(self.vm.get_qmp_events(wait=False))
+        self.assert_no_active_block_jobs()
+
+        # Delete drive0's successful target and eliminate our record of the
+        # unsuccessful drive1 target.
+        dr0bm0.del_target()
+        dr1bm0.del_target()
+
+        self.vm.shutdown()
+
+
+
     def test_sync_dirty_bitmap_missing(self):
         self.assert_no_active_block_jobs()
         self.files.append(self.err_img)
diff --git a/tests/qemu-iotests/124.out b/tests/qemu-iotests/124.out
index 36376be..e56cae0 100644
--- a/tests/qemu-iotests/124.out
+++ b/tests/qemu-iotests/124.out
@@ -1,5 +1,5 @@
-..........
+...........
 ----------------------------------------------------------------------
-Ran 10 tests
+Ran 11 tests
 
 OK
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH 0/5] blockjobs: Fix transactional race condition
  2016-08-08 19:09 [Qemu-devel] [PATCH 0/5] blockjobs: Fix transactional race condition John Snow
                   ` (4 preceding siblings ...)
  2016-08-08 19:09 ` [Qemu-devel] [PATCH 5/5] iotests: add transactional failure race test John Snow
@ 2016-08-08 19:18 ` no-reply
  2016-08-08 19:19 ` John Snow
       [not found] ` <57E94491.8090501@virtuozzo.com>
  7 siblings, 0 replies; 15+ messages in thread
From: no-reply @ 2016-08-08 19:18 UTC (permalink / raw)
  To: jsnow; +Cc: famz, qemu-block, kwolf, vsementsov

Hi,

Your series seems to have some coding style problems. See output below for
more information:

Message-id: 1470683381-16680-1-git-send-email-jsnow@redhat.com
Type: series
Subject: [Qemu-devel] [PATCH 0/5] blockjobs: Fix transactional race condition

=== TEST SCRIPT BEGIN ===
#!/bin/bash

BASE=base
n=1
total=$(git log --oneline $BASE.. | wc -l)
failed=0

commits="$(git log --format=%H --reverse $BASE..)"
for c in $commits; do
    echo "Checking PATCH $n/$total: $(git show --no-patch --format=%s $c)..."
    if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; then
        failed=1
        echo
    fi
    n=$((n+1))
done

exit $failed
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
 * [new tag]         patchew/1470683381-16680-1-git-send-email-jsnow@redhat.com -> patchew/1470683381-16680-1-git-send-email-jsnow@redhat.com
Switched to a new branch 'test'
23f79b3 iotests: add transactional failure race test
8852e8a blockjob: add .clean property
0a62157 blockjob: refactor backup_start as backup_job_create
22acc76 blockjob: add block_job_start
e1cbf1b blockjob: fix dead pointer in txn list

=== OUTPUT BEGIN ===
Checking PATCH 1/5: blockjob: fix dead pointer in txn list...
Checking PATCH 2/5: blockjob: add block_job_start...
Checking PATCH 3/5: blockjob: refactor backup_start as backup_job_create...
WARNING: line over 80 characters
#433: FILE: blockdev.c:3396:
+    job = backup_job_create(job_id, bs, target_bs, speed, sync, NULL, on_source_error,

total: 0 errors, 1 warnings, 463 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

Checking PATCH 4/5: blockjob: add .clean property...
Checking PATCH 5/5: iotests: add transactional failure race test...
=== OUTPUT END ===

Test command exited with code: 1

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH 0/5] blockjobs: Fix transactional race condition
  2016-08-08 19:09 [Qemu-devel] [PATCH 0/5] blockjobs: Fix transactional race condition John Snow
                   ` (5 preceding siblings ...)
  2016-08-08 19:18 ` [Qemu-devel] [PATCH 0/5] blockjobs: Fix transactional race condition no-reply
@ 2016-08-08 19:19 ` John Snow
       [not found] ` <57E94491.8090501@virtuozzo.com>
  7 siblings, 0 replies; 15+ messages in thread
From: John Snow @ 2016-08-08 19:19 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, jcody, vsementsov, famz, qemu-devel

I should also clarify that this is ambiguously for either 2.7 or 2.8.

2.7: This fixes a real, observable problem with transactional completion 
that has the capacity to hang QEMU or segfault due to QLIST corruption.

2.8: Incremental backup is not earnestly a supported feature yet as 
persistence and migration are not yet integrated, so perhaps it's not a 
game-breaker that this feature breaks in some circumstances.


On 08/08/2016 03:09 PM, John Snow wrote:
> There are a few problems with transactional job completion right now.
>
> First, if jobs complete so quickly they complete before remaining jobs
> get a chance to join the transaction, the completion mode can leave well
> known state and the QLIST can get corrupted and the transactional jobs
> can complete in batches or phases instead of all together.
>
> Second, if two or more jobs defer to the main loop at roughly the same
> time, it's possible for one job's cleanup to directly invoke the other
> job's cleanup from within the same thread, leading to a situation that
> will deadlock the entire transaction.
>
> Thanks to Vladimir for pointing out these modes of failure.
>
> ________________________________________________________________________________
>
> For convenience, this branch is available at:
> https://github.com/jnsnow/qemu.git branch job-manual-start
> https://github.com/jnsnow/qemu/tree/job-manual-start
>
> This version is tagged job-manual-start-v1:
> https://github.com/jnsnow/qemu/releases/tag/job-manual-start-v1
>
> John Snow (4):
>   blockjob: add block_job_start
>   blockjob: refactor backup_start as backup_job_create
>   blockjob: add .clean property
>   iotests: add transactional failure race test
>
> Vladimir Sementsov-Ogievskiy (1):
>   blockjob: fix dead pointer in txn list
>
>  block/backup.c             |  50 +++++++-----
>  block/commit.c             |   2 +-
>  block/mirror.c             |   2 +-
>  block/stream.c             |   2 +-
>  blockdev.c                 | 194 ++++++++++++++++++++++++++-------------------
>  blockjob.c                 |  24 +++++-
>  include/block/block_int.h  |  19 ++---
>  include/block/blockjob.h   |  16 ++++
>  tests/qemu-iotests/124     |  91 +++++++++++++++++++++
>  tests/qemu-iotests/124.out |   4 +-
>  tests/test-blockjob-txn.c  |   2 +-
>  11 files changed, 284 insertions(+), 122 deletions(-)
>

-- 
—js

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH 3/5] blockjob: refactor backup_start as backup_job_create
  2016-08-08 19:09 ` [Qemu-devel] [PATCH 3/5] blockjob: refactor backup_start as backup_job_create John Snow
@ 2016-08-08 21:23   ` John Snow
  0 siblings, 0 replies; 15+ messages in thread
From: John Snow @ 2016-08-08 21:23 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, vsementsov, famz, jcody, qemu-devel



On 08/08/2016 03:09 PM, John Snow wrote:
> Refactor backup_start as backup_job_create, which only creates the job,
> but does not automatically start it. The old interface, 'backup_start',
> is not kept in favor of limiting the number of nearly-identical iterfaces
> that would have to be edited to keep up with QAPI changes in the future.
>
> Callers that wish to synchronously start the backup_block_job can
> instead just call block_job_start immediately after calling
> backup_job_create.
>
> Transactions are updated to use the new interface, calling block_job_start
> only during the .commit phase, which helps prevent race conditions where
> jobs may finish before we even finish building the transaction. This may
> happen, for instance, during empty blockup jobs.
>
> Reported-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> Signed-off-by: John Snow <jsnow@redhat.com>
> ---
>  block/backup.c            |  39 ++++++----
>  blockdev.c                | 194 ++++++++++++++++++++++++++--------------------
>  blockjob.c                |   9 ++-
>  include/block/block_int.h |  19 ++---
>  4 files changed, 149 insertions(+), 112 deletions(-)
>
> diff --git a/block/backup.c b/block/backup.c
> index 2229e26..5878ffe 100644
> --- a/block/backup.c
> +++ b/block/backup.c
> @@ -474,13 +474,16 @@ static void coroutine_fn backup_run(void *opaque)
>      block_job_defer_to_main_loop(&job->common, backup_complete, data);
>  }
>
> -void backup_start(const char *job_id, BlockDriverState *bs,
> -                  BlockDriverState *target, int64_t speed,
> -                  MirrorSyncMode sync_mode, BdrvDirtyBitmap *sync_bitmap,
> -                  BlockdevOnError on_source_error,
> -                  BlockdevOnError on_target_error,
> -                  BlockCompletionFunc *cb, void *opaque,
> -                  BlockJobTxn *txn, Error **errp)
> +BlockJob *backup_job_create(const char *job_id,
> +                            BlockDriverState *bs,
> +                            BlockDriverState *target,
> +                            int64_t speed,
> +                            MirrorSyncMode sync_mode,
> +                            BdrvDirtyBitmap *sync_bitmap,
> +                            BlockdevOnError on_source_error,
> +                            BlockdevOnError on_target_error,
> +                            BlockCompletionFunc *cb, void *opaque,
> +                            BlockJobTxn *txn, Error **errp)
>  {
>      int64_t len;
>      BlockDriverInfo bdi;
> @@ -492,46 +495,46 @@ void backup_start(const char *job_id, BlockDriverState *bs,
>
>      if (bs == target) {
>          error_setg(errp, "Source and target cannot be the same");
> -        return;
> +        return NULL;
>      }
>
>      if (!bdrv_is_inserted(bs)) {
>          error_setg(errp, "Device is not inserted: %s",
>                     bdrv_get_device_name(bs));
> -        return;
> +        return NULL;
>      }
>
>      if (!bdrv_is_inserted(target)) {
>          error_setg(errp, "Device is not inserted: %s",
>                     bdrv_get_device_name(target));
> -        return;
> +        return NULL;
>      }
>
>      if (bdrv_op_is_blocked(bs, BLOCK_OP_TYPE_BACKUP_SOURCE, errp)) {
> -        return;
> +        return NULL;
>      }
>
>      if (bdrv_op_is_blocked(target, BLOCK_OP_TYPE_BACKUP_TARGET, errp)) {
> -        return;
> +        return NULL;
>      }
>
>      if (sync_mode == MIRROR_SYNC_MODE_INCREMENTAL) {
>          if (!sync_bitmap) {
>              error_setg(errp, "must provide a valid bitmap name for "
>                               "\"incremental\" sync mode");
> -            return;
> +            return NULL;
>          }
>
>          /* Create a new bitmap, and freeze/disable this one. */
>          if (bdrv_dirty_bitmap_create_successor(bs, sync_bitmap, errp) < 0) {
> -            return;
> +            return NULL;
>          }
>      } else if (sync_bitmap) {
>          error_setg(errp,
>                     "a sync_bitmap was provided to backup_run, "
>                     "but received an incompatible sync_mode (%s)",
>                     MirrorSyncMode_lookup[sync_mode]);
> -        return;
> +        return NULL;
>      }
>
>      len = bdrv_getlength(bs);
> @@ -578,8 +581,8 @@ void backup_start(const char *job_id, BlockDriverState *bs,
>      job->common.len = len;
>      job->common.co = qemu_coroutine_create(backup_run, job);
>      block_job_txn_add_job(txn, &job->common);
> -    block_job_start(&job->common);
> -    return;
> +
> +    return &job->common;
>
>   error:
>      if (sync_bitmap) {
> @@ -589,4 +592,6 @@ void backup_start(const char *job_id, BlockDriverState *bs,
>          blk_unref(job->target);
>          block_job_unref(&job->common);
>      }
> +
> +    return NULL;
>  }
> diff --git a/blockdev.c b/blockdev.c
> index 2161400..4b041d9 100644
> --- a/blockdev.c
> +++ b/blockdev.c
> @@ -1838,17 +1838,17 @@ typedef struct DriveBackupState {
>      BlockJob *job;
>  } DriveBackupState;
>
> -static void do_drive_backup(const char *job_id, const char *device,
> -                            const char *target, bool has_format,
> -                            const char *format, enum MirrorSyncMode sync,
> -                            bool has_mode, enum NewImageMode mode,
> -                            bool has_speed, int64_t speed,
> -                            bool has_bitmap, const char *bitmap,
> -                            bool has_on_source_error,
> -                            BlockdevOnError on_source_error,
> -                            bool has_on_target_error,
> -                            BlockdevOnError on_target_error,
> -                            BlockJobTxn *txn, Error **errp);
> +static BlockJob *do_drive_backup(const char *job_id, const char *device,
> +                                 const char *target, bool has_format,
> +                                 const char *format, enum MirrorSyncMode sync,
> +                                 bool has_mode, enum NewImageMode mode,
> +                                 bool has_speed, int64_t speed,
> +                                 bool has_bitmap, const char *bitmap,
> +                                 bool has_on_source_error,
> +                                 BlockdevOnError on_source_error,
> +                                 bool has_on_target_error,
> +                                 BlockdevOnError on_target_error,
> +                                 BlockJobTxn *txn, Error **errp);
>
>  static void drive_backup_prepare(BlkActionState *common, Error **errp)
>  {
> @@ -1878,32 +1878,38 @@ static void drive_backup_prepare(BlkActionState *common, Error **errp)
>      bdrv_drained_begin(blk_bs(blk));
>      state->bs = blk_bs(blk);
>
> -    do_drive_backup(backup->has_job_id ? backup->job_id : NULL,
> -                    backup->device, backup->target,
> -                    backup->has_format, backup->format,
> -                    backup->sync,
> -                    backup->has_mode, backup->mode,
> -                    backup->has_speed, backup->speed,
> -                    backup->has_bitmap, backup->bitmap,
> -                    backup->has_on_source_error, backup->on_source_error,
> -                    backup->has_on_target_error, backup->on_target_error,
> -                    common->block_job_txn, &local_err);
> +    state->job = do_drive_backup(backup->has_job_id ? backup->job_id : NULL,
> +                                 backup->device, backup->target,
> +                                 backup->has_format, backup->format,
> +                                 backup->sync,
> +                                 backup->has_mode, backup->mode,
> +                                 backup->has_speed, backup->speed,
> +                                 backup->has_bitmap, backup->bitmap,
> +                                 backup->has_on_source_error,
> +                                 backup->on_source_error,
> +                                 backup->has_on_target_error,
> +                                 backup->on_target_error,
> +                                 common->block_job_txn, &local_err);
>      if (local_err) {
>          error_propagate(errp, local_err);
>          return;
>      }
> +}
>
> -    state->job = state->bs->job;
> +static void drive_backup_commit(BlkActionState *common)
> +{
> +    DriveBackupState *state = DO_UPCAST(DriveBackupState, common, common);
> +    if (state->job) {
> +        block_job_start(state->job);
> +    }
>  }
>
>  static void drive_backup_abort(BlkActionState *common)
>  {
>      DriveBackupState *state = DO_UPCAST(DriveBackupState, common, common);
> -    BlockDriverState *bs = state->bs;
>
> -    /* Only cancel if it's the job we started */
> -    if (bs && bs->job && bs->job == state->job) {
> -        block_job_cancel_sync(bs->job);
> +    if (state->job) {
> +        block_job_cancel_sync(state->job);
>      }

NACK, this will create problems for jobs that we never actually started. 
I'll have to think about it. Good indication this is 2.8 material.

>  }
>
> @@ -1924,14 +1930,15 @@ typedef struct BlockdevBackupState {
>      AioContext *aio_context;
>  } BlockdevBackupState;
>
> -static void do_blockdev_backup(const char *job_id, const char *device,
> -                               const char *target, enum MirrorSyncMode sync,
> -                               bool has_speed, int64_t speed,
> -                               bool has_on_source_error,
> -                               BlockdevOnError on_source_error,
> -                               bool has_on_target_error,
> -                               BlockdevOnError on_target_error,
> -                               BlockJobTxn *txn, Error **errp);
> +static BlockJob *do_blockdev_backup(const char *job_id, const char *device,
> +                                    const char *target,
> +                                    enum MirrorSyncMode sync,
> +                                    bool has_speed, int64_t speed,
> +                                    bool has_on_source_error,
> +                                    BlockdevOnError on_source_error,
> +                                    bool has_on_target_error,
> +                                    BlockdevOnError on_target_error,
> +                                    BlockJobTxn *txn, Error **errp);
>
>  static void blockdev_backup_prepare(BlkActionState *common, Error **errp)
>  {
> @@ -1971,28 +1978,35 @@ static void blockdev_backup_prepare(BlkActionState *common, Error **errp)
>      state->bs = blk_bs(blk);
>      bdrv_drained_begin(state->bs);
>
> -    do_blockdev_backup(backup->has_job_id ? backup->job_id : NULL,
> -                       backup->device, backup->target, backup->sync,
> -                       backup->has_speed, backup->speed,
> -                       backup->has_on_source_error, backup->on_source_error,
> -                       backup->has_on_target_error, backup->on_target_error,
> -                       common->block_job_txn, &local_err);
> +    state->job = do_blockdev_backup(backup->has_job_id ? backup->job_id : NULL,
> +                                    backup->device, backup->target,
> +                                    backup->sync,
> +                                    backup->has_speed, backup->speed,
> +                                    backup->has_on_source_error,
> +                                    backup->on_source_error,
> +                                    backup->has_on_target_error,
> +                                    backup->on_target_error,
> +                                    common->block_job_txn, &local_err);
>      if (local_err) {
>          error_propagate(errp, local_err);
>          return;
>      }
> +}
>
> -    state->job = state->bs->job;
> +static void blockdev_backup_commit(BlkActionState *common)
> +{
> +    BlockdevBackupState *state = DO_UPCAST(BlockdevBackupState, common, common);
> +    if (state->job) {
> +        block_job_start(state->job);
> +    }
>  }
>
>  static void blockdev_backup_abort(BlkActionState *common)
>  {
>      BlockdevBackupState *state = DO_UPCAST(BlockdevBackupState, common, common);
> -    BlockDriverState *bs = state->bs;
>
> -    /* Only cancel if it's the job we started */
> -    if (bs && bs->job && bs->job == state->job) {
> -        block_job_cancel_sync(bs->job);
> +    if (state->job) {
> +        block_job_cancel_sync(state->job);
>      }
>  }
>
> @@ -2142,12 +2156,14 @@ static const BlkActionOps actions[] = {
>      [TRANSACTION_ACTION_KIND_DRIVE_BACKUP] = {
>          .instance_size = sizeof(DriveBackupState),
>          .prepare = drive_backup_prepare,
> +        .commit = drive_backup_commit,
>          .abort = drive_backup_abort,
>          .clean = drive_backup_clean,
>      },
>      [TRANSACTION_ACTION_KIND_BLOCKDEV_BACKUP] = {
>          .instance_size = sizeof(BlockdevBackupState),
>          .prepare = blockdev_backup_prepare,
> +        .commit = blockdev_backup_commit,
>          .abort = blockdev_backup_abort,
>          .clean = blockdev_backup_clean,
>      },
> @@ -3155,22 +3171,23 @@ out:
>      aio_context_release(aio_context);
>  }
>
> -static void do_drive_backup(const char *job_id, const char *device,
> -                            const char *target, bool has_format,
> -                            const char *format, enum MirrorSyncMode sync,
> -                            bool has_mode, enum NewImageMode mode,
> -                            bool has_speed, int64_t speed,
> -                            bool has_bitmap, const char *bitmap,
> -                            bool has_on_source_error,
> -                            BlockdevOnError on_source_error,
> -                            bool has_on_target_error,
> -                            BlockdevOnError on_target_error,
> -                            BlockJobTxn *txn, Error **errp)
> +static BlockJob *do_drive_backup(const char *job_id, const char *device,
> +                                 const char *target, bool has_format,
> +                                 const char *format, enum MirrorSyncMode sync,
> +                                 bool has_mode, enum NewImageMode mode,
> +                                 bool has_speed, int64_t speed,
> +                                 bool has_bitmap, const char *bitmap,
> +                                 bool has_on_source_error,
> +                                 BlockdevOnError on_source_error,
> +                                 bool has_on_target_error,
> +                                 BlockdevOnError on_target_error,
> +                                 BlockJobTxn *txn, Error **errp)
>  {
>      BlockBackend *blk;
>      BlockDriverState *bs;
>      BlockDriverState *target_bs;
>      BlockDriverState *source = NULL;
> +    BlockJob *job = NULL;
>      BdrvDirtyBitmap *bmap = NULL;
>      AioContext *aio_context;
>      QDict *options = NULL;
> @@ -3195,7 +3212,7 @@ static void do_drive_backup(const char *job_id, const char *device,
>      if (!blk) {
>          error_set(errp, ERROR_CLASS_DEVICE_NOT_FOUND,
>                    "Device '%s' not found", device);
> -        return;
> +        return NULL;
>      }
>
>      aio_context = blk_get_aio_context(blk);
> @@ -3276,9 +3293,9 @@ static void do_drive_backup(const char *job_id, const char *device,
>          }
>      }
>
> -    backup_start(job_id, bs, target_bs, speed, sync, bmap,
> -                 on_source_error, on_target_error,
> -                 block_job_cb, bs, txn, &local_err);
> +    job = backup_job_create(job_id, bs, target_bs, speed, sync, bmap,
> +                            on_source_error, on_target_error,
> +                            block_job_cb, bs, txn, &local_err);
>      bdrv_unref(target_bs);
>      if (local_err != NULL) {
>          error_propagate(errp, local_err);
> @@ -3287,6 +3304,7 @@ static void do_drive_backup(const char *job_id, const char *device,
>
>  out:
>      aio_context_release(aio_context);
> +    return job;
>  }
>
>  void qmp_drive_backup(bool has_job_id, const char *job_id,
> @@ -3300,13 +3318,17 @@ void qmp_drive_backup(bool has_job_id, const char *job_id,
>                        bool has_on_target_error, BlockdevOnError on_target_error,
>                        Error **errp)
>  {
> -    return do_drive_backup(has_job_id ? job_id : NULL, device, target,
> -                           has_format, format, sync,
> -                           has_mode, mode, has_speed, speed,
> -                           has_bitmap, bitmap,
> -                           has_on_source_error, on_source_error,
> -                           has_on_target_error, on_target_error,
> -                           NULL, errp);
> +    BlockJob *job;
> +    job = do_drive_backup(has_job_id ? job_id : NULL, device, target,
> +                          has_format, format, sync,
> +                          has_mode, mode, has_speed, speed,
> +                          has_bitmap, bitmap,
> +                          has_on_source_error, on_source_error,
> +                          has_on_target_error, on_target_error,
> +                          NULL, errp);
> +    if (job) {
> +        block_job_start(job);
> +    }
>  }
>
>  BlockDeviceInfoList *qmp_query_named_block_nodes(Error **errp)
> @@ -3314,20 +3336,21 @@ BlockDeviceInfoList *qmp_query_named_block_nodes(Error **errp)
>      return bdrv_named_nodes_list(errp);
>  }
>
> -void do_blockdev_backup(const char *job_id, const char *device,
> -                        const char *target, enum MirrorSyncMode sync,
> -                         bool has_speed, int64_t speed,
> -                         bool has_on_source_error,
> -                         BlockdevOnError on_source_error,
> -                         bool has_on_target_error,
> -                         BlockdevOnError on_target_error,
> -                         BlockJobTxn *txn, Error **errp)
> +BlockJob *do_blockdev_backup(const char *job_id, const char *device,
> +                             const char *target, enum MirrorSyncMode sync,
> +                             bool has_speed, int64_t speed,
> +                             bool has_on_source_error,
> +                             BlockdevOnError on_source_error,
> +                             bool has_on_target_error,
> +                             BlockdevOnError on_target_error,
> +                             BlockJobTxn *txn, Error **errp)
>  {
>      BlockBackend *blk;
>      BlockDriverState *bs;
>      BlockDriverState *target_bs;
>      Error *local_err = NULL;
>      AioContext *aio_context;
> +    BlockJob *job = NULL;
>
>      if (!has_speed) {
>          speed = 0;
> @@ -3342,7 +3365,7 @@ void do_blockdev_backup(const char *job_id, const char *device,
>      blk = blk_by_name(device);
>      if (!blk) {
>          error_setg(errp, "Device '%s' not found", device);
> -        return;
> +        return NULL;
>      }
>
>      aio_context = blk_get_aio_context(blk);
> @@ -3370,13 +3393,14 @@ void do_blockdev_backup(const char *job_id, const char *device,
>              goto out;
>          }
>      }
> -    backup_start(job_id, bs, target_bs, speed, sync, NULL, on_source_error,
> -                 on_target_error, block_job_cb, bs, txn, &local_err);
> +    job = backup_job_create(job_id, bs, target_bs, speed, sync, NULL, on_source_error,
> +                            on_target_error, block_job_cb, bs, txn, &local_err);
>      if (local_err != NULL) {
>          error_propagate(errp, local_err);
>      }
>  out:
>      aio_context_release(aio_context);
> +    return job;
>  }
>
>  void qmp_blockdev_backup(bool has_job_id, const char *job_id,
> @@ -3389,11 +3413,15 @@ void qmp_blockdev_backup(bool has_job_id, const char *job_id,
>                           BlockdevOnError on_target_error,
>                           Error **errp)
>  {
> -    do_blockdev_backup(has_job_id ? job_id : NULL, device, target,
> -                       sync, has_speed, speed,
> -                       has_on_source_error, on_source_error,
> -                       has_on_target_error, on_target_error,
> -                       NULL, errp);
> +    BlockJob *job;
> +    job = do_blockdev_backup(has_job_id ? job_id : NULL, device, target,
> +                             sync, has_speed, speed,
> +                             has_on_source_error, on_source_error,
> +                             has_on_target_error, on_target_error,
> +                             NULL, errp);
> +    if (job) {
> +        block_job_start(job);
> +    }
>  }
>
>  /* Parameter check and block job starting for drive mirroring.
> diff --git a/blockjob.c b/blockjob.c
> index 0d07abc..74b19b9 100644
> --- a/blockjob.c
> +++ b/blockjob.c
> @@ -117,9 +117,12 @@ static void block_job_detach_aio_context(void *opaque)
>      block_job_unref(job);
>  }
>
> -void *block_job_create(const char *job_id, const BlockJobDriver *driver,
> -                       BlockDriverState *bs, int64_t speed,
> -                       BlockCompletionFunc *cb, void *opaque, Error **errp)
> +void *block_job_create(const char *job_id,
> +                       const BlockJobDriver *driver,
> +                       BlockDriverState *bs,
> +                       int64_t speed,
> +                       BlockCompletionFunc *cb,
> +                       void *opaque, Error **errp)
>  {
>      BlockBackend *blk;
>      BlockJob *job;
> diff --git a/include/block/block_int.h b/include/block/block_int.h
> index 47665be..ba63a8e 100644
> --- a/include/block/block_int.h
> +++ b/include/block/block_int.h
> @@ -745,7 +745,7 @@ void mirror_start(const char *job_id, BlockDriverState *bs,
>                    void *opaque, Error **errp);
>
>  /*
> - * backup_start:
> + * backup_job_create:
>   * @job_id: The id of the newly-created job, or %NULL to use the
>   * device name of @bs.
>   * @bs: Block device to operate on.
> @@ -759,16 +759,17 @@ void mirror_start(const char *job_id, BlockDriverState *bs,
>   * @opaque: Opaque pointer value passed to @cb.
>   * @txn: Transaction that this job is part of (may be NULL).
>   *
> - * Start a backup operation on @bs.  Clusters in @bs are written to @target
> + * Create a backup operation on @bs.  Clusters in @bs are written to @target
>   * until the job is cancelled or manually completed.
>   */
> -void backup_start(const char *job_id, BlockDriverState *bs,
> -                  BlockDriverState *target, int64_t speed,
> -                  MirrorSyncMode sync_mode, BdrvDirtyBitmap *sync_bitmap,
> -                  BlockdevOnError on_source_error,
> -                  BlockdevOnError on_target_error,
> -                  BlockCompletionFunc *cb, void *opaque,
> -                  BlockJobTxn *txn, Error **errp);
> +BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs,
> +                            BlockDriverState *target, int64_t speed,
> +                            MirrorSyncMode sync_mode,
> +                            BdrvDirtyBitmap *sync_bitmap,
> +                            BlockdevOnError on_source_error,
> +                            BlockdevOnError on_target_error,
> +                            BlockCompletionFunc *cb, void *opaque,
> +                            BlockJobTxn *txn, Error **errp);
>
>  void hmp_drive_add_node(Monitor *mon, const char *optstr);
>
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH 5/5] iotests: add transactional failure race test
  2016-08-08 19:09 ` [Qemu-devel] [PATCH 5/5] iotests: add transactional failure race test John Snow
@ 2016-08-10 15:19   ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 15+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2016-08-10 15:19 UTC (permalink / raw)
  To: John Snow, qemu-block; +Cc: kwolf, jcody, famz, qemu-devel

On 08.08.2016 22:09, John Snow wrote:
> Add a regression test for the case found by Vladimir.
>
> Reported-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> Signed-off-by: John Snow <jsnow@redhat.com>
> ---
>   tests/qemu-iotests/124     | 91 ++++++++++++++++++++++++++++++++++++++++++++++
>   tests/qemu-iotests/124.out |  4 +-
>   2 files changed, 93 insertions(+), 2 deletions(-)
>
> diff --git a/tests/qemu-iotests/124 b/tests/qemu-iotests/124
> index de7cdbe..a4a8c83 100644
> --- a/tests/qemu-iotests/124
> +++ b/tests/qemu-iotests/124
> @@ -509,6 +509,97 @@ class TestIncrementalBackup(TestIncrementalBackupBase):
>           self.check_backups()
>   
>   
> +    def test_transaction_failure_race(self):
> +        '''Test: Verify that transactions with jobs that have no data to
> +        transfer do not cause race conditions in the cancellation of the entire
> +        transaction job group.
> +        '''
> +
> +        # Create a second drive, with pattern:
> +        drive1 = self.add_node('drive1')
> +        self.img_create(drive1['file'], drive1['fmt'])
> +        io_write_patterns(drive1['file'], (('0x14', 0, 512),
> +                                           ('0x5d', '1M', '32k'),
> +                                           ('0xcd', '32M', '124k')))
> +
> +        # Create a blkdebug interface to this img as 'drive1'
> +        result = self.vm.qmp('blockdev-add', options={
> +            'id': drive1['id'],
> +            'driver': drive1['fmt'],
> +            'file': {
> +                'driver': 'blkdebug',
> +                'image': {
> +                    'driver': 'file',
> +                    'filename': drive1['file']
> +                },
> +                'set-state': [{
> +                    'event': 'flush_to_disk',
> +                    'state': 1,
> +                    'new_state': 2
> +                }],
> +                'inject-error': [{
> +                    'event': 'read_aio',
> +                    'errno': 5,
> +                    'state': 2,
> +                    'immediately': False,
> +                    'once': True
> +                }],
> +            }
> +        })
> +        self.assert_qmp(result, 'return', {})
> +
> +        # Create bitmaps and full backups for both drives
> +        drive0 = self.drives[0]
> +        dr0bm0 = self.add_bitmap('bitmap0', drive0)
> +        dr1bm0 = self.add_bitmap('bitmap0', drive1)
> +        self.create_anchor_backup(drive0)
> +        self.create_anchor_backup(drive1)
> +        self.assert_no_active_block_jobs()
> +        self.assertFalse(self.vm.get_qmp_events(wait=False))
> +
> +        # Emulate some writes
> +        self.hmp_io_writes(drive1['id'], (('0xba', 0, 512),
> +                                          ('0xef', '16M', '256k'),
> +                                          ('0x46', '32736k', '64k')))
> +
> +        # Create incremental backup targets
> +        target0 = self.prepare_backup(dr0bm0)
> +        target1 = self.prepare_backup(dr1bm0)
> +
> +        # Ask for a new incremental backup per-each drive, expecting drive1's
> +        # backup to fail and attempt to cancel the empty drive0 job.
> +        transaction = [
> +            transaction_drive_backup(drive0['id'], target0, sync='incremental',
> +                                     format=drive0['fmt'], mode='existing',
> +                                     bitmap=dr0bm0.name),
> +            transaction_drive_backup(drive1['id'], target1, sync='incremental',
> +                                     format=drive1['fmt'], mode='existing',
> +                                     bitmap=dr1bm0.name)
> +        ]
> +        result = self.vm.qmp('transaction', actions=transaction,
> +                             properties={'completion-mode': 'grouped'} )
> +        self.assert_qmp(result, 'return', {})
> +
> +        # Observe that drive0's backup is cancelled and drive1 completes with
> +        # an error.
> +        self.wait_qmp_backup_cancelled(drive0['id'])
> +        self.assertFalse(self.wait_qmp_backup(drive1['id']))
> +        error = self.vm.event_wait('BLOCK_JOB_ERROR')
> +        self.assert_qmp(error, 'data', {'device': drive1['id'],
> +                                        'action': 'report',
> +                                        'operation': 'read'})
> +        self.assertFalse(self.vm.get_qmp_events(wait=False))
> +        self.assert_no_active_block_jobs()
> +
> +        # Delete drive0's successful target and eliminate our record of the
> +        # unsuccessful drive1 target.
> +        dr0bm0.del_target()
> +        dr1bm0.del_target()
> +
> +        self.vm.shutdown()
> +
> +
> +

it is a almost duplication of test_transaction_failure, I think it would 
be better to make separate do_test_transaction_failure with parameter 
and two wrappers


>       def test_sync_dirty_bitmap_missing(self):
>           self.assert_no_active_block_jobs()
>           self.files.append(self.err_img)
> diff --git a/tests/qemu-iotests/124.out b/tests/qemu-iotests/124.out
> index 36376be..e56cae0 100644
> --- a/tests/qemu-iotests/124.out
> +++ b/tests/qemu-iotests/124.out
> @@ -1,5 +1,5 @@
> -..........
> +...........
>   ----------------------------------------------------------------------
> -Ran 10 tests
> +Ran 11 tests
>   
>   OK


-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH 0/5] blockjobs: Fix transactional race condition
       [not found]   ` <bf412a34-acb7-d3df-1710-ee7917ee2060@redhat.com>
@ 2016-09-28 12:16     ` Vladimir Sementsov-Ogievskiy
  2016-09-29 11:33       ` [Qemu-devel] [Qemu-block] " Kevin Wolf
  0 siblings, 1 reply; 15+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2016-09-28 12:16 UTC (permalink / raw)
  To: John Snow, qemu-devel, Stefan Hajnoczi, Jeff Cody,
	Denis V. Lunev, qemu block

Ok, let's discuss it on list.

On 27.09.2016 19:19, John Snow wrote:
> Replying off-list to follow your lead, but this might be a good 
> discussion to have upstream where Stefan and Jeff can see it.
>
> On 09/26/2016 11:53 AM, Vladimir Sementsov-Ogievskiy wrote:
>> Hi John!
>>
>> What about this series? If you are working on this, I can tell about
>> another related problem.
>>
>
> I got mired in trying to think about how to delete a BlockJob that 
> hasn't technically been started yet -- this patch series is unsafe in 
> that it will be unable to "cancel" a job that hasn't started yet.
>
> (And I don't want to start a job just to cancel it, that's gross.)
>
> I need to expand the job modifications here a little bit to make this 
> completely safe. I may need to create a job->started bool that allows 
> you to call a job_delete() function if and only if started is false.
>
> I'll try to send this along this week.
>
>> I'm working on async scheme for backup. I've started from something like
>> just wrapping backup_do_cow into coroutine and use this coroutine in
>> full-backup-loop instead of just call backup_do_cow. This simple
>> approach seems better than used in mirror - in mirror we create new
>> coroutine for each async read and write, when I create new coroutine for
>> copy of = read + write. But then I decided to rewrite this to
>> worker-coroutines scheme. I.e. we do not create new coroutine for each
>> request and control their count by " while (job->inflight_requests >=
>> MAX_IN_FLIGHT) { yield }" but we create MAX_IN_FLIGHT worker coroutines
>> at backup start, and then they just copy cluster by cluster, using
>> common cop_bitmap to synchronize. All cluster copying is done in
>> backup-workers and write-notifiers only change the order in which
>> workers copy clusters. And I liked it all until I figured out the error
>> handling.
>>
>
> Makes sense to me so far.
>
>> I can't handle errors in workers, because I need to call
>> block_job_error_action, which may stop blockjob, and it seems to be a
>> problem if several workers calls it simultaneously. So, worker have to
>
> What kind of problems are you seeing? Maybe we can fix them? is it for 
> STOP cases? (or also on IGNORE/REPORT?)
>
> Otherwise, just stash an error object in a location that the master 
> coroutine can find, pause/terminate all worker threads, and signal the 
> error.
>
>> defer error handling to main coroutine of the blockjob (this coroutine
>> in my solution doesn't do copying but only handles blockjob staff like
>> pauses, stops, throttling). So, we will have a queue of workers, waiting
>> for main coroutine to handle their errors. This looks too complicated..
>
> Really? I don't think it's too bad. The worker detects an error and 
> creates a record of the problem, then the worker yields.
>
> The controller detects the worker has yielded and sees the record of 
> the error. The controller opts not to re-schedule the worker.
>
> The controller then instructs all remaining workers to yield. Once all 
> workers are yielded, the controller invokes block_job_error_action.
>
> From there, we can handle the error accordingly.
>
> Either way it sounds like we're going to have to manage the complexity 
> of what happens when a job with worker coroutines is paused by e.g. 
> the QMP monitor: the controller needs to send and coordinate messages 
> to the workers; so this doesn't sound like too much added complexity 
> unless I'm dreaming.
>
>> And finally I come to the idea that all this problems and complications
>> are because blockjob is not created for asynchronous work, because
>> blockjob has only one working coroutine. So the true way is to maintain
>> several working coroutines on generic blockjob layer. This will lead to
>> simpler and transparent async schemes for backup and mirror (and may be
>> other blockjobs).
>>
>> So, the question is: what about refactoring blockjobs, to move from one
>> working coroutine to several ones? Is it hard and is it possible? I do
>> not understand all block-jobs related staff..
>>
>
> I think jobs will need to remain "one coroutine, one job" for now, but 
> there's no reason why drive-backup or blockdev-backup can't just 
> create multiple jobs each if that's what they need to do. (The backup 
> job object could, in theory, just have another job pointer to a helper 
> job if it really became necessary.)
>
> Let's try to solve your design problem first and see if we can't make 
> error reporting behave nicely in a contested environment.

Hmm, ok, I'll go this way for now. But anyway, I think that finally 
async scheme in backup and mirror should be the same. And it should 
share the same code, so all worker-related should be separated from 
backup to own file, or injected into block-jobs.

>
>>
>> On 08.08.2016 22:09, John Snow wrote:
>>> There are a few problems with transactional job completion right now.
>>>
>>> First, if jobs complete so quickly they complete before remaining jobs
>>> get a chance to join the transaction, the completion mode can leave 
>>> well
>>> known state and the QLIST can get corrupted and the transactional jobs
>>> can complete in batches or phases instead of all together.
>>>
>>> Second, if two or more jobs defer to the main loop at roughly the same
>>> time, it's possible for one job's cleanup to directly invoke the other
>>> job's cleanup from within the same thread, leading to a situation that
>>> will deadlock the entire transaction.
>>>
>>> Thanks to Vladimir for pointing out these modes of failure.
>>>
>>> ________________________________________________________________________________ 
>>>
>>>
>>>
>>> For convenience, this branch is available at:
>>> https://github.com/jnsnow/qemu.git  branch job-manual-start
>>> https://github.com/jnsnow/qemu/tree/job-manual-start
>>>
>>> This version is tagged job-manual-start-v1:
>>> https://github.com/jnsnow/qemu/releases/tag/job-manual-start-v1
>>>
>>> John Snow (4):
>>>    blockjob: add block_job_start
>>>    blockjob: refactor backup_start as backup_job_create
>>>    blockjob: add .clean property
>>>    iotests: add transactional failure race test
>>>
>>> Vladimir Sementsov-Ogievskiy (1):
>>>    blockjob: fix dead pointer in txn list
>>>
>>>   block/backup.c             |  50 +++++++-----
>>>   block/commit.c             |   2 +-
>>>   block/mirror.c             |   2 +-
>>>   block/stream.c             |   2 +-
>>>   blockdev.c                 | 194
>>> ++++++++++++++++++++++++++-------------------
>>>   blockjob.c                 |  24 +++++-
>>>   include/block/block_int.h  |  19 ++---
>>>   include/block/blockjob.h   |  16 ++++
>>>   tests/qemu-iotests/124     |  91 +++++++++++++++++++++
>>>   tests/qemu-iotests/124.out |   4 +-
>>>   tests/test-blockjob-txn.c  |   2 +-
>>>   11 files changed, 284 insertions(+), 122 deletions(-)
>>>
>>
>>


-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH 0/5] blockjobs: Fix transactional race condition
  2016-09-28 12:16     ` Vladimir Sementsov-Ogievskiy
@ 2016-09-29 11:33       ` Kevin Wolf
  2016-09-29 12:30         ` Stefan Hajnoczi
  2016-09-29 20:58         ` John Snow
  0 siblings, 2 replies; 15+ messages in thread
From: Kevin Wolf @ 2016-09-29 11:33 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: John Snow, qemu-devel, Stefan Hajnoczi, Jeff Cody,
	Denis V. Lunev, qemu block

Am 28.09.2016 um 14:16 hat Vladimir Sementsov-Ogievskiy geschrieben:
> >I think jobs will need to remain "one coroutine, one job" for now,
> >but there's no reason why drive-backup or blockdev-backup can't
> >just create multiple jobs each if that's what they need to do.
> >(The backup job object could, in theory, just have another job
> >pointer to a helper job if it really became necessary.)

What's the problem with a job spawning additional coroutines internally?
Jobs already do this all the time (AIO requests are just coroutines in
disguise.)

A job is basically just the user interface for managing a background
process, so helper jobs that are managed internally rather than by the
user don't seem to make that much sense.

Kevin

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH 0/5] blockjobs: Fix transactional race condition
  2016-09-29 11:33       ` [Qemu-devel] [Qemu-block] " Kevin Wolf
@ 2016-09-29 12:30         ` Stefan Hajnoczi
  2016-09-29 20:58         ` John Snow
  1 sibling, 0 replies; 15+ messages in thread
From: Stefan Hajnoczi @ 2016-09-29 12:30 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: Vladimir Sementsov-Ogievskiy, qemu block, qemu-devel,
	Stefan Hajnoczi, Denis V. Lunev

[-- Attachment #1: Type: text/plain, Size: 1720 bytes --]

On Thu, Sep 29, 2016 at 01:33:07PM +0200, Kevin Wolf wrote:
> Am 28.09.2016 um 14:16 hat Vladimir Sementsov-Ogievskiy geschrieben:
> > >I think jobs will need to remain "one coroutine, one job" for now,
> > >but there's no reason why drive-backup or blockdev-backup can't
> > >just create multiple jobs each if that's what they need to do.
> > >(The backup job object could, in theory, just have another job
> > >pointer to a helper job if it really became necessary.)
> 
> What's the problem with a job spawning additional coroutines internally?
> Jobs already do this all the time (AIO requests are just coroutines in
> disguise.)
> 
> A job is basically just the user interface for managing a background
> process, so helper jobs that are managed internally rather than by the
> user don't seem to make that much sense.

Internal blockjobs do have use cases.  The COLO replication code uses
them and that's good because it avoids code duplication.

While internal blockjob users don't need to be hooked up to the QMP
monitor, they might want other aspects like rate-limiting, pause/resume,
ready/complete lifecycle, etc which are implemented in blockjob.c and
not part of plain old coroutines.

A layering model that supports this is:
1. User-initiated blockjobs, exposed via QMP
2. Blockjob core API
3. mirror, commit, etc

It should be possible for internal users like COLO replication to talk
to blockjob core.  That way the functionality mentioned above is
available.

If there is a valid reason for a blockjob to be composed of other block
jobs, then it could be done properly with this layering model.  I don't
know if there are valid use cases for this though ;).

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH 1/5] blockjob: fix dead pointer in txn list
  2016-08-08 19:09 ` [Qemu-devel] [PATCH 1/5] blockjob: fix dead pointer in txn list John Snow
@ 2016-09-29 18:16   ` Eric Blake
  0 siblings, 0 replies; 15+ messages in thread
From: Eric Blake @ 2016-09-29 18:16 UTC (permalink / raw)
  To: John Snow, qemu-block; +Cc: kwolf, vsementsov, famz, jcody, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 917 bytes --]

On 08/08/2016 02:09 PM, John Snow wrote:
> From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> 
> Though it is not intended to be reached through normal circumstances,
> if we do not gracefully deconstruct the transaction QLIST, we may wind
> up with stale pointers in the list.
> 
> The rest of this series attempts to address the underlying issues,
> but this should fix list inconsistencies.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> Tested-by: John Snow <jsnow@redhat.com>
> Reviewed-by: John Snow <jsnow@redhat.com>
> [Rewrote commit message. --js]
> Signed-off-by: John Snow <jsnow@redhat.com>
> 
> Signed-off-by: John Snow <jsnow@redhat.com>

Double SoB looks interesting.  Otherwise,
Reviewed-by: Eric Blake <eblake@redhat.com>


-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH 0/5] blockjobs: Fix transactional race condition
  2016-09-29 11:33       ` [Qemu-devel] [Qemu-block] " Kevin Wolf
  2016-09-29 12:30         ` Stefan Hajnoczi
@ 2016-09-29 20:58         ` John Snow
  1 sibling, 0 replies; 15+ messages in thread
From: John Snow @ 2016-09-29 20:58 UTC (permalink / raw)
  To: Kevin Wolf, Vladimir Sementsov-Ogievskiy
  Cc: qemu-devel, Stefan Hajnoczi, Jeff Cody, Denis V. Lunev, qemu block



On 09/29/2016 07:33 AM, Kevin Wolf wrote:
> Am 28.09.2016 um 14:16 hat Vladimir Sementsov-Ogievskiy geschrieben:
>>> I think jobs will need to remain "one coroutine, one job" for now,
>>> but there's no reason why drive-backup or blockdev-backup can't
>>> just create multiple jobs each if that's what they need to do.
>>> (The backup job object could, in theory, just have another job
>>> pointer to a helper job if it really became necessary.)
>
> What's the problem with a job spawning additional coroutines internally?
> Jobs already do this all the time (AIO requests are just coroutines in
> disguise.)
>

Mostly I was just pushing back against baking in multi-coroutines 
per-job as a default. If you want to add them in your own extensions, 
e.g. MultiJob : BlockJob {
   Coroutine *extra;
}

then I'm okay with that and realize we already do exactly that. Beyond 
that I think complicates the job layer a little more than necessary... 
but maybe I am being too pre-fearful.

> A job is basically just the user interface for managing a background
> process, so helper jobs that are managed internally rather than by the
> user don't seem to make that much sense.
>
> Kevin
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2016-09-29 20:59 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-08-08 19:09 [Qemu-devel] [PATCH 0/5] blockjobs: Fix transactional race condition John Snow
2016-08-08 19:09 ` [Qemu-devel] [PATCH 1/5] blockjob: fix dead pointer in txn list John Snow
2016-09-29 18:16   ` Eric Blake
2016-08-08 19:09 ` [Qemu-devel] [PATCH 2/5] blockjob: add block_job_start John Snow
2016-08-08 19:09 ` [Qemu-devel] [PATCH 3/5] blockjob: refactor backup_start as backup_job_create John Snow
2016-08-08 21:23   ` John Snow
2016-08-08 19:09 ` [Qemu-devel] [PATCH 4/5] blockjob: add .clean property John Snow
2016-08-08 19:09 ` [Qemu-devel] [PATCH 5/5] iotests: add transactional failure race test John Snow
2016-08-10 15:19   ` Vladimir Sementsov-Ogievskiy
2016-08-08 19:18 ` [Qemu-devel] [PATCH 0/5] blockjobs: Fix transactional race condition no-reply
2016-08-08 19:19 ` John Snow
     [not found] ` <57E94491.8090501@virtuozzo.com>
     [not found]   ` <bf412a34-acb7-d3df-1710-ee7917ee2060@redhat.com>
2016-09-28 12:16     ` Vladimir Sementsov-Ogievskiy
2016-09-29 11:33       ` [Qemu-devel] [Qemu-block] " Kevin Wolf
2016-09-29 12:30         ` Stefan Hajnoczi
2016-09-29 20:58         ` John Snow

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.