[PATCH v3 0/8] blockdev: avoid acquiring AioContext lock twice at do_drive_backup and do_blockdev

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v3 0/8] blockdev: avoid acquiring AioContext lock twice at do_drive_backup and do_blockdev_backup
@ 2019-11-12 11:30 Sergio Lopez
  2019-11-12 11:30 ` [PATCH v3 1/8] blockdev: merge drive_backup_prepare with do_drive_backup Sergio Lopez
                   ` (8 more replies)
  0 siblings, 9 replies; 19+ messages in thread
From: Sergio Lopez @ 2019-11-12 11:30 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Sergio Lopez, Markus Armbruster, qemu-block, Max Reitz

do_drive_backup() acquires the AioContext lock of the corresponding
BlockDriverState. This is not a problem when it's called from
qmp_drive_backup(), but drive_backup_prepare() also acquires the lock
before calling it. The same things happens with do_blockdev_backup()
and blockdev_backup_prepare().

This patch series merges do_drive_backup() with drive_backup_prepare()
and do_blockdev_backup() with blockdev_backup_prepare(), and ensures
they're only getting called from a transaction context. This way,
there's a single code path for both transaction requests and qmp
commands, as suggested by Kevin Wolf.

We also take this opportunity to ensure we're honoring the context
acquisition semantics required by bdrv_try_set_aio_context, as
suggested by Max Reitz.

---
Changelog:

v3:
 - Rework the whole patch series to fix the issue by consolidating all
   operations in the transaction model. (thanks Kevin Wolf)

v2:
 - Honor bdrv_try_set_aio_context() context acquisition requirements
   (thanks Max Reitz).
 - Release the context at drive_backup_prepare() instead of avoiding
   re-acquiring it at do_drive_baclup(). (thanks Max Reitz)
 - Convert a single patch into a two-patch series.
---

Sergio Lopez (8):
  blockdev: merge drive_backup_prepare with do_drive_backup
  blockdev: fix coding style issues in drive_backup_prepare
  blockdev: place drive_backup_prepare with the other related
    transaction functions
  blockdev: change qmp_drive_backup to make use of transactions
  blockdev: merge blockdev_backup_prepare with do_blockdev_backup
  blockdev: place blockdev_backup_prepare with the other related
    transaction helpers
  blockdev: change qmp_blockdev_backup to make use of transactions
  blockdev: honor bdrv_try_set_aio_context() context requirements

 blockdev.c | 349 ++++++++++++++++++++++++++---------------------------
 1 file changed, 171 insertions(+), 178 deletions(-)

-- 
2.23.0

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v3 1/8] blockdev: merge drive_backup_prepare with do_drive_backup
  2019-11-12 11:30 [PATCH v3 0/8] blockdev: avoid acquiring AioContext lock twice at do_drive_backup and do_blockdev_backup Sergio Lopez
@ 2019-11-12 11:30 ` Sergio Lopez
  2019-11-19  9:14   ` Max Reitz
  2019-11-12 11:30 ` [PATCH v3 2/8] blockdev: fix coding style issues in drive_backup_prepare Sergio Lopez
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 19+ messages in thread
From: Sergio Lopez @ 2019-11-12 11:30 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Sergio Lopez, Markus Armbruster, qemu-block, Max Reitz

Consolidate drive_backup_prepare() with do_drive_backup() as a first
step towards streamlining all functionality through transactions.

Signed-off-by: Sergio Lopez <slp@redhat.com>
---
 blockdev.c | 58 +++++++++++++++---------------------------------------
 1 file changed, 16 insertions(+), 42 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index 8e029e9c01..5d30aff1e5 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1764,40 +1764,6 @@ typedef struct DriveBackupState {
 static BlockJob *do_drive_backup(DriveBackup *backup, JobTxn *txn,
                             Error **errp);
 
-static void drive_backup_prepare(BlkActionState *common, Error **errp)
-{
-    DriveBackupState *state = DO_UPCAST(DriveBackupState, common, common);
-    BlockDriverState *bs;
-    DriveBackup *backup;
-    AioContext *aio_context;
-    Error *local_err = NULL;
-
-    assert(common->action->type == TRANSACTION_ACTION_KIND_DRIVE_BACKUP);
-    backup = common->action->u.drive_backup.data;
-
-    bs = bdrv_lookup_bs(backup->device, backup->device, errp);
-    if (!bs) {
-        return;
-    }
-
-    aio_context = bdrv_get_aio_context(bs);
-    aio_context_acquire(aio_context);
-
-    /* Paired with .clean() */
-    bdrv_drained_begin(bs);
-
-    state->bs = bs;
-
-    state->job = do_drive_backup(backup, common->block_job_txn, &local_err);
-    if (local_err) {
-        error_propagate(errp, local_err);
-        goto out;
-    }
-
-out:
-    aio_context_release(aio_context);
-}
-
 static void drive_backup_commit(BlkActionState *common)
 {
     DriveBackupState *state = DO_UPCAST(DriveBackupState, common, common);
@@ -3587,13 +3553,13 @@ static BlockJob *do_backup_common(BackupCommon *backup,
     return job;
 }
 
-static BlockJob *do_drive_backup(DriveBackup *backup, JobTxn *txn,
-                                 Error **errp)
+static void drive_backup_prepare(BlkActionState *common, Error **errp)
 {
+    DriveBackupState *state = DO_UPCAST(DriveBackupState, common, common);
+    DriveBackup *backup;
     BlockDriverState *bs;
     BlockDriverState *target_bs;
     BlockDriverState *source = NULL;
-    BlockJob *job = NULL;
     AioContext *aio_context;
     QDict *options;
     Error *local_err = NULL;
@@ -3601,23 +3567,29 @@ static BlockJob *do_drive_backup(DriveBackup *backup, JobTxn *txn,
     int64_t size;
     bool set_backing_hd = false;
 
+    assert(common->action->type == TRANSACTION_ACTION_KIND_DRIVE_BACKUP);
+    backup = common->action->u.drive_backup.data;
+
     if (!backup->has_mode) {
         backup->mode = NEW_IMAGE_MODE_ABSOLUTE_PATHS;
     }
 
     bs = bdrv_lookup_bs(backup->device, backup->device, errp);
     if (!bs) {
-        return NULL;
+        return;
     }
 
     if (!bs->drv) {
         error_setg(errp, "Device has no medium");
-        return NULL;
+        return;
     }
 
     aio_context = bdrv_get_aio_context(bs);
     aio_context_acquire(aio_context);
 
+    /* Paired with .clean() */
+    bdrv_drained_begin(bs);
+
     if (!backup->has_format) {
         backup->format = backup->mode == NEW_IMAGE_MODE_EXISTING ?
                          NULL : (char*) bs->drv->format_name;
@@ -3687,14 +3659,16 @@ static BlockJob *do_drive_backup(DriveBackup *backup, JobTxn *txn,
         }
     }
 
-    job = do_backup_common(qapi_DriveBackup_base(backup),
-                           bs, target_bs, aio_context, txn, errp);
+    state->bs = bs;
+
+    state->job = do_backup_common(qapi_DriveBackup_base(backup),
+                                  bs, target_bs, aio_context,
+                                  common->block_job_txn, errp);
 
 unref:
     bdrv_unref(target_bs);
 out:
     aio_context_release(aio_context);
-    return job;
 }
 
 void qmp_drive_backup(DriveBackup *arg, Error **errp)
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v3 2/8] blockdev: fix coding style issues in drive_backup_prepare
  2019-11-12 11:30 [PATCH v3 0/8] blockdev: avoid acquiring AioContext lock twice at do_drive_backup and do_blockdev_backup Sergio Lopez
  2019-11-12 11:30 ` [PATCH v3 1/8] blockdev: merge drive_backup_prepare with do_drive_backup Sergio Lopez
@ 2019-11-12 11:30 ` Sergio Lopez
  2019-11-12 11:30 ` [PATCH v3 3/8] blockdev: place drive_backup_prepare with the other related transaction functions Sergio Lopez
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 19+ messages in thread
From: Sergio Lopez @ 2019-11-12 11:30 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Sergio Lopez, Markus Armbruster, qemu-block, Max Reitz

Fix a couple of minor coding style issues in drive_backup_prepare.

Signed-off-by: Sergio Lopez <slp@redhat.com>
---
 blockdev.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index 5d30aff1e5..e8b673c5f3 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -3592,7 +3592,7 @@ static void drive_backup_prepare(BlkActionState *common, Error **errp)
 
     if (!backup->has_format) {
         backup->format = backup->mode == NEW_IMAGE_MODE_EXISTING ?
-                         NULL : (char*) bs->drv->format_name;
+                         NULL : (char *) bs->drv->format_name;
     }
 
     /* Early check to avoid creating target */
@@ -3602,8 +3602,10 @@ static void drive_backup_prepare(BlkActionState *common, Error **errp)
 
     flags = bs->open_flags | BDRV_O_RDWR;
 
-    /* See if we have a backing HD we can use to create our new image
-     * on top of. */
+    /*
+     * See if we have a backing HD we can use to create our new image
+     * on top of.
+     */
     if (backup->sync == MIRROR_SYNC_MODE_TOP) {
         source = backing_bs(bs);
         if (!source) {
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v3 3/8] blockdev: place drive_backup_prepare with the other related transaction functions
  2019-11-12 11:30 [PATCH v3 0/8] blockdev: avoid acquiring AioContext lock twice at do_drive_backup and do_blockdev_backup Sergio Lopez
  2019-11-12 11:30 ` [PATCH v3 1/8] blockdev: merge drive_backup_prepare with do_drive_backup Sergio Lopez
  2019-11-12 11:30 ` [PATCH v3 2/8] blockdev: fix coding style issues in drive_backup_prepare Sergio Lopez
@ 2019-11-12 11:30 ` Sergio Lopez
  2019-11-12 11:30 ` [PATCH v3 4/8] blockdev: change qmp_drive_backup to make use of transactions Sergio Lopez
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 19+ messages in thread
From: Sergio Lopez @ 2019-11-12 11:30 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Sergio Lopez, Markus Armbruster, qemu-block, Max Reitz

Move drive_backup_prepare() to be side by side with the other
related transaction helper functions.

Signed-off-by: Sergio Lopez <slp@redhat.com>
---
 blockdev.c | 247 +++++++++++++++++++++++++++--------------------------
 1 file changed, 125 insertions(+), 122 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index e8b673c5f3..b32855f702 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1761,8 +1761,131 @@ typedef struct DriveBackupState {
     BlockJob *job;
 } DriveBackupState;
 
-static BlockJob *do_drive_backup(DriveBackup *backup, JobTxn *txn,
-                            Error **errp);
+static BlockJob *do_backup_common(BackupCommon *backup,
+                                  BlockDriverState *bs,
+                                  BlockDriverState *target_bs,
+                                  AioContext *aio_context,
+                                  JobTxn *txn, Error **errp);
+
+static void drive_backup_prepare(BlkActionState *common, Error **errp)
+{
+    DriveBackupState *state = DO_UPCAST(DriveBackupState, common, common);
+    DriveBackup *backup;
+    BlockDriverState *bs;
+    BlockDriverState *target_bs;
+    BlockDriverState *source = NULL;
+    AioContext *aio_context;
+    QDict *options;
+    Error *local_err = NULL;
+    int flags;
+    int64_t size;
+    bool set_backing_hd = false;
+
+    assert(common->action->type == TRANSACTION_ACTION_KIND_DRIVE_BACKUP);
+    backup = common->action->u.drive_backup.data;
+
+    if (!backup->has_mode) {
+        backup->mode = NEW_IMAGE_MODE_ABSOLUTE_PATHS;
+    }
+
+    bs = bdrv_lookup_bs(backup->device, backup->device, errp);
+    if (!bs) {
+        return;
+    }
+
+    if (!bs->drv) {
+        error_setg(errp, "Device has no medium");
+        return;
+    }
+
+    aio_context = bdrv_get_aio_context(bs);
+    aio_context_acquire(aio_context);
+
+    /* Paired with .clean() */
+    bdrv_drained_begin(bs);
+
+    if (!backup->has_format) {
+        backup->format = backup->mode == NEW_IMAGE_MODE_EXISTING ?
+                         NULL : (char *) bs->drv->format_name;
+    }
+
+    /* Early check to avoid creating target */
+    if (bdrv_op_is_blocked(bs, BLOCK_OP_TYPE_BACKUP_SOURCE, errp)) {
+        goto out;
+    }
+
+    flags = bs->open_flags | BDRV_O_RDWR;
+
+    /*
+     * See if we have a backing HD we can use to create our new image
+     * on top of.
+     */
+    if (backup->sync == MIRROR_SYNC_MODE_TOP) {
+        source = backing_bs(bs);
+        if (!source) {
+            backup->sync = MIRROR_SYNC_MODE_FULL;
+        }
+    }
+    if (backup->sync == MIRROR_SYNC_MODE_NONE) {
+        source = bs;
+        flags |= BDRV_O_NO_BACKING;
+        set_backing_hd = true;
+    }
+
+    size = bdrv_getlength(bs);
+    if (size < 0) {
+        error_setg_errno(errp, -size, "bdrv_getlength failed");
+        goto out;
+    }
+
+    if (backup->mode != NEW_IMAGE_MODE_EXISTING) {
+        assert(backup->format);
+        if (source) {
+            bdrv_refresh_filename(source);
+            bdrv_img_create(backup->target, backup->format, source->filename,
+                            source->drv->format_name, NULL,
+                            size, flags, false, &local_err);
+        } else {
+            bdrv_img_create(backup->target, backup->format, NULL, NULL, NULL,
+                            size, flags, false, &local_err);
+        }
+    }
+
+    if (local_err) {
+        error_propagate(errp, local_err);
+        goto out;
+    }
+
+    options = qdict_new();
+    qdict_put_str(options, "discard", "unmap");
+    qdict_put_str(options, "detect-zeroes", "unmap");
+    if (backup->format) {
+        qdict_put_str(options, "driver", backup->format);
+    }
+
+    target_bs = bdrv_open(backup->target, NULL, options, flags, errp);
+    if (!target_bs) {
+        goto out;
+    }
+
+    if (set_backing_hd) {
+        bdrv_set_backing_hd(target_bs, source, &local_err);
+        if (local_err) {
+            goto unref;
+        }
+    }
+
+    state->bs = bs;
+
+    state->job = do_backup_common(qapi_DriveBackup_base(backup),
+                                  bs, target_bs, aio_context,
+                                  common->block_job_txn, errp);
+
+unref:
+    bdrv_unref(target_bs);
+out:
+    aio_context_release(aio_context);
+}
 
 static void drive_backup_commit(BlkActionState *common)
 {
@@ -3553,126 +3676,6 @@ static BlockJob *do_backup_common(BackupCommon *backup,
     return job;
 }
 
-static void drive_backup_prepare(BlkActionState *common, Error **errp)
-{
-    DriveBackupState *state = DO_UPCAST(DriveBackupState, common, common);
-    DriveBackup *backup;
-    BlockDriverState *bs;
-    BlockDriverState *target_bs;
-    BlockDriverState *source = NULL;
-    AioContext *aio_context;
-    QDict *options;
-    Error *local_err = NULL;
-    int flags;
-    int64_t size;
-    bool set_backing_hd = false;
-
-    assert(common->action->type == TRANSACTION_ACTION_KIND_DRIVE_BACKUP);
-    backup = common->action->u.drive_backup.data;
-
-    if (!backup->has_mode) {
-        backup->mode = NEW_IMAGE_MODE_ABSOLUTE_PATHS;
-    }
-
-    bs = bdrv_lookup_bs(backup->device, backup->device, errp);
-    if (!bs) {
-        return;
-    }
-
-    if (!bs->drv) {
-        error_setg(errp, "Device has no medium");
-        return;
-    }
-
-    aio_context = bdrv_get_aio_context(bs);
-    aio_context_acquire(aio_context);
-
-    /* Paired with .clean() */
-    bdrv_drained_begin(bs);
-
-    if (!backup->has_format) {
-        backup->format = backup->mode == NEW_IMAGE_MODE_EXISTING ?
-                         NULL : (char *) bs->drv->format_name;
-    }
-
-    /* Early check to avoid creating target */
-    if (bdrv_op_is_blocked(bs, BLOCK_OP_TYPE_BACKUP_SOURCE, errp)) {
-        goto out;
-    }
-
-    flags = bs->open_flags | BDRV_O_RDWR;
-
-    /*
-     * See if we have a backing HD we can use to create our new image
-     * on top of.
-     */
-    if (backup->sync == MIRROR_SYNC_MODE_TOP) {
-        source = backing_bs(bs);
-        if (!source) {
-            backup->sync = MIRROR_SYNC_MODE_FULL;
-        }
-    }
-    if (backup->sync == MIRROR_SYNC_MODE_NONE) {
-        source = bs;
-        flags |= BDRV_O_NO_BACKING;
-        set_backing_hd = true;
-    }
-
-    size = bdrv_getlength(bs);
-    if (size < 0) {
-        error_setg_errno(errp, -size, "bdrv_getlength failed");
-        goto out;
-    }
-
-    if (backup->mode != NEW_IMAGE_MODE_EXISTING) {
-        assert(backup->format);
-        if (source) {
-            bdrv_refresh_filename(source);
-            bdrv_img_create(backup->target, backup->format, source->filename,
-                            source->drv->format_name, NULL,
-                            size, flags, false, &local_err);
-        } else {
-            bdrv_img_create(backup->target, backup->format, NULL, NULL, NULL,
-                            size, flags, false, &local_err);
-        }
-    }
-
-    if (local_err) {
-        error_propagate(errp, local_err);
-        goto out;
-    }
-
-    options = qdict_new();
-    qdict_put_str(options, "discard", "unmap");
-    qdict_put_str(options, "detect-zeroes", "unmap");
-    if (backup->format) {
-        qdict_put_str(options, "driver", backup->format);
-    }
-
-    target_bs = bdrv_open(backup->target, NULL, options, flags, errp);
-    if (!target_bs) {
-        goto out;
-    }
-
-    if (set_backing_hd) {
-        bdrv_set_backing_hd(target_bs, source, &local_err);
-        if (local_err) {
-            goto unref;
-        }
-    }
-
-    state->bs = bs;
-
-    state->job = do_backup_common(qapi_DriveBackup_base(backup),
-                                  bs, target_bs, aio_context,
-                                  common->block_job_txn, errp);
-
-unref:
-    bdrv_unref(target_bs);
-out:
-    aio_context_release(aio_context);
-}
-
 void qmp_drive_backup(DriveBackup *arg, Error **errp)
 {
 
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v3 4/8] blockdev: change qmp_drive_backup to make use of transactions
  2019-11-12 11:30 [PATCH v3 0/8] blockdev: avoid acquiring AioContext lock twice at do_drive_backup and do_blockdev_backup Sergio Lopez
                   ` (2 preceding siblings ...)
  2019-11-12 11:30 ` [PATCH v3 3/8] blockdev: place drive_backup_prepare with the other related transaction functions Sergio Lopez
@ 2019-11-12 11:30 ` Sergio Lopez
  2019-11-12 11:30 ` [PATCH v3 5/8] blockdev: merge blockdev_backup_prepare with do_blockdev_backup Sergio Lopez
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 19+ messages in thread
From: Sergio Lopez @ 2019-11-12 11:30 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Sergio Lopez, Markus Armbruster, qemu-block, Max Reitz

Change qmp_drive_backup() to create and start a transaction instead of
calling do_drive_backup directly.

Signed-off-by: Sergio Lopez <slp@redhat.com>
---
 blockdev.c | 13 ++++++-------
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index b32855f702..5e85fc042e 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -3676,14 +3676,13 @@ static BlockJob *do_backup_common(BackupCommon *backup,
     return job;
 }
 
-void qmp_drive_backup(DriveBackup *arg, Error **errp)
+void qmp_drive_backup(DriveBackup *backup, Error **errp)
 {
-
-    BlockJob *job;
-    job = do_drive_backup(arg, NULL, errp);
-    if (job) {
-        job_start(&job->job);
-    }
+    TransactionAction action = {
+        .type = TRANSACTION_ACTION_KIND_DRIVE_BACKUP,
+        .u.drive_backup.data = backup,
+    };
+    blockdev_do_action(&action, errp);
 }
 
 BlockDeviceInfoList *qmp_query_named_block_nodes(Error **errp)
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v3 5/8] blockdev: merge blockdev_backup_prepare with do_blockdev_backup
  2019-11-12 11:30 [PATCH v3 0/8] blockdev: avoid acquiring AioContext lock twice at do_drive_backup and do_blockdev_backup Sergio Lopez
                   ` (3 preceding siblings ...)
  2019-11-12 11:30 ` [PATCH v3 4/8] blockdev: change qmp_drive_backup to make use of transactions Sergio Lopez
@ 2019-11-12 11:30 ` Sergio Lopez
  2019-11-12 11:30 ` [PATCH v3 6/8] blockdev: place blockdev_backup_prepare with the other related transaction helpers Sergio Lopez
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 19+ messages in thread
From: Sergio Lopez @ 2019-11-12 11:30 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Sergio Lopez, Markus Armbruster, qemu-block, Max Reitz

Consolidate blockdev_backup_prepare() with do_blockdev_backup() as a
first step towards streamlining all functionality through
transactions.

Signed-off-by: Sergio Lopez <slp@redhat.com>
---
 blockdev.c | 64 +++++++++++++-----------------------------------------
 1 file changed, 15 insertions(+), 49 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index 5e85fc042e..128707cdc6 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1940,47 +1940,6 @@ typedef struct BlockdevBackupState {
     BlockJob *job;
 } BlockdevBackupState;
 
-static BlockJob *do_blockdev_backup(BlockdevBackup *backup, JobTxn *txn,
-                                    Error **errp);
-
-static void blockdev_backup_prepare(BlkActionState *common, Error **errp)
-{
-    BlockdevBackupState *state = DO_UPCAST(BlockdevBackupState, common, common);
-    BlockdevBackup *backup;
-    BlockDriverState *bs, *target;
-    AioContext *aio_context;
-    Error *local_err = NULL;
-
-    assert(common->action->type == TRANSACTION_ACTION_KIND_BLOCKDEV_BACKUP);
-    backup = common->action->u.blockdev_backup.data;
-
-    bs = bdrv_lookup_bs(backup->device, backup->device, errp);
-    if (!bs) {
-        return;
-    }
-
-    target = bdrv_lookup_bs(backup->target, backup->target, errp);
-    if (!target) {
-        return;
-    }
-
-    aio_context = bdrv_get_aio_context(bs);
-    aio_context_acquire(aio_context);
-    state->bs = bs;
-
-    /* Paired with .clean() */
-    bdrv_drained_begin(state->bs);
-
-    state->job = do_blockdev_backup(backup, common->block_job_txn, &local_err);
-    if (local_err) {
-        error_propagate(errp, local_err);
-        goto out;
-    }
-
-out:
-    aio_context_release(aio_context);
-}
-
 static void blockdev_backup_commit(BlkActionState *common)
 {
     BlockdevBackupState *state = DO_UPCAST(BlockdevBackupState, common, common);
@@ -3695,32 +3654,39 @@ XDbgBlockGraph *qmp_x_debug_query_block_graph(Error **errp)
     return bdrv_get_xdbg_block_graph(errp);
 }
 
-BlockJob *do_blockdev_backup(BlockdevBackup *backup, JobTxn *txn,
-                             Error **errp)
+static void blockdev_backup_prepare(BlkActionState *common, Error **errp)
 {
+    BlockdevBackupState *state = DO_UPCAST(BlockdevBackupState, common, common);
+    BlockdevBackup *backup;
     BlockDriverState *bs;
     BlockDriverState *target_bs;
     AioContext *aio_context;
-    BlockJob *job;
+
+    assert(common->action->type == TRANSACTION_ACTION_KIND_BLOCKDEV_BACKUP);
+    backup = common->action->u.blockdev_backup.data;
 
     bs = bdrv_lookup_bs(backup->device, backup->device, errp);
     if (!bs) {
-        return NULL;
+        return;
     }
 
     target_bs = bdrv_lookup_bs(backup->target, backup->target, errp);
     if (!target_bs) {
-        return NULL;
+        return;
     }
 
     aio_context = bdrv_get_aio_context(bs);
     aio_context_acquire(aio_context);
+    state->bs = bs;
 
-    job = do_backup_common(qapi_BlockdevBackup_base(backup),
-                           bs, target_bs, aio_context, txn, errp);
+    /* Paired with .clean() */
+    bdrv_drained_begin(state->bs);
+
+    state->job = do_backup_common(qapi_BlockdevBackup_base(backup),
+                                  bs, target_bs, aio_context,
+                                  common->block_job_txn, errp);
 
     aio_context_release(aio_context);
-    return job;
 }
 
 void qmp_blockdev_backup(BlockdevBackup *arg, Error **errp)
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v3 6/8] blockdev: place blockdev_backup_prepare with the other related transaction helpers
  2019-11-12 11:30 [PATCH v3 0/8] blockdev: avoid acquiring AioContext lock twice at do_drive_backup and do_blockdev_backup Sergio Lopez
                   ` (4 preceding siblings ...)
  2019-11-12 11:30 ` [PATCH v3 5/8] blockdev: merge blockdev_backup_prepare with do_blockdev_backup Sergio Lopez
@ 2019-11-12 11:30 ` Sergio Lopez
  2019-11-12 11:30 ` [PATCH v3 7/8] blockdev: change qmp_blockdev_backup to make use of transactions Sergio Lopez
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 19+ messages in thread
From: Sergio Lopez @ 2019-11-12 11:30 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Sergio Lopez, Markus Armbruster, qemu-block, Max Reitz

Move blockdev_backup_prepare() to be side by side with the other
related transaction helper functions.

Signed-off-by: Sergio Lopez <slp@redhat.com>
---
 blockdev.c | 70 +++++++++++++++++++++++++++---------------------------
 1 file changed, 35 insertions(+), 35 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index 128707cdc6..f94aaa98f0 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1940,6 +1940,41 @@ typedef struct BlockdevBackupState {
     BlockJob *job;
 } BlockdevBackupState;
 
+static void blockdev_backup_prepare(BlkActionState *common, Error **errp)
+{
+    BlockdevBackupState *state = DO_UPCAST(BlockdevBackupState, common, common);
+    BlockdevBackup *backup;
+    BlockDriverState *bs;
+    BlockDriverState *target_bs;
+    AioContext *aio_context;
+
+    assert(common->action->type == TRANSACTION_ACTION_KIND_BLOCKDEV_BACKUP);
+    backup = common->action->u.blockdev_backup.data;
+
+    bs = bdrv_lookup_bs(backup->device, backup->device, errp);
+    if (!bs) {
+        return;
+    }
+
+    target_bs = bdrv_lookup_bs(backup->target, backup->target, errp);
+    if (!target_bs) {
+        return;
+    }
+
+    aio_context = bdrv_get_aio_context(bs);
+    aio_context_acquire(aio_context);
+    state->bs = bs;
+
+    /* Paired with .clean() */
+    bdrv_drained_begin(state->bs);
+
+    state->job = do_backup_common(qapi_BlockdevBackup_base(backup),
+                                  bs, target_bs, aio_context,
+                                  common->block_job_txn, errp);
+
+    aio_context_release(aio_context);
+}
+
 static void blockdev_backup_commit(BlkActionState *common)
 {
     BlockdevBackupState *state = DO_UPCAST(BlockdevBackupState, common, common);
@@ -3654,41 +3689,6 @@ XDbgBlockGraph *qmp_x_debug_query_block_graph(Error **errp)
     return bdrv_get_xdbg_block_graph(errp);
 }
 
-static void blockdev_backup_prepare(BlkActionState *common, Error **errp)
-{
-    BlockdevBackupState *state = DO_UPCAST(BlockdevBackupState, common, common);
-    BlockdevBackup *backup;
-    BlockDriverState *bs;
-    BlockDriverState *target_bs;
-    AioContext *aio_context;
-
-    assert(common->action->type == TRANSACTION_ACTION_KIND_BLOCKDEV_BACKUP);
-    backup = common->action->u.blockdev_backup.data;
-
-    bs = bdrv_lookup_bs(backup->device, backup->device, errp);
-    if (!bs) {
-        return;
-    }
-
-    target_bs = bdrv_lookup_bs(backup->target, backup->target, errp);
-    if (!target_bs) {
-        return;
-    }
-
-    aio_context = bdrv_get_aio_context(bs);
-    aio_context_acquire(aio_context);
-    state->bs = bs;
-
-    /* Paired with .clean() */
-    bdrv_drained_begin(state->bs);
-
-    state->job = do_backup_common(qapi_BlockdevBackup_base(backup),
-                                  bs, target_bs, aio_context,
-                                  common->block_job_txn, errp);
-
-    aio_context_release(aio_context);
-}
-
 void qmp_blockdev_backup(BlockdevBackup *arg, Error **errp)
 {
     BlockJob *job;
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v3 7/8] blockdev: change qmp_blockdev_backup to make use of transactions
  2019-11-12 11:30 [PATCH v3 0/8] blockdev: avoid acquiring AioContext lock twice at do_drive_backup and do_blockdev_backup Sergio Lopez
                   ` (5 preceding siblings ...)
  2019-11-12 11:30 ` [PATCH v3 6/8] blockdev: place blockdev_backup_prepare with the other related transaction helpers Sergio Lopez
@ 2019-11-12 11:30 ` Sergio Lopez
  2019-11-12 11:30 ` [PATCH v3 8/8] blockdev: honor bdrv_try_set_aio_context() context requirements Sergio Lopez
  2019-11-12 22:49 ` [PATCH v3 0/8] blockdev: avoid acquiring AioContext lock twice at do_drive_backup and do_blockdev_backup no-reply
  8 siblings, 0 replies; 19+ messages in thread
From: Sergio Lopez @ 2019-11-12 11:30 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Sergio Lopez, Markus Armbruster, qemu-block, Max Reitz

Change qmp_blockdev_backup() to create and start a transaction instead
of calling do_blockdev_backup() directly.

Signed-off-by: Sergio Lopez <slp@redhat.com>
---
 blockdev.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index f94aaa98f0..152a0f7454 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -3689,13 +3689,13 @@ XDbgBlockGraph *qmp_x_debug_query_block_graph(Error **errp)
     return bdrv_get_xdbg_block_graph(errp);
 }
 
-void qmp_blockdev_backup(BlockdevBackup *arg, Error **errp)
+void qmp_blockdev_backup(BlockdevBackup *backup, Error **errp)
 {
-    BlockJob *job;
-    job = do_blockdev_backup(arg, NULL, errp);
-    if (job) {
-        job_start(&job->job);
-    }
+    TransactionAction action = {
+        .type = TRANSACTION_ACTION_KIND_BLOCKDEV_BACKUP,
+        .u.blockdev_backup.data = backup,
+    };
+    blockdev_do_action(&action, errp);
 }
 
 /* Parameter check and block job starting for drive mirroring.
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v3 8/8] blockdev: honor bdrv_try_set_aio_context() context requirements
  2019-11-12 11:30 [PATCH v3 0/8] blockdev: avoid acquiring AioContext lock twice at do_drive_backup and do_blockdev_backup Sergio Lopez
                   ` (6 preceding siblings ...)
  2019-11-12 11:30 ` [PATCH v3 7/8] blockdev: change qmp_blockdev_backup to make use of transactions Sergio Lopez
@ 2019-11-12 11:30 ` Sergio Lopez
  2019-11-12 22:49 ` [PATCH v3 0/8] blockdev: avoid acquiring AioContext lock twice at do_drive_backup and do_blockdev_backup no-reply
  8 siblings, 0 replies; 19+ messages in thread
From: Sergio Lopez @ 2019-11-12 11:30 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Sergio Lopez, Markus Armbruster, qemu-block, Max Reitz

bdrv_try_set_aio_context() requires that the old context is held, and
the new context is not held. Fix all the occurrences where it's not
done this way.

Suggested-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Sergio Lopez <slp@redhat.com>
---
 blockdev.c | 67 ++++++++++++++++++++++++++++++++++++++++++++++--------
 1 file changed, 58 insertions(+), 9 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index 152a0f7454..b0647d8d33 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1535,6 +1535,7 @@ static void external_snapshot_prepare(BlkActionState *common,
                              DO_UPCAST(ExternalSnapshotState, common, common);
     TransactionAction *action = common->action;
     AioContext *aio_context;
+    AioContext *old_context;
     int ret;
 
     /* 'blockdev-snapshot' and 'blockdev-snapshot-sync' have similar
@@ -1675,11 +1676,20 @@ static void external_snapshot_prepare(BlkActionState *common,
         goto out;
     }
 
+    /* Honor bdrv_try_set_aio_context() context acquisition requirements. */
+    old_context = bdrv_get_aio_context(state->new_bs);
+    aio_context_release(aio_context);
+    aio_context_acquire(old_context);
+
     ret = bdrv_try_set_aio_context(state->new_bs, aio_context, errp);
     if (ret < 0) {
-        goto out;
+        aio_context_release(old_context);
+        return;
     }
 
+    aio_context_release(old_context);
+    aio_context_acquire(aio_context);
+
     /* This removes our old bs and adds the new bs. This is an operation that
      * can fail, so we need to do it in .prepare; undoing it for abort is
      * always possible. */
@@ -1775,11 +1785,13 @@ static void drive_backup_prepare(BlkActionState *common, Error **errp)
     BlockDriverState *target_bs;
     BlockDriverState *source = NULL;
     AioContext *aio_context;
+    AioContext *old_context;
     QDict *options;
     Error *local_err = NULL;
     int flags;
     int64_t size;
     bool set_backing_hd = false;
+    int ret;
 
     assert(common->action->type == TRANSACTION_ACTION_KIND_DRIVE_BACKUP);
     backup = common->action->u.drive_backup.data;
@@ -1868,6 +1880,20 @@ static void drive_backup_prepare(BlkActionState *common, Error **errp)
         goto out;
     }
 
+    /* Honor bdrv_try_set_aio_context() context acquisition requirements. */
+    old_context = bdrv_get_aio_context(target_bs);
+    aio_context_release(aio_context);
+    aio_context_acquire(old_context);
+
+    ret = bdrv_try_set_aio_context(target_bs, aio_context, errp);
+    if (ret < 0) {
+        aio_context_release(old_context);
+        return;
+     }
+
+    aio_context_release(old_context);
+    aio_context_acquire(aio_context);
+
     if (set_backing_hd) {
         bdrv_set_backing_hd(target_bs, source, &local_err);
         if (local_err) {
@@ -1947,6 +1973,8 @@ static void blockdev_backup_prepare(BlkActionState *common, Error **errp)
     BlockDriverState *bs;
     BlockDriverState *target_bs;
     AioContext *aio_context;
+    AioContext *old_context;
+    int ret;
 
     assert(common->action->type == TRANSACTION_ACTION_KIND_BLOCKDEV_BACKUP);
     backup = common->action->u.blockdev_backup.data;
@@ -1961,7 +1989,18 @@ static void blockdev_backup_prepare(BlkActionState *common, Error **errp)
         return;
     }
 
+    /* Honor bdrv_try_set_aio_context() context acquisition requirements. */
     aio_context = bdrv_get_aio_context(bs);
+    old_context = bdrv_get_aio_context(target_bs);
+    aio_context_acquire(old_context);
+
+    ret = bdrv_try_set_aio_context(target_bs, aio_context, errp);
+    if (ret < 0) {
+        aio_context_release(old_context);
+        return;
+    }
+
+    aio_context_release(old_context);
     aio_context_acquire(aio_context);
     state->bs = bs;
 
@@ -3562,7 +3601,6 @@ static BlockJob *do_backup_common(BackupCommon *backup,
     BlockJob *job = NULL;
     BdrvDirtyBitmap *bmap = NULL;
     int job_flags = JOB_DEFAULT;
-    int ret;
 
     if (!backup->has_speed) {
         backup->speed = 0;
@@ -3586,11 +3624,6 @@ static BlockJob *do_backup_common(BackupCommon *backup,
         backup->compress = false;
     }
 
-    ret = bdrv_try_set_aio_context(target_bs, aio_context, errp);
-    if (ret < 0) {
-        return NULL;
-    }
-
     if ((backup->sync == MIRROR_SYNC_MODE_BITMAP) ||
         (backup->sync == MIRROR_SYNC_MODE_INCREMENTAL)) {
         /* done before desugaring 'incremental' to print the right message */
@@ -3825,6 +3858,7 @@ void qmp_drive_mirror(DriveMirror *arg, Error **errp)
     BlockDriverState *bs;
     BlockDriverState *source, *target_bs;
     AioContext *aio_context;
+    AioContext *old_context;
     BlockMirrorBackingMode backing_mode;
     Error *local_err = NULL;
     QDict *options = NULL;
@@ -3937,12 +3971,22 @@ void qmp_drive_mirror(DriveMirror *arg, Error **errp)
                    (arg->mode == NEW_IMAGE_MODE_EXISTING ||
                     !bdrv_has_zero_init(target_bs)));
 
+
+    /* Honor bdrv_try_set_aio_context() context acquisition requirements. */
+    old_context = bdrv_get_aio_context(target_bs);
+    aio_context_release(aio_context);
+    aio_context_acquire(old_context);
+
     ret = bdrv_try_set_aio_context(target_bs, aio_context, errp);
     if (ret < 0) {
         bdrv_unref(target_bs);
-        goto out;
+        aio_context_release(old_context);
+        return;
     }
 
+    aio_context_release(old_context);
+    aio_context_acquire(aio_context);
+
     blockdev_mirror_common(arg->has_job_id ? arg->job_id : NULL, bs, target_bs,
                            arg->has_replaces, arg->replaces, arg->sync,
                            backing_mode, zero_target,
@@ -3984,6 +4028,7 @@ void qmp_blockdev_mirror(bool has_job_id, const char *job_id,
     BlockDriverState *bs;
     BlockDriverState *target_bs;
     AioContext *aio_context;
+    AioContext *old_context;
     BlockMirrorBackingMode backing_mode = MIRROR_LEAVE_BACKING_CHAIN;
     Error *local_err = NULL;
     bool zero_target;
@@ -4001,14 +4046,18 @@ void qmp_blockdev_mirror(bool has_job_id, const char *job_id,
 
     zero_target = (sync == MIRROR_SYNC_MODE_FULL);
 
+    /* Honor bdrv_try_set_aio_context() context acquisition requirements. */
+    old_context = bdrv_get_aio_context(target_bs);
     aio_context = bdrv_get_aio_context(bs);
-    aio_context_acquire(aio_context);
+    aio_context_acquire(old_context);
 
     ret = bdrv_try_set_aio_context(target_bs, aio_context, errp);
     if (ret < 0) {
         goto out;
     }
 
+    aio_context_acquire(aio_context);
+
     blockdev_mirror_common(has_job_id ? job_id : NULL, bs, target_bs,
                            has_replaces, replaces, sync, backing_mode,
                            zero_target, has_speed, speed,
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 0/8] blockdev: avoid acquiring AioContext lock twice at do_drive_backup and do_blockdev_backup
  2019-11-12 11:30 [PATCH v3 0/8] blockdev: avoid acquiring AioContext lock twice at do_drive_backup and do_blockdev_backup Sergio Lopez
                   ` (7 preceding siblings ...)
  2019-11-12 11:30 ` [PATCH v3 8/8] blockdev: honor bdrv_try_set_aio_context() context requirements Sergio Lopez
@ 2019-11-12 22:49 ` no-reply
  2019-11-13  9:14   ` Sergio Lopez
  8 siblings, 1 reply; 19+ messages in thread
From: no-reply @ 2019-11-12 22:49 UTC (permalink / raw)
  To: slp; +Cc: kwolf, slp, qemu-block, armbru, qemu-devel, mreitz

Patchew URL: https://patchew.org/QEMU/20191112113012.71136-1-slp@redhat.com/



Hi,

This series failed the docker-quick@centos7 build test. Please find the testing commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.

=== TEST SCRIPT BEGIN ===
#!/bin/bash
make docker-image-centos7 V=1 NETWORK=1
time make docker-test-quick@centos7 SHOW_ENV=1 J=14 NETWORK=1
=== TEST SCRIPT END ===

  TEST    iotest-qcow2: 268
Failures: 141
Failed 1 of 108 iotests
make: *** [check-tests/check-block.sh] Error 1
Traceback (most recent call last):
  File "./tests/docker/docker.py", line 662, in <module>
    sys.exit(main())
---
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['sudo', '-n', 'docker', 'run', '--label', 'com.qemu.instance.uuid=5e0a4e7f97154a93b182d709969b9417', '-u', '1003', '--security-opt', 'seccomp=unconfined', '--rm', '-e', 'TARGET_LIST=', '-e', 'EXTRA_CONFIGURE_OPTS=', '-e', 'V=', '-e', 'J=14', '-e', 'DEBUG=', '-e', 'SHOW_ENV=1', '-e', 'CCACHE_DIR=/var/tmp/ccache', '-v', '/home/patchew2/.cache/qemu-docker-ccache:/var/tmp/ccache:z', '-v', '/var/tmp/patchew-tester-tmp-6a9_8q0n/src/docker-src.2019-11-12-17.38.46.26027:/var/tmp/qemu:z,ro', 'qemu:centos7', '/var/tmp/qemu/run', 'test-quick']' returned non-zero exit status 2.
filter=--filter=label=com.qemu.instance.uuid=5e0a4e7f97154a93b182d709969b9417
make[1]: *** [docker-run] Error 1
make[1]: Leaving directory `/var/tmp/patchew-tester-tmp-6a9_8q0n/src'
make: *** [docker-run-test-quick@centos7] Error 2

real    10m57.839s
user    0m8.062s


The full log is available at
http://patchew.org/logs/20191112113012.71136-1-slp@redhat.com/testing.docker-quick@centos7/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-devel@redhat.com

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 0/8] blockdev: avoid acquiring AioContext lock twice at do_drive_backup and do_blockdev_backup
  2019-11-12 22:49 ` [PATCH v3 0/8] blockdev: avoid acquiring AioContext lock twice at do_drive_backup and do_blockdev_backup no-reply
@ 2019-11-13  9:14   ` Sergio Lopez
  2019-11-13 13:24     ` Sergio Lopez
  0 siblings, 1 reply; 19+ messages in thread
From: Sergio Lopez @ 2019-11-13  9:14 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, armbru, qemu-block, mreitz


no-reply@patchew.org writes:

> Patchew URL: https://patchew.org/QEMU/20191112113012.71136-1-slp@redhat.com/
>
>
>
> Hi,
>
> This series failed the docker-quick@centos7 build test. Please find the testing commands and
> their output below. If you have Docker installed, you can probably reproduce it
> locally.
>
> === TEST SCRIPT BEGIN ===
> #!/bin/bash
> make docker-image-centos7 V=1 NETWORK=1
> time make docker-test-quick@centos7 SHOW_ENV=1 J=14 NETWORK=1
> === TEST SCRIPT END ===
>
>   TEST    iotest-qcow2: 268
> Failures: 141

Hm... 141 didn't fail in my test machine. I'm going to have a look.

Sergio.

> Failed 1 of 108 iotests
> make: *** [check-tests/check-block.sh] Error 1
> Traceback (most recent call last):
>   File "./tests/docker/docker.py", line 662, in <module>
>     sys.exit(main())
> ---
>     raise CalledProcessError(retcode, cmd)
> subprocess.CalledProcessError: Command '['sudo', '-n', 'docker', 'run', '--label', 'com.qemu.instance.uuid=5e0a4e7f97154a93b182d709969b9417', '-u', '1003', '--security-opt', 'seccomp=unconfined', '--rm', '-e', 'TARGET_LIST=', '-e', 'EXTRA_CONFIGURE_OPTS=', '-e', 'V=', '-e', 'J=14', '-e', 'DEBUG=', '-e', 'SHOW_ENV=1', '-e', 'CCACHE_DIR=/var/tmp/ccache', '-v', '/home/patchew2/.cache/qemu-docker-ccache:/var/tmp/ccache:z', '-v', '/var/tmp/patchew-tester-tmp-6a9_8q0n/src/docker-src.2019-11-12-17.38.46.26027:/var/tmp/qemu:z,ro', 'qemu:centos7', '/var/tmp/qemu/run', 'test-quick']' returned non-zero exit status 2.
> filter=--filter=label=com.qemu.instance.uuid=5e0a4e7f97154a93b182d709969b9417
> make[1]: *** [docker-run] Error 1
> make[1]: Leaving directory `/var/tmp/patchew-tester-tmp-6a9_8q0n/src'
> make: *** [docker-run-test-quick@centos7] Error 2
>
> real    10m57.839s
> user    0m8.062s
>
>
> The full log is available at
> http://patchew.org/logs/20191112113012.71136-1-slp@redhat.com/testing.docker-quick@centos7/?type=message.
> ---
> Email generated automatically by Patchew [https://patchew.org/].
> Please send your feedback to patchew-devel@redhat.com



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 0/8] blockdev: avoid acquiring AioContext lock twice at do_drive_backup and do_blockdev_backup
  2019-11-13  9:14   ` Sergio Lopez
@ 2019-11-13 13:24     ` Sergio Lopez
  2019-11-19  9:36       ` Max Reitz
  0 siblings, 1 reply; 19+ messages in thread
From: Sergio Lopez @ 2019-11-13 13:24 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, armbru, qemu-block, mreitz


Sergio Lopez <slp@redhat.com> writes:

> no-reply@patchew.org writes:
>
>> Patchew URL: https://patchew.org/QEMU/20191112113012.71136-1-slp@redhat.com/
>>
>>
>>
>> Hi,
>>
>> This series failed the docker-quick@centos7 build test. Please find the testing commands and
>> their output below. If you have Docker installed, you can probably reproduce it
>> locally.
>>
>> === TEST SCRIPT BEGIN ===
>> #!/bin/bash
>> make docker-image-centos7 V=1 NETWORK=1
>> time make docker-test-quick@centos7 SHOW_ENV=1 J=14 NETWORK=1
>> === TEST SCRIPT END ===
>>
>>   TEST    iotest-qcow2: 268
>> Failures: 141
>
> Hm... 141 didn't fail in my test machine. I'm going to have a look.

So here's the output:

--- /root/qemu/tests/qemu-iotests/141.out	2019-11-12 04:43:27.651557587 -0500
+++ /root/qemu/build/tests/qemu-iotests/141.out.bad	2019-11-13 08:12:06.575967337 -0500
@@ -10,6 +10,8 @@
 Formatting 'TEST_DIR/o.IMGFMT', fmt=IMGFMT size=1048576 backing_file=TEST_DIR/t.IMGFMT backing_fmt=IMGFMT
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "job0"}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "job0"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "paused", "id": "job0"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "job0"}}
 {"error": {"class": "GenericError", "desc": "Node 'drv0' is busy: node is used as backing hd of 'NODE_NAME'"}}
 {"return": {}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "aborting", "id": "job0"}}

Those extra lines, the "paused" and "running", are a result of the job
being done in a transaction, within a drained section.

We can update 141.out, but now I'm wondering, was it safe creating the
job at do_drive_backup() outside of a drained section, as
qmp_drive_backup was doing?

Do you think there may be any potential drawbacks as a result of always
doing it now inside a drained section?

Thanks,
Sergio.

> Sergio.
>
>> Failed 1 of 108 iotests
>> make: *** [check-tests/check-block.sh] Error 1
>> Traceback (most recent call last):
>>   File "./tests/docker/docker.py", line 662, in <module>
>>     sys.exit(main())
>> ---
>>     raise CalledProcessError(retcode, cmd)
>> subprocess.CalledProcessError: Command '['sudo', '-n', 'docker', 'run', '--label', 'com.qemu.instance.uuid=5e0a4e7f97154a93b182d709969b9417', '-u', '1003', '--security-opt', 'seccomp=unconfined', '--rm', '-e', 'TARGET_LIST=', '-e', 'EXTRA_CONFIGURE_OPTS=', '-e', 'V=', '-e', 'J=14', '-e', 'DEBUG=', '-e', 'SHOW_ENV=1', '-e', 'CCACHE_DIR=/var/tmp/ccache', '-v', '/home/patchew2/.cache/qemu-docker-ccache:/var/tmp/ccache:z', '-v', '/var/tmp/patchew-tester-tmp-6a9_8q0n/src/docker-src.2019-11-12-17.38.46.26027:/var/tmp/qemu:z,ro', 'qemu:centos7', '/var/tmp/qemu/run', 'test-quick']' returned non-zero exit status 2.
>> filter=--filter=label=com.qemu.instance.uuid=5e0a4e7f97154a93b182d709969b9417
>> make[1]: *** [docker-run] Error 1
>> make[1]: Leaving directory `/var/tmp/patchew-tester-tmp-6a9_8q0n/src'
>> make: *** [docker-run-test-quick@centos7] Error 2
>>
>> real    10m57.839s
>> user    0m8.062s
>>
>>
>> The full log is available at
>> http://patchew.org/logs/20191112113012.71136-1-slp@redhat.com/testing.docker-quick@centos7/?type=message.
>> ---
>> Email generated automatically by Patchew [https://patchew.org/].
>> Please send your feedback to patchew-devel@redhat.com



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 1/8] blockdev: merge drive_backup_prepare with do_drive_backup
  2019-11-12 11:30 ` [PATCH v3 1/8] blockdev: merge drive_backup_prepare with do_drive_backup Sergio Lopez
@ 2019-11-19  9:14   ` Max Reitz
  0 siblings, 0 replies; 19+ messages in thread
From: Max Reitz @ 2019-11-19  9:14 UTC (permalink / raw)
  To: Sergio Lopez, qemu-devel; +Cc: Kevin Wolf, Markus Armbruster, qemu-block


[-- Attachment #1.1: Type: text/plain, Size: 765 bytes --]

On 12.11.19 12:30, Sergio Lopez wrote:
> Consolidate drive_backup_prepare() with do_drive_backup() as a first
> step towards streamlining all functionality through transactions.
> 
> Signed-off-by: Sergio Lopez <slp@redhat.com>
> ---
>  blockdev.c | 58 +++++++++++++++---------------------------------------
>  1 file changed, 16 insertions(+), 42 deletions(-)

qemu no longer compiles with this patch applied, for two reasons:

(1) actions[TRANSACTION_ACTION_KIND_DRIVE_BACKUP].prepare is initialized
with drive_backup_prepare on line 2175, but this patch merges it into
do_drive_backup(), which is only on line 3556, so we need a forward
declaration.

(2) qmp_drive_backup() calls do_drive_backup(), but that function no
longer exists.

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 0/8] blockdev: avoid acquiring AioContext lock twice at do_drive_backup and do_blockdev_backup
  2019-11-13 13:24     ` Sergio Lopez
@ 2019-11-19  9:36       ` Max Reitz
  2019-11-19 10:54         ` Sergio Lopez
  0 siblings, 1 reply; 19+ messages in thread
From: Max Reitz @ 2019-11-19  9:36 UTC (permalink / raw)
  To: Sergio Lopez, qemu-devel; +Cc: kwolf, armbru, qemu-block


[-- Attachment #1.1: Type: text/plain, Size: 3095 bytes --]

On 13.11.19 14:24, Sergio Lopez wrote:
> 
> Sergio Lopez <slp@redhat.com> writes:
> 
>> no-reply@patchew.org writes:
>>
>>> Patchew URL: https://patchew.org/QEMU/20191112113012.71136-1-slp@redhat.com/
>>>
>>>
>>>
>>> Hi,
>>>
>>> This series failed the docker-quick@centos7 build test. Please find the testing commands and
>>> their output below. If you have Docker installed, you can probably reproduce it
>>> locally.
>>>
>>> === TEST SCRIPT BEGIN ===
>>> #!/bin/bash
>>> make docker-image-centos7 V=1 NETWORK=1
>>> time make docker-test-quick@centos7 SHOW_ENV=1 J=14 NETWORK=1
>>> === TEST SCRIPT END ===
>>>
>>>   TEST    iotest-qcow2: 268
>>> Failures: 141
>>
>> Hm... 141 didn't fail in my test machine. I'm going to have a look.
> 
> So here's the output:
> 
> --- /root/qemu/tests/qemu-iotests/141.out	2019-11-12 04:43:27.651557587 -0500
> +++ /root/qemu/build/tests/qemu-iotests/141.out.bad	2019-11-13 08:12:06.575967337 -0500
> @@ -10,6 +10,8 @@
>  Formatting 'TEST_DIR/o.IMGFMT', fmt=IMGFMT size=1048576 backing_file=TEST_DIR/t.IMGFMT backing_fmt=IMGFMT
>  {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "job0"}}
>  {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "job0"}}
> +{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "paused", "id": "job0"}}
> +{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "job0"}}
>  {"error": {"class": "GenericError", "desc": "Node 'drv0' is busy: node is used as backing hd of 'NODE_NAME'"}}
>  {"return": {}}
>  {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "aborting", "id": "job0"}}
> 
> Those extra lines, the "paused" and "running", are a result of the job
> being done in a transaction, within a drained section.
> 
> We can update 141.out, but now I'm wondering, was it safe creating the
> job at do_drive_backup() outside of a drained section, as
> qmp_drive_backup was doing?

I think it is.  Someone needs to drain the source node before attaching
the job filter (which intercepts writes), and bdrv_backup_top_append()
does precisely this.

If the source node is in an I/O thread, you could argue that the drain
starts later than when the user has invoked the backup command, and so
some writes might slip through.  That’s correct.  But at the same time,
it’s impossible to drain it the instant the command is received.  So
some writes might always slip through (and the drain will not stop them
either, it will just let them happen).

Therefore, I think it’s fine the way it is.

> Do you think there may be any potential drawbacks as a result of always
> doing it now inside a drained section?

Well, one drawback is clearly visible.  The job goes to paused for no
reason.

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 0/8] blockdev: avoid acquiring AioContext lock twice at do_drive_backup and do_blockdev_backup
  2019-11-19  9:36       ` Max Reitz
@ 2019-11-19 10:54         ` Sergio Lopez
  2019-11-19 11:18           ` Kevin Wolf
  0 siblings, 1 reply; 19+ messages in thread
From: Sergio Lopez @ 2019-11-19 10:54 UTC (permalink / raw)
  To: Max Reitz; +Cc: kwolf, qemu-devel, qemu-block, armbru

[-- Attachment #1: Type: text/plain, Size: 3742 bytes --]


Max Reitz <mreitz@redhat.com> writes:

> On 13.11.19 14:24, Sergio Lopez wrote:
>> 
>> Sergio Lopez <slp@redhat.com> writes:
>> 
>>> no-reply@patchew.org writes:
>>>
>>>> Patchew URL: https://patchew.org/QEMU/20191112113012.71136-1-slp@redhat.com/
>>>>
>>>>
>>>>
>>>> Hi,
>>>>
>>>> This series failed the docker-quick@centos7 build test. Please find the testing commands and
>>>> their output below. If you have Docker installed, you can probably reproduce it
>>>> locally.
>>>>
>>>> === TEST SCRIPT BEGIN ===
>>>> #!/bin/bash
>>>> make docker-image-centos7 V=1 NETWORK=1
>>>> time make docker-test-quick@centos7 SHOW_ENV=1 J=14 NETWORK=1
>>>> === TEST SCRIPT END ===
>>>>
>>>>   TEST    iotest-qcow2: 268
>>>> Failures: 141
>>>
>>> Hm... 141 didn't fail in my test machine. I'm going to have a look.
>> 
>> So here's the output:
>> 
>> --- /root/qemu/tests/qemu-iotests/141.out	2019-11-12 04:43:27.651557587 -0500
>> +++ /root/qemu/build/tests/qemu-iotests/141.out.bad	2019-11-13 08:12:06.575967337 -0500
>> @@ -10,6 +10,8 @@
>>  Formatting 'TEST_DIR/o.IMGFMT', fmt=IMGFMT size=1048576 backing_file=TEST_DIR/t.IMGFMT backing_fmt=IMGFMT
>>  {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "job0"}}
>>  {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "job0"}}
>> +{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "paused", "id": "job0"}}
>> +{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "job0"}}
>>  {"error": {"class": "GenericError", "desc": "Node 'drv0' is busy: node is used as backing hd of 'NODE_NAME'"}}
>>  {"return": {}}
>>  {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "aborting", "id": "job0"}}
>> 
>> Those extra lines, the "paused" and "running", are a result of the job
>> being done in a transaction, within a drained section.
>> 
>> We can update 141.out, but now I'm wondering, was it safe creating the
>> job at do_drive_backup() outside of a drained section, as
>> qmp_drive_backup was doing?
>
> I think it is.  Someone needs to drain the source node before attaching
> the job filter (which intercepts writes), and bdrv_backup_top_append()
> does precisely this.
>
> If the source node is in an I/O thread, you could argue that the drain
> starts later than when the user has invoked the backup command, and so
> some writes might slip through.  That’s correct.  But at the same time,
> it’s impossible to drain it the instant the command is received.  So
> some writes might always slip through (and the drain will not stop them
> either, it will just let them happen).
>
> Therefore, I think it’s fine the way it is.
>
>> Do you think there may be any potential drawbacks as a result of always
>> doing it now inside a drained section?
>
> Well, one drawback is clearly visible.  The job goes to paused for no
> reason.

This is something that already happens when requesting the drive-backup
through a transaction:

{"execute":"transaction","arguments":{"actions":[{"type":"drive-backup","data":{"device":"drv0","target":"o.qcow2","sync":"full","format":"qcow2"}}]}}

I don't think it makes sense to have two different behaviors for the
same action. So we either accept the additional pause+resume iteration
for qmp_drive_backup, or we remove the drained section from the
transaction based one.

What do you think?

Cheers,
Sergio.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 0/8] blockdev: avoid acquiring AioContext lock twice at do_drive_backup and do_blockdev_backup
  2019-11-19 10:54         ` Sergio Lopez
@ 2019-11-19 11:18           ` Kevin Wolf
  2019-11-19 11:35             ` Sergio Lopez
  0 siblings, 1 reply; 19+ messages in thread
From: Kevin Wolf @ 2019-11-19 11:18 UTC (permalink / raw)
  To: Sergio Lopez; +Cc: armbru, qemu-devel, qemu-block, Max Reitz

[-- Attachment #1: Type: text/plain, Size: 4380 bytes --]

Am 19.11.2019 um 11:54 hat Sergio Lopez geschrieben:
> 
> Max Reitz <mreitz@redhat.com> writes:
> 
> > On 13.11.19 14:24, Sergio Lopez wrote:
> >> 
> >> Sergio Lopez <slp@redhat.com> writes:
> >> 
> >>> no-reply@patchew.org writes:
> >>>
> >>>> Patchew URL: https://patchew.org/QEMU/20191112113012.71136-1-slp@redhat.com/
> >>>>
> >>>>
> >>>>
> >>>> Hi,
> >>>>
> >>>> This series failed the docker-quick@centos7 build test. Please find the testing commands and
> >>>> their output below. If you have Docker installed, you can probably reproduce it
> >>>> locally.
> >>>>
> >>>> === TEST SCRIPT BEGIN ===
> >>>> #!/bin/bash
> >>>> make docker-image-centos7 V=1 NETWORK=1
> >>>> time make docker-test-quick@centos7 SHOW_ENV=1 J=14 NETWORK=1
> >>>> === TEST SCRIPT END ===
> >>>>
> >>>>   TEST    iotest-qcow2: 268
> >>>> Failures: 141
> >>>
> >>> Hm... 141 didn't fail in my test machine. I'm going to have a look.
> >> 
> >> So here's the output:
> >> 
> >> --- /root/qemu/tests/qemu-iotests/141.out	2019-11-12 04:43:27.651557587 -0500
> >> +++ /root/qemu/build/tests/qemu-iotests/141.out.bad	2019-11-13 08:12:06.575967337 -0500
> >> @@ -10,6 +10,8 @@
> >>  Formatting 'TEST_DIR/o.IMGFMT', fmt=IMGFMT size=1048576 backing_file=TEST_DIR/t.IMGFMT backing_fmt=IMGFMT
> >>  {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "job0"}}
> >>  {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "job0"}}
> >> +{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "paused", "id": "job0"}}
> >> +{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "job0"}}
> >>  {"error": {"class": "GenericError", "desc": "Node 'drv0' is busy: node is used as backing hd of 'NODE_NAME'"}}
> >>  {"return": {}}
> >>  {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "aborting", "id": "job0"}}
> >> 
> >> Those extra lines, the "paused" and "running", are a result of the job
> >> being done in a transaction, within a drained section.
> >> 
> >> We can update 141.out, but now I'm wondering, was it safe creating the
> >> job at do_drive_backup() outside of a drained section, as
> >> qmp_drive_backup was doing?
> >
> > I think it is.  Someone needs to drain the source node before attaching
> > the job filter (which intercepts writes), and bdrv_backup_top_append()
> > does precisely this.
> >
> > If the source node is in an I/O thread, you could argue that the drain
> > starts later than when the user has invoked the backup command, and so
> > some writes might slip through.  That’s correct.  But at the same time,
> > it’s impossible to drain it the instant the command is received.  So
> > some writes might always slip through (and the drain will not stop them
> > either, it will just let them happen).
> >
> > Therefore, I think it’s fine the way it is.
> >
> >> Do you think there may be any potential drawbacks as a result of always
> >> doing it now inside a drained section?
> >
> > Well, one drawback is clearly visible.  The job goes to paused for no
> > reason.
> 
> This is something that already happens when requesting the drive-backup
> through a transaction:
> 
> {"execute":"transaction","arguments":{"actions":[{"type":"drive-backup","data":{"device":"drv0","target":"o.qcow2","sync":"full","format":"qcow2"}}]}}
> 
> I don't think it makes sense to have two different behaviors for the
> same action. So we either accept the additional pause+resume iteration
> for qmp_drive_backup, or we remove the drained section from the
> transaction based one.
> 
> What do you think?

Draining all involved nodes is necessary for transactions, because you
want a consistent backup across all involved disks. That is, you want it
to be a snapshot at the same point in time for all of them - no requests
may happen between starting backup on the first and the second disk.

For a single device operation, this requirement doesn't exist, because
there is nothing else that must happen at the same point in time.

Kevin

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 801 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 0/8] blockdev: avoid acquiring AioContext lock twice at do_drive_backup and do_blockdev_backup
  2019-11-19 11:18           ` Kevin Wolf
@ 2019-11-19 11:35             ` Sergio Lopez
  2019-11-19 12:13               ` Kevin Wolf
  0 siblings, 1 reply; 19+ messages in thread
From: Sergio Lopez @ 2019-11-19 11:35 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: armbru, qemu-devel, qemu-block, Max Reitz

[-- Attachment #1: Type: text/plain, Size: 4858 bytes --]


Kevin Wolf <kwolf@redhat.com> writes:

> Am 19.11.2019 um 11:54 hat Sergio Lopez geschrieben:
>> 
>> Max Reitz <mreitz@redhat.com> writes:
>> 
>> > On 13.11.19 14:24, Sergio Lopez wrote:
>> >> 
>> >> Sergio Lopez <slp@redhat.com> writes:
>> >> 
>> >>> no-reply@patchew.org writes:
>> >>>
>> >>>> Patchew URL: https://patchew.org/QEMU/20191112113012.71136-1-slp@redhat.com/
>> >>>>
>> >>>>
>> >>>>
>> >>>> Hi,
>> >>>>
>> >>>> This series failed the docker-quick@centos7 build test. Please find the testing commands and
>> >>>> their output below. If you have Docker installed, you can probably reproduce it
>> >>>> locally.
>> >>>>
>> >>>> === TEST SCRIPT BEGIN ===
>> >>>> #!/bin/bash
>> >>>> make docker-image-centos7 V=1 NETWORK=1
>> >>>> time make docker-test-quick@centos7 SHOW_ENV=1 J=14 NETWORK=1
>> >>>> === TEST SCRIPT END ===
>> >>>>
>> >>>>   TEST    iotest-qcow2: 268
>> >>>> Failures: 141
>> >>>
>> >>> Hm... 141 didn't fail in my test machine. I'm going to have a look.
>> >> 
>> >> So here's the output:
>> >> 
>> >> --- /root/qemu/tests/qemu-iotests/141.out	2019-11-12 04:43:27.651557587 -0500
>> >> +++ /root/qemu/build/tests/qemu-iotests/141.out.bad	2019-11-13 08:12:06.575967337 -0500
>> >> @@ -10,6 +10,8 @@
>> >>  Formatting 'TEST_DIR/o.IMGFMT', fmt=IMGFMT size=1048576 backing_file=TEST_DIR/t.IMGFMT backing_fmt=IMGFMT
>> >>  {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "job0"}}
>> >>  {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "job0"}}
>> >> +{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "paused", "id": "job0"}}
>> >> +{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "job0"}}
>> >>  {"error": {"class": "GenericError", "desc": "Node 'drv0' is busy: node is used as backing hd of 'NODE_NAME'"}}
>> >>  {"return": {}}
>> >>  {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "aborting", "id": "job0"}}
>> >> 
>> >> Those extra lines, the "paused" and "running", are a result of the job
>> >> being done in a transaction, within a drained section.
>> >> 
>> >> We can update 141.out, but now I'm wondering, was it safe creating the
>> >> job at do_drive_backup() outside of a drained section, as
>> >> qmp_drive_backup was doing?
>> >
>> > I think it is.  Someone needs to drain the source node before attaching
>> > the job filter (which intercepts writes), and bdrv_backup_top_append()
>> > does precisely this.
>> >
>> > If the source node is in an I/O thread, you could argue that the drain
>> > starts later than when the user has invoked the backup command, and so
>> > some writes might slip through.  That’s correct.  But at the same time,
>> > it’s impossible to drain it the instant the command is received.  So
>> > some writes might always slip through (and the drain will not stop them
>> > either, it will just let them happen).
>> >
>> > Therefore, I think it’s fine the way it is.
>> >
>> >> Do you think there may be any potential drawbacks as a result of always
>> >> doing it now inside a drained section?
>> >
>> > Well, one drawback is clearly visible.  The job goes to paused for no
>> > reason.
>> 
>> This is something that already happens when requesting the drive-backup
>> through a transaction:
>> 
>> {"execute":"transaction","arguments":{"actions":[{"type":"drive-backup","data":{"device":"drv0","target":"o.qcow2","sync":"full","format":"qcow2"}}]}}
>> 
>> I don't think it makes sense to have two different behaviors for the
>> same action. So we either accept the additional pause+resume iteration
>> for qmp_drive_backup, or we remove the drained section from the
>> transaction based one.
>> 
>> What do you think?
>
> Draining all involved nodes is necessary for transactions, because you
> want a consistent backup across all involved disks. That is, you want it
> to be a snapshot at the same point in time for all of them - no requests
> may happen between starting backup on the first and the second disk.
>
> For a single device operation, this requirement doesn't exist, because
> there is nothing else that must happen at the same point in time.

This poses a problem with the unification strategy you suggested for qmp
commands and transactions. I guess that, if we really want to preserve
the original behavior, we can extend DriveBackup to add a flag to
indicate whether the transaction should create a drained section or not.

Does this sound reasonable to you?

Thanks,
Sergio.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 0/8] blockdev: avoid acquiring AioContext lock twice at do_drive_backup and do_blockdev_backup
  2019-11-19 11:35             ` Sergio Lopez
@ 2019-11-19 12:13               ` Kevin Wolf
  2019-11-19 12:31                 ` Sergio Lopez
  0 siblings, 1 reply; 19+ messages in thread
From: Kevin Wolf @ 2019-11-19 12:13 UTC (permalink / raw)
  To: Sergio Lopez; +Cc: armbru, qemu-devel, qemu-block, Max Reitz

[-- Attachment #1: Type: text/plain, Size: 5659 bytes --]

Am 19.11.2019 um 12:35 hat Sergio Lopez geschrieben:
> 
> Kevin Wolf <kwolf@redhat.com> writes:
> 
> > Am 19.11.2019 um 11:54 hat Sergio Lopez geschrieben:
> >> 
> >> Max Reitz <mreitz@redhat.com> writes:
> >> 
> >> > On 13.11.19 14:24, Sergio Lopez wrote:
> >> >> 
> >> >> Sergio Lopez <slp@redhat.com> writes:
> >> >> 
> >> >>> no-reply@patchew.org writes:
> >> >>>
> >> >>>> Patchew URL: https://patchew.org/QEMU/20191112113012.71136-1-slp@redhat.com/
> >> >>>>
> >> >>>>
> >> >>>>
> >> >>>> Hi,
> >> >>>>
> >> >>>> This series failed the docker-quick@centos7 build test. Please find the testing commands and
> >> >>>> their output below. If you have Docker installed, you can probably reproduce it
> >> >>>> locally.
> >> >>>>
> >> >>>> === TEST SCRIPT BEGIN ===
> >> >>>> #!/bin/bash
> >> >>>> make docker-image-centos7 V=1 NETWORK=1
> >> >>>> time make docker-test-quick@centos7 SHOW_ENV=1 J=14 NETWORK=1
> >> >>>> === TEST SCRIPT END ===
> >> >>>>
> >> >>>>   TEST    iotest-qcow2: 268
> >> >>>> Failures: 141
> >> >>>
> >> >>> Hm... 141 didn't fail in my test machine. I'm going to have a look.
> >> >> 
> >> >> So here's the output:
> >> >> 
> >> >> --- /root/qemu/tests/qemu-iotests/141.out	2019-11-12 04:43:27.651557587 -0500
> >> >> +++ /root/qemu/build/tests/qemu-iotests/141.out.bad	2019-11-13 08:12:06.575967337 -0500
> >> >> @@ -10,6 +10,8 @@
> >> >>  Formatting 'TEST_DIR/o.IMGFMT', fmt=IMGFMT size=1048576 backing_file=TEST_DIR/t.IMGFMT backing_fmt=IMGFMT
> >> >>  {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "job0"}}
> >> >>  {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "job0"}}
> >> >> +{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "paused", "id": "job0"}}
> >> >> +{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "job0"}}
> >> >>  {"error": {"class": "GenericError", "desc": "Node 'drv0' is busy: node is used as backing hd of 'NODE_NAME'"}}
> >> >>  {"return": {}}
> >> >>  {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "aborting", "id": "job0"}}
> >> >> 
> >> >> Those extra lines, the "paused" and "running", are a result of the job
> >> >> being done in a transaction, within a drained section.
> >> >> 
> >> >> We can update 141.out, but now I'm wondering, was it safe creating the
> >> >> job at do_drive_backup() outside of a drained section, as
> >> >> qmp_drive_backup was doing?
> >> >
> >> > I think it is.  Someone needs to drain the source node before attaching
> >> > the job filter (which intercepts writes), and bdrv_backup_top_append()
> >> > does precisely this.
> >> >
> >> > If the source node is in an I/O thread, you could argue that the drain
> >> > starts later than when the user has invoked the backup command, and so
> >> > some writes might slip through.  That’s correct.  But at the same time,
> >> > it’s impossible to drain it the instant the command is received.  So
> >> > some writes might always slip through (and the drain will not stop them
> >> > either, it will just let them happen).
> >> >
> >> > Therefore, I think it’s fine the way it is.
> >> >
> >> >> Do you think there may be any potential drawbacks as a result of always
> >> >> doing it now inside a drained section?
> >> >
> >> > Well, one drawback is clearly visible.  The job goes to paused for no
> >> > reason.
> >> 
> >> This is something that already happens when requesting the drive-backup
> >> through a transaction:
> >> 
> >> {"execute":"transaction","arguments":{"actions":[{"type":"drive-backup","data":{"device":"drv0","target":"o.qcow2","sync":"full","format":"qcow2"}}]}}
> >> 
> >> I don't think it makes sense to have two different behaviors for the
> >> same action. So we either accept the additional pause+resume iteration
> >> for qmp_drive_backup, or we remove the drained section from the
> >> transaction based one.
> >> 
> >> What do you think?
> >
> > Draining all involved nodes is necessary for transactions, because you
> > want a consistent backup across all involved disks. That is, you want it
> > to be a snapshot at the same point in time for all of them - no requests
> > may happen between starting backup on the first and the second disk.
> >
> > For a single device operation, this requirement doesn't exist, because
> > there is nothing else that must happen at the same point in time.
> 
> This poses a problem with the unification strategy you suggested for qmp
> commands and transactions. I guess that, if we really want to preserve
> the original behavior, we can extend DriveBackup to add a flag to
> indicate whether the transaction should create a drained section or not.
> 
> Does this sound reasonable to you?

I think we can accept an unnecessary drain for the single-device case.
It's only minimally worse than not draining early (because, as Max said,
we'll drain the node anyway later).

I'm not sure what the code looks like, but does the job go to paused
even when it's already created inside the drained section? (As opposed
to first creating the job and then draining.) I assume that this is what
you're already doing, just double-checking.

If this is how things work, I'd just adjust the test output and explain
the change in the commit message.

Kevin

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 801 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 0/8] blockdev: avoid acquiring AioContext lock twice at do_drive_backup and do_blockdev_backup
  2019-11-19 12:13               ` Kevin Wolf
@ 2019-11-19 12:31                 ` Sergio Lopez
  0 siblings, 0 replies; 19+ messages in thread
From: Sergio Lopez @ 2019-11-19 12:31 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: armbru, qemu-devel, qemu-block, Max Reitz

[-- Attachment #1: Type: text/plain, Size: 6059 bytes --]


Kevin Wolf <kwolf@redhat.com> writes:

> Am 19.11.2019 um 12:35 hat Sergio Lopez geschrieben:
>> 
>> Kevin Wolf <kwolf@redhat.com> writes:
>> 
>> > Am 19.11.2019 um 11:54 hat Sergio Lopez geschrieben:
>> >> 
>> >> Max Reitz <mreitz@redhat.com> writes:
>> >> 
>> >> > On 13.11.19 14:24, Sergio Lopez wrote:
>> >> >> 
>> >> >> Sergio Lopez <slp@redhat.com> writes:
>> >> >> 
>> >> >>> no-reply@patchew.org writes:
>> >> >>>
>> >> >>>> Patchew URL: https://patchew.org/QEMU/20191112113012.71136-1-slp@redhat.com/
>> >> >>>>
>> >> >>>>
>> >> >>>>
>> >> >>>> Hi,
>> >> >>>>
>> >> >>>> This series failed the docker-quick@centos7 build test. Please find the testing commands and
>> >> >>>> their output below. If you have Docker installed, you can probably reproduce it
>> >> >>>> locally.
>> >> >>>>
>> >> >>>> === TEST SCRIPT BEGIN ===
>> >> >>>> #!/bin/bash
>> >> >>>> make docker-image-centos7 V=1 NETWORK=1
>> >> >>>> time make docker-test-quick@centos7 SHOW_ENV=1 J=14 NETWORK=1
>> >> >>>> === TEST SCRIPT END ===
>> >> >>>>
>> >> >>>>   TEST    iotest-qcow2: 268
>> >> >>>> Failures: 141
>> >> >>>
>> >> >>> Hm... 141 didn't fail in my test machine. I'm going to have a look.
>> >> >> 
>> >> >> So here's the output:
>> >> >> 
>> >> >> --- /root/qemu/tests/qemu-iotests/141.out	2019-11-12 04:43:27.651557587 -0500
>> >> >> +++ /root/qemu/build/tests/qemu-iotests/141.out.bad	2019-11-13 08:12:06.575967337 -0500
>> >> >> @@ -10,6 +10,8 @@
>> >> >>  Formatting 'TEST_DIR/o.IMGFMT', fmt=IMGFMT size=1048576 backing_file=TEST_DIR/t.IMGFMT backing_fmt=IMGFMT
>> >> >>  {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "job0"}}
>> >> >>  {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "job0"}}
>> >> >> +{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "paused", "id": "job0"}}
>> >> >> +{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "job0"}}
>> >> >>  {"error": {"class": "GenericError", "desc": "Node 'drv0' is busy: node is used as backing hd of 'NODE_NAME'"}}
>> >> >>  {"return": {}}
>> >> >>  {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "aborting", "id": "job0"}}
>> >> >> 
>> >> >> Those extra lines, the "paused" and "running", are a result of the job
>> >> >> being done in a transaction, within a drained section.
>> >> >> 
>> >> >> We can update 141.out, but now I'm wondering, was it safe creating the
>> >> >> job at do_drive_backup() outside of a drained section, as
>> >> >> qmp_drive_backup was doing?
>> >> >
>> >> > I think it is.  Someone needs to drain the source node before attaching
>> >> > the job filter (which intercepts writes), and bdrv_backup_top_append()
>> >> > does precisely this.
>> >> >
>> >> > If the source node is in an I/O thread, you could argue that the drain
>> >> > starts later than when the user has invoked the backup command, and so
>> >> > some writes might slip through.  That’s correct.  But at the same time,
>> >> > it’s impossible to drain it the instant the command is received.  So
>> >> > some writes might always slip through (and the drain will not stop them
>> >> > either, it will just let them happen).
>> >> >
>> >> > Therefore, I think it’s fine the way it is.
>> >> >
>> >> >> Do you think there may be any potential drawbacks as a result of always
>> >> >> doing it now inside a drained section?
>> >> >
>> >> > Well, one drawback is clearly visible.  The job goes to paused for no
>> >> > reason.
>> >> 
>> >> This is something that already happens when requesting the drive-backup
>> >> through a transaction:
>> >> 
>> >> {"execute":"transaction","arguments":{"actions":[{"type":"drive-backup","data":{"device":"drv0","target":"o.qcow2","sync":"full","format":"qcow2"}}]}}
>> >> 
>> >> I don't think it makes sense to have two different behaviors for the
>> >> same action. So we either accept the additional pause+resume iteration
>> >> for qmp_drive_backup, or we remove the drained section from the
>> >> transaction based one.
>> >> 
>> >> What do you think?
>> >
>> > Draining all involved nodes is necessary for transactions, because you
>> > want a consistent backup across all involved disks. That is, you want it
>> > to be a snapshot at the same point in time for all of them - no requests
>> > may happen between starting backup on the first and the second disk.
>> >
>> > For a single device operation, this requirement doesn't exist, because
>> > there is nothing else that must happen at the same point in time.
>> 
>> This poses a problem with the unification strategy you suggested for qmp
>> commands and transactions. I guess that, if we really want to preserve
>> the original behavior, we can extend DriveBackup to add a flag to
>> indicate whether the transaction should create a drained section or not.
>> 
>> Does this sound reasonable to you?
>
> I think we can accept an unnecessary drain for the single-device case.
> It's only minimally worse than not draining early (because, as Max said,
> we'll drain the node anyway later).
>
> I'm not sure what the code looks like, but does the job go to paused
> even when it's already created inside the drained section? (As opposed
> to first creating the job and then draining.) I assume that this is what
> you're already doing, just double-checking.

Yes, that's the case. drive_backup_prepare() calls to
bdrv_drained_begin() first, and then to do_backup_common(), which creates
the job.

> If this is how things work, I'd just adjust the test output and explain
> the change in the commit message.

OK, I'll prepare a v4 with a rework of the patchset and an update to the
job.

Thanks,
Sergio.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2019-11-19 12:34 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-12 11:30 [PATCH v3 0/8] blockdev: avoid acquiring AioContext lock twice at do_drive_backup and do_blockdev_backup Sergio Lopez
2019-11-12 11:30 ` [PATCH v3 1/8] blockdev: merge drive_backup_prepare with do_drive_backup Sergio Lopez
2019-11-19  9:14   ` Max Reitz
2019-11-12 11:30 ` [PATCH v3 2/8] blockdev: fix coding style issues in drive_backup_prepare Sergio Lopez
2019-11-12 11:30 ` [PATCH v3 3/8] blockdev: place drive_backup_prepare with the other related transaction functions Sergio Lopez
2019-11-12 11:30 ` [PATCH v3 4/8] blockdev: change qmp_drive_backup to make use of transactions Sergio Lopez
2019-11-12 11:30 ` [PATCH v3 5/8] blockdev: merge blockdev_backup_prepare with do_blockdev_backup Sergio Lopez
2019-11-12 11:30 ` [PATCH v3 6/8] blockdev: place blockdev_backup_prepare with the other related transaction helpers Sergio Lopez
2019-11-12 11:30 ` [PATCH v3 7/8] blockdev: change qmp_blockdev_backup to make use of transactions Sergio Lopez
2019-11-12 11:30 ` [PATCH v3 8/8] blockdev: honor bdrv_try_set_aio_context() context requirements Sergio Lopez
2019-11-12 22:49 ` [PATCH v3 0/8] blockdev: avoid acquiring AioContext lock twice at do_drive_backup and do_blockdev_backup no-reply
2019-11-13  9:14   ` Sergio Lopez
2019-11-13 13:24     ` Sergio Lopez
2019-11-19  9:36       ` Max Reitz
2019-11-19 10:54         ` Sergio Lopez
2019-11-19 11:18           ` Kevin Wolf
2019-11-19 11:35             ` Sergio Lopez
2019-11-19 12:13               ` Kevin Wolf
2019-11-19 12:31                 ` Sergio Lopez

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).