qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v6 0/8] blockdev: Fix AioContext handling for various blockdev actions
@ 2020-01-08 14:31 Sergio Lopez
  2020-01-08 14:31 ` [PATCH v6 1/8] blockdev: fix coding style issues in drive_backup_prepare Sergio Lopez
                   ` (9 more replies)
  0 siblings, 10 replies; 13+ messages in thread
From: Sergio Lopez @ 2020-01-08 14:31 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, qemu-block, Sergio Lopez, Markus Armbruster,
	Max Reitz, John Snow

This patch series includes fixes for various issues related to
AioContext mismanagement for various blockdev-initiated actions.

It also adds tests for those actions when executed on drives running
on an IOThread AioContext.

---
Changelog:

v6:
 - Rename the patch series.
 - Add "block/backup-top: Don't acquire context while dropping top"
 - Add "blockdev: Acquire AioContext on dirty bitmap functions"
 - Add "blockdev: Return bs to the proper context on snapshot abort"
 - Add "iotests: Test handling of AioContexts with some blockdev
   actions" (thanks Kevin Wolf)
 - Fix context release on error. (thanks Kevin Wolf)

v5:
 - Include fix for iotest 141 in patch 2. (thanks Max Reitz)
 - Also fix iotest 185 and 219 in patch 2. (thanks Max Reitz)
 - Move error block after context acquisition/release, to we can use
   goto to bail out and release resources. (thanks Max Reitz)
 - Properly release old_context in qmp_blockdev_mirror. (thanks Max
   Reitz)

v4:
 - Unify patches 1-4 and 5-7 to avoid producing broken interim
   states. (thanks Max Reitz)
 - Include a fix for iotest 141. (thanks Kevin Wolf)

v3:
 - Rework the whole patch series to fix the issue by consolidating all
   operations in the transaction model. (thanks Kevin Wolf)

v2:
 - Honor bdrv_try_set_aio_context() context acquisition requirements
   (thanks Max Reitz).
 - Release the context at drive_backup_prepare() instead of avoiding
   re-acquiring it at do_drive_baclup(). (thanks Max Reitz)
 - Convert a single patch into a two-patch series.
---

Sergio Lopez (8):
  blockdev: fix coding style issues in drive_backup_prepare
  blockdev: unify qmp_drive_backup and drive-backup transaction paths
  blockdev: unify qmp_blockdev_backup and blockdev-backup transaction
    paths
  blockdev: honor bdrv_try_set_aio_context() context requirements
  block/backup-top: Don't acquire context while dropping top
  blockdev: Acquire AioContext on dirty bitmap functions
  blockdev: Return bs to the proper context on snapshot abort
  iotests: Test handling of AioContexts with some blockdev actions

 block/backup-top.c         |   5 -
 block/backup.c             |   3 +
 blockdev.c                 | 391 ++++++++++++++++++++-----------------
 tests/qemu-iotests/141.out |   2 +
 tests/qemu-iotests/185.out |   2 +
 tests/qemu-iotests/219     |   7 +-
 tests/qemu-iotests/219.out |   8 +
 tests/qemu-iotests/281     | 247 +++++++++++++++++++++++
 tests/qemu-iotests/281.out |   5 +
 tests/qemu-iotests/group   |   1 +
 10 files changed, 484 insertions(+), 187 deletions(-)
 create mode 100755 tests/qemu-iotests/281
 create mode 100644 tests/qemu-iotests/281.out

-- 
2.23.0



^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v6 1/8] blockdev: fix coding style issues in drive_backup_prepare
  2020-01-08 14:31 [PATCH v6 0/8] blockdev: Fix AioContext handling for various blockdev actions Sergio Lopez
@ 2020-01-08 14:31 ` Sergio Lopez
  2020-01-08 14:31 ` [PATCH v6 2/8] blockdev: unify qmp_drive_backup and drive-backup transaction paths Sergio Lopez
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 13+ messages in thread
From: Sergio Lopez @ 2020-01-08 14:31 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, qemu-block, Sergio Lopez, Markus Armbruster,
	Max Reitz, John Snow

Fix a couple of minor coding style issues in drive_backup_prepare.

Signed-off-by: Sergio Lopez <slp@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
---
 blockdev.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index 8e029e9c01..553e315972 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -3620,7 +3620,7 @@ static BlockJob *do_drive_backup(DriveBackup *backup, JobTxn *txn,
 
     if (!backup->has_format) {
         backup->format = backup->mode == NEW_IMAGE_MODE_EXISTING ?
-                         NULL : (char*) bs->drv->format_name;
+                         NULL : (char *) bs->drv->format_name;
     }
 
     /* Early check to avoid creating target */
@@ -3630,8 +3630,10 @@ static BlockJob *do_drive_backup(DriveBackup *backup, JobTxn *txn,
 
     flags = bs->open_flags | BDRV_O_RDWR;
 
-    /* See if we have a backing HD we can use to create our new image
-     * on top of. */
+    /*
+     * See if we have a backing HD we can use to create our new image
+     * on top of.
+     */
     if (backup->sync == MIRROR_SYNC_MODE_TOP) {
         source = backing_bs(bs);
         if (!source) {
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v6 2/8] blockdev: unify qmp_drive_backup and drive-backup transaction paths
  2020-01-08 14:31 [PATCH v6 0/8] blockdev: Fix AioContext handling for various blockdev actions Sergio Lopez
  2020-01-08 14:31 ` [PATCH v6 1/8] blockdev: fix coding style issues in drive_backup_prepare Sergio Lopez
@ 2020-01-08 14:31 ` Sergio Lopez
  2020-01-08 14:31 ` [PATCH v6 3/8] blockdev: unify qmp_blockdev_backup and blockdev-backup " Sergio Lopez
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 13+ messages in thread
From: Sergio Lopez @ 2020-01-08 14:31 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, qemu-block, Sergio Lopez, Markus Armbruster,
	Max Reitz, John Snow

Issuing a drive-backup from qmp_drive_backup takes a slightly
different path than when it's issued from a transaction. In the code,
this is manifested as some redundancy between do_drive_backup() and
drive_backup_prepare().

This change unifies both paths, merging do_drive_backup() and
drive_backup_prepare(), and changing qmp_drive_backup() to create a
transaction instead of calling do_backup_common() direcly.

As a side-effect, now qmp_drive_backup() is executed inside a drained
section, as it happens when creating a drive-backup transaction. This
change is visible from the user's perspective, as the job gets paused
and immediately resumed before starting the actual work.

Also fix tests 141, 185 and 219 to cope with the extra
JOB_STATUS_CHANGE lines.

Signed-off-by: Sergio Lopez <slp@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
---
 blockdev.c                 | 224 +++++++++++++++++--------------------
 tests/qemu-iotests/141.out |   2 +
 tests/qemu-iotests/185.out |   2 +
 tests/qemu-iotests/219     |   7 +-
 tests/qemu-iotests/219.out |   8 ++
 5 files changed, 117 insertions(+), 126 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index 553e315972..5e85fc042e 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1761,39 +1761,128 @@ typedef struct DriveBackupState {
     BlockJob *job;
 } DriveBackupState;
 
-static BlockJob *do_drive_backup(DriveBackup *backup, JobTxn *txn,
-                            Error **errp);
+static BlockJob *do_backup_common(BackupCommon *backup,
+                                  BlockDriverState *bs,
+                                  BlockDriverState *target_bs,
+                                  AioContext *aio_context,
+                                  JobTxn *txn, Error **errp);
 
 static void drive_backup_prepare(BlkActionState *common, Error **errp)
 {
     DriveBackupState *state = DO_UPCAST(DriveBackupState, common, common);
-    BlockDriverState *bs;
     DriveBackup *backup;
+    BlockDriverState *bs;
+    BlockDriverState *target_bs;
+    BlockDriverState *source = NULL;
     AioContext *aio_context;
+    QDict *options;
     Error *local_err = NULL;
+    int flags;
+    int64_t size;
+    bool set_backing_hd = false;
 
     assert(common->action->type == TRANSACTION_ACTION_KIND_DRIVE_BACKUP);
     backup = common->action->u.drive_backup.data;
 
+    if (!backup->has_mode) {
+        backup->mode = NEW_IMAGE_MODE_ABSOLUTE_PATHS;
+    }
+
     bs = bdrv_lookup_bs(backup->device, backup->device, errp);
     if (!bs) {
         return;
     }
 
+    if (!bs->drv) {
+        error_setg(errp, "Device has no medium");
+        return;
+    }
+
     aio_context = bdrv_get_aio_context(bs);
     aio_context_acquire(aio_context);
 
     /* Paired with .clean() */
     bdrv_drained_begin(bs);
 
-    state->bs = bs;
+    if (!backup->has_format) {
+        backup->format = backup->mode == NEW_IMAGE_MODE_EXISTING ?
+                         NULL : (char *) bs->drv->format_name;
+    }
+
+    /* Early check to avoid creating target */
+    if (bdrv_op_is_blocked(bs, BLOCK_OP_TYPE_BACKUP_SOURCE, errp)) {
+        goto out;
+    }
+
+    flags = bs->open_flags | BDRV_O_RDWR;
+
+    /*
+     * See if we have a backing HD we can use to create our new image
+     * on top of.
+     */
+    if (backup->sync == MIRROR_SYNC_MODE_TOP) {
+        source = backing_bs(bs);
+        if (!source) {
+            backup->sync = MIRROR_SYNC_MODE_FULL;
+        }
+    }
+    if (backup->sync == MIRROR_SYNC_MODE_NONE) {
+        source = bs;
+        flags |= BDRV_O_NO_BACKING;
+        set_backing_hd = true;
+    }
+
+    size = bdrv_getlength(bs);
+    if (size < 0) {
+        error_setg_errno(errp, -size, "bdrv_getlength failed");
+        goto out;
+    }
+
+    if (backup->mode != NEW_IMAGE_MODE_EXISTING) {
+        assert(backup->format);
+        if (source) {
+            bdrv_refresh_filename(source);
+            bdrv_img_create(backup->target, backup->format, source->filename,
+                            source->drv->format_name, NULL,
+                            size, flags, false, &local_err);
+        } else {
+            bdrv_img_create(backup->target, backup->format, NULL, NULL, NULL,
+                            size, flags, false, &local_err);
+        }
+    }
 
-    state->job = do_drive_backup(backup, common->block_job_txn, &local_err);
     if (local_err) {
         error_propagate(errp, local_err);
         goto out;
     }
 
+    options = qdict_new();
+    qdict_put_str(options, "discard", "unmap");
+    qdict_put_str(options, "detect-zeroes", "unmap");
+    if (backup->format) {
+        qdict_put_str(options, "driver", backup->format);
+    }
+
+    target_bs = bdrv_open(backup->target, NULL, options, flags, errp);
+    if (!target_bs) {
+        goto out;
+    }
+
+    if (set_backing_hd) {
+        bdrv_set_backing_hd(target_bs, source, &local_err);
+        if (local_err) {
+            goto unref;
+        }
+    }
+
+    state->bs = bs;
+
+    state->job = do_backup_common(qapi_DriveBackup_base(backup),
+                                  bs, target_bs, aio_context,
+                                  common->block_job_txn, errp);
+
+unref:
+    bdrv_unref(target_bs);
 out:
     aio_context_release(aio_context);
 }
@@ -3587,126 +3676,13 @@ static BlockJob *do_backup_common(BackupCommon *backup,
     return job;
 }
 
-static BlockJob *do_drive_backup(DriveBackup *backup, JobTxn *txn,
-                                 Error **errp)
-{
-    BlockDriverState *bs;
-    BlockDriverState *target_bs;
-    BlockDriverState *source = NULL;
-    BlockJob *job = NULL;
-    AioContext *aio_context;
-    QDict *options;
-    Error *local_err = NULL;
-    int flags;
-    int64_t size;
-    bool set_backing_hd = false;
-
-    if (!backup->has_mode) {
-        backup->mode = NEW_IMAGE_MODE_ABSOLUTE_PATHS;
-    }
-
-    bs = bdrv_lookup_bs(backup->device, backup->device, errp);
-    if (!bs) {
-        return NULL;
-    }
-
-    if (!bs->drv) {
-        error_setg(errp, "Device has no medium");
-        return NULL;
-    }
-
-    aio_context = bdrv_get_aio_context(bs);
-    aio_context_acquire(aio_context);
-
-    if (!backup->has_format) {
-        backup->format = backup->mode == NEW_IMAGE_MODE_EXISTING ?
-                         NULL : (char *) bs->drv->format_name;
-    }
-
-    /* Early check to avoid creating target */
-    if (bdrv_op_is_blocked(bs, BLOCK_OP_TYPE_BACKUP_SOURCE, errp)) {
-        goto out;
-    }
-
-    flags = bs->open_flags | BDRV_O_RDWR;
-
-    /*
-     * See if we have a backing HD we can use to create our new image
-     * on top of.
-     */
-    if (backup->sync == MIRROR_SYNC_MODE_TOP) {
-        source = backing_bs(bs);
-        if (!source) {
-            backup->sync = MIRROR_SYNC_MODE_FULL;
-        }
-    }
-    if (backup->sync == MIRROR_SYNC_MODE_NONE) {
-        source = bs;
-        flags |= BDRV_O_NO_BACKING;
-        set_backing_hd = true;
-    }
-
-    size = bdrv_getlength(bs);
-    if (size < 0) {
-        error_setg_errno(errp, -size, "bdrv_getlength failed");
-        goto out;
-    }
-
-    if (backup->mode != NEW_IMAGE_MODE_EXISTING) {
-        assert(backup->format);
-        if (source) {
-            bdrv_refresh_filename(source);
-            bdrv_img_create(backup->target, backup->format, source->filename,
-                            source->drv->format_name, NULL,
-                            size, flags, false, &local_err);
-        } else {
-            bdrv_img_create(backup->target, backup->format, NULL, NULL, NULL,
-                            size, flags, false, &local_err);
-        }
-    }
-
-    if (local_err) {
-        error_propagate(errp, local_err);
-        goto out;
-    }
-
-    options = qdict_new();
-    qdict_put_str(options, "discard", "unmap");
-    qdict_put_str(options, "detect-zeroes", "unmap");
-    if (backup->format) {
-        qdict_put_str(options, "driver", backup->format);
-    }
-
-    target_bs = bdrv_open(backup->target, NULL, options, flags, errp);
-    if (!target_bs) {
-        goto out;
-    }
-
-    if (set_backing_hd) {
-        bdrv_set_backing_hd(target_bs, source, &local_err);
-        if (local_err) {
-            goto unref;
-        }
-    }
-
-    job = do_backup_common(qapi_DriveBackup_base(backup),
-                           bs, target_bs, aio_context, txn, errp);
-
-unref:
-    bdrv_unref(target_bs);
-out:
-    aio_context_release(aio_context);
-    return job;
-}
-
-void qmp_drive_backup(DriveBackup *arg, Error **errp)
+void qmp_drive_backup(DriveBackup *backup, Error **errp)
 {
-
-    BlockJob *job;
-    job = do_drive_backup(arg, NULL, errp);
-    if (job) {
-        job_start(&job->job);
-    }
+    TransactionAction action = {
+        .type = TRANSACTION_ACTION_KIND_DRIVE_BACKUP,
+        .u.drive_backup.data = backup,
+    };
+    blockdev_do_action(&action, errp);
 }
 
 BlockDeviceInfoList *qmp_query_named_block_nodes(Error **errp)
diff --git a/tests/qemu-iotests/141.out b/tests/qemu-iotests/141.out
index 3645675ce8..263b680bdf 100644
--- a/tests/qemu-iotests/141.out
+++ b/tests/qemu-iotests/141.out
@@ -13,6 +13,8 @@ Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1048576 backing_file=TEST_DIR/m.
 Formatting 'TEST_DIR/o.IMGFMT', fmt=IMGFMT size=1048576 backing_file=TEST_DIR/t.IMGFMT backing_fmt=IMGFMT
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "job0"}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "job0"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "paused", "id": "job0"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "job0"}}
 {'execute': 'blockdev-del', 'arguments': {'node-name': 'drv0'}}
 {"error": {"class": "GenericError", "desc": "Node 'drv0' is busy: node is used as backing hd of 'NODE_NAME'"}}
 {'execute': 'block-job-cancel', 'arguments': {'device': 'job0'}}
diff --git a/tests/qemu-iotests/185.out b/tests/qemu-iotests/185.out
index 8379ac5854..9a3b65782b 100644
--- a/tests/qemu-iotests/185.out
+++ b/tests/qemu-iotests/185.out
@@ -65,6 +65,8 @@ Formatting 'TEST_DIR/t.qcow2.copy', fmt=qcow2 size=67108864 cluster_size=65536 l
 Formatting 'TEST_DIR/t.qcow2.copy', fmt=qcow2 size=67108864 cluster_size=65536 lazy_refcounts=off refcount_bits=16
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "disk"}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "disk"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "paused", "id": "disk"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "disk"}}
 {"return": {}}
 { 'execute': 'quit' }
 {"return": {}}
diff --git a/tests/qemu-iotests/219 b/tests/qemu-iotests/219
index e0c51662c0..655f54d881 100755
--- a/tests/qemu-iotests/219
+++ b/tests/qemu-iotests/219
@@ -63,7 +63,7 @@ def test_pause_resume(vm):
                 # logged immediately
                 iotests.log(vm.qmp('query-jobs'))
 
-def test_job_lifecycle(vm, job, job_args, has_ready=False):
+def test_job_lifecycle(vm, job, job_args, has_ready=False, is_mirror=False):
     global img_size
 
     iotests.log('')
@@ -135,6 +135,9 @@ def test_job_lifecycle(vm, job, job_args, has_ready=False):
     iotests.log('Waiting for PENDING state...')
     iotests.log(iotests.filter_qmp_event(vm.event_wait('JOB_STATUS_CHANGE')))
     iotests.log(iotests.filter_qmp_event(vm.event_wait('JOB_STATUS_CHANGE')))
+    if is_mirror:
+        iotests.log(iotests.filter_qmp_event(vm.event_wait('JOB_STATUS_CHANGE')))
+        iotests.log(iotests.filter_qmp_event(vm.event_wait('JOB_STATUS_CHANGE')))
 
     if not job_args.get('auto-finalize', True):
         # PENDING state:
@@ -218,7 +221,7 @@ with iotests.FilePath('disk.img') as disk_path, \
 
     for auto_finalize in [True, False]:
         for auto_dismiss in [True, False]:
-            test_job_lifecycle(vm, 'drive-backup', job_args={
+            test_job_lifecycle(vm, 'drive-backup', is_mirror=True, job_args={
                 'device': 'drive0-node',
                 'target': copy_path,
                 'sync': 'full',
diff --git a/tests/qemu-iotests/219.out b/tests/qemu-iotests/219.out
index 8ebd3fee60..0ea5d0b9d5 100644
--- a/tests/qemu-iotests/219.out
+++ b/tests/qemu-iotests/219.out
@@ -135,6 +135,8 @@ Pause/resume in RUNNING
 {"return": {}}
 
 Waiting for PENDING state...
+{"data": {"id": "job0", "status": "paused"}, "event": "JOB_STATUS_CHANGE", "timestamp": {"microseconds": "USECS", "seconds": "SECS"}}
+{"data": {"id": "job0", "status": "running"}, "event": "JOB_STATUS_CHANGE", "timestamp": {"microseconds": "USECS", "seconds": "SECS"}}
 {"data": {"id": "job0", "status": "waiting"}, "event": "JOB_STATUS_CHANGE", "timestamp": {"microseconds": "USECS", "seconds": "SECS"}}
 {"data": {"id": "job0", "status": "pending"}, "event": "JOB_STATUS_CHANGE", "timestamp": {"microseconds": "USECS", "seconds": "SECS"}}
 {"data": {"id": "job0", "status": "concluded"}, "event": "JOB_STATUS_CHANGE", "timestamp": {"microseconds": "USECS", "seconds": "SECS"}}
@@ -186,6 +188,8 @@ Pause/resume in RUNNING
 {"return": {}}
 
 Waiting for PENDING state...
+{"data": {"id": "job0", "status": "paused"}, "event": "JOB_STATUS_CHANGE", "timestamp": {"microseconds": "USECS", "seconds": "SECS"}}
+{"data": {"id": "job0", "status": "running"}, "event": "JOB_STATUS_CHANGE", "timestamp": {"microseconds": "USECS", "seconds": "SECS"}}
 {"data": {"id": "job0", "status": "waiting"}, "event": "JOB_STATUS_CHANGE", "timestamp": {"microseconds": "USECS", "seconds": "SECS"}}
 {"data": {"id": "job0", "status": "pending"}, "event": "JOB_STATUS_CHANGE", "timestamp": {"microseconds": "USECS", "seconds": "SECS"}}
 {"data": {"id": "job0", "status": "concluded"}, "event": "JOB_STATUS_CHANGE", "timestamp": {"microseconds": "USECS", "seconds": "SECS"}}
@@ -245,6 +249,8 @@ Pause/resume in RUNNING
 {"return": {}}
 
 Waiting for PENDING state...
+{"data": {"id": "job0", "status": "paused"}, "event": "JOB_STATUS_CHANGE", "timestamp": {"microseconds": "USECS", "seconds": "SECS"}}
+{"data": {"id": "job0", "status": "running"}, "event": "JOB_STATUS_CHANGE", "timestamp": {"microseconds": "USECS", "seconds": "SECS"}}
 {"data": {"id": "job0", "status": "waiting"}, "event": "JOB_STATUS_CHANGE", "timestamp": {"microseconds": "USECS", "seconds": "SECS"}}
 {"data": {"id": "job0", "status": "pending"}, "event": "JOB_STATUS_CHANGE", "timestamp": {"microseconds": "USECS", "seconds": "SECS"}}
 {"return": [{"current-progress": 4194304, "id": "job0", "status": "pending", "total-progress": 4194304, "type": "backup"}]}
@@ -304,6 +310,8 @@ Pause/resume in RUNNING
 {"return": {}}
 
 Waiting for PENDING state...
+{"data": {"id": "job0", "status": "paused"}, "event": "JOB_STATUS_CHANGE", "timestamp": {"microseconds": "USECS", "seconds": "SECS"}}
+{"data": {"id": "job0", "status": "running"}, "event": "JOB_STATUS_CHANGE", "timestamp": {"microseconds": "USECS", "seconds": "SECS"}}
 {"data": {"id": "job0", "status": "waiting"}, "event": "JOB_STATUS_CHANGE", "timestamp": {"microseconds": "USECS", "seconds": "SECS"}}
 {"data": {"id": "job0", "status": "pending"}, "event": "JOB_STATUS_CHANGE", "timestamp": {"microseconds": "USECS", "seconds": "SECS"}}
 {"return": [{"current-progress": 4194304, "id": "job0", "status": "pending", "total-progress": 4194304, "type": "backup"}]}
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v6 3/8] blockdev: unify qmp_blockdev_backup and blockdev-backup transaction paths
  2020-01-08 14:31 [PATCH v6 0/8] blockdev: Fix AioContext handling for various blockdev actions Sergio Lopez
  2020-01-08 14:31 ` [PATCH v6 1/8] blockdev: fix coding style issues in drive_backup_prepare Sergio Lopez
  2020-01-08 14:31 ` [PATCH v6 2/8] blockdev: unify qmp_drive_backup and drive-backup transaction paths Sergio Lopez
@ 2020-01-08 14:31 ` Sergio Lopez
  2020-01-08 14:31 ` [PATCH v6 4/8] blockdev: honor bdrv_try_set_aio_context() context requirements Sergio Lopez
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 13+ messages in thread
From: Sergio Lopez @ 2020-01-08 14:31 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, qemu-block, Sergio Lopez, Markus Armbruster,
	Max Reitz, John Snow

Issuing a blockdev-backup from qmp_blockdev_backup takes a slightly
different path than when it's issued from a transaction. In the code,
this is manifested as some redundancy between do_blockdev_backup() and
blockdev_backup_prepare().

This change unifies both paths, merging do_blockdev_backup() and
blockdev_backup_prepare(), and changing qmp_blockdev_backup() to
create a transaction instead of calling do_backup_common() direcly.

As a side-effect, now qmp_blockdev_backup() is executed inside a
drained section, as it happens when creating a blockdev-backup
transaction. This change is visible from the user's perspective, as
the job gets paused and immediately resumed before starting the actual
work.

Signed-off-by: Sergio Lopez <slp@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
---
 blockdev.c | 60 ++++++++++++------------------------------------------
 1 file changed, 13 insertions(+), 47 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index 5e85fc042e..152a0f7454 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1940,16 +1940,13 @@ typedef struct BlockdevBackupState {
     BlockJob *job;
 } BlockdevBackupState;
 
-static BlockJob *do_blockdev_backup(BlockdevBackup *backup, JobTxn *txn,
-                                    Error **errp);
-
 static void blockdev_backup_prepare(BlkActionState *common, Error **errp)
 {
     BlockdevBackupState *state = DO_UPCAST(BlockdevBackupState, common, common);
     BlockdevBackup *backup;
-    BlockDriverState *bs, *target;
+    BlockDriverState *bs;
+    BlockDriverState *target_bs;
     AioContext *aio_context;
-    Error *local_err = NULL;
 
     assert(common->action->type == TRANSACTION_ACTION_KIND_BLOCKDEV_BACKUP);
     backup = common->action->u.blockdev_backup.data;
@@ -1959,8 +1956,8 @@ static void blockdev_backup_prepare(BlkActionState *common, Error **errp)
         return;
     }
 
-    target = bdrv_lookup_bs(backup->target, backup->target, errp);
-    if (!target) {
+    target_bs = bdrv_lookup_bs(backup->target, backup->target, errp);
+    if (!target_bs) {
         return;
     }
 
@@ -1971,13 +1968,10 @@ static void blockdev_backup_prepare(BlkActionState *common, Error **errp)
     /* Paired with .clean() */
     bdrv_drained_begin(state->bs);
 
-    state->job = do_blockdev_backup(backup, common->block_job_txn, &local_err);
-    if (local_err) {
-        error_propagate(errp, local_err);
-        goto out;
-    }
+    state->job = do_backup_common(qapi_BlockdevBackup_base(backup),
+                                  bs, target_bs, aio_context,
+                                  common->block_job_txn, errp);
 
-out:
     aio_context_release(aio_context);
 }
 
@@ -3695,41 +3689,13 @@ XDbgBlockGraph *qmp_x_debug_query_block_graph(Error **errp)
     return bdrv_get_xdbg_block_graph(errp);
 }
 
-BlockJob *do_blockdev_backup(BlockdevBackup *backup, JobTxn *txn,
-                             Error **errp)
+void qmp_blockdev_backup(BlockdevBackup *backup, Error **errp)
 {
-    BlockDriverState *bs;
-    BlockDriverState *target_bs;
-    AioContext *aio_context;
-    BlockJob *job;
-
-    bs = bdrv_lookup_bs(backup->device, backup->device, errp);
-    if (!bs) {
-        return NULL;
-    }
-
-    target_bs = bdrv_lookup_bs(backup->target, backup->target, errp);
-    if (!target_bs) {
-        return NULL;
-    }
-
-    aio_context = bdrv_get_aio_context(bs);
-    aio_context_acquire(aio_context);
-
-    job = do_backup_common(qapi_BlockdevBackup_base(backup),
-                           bs, target_bs, aio_context, txn, errp);
-
-    aio_context_release(aio_context);
-    return job;
-}
-
-void qmp_blockdev_backup(BlockdevBackup *arg, Error **errp)
-{
-    BlockJob *job;
-    job = do_blockdev_backup(arg, NULL, errp);
-    if (job) {
-        job_start(&job->job);
-    }
+    TransactionAction action = {
+        .type = TRANSACTION_ACTION_KIND_BLOCKDEV_BACKUP,
+        .u.blockdev_backup.data = backup,
+    };
+    blockdev_do_action(&action, errp);
 }
 
 /* Parameter check and block job starting for drive mirroring.
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v6 4/8] blockdev: honor bdrv_try_set_aio_context() context requirements
  2020-01-08 14:31 [PATCH v6 0/8] blockdev: Fix AioContext handling for various blockdev actions Sergio Lopez
                   ` (2 preceding siblings ...)
  2020-01-08 14:31 ` [PATCH v6 3/8] blockdev: unify qmp_blockdev_backup and blockdev-backup " Sergio Lopez
@ 2020-01-08 14:31 ` Sergio Lopez
  2020-01-08 14:31 ` [PATCH v6 5/8] block/backup-top: Don't acquire context while dropping top Sergio Lopez
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 13+ messages in thread
From: Sergio Lopez @ 2020-01-08 14:31 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, qemu-block, Sergio Lopez, Markus Armbruster,
	Max Reitz, John Snow

bdrv_try_set_aio_context() requires that the old context is held, and
the new context is not held. Fix all the occurrences where it's not
done this way.

Suggested-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Sergio Lopez <slp@redhat.com>
---
 blockdev.c | 68 +++++++++++++++++++++++++++++++++++++++++++++++-------
 1 file changed, 60 insertions(+), 8 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index 152a0f7454..1dacbc20ec 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1535,6 +1535,7 @@ static void external_snapshot_prepare(BlkActionState *common,
                              DO_UPCAST(ExternalSnapshotState, common, common);
     TransactionAction *action = common->action;
     AioContext *aio_context;
+    AioContext *old_context;
     int ret;
 
     /* 'blockdev-snapshot' and 'blockdev-snapshot-sync' have similar
@@ -1675,7 +1676,16 @@ static void external_snapshot_prepare(BlkActionState *common,
         goto out;
     }
 
+    /* Honor bdrv_try_set_aio_context() context acquisition requirements. */
+    old_context = bdrv_get_aio_context(state->new_bs);
+    aio_context_release(aio_context);
+    aio_context_acquire(old_context);
+
     ret = bdrv_try_set_aio_context(state->new_bs, aio_context, errp);
+
+    aio_context_release(old_context);
+    aio_context_acquire(aio_context);
+
     if (ret < 0) {
         goto out;
     }
@@ -1775,11 +1785,13 @@ static void drive_backup_prepare(BlkActionState *common, Error **errp)
     BlockDriverState *target_bs;
     BlockDriverState *source = NULL;
     AioContext *aio_context;
+    AioContext *old_context;
     QDict *options;
     Error *local_err = NULL;
     int flags;
     int64_t size;
     bool set_backing_hd = false;
+    int ret;
 
     assert(common->action->type == TRANSACTION_ACTION_KIND_DRIVE_BACKUP);
     backup = common->action->u.drive_backup.data;
@@ -1868,6 +1880,21 @@ static void drive_backup_prepare(BlkActionState *common, Error **errp)
         goto out;
     }
 
+    /* Honor bdrv_try_set_aio_context() context acquisition requirements. */
+    old_context = bdrv_get_aio_context(target_bs);
+    aio_context_release(aio_context);
+    aio_context_acquire(old_context);
+
+    ret = bdrv_try_set_aio_context(target_bs, aio_context, errp);
+    if (ret < 0) {
+        bdrv_unref(target_bs);
+        aio_context_release(old_context);
+        return;
+    }
+
+    aio_context_release(old_context);
+    aio_context_acquire(aio_context);
+
     if (set_backing_hd) {
         bdrv_set_backing_hd(target_bs, source, &local_err);
         if (local_err) {
@@ -1947,6 +1974,8 @@ static void blockdev_backup_prepare(BlkActionState *common, Error **errp)
     BlockDriverState *bs;
     BlockDriverState *target_bs;
     AioContext *aio_context;
+    AioContext *old_context;
+    int ret;
 
     assert(common->action->type == TRANSACTION_ACTION_KIND_BLOCKDEV_BACKUP);
     backup = common->action->u.blockdev_backup.data;
@@ -1961,7 +1990,18 @@ static void blockdev_backup_prepare(BlkActionState *common, Error **errp)
         return;
     }
 
+    /* Honor bdrv_try_set_aio_context() context acquisition requirements. */
     aio_context = bdrv_get_aio_context(bs);
+    old_context = bdrv_get_aio_context(target_bs);
+    aio_context_acquire(old_context);
+
+    ret = bdrv_try_set_aio_context(target_bs, aio_context, errp);
+    if (ret < 0) {
+        aio_context_release(old_context);
+        return;
+    }
+
+    aio_context_release(old_context);
     aio_context_acquire(aio_context);
     state->bs = bs;
 
@@ -3562,7 +3602,6 @@ static BlockJob *do_backup_common(BackupCommon *backup,
     BlockJob *job = NULL;
     BdrvDirtyBitmap *bmap = NULL;
     int job_flags = JOB_DEFAULT;
-    int ret;
 
     if (!backup->has_speed) {
         backup->speed = 0;
@@ -3586,11 +3625,6 @@ static BlockJob *do_backup_common(BackupCommon *backup,
         backup->compress = false;
     }
 
-    ret = bdrv_try_set_aio_context(target_bs, aio_context, errp);
-    if (ret < 0) {
-        return NULL;
-    }
-
     if ((backup->sync == MIRROR_SYNC_MODE_BITMAP) ||
         (backup->sync == MIRROR_SYNC_MODE_INCREMENTAL)) {
         /* done before desugaring 'incremental' to print the right message */
@@ -3825,6 +3859,7 @@ void qmp_drive_mirror(DriveMirror *arg, Error **errp)
     BlockDriverState *bs;
     BlockDriverState *source, *target_bs;
     AioContext *aio_context;
+    AioContext *old_context;
     BlockMirrorBackingMode backing_mode;
     Error *local_err = NULL;
     QDict *options = NULL;
@@ -3937,12 +3972,22 @@ void qmp_drive_mirror(DriveMirror *arg, Error **errp)
                    (arg->mode == NEW_IMAGE_MODE_EXISTING ||
                     !bdrv_has_zero_init(target_bs)));
 
+
+    /* Honor bdrv_try_set_aio_context() context acquisition requirements. */
+    old_context = bdrv_get_aio_context(target_bs);
+    aio_context_release(aio_context);
+    aio_context_acquire(old_context);
+
     ret = bdrv_try_set_aio_context(target_bs, aio_context, errp);
     if (ret < 0) {
         bdrv_unref(target_bs);
-        goto out;
+        aio_context_release(old_context);
+        return;
     }
 
+    aio_context_release(old_context);
+    aio_context_acquire(aio_context);
+
     blockdev_mirror_common(arg->has_job_id ? arg->job_id : NULL, bs, target_bs,
                            arg->has_replaces, arg->replaces, arg->sync,
                            backing_mode, zero_target,
@@ -3984,6 +4029,7 @@ void qmp_blockdev_mirror(bool has_job_id, const char *job_id,
     BlockDriverState *bs;
     BlockDriverState *target_bs;
     AioContext *aio_context;
+    AioContext *old_context;
     BlockMirrorBackingMode backing_mode = MIRROR_LEAVE_BACKING_CHAIN;
     Error *local_err = NULL;
     bool zero_target;
@@ -4001,10 +4047,16 @@ void qmp_blockdev_mirror(bool has_job_id, const char *job_id,
 
     zero_target = (sync == MIRROR_SYNC_MODE_FULL);
 
+    /* Honor bdrv_try_set_aio_context() context acquisition requirements. */
+    old_context = bdrv_get_aio_context(target_bs);
     aio_context = bdrv_get_aio_context(bs);
-    aio_context_acquire(aio_context);
+    aio_context_acquire(old_context);
 
     ret = bdrv_try_set_aio_context(target_bs, aio_context, errp);
+
+    aio_context_release(old_context);
+    aio_context_acquire(aio_context);
+
     if (ret < 0) {
         goto out;
     }
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v6 5/8] block/backup-top: Don't acquire context while dropping top
  2020-01-08 14:31 [PATCH v6 0/8] blockdev: Fix AioContext handling for various blockdev actions Sergio Lopez
                   ` (3 preceding siblings ...)
  2020-01-08 14:31 ` [PATCH v6 4/8] blockdev: honor bdrv_try_set_aio_context() context requirements Sergio Lopez
@ 2020-01-08 14:31 ` Sergio Lopez
  2020-01-08 14:31 ` [PATCH v6 6/8] blockdev: Acquire AioContext on dirty bitmap functions Sergio Lopez
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 13+ messages in thread
From: Sergio Lopez @ 2020-01-08 14:31 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, qemu-block, Sergio Lopez, Markus Armbruster,
	Max Reitz, John Snow

All paths that lead to bdrv_backup_top_drop(), except for the call
from backup_clean(), imply that the BDS AioContext has already been
acquired, so doing it there too can potentially lead to QEMU hanging
on AIO_WAIT_WHILE().

An easy way to trigger this situation is by issuing a two actions
transaction, with a proper and a bogus blockdev-backup, so the second
one will trigger a rollback. This will trigger a hang with an stack
trace like this one:

 #0  0x00007fb680c75016 in __GI_ppoll (fds=0x55e74580f7c0, nfds=1, timeout=<optimized out>,
     timeout@entry=0x0, sigmask=sigmask@entry=0x0) at ../sysdeps/unix/sysv/linux/ppoll.c:39
 #1  0x000055e743386e09 in ppoll (__ss=0x0, __timeout=0x0, __nfds=<optimized out>, __fds=<optimized out>)
     at /usr/include/bits/poll2.h:77
 #2  0x000055e743386e09 in qemu_poll_ns
     (fds=<optimized out>, nfds=<optimized out>, timeout=<optimized out>) at util/qemu-timer.c:336
 #3  0x000055e743388dc4 in aio_poll (ctx=0x55e7458925d0, blocking=blocking@entry=true)
     at util/aio-posix.c:669
 #4  0x000055e743305dea in bdrv_flush (bs=bs@entry=0x55e74593c0d0) at block/io.c:2878
 #5  0x000055e7432be58e in bdrv_close (bs=0x55e74593c0d0) at block.c:4017
 #6  0x000055e7432be58e in bdrv_delete (bs=<optimized out>) at block.c:4262
 #7  0x000055e7432be58e in bdrv_unref (bs=bs@entry=0x55e74593c0d0) at block.c:5644
 #8  0x000055e743316b9b in bdrv_backup_top_drop (bs=bs@entry=0x55e74593c0d0) at block/backup-top.c:273
 #9  0x000055e74331461f in backup_job_create
     (job_id=0x0, bs=bs@entry=0x55e7458d5820, target=target@entry=0x55e74589f640, speed=0, sync_mode=MIRROR_SYNC_MODE_FULL, sync_bitmap=sync_bitmap@entry=0x0, bitmap_mode=BITMAP_SYNC_MODE_ON_SUCCESS, compress=false, filter_node_name=0x0, on_source_error=BLOCKDEV_ON_ERROR_REPORT, on_target_error=BLOCKDEV_ON_ERROR_REPORT, creation_flags=0, cb=0x0, opaque=0x0, txn=0x0, errp=0x7ffddfd1efb0) at block/backup.c:478
 #10 0x000055e74315bc52 in do_backup_common
     (backup=backup@entry=0x55e746c066d0, bs=bs@entry=0x55e7458d5820, target_bs=target_bs@entry=0x55e74589f640, aio_context=aio_context@entry=0x55e7458a91e0, txn=txn@entry=0x0, errp=errp@entry=0x7ffddfd1efb0)
     at blockdev.c:3580
 #11 0x000055e74315c37c in do_blockdev_backup
     (backup=backup@entry=0x55e746c066d0, txn=0x0, errp=errp@entry=0x7ffddfd1efb0)
     at /usr/src/debug/qemu-kvm-4.2.0-2.module+el8.2.0+5135+ed3b2489.x86_64/./qapi/qapi-types-block-core.h:1492
 #12 0x000055e74315c449 in blockdev_backup_prepare (common=0x55e746a8de90, errp=0x7ffddfd1f018)
     at blockdev.c:1885
 #13 0x000055e743160152 in qmp_transaction
     (dev_list=<optimized out>, has_props=<optimized out>, props=0x55e7467fe2c0, errp=errp@entry=0x7ffddfd1f088) at blockdev.c:2340
 #14 0x000055e743287ff5 in qmp_marshal_transaction
     (args=<optimized out>, ret=<optimized out>, errp=0x7ffddfd1f0f8)
     at qapi/qapi-commands-transaction.c:44
 #15 0x000055e74333de6c in do_qmp_dispatch
     (errp=0x7ffddfd1f0f0, allow_oob=<optimized out>, request=<optimized out>, cmds=0x55e743c28d60 <qmp_commands>) at qapi/qmp-dispatch.c:132
 #16 0x000055e74333de6c in qmp_dispatch
     (cmds=0x55e743c28d60 <qmp_commands>, request=<optimized out>, allow_oob=<optimized out>)
     at qapi/qmp-dispatch.c:175
 #17 0x000055e74325c061 in monitor_qmp_dispatch (mon=0x55e745908030, req=<optimized out>)
     at monitor/qmp.c:145
 #18 0x000055e74325c6fa in monitor_qmp_bh_dispatcher (data=<optimized out>) at monitor/qmp.c:234
 #19 0x000055e743385866 in aio_bh_call (bh=0x55e745807ae0) at util/async.c:117
 #20 0x000055e743385866 in aio_bh_poll (ctx=ctx@entry=0x55e7458067a0) at util/async.c:117
 #21 0x000055e743388c54 in aio_dispatch (ctx=0x55e7458067a0) at util/aio-posix.c:459
 #22 0x000055e743385742 in aio_ctx_dispatch
     (source=<optimized out>, callback=<optimized out>, user_data=<optimized out>) at util/async.c:260
 #23 0x00007fb68543e67d in g_main_dispatch (context=0x55e745893a40) at gmain.c:3176
 #24 0x00007fb68543e67d in g_main_context_dispatch (context=context@entry=0x55e745893a40) at gmain.c:3829
 #25 0x000055e743387d08 in glib_pollfds_poll () at util/main-loop.c:219
 #26 0x000055e743387d08 in os_host_main_loop_wait (timeout=<optimized out>) at util/main-loop.c:242
 #27 0x000055e743387d08 in main_loop_wait (nonblocking=<optimized out>) at util/main-loop.c:518
 #28 0x000055e74316a3c1 in main_loop () at vl.c:1828
 #29 0x000055e743016a72 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>)
     at vl.c:4504

Fix this by not acquiring the AioContext there, and ensuring all paths
leading to it have it already acquired (backup_clean()).

RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1782111
Signed-off-by: Sergio Lopez <slp@redhat.com>
---
 block/backup-top.c | 5 -----
 block/backup.c     | 3 +++
 2 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/block/backup-top.c b/block/backup-top.c
index 818d3f26b4..b8d863ff08 100644
--- a/block/backup-top.c
+++ b/block/backup-top.c
@@ -255,9 +255,6 @@ append_failed:
 void bdrv_backup_top_drop(BlockDriverState *bs)
 {
     BDRVBackupTopState *s = bs->opaque;
-    AioContext *aio_context = bdrv_get_aio_context(bs);
-
-    aio_context_acquire(aio_context);
 
     bdrv_drained_begin(bs);
 
@@ -271,6 +268,4 @@ void bdrv_backup_top_drop(BlockDriverState *bs)
     bdrv_drained_end(bs);
 
     bdrv_unref(bs);
-
-    aio_context_release(aio_context);
 }
diff --git a/block/backup.c b/block/backup.c
index cf62b1a38c..1383e219f5 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -135,8 +135,11 @@ static void backup_abort(Job *job)
 static void backup_clean(Job *job)
 {
     BackupBlockJob *s = container_of(job, BackupBlockJob, common.job);
+    AioContext *aio_context = bdrv_get_aio_context(s->backup_top);
 
+    aio_context_acquire(aio_context);
     bdrv_backup_top_drop(s->backup_top);
+    aio_context_release(aio_context);
 }
 
 void backup_do_checkpoint(BlockJob *job, Error **errp)
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v6 6/8] blockdev: Acquire AioContext on dirty bitmap functions
  2020-01-08 14:31 [PATCH v6 0/8] blockdev: Fix AioContext handling for various blockdev actions Sergio Lopez
                   ` (4 preceding siblings ...)
  2020-01-08 14:31 ` [PATCH v6 5/8] block/backup-top: Don't acquire context while dropping top Sergio Lopez
@ 2020-01-08 14:31 ` Sergio Lopez
  2020-01-15 15:19   ` Kevin Wolf
  2020-01-08 14:31 ` [PATCH v6 7/8] blockdev: Return bs to the proper context on snapshot abort Sergio Lopez
                   ` (3 subsequent siblings)
  9 siblings, 1 reply; 13+ messages in thread
From: Sergio Lopez @ 2020-01-08 14:31 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, qemu-block, Sergio Lopez, Markus Armbruster,
	Max Reitz, John Snow

Dirty map addition and removal functions are not acquiring to BDS
AioContext, while they may call to code that expects it to be
acquired.

This may trigger a crash with a stack trace like this one:

 #0  0x00007f0ef146370f in __GI_raise (sig=sig@entry=6)
     at ../sysdeps/unix/sysv/linux/raise.c:50
 #1  0x00007f0ef144db25 in __GI_abort () at abort.c:79
 #2  0x0000565022294dce in error_exit
     (err=<optimized out>, msg=msg@entry=0x56502243a730 <__func__.16350> "qemu_mutex_unlock_impl") at util/qemu-thread-posix.c:36
 #3  0x00005650222950ba in qemu_mutex_unlock_impl
     (mutex=mutex@entry=0x5650244b0240, file=file@entry=0x565022439adf "util/async.c", line=line@entry=526) at util/qemu-thread-posix.c:108
 #4  0x0000565022290029 in aio_context_release
     (ctx=ctx@entry=0x5650244b01e0) at util/async.c:526
 #5  0x000056502221cd08 in bdrv_can_store_new_dirty_bitmap
     (bs=bs@entry=0x5650244dc820, name=name@entry=0x56502481d360 "bitmap1", granularity=granularity@entry=65536, errp=errp@entry=0x7fff22831718)
     at block/dirty-bitmap.c:542
 #6  0x000056502206ae53 in qmp_block_dirty_bitmap_add
     (errp=0x7fff22831718, disabled=false, has_disabled=<optimized out>, persistent=<optimized out>, has_persistent=true, granularity=65536, has_granularity=<optimized out>, name=0x56502481d360 "bitmap1", node=<optimized out>) at blockdev.c:2894
 #7  0x000056502206ae53 in qmp_block_dirty_bitmap_add
     (node=<optimized out>, name=0x56502481d360 "bitmap1", has_granularity=<optimized out>, granularity=<optimized out>, has_persistent=true, persistent=<optimized out>, has_disabled=false, disabled=false, errp=0x7fff22831718) at blockdev.c:2856
 #8  0x00005650221847a3 in qmp_marshal_block_dirty_bitmap_add
     (args=<optimized out>, ret=<optimized out>, errp=0x7fff22831798)
     at qapi/qapi-commands-block-core.c:651
 #9  0x0000565022247e6c in do_qmp_dispatch
     (errp=0x7fff22831790, allow_oob=<optimized out>, request=<optimized out>, cmds=0x565022b32d60 <qmp_commands>) at qapi/qmp-dispatch.c:132
 #10 0x0000565022247e6c in qmp_dispatch
     (cmds=0x565022b32d60 <qmp_commands>, request=<optimized out>, allow_oob=<optimized out>) at qapi/qmp-dispatch.c:175
 #11 0x0000565022166061 in monitor_qmp_dispatch
     (mon=0x56502450faa0, req=<optimized out>) at monitor/qmp.c:145
 #12 0x00005650221666fa in monitor_qmp_bh_dispatcher
     (data=<optimized out>) at monitor/qmp.c:234
 #13 0x000056502228f866 in aio_bh_call (bh=0x56502440eae0)
     at util/async.c:117
 #14 0x000056502228f866 in aio_bh_poll (ctx=ctx@entry=0x56502440d7a0)
     at util/async.c:117
 #15 0x0000565022292c54 in aio_dispatch (ctx=0x56502440d7a0)
     at util/aio-posix.c:459
 #16 0x000056502228f742 in aio_ctx_dispatch
     (source=<optimized out>, callback=<optimized out>, user_data=<optimized out>) at util/async.c:260
 #17 0x00007f0ef5ce667d in g_main_dispatch (context=0x56502449aa40)
     at gmain.c:3176
 #18 0x00007f0ef5ce667d in g_main_context_dispatch
     (context=context@entry=0x56502449aa40) at gmain.c:3829
 #19 0x0000565022291d08 in glib_pollfds_poll () at util/main-loop.c:219
 #20 0x0000565022291d08 in os_host_main_loop_wait
     (timeout=<optimized out>) at util/main-loop.c:242
 #21 0x0000565022291d08 in main_loop_wait (nonblocking=<optimized out>)
     at util/main-loop.c:518
 #22 0x00005650220743c1 in main_loop () at vl.c:1828
 #23 0x0000565021f20a72 in main
     (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>)
     at vl.c:4504

Fix this by acquiring the AioContext at qmp_block_dirty_bitmap_add()
and qmp_block_dirty_bitmap_add().

RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1782175
Signed-off-by: Sergio Lopez <slp@redhat.com>
---
 blockdev.c | 20 +++++++++++++++++---
 1 file changed, 17 insertions(+), 3 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index 1dacbc20ec..292961a88d 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -2984,6 +2984,7 @@ void qmp_block_dirty_bitmap_add(const char *node, const char *name,
 {
     BlockDriverState *bs;
     BdrvDirtyBitmap *bitmap;
+    AioContext *aio_context;
 
     if (!name || name[0] == '\0') {
         error_setg(errp, "Bitmap name cannot be empty");
@@ -2995,11 +2996,14 @@ void qmp_block_dirty_bitmap_add(const char *node, const char *name,
         return;
     }
 
+    aio_context = bdrv_get_aio_context(bs);
+    aio_context_acquire(aio_context);
+
     if (has_granularity) {
         if (granularity < 512 || !is_power_of_2(granularity)) {
             error_setg(errp, "Granularity must be power of 2 "
                              "and at least 512");
-            return;
+            goto out;
         }
     } else {
         /* Default to cluster size, if available: */
@@ -3017,12 +3021,12 @@ void qmp_block_dirty_bitmap_add(const char *node, const char *name,
     if (persistent &&
         !bdrv_can_store_new_dirty_bitmap(bs, name, granularity, errp))
     {
-        return;
+        goto out;
     }
 
     bitmap = bdrv_create_dirty_bitmap(bs, granularity, name, errp);
     if (bitmap == NULL) {
-        return;
+        goto out;
     }
 
     if (disabled) {
@@ -3030,6 +3034,9 @@ void qmp_block_dirty_bitmap_add(const char *node, const char *name,
     }
 
     bdrv_dirty_bitmap_set_persistence(bitmap, persistent);
+
+out:
+    aio_context_release(aio_context);
 }
 
 static BdrvDirtyBitmap *do_block_dirty_bitmap_remove(
@@ -3038,20 +3045,26 @@ static BdrvDirtyBitmap *do_block_dirty_bitmap_remove(
 {
     BlockDriverState *bs;
     BdrvDirtyBitmap *bitmap;
+    AioContext *aio_context;
 
     bitmap = block_dirty_bitmap_lookup(node, name, &bs, errp);
     if (!bitmap || !bs) {
         return NULL;
     }
 
+    aio_context = bdrv_get_aio_context(bs);
+    aio_context_acquire(aio_context);
+
     if (bdrv_dirty_bitmap_check(bitmap, BDRV_BITMAP_BUSY | BDRV_BITMAP_RO,
                                 errp)) {
+        aio_context_release(aio_context);
         return NULL;
     }
 
     if (bdrv_dirty_bitmap_get_persistence(bitmap) &&
         bdrv_remove_persistent_dirty_bitmap(bs, name, errp) < 0)
     {
+            aio_context_release(aio_context);
             return NULL;
     }
 
@@ -3063,6 +3076,7 @@ static BdrvDirtyBitmap *do_block_dirty_bitmap_remove(
         *bitmap_bs = bs;
     }
 
+    aio_context_release(aio_context);
     return release ? NULL : bitmap;
 }
 
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v6 7/8] blockdev: Return bs to the proper context on snapshot abort
  2020-01-08 14:31 [PATCH v6 0/8] blockdev: Fix AioContext handling for various blockdev actions Sergio Lopez
                   ` (5 preceding siblings ...)
  2020-01-08 14:31 ` [PATCH v6 6/8] blockdev: Acquire AioContext on dirty bitmap functions Sergio Lopez
@ 2020-01-08 14:31 ` Sergio Lopez
  2020-01-15 15:22   ` Kevin Wolf
  2020-01-08 14:31 ` [PATCH v6 8/8] iotests: Test handling of AioContexts with some blockdev actions Sergio Lopez
                   ` (2 subsequent siblings)
  9 siblings, 1 reply; 13+ messages in thread
From: Sergio Lopez @ 2020-01-08 14:31 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, qemu-block, Sergio Lopez, Markus Armbruster,
	Max Reitz, John Snow

external_snapshot_abort() calls to bdrv_set_backing_hd(), which
returns state->old_bs to the main AioContext, as it's intended to be
used then the BDS is going to be released. As that's not the case when
aborting an external snapshot, return it to the AioContext it was
before the call.

This issue can be triggered by issuing a transaction with two actions,
a proper blockdev-snapshot-sync and a bogus one, so the second will
trigger a transaction abort. This results in a crash with an stack
trace like this one:

 #0  0x00007fa1048b28df in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
 #1  0x00007fa10489ccf5 in __GI_abort () at abort.c:79
 #2  0x00007fa10489cbc9 in __assert_fail_base
     (fmt=0x7fa104a03300 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x5572240b44d8 "bdrv_get_aio_context(old_bs) == bdrv_get_aio_context(new_bs)", file=0x557224014d30 "block.c", line=2240, function=<optimized out>) at assert.c:92
 #3  0x00007fa1048aae96 in __GI___assert_fail
     (assertion=assertion@entry=0x5572240b44d8 "bdrv_get_aio_context(old_bs) == bdrv_get_aio_context(new_bs)", file=file@entry=0x557224014d30 "block.c", line=line@entry=2240, function=function@entry=0x5572240b5d60 <__PRETTY_FUNCTION__.31620> "bdrv_replace_child_noperm") at assert.c:101
 #4  0x0000557223e631f8 in bdrv_replace_child_noperm (child=0x557225b9c980, new_bs=new_bs@entry=0x557225c42e40) at block.c:2240
 #5  0x0000557223e68be7 in bdrv_replace_node (from=0x557226951a60, to=0x557225c42e40, errp=0x5572247d6138 <error_abort>) at block.c:4196
 #6  0x0000557223d069c4 in external_snapshot_abort (common=0x557225d7e170) at blockdev.c:1731
 #7  0x0000557223d069c4 in external_snapshot_abort (common=0x557225d7e170) at blockdev.c:1717
 #8  0x0000557223d09013 in qmp_transaction (dev_list=<optimized out>, has_props=<optimized out>, props=0x557225cc7d70, errp=errp@entry=0x7ffe704c0c98) at blockdev.c:2360
 #9  0x0000557223e32085 in qmp_marshal_transaction (args=<optimized out>, ret=<optimized out>, errp=0x7ffe704c0d08) at qapi/qapi-commands-transaction.c:44
 #10 0x0000557223ee798c in do_qmp_dispatch (errp=0x7ffe704c0d00, allow_oob=<optimized out>, request=<optimized out>, cmds=0x5572247d3cc0 <qmp_commands>) at qapi/qmp-dispatch.c:132
 #11 0x0000557223ee798c in qmp_dispatch (cmds=0x5572247d3cc0 <qmp_commands>, request=<optimized out>, allow_oob=<optimized out>) at qapi/qmp-dispatch.c:175
 #12 0x0000557223e06141 in monitor_qmp_dispatch (mon=0x557225c69ff0, req=<optimized out>) at monitor/qmp.c:120
 #13 0x0000557223e0678a in monitor_qmp_bh_dispatcher (data=<optimized out>) at monitor/qmp.c:209
 #14 0x0000557223f2f366 in aio_bh_call (bh=0x557225b9dc60) at util/async.c:117
 #15 0x0000557223f2f366 in aio_bh_poll (ctx=ctx@entry=0x557225b9c840) at util/async.c:117
 #16 0x0000557223f32754 in aio_dispatch (ctx=0x557225b9c840) at util/aio-posix.c:459
 #17 0x0000557223f2f242 in aio_ctx_dispatch (source=<optimized out>, callback=<optimized out>, user_data=<optimized out>) at util/async.c:260
 #18 0x00007fa10913467d in g_main_dispatch (context=0x557225c28e80) at gmain.c:3176
 #19 0x00007fa10913467d in g_main_context_dispatch (context=context@entry=0x557225c28e80) at gmain.c:3829
 #20 0x0000557223f31808 in glib_pollfds_poll () at util/main-loop.c:219
 #21 0x0000557223f31808 in os_host_main_loop_wait (timeout=<optimized out>) at util/main-loop.c:242
 #22 0x0000557223f31808 in main_loop_wait (nonblocking=<optimized out>) at util/main-loop.c:518
 #23 0x0000557223d13201 in main_loop () at vl.c:1828
 #24 0x0000557223bbfb82 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4504

RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1779036
Signed-off-by: Sergio Lopez <slp@redhat.com>
---
 blockdev.c | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

diff --git a/blockdev.c b/blockdev.c
index 292961a88d..cfed87f434 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1731,6 +1731,8 @@ static void external_snapshot_abort(BlkActionState *common)
     if (state->new_bs) {
         if (state->overlay_appended) {
             AioContext *aio_context;
+            AioContext *tmp_context;
+            int ret;
 
             aio_context = bdrv_get_aio_context(state->old_bs);
             aio_context_acquire(aio_context);
@@ -1738,6 +1740,25 @@ static void external_snapshot_abort(BlkActionState *common)
             bdrv_ref(state->old_bs);   /* we can't let bdrv_set_backind_hd()
                                           close state->old_bs; we need it */
             bdrv_set_backing_hd(state->new_bs, NULL, &error_abort);
+
+            /*
+             * The call to bdrv_set_backing_hd() above returns state->old_bs to
+             * the main AioContext. As we're still going to be using it, return
+             * it to the AioContext it was before.
+             */
+            tmp_context = bdrv_get_aio_context(state->old_bs);
+            if (aio_context != tmp_context) {
+                aio_context_release(aio_context);
+                aio_context_acquire(tmp_context);
+
+                ret = bdrv_try_set_aio_context(state->old_bs,
+                                               aio_context, NULL);
+                assert(ret == 0);
+
+                aio_context_release(tmp_context);
+                aio_context_acquire(aio_context);
+            }
+
             bdrv_replace_node(state->new_bs, state->old_bs, &error_abort);
             bdrv_unref(state->old_bs); /* bdrv_replace_node() ref'ed old_bs */
 
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v6 8/8] iotests: Test handling of AioContexts with some blockdev actions
  2020-01-08 14:31 [PATCH v6 0/8] blockdev: Fix AioContext handling for various blockdev actions Sergio Lopez
                   ` (6 preceding siblings ...)
  2020-01-08 14:31 ` [PATCH v6 7/8] blockdev: Return bs to the proper context on snapshot abort Sergio Lopez
@ 2020-01-08 14:31 ` Sergio Lopez
  2020-01-15 15:27 ` [PATCH v6 0/8] blockdev: Fix AioContext handling for various " Kevin Wolf
  2020-01-16 17:17 ` Paolo Bonzini
  9 siblings, 0 replies; 13+ messages in thread
From: Sergio Lopez @ 2020-01-08 14:31 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, qemu-block, Sergio Lopez, Markus Armbruster,
	Max Reitz, John Snow

Includes the following tests:

 - Adding a dirty bitmap.
   * RHBZ: 1782175

 - Starting a drive-mirror to an NBD-backed target.
   * RHBZ: 1746217, 1773517

 - Aborting an external snapshot transaction.
   * RHBZ: 1779036

 - Aborting a blockdev backup transaction.
   * RHBZ: 1782111

For each one of them, a VM with a number of disks running in an
IOThread AioContext is used.

Signed-off-by: Sergio Lopez <slp@redhat.com>
---
 tests/qemu-iotests/281     | 247 +++++++++++++++++++++++++++++++++++++
 tests/qemu-iotests/281.out |   5 +
 tests/qemu-iotests/group   |   1 +
 3 files changed, 253 insertions(+)
 create mode 100755 tests/qemu-iotests/281
 create mode 100644 tests/qemu-iotests/281.out

diff --git a/tests/qemu-iotests/281 b/tests/qemu-iotests/281
new file mode 100755
index 0000000000..269d583b2c
--- /dev/null
+++ b/tests/qemu-iotests/281
@@ -0,0 +1,247 @@
+#!/usr/bin/env python
+#
+# Test cases for blockdev + IOThread interactions
+#
+# Copyright (C) 2019 Red Hat, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+#
+
+import os
+import iotests
+from iotests import qemu_img
+
+image_len = 64 * 1024 * 1024
+
+# Test for RHBZ#1782175
+class TestDirtyBitmapIOThread(iotests.QMPTestCase):
+    drive0_img = os.path.join(iotests.test_dir, 'drive0.img')
+    images = { 'drive0': drive0_img }
+
+    def setUp(self):
+        for name in self.images:
+            qemu_img('create', '-f', iotests.imgfmt,
+                     self.images[name], str(image_len))
+
+        self.vm = iotests.VM()
+        self.vm.add_object('iothread,id=iothread0')
+
+        for name in self.images:
+            self.vm.add_blockdev('driver=file,filename=%s,node-name=file_%s'
+                                 % (self.images[name], name))
+            self.vm.add_blockdev('driver=qcow2,file=file_%s,node-name=%s'
+                                 % (name, name))
+
+        self.vm.launch()
+        self.vm.qmp('x-blockdev-set-iothread',
+                    node_name='drive0', iothread='iothread0',
+                    force=True)
+
+    def tearDown(self):
+        self.vm.shutdown()
+        for name in self.images:
+            os.remove(self.images[name])
+
+    def test_add_dirty_bitmap(self):
+        result = self.vm.qmp(
+            'block-dirty-bitmap-add',
+            node='drive0',
+            name='bitmap1',
+            persistent=True,
+        )
+
+        self.assert_qmp(result, 'return', {})
+
+
+# Test for RHBZ#1746217 & RHBZ#1773517
+class TestNBDMirrorIOThread(iotests.QMPTestCase):
+    nbd_sock = os.path.join(iotests.sock_dir, 'nbd.sock')
+    drive0_img = os.path.join(iotests.test_dir, 'drive0.img')
+    mirror_img = os.path.join(iotests.test_dir, 'mirror.img')
+    images = { 'drive0': drive0_img, 'mirror': mirror_img }
+
+    def setUp(self):
+        for name in self.images:
+            qemu_img('create', '-f', iotests.imgfmt,
+                     self.images[name], str(image_len))
+
+        self.vm_src = iotests.VM(path_suffix='src')
+        self.vm_src.add_object('iothread,id=iothread0')
+        self.vm_src.add_blockdev('driver=file,filename=%s,node-name=file0'
+                          % (self.drive0_img))
+        self.vm_src.add_blockdev('driver=qcow2,file=file0,node-name=drive0')
+        self.vm_src.launch()
+        self.vm_src.qmp('x-blockdev-set-iothread',
+                        node_name='drive0', iothread='iothread0',
+                        force=True)
+
+        self.vm_tgt = iotests.VM(path_suffix='tgt')
+        self.vm_tgt.add_object('iothread,id=iothread0')
+        self.vm_tgt.add_blockdev('driver=file,filename=%s,node-name=file0'
+                          % (self.mirror_img))
+        self.vm_tgt.add_blockdev('driver=qcow2,file=file0,node-name=drive0')
+        self.vm_tgt.launch()
+        self.vm_tgt.qmp('x-blockdev-set-iothread',
+                        node_name='drive0', iothread='iothread0',
+                        force=True)
+
+    def tearDown(self):
+        self.vm_src.shutdown()
+        self.vm_tgt.shutdown()
+        for name in self.images:
+            os.remove(self.images[name])
+
+    def test_nbd_mirror(self):
+        result = self.vm_tgt.qmp(
+            'nbd-server-start',
+            addr={
+                'type': 'unix',
+                'data': { 'path': self.nbd_sock }
+            }
+        )
+        self.assert_qmp(result, 'return', {})
+
+        result = self.vm_tgt.qmp(
+            'nbd-server-add',
+            device='drive0',
+            writable=True
+        )
+        self.assert_qmp(result, 'return', {})
+
+        result = self.vm_src.qmp(
+            'drive-mirror',
+            device='drive0',
+            target='nbd+unix:///drive0?socket=' + self.nbd_sock,
+            sync='full',
+            mode='existing',
+            speed=64*1024*1024,
+            job_id='j1'
+        )
+        self.assert_qmp(result, 'return', {})
+
+        self.vm_src.event_wait(name="BLOCK_JOB_READY")
+
+
+# Test for RHBZ#1779036
+class TestExternalSnapshotAbort(iotests.QMPTestCase):
+    drive0_img = os.path.join(iotests.test_dir, 'drive0.img')
+    snapshot_img = os.path.join(iotests.test_dir, 'snapshot.img')
+    images = { 'drive0': drive0_img, 'snapshot': snapshot_img }
+
+    def setUp(self):
+        for name in self.images:
+            qemu_img('create', '-f', iotests.imgfmt,
+                     self.images[name], str(image_len))
+
+        self.vm = iotests.VM()
+        self.vm.add_object('iothread,id=iothread0')
+        self.vm.add_blockdev('driver=file,filename=%s,node-name=file0'
+                          % (self.drive0_img))
+        self.vm.add_blockdev('driver=qcow2,file=file0,node-name=drive0')
+        self.vm.launch()
+        self.vm.qmp('x-blockdev-set-iothread',
+                    node_name='drive0', iothread='iothread0',
+                    force=True)
+
+    def tearDown(self):
+        self.vm.shutdown()
+        for name in self.images:
+            os.remove(self.images[name])
+
+    def test_external_snapshot_abort(self):
+        # Use a two actions transaction with a bogus values on the second
+        # one to trigger an abort of the transaction.
+        result = self.vm.qmp('transaction', actions=[
+            {
+                'type': 'blockdev-snapshot-sync',
+                'data': { 'node-name': 'drive0',
+                          'snapshot-file': self.snapshot_img,
+                          'snapshot-node-name': 'snap1',
+                          'mode': 'absolute-paths',
+                          'format': 'qcow2' }
+            },
+            {
+                'type': 'blockdev-snapshot-sync',
+                'data': { 'node-name': 'drive0',
+                          'snapshot-file': '/fakesnapshot',
+                          'snapshot-node-name': 'snap2',
+                          'mode': 'absolute-paths',
+                          'format': 'qcow2' }
+            },
+        ])
+
+        # Crashes on failure, we expect this error.
+        self.assert_qmp(result, 'error/class', 'GenericError')
+
+
+# Test for RHBZ#1782111
+class TestBlockdevBackupAbort(iotests.QMPTestCase):
+    drive0_img = os.path.join(iotests.test_dir, 'drive0.img')
+    drive1_img = os.path.join(iotests.test_dir, 'drive1.img')
+    snap0_img = os.path.join(iotests.test_dir, 'snap0.img')
+    snap1_img = os.path.join(iotests.test_dir, 'snap1.img')
+    images = { 'drive0': drive0_img,
+               'drive1': drive1_img,
+               'snap0': snap0_img,
+               'snap1': snap1_img }
+
+    def setUp(self):
+        for name in self.images:
+            qemu_img('create', '-f', iotests.imgfmt,
+                     self.images[name], str(image_len))
+
+        self.vm = iotests.VM()
+        self.vm.add_object('iothread,id=iothread0')
+        self.vm.add_device('virtio-scsi,iothread=iothread0')
+
+        for name in self.images:
+            self.vm.add_blockdev('driver=file,filename=%s,node-name=file_%s'
+                                 % (self.images[name], name))
+            self.vm.add_blockdev('driver=qcow2,file=file_%s,node-name=%s'
+                                 % (name, name))
+
+        self.vm.add_device('scsi-hd,drive=drive0')
+        self.vm.add_device('scsi-hd,drive=drive1')
+        self.vm.launch()
+
+    def tearDown(self):
+        self.vm.shutdown()
+        for name in self.images:
+            os.remove(self.images[name])
+
+    def test_blockdev_backup_abort(self):
+        # Use a two actions transaction with a bogus values on the second
+        # one to trigger an abort of the transaction.
+        result = self.vm.qmp('transaction', actions=[
+            {
+                'type': 'blockdev-backup',
+                'data': { 'device': 'drive0',
+                          'target': 'snap0',
+                          'sync': 'full',
+                          'job-id': 'j1' }
+            },
+            {
+                'type': 'blockdev-backup',
+                'data': { 'device': 'drive1',
+                          'target': 'snap1',
+                          'sync': 'full' }
+            },
+        ])
+
+        # Hangs on failure, we expect this error.
+        self.assert_qmp(result, 'error/class', 'GenericError')
+
+if __name__ == '__main__':
+    iotests.main(supported_fmts=['qcow2'],
+                 supported_protocols=['file'])
diff --git a/tests/qemu-iotests/281.out b/tests/qemu-iotests/281.out
new file mode 100644
index 0000000000..89968f35d7
--- /dev/null
+++ b/tests/qemu-iotests/281.out
@@ -0,0 +1,5 @@
+....
+----------------------------------------------------------------------
+Ran 4 tests
+
+OK
diff --git a/tests/qemu-iotests/group b/tests/qemu-iotests/group
index cb2b789e44..e041cc1ee3 100644
--- a/tests/qemu-iotests/group
+++ b/tests/qemu-iotests/group
@@ -288,3 +288,4 @@
 277 rw quick
 279 rw backing quick
 280 rw migration quick
+281 rw quick
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v6 6/8] blockdev: Acquire AioContext on dirty bitmap functions
  2020-01-08 14:31 ` [PATCH v6 6/8] blockdev: Acquire AioContext on dirty bitmap functions Sergio Lopez
@ 2020-01-15 15:19   ` Kevin Wolf
  0 siblings, 0 replies; 13+ messages in thread
From: Kevin Wolf @ 2020-01-15 15:19 UTC (permalink / raw)
  To: Sergio Lopez
  Cc: Markus Armbruster, John Snow, qemu-devel, qemu-block, Max Reitz

Am 08.01.2020 um 15:31 hat Sergio Lopez geschrieben:
> Dirty map addition and removal functions are not acquiring to BDS
> AioContext, while they may call to code that expects it to be
> acquired.
> 
> This may trigger a crash with a stack trace like this one:
> 
>  #0  0x00007f0ef146370f in __GI_raise (sig=sig@entry=6)
>      at ../sysdeps/unix/sysv/linux/raise.c:50
>  #1  0x00007f0ef144db25 in __GI_abort () at abort.c:79
>  #2  0x0000565022294dce in error_exit
>      (err=<optimized out>, msg=msg@entry=0x56502243a730 <__func__.16350> "qemu_mutex_unlock_impl") at util/qemu-thread-posix.c:36
>  #3  0x00005650222950ba in qemu_mutex_unlock_impl
>      (mutex=mutex@entry=0x5650244b0240, file=file@entry=0x565022439adf "util/async.c", line=line@entry=526) at util/qemu-thread-posix.c:108
>  #4  0x0000565022290029 in aio_context_release
>      (ctx=ctx@entry=0x5650244b01e0) at util/async.c:526
>  #5  0x000056502221cd08 in bdrv_can_store_new_dirty_bitmap
>      (bs=bs@entry=0x5650244dc820, name=name@entry=0x56502481d360 "bitmap1", granularity=granularity@entry=65536, errp=errp@entry=0x7fff22831718)
>      at block/dirty-bitmap.c:542
>  #6  0x000056502206ae53 in qmp_block_dirty_bitmap_add
>      (errp=0x7fff22831718, disabled=false, has_disabled=<optimized out>, persistent=<optimized out>, has_persistent=true, granularity=65536, has_granularity=<optimized out>, name=0x56502481d360 "bitmap1", node=<optimized out>) at blockdev.c:2894
>  #7  0x000056502206ae53 in qmp_block_dirty_bitmap_add
>      (node=<optimized out>, name=0x56502481d360 "bitmap1", has_granularity=<optimized out>, granularity=<optimized out>, has_persistent=true, persistent=<optimized out>, has_disabled=false, disabled=false, errp=0x7fff22831718) at blockdev.c:2856
>  #8  0x00005650221847a3 in qmp_marshal_block_dirty_bitmap_add
>      (args=<optimized out>, ret=<optimized out>, errp=0x7fff22831798)
>      at qapi/qapi-commands-block-core.c:651
>  #9  0x0000565022247e6c in do_qmp_dispatch
>      (errp=0x7fff22831790, allow_oob=<optimized out>, request=<optimized out>, cmds=0x565022b32d60 <qmp_commands>) at qapi/qmp-dispatch.c:132
>  #10 0x0000565022247e6c in qmp_dispatch
>      (cmds=0x565022b32d60 <qmp_commands>, request=<optimized out>, allow_oob=<optimized out>) at qapi/qmp-dispatch.c:175
>  #11 0x0000565022166061 in monitor_qmp_dispatch
>      (mon=0x56502450faa0, req=<optimized out>) at monitor/qmp.c:145
>  #12 0x00005650221666fa in monitor_qmp_bh_dispatcher
>      (data=<optimized out>) at monitor/qmp.c:234
>  #13 0x000056502228f866 in aio_bh_call (bh=0x56502440eae0)
>      at util/async.c:117
>  #14 0x000056502228f866 in aio_bh_poll (ctx=ctx@entry=0x56502440d7a0)
>      at util/async.c:117
>  #15 0x0000565022292c54 in aio_dispatch (ctx=0x56502440d7a0)
>      at util/aio-posix.c:459
>  #16 0x000056502228f742 in aio_ctx_dispatch
>      (source=<optimized out>, callback=<optimized out>, user_data=<optimized out>) at util/async.c:260
>  #17 0x00007f0ef5ce667d in g_main_dispatch (context=0x56502449aa40)
>      at gmain.c:3176
>  #18 0x00007f0ef5ce667d in g_main_context_dispatch
>      (context=context@entry=0x56502449aa40) at gmain.c:3829
>  #19 0x0000565022291d08 in glib_pollfds_poll () at util/main-loop.c:219
>  #20 0x0000565022291d08 in os_host_main_loop_wait
>      (timeout=<optimized out>) at util/main-loop.c:242
>  #21 0x0000565022291d08 in main_loop_wait (nonblocking=<optimized out>)
>      at util/main-loop.c:518
>  #22 0x00005650220743c1 in main_loop () at vl.c:1828
>  #23 0x0000565021f20a72 in main
>      (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>)
>      at vl.c:4504
> 
> Fix this by acquiring the AioContext at qmp_block_dirty_bitmap_add()
> and qmp_block_dirty_bitmap_add().
> 
> RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1782175
> Signed-off-by: Sergio Lopez <slp@redhat.com>

> @@ -3038,20 +3045,26 @@ static BdrvDirtyBitmap *do_block_dirty_bitmap_remove(
>  {
>      BlockDriverState *bs;
>      BdrvDirtyBitmap *bitmap;
> +    AioContext *aio_context;
>  
>      bitmap = block_dirty_bitmap_lookup(node, name, &bs, errp);
>      if (!bitmap || !bs) {
>          return NULL;
>      }
>  
> +    aio_context = bdrv_get_aio_context(bs);
> +    aio_context_acquire(aio_context);
> +
>      if (bdrv_dirty_bitmap_check(bitmap, BDRV_BITMAP_BUSY | BDRV_BITMAP_RO,
>                                  errp)) {
> +        aio_context_release(aio_context);
>          return NULL;
>      }
>  
>      if (bdrv_dirty_bitmap_get_persistence(bitmap) &&
>          bdrv_remove_persistent_dirty_bitmap(bs, name, errp) < 0)
>      {
> +            aio_context_release(aio_context);
>              return NULL;
>      }

Indentation is off here. It was already off before this patch, but
rather than adding another incorrectly indented line, I think we should
just fix it. I can do that while applying.

Kevin



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v6 7/8] blockdev: Return bs to the proper context on snapshot abort
  2020-01-08 14:31 ` [PATCH v6 7/8] blockdev: Return bs to the proper context on snapshot abort Sergio Lopez
@ 2020-01-15 15:22   ` Kevin Wolf
  0 siblings, 0 replies; 13+ messages in thread
From: Kevin Wolf @ 2020-01-15 15:22 UTC (permalink / raw)
  To: Sergio Lopez
  Cc: Markus Armbruster, John Snow, qemu-devel, qemu-block, Max Reitz

Am 08.01.2020 um 15:31 hat Sergio Lopez geschrieben:
> external_snapshot_abort() calls to bdrv_set_backing_hd(), which
> returns state->old_bs to the main AioContext, as it's intended to be
> used then the BDS is going to be released. As that's not the case when
> aborting an external snapshot, return it to the AioContext it was
> before the call.
> 
> This issue can be triggered by issuing a transaction with two actions,
> a proper blockdev-snapshot-sync and a bogus one, so the second will
> trigger a transaction abort. This results in a crash with an stack
> trace like this one:
> 
>  #0  0x00007fa1048b28df in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
>  #1  0x00007fa10489ccf5 in __GI_abort () at abort.c:79
>  #2  0x00007fa10489cbc9 in __assert_fail_base
>      (fmt=0x7fa104a03300 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x5572240b44d8 "bdrv_get_aio_context(old_bs) == bdrv_get_aio_context(new_bs)", file=0x557224014d30 "block.c", line=2240, function=<optimized out>) at assert.c:92
>  #3  0x00007fa1048aae96 in __GI___assert_fail
>      (assertion=assertion@entry=0x5572240b44d8 "bdrv_get_aio_context(old_bs) == bdrv_get_aio_context(new_bs)", file=file@entry=0x557224014d30 "block.c", line=line@entry=2240, function=function@entry=0x5572240b5d60 <__PRETTY_FUNCTION__.31620> "bdrv_replace_child_noperm") at assert.c:101
>  #4  0x0000557223e631f8 in bdrv_replace_child_noperm (child=0x557225b9c980, new_bs=new_bs@entry=0x557225c42e40) at block.c:2240
>  #5  0x0000557223e68be7 in bdrv_replace_node (from=0x557226951a60, to=0x557225c42e40, errp=0x5572247d6138 <error_abort>) at block.c:4196
>  #6  0x0000557223d069c4 in external_snapshot_abort (common=0x557225d7e170) at blockdev.c:1731
>  #7  0x0000557223d069c4 in external_snapshot_abort (common=0x557225d7e170) at blockdev.c:1717
>  #8  0x0000557223d09013 in qmp_transaction (dev_list=<optimized out>, has_props=<optimized out>, props=0x557225cc7d70, errp=errp@entry=0x7ffe704c0c98) at blockdev.c:2360
>  #9  0x0000557223e32085 in qmp_marshal_transaction (args=<optimized out>, ret=<optimized out>, errp=0x7ffe704c0d08) at qapi/qapi-commands-transaction.c:44
>  #10 0x0000557223ee798c in do_qmp_dispatch (errp=0x7ffe704c0d00, allow_oob=<optimized out>, request=<optimized out>, cmds=0x5572247d3cc0 <qmp_commands>) at qapi/qmp-dispatch.c:132
>  #11 0x0000557223ee798c in qmp_dispatch (cmds=0x5572247d3cc0 <qmp_commands>, request=<optimized out>, allow_oob=<optimized out>) at qapi/qmp-dispatch.c:175
>  #12 0x0000557223e06141 in monitor_qmp_dispatch (mon=0x557225c69ff0, req=<optimized out>) at monitor/qmp.c:120
>  #13 0x0000557223e0678a in monitor_qmp_bh_dispatcher (data=<optimized out>) at monitor/qmp.c:209
>  #14 0x0000557223f2f366 in aio_bh_call (bh=0x557225b9dc60) at util/async.c:117
>  #15 0x0000557223f2f366 in aio_bh_poll (ctx=ctx@entry=0x557225b9c840) at util/async.c:117
>  #16 0x0000557223f32754 in aio_dispatch (ctx=0x557225b9c840) at util/aio-posix.c:459
>  #17 0x0000557223f2f242 in aio_ctx_dispatch (source=<optimized out>, callback=<optimized out>, user_data=<optimized out>) at util/async.c:260
>  #18 0x00007fa10913467d in g_main_dispatch (context=0x557225c28e80) at gmain.c:3176
>  #19 0x00007fa10913467d in g_main_context_dispatch (context=context@entry=0x557225c28e80) at gmain.c:3829
>  #20 0x0000557223f31808 in glib_pollfds_poll () at util/main-loop.c:219
>  #21 0x0000557223f31808 in os_host_main_loop_wait (timeout=<optimized out>) at util/main-loop.c:242
>  #22 0x0000557223f31808 in main_loop_wait (nonblocking=<optimized out>) at util/main-loop.c:518
>  #23 0x0000557223d13201 in main_loop () at vl.c:1828
>  #24 0x0000557223bbfb82 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4504
> 
> RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1779036
> Signed-off-by: Sergio Lopez <slp@redhat.com>
> ---
>  blockdev.c | 21 +++++++++++++++++++++
>  1 file changed, 21 insertions(+)
> 
> diff --git a/blockdev.c b/blockdev.c
> index 292961a88d..cfed87f434 100644
> --- a/blockdev.c
> +++ b/blockdev.c
> @@ -1731,6 +1731,8 @@ static void external_snapshot_abort(BlkActionState *common)
>      if (state->new_bs) {
>          if (state->overlay_appended) {
>              AioContext *aio_context;
> +            AioContext *tmp_context;
> +            int ret;
>  
>              aio_context = bdrv_get_aio_context(state->old_bs);
>              aio_context_acquire(aio_context);
> @@ -1738,6 +1740,25 @@ static void external_snapshot_abort(BlkActionState *common)
>              bdrv_ref(state->old_bs);   /* we can't let bdrv_set_backind_hd()
>                                            close state->old_bs; we need it */
>              bdrv_set_backing_hd(state->new_bs, NULL, &error_abort);
> +
> +            /*
> +             * The call to bdrv_set_backing_hd() above returns state->old_bs to
> +             * the main AioContext. As we're still going to be using it, return
> +             * it to the AioContext it was before.
> +             */
> +            tmp_context = bdrv_get_aio_context(state->old_bs);
> +            if (aio_context != tmp_context) {
> +                aio_context_release(aio_context);
> +                aio_context_acquire(tmp_context);
> +
> +                ret = bdrv_try_set_aio_context(state->old_bs,
> +                                               aio_context, NULL);
> +                assert(ret == 0);
> +
> +                aio_context_release(tmp_context);
> +                aio_context_acquire(aio_context);
> +            }
> +
>              bdrv_replace_node(state->new_bs, state->old_bs, &error_abort);
>              bdrv_unref(state->old_bs); /* bdrv_replace_node() ref'ed old_bs */

Arguably, bdrv_replace_node() should share the AioContext-switching
logic with bdrv_root_attach_child() so that the general case is solved.

I guess this patch is simple enough and solves a relevant special case,
so no objection. But it might not be the final fix.

Kevin



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v6 0/8] blockdev: Fix AioContext handling for various blockdev actions
  2020-01-08 14:31 [PATCH v6 0/8] blockdev: Fix AioContext handling for various blockdev actions Sergio Lopez
                   ` (7 preceding siblings ...)
  2020-01-08 14:31 ` [PATCH v6 8/8] iotests: Test handling of AioContexts with some blockdev actions Sergio Lopez
@ 2020-01-15 15:27 ` Kevin Wolf
  2020-01-16 17:17 ` Paolo Bonzini
  9 siblings, 0 replies; 13+ messages in thread
From: Kevin Wolf @ 2020-01-15 15:27 UTC (permalink / raw)
  To: Sergio Lopez
  Cc: Markus Armbruster, John Snow, qemu-devel, qemu-block, Max Reitz

Am 08.01.2020 um 15:31 hat Sergio Lopez geschrieben:
> This patch series includes fixes for various issues related to
> AioContext mismanagement for various blockdev-initiated actions.
> 
> It also adds tests for those actions when executed on drives running
> on an IOThread AioContext.

Thanks, fixed up indentation in patch 6 and applied to the block branch.

Kevin



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v6 0/8] blockdev: Fix AioContext handling for various blockdev actions
  2020-01-08 14:31 [PATCH v6 0/8] blockdev: Fix AioContext handling for various blockdev actions Sergio Lopez
                   ` (8 preceding siblings ...)
  2020-01-15 15:27 ` [PATCH v6 0/8] blockdev: Fix AioContext handling for various " Kevin Wolf
@ 2020-01-16 17:17 ` Paolo Bonzini
  9 siblings, 0 replies; 13+ messages in thread
From: Paolo Bonzini @ 2020-01-16 17:17 UTC (permalink / raw)
  To: Sergio Lopez, qemu-devel
  Cc: Kevin Wolf, John Snow, Markus Armbruster, qemu-block, Max Reitz

On 08/01/20 15:31, Sergio Lopez wrote:
> This patch series includes fixes for various issues related to
> AioContext mismanagement for various blockdev-initiated actions.
> 
> It also adds tests for those actions when executed on drives running
> on an IOThread AioContext.

Testing the latest addition to Patchew:

Supersedes: <20191128104129.250206-1-slp@redhat.com>

:)

Paolo

> ---
> Changelog:
> 
> v6:
>  - Rename the patch series.
>  - Add "block/backup-top: Don't acquire context while dropping top"
>  - Add "blockdev: Acquire AioContext on dirty bitmap functions"
>  - Add "blockdev: Return bs to the proper context on snapshot abort"
>  - Add "iotests: Test handling of AioContexts with some blockdev
>    actions" (thanks Kevin Wolf)
>  - Fix context release on error. (thanks Kevin Wolf)
> 
> v5:
>  - Include fix for iotest 141 in patch 2. (thanks Max Reitz)
>  - Also fix iotest 185 and 219 in patch 2. (thanks Max Reitz)
>  - Move error block after context acquisition/release, to we can use
>    goto to bail out and release resources. (thanks Max Reitz)
>  - Properly release old_context in qmp_blockdev_mirror. (thanks Max
>    Reitz)
> 
> v4:
>  - Unify patches 1-4 and 5-7 to avoid producing broken interim
>    states. (thanks Max Reitz)
>  - Include a fix for iotest 141. (thanks Kevin Wolf)
> 
> v3:
>  - Rework the whole patch series to fix the issue by consolidating all
>    operations in the transaction model. (thanks Kevin Wolf)
> 
> v2:
>  - Honor bdrv_try_set_aio_context() context acquisition requirements
>    (thanks Max Reitz).
>  - Release the context at drive_backup_prepare() instead of avoiding
>    re-acquiring it at do_drive_baclup(). (thanks Max Reitz)
>  - Convert a single patch into a two-patch series.
> ---
> 
> Sergio Lopez (8):
>   blockdev: fix coding style issues in drive_backup_prepare
>   blockdev: unify qmp_drive_backup and drive-backup transaction paths
>   blockdev: unify qmp_blockdev_backup and blockdev-backup transaction
>     paths
>   blockdev: honor bdrv_try_set_aio_context() context requirements
>   block/backup-top: Don't acquire context while dropping top
>   blockdev: Acquire AioContext on dirty bitmap functions
>   blockdev: Return bs to the proper context on snapshot abort
>   iotests: Test handling of AioContexts with some blockdev actions
> 
>  block/backup-top.c         |   5 -
>  block/backup.c             |   3 +
>  blockdev.c                 | 391 ++++++++++++++++++++-----------------
>  tests/qemu-iotests/141.out |   2 +
>  tests/qemu-iotests/185.out |   2 +
>  tests/qemu-iotests/219     |   7 +-
>  tests/qemu-iotests/219.out |   8 +
>  tests/qemu-iotests/281     | 247 +++++++++++++++++++++++
>  tests/qemu-iotests/281.out |   5 +
>  tests/qemu-iotests/group   |   1 +
>  10 files changed, 484 insertions(+), 187 deletions(-)
>  create mode 100755 tests/qemu-iotests/281
>  create mode 100644 tests/qemu-iotests/281.out
> 



^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2020-01-16 17:18 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-08 14:31 [PATCH v6 0/8] blockdev: Fix AioContext handling for various blockdev actions Sergio Lopez
2020-01-08 14:31 ` [PATCH v6 1/8] blockdev: fix coding style issues in drive_backup_prepare Sergio Lopez
2020-01-08 14:31 ` [PATCH v6 2/8] blockdev: unify qmp_drive_backup and drive-backup transaction paths Sergio Lopez
2020-01-08 14:31 ` [PATCH v6 3/8] blockdev: unify qmp_blockdev_backup and blockdev-backup " Sergio Lopez
2020-01-08 14:31 ` [PATCH v6 4/8] blockdev: honor bdrv_try_set_aio_context() context requirements Sergio Lopez
2020-01-08 14:31 ` [PATCH v6 5/8] block/backup-top: Don't acquire context while dropping top Sergio Lopez
2020-01-08 14:31 ` [PATCH v6 6/8] blockdev: Acquire AioContext on dirty bitmap functions Sergio Lopez
2020-01-15 15:19   ` Kevin Wolf
2020-01-08 14:31 ` [PATCH v6 7/8] blockdev: Return bs to the proper context on snapshot abort Sergio Lopez
2020-01-15 15:22   ` Kevin Wolf
2020-01-08 14:31 ` [PATCH v6 8/8] iotests: Test handling of AioContexts with some blockdev actions Sergio Lopez
2020-01-15 15:27 ` [PATCH v6 0/8] blockdev: Fix AioContext handling for various " Kevin Wolf
2020-01-16 17:17 ` Paolo Bonzini

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).