All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH for-5.0 v5 0/3] Fix some AIO context locking in jobs
@ 2020-04-07 11:56 Stefan Reiter
  2020-04-07 11:56 ` [PATCH for-5.0 v5 1/3] job: take each job's lock individually in job_txn_apply Stefan Reiter
                   ` (3 more replies)
  0 siblings, 4 replies; 6+ messages in thread
From: Stefan Reiter @ 2020-04-07 11:56 UTC (permalink / raw)
  To: qemu-devel, qemu-block
  Cc: kwolf, vsementsov, t.lamprecht, slp, mreitz, stefanha, jsnow, dietmar

Contains three seperate but related patches cleaning up and fixing some
issues regarding aio_context_acquire/aio_context_release for jobs. Mostly
affects blockjobs running for devices that have IO threads enabled AFAICT.


Changes from v4:
* Do job_ref/job_unref in job_txn_apply and job_exit since we need the job to
  survive the callback to access the potentially changed lock afterwards
* Reduce patch 2/3 to an assert, the context should already be acquired since
  it's a bdrv handler
* Collect R-by for 3/3

I've marked it 'for-5.0' this time, I think it would make sense to be
picked up together with Kevin's "block: Fix blk->in_flight during
blk_wait_while_drained()" series. With that series and these three patches
applied I can no longer reproduce any of the reported related crashes/hangs.


Changes from v3:
* commit_job appears to be unset in certain cases when replication_close is
  called, only access when necessary to avoid SIGSEGV

Missed this when shuffling around patches, sorry for noise with still-broken v3.

Changes from v2:
* reordered patch 1 to the end to not introduce temporary breakages
* added more fixes to job txn patch (should now pass the tests)

Changes from v1:
* fixed commit message for patch 1
* added patches 2 and 3


qemu: Stefan Reiter (3):
  job: take each job's lock individually in job_txn_apply
  replication: assert we own context before job_cancel_sync
  backup: don't acquire aio_context in backup_clean

 block/backup.c        |  4 ----
 block/replication.c   |  5 ++++-
 blockdev.c            |  9 ++++++++
 job-qmp.c             |  9 ++++++++
 job.c                 | 50 ++++++++++++++++++++++++++++++++++---------
 tests/test-blockjob.c |  2 ++
 6 files changed, 64 insertions(+), 15 deletions(-)

-- 
2.26.0



^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH for-5.0 v5 1/3] job: take each job's lock individually in job_txn_apply
  2020-04-07 11:56 [PATCH for-5.0 v5 0/3] Fix some AIO context locking in jobs Stefan Reiter
@ 2020-04-07 11:56 ` Stefan Reiter
  2020-04-07 13:26   ` Kevin Wolf
  2020-04-07 11:56 ` [PATCH for-5.0 v5 2/3] replication: assert we own context before job_cancel_sync Stefan Reiter
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 6+ messages in thread
From: Stefan Reiter @ 2020-04-07 11:56 UTC (permalink / raw)
  To: qemu-devel, qemu-block
  Cc: kwolf, vsementsov, t.lamprecht, slp, mreitz, stefanha, jsnow, dietmar

All callers of job_txn_apply hold a single job's lock, but different
jobs within a transaction can have different contexts, thus we need to
lock each one individually before applying the callback function.

Similar to job_completed_txn_abort this also requires releasing the
caller's context before and reacquiring it after to avoid recursive
locks which might break AIO_WAIT_WHILE in the callback. This is safe, since
existing code would already have to take this into account, lest
job_completed_txn_abort might have broken.

This also brings to light a different issue: When a callback function in
job_txn_apply moves it's job to a different AIO context, callers will
try to release the wrong lock (now that we re-acquire the lock
correctly, previously it would just continue with the old lock, leaving
the job unlocked for the rest of the return path). Fix this by not caching
the job's context.

This is only necessary for qmp_block_job_finalize, qmp_job_finalize and
job_exit, since everyone else calls through job_exit.

One test needed adapting, since it calls job_finalize directly, so it
manually needs to acquire the correct context.

Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
---
 blockdev.c            |  9 ++++++++
 job-qmp.c             |  9 ++++++++
 job.c                 | 50 ++++++++++++++++++++++++++++++++++---------
 tests/test-blockjob.c |  2 ++
 4 files changed, 60 insertions(+), 10 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index fa8630cb41..5faddaa705 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -3612,7 +3612,16 @@ void qmp_block_job_finalize(const char *id, Error **errp)
     }
 
     trace_qmp_block_job_finalize(job);
+    job_ref(&job->job);
     job_finalize(&job->job, errp);
+
+    /*
+     * Job's context might have changed via job_finalize (and job_txn_apply
+     * automatically acquires the new one), so make sure we release the correct
+     * one.
+     */
+    aio_context = blk_get_aio_context(job->blk);
+    job_unref(&job->job);
     aio_context_release(aio_context);
 }
 
diff --git a/job-qmp.c b/job-qmp.c
index fecc939ebd..f9a58832e1 100644
--- a/job-qmp.c
+++ b/job-qmp.c
@@ -114,7 +114,16 @@ void qmp_job_finalize(const char *id, Error **errp)
     }
 
     trace_qmp_job_finalize(job);
+    job_ref(job);
     job_finalize(job, errp);
+
+    /*
+     * Job's context might have changed via job_finalize (and job_txn_apply
+     * automatically acquires the new one), so make sure we release the correct
+     * one.
+     */
+    aio_context = job->aio_context;
+    job_unref(job);
     aio_context_release(aio_context);
 }
 
diff --git a/job.c b/job.c
index 134a07b92e..53be57a3a0 100644
--- a/job.c
+++ b/job.c
@@ -136,17 +136,38 @@ static void job_txn_del_job(Job *job)
     }
 }
 
-static int job_txn_apply(JobTxn *txn, int fn(Job *))
+static int job_txn_apply(Job *job, int fn(Job *))
 {
-    Job *job, *next;
+    AioContext *inner_ctx;
+    Job *other_job, *next;
+    JobTxn *txn = job->txn;
     int rc = 0;
 
-    QLIST_FOREACH_SAFE(job, &txn->jobs, txn_list, next) {
-        rc = fn(job);
+    /*
+     * Similar to job_completed_txn_abort, we take each job's lock before
+     * applying fn, but since we assume that outer_ctx is held by the caller,
+     * we need to release it here to avoid holding the lock twice - which would
+     * break AIO_WAIT_WHILE from within fn.
+     */
+    job_ref(job);
+    aio_context_release(job->aio_context);
+
+    QLIST_FOREACH_SAFE(other_job, &txn->jobs, txn_list, next) {
+        inner_ctx = other_job->aio_context;
+        aio_context_acquire(inner_ctx);
+        rc = fn(other_job);
+        aio_context_release(inner_ctx);
         if (rc) {
             break;
         }
     }
+
+    /*
+     * Note that job->aio_context might have been changed by calling fn, so we
+     * can't use a local variable to cache it.
+     */
+    aio_context_acquire(job->aio_context);
+    job_unref(job);
     return rc;
 }
 
@@ -774,11 +795,11 @@ static void job_do_finalize(Job *job)
     assert(job && job->txn);
 
     /* prepare the transaction to complete */
-    rc = job_txn_apply(job->txn, job_prepare);
+    rc = job_txn_apply(job, job_prepare);
     if (rc) {
         job_completed_txn_abort(job);
     } else {
-        job_txn_apply(job->txn, job_finalize_single);
+        job_txn_apply(job, job_finalize_single);
     }
 }
 
@@ -824,10 +845,10 @@ static void job_completed_txn_success(Job *job)
         assert(other_job->ret == 0);
     }
 
-    job_txn_apply(txn, job_transition_to_pending);
+    job_txn_apply(job, job_transition_to_pending);
 
     /* If no jobs need manual finalization, automatically do so */
-    if (job_txn_apply(txn, job_needs_finalize) == 0) {
+    if (job_txn_apply(job, job_needs_finalize) == 0) {
         job_do_finalize(job);
     }
 }
@@ -849,9 +870,10 @@ static void job_completed(Job *job)
 static void job_exit(void *opaque)
 {
     Job *job = (Job *)opaque;
-    AioContext *ctx = job->aio_context;
+    AioContext *ctx;
 
-    aio_context_acquire(ctx);
+    job_ref(job);
+    aio_context_acquire(job->aio_context);
 
     /* This is a lie, we're not quiescent, but still doing the completion
      * callbacks. However, completion callbacks tend to involve operations that
@@ -862,6 +884,14 @@ static void job_exit(void *opaque)
 
     job_completed(job);
 
+    /*
+     * Note that calling job_completed can move the job to a different
+     * aio_context, so we cannot cache from above. job_txn_apply takes care of
+     * acquiring the new lock, and we ref/unref to avoid job_completed freeing
+     * the job underneath us.
+     */
+    ctx = job->aio_context;
+    job_unref(job);
     aio_context_release(ctx);
 }
 
diff --git a/tests/test-blockjob.c b/tests/test-blockjob.c
index 4eeb184caf..7519847912 100644
--- a/tests/test-blockjob.c
+++ b/tests/test-blockjob.c
@@ -367,7 +367,9 @@ static void test_cancel_concluded(void)
     aio_poll(qemu_get_aio_context(), true);
     assert(job->status == JOB_STATUS_PENDING);
 
+    aio_context_acquire(job->aio_context);
     job_finalize(job, &error_abort);
+    aio_context_release(job->aio_context);
     assert(job->status == JOB_STATUS_CONCLUDED);
 
     cancel_common(s);
-- 
2.26.0




^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH for-5.0 v5 2/3] replication: assert we own context before job_cancel_sync
  2020-04-07 11:56 [PATCH for-5.0 v5 0/3] Fix some AIO context locking in jobs Stefan Reiter
  2020-04-07 11:56 ` [PATCH for-5.0 v5 1/3] job: take each job's lock individually in job_txn_apply Stefan Reiter
@ 2020-04-07 11:56 ` Stefan Reiter
  2020-04-07 11:56 ` [PATCH for-5.0 v5 3/3] backup: don't acquire aio_context in backup_clean Stefan Reiter
  2020-04-07 14:22 ` [PATCH for-5.0 v5 0/3] Fix some AIO context locking in jobs Kevin Wolf
  3 siblings, 0 replies; 6+ messages in thread
From: Stefan Reiter @ 2020-04-07 11:56 UTC (permalink / raw)
  To: qemu-devel, qemu-block
  Cc: kwolf, vsementsov, t.lamprecht, slp, mreitz, stefanha, jsnow, dietmar

job_cancel_sync requires the job's lock to be held, all other callers
already do this (replication_stop, drive_backup_abort,
blockdev_backup_abort, job_cancel_sync_all, cancel_common).

In this case we're in a BlockDriver handler, so we already have a lock,
just assert that it is the same as the one used for the commit_job.

Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
---
 block/replication.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/block/replication.c b/block/replication.c
index 413d95407d..da013c2041 100644
--- a/block/replication.c
+++ b/block/replication.c
@@ -144,12 +144,15 @@ fail:
 static void replication_close(BlockDriverState *bs)
 {
     BDRVReplicationState *s = bs->opaque;
+    Job *commit_job;
 
     if (s->stage == BLOCK_REPLICATION_RUNNING) {
         replication_stop(s->rs, false, NULL);
     }
     if (s->stage == BLOCK_REPLICATION_FAILOVER) {
-        job_cancel_sync(&s->commit_job->job);
+        commit_job = &s->commit_job->job;
+        assert(commit_job->aio_context == qemu_get_current_aio_context());
+        job_cancel_sync(commit_job);
     }
 
     if (s->mode == REPLICATION_MODE_SECONDARY) {
-- 
2.26.0




^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH for-5.0 v5 3/3] backup: don't acquire aio_context in backup_clean
  2020-04-07 11:56 [PATCH for-5.0 v5 0/3] Fix some AIO context locking in jobs Stefan Reiter
  2020-04-07 11:56 ` [PATCH for-5.0 v5 1/3] job: take each job's lock individually in job_txn_apply Stefan Reiter
  2020-04-07 11:56 ` [PATCH for-5.0 v5 2/3] replication: assert we own context before job_cancel_sync Stefan Reiter
@ 2020-04-07 11:56 ` Stefan Reiter
  2020-04-07 14:22 ` [PATCH for-5.0 v5 0/3] Fix some AIO context locking in jobs Kevin Wolf
  3 siblings, 0 replies; 6+ messages in thread
From: Stefan Reiter @ 2020-04-07 11:56 UTC (permalink / raw)
  To: qemu-devel, qemu-block
  Cc: kwolf, vsementsov, t.lamprecht, slp, mreitz, stefanha, jsnow, dietmar

All code-paths leading to backup_clean (via job_clean) have the job's
context already acquired. The job's context is guaranteed to be the same
as the one used by backup_top via backup_job_create.

Since the previous logic effectively acquired the lock twice, this
broke cleanup of backups for disks using IO threads, since the BDRV_POLL_WHILE
in bdrv_backup_top_drop -> bdrv_do_drained_begin would only release the lock
once, thus deadlocking with the IO thread.

This is a partial revert of 0abf2581717a19.

Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
---
 block/backup.c | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/block/backup.c b/block/backup.c
index 7430ca5883..a7a7dcaf4c 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -126,11 +126,7 @@ static void backup_abort(Job *job)
 static void backup_clean(Job *job)
 {
     BackupBlockJob *s = container_of(job, BackupBlockJob, common.job);
-    AioContext *aio_context = bdrv_get_aio_context(s->backup_top);
-
-    aio_context_acquire(aio_context);
     bdrv_backup_top_drop(s->backup_top);
-    aio_context_release(aio_context);
 }
 
 void backup_do_checkpoint(BlockJob *job, Error **errp)
-- 
2.26.0




^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH for-5.0 v5 1/3] job: take each job's lock individually in job_txn_apply
  2020-04-07 11:56 ` [PATCH for-5.0 v5 1/3] job: take each job's lock individually in job_txn_apply Stefan Reiter
@ 2020-04-07 13:26   ` Kevin Wolf
  0 siblings, 0 replies; 6+ messages in thread
From: Kevin Wolf @ 2020-04-07 13:26 UTC (permalink / raw)
  To: Stefan Reiter
  Cc: vsementsov, t.lamprecht, slp, qemu-block, qemu-devel, mreitz,
	stefanha, jsnow, dietmar

Am 07.04.2020 um 13:56 hat Stefan Reiter geschrieben:
> All callers of job_txn_apply hold a single job's lock, but different
> jobs within a transaction can have different contexts, thus we need to
> lock each one individually before applying the callback function.
> 
> Similar to job_completed_txn_abort this also requires releasing the
> caller's context before and reacquiring it after to avoid recursive
> locks which might break AIO_WAIT_WHILE in the callback. This is safe, since
> existing code would already have to take this into account, lest
> job_completed_txn_abort might have broken.
> 
> This also brings to light a different issue: When a callback function in
> job_txn_apply moves it's job to a different AIO context, callers will
> try to release the wrong lock (now that we re-acquire the lock
> correctly, previously it would just continue with the old lock, leaving
> the job unlocked for the rest of the return path). Fix this by not caching
> the job's context.
> 
> This is only necessary for qmp_block_job_finalize, qmp_job_finalize and
> job_exit, since everyone else calls through job_exit.

job_cancel() doesn't go through job_exit(). It calls job_completed()
onyl for jobs that are not started yet, and it sets job->cancelled, so
that job_completed() takes the job_completed_txn_abort() path, which is
not changed by this patch.

I _think_ this is okay, but it shows that the whole job completion
infrastructure is becoming way too complicated. We're late for 5.0, so
let's take this patch for now, but I think we should use the 5.1 release
cycle to clean up this mess a bit.

> One test needed adapting, since it calls job_finalize directly, so it
> manually needs to acquire the correct context.
> 
> Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>

Kevin



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH for-5.0 v5 0/3] Fix some AIO context locking in jobs
  2020-04-07 11:56 [PATCH for-5.0 v5 0/3] Fix some AIO context locking in jobs Stefan Reiter
                   ` (2 preceding siblings ...)
  2020-04-07 11:56 ` [PATCH for-5.0 v5 3/3] backup: don't acquire aio_context in backup_clean Stefan Reiter
@ 2020-04-07 14:22 ` Kevin Wolf
  3 siblings, 0 replies; 6+ messages in thread
From: Kevin Wolf @ 2020-04-07 14:22 UTC (permalink / raw)
  To: Stefan Reiter
  Cc: vsementsov, t.lamprecht, slp, qemu-block, qemu-devel, mreitz,
	stefanha, jsnow, dietmar

Am 07.04.2020 um 13:56 hat Stefan Reiter geschrieben:
> Contains three seperate but related patches cleaning up and fixing some
> issues regarding aio_context_acquire/aio_context_release for jobs. Mostly
> affects blockjobs running for devices that have IO threads enabled AFAICT.
> 
> 
> Changes from v4:
> * Do job_ref/job_unref in job_txn_apply and job_exit since we need the job to
>   survive the callback to access the potentially changed lock afterwards
> * Reduce patch 2/3 to an assert, the context should already be acquired since
>   it's a bdrv handler
> * Collect R-by for 3/3
> 
> I've marked it 'for-5.0' this time, I think it would make sense to be
> picked up together with Kevin's "block: Fix blk->in_flight during
> blk_wait_while_drained()" series. With that series and these three patches
> applied I can no longer reproduce any of the reported related crashes/hangs.

Thanks, applied to the block branch.

Kevin



^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-04-07 14:23 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-07 11:56 [PATCH for-5.0 v5 0/3] Fix some AIO context locking in jobs Stefan Reiter
2020-04-07 11:56 ` [PATCH for-5.0 v5 1/3] job: take each job's lock individually in job_txn_apply Stefan Reiter
2020-04-07 13:26   ` Kevin Wolf
2020-04-07 11:56 ` [PATCH for-5.0 v5 2/3] replication: assert we own context before job_cancel_sync Stefan Reiter
2020-04-07 11:56 ` [PATCH for-5.0 v5 3/3] backup: don't acquire aio_context in backup_clean Stefan Reiter
2020-04-07 14:22 ` [PATCH for-5.0 v5 0/3] Fix some AIO context locking in jobs Kevin Wolf

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.