All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 0/3] Fix some AIO context locking in jobs
@ 2020-04-01  8:15 Stefan Reiter
  2020-04-01  8:15 ` [PATCH v4 1/3] job: take each job's lock individually in job_txn_apply Stefan Reiter
                   ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: Stefan Reiter @ 2020-04-01  8:15 UTC (permalink / raw)
  To: qemu-devel, qemu-block
  Cc: kwolf, vsementsov, slp, mreitz, stefanha, jsnow, dietmar

Contains three seperate but related patches cleaning up and fixing some
issues regarding aio_context_acquire/aio_context_release for jobs. Mostly
affects blockjobs running for devices that have IO threads enabled AFAICT.

This is based on the discussions here:
https://lists.gnu.org/archive/html/qemu-devel/2020-03/msg07929.html

Changes from v3:
* commit_job appears to be unset in certain cases when replication_close is
  called, only access when necessary to avoid SIGSEGV

Missed this when shuffling around patches, sorry for noise with still-broken v3.

Changes from v2:
* reordered patch 1 to the end to not introduce temporary breakages
* added more fixes to job txn patch (should now pass the tests)

Changes from v1:
* fixed commit message for patch 1
* added patches 2 and 3


qemu: Stefan Reiter (3):
  job: take each job's lock individually in job_txn_apply
  replication: acquire aio context before calling job_cancel_sync
  backup: don't acquire aio_context in backup_clean

 block/backup.c        |  4 ----
 block/replication.c   |  8 +++++++-
 job.c                 | 48 ++++++++++++++++++++++++++++++++++---------
 tests/test-blockjob.c |  2 ++
 4 files changed, 47 insertions(+), 15 deletions(-)

-- 
2.26.0



^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v4 1/3] job: take each job's lock individually in job_txn_apply
  2020-04-01  8:15 [PATCH v4 0/3] Fix some AIO context locking in jobs Stefan Reiter
@ 2020-04-01  8:15 ` Stefan Reiter
  2020-04-02 12:33   ` Max Reitz
  2020-04-01  8:15 ` [PATCH v4 2/3] replication: acquire aio context before calling job_cancel_sync Stefan Reiter
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 11+ messages in thread
From: Stefan Reiter @ 2020-04-01  8:15 UTC (permalink / raw)
  To: qemu-devel, qemu-block
  Cc: kwolf, vsementsov, slp, mreitz, stefanha, jsnow, dietmar

All callers of job_txn_apply hold a single job's lock, but different
jobs within a transaction can have different contexts, thus we need to
lock each one individually before applying the callback function.

Similar to job_completed_txn_abort this also requires releasing the
caller's context before and reacquiring it after to avoid recursive
locks which might break AIO_WAIT_WHILE in the callback.

This also brings to light a different issue: When a callback function in
job_txn_apply moves it's job to a different AIO context, job_exit will
try to release the wrong lock (now that we re-acquire the lock
correctly, previously it would just continue with the old lock, leaving
the job unlocked for the rest of the codepath back to job_exit). Fix
this by not caching the job's context in job_exit and add a comment
about why this is done.

One test needed adapting, since it calls job_finalize directly, so it
manually needs to acquire the correct context.

Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
---
 job.c                 | 48 ++++++++++++++++++++++++++++++++++---------
 tests/test-blockjob.c |  2 ++
 2 files changed, 40 insertions(+), 10 deletions(-)

diff --git a/job.c b/job.c
index 134a07b92e..5fbaaabf78 100644
--- a/job.c
+++ b/job.c
@@ -136,17 +136,36 @@ static void job_txn_del_job(Job *job)
     }
 }
 
-static int job_txn_apply(JobTxn *txn, int fn(Job *))
+static int job_txn_apply(Job *job, int fn(Job *))
 {
-    Job *job, *next;
+    AioContext *inner_ctx;
+    Job *other_job, *next;
+    JobTxn *txn = job->txn;
     int rc = 0;
 
-    QLIST_FOREACH_SAFE(job, &txn->jobs, txn_list, next) {
-        rc = fn(job);
+    /*
+     * Similar to job_completed_txn_abort, we take each job's lock before
+     * applying fn, but since we assume that outer_ctx is held by the caller,
+     * we need to release it here to avoid holding the lock twice - which would
+     * break AIO_WAIT_WHILE from within fn.
+     */
+    aio_context_release(job->aio_context);
+
+    QLIST_FOREACH_SAFE(other_job, &txn->jobs, txn_list, next) {
+        inner_ctx = other_job->aio_context;
+        aio_context_acquire(inner_ctx);
+        rc = fn(other_job);
+        aio_context_release(inner_ctx);
         if (rc) {
             break;
         }
     }
+
+    /*
+     * Note that job->aio_context might have been changed by calling fn, so we
+     * can't use a local variable to cache it.
+     */
+    aio_context_acquire(job->aio_context);
     return rc;
 }
 
@@ -774,11 +793,11 @@ static void job_do_finalize(Job *job)
     assert(job && job->txn);
 
     /* prepare the transaction to complete */
-    rc = job_txn_apply(job->txn, job_prepare);
+    rc = job_txn_apply(job, job_prepare);
     if (rc) {
         job_completed_txn_abort(job);
     } else {
-        job_txn_apply(job->txn, job_finalize_single);
+        job_txn_apply(job, job_finalize_single);
     }
 }
 
@@ -824,10 +843,10 @@ static void job_completed_txn_success(Job *job)
         assert(other_job->ret == 0);
     }
 
-    job_txn_apply(txn, job_transition_to_pending);
+    job_txn_apply(job, job_transition_to_pending);
 
     /* If no jobs need manual finalization, automatically do so */
-    if (job_txn_apply(txn, job_needs_finalize) == 0) {
+    if (job_txn_apply(job, job_needs_finalize) == 0) {
         job_do_finalize(job);
     }
 }
@@ -849,9 +868,10 @@ static void job_completed(Job *job)
 static void job_exit(void *opaque)
 {
     Job *job = (Job *)opaque;
-    AioContext *ctx = job->aio_context;
+    AioContext *ctx;
 
-    aio_context_acquire(ctx);
+    job_ref(job);
+    aio_context_acquire(job->aio_context);
 
     /* This is a lie, we're not quiescent, but still doing the completion
      * callbacks. However, completion callbacks tend to involve operations that
@@ -862,6 +882,14 @@ static void job_exit(void *opaque)
 
     job_completed(job);
 
+    /*
+     * Note that calling job_completed can move the job to a different
+     * aio_context, so we cannot cache from above. job_txn_apply takes care of
+     * acquiring the new lock, and we ref/unref to avoid job_completed freeing
+     * the job underneath us.
+     */
+    ctx = job->aio_context;
+    job_unref(job);
     aio_context_release(ctx);
 }
 
diff --git a/tests/test-blockjob.c b/tests/test-blockjob.c
index 4eeb184caf..7519847912 100644
--- a/tests/test-blockjob.c
+++ b/tests/test-blockjob.c
@@ -367,7 +367,9 @@ static void test_cancel_concluded(void)
     aio_poll(qemu_get_aio_context(), true);
     assert(job->status == JOB_STATUS_PENDING);
 
+    aio_context_acquire(job->aio_context);
     job_finalize(job, &error_abort);
+    aio_context_release(job->aio_context);
     assert(job->status == JOB_STATUS_CONCLUDED);
 
     cancel_common(s);
-- 
2.26.0




^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v4 2/3] replication: acquire aio context before calling job_cancel_sync
  2020-04-01  8:15 [PATCH v4 0/3] Fix some AIO context locking in jobs Stefan Reiter
  2020-04-01  8:15 ` [PATCH v4 1/3] job: take each job's lock individually in job_txn_apply Stefan Reiter
@ 2020-04-01  8:15 ` Stefan Reiter
  2020-04-02 12:41   ` Max Reitz
  2020-04-01  8:15 ` [PATCH v4 3/3] backup: don't acquire aio_context in backup_clean Stefan Reiter
  2020-04-02 12:48 ` [PATCH v4 0/3] Fix some AIO context locking in jobs Kevin Wolf
  3 siblings, 1 reply; 11+ messages in thread
From: Stefan Reiter @ 2020-04-01  8:15 UTC (permalink / raw)
  To: qemu-devel, qemu-block
  Cc: kwolf, vsementsov, slp, mreitz, stefanha, jsnow, dietmar

job_cancel_sync requires the job's lock to be held, all other callers
already do this (replication_stop, drive_backup_abort,
blockdev_backup_abort, job_cancel_sync_all, cancel_common).

Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
---
 block/replication.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/block/replication.c b/block/replication.c
index 413d95407d..17ddc31569 100644
--- a/block/replication.c
+++ b/block/replication.c
@@ -144,12 +144,18 @@ fail:
 static void replication_close(BlockDriverState *bs)
 {
     BDRVReplicationState *s = bs->opaque;
+    Job *commit_job;
+    AioContext *commit_ctx;
 
     if (s->stage == BLOCK_REPLICATION_RUNNING) {
         replication_stop(s->rs, false, NULL);
     }
     if (s->stage == BLOCK_REPLICATION_FAILOVER) {
-        job_cancel_sync(&s->commit_job->job);
+        commit_job = &s->commit_job->job;
+        commit_ctx = commit_job->aio_context;
+        aio_context_acquire(commit_ctx);
+        job_cancel_sync(commit_job);
+        aio_context_release(commit_ctx);
     }
 
     if (s->mode == REPLICATION_MODE_SECONDARY) {
-- 
2.26.0




^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v4 3/3] backup: don't acquire aio_context in backup_clean
  2020-04-01  8:15 [PATCH v4 0/3] Fix some AIO context locking in jobs Stefan Reiter
  2020-04-01  8:15 ` [PATCH v4 1/3] job: take each job's lock individually in job_txn_apply Stefan Reiter
  2020-04-01  8:15 ` [PATCH v4 2/3] replication: acquire aio context before calling job_cancel_sync Stefan Reiter
@ 2020-04-01  8:15 ` Stefan Reiter
  2020-04-02 12:59   ` Max Reitz
  2020-04-02 12:48 ` [PATCH v4 0/3] Fix some AIO context locking in jobs Kevin Wolf
  3 siblings, 1 reply; 11+ messages in thread
From: Stefan Reiter @ 2020-04-01  8:15 UTC (permalink / raw)
  To: qemu-devel, qemu-block
  Cc: kwolf, vsementsov, slp, mreitz, stefanha, jsnow, dietmar

All code-paths leading to backup_clean (via job_clean) have the job's
context already acquired. The job's context is guaranteed to be the same
as the one used by backup_top via backup_job_create.

Since the previous logic effectively acquired the lock twice, this
broke cleanup of backups for disks using IO threads, since the BDRV_POLL_WHILE
in bdrv_backup_top_drop -> bdrv_do_drained_begin would only release the lock
once, thus deadlocking with the IO thread.

This is a partial revert of 0abf2581717a19.

Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
---

With the two previous patches applied, the commit message should now hold true.

 block/backup.c | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/block/backup.c b/block/backup.c
index 7430ca5883..a7a7dcaf4c 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -126,11 +126,7 @@ static void backup_abort(Job *job)
 static void backup_clean(Job *job)
 {
     BackupBlockJob *s = container_of(job, BackupBlockJob, common.job);
-    AioContext *aio_context = bdrv_get_aio_context(s->backup_top);
-
-    aio_context_acquire(aio_context);
     bdrv_backup_top_drop(s->backup_top);
-    aio_context_release(aio_context);
 }
 
 void backup_do_checkpoint(BlockJob *job, Error **errp)
-- 
2.26.0




^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v4 1/3] job: take each job's lock individually in job_txn_apply
  2020-04-01  8:15 ` [PATCH v4 1/3] job: take each job's lock individually in job_txn_apply Stefan Reiter
@ 2020-04-02 12:33   ` Max Reitz
  2020-04-02 15:04     ` Stefan Reiter
  0 siblings, 1 reply; 11+ messages in thread
From: Max Reitz @ 2020-04-02 12:33 UTC (permalink / raw)
  To: Stefan Reiter, qemu-devel, qemu-block
  Cc: kwolf, vsementsov, slp, stefanha, jsnow, dietmar


[-- Attachment #1.1: Type: text/plain, Size: 3169 bytes --]

On 01.04.20 10:15, Stefan Reiter wrote:
> All callers of job_txn_apply hold a single job's lock, but different
> jobs within a transaction can have different contexts, thus we need to
> lock each one individually before applying the callback function.
> 
> Similar to job_completed_txn_abort this also requires releasing the
> caller's context before and reacquiring it after to avoid recursive
> locks which might break AIO_WAIT_WHILE in the callback.
> 
> This also brings to light a different issue: When a callback function in
> job_txn_apply moves it's job to a different AIO context, job_exit will
> try to release the wrong lock (now that we re-acquire the lock
> correctly, previously it would just continue with the old lock, leaving
> the job unlocked for the rest of the codepath back to job_exit). Fix
> this by not caching the job's context in job_exit and add a comment
> about why this is done.
> 
> One test needed adapting, since it calls job_finalize directly, so it
> manually needs to acquire the correct context.
> 
> Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
> ---
>  job.c                 | 48 ++++++++++++++++++++++++++++++++++---------
>  tests/test-blockjob.c |  2 ++
>  2 files changed, 40 insertions(+), 10 deletions(-)
> 
> diff --git a/job.c b/job.c
> index 134a07b92e..5fbaaabf78 100644
> --- a/job.c
> +++ b/job.c
> @@ -136,17 +136,36 @@ static void job_txn_del_job(Job *job)
>      }
>  }
>  
> -static int job_txn_apply(JobTxn *txn, int fn(Job *))
> +static int job_txn_apply(Job *job, int fn(Job *))
>  {
> -    Job *job, *next;
> +    AioContext *inner_ctx;
> +    Job *other_job, *next;
> +    JobTxn *txn = job->txn;
>      int rc = 0;
>  
> -    QLIST_FOREACH_SAFE(job, &txn->jobs, txn_list, next) {
> -        rc = fn(job);
> +    /*
> +     * Similar to job_completed_txn_abort, we take each job's lock before
> +     * applying fn, but since we assume that outer_ctx is held by the caller,
> +     * we need to release it here to avoid holding the lock twice - which would
> +     * break AIO_WAIT_WHILE from within fn.
> +     */
> +    aio_context_release(job->aio_context);

Hm, is it safe to drop the lock here and then reacquire it later?  I.e.,
is the job in a consistent state in between?  I don’t know.  Looks like
it.  Maybe someone else knows better.

(I find the job code rather confusing.)

> +
> +    QLIST_FOREACH_SAFE(other_job, &txn->jobs, txn_list, next) {
> +        inner_ctx = other_job->aio_context;
> +        aio_context_acquire(inner_ctx);
> +        rc = fn(other_job);
> +        aio_context_release(inner_ctx);
>          if (rc) {
>              break;
>          }
>      }
> +
> +    /*
> +     * Note that job->aio_context might have been changed by calling fn, so we
> +     * can't use a local variable to cache it.
> +     */
> +    aio_context_acquire(job->aio_context);

But all callers of job_txn_apply() (but job_exit(), which you fix in
this patch) cache it.  Won’t they release the wrong context then?  Or is
a context change only possible when job_txn_apply() is called from
job_exit()?

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v4 2/3] replication: acquire aio context before calling job_cancel_sync
  2020-04-01  8:15 ` [PATCH v4 2/3] replication: acquire aio context before calling job_cancel_sync Stefan Reiter
@ 2020-04-02 12:41   ` Max Reitz
  2020-04-02 15:05     ` Stefan Reiter
  0 siblings, 1 reply; 11+ messages in thread
From: Max Reitz @ 2020-04-02 12:41 UTC (permalink / raw)
  To: Stefan Reiter, qemu-devel, qemu-block
  Cc: kwolf, vsementsov, slp, stefanha, jsnow, dietmar


[-- Attachment #1.1: Type: text/plain, Size: 2325 bytes --]

On 01.04.20 10:15, Stefan Reiter wrote:
> job_cancel_sync requires the job's lock to be held, all other callers
> already do this (replication_stop, drive_backup_abort,
> blockdev_backup_abort, job_cancel_sync_all, cancel_common).

I think all other callers come directly from QMP, though, so they have
no locks yet.  This OTOH is called from a block driver function, so I
would assume the BDS context is locked already (or rather, this is
executed in the BDS context).

I also think that the commit job runs in the same context.  So I would
assume that this would be a nested lock, which should be unnecessary and
might cause problems.  Maybe we should just assert that the job’s
context is the current context?

(Or would that still be problematic because now job_txn_apply() wants to
release some context, and that isn’t possible without this patch?  I
would hope it’s possible, because I think we shouldn’t have to acquire
the current context, and should be free to release it...?  I have no
idea.  Maybe we are actually free to reacquire the current context here.)

> Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
> ---
>  block/replication.c | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/block/replication.c b/block/replication.c
> index 413d95407d..17ddc31569 100644
> --- a/block/replication.c
> +++ b/block/replication.c
> @@ -144,12 +144,18 @@ fail:
>  static void replication_close(BlockDriverState *bs)
>  {
>      BDRVReplicationState *s = bs->opaque;
> +    Job *commit_job;
> +    AioContext *commit_ctx;
>  
>      if (s->stage == BLOCK_REPLICATION_RUNNING) {
>          replication_stop(s->rs, false, NULL);
>      }
>      if (s->stage == BLOCK_REPLICATION_FAILOVER) {
> -        job_cancel_sync(&s->commit_job->job);
> +        commit_job = &s->commit_job->job;
> +        commit_ctx = commit_job->aio_context;
> +        aio_context_acquire(commit_ctx);
> +        job_cancel_sync(commit_job);
> +        aio_context_release(commit_ctx);

Anyway, there’s also the problem that I would guess the
job_cancel_sync() might move the job from its current context back into
the main context.  Then we’d release the wrong context here.

Max

>      }
>  
>      if (s->mode == REPLICATION_MODE_SECONDARY) {
> 



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v4 0/3] Fix some AIO context locking in jobs
  2020-04-01  8:15 [PATCH v4 0/3] Fix some AIO context locking in jobs Stefan Reiter
                   ` (2 preceding siblings ...)
  2020-04-01  8:15 ` [PATCH v4 3/3] backup: don't acquire aio_context in backup_clean Stefan Reiter
@ 2020-04-02 12:48 ` Kevin Wolf
  2020-04-06 20:11   ` John Snow
  3 siblings, 1 reply; 11+ messages in thread
From: Kevin Wolf @ 2020-04-02 12:48 UTC (permalink / raw)
  To: Stefan Reiter
  Cc: vsementsov, slp, qemu-block, qemu-devel, mreitz, stefanha, jsnow,
	dietmar

Am 01.04.2020 um 10:15 hat Stefan Reiter geschrieben:
> Contains three seperate but related patches cleaning up and fixing some
> issues regarding aio_context_acquire/aio_context_release for jobs. Mostly
> affects blockjobs running for devices that have IO threads enabled AFAICT.
> 
> This is based on the discussions here:
> https://lists.gnu.org/archive/html/qemu-devel/2020-03/msg07929.html

I'm getting segfaults in some qemu-iotests cases:

    Failures: 155 219 245 255 257 258

This is the backtrace of one of the coredumps I got, looks like use
after free:

(gdb) bt
#0  0x000055bad36ce4dc in qemu_mutex_lock_impl (mutex=0xebebebebebebec4b, file=0x55bad38c5cbf "util/async.c", line=596) at util/qemu-thread-posix.c:76
#1  0x000055bad35d4f4f in job_txn_apply (fn=0x55bad35d58b0 <job_finalize_single>, job=<optimized out>, job=<optimized out>) at job.c:168
#2  0x000055bad33aa807 in qmp_job_finalize (id=<optimized out>, errp=errp@entry=0x7ffff6a2ad68) at job-qmp.c:117
#3  0x000055bad357fabb in qmp_marshal_job_finalize (args=<optimized out>, ret=<optimized out>, errp=0x7ffff6a2adc8) at qapi/qapi-commands-job.c:204
#4  0x000055bad367f688 in qmp_dispatch (cmds=0x55bad3df2880 <qmp_commands>, request=<optimized out>, allow_oob=<optimized out>) at qapi/qmp-dispatch.c:155
#5  0x000055bad355bfb1 in monitor_qmp_dispatch (mon=0x55bad5b0d2f0, req=<optimized out>) at monitor/qmp.c:145
#6  0x000055bad355c79a in monitor_qmp_bh_dispatcher (data=<optimized out>) at monitor/qmp.c:234
#7  0x000055bad36c7ea5 in aio_bh_call (bh=0x55bad58fa2b0) at util/async.c:164
#8  0x000055bad36c7ea5 in aio_bh_poll (ctx=ctx@entry=0x55bad58f8ee0) at util/async.c:164
#9  0x000055bad36cb52e in aio_dispatch (ctx=0x55bad58f8ee0) at util/aio-posix.c:380
#10 0x000055bad36c7d8e in aio_ctx_dispatch (source=<optimized out>, callback=<optimized out>, user_data=<optimized out>) at util/async.c:298
#11 0x00007fa3a3f7a06d in g_main_context_dispatch () at /lib64/libglib-2.0.so.0
#12 0x000055bad36ca798 in glib_pollfds_poll () at util/main-loop.c:219
#13 0x000055bad36ca798 in os_host_main_loop_wait (timeout=<optimized out>) at util/main-loop.c:242
#14 0x000055bad36ca798 in main_loop_wait (nonblocking=nonblocking@entry=0) at util/main-loop.c:518
#15 0x000055bad3340559 in qemu_main_loop () at /home/kwolf/source/qemu/softmmu/vl.c:1664
#16 0x000055bad322993e in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at /home/kwolf/source/qemu/softmmu/main.c:49
(gdb) p *job
$3 = {id = 0xebebebebebebebeb <error: Cannot access memory at address 0xebebebebebebebeb>, driver = 0xebebebebebebebeb, refcnt = -336860181, status = 3958107115, 
  aio_context = 0xebebebebebebebeb, co = 0xebebebebebebebeb, sleep_timer = {expire_time = -1446803456761533461, timer_list = 0xebebebebebebebeb, cb = 0xebebebebebebebeb, 
    opaque = 0xebebebebebebebeb, next = 0xebebebebebebebeb, attributes = -336860181, scale = -336860181}, pause_count = -336860181, busy = 235, paused = 235, user_paused = 235, 
  cancelled = 235, force_cancel = 235, deferred_to_main_loop = 235, auto_finalize = 235, auto_dismiss = 235, progress = {current = 16999940616948018155, total = 16999940616948018155}, 
  ret = -336860181, err = 0xebebebebebebebeb, cb = 0xebebebebebebebeb, opaque = 0xebebebebebebebeb, on_finalize_cancelled = {notifiers = {lh_first = 0xebebebebebebebeb}}, 
  on_finalize_completed = {notifiers = {lh_first = 0xebebebebebebebeb}}, on_pending = {notifiers = {lh_first = 0xebebebebebebebeb}}, on_ready = {notifiers = {lh_first = 
    0xebebebebebebebeb}}, on_idle = {notifiers = {lh_first = 0xebebebebebebebeb}}, job_list = {le_next = 0xebebebebebebebeb, le_prev = 0xebebebebebebebeb}, txn = 0xebebebebebebebeb, 
  txn_list = {le_next = 0xebebebebebebebeb, le_prev = 0xebebebebebebebeb}}

Kevin



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v4 3/3] backup: don't acquire aio_context in backup_clean
  2020-04-01  8:15 ` [PATCH v4 3/3] backup: don't acquire aio_context in backup_clean Stefan Reiter
@ 2020-04-02 12:59   ` Max Reitz
  0 siblings, 0 replies; 11+ messages in thread
From: Max Reitz @ 2020-04-02 12:59 UTC (permalink / raw)
  To: Stefan Reiter, qemu-devel, qemu-block
  Cc: kwolf, vsementsov, slp, stefanha, jsnow, dietmar


[-- Attachment #1.1: Type: text/plain, Size: 847 bytes --]

On 01.04.20 10:15, Stefan Reiter wrote:
> All code-paths leading to backup_clean (via job_clean) have the job's
> context already acquired. The job's context is guaranteed to be the same
> as the one used by backup_top via backup_job_create.
> 
> Since the previous logic effectively acquired the lock twice, this
> broke cleanup of backups for disks using IO threads, since the BDRV_POLL_WHILE
> in bdrv_backup_top_drop -> bdrv_do_drained_begin would only release the lock
> once, thus deadlocking with the IO thread.
> 
> This is a partial revert of 0abf2581717a19.
> 
> Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
> ---
> 
> With the two previous patches applied, the commit message should now hold true.
> 
>  block/backup.c | 4 ----
>  1 file changed, 4 deletions(-)

Reviewed-by: Max Reitz <mreitz@redhat.com>


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v4 1/3] job: take each job's lock individually in job_txn_apply
  2020-04-02 12:33   ` Max Reitz
@ 2020-04-02 15:04     ` Stefan Reiter
  0 siblings, 0 replies; 11+ messages in thread
From: Stefan Reiter @ 2020-04-02 15:04 UTC (permalink / raw)
  To: Max Reitz, kwolf
  Cc: vsementsov, slp, qemu-block, qemu-devel, stefanha, jsnow, dietmar

On 02/04/2020 14:33, Max Reitz wrote:
> On 01.04.20 10:15, Stefan Reiter wrote:
>> All callers of job_txn_apply hold a single job's lock, but different
>> jobs within a transaction can have different contexts, thus we need to
>> lock each one individually before applying the callback function.
>>
>> Similar to job_completed_txn_abort this also requires releasing the
>> caller's context before and reacquiring it after to avoid recursive
>> locks which might break AIO_WAIT_WHILE in the callback.
>>
>> This also brings to light a different issue: When a callback function in
>> job_txn_apply moves it's job to a different AIO context, job_exit will
>> try to release the wrong lock (now that we re-acquire the lock
>> correctly, previously it would just continue with the old lock, leaving
>> the job unlocked for the rest of the codepath back to job_exit). Fix
>> this by not caching the job's context in job_exit and add a comment
>> about why this is done.
>>
>> One test needed adapting, since it calls job_finalize directly, so it
>> manually needs to acquire the correct context.
>>
>> Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
>> ---
>>   job.c                 | 48 ++++++++++++++++++++++++++++++++++---------
>>   tests/test-blockjob.c |  2 ++
>>   2 files changed, 40 insertions(+), 10 deletions(-)
>>
>> diff --git a/job.c b/job.c
>> index 134a07b92e..5fbaaabf78 100644
>> --- a/job.c
>> +++ b/job.c
>> @@ -136,17 +136,36 @@ static void job_txn_del_job(Job *job)
>>       }
>>   }
>>   
>> -static int job_txn_apply(JobTxn *txn, int fn(Job *))
>> +static int job_txn_apply(Job *job, int fn(Job *))
>>   {
>> -    Job *job, *next;
>> +    AioContext *inner_ctx;
>> +    Job *other_job, *next;
>> +    JobTxn *txn = job->txn;
>>       int rc = 0;
>>   
>> -    QLIST_FOREACH_SAFE(job, &txn->jobs, txn_list, next) {
>> -        rc = fn(job);
>> +    /*
>> +     * Similar to job_completed_txn_abort, we take each job's lock before
>> +     * applying fn, but since we assume that outer_ctx is held by the caller,
>> +     * we need to release it here to avoid holding the lock twice - which would
>> +     * break AIO_WAIT_WHILE from within fn.
>> +     */
>> +    aio_context_release(job->aio_context);
> 
> Hm, is it safe to drop the lock here and then reacquire it later?  I.e.,
> is the job in a consistent state in between?  I don’t know.  Looks like
> it.  Maybe someone else knows better.
>

I would have said so too, but the iotest segfaults Kevin discovered (I 
can reproduce them after a dozen or so cycles) seem to be triggered by 
some kind of race, which I can only imagine being caused by it not being 
safe to drop the lock.

It can be resolved by doing a job_ref/unref in job_txn_apply, but that 
might be only fixing the symptoms and ignoring the problem.

> (I find the job code rather confusing.)
> 

That seems to be common around here ;)

>> +
>> +    QLIST_FOREACH_SAFE(other_job, &txn->jobs, txn_list, next) {
>> +        inner_ctx = other_job->aio_context;
>> +        aio_context_acquire(inner_ctx);

Ignoring the fact that fn() can change a job's lock, one idea I had was 
to do a

   if (inner_ctx != job->aio_context) {
       aio_context_acquire(inner_ctx);
   }

instead of releasing the lock prior.
However, that gets complicated when the job's context does change in fn.

>> +        rc = fn(other_job);
>> +        aio_context_release(inner_ctx);
>>           if (rc) {
>>               break;
>>           }
>>       }
>> +
>> +    /*
>> +     * Note that job->aio_context might have been changed by calling fn, so we
>> +     * can't use a local variable to cache it.
>> +     */
>> +    aio_context_acquire(job->aio_context);
> 
> But all callers of job_txn_apply() (but job_exit(), which you fix in
> this patch) cache it.  Won’t they release the wrong context then?  Or is
> a context change only possible when job_txn_apply() is called from
> job_exit()?
> 

You're right that it can probably change for other callers as well 
(though at least it doesn't seem to currently, since no other test cases 
fail?). Looking at the analysis Vladimir did [0], there's a few places 
where that would need fixing.

The issue is that all of these places would also need the job_ref/unref 
part AFAICT, which would make this a bit unwieldy...

[0] https://lists.gnu.org/archive/html/qemu-devel/2020-03/msg07867.html

> Max
> 



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v4 2/3] replication: acquire aio context before calling job_cancel_sync
  2020-04-02 12:41   ` Max Reitz
@ 2020-04-02 15:05     ` Stefan Reiter
  0 siblings, 0 replies; 11+ messages in thread
From: Stefan Reiter @ 2020-04-02 15:05 UTC (permalink / raw)
  To: Max Reitz
  Cc: kwolf, vsementsov, slp, qemu-block, qemu-devel, stefanha, jsnow, dietmar

On 02/04/2020 14:41, Max Reitz wrote:
> On 01.04.20 10:15, Stefan Reiter wrote:
>> job_cancel_sync requires the job's lock to be held, all other callers
>> already do this (replication_stop, drive_backup_abort,
>> blockdev_backup_abort, job_cancel_sync_all, cancel_common).
> 
> I think all other callers come directly from QMP, though, so they have
> no locks yet.  This OTOH is called from a block driver function, so I
> would assume the BDS context is locked already (or rather, this is
> executed in the BDS context).
> 
> I also think that the commit job runs in the same context.  So I would
> assume that this would be a nested lock, which should be unnecessary and
> might cause problems.  Maybe we should just assert that the job’s
> context is the current context?
> 
> (Or would that still be problematic because now job_txn_apply() wants to
> release some context, and that isn’t possible without this patch?  I
> would hope it’s possible, because I think we shouldn’t have to acquire
> the current context, and should be free to release it...?  I have no
> idea.  Maybe we are actually free to reacquire the current context here.)
> 

You're right, this seems to be unnecessary. Adding an

   assert(commit_job->aio_context == qemu_get_current_aio_context());

to make sure seems like a good idea though. bdrv_close appears to always 
have the lock on its driver's context held.

>> Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
>> ---
>>   block/replication.c | 8 +++++++-
>>   1 file changed, 7 insertions(+), 1 deletion(-)
>>
>> diff --git a/block/replication.c b/block/replication.c
>> index 413d95407d..17ddc31569 100644
>> --- a/block/replication.c
>> +++ b/block/replication.c
>> @@ -144,12 +144,18 @@ fail:
>>   static void replication_close(BlockDriverState *bs)
>>   {
>>       BDRVReplicationState *s = bs->opaque;
>> +    Job *commit_job;
>> +    AioContext *commit_ctx;
>>   
>>       if (s->stage == BLOCK_REPLICATION_RUNNING) {
>>           replication_stop(s->rs, false, NULL);
>>       }
>>       if (s->stage == BLOCK_REPLICATION_FAILOVER) {
>> -        job_cancel_sync(&s->commit_job->job);
>> +        commit_job = &s->commit_job->job;
>> +        commit_ctx = commit_job->aio_context;
>> +        aio_context_acquire(commit_ctx);
>> +        job_cancel_sync(commit_job);
>> +        aio_context_release(commit_ctx);
> 
> Anyway, there’s also the problem that I would guess the
> job_cancel_sync() might move the job from its current context back into
> the main context.  Then we’d release the wrong context here.
>  > Max
> 
>>       }
>>   
>>       if (s->mode == REPLICATION_MODE_SECONDARY) {
>>
> 
> 



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v4 0/3] Fix some AIO context locking in jobs
  2020-04-02 12:48 ` [PATCH v4 0/3] Fix some AIO context locking in jobs Kevin Wolf
@ 2020-04-06 20:11   ` John Snow
  0 siblings, 0 replies; 11+ messages in thread
From: John Snow @ 2020-04-06 20:11 UTC (permalink / raw)
  To: Kevin Wolf, Stefan Reiter
  Cc: vsementsov, slp, qemu-block, qemu-devel, mreitz, stefanha, dietmar



On 4/2/20 8:48 AM, Kevin Wolf wrote:
> Am 01.04.2020 um 10:15 hat Stefan Reiter geschrieben:
>> Contains three seperate but related patches cleaning up and fixing some
>> issues regarding aio_context_acquire/aio_context_release for jobs. Mostly
>> affects blockjobs running for devices that have IO threads enabled AFAICT.
>>
>> This is based on the discussions here:
>> https://lists.gnu.org/archive/html/qemu-devel/2020-03/msg07929.html
> 
> I'm getting segfaults in some qemu-iotests cases:
> 
>     Failures: 155 219 245 255 257 258
> 
> This is the backtrace of one of the coredumps I got, looks like use
> after free:
> 

fwiw, this appears to be the case after the very first patch, all six tests.

--js



^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2020-04-06 20:12 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-01  8:15 [PATCH v4 0/3] Fix some AIO context locking in jobs Stefan Reiter
2020-04-01  8:15 ` [PATCH v4 1/3] job: take each job's lock individually in job_txn_apply Stefan Reiter
2020-04-02 12:33   ` Max Reitz
2020-04-02 15:04     ` Stefan Reiter
2020-04-01  8:15 ` [PATCH v4 2/3] replication: acquire aio context before calling job_cancel_sync Stefan Reiter
2020-04-02 12:41   ` Max Reitz
2020-04-02 15:05     ` Stefan Reiter
2020-04-01  8:15 ` [PATCH v4 3/3] backup: don't acquire aio_context in backup_clean Stefan Reiter
2020-04-02 12:59   ` Max Reitz
2020-04-02 12:48 ` [PATCH v4 0/3] Fix some AIO context locking in jobs Kevin Wolf
2020-04-06 20:11   ` John Snow

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.