All of lore.kernel.org
 help / color / mirror / Atom feed
From: Max Reitz <mreitz@redhat.com>
To: John Snow <jsnow@redhat.com>,
	qemu-block@nongnu.org, qemu-devel@nongnu.org
Cc: Jeff Cody <jcody@redhat.com>,
	kwolf@redhat.com, jtc@redhat.com, Fam Zheng <famz@redhat.com>
Subject: Re: [Qemu-devel] [PATCH 5/7] block/mirror: utilize job_exit shim
Date: Wed, 29 Aug 2018 10:28:57 +0200	[thread overview]
Message-ID: <595465e4-2450-25b1-2095-1217f34fb50f@redhat.com> (raw)
In-Reply-To: <1920b33f-6773-b224-23b6-74874641f639@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 2321 bytes --]

On 2018-08-28 22:25, John Snow wrote:
> 
> 
> On 08/25/2018 11:02 AM, Max Reitz wrote:
>> If you say so...  I have to admit I don't really understand.  The
>> comment doesn't explain why it's so important to keep src around until
>> job_completed(), so I don't know.  I thought AioContexts are recursive
>> so it doesn't matter whether you take them recursively or not.
> 
> bdrv_flush has troubles under a recursive lock if it is invoked from a
> different thread. It tries to poll for flush completion but the
> coroutine which gets scheduled (instead of entered) can't actually run
> if we hold the lock twice from, say, the main thread -- which is where
> we're doing the graph manipulation from.
> 
>>        co = qemu_coroutine_create(bdrv_flush_co_entry, &flush_co);
>>        bdrv_coroutine_enter(bs, co);
>>        BDRV_POLL_WHILE(bs, flush_co.ret == NOT_DONE);
> 
> BDRV_POLL_WHILE there causes us the grief via AIO_WAIT_WHILE, which only
> puts down one reference, so we deadlock in bdrv_flush in a recursive
> context.
> 
> Kevin helped me figure it out; I CC'd him off-list on a mail that you
> were also CC'd on ("hang in mirror_exit") that's probably pretty helpful:
> 
>> Took a little more than five minutes, but I think I've got it now. The
>> important thing is that the test case is for dataplane, i.e. the job
>> runs in an I/O thread AioContext. Job completion, however, happens in
>> the main loop thread.
>>
>> Therefore, when bdrv_delete() wants to flush the node first, it needs to
>> run bdrv_co_flush() in a foreign AioContext, so the coroutine is
>> scheduled. The I/O thread backtrace shows that it's waiting for the
>> AioContext lock before it can actually enter the bdrv_co_flush()
>> coroutine, so we must have deadlocked:

OK, that makes more sense.  I still would have thought that you should
always be allowed to take an AioContext lock more than once in a single
other AioContext, but on my way through the commit log, I found
bd6458e410c1e7, d79df2a2ceb3cb07711, and maybe most importantly
17e2a4a47d4.  So maybe not.

So at some point we decided that, yeah, you can take them multiple times
in a single context, and, yeah, that was how it was designed, but don't
do that if you expect a BDRV_POLL_WHILE().

OK.  Got it now.

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

  reply	other threads:[~2018-08-29  8:30 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-08-17 19:04 [Qemu-devel] [PATCH 0/7] jobs: remove job_defer_to_main_loop John Snow
2018-08-17 19:04 ` [Qemu-devel] [PATCH 1/7] jobs: change start callback to run callback John Snow
2018-08-20 18:28   ` Eric Blake
2018-08-20 19:04     ` John Snow
2018-08-22 10:51   ` Max Reitz
2018-08-22 23:01     ` John Snow
2018-08-25 13:33       ` Max Reitz
2018-08-25 14:15         ` Max Reitz
2018-08-27 16:01         ` John Snow
2018-08-17 19:04 ` [Qemu-devel] [PATCH 2/7] jobs: canonize Error object John Snow
2018-08-20 20:03   ` Eric Blake
2018-08-21  0:10   ` John Snow
2018-08-22 10:59     ` Max Reitz
2018-08-22 22:50       ` John Snow
2018-08-25 13:15         ` Max Reitz
2018-08-22 11:09   ` Max Reitz
2018-08-22 11:11   ` Max Reitz
2018-08-17 19:04 ` [Qemu-devel] [PATCH 3/7] jobs: add exit shim John Snow
2018-08-20 21:16   ` Eric Blake
2018-08-22 11:43   ` Max Reitz
2018-08-22 11:52     ` Max Reitz
2018-08-22 21:45       ` John Snow
2018-08-25 12:54         ` Max Reitz
2018-08-22 21:52     ` John Snow
2018-08-25 13:05       ` Max Reitz
2018-08-27 15:54         ` John Snow
2018-08-29  8:16           ` Max Reitz
2018-08-22 22:01     ` Eric Blake
2018-08-22 22:04       ` John Snow
2018-08-17 19:04 ` [Qemu-devel] [PATCH 4/7] block/commit: utilize job_exit shim John Snow
2018-08-17 19:18   ` John Snow
2018-08-22 11:58     ` Max Reitz
2018-08-22 21:55       ` John Snow
2018-08-25 13:07         ` Max Reitz
2018-08-22 11:55   ` Max Reitz
2018-08-17 19:04 ` [Qemu-devel] [PATCH 5/7] block/mirror: " John Snow
2018-08-22 12:06   ` Max Reitz
2018-08-22 12:15   ` Max Reitz
2018-08-22 22:05     ` John Snow
2018-08-25 15:02       ` Max Reitz
2018-08-25 15:14         ` Max Reitz
2018-08-28 20:25         ` John Snow
2018-08-29  8:28           ` Max Reitz [this message]
2018-08-28 21:51         ` John Snow
2018-08-17 19:04 ` [Qemu-devel] [PATCH 6/7] jobs: " John Snow
2018-08-22 12:20   ` Max Reitz
2018-08-22 23:40     ` John Snow
2018-08-17 19:04 ` [Qemu-devel] [PATCH 7/7] jobs: remove job_defer_to_main_loop John Snow
2018-08-22 12:21   ` Max Reitz
2018-08-18 16:27 ` [Qemu-devel] [PATCH 0/7] " no-reply
2018-08-18 16:31 ` no-reply
2018-09-04  2:06 ` no-reply
2018-09-04  2:09 ` no-reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=595465e4-2450-25b1-2095-1217f34fb50f@redhat.com \
    --to=mreitz@redhat.com \
    --cc=famz@redhat.com \
    --cc=jcody@redhat.com \
    --cc=jsnow@redhat.com \
    --cc=jtc@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.