All of lore.kernel.org
 help / color / mirror / Atom feed
From: Paolo Bonzini <pbonzini@redhat.com>
To: Kevin Wolf <kwolf@redhat.com>, Fam Zheng <famz@redhat.com>
Cc: qemu-block@nongnu.org, qemu-devel@nongnu.org,
	Max Reitz <mreitz@redhat.com>,
	Stefan Hajnoczi <stefanha@redhat.com>
Subject: Re: [Qemu-devel] [PATCH v2 5/6] coroutine: Explicitly specify AioContext when entering coroutine
Date: Mon, 10 Apr 2017 08:56:00 +0800	[thread overview]
Message-ID: <f28a47ae-4c65-8b56-d7c0-d0298f03c3f7@redhat.com> (raw)
In-Reply-To: <20170407151445.GF4716@noname.redhat.com>



On 07/04/2017 23:14, Kevin Wolf wrote:
> One part of the reason is that
> BDRV_POLL_WHILE() drops the lock since commit c9d1a561, so just calling
> aio_context_acquire() doesn't protect you any more if there is any
> chance that a nested function calls BDRV_POLL_WHILE().

Hi guys, sorry for the late reply.

There wasn't actually much that changed in commit c9d1a561.  Everything
that could break was also broken before.

With commit c9d1a561 the logic is

	aio_context_acquire
	prepare I/O
	aio_context_release
	poll main AioContext
				aio_context_acquire
				aio_poll secondary AioContext
				complete I/O
			    <-- bdrv_wakeup
	aio_context_acquire
				aio_context_release
	Sync I/O is complete

while before it was

	aio_context_acquire
	prepare I/O
	aio_poll secondary AioContext
	complete I/O
	Sync I/O is complete

The code that can run in "aio_poll secondary AioContext" is the same in
both cases.  The difference is that, after c9d1a561, it always runs in
one thread which eliminates the need for RFifoLock's contention
callbacks (and a bunch of bdrv_drain_all deadlocks that arose from the
combination of contention callbacks and recursive locking).

The patch that introduced the bug is the one that introduced the "home
AioContext" co->ctx for coroutines, because now you have

	aio_context_acquire
	prepare I/O
	aio_context_release
	poll main AioContext
				aio_context_acquire
				aio_poll secondary AioContext
				aio_co_wake
	complete I/O
	bdrv_wakeup
	aio_context_acquire
				aio_context_release
	Sync I/O is complete

and if "complete I/O" causes anything to happen in the iothread, bad
things happen.

I think Fam's analysis is right.  This patch will hopefully be reverted
in 2.10, but right now it's the right thing to do.  However, I don't
like very much adding an argument to qemu_coroutine_enter because:

1) coroutines can be used without AioContexts (CoMutex has a dependency
on aio_co_wake, but it is hidden behind the API and if you have a single
AioContext you could just define aio_co_wake to be qemu_coroutine_enter).

2) the value that is assigned to co->ctx is not the AioContext in which
the coroutine is running.

It would be better to split aio_co_wake so that aio_co_wake is

    /* Read coroutine before co->ctx.  Matches smp_wmb in
     * qemu_coroutine_enter.
     */
    smp_read_barrier_depends();
    ctx = atomic_read(&co->ctx);
    aio_co_enter(ctx, co);

and aio_co_enter is the rest of the logic.

Then the coroutine code always runs in the right thread, which obviously
is thread-safe.

Paolo

  reply	other threads:[~2017-04-10  0:56 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-07  6:54 [Qemu-devel] [PATCH v2 0/6] block: Fixes regarding dataplane and management operations Fam Zheng
2017-04-07  6:54 ` [Qemu-devel] [PATCH v2 1/6] block: Fix unpaired aio_disable_external in external snapshot Fam Zheng
2017-04-07 13:23   ` Stefan Hajnoczi
2017-04-07  6:54 ` [Qemu-devel] [PATCH v2 2/6] block: Assert attached child node has right aio context Fam Zheng
2017-04-07 13:24   ` Stefan Hajnoczi
2017-04-07  6:54 ` [Qemu-devel] [PATCH v2 3/6] mirror: Fix aio context of mirror_top_bs Fam Zheng
2017-04-07 13:24   ` Stefan Hajnoczi
2017-04-07  6:54 ` [Qemu-devel] [PATCH v2 4/6] block: Quiesce old aio context during bdrv_set_aio_context Fam Zheng
2017-04-07 12:50   ` Stefan Hajnoczi
2017-04-08  3:43     ` Fam Zheng
2017-04-10  8:06       ` Kevin Wolf
2017-04-10  8:45         ` Fam Zheng
2017-04-10  9:11           ` Kevin Wolf
2017-04-07  6:54 ` [Qemu-devel] [PATCH v2 5/6] coroutine: Explicitly specify AioContext when entering coroutine Fam Zheng
2017-04-07  9:57   ` Fam Zheng
2017-04-07 13:26   ` Stefan Hajnoczi
2017-04-08  3:27     ` Fam Zheng
2017-04-07 14:07   ` Eric Blake
2017-04-07 15:14   ` Kevin Wolf
2017-04-10  0:56     ` Paolo Bonzini [this message]
2017-04-10  1:43       ` Fam Zheng
2017-04-07  6:54 ` [Qemu-devel] [PATCH v2 6/6] tests/block-job-txn: Don't start block job before adding to txn Fam Zheng
2017-04-07 13:28   ` Stefan Hajnoczi
2017-04-07 18:05     ` John Snow
2017-04-08  3:39       ` Fam Zheng
2017-04-07 12:45 ` [Qemu-devel] [PATCH v2 0/6] block: Fixes regarding dataplane and management operations Kevin Wolf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f28a47ae-4c65-8b56-d7c0-d0298f03c3f7@redhat.com \
    --to=pbonzini@redhat.com \
    --cc=famz@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.