All of lore.kernel.org
 help / color / mirror / Atom feed
From: Fam Zheng <famz@redhat.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>,
	Stefan Hajnoczi <stefanha@redhat.com>,
	qemu-devel@nongnu.org, qemu-block@nongnu.org,
	Max Reitz <mreitz@redhat.com>
Subject: Re: [Qemu-devel] [PATCH v2 5/6] coroutine: Explicitly specify AioContext when entering coroutine
Date: Mon, 10 Apr 2017 09:43:56 +0800	[thread overview]
Message-ID: <20170410014356.GA15038@lemon> (raw)
In-Reply-To: <f28a47ae-4c65-8b56-d7c0-d0298f03c3f7@redhat.com>

On Mon, 04/10 08:56, Paolo Bonzini wrote:
> 
> 
> On 07/04/2017 23:14, Kevin Wolf wrote:
> > One part of the reason is that
> > BDRV_POLL_WHILE() drops the lock since commit c9d1a561, so just calling
> > aio_context_acquire() doesn't protect you any more if there is any
> > chance that a nested function calls BDRV_POLL_WHILE().
> 
> Hi guys, sorry for the late reply.
> 
> There wasn't actually much that changed in commit c9d1a561.  Everything
> that could break was also broken before.
> 
> With commit c9d1a561 the logic is
> 
> 	aio_context_acquire
> 	prepare I/O
> 	aio_context_release
> 	poll main AioContext
> 				aio_context_acquire
> 				aio_poll secondary AioContext
> 				complete I/O
> 			    <-- bdrv_wakeup
> 	aio_context_acquire
> 				aio_context_release
> 	Sync I/O is complete
> 
> while before it was
> 
> 	aio_context_acquire
> 	prepare I/O
> 	aio_poll secondary AioContext
> 	complete I/O
> 	Sync I/O is complete
> 
> The code that can run in "aio_poll secondary AioContext" is the same in
> both cases.

This is not completely true. Before this fact, a blocked aio_context_acquire()
in virtio_scsi_data_plane_handle_cmd() wouldn't be woken up during the sync I/O.
Of course the aio context pushdown (9d4566544) happened after c9d1a561, so this
is a compound consequence of the two.

Also, not polling for at least once in bdrv_drain_poll if
!bdrv_requests_pending(bs), since 997235485, made the situation a bit worse -
virtio_scsi_data_plane_handle_cmd could have returned before
bdrv_drained_begin() returns if we do the BDRV_POLL_WHILE body at least once
even if cond is false.

> The difference is that, after c9d1a561, it always runs in
> one thread which eliminates the need for RFifoLock's contention
> callbacks (and a bunch of bdrv_drain_all deadlocks that arose from the
> combination of contention callbacks and recursive locking).
> 
> The patch that introduced the bug is the one that introduced the "home
> AioContext" co->ctx for coroutines, because now you have
> 
> 	aio_context_acquire
> 	prepare I/O
> 	aio_context_release
> 	poll main AioContext
> 				aio_context_acquire
> 				aio_poll secondary AioContext
> 				aio_co_wake
> 	complete I/O
> 	bdrv_wakeup
> 	aio_context_acquire
> 				aio_context_release
> 	Sync I/O is complete
> 
> and if "complete I/O" causes anything to happen in the iothread, bad
> things happen.
> 
> I think Fam's analysis is right.  This patch will hopefully be reverted
> in 2.10, but right now it's the right thing to do.  However, I don't
> like very much adding an argument to qemu_coroutine_enter because:
> 
> 1) coroutines can be used without AioContexts (CoMutex has a dependency
> on aio_co_wake, but it is hidden behind the API and if you have a single
> AioContext you could just define aio_co_wake to be qemu_coroutine_enter).
> 
> 2) the value that is assigned to co->ctx is not the AioContext in which
> the coroutine is running.
> 
> It would be better to split aio_co_wake so that aio_co_wake is
> 
>     /* Read coroutine before co->ctx.  Matches smp_wmb in
>      * qemu_coroutine_enter.
>      */
>     smp_read_barrier_depends();
>     ctx = atomic_read(&co->ctx);
>     aio_co_enter(ctx, co);
> 
> and aio_co_enter is the rest of the logic.

Is aio_co_enter what solves 1) for you too? If so I can add it in v3.

Fam

> 
> Then the coroutine code always runs in the right thread, which obviously
> is thread-safe.
> 
> Paolo
> 

  reply	other threads:[~2017-04-10  1:44 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-07  6:54 [Qemu-devel] [PATCH v2 0/6] block: Fixes regarding dataplane and management operations Fam Zheng
2017-04-07  6:54 ` [Qemu-devel] [PATCH v2 1/6] block: Fix unpaired aio_disable_external in external snapshot Fam Zheng
2017-04-07 13:23   ` Stefan Hajnoczi
2017-04-07  6:54 ` [Qemu-devel] [PATCH v2 2/6] block: Assert attached child node has right aio context Fam Zheng
2017-04-07 13:24   ` Stefan Hajnoczi
2017-04-07  6:54 ` [Qemu-devel] [PATCH v2 3/6] mirror: Fix aio context of mirror_top_bs Fam Zheng
2017-04-07 13:24   ` Stefan Hajnoczi
2017-04-07  6:54 ` [Qemu-devel] [PATCH v2 4/6] block: Quiesce old aio context during bdrv_set_aio_context Fam Zheng
2017-04-07 12:50   ` Stefan Hajnoczi
2017-04-08  3:43     ` Fam Zheng
2017-04-10  8:06       ` Kevin Wolf
2017-04-10  8:45         ` Fam Zheng
2017-04-10  9:11           ` Kevin Wolf
2017-04-07  6:54 ` [Qemu-devel] [PATCH v2 5/6] coroutine: Explicitly specify AioContext when entering coroutine Fam Zheng
2017-04-07  9:57   ` Fam Zheng
2017-04-07 13:26   ` Stefan Hajnoczi
2017-04-08  3:27     ` Fam Zheng
2017-04-07 14:07   ` Eric Blake
2017-04-07 15:14   ` Kevin Wolf
2017-04-10  0:56     ` Paolo Bonzini
2017-04-10  1:43       ` Fam Zheng [this message]
2017-04-07  6:54 ` [Qemu-devel] [PATCH v2 6/6] tests/block-job-txn: Don't start block job before adding to txn Fam Zheng
2017-04-07 13:28   ` Stefan Hajnoczi
2017-04-07 18:05     ` John Snow
2017-04-08  3:39       ` Fam Zheng
2017-04-07 12:45 ` [Qemu-devel] [PATCH v2 0/6] block: Fixes regarding dataplane and management operations Kevin Wolf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170410014356.GA15038@lemon \
    --to=famz@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.