On Sat, Apr 15, 2017 at 01:10:02AM +0800, Paolo Bonzini wrote: > > > On 14/04/2017 16:51, Stefan Hajnoczi wrote: > > On Fri, Apr 14, 2017 at 9:02 AM, Fam Zheng wrote: > >> @@ -398,11 +399,15 @@ void bdrv_drain_all(void); > >> */ \ > >> assert(!bs_->wakeup); \ > >> bs_->wakeup = true; \ > >> - while ((cond)) { \ > >> - aio_context_release(ctx_); \ > >> - aio_poll(qemu_get_aio_context(), true); \ > >> - aio_context_acquire(ctx_); \ > >> - waited_ = true; \ > >> + while (busy_) { \ > >> + if ((cond)) { \ > >> + waited_ = busy_ = true; \ > >> + aio_context_release(ctx_); \ > >> + aio_poll(qemu_get_aio_context(), true); \ > >> + aio_context_acquire(ctx_); \ > >> + } else { \ > >> + busy_ = aio_poll(ctx_, false); \ > >> + } \ > > > > Wait, I'm confused. The current thread is not in the BDS AioContext. > > We're not allowed to call aio_poll(ctx_, false). > > It's pretty ugly indeed. Strictly from a thread-safety point of view, > everything that aio_poll calls will acquire the AioContext, so that is > safe and in fact the release/acquire pair can beeven hoisted outside > the "if". > > If we did that for blocking=true in both I/O and main thread, then that > would be racy. This is the scenario mentioned in the commit message for > c9d1a56, "block: only call aio_poll on the current thread's AioContext", > 2016-10-28). > > If only one thread has blocking=true, it's subject to races too. In > this case, the I/O thread may fail to be woken by iothread_stop's > aio_notify. However, by the time iothread_stop is called there should > be no BlockDriverStates (and thus no BDRV_POLL_WHILE running the above > code) for the I/O thread's AioContext. The scenario I have in mind is: BDRV_POLL_WHILE() is called from the IOThread and the main loop also invokes BDRV_POLL_WHILE(). The IOThread is blocked in aio_poll(ctx, true). The main loop calls aio_poll(ctx, false) and can therefore steal the event/completion condition. I *think* ppoll() will return in both threads and they will race to invoke handlers, but I'm not 100% sure that two BDRV_POLL_WHILE() calls in parallel are safe. What do you think? Stefan