On 02/08/2016 09:14 AM, Paolo Bonzini wrote: > aio_poll is not thread safe; it can report progress incorrectly when > called from the main thread. The bug remains latent as long as > all of it is called within aio_context_acquire/aio_context_release, > but this will change soon. > > The details of the bug are pretty simple, but fixing it in an > efficient way is thorny. There are plenty of comments and formal > models in the patch, so I will refer to it. > > Signed-off-by: Paolo Bonzini > --- > +++ b/async.c > @@ -300,12 +300,224 @@ void aio_notify_accept(AioContext *ctx) > } > } > > +/* aio_poll_internal is not thread-safe; it only reports progress > + * correctly when called from one thread, because it has no > + * history of what happened in different threads. When called > + * from two threads, there is a race: > + * > + * main thread I/O thread > + * ----------------------- -------------------------- > + * blk_drain > + * bdrv_requests_pending -> true > + * aio_poll_internal > + * process last request > + * aio_poll_internal > + * > + * Now aio_poll_internal will never exit, because there is no pending > + * I/O on the AioContext. > + * > + * Therefore, aio_poll is a wrapper around aio_poll_internal that allows > + * usage from _two_ threads: the I/O thread of course, and the main thread. > + * When called from the main thread, aio_poll just asks the I/O thread > + * for a nudge as soon as the next call to aio_poll is complete. > + * Because we use QemuEvent, and QemuEvent supports a single consumer > + * only, this only works when the calling thread holds the big QEMU lock. > + * > + * Because aio_poll is used in a loop, spurious wakeups are okay. > + * Therefore, the I/O thread calls qemu_event_set very liberally > + * (it helps that qemu_event_set is cheap on an already-set event). > + * generally used in a loop, it's okay to have spurious wakeups. Incomplete sentence due to bad rebase leftovers? > + * Similarly it is okay to return true when no progress was made > + * (as long as this doesn't happen forever, or you get livelock). > + * > + -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org