On Wed, Nov 08, 2017 at 07:34:47AM +0100, Sergio Lopez wrote: > Commit b7a745d added a qemu_bh_cancel call to the completion function > as an optimization to prevent it from unnecessarily rescheduling itself. > > This completion function is scheduled from worker_thread, after setting > the state of a ThreadPoolElement to THREAD_DONE. > > This was considered to be safe, as the completion function restarts the > loop just after the call to qemu_bh_cancel. But, under certain access > patterns and scheduling conditions, the loop may wrongly use a > pre-fetched elem->state value, reading it as THREAD_QUEUED, and ending > the completion function without having processed a pending TPE linked at > pool->head: > > worker thread | I/O thread > ------------------------------------------------------------------------ > | speculatively read req->state > req->state = THREAD_DONE; | > qemu_bh_schedule(p->completion_bh) | > bh->scheduled = 1; | > | qemu_bh_cancel(p->completion_bh) > | bh->scheduled = 0; > | if (req->state == THREAD_DONE) > | // sees THREAD_QUEUED > > The source of the misunderstanding was that qemu_bh_cancel is now being > used by the _consumer_ rather than the producer, and therefore now needs > to have acquire semantics just like e.g. aio_bh_poll. > > In some situations, if there are no other independent requests in the > same aio context that could eventually trigger the scheduling of the > completion function, the omitted TPE and all operations pending on it > will get stuck forever. > > Signed-off-by: Sergio Lopez > --- > util/async.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) Thanks, applied to my block tree: https://github.com/stefanha/qemu/commits/block Stefan