On 08/22/2017 07:51 AM, Stefan Hajnoczi wrote: > The following scenario leads to an assertion failure in > qio_channel_yield(): > > 1. Request coroutine calls qio_channel_yield() successfully when sending > would block on the socket. It is now yielded. > 2. nbd_read_reply_entry() calls nbd_recv_coroutines_enter_all() because > nbd_receive_reply() failed. > 3. Request coroutine is entered and returns from qio_channel_yield(). > Note that the socket fd handler has not fired yet so > ioc->write_coroutine is still set. > 4. Request coroutine attempts to send the request body with nbd_rwv() > but the socket would still block. qio_channel_yield() is called > again and assert(!ioc->write_coroutine) is hit. > > The problem is that nbd_read_reply_entry() does not distinguish between > request coroutines that are waiting to receive a reply and those that > are not. > > This patch adds a per-request bool receiving flag so > nbd_read_reply_entry() can avoid spurious aio_wake() calls. > > Reported-by: Dr. David Alan Gilbert > Signed-off-by: Stefan Hajnoczi > --- > This should fix the issue that Dave is seeing but I'm concerned that > there are more problems in nbd-client.c. We don't have good > abstractions for writing coroutine socket I/O code. Something like Go's > channels would avoid manual low-level coroutine calls. There is > currently no way to cancel qio_channel_yield() so requests doing I/O may > remain in-flight indefinitely and nbd-client.c doesn't join them... Is this patch needed for 2.10-rc4, or does Fam's series cover the issue? -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org