From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:38140) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dkWcF-00080N-Jt for qemu-devel@nongnu.org; Wed, 23 Aug 2017 10:26:44 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1dkWc9-0002Cv-S9 for qemu-devel@nongnu.org; Wed, 23 Aug 2017 10:26:43 -0400 MIME-Version: 1.0 In-Reply-To: References: <20170822125113.5025-1-stefanha@redhat.com> From: Stefan Hajnoczi Date: Wed, 23 Aug 2017 15:26:34 +0100 Message-ID: Content-Type: text/plain; charset="UTF-8" Subject: Re: [Qemu-devel] [Qemu-block] [PATCH] nbd-client: avoid spurious qio_channel_yield() re-entry List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Eric Blake Cc: Stefan Hajnoczi , qemu-devel , Paolo Bonzini , "Dr. David Alan Gilbert" , qemu block On Wed, Aug 23, 2017 at 3:20 PM, Eric Blake wrote: > On 08/22/2017 07:51 AM, Stefan Hajnoczi wrote: >> The following scenario leads to an assertion failure in >> qio_channel_yield(): >> >> 1. Request coroutine calls qio_channel_yield() successfully when sending >> would block on the socket. It is now yielded. >> 2. nbd_read_reply_entry() calls nbd_recv_coroutines_enter_all() because >> nbd_receive_reply() failed. >> 3. Request coroutine is entered and returns from qio_channel_yield(). >> Note that the socket fd handler has not fired yet so >> ioc->write_coroutine is still set. >> 4. Request coroutine attempts to send the request body with nbd_rwv() >> but the socket would still block. qio_channel_yield() is called >> again and assert(!ioc->write_coroutine) is hit. >> >> The problem is that nbd_read_reply_entry() does not distinguish between >> request coroutines that are waiting to receive a reply and those that >> are not. >> >> This patch adds a per-request bool receiving flag so >> nbd_read_reply_entry() can avoid spurious aio_wake() calls. >> >> Reported-by: Dr. David Alan Gilbert >> Signed-off-by: Stefan Hajnoczi >> --- >> This should fix the issue that Dave is seeing but I'm concerned that >> there are more problems in nbd-client.c. We don't have good >> abstractions for writing coroutine socket I/O code. Something like Go's >> channels would avoid manual low-level coroutine calls. There is >> currently no way to cancel qio_channel_yield() so requests doing I/O may >> remain in-flight indefinitely and nbd-client.c doesn't join them... > > Is this patch needed for 2.10-rc4, or does Fam's series cover the issue? Fam's series fixes non-shared storage migration. This patch addresses the failure case when the server closes the connection prematurely. Stefan