From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S968040AbcHBQ6s (ORCPT ); Tue, 2 Aug 2016 12:58:48 -0400 Received: from mx1.redhat.com ([209.132.183.28]:57272 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S935644AbcHBQ4k (ORCPT ); Tue, 2 Aug 2016 12:56:40 -0400 Date: Tue, 2 Aug 2016 19:49:44 +0300 From: "Michael S. Tsirkin" To: Cornelia Huck Cc: Vegard Nossum , Eric Van Hensbergen , "Aneesh Kumar K.V" , v9fs-developer@lists.sourceforge.net, LKML Subject: Re: Hang in 9p/virtio Message-ID: <20160802194725-mutt-send-email-mst@kernel.org> References: <579D1F3A.7020806@oracle.com> <20160802110321.1fc7369f.cornelia.huck@de.ibm.com> <57A06420.3070508@oracle.com> <57A0A1A6.3000504@oracle.com> <20160802183502.5716e415.cornelia.huck@de.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160802183502.5716e415.cornelia.huck@de.ibm.com> X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.32]); Tue, 02 Aug 2016 16:49:49 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Aug 02, 2016 at 06:35:02PM +0200, Cornelia Huck wrote: > On Tue, 2 Aug 2016 15:35:34 +0200 > Vegard Nossum wrote: > > > On 08/02/2016 11:13 AM, Vegard Nossum wrote: > > > On 08/02/2016 11:03 AM, Cornelia Huck wrote: > > >> On Sat, 30 Jul 2016 23:42:18 +0200 > > >> Vegard Nossum wrote: > > >> > > >>> Hi, > > >>> > > >>> With fault injection triggering an allocation failure for the > > >>> alloc_indirect() call in virtqueue_add() I'm seeing a hang in > > >>> p9_virtio_zc_request() -- it seems to be waiting here indefinitely > > >>> (i.e. at least 120 seconds): > > >>> > > > [...] > > > > > >> What happens is that the code falls back to direct virtio addressing > > >> (after indirect addressing failed) - and this should work. > > >> > > >> I'm more inclined to suspect a qemu instead of a kernel bug, as your > > >> qemu version is quite old and there have been fixes in the virtio > > >> buffer handling and virtio-9p in the meantime. (I'm suspecting > > >> "virtio-9p: fix any_layout".) > > >> > > >> Could you retry with a more recent qemu (at least version 2.4)? > > > > > > I think maybe the version number in the stack trace is a bit misleading, > > > this is the full/actual version: > > > > > > $ kvm --version > > > QEMU emulator version 2.5.0 (Debian 1:2.5+dfsg-5ubuntu10.1), Copyright > > > (c) 2003-2008 Fabrice Bellard > > > > > > I'll still try to get qemu from git and see if it makes a difference. > > > Thanks, > > > > I still seem to get it: > > > > $ qemu-system-x86_64 --version > > QEMU emulator version 2.6.91 (v2.7.0-rc1-2-gcc0100f-dirty), Copyright > > (c) 2003-2008 Fabrice Bellard > > :( > > Sorry, no good immediate idea. > > One thing would be to check whether you get notified by qemu after the > request was queued (i.e., whether vring_interrupt() ever gets called > with 9p's req_done() after the alloc failure was injected). This would > help to suggest whether to continue debugging here or in qemu. > > I still think the root of this error is some failure of the virtio 9p > code to deal with non-indirect buffers, either in the driver or in qemu. It might be interesting to just disable indirect buffers on qemu command line by specifying indirect_desc=off. This way you avoid using error paths. -- MST