From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:43300) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aiGSB-0002xZ-0T for qemu-devel@nongnu.org; Tue, 22 Mar 2016 03:10:12 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aiGS7-0003LZ-QP for qemu-devel@nongnu.org; Tue, 22 Mar 2016 03:10:10 -0400 Received: from e19.ny.us.ibm.com ([129.33.205.209]:46024) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aiGS7-0003LD-Lf for qemu-devel@nongnu.org; Tue, 22 Mar 2016 03:10:07 -0400 Received: from localhost by e19.ny.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Tue, 22 Mar 2016 03:10:07 -0400 Received: from b01cxnp22036.gho.pok.ibm.com (b01cxnp22036.gho.pok.ibm.com [9.57.198.26]) by d01dlp02.pok.ibm.com (Postfix) with ESMTP id 9BB596E803F for ; Tue, 22 Mar 2016 02:56:54 -0400 (EDT) Received: from d01av03.pok.ibm.com (d01av03.pok.ibm.com [9.56.224.217]) by b01cxnp22036.gho.pok.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id u2M7A4Sw15794186 for ; Tue, 22 Mar 2016 07:10:04 GMT Received: from d01av03.pok.ibm.com (localhost [127.0.0.1]) by d01av03.pok.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id u2M7A4Ba022088 for ; Tue, 22 Mar 2016 03:10:04 -0400 Received: from oc6333346435.ibm.com ([9.115.114.160]) by d01av03.pok.ibm.com (8.14.4/8.14.4/NCO v10.0 AVin) with ESMTP id u2M7A2hC021885 for ; Tue, 22 Mar 2016 03:10:03 -0400 References: <1458123018-18651-1-git-send-email-famz@redhat.com> <56E9355A.5070700@redhat.com> <56E93A22.1080102@de.ibm.com> <56E93ECE.10103@redhat.com> <56E9425C.8030201@de.ibm.com> <56E957AD.2050005@redhat.com> <56E961EA.4090908@de.ibm.com> <56E9638B.5090204@redhat.com> <20160317003906.GA23821@ad.usersys.redhat.com> <56EA8EEE.2020801@linux.vnet.ibm.com> <20160321105718.GA7710@ad.usersys.redhat.com> From: tu bo Message-ID: <56F0EFCA.4080003@linux.vnet.ibm.com> Date: Tue, 22 Mar 2016 15:10:02 +0800 MIME-Version: 1.0 In-Reply-To: <20160321105718.GA7710@ad.usersys.redhat.com> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH 0/4] Tweaks around virtio-blk start/stop List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: qemu-devel@nongnu.org Hi Fam: On 03/21/2016 06:57 PM, Fam Zheng wrote: > On Thu, 03/17 19:03, tu bo wrote: >> >> On 03/17/2016 08:39 AM, Fam Zheng wrote: >>> On Wed, 03/16 14:45, Paolo Bonzini wrote: >>>> >>>> >>>> On 16/03/2016 14:38, Christian Borntraeger wrote: >>>>>> If you just remove the calls to virtio_queue_host_notifier_read, here >>>>>> and in virtio_queue_aio_set_host_notifier_fd_handler, does it work >>>>>> (keeping patches 2-4 in)? >>>>> >>>>> With these changes and patch 2-4 it does no longer locks up. >>>>> I keep it running some hour to check if a crash happens. >>>>> >>>>> Tu Bo, your setup is currently better suited for reproducing. Can you also check? >>>> >>>> Great, I'll prepare a patch to virtio then sketching the solution that >>>> Conny agreed with. >>>> >>>> While Fam and I agreed that patch 1 is not required, I'm not sure if the >>>> mutex is necessary in the end. >>> >>> If we can fix this from the virtio_queue_host_notifier_read side, the mutex/BH >>> are not necessary; but OTOH the mutex does catch such bugs, so maybe it's good >>> to have it. I'm not sure about the BH. >>> >>> And on a hindsight I realize we don't want patches 2-3 too. Actually the >>> begin/end pair won't work as expected because of the blk_set_aio_context. >>> >>> Let's hold on this series. >>> >>>> >>>> So if Tu Bo can check without the virtio_queue_host_notifier_read calls, >>>> and both with/without Fam's patches, it would be great. >>> >>> Tu Bo, only with/withoug patch 4, if you want to check. Sorry for the noise. >>> >> 1. without the virtio_queue_host_notifier_read calls, without patch 4 >> >> crash happens very often, >> >> (gdb) bt >> #0 bdrv_co_do_rw (opaque=0x0) at block/io.c:2172 >> #1 0x000002aa165da37e in coroutine_trampoline (i0=, >> i1=1812051552) at util/coroutine-ucontext.c:79 >> #2 0x000003ff7dd5150a in __makecontext_ret () from /lib64/libc.so.6 >> >> >> 2. without the virtio_queue_host_notifier_read calls, with patch 4 >> >> crash happens very often, >> >> (gdb) bt >> #0 bdrv_co_do_rw (opaque=0x0) at block/io.c:2172 >> #1 0x000002aa39dda43e in coroutine_trampoline (i0=, >> i1=-1677715600) at util/coroutine-ucontext.c:79 >> #2 0x000003ffab6d150a in __makecontext_ret () from /lib64/libc.so.6 >> >> > > Tu Bo, > > Could you help test this patch (on top of master, without patch 4)? > > diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c > index 08275a9..47f8043 100644 > --- a/hw/virtio/virtio.c > +++ b/hw/virtio/virtio.c > @@ -1098,7 +1098,14 @@ void virtio_queue_notify_vq(VirtQueue *vq) > > void virtio_queue_notify(VirtIODevice *vdev, int n) > { > - virtio_queue_notify_vq(&vdev->vq[n]); > + VirtQueue *vq = &vdev->vq[n]; > + EventNotifier *n; > + n = virtio_queue_get_host_notifier(vq); > + if (n) { > + event_notifier_set(n); > + } else { > + virtio_queue_notify_vq(vq); > + } > } > > uint16_t virtio_queue_vector(VirtIODevice *vdev, int n) > > I got a build error as below, /BUILD/qemu-2.5.50/hw/virtio/virtio.c: In function 'virtio_queue_notify': /BUILD/qemu-2.5.50/hw/virtio/virtio.c:1102:20: error: 'n' redeclared as different kind of symbol EventNotifier *n; ^ /BUILD/qemu-2.5.50/hw/virtio/virtio.c:1099:50: note: previous definition of 'n' was here void virtio_queue_notify(VirtIODevice *vdev, int n) Then I did some change for your patch as below, diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c index 08275a9..a10da39 100644 --- a/hw/virtio/virtio.c +++ b/hw/virtio/virtio.c @@ -1098,7 +1098,14 @@ void virtio_queue_notify_vq(VirtQueue *vq) void virtio_queue_notify(VirtIODevice *vdev, int n) { - virtio_queue_notify_vq(&vdev->vq[n]); + VirtQueue *vq = &vdev->vq[n]; + EventNotifier *en; + en = virtio_queue_get_host_notifier(vq); + if (en) { + event_notifier_set(en); + } else { + virtio_queue_notify_vq(vq); + } } uint16_t virtio_queue_vector(VirtIODevice *vdev, int n) With qemu master + modified patch above(without patch 4, without Conny's patches), I did NOT get crash so far. thanks >