From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:44766) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aiGZw-0006JT-Pj for qemu-devel@nongnu.org; Tue, 22 Mar 2016 03:18:14 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aiGZr-0004m2-Pb for qemu-devel@nongnu.org; Tue, 22 Mar 2016 03:18:12 -0400 Received: from mx1.redhat.com ([209.132.183.28]:52181) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aiGZr-0004lo-Ia for qemu-devel@nongnu.org; Tue, 22 Mar 2016 03:18:07 -0400 Date: Tue, 22 Mar 2016 15:18:05 +0800 From: Fam Zheng Message-ID: <20160322071804.GC24999@ad.usersys.redhat.com> References: <56E93A22.1080102@de.ibm.com> <56E93ECE.10103@redhat.com> <56E9425C.8030201@de.ibm.com> <56E957AD.2050005@redhat.com> <56E961EA.4090908@de.ibm.com> <56E9638B.5090204@redhat.com> <20160317003906.GA23821@ad.usersys.redhat.com> <56EA8EEE.2020801@linux.vnet.ibm.com> <20160321105718.GA7710@ad.usersys.redhat.com> <56F0EFCA.4080003@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <56F0EFCA.4080003@linux.vnet.ibm.com> Subject: Re: [Qemu-devel] [PATCH 0/4] Tweaks around virtio-blk start/stop List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: tu bo Cc: qemu-devel@nongnu.org On Tue, 03/22 15:10, tu bo wrote: > Hi Fam: > > On 03/21/2016 06:57 PM, Fam Zheng wrote: > >On Thu, 03/17 19:03, tu bo wrote: > >> > >>On 03/17/2016 08:39 AM, Fam Zheng wrote: > >>>On Wed, 03/16 14:45, Paolo Bonzini wrote: > >>>> > >>>> > >>>>On 16/03/2016 14:38, Christian Borntraeger wrote: > >>>>>>If you just remove the calls to virtio_queue_host_notifier_read, here > >>>>>>and in virtio_queue_aio_set_host_notifier_fd_handler, does it work > >>>>>>(keeping patches 2-4 in)? > >>>>> > >>>>>With these changes and patch 2-4 it does no longer locks up. > >>>>>I keep it running some hour to check if a crash happens. > >>>>> > >>>>>Tu Bo, your setup is currently better suited for reproducing. Can you also check? > >>>> > >>>>Great, I'll prepare a patch to virtio then sketching the solution that > >>>>Conny agreed with. > >>>> > >>>>While Fam and I agreed that patch 1 is not required, I'm not sure if the > >>>>mutex is necessary in the end. > >>> > >>>If we can fix this from the virtio_queue_host_notifier_read side, the mutex/BH > >>>are not necessary; but OTOH the mutex does catch such bugs, so maybe it's good > >>>to have it. I'm not sure about the BH. > >>> > >>>And on a hindsight I realize we don't want patches 2-3 too. Actually the > >>>begin/end pair won't work as expected because of the blk_set_aio_context. > >>> > >>>Let's hold on this series. > >>> > >>>> > >>>>So if Tu Bo can check without the virtio_queue_host_notifier_read calls, > >>>>and both with/without Fam's patches, it would be great. > >>> > >>>Tu Bo, only with/withoug patch 4, if you want to check. Sorry for the noise. > >>> > >>1. without the virtio_queue_host_notifier_read calls, without patch 4 > >> > >>crash happens very often, > >> > >>(gdb) bt > >>#0 bdrv_co_do_rw (opaque=0x0) at block/io.c:2172 > >>#1 0x000002aa165da37e in coroutine_trampoline (i0=, > >>i1=1812051552) at util/coroutine-ucontext.c:79 > >>#2 0x000003ff7dd5150a in __makecontext_ret () from /lib64/libc.so.6 > >> > >> > >>2. without the virtio_queue_host_notifier_read calls, with patch 4 > >> > >>crash happens very often, > >> > >>(gdb) bt > >>#0 bdrv_co_do_rw (opaque=0x0) at block/io.c:2172 > >>#1 0x000002aa39dda43e in coroutine_trampoline (i0=, > >>i1=-1677715600) at util/coroutine-ucontext.c:79 > >>#2 0x000003ffab6d150a in __makecontext_ret () from /lib64/libc.so.6 > >> > >> > > > >Tu Bo, > > > >Could you help test this patch (on top of master, without patch 4)? > > > >diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c > >index 08275a9..47f8043 100644 > >--- a/hw/virtio/virtio.c > >+++ b/hw/virtio/virtio.c > >@@ -1098,7 +1098,14 @@ void virtio_queue_notify_vq(VirtQueue *vq) > > > > void virtio_queue_notify(VirtIODevice *vdev, int n) > > { > >- virtio_queue_notify_vq(&vdev->vq[n]); > >+ VirtQueue *vq = &vdev->vq[n]; > >+ EventNotifier *n; > >+ n = virtio_queue_get_host_notifier(vq); > >+ if (n) { > >+ event_notifier_set(n); > >+ } else { > >+ virtio_queue_notify_vq(vq); > >+ } > > } > > > > uint16_t virtio_queue_vector(VirtIODevice *vdev, int n) > > > > > > I got a build error as below, > > /BUILD/qemu-2.5.50/hw/virtio/virtio.c: In function 'virtio_queue_notify': > /BUILD/qemu-2.5.50/hw/virtio/virtio.c:1102:20: error: 'n' redeclared > as different kind of symbol > EventNotifier *n; > ^ > /BUILD/qemu-2.5.50/hw/virtio/virtio.c:1099:50: note: previous > definition of 'n' was here > void virtio_queue_notify(VirtIODevice *vdev, int n) > > > Then I did some change for your patch as below, > diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c > index 08275a9..a10da39 100644 > --- a/hw/virtio/virtio.c > +++ b/hw/virtio/virtio.c > @@ -1098,7 +1098,14 @@ void virtio_queue_notify_vq(VirtQueue *vq) > > void virtio_queue_notify(VirtIODevice *vdev, int n) > { > - virtio_queue_notify_vq(&vdev->vq[n]); > + VirtQueue *vq = &vdev->vq[n]; > + EventNotifier *en; > + en = virtio_queue_get_host_notifier(vq); > + if (en) { > + event_notifier_set(en); > + } else { > + virtio_queue_notify_vq(vq); > + } > } > > uint16_t virtio_queue_vector(VirtIODevice *vdev, int n) > > With qemu master + modified patch above(without patch 4, without > Conny's patches), I did NOT get crash so far. thanks Yes, it was a mistake. Thanks for the testing! Fam