From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:35460) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XP63c-0002Gy-Sa for qemu-devel@nongnu.org; Wed, 03 Sep 2014 04:36:50 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1XP63U-0001r3-LS for qemu-devel@nongnu.org; Wed, 03 Sep 2014 04:36:48 -0400 Received: from mail-ob0-x232.google.com ([2607:f8b0:4003:c01::232]:64943) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XP63U-0001qc-B2 for qemu-devel@nongnu.org; Wed, 03 Sep 2014 04:36:40 -0400 Received: by mail-ob0-f178.google.com with SMTP id uy5so5775741obc.37 for ; Wed, 03 Sep 2014 01:36:38 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: <20140903081302.GC6187@redhat.com> References: <20140902152546.GA23254@redhat.com> <20140902152736.GA23266@redhat.com> <20140902210315.GA25153@redhat.com> <20140902215125.GC25231@redhat.com> <20140903061015.GA5449@redhat.com> <20140903081302.GC6187@redhat.com> From: Andrey Korolyov Date: Wed, 3 Sep 2014 12:36:18 +0400 Message-ID: Content-Type: text/plain; charset=UTF-8 Subject: Re: [Qemu-devel] [Qemu-stable] Patch Round-up for stable 2.1.1, freeze on 2014-09-03 List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Michael S. Tsirkin" Cc: ehabkost@redhat.com, "qemu-devel@nongnu.org" , Stefan Hajnoczi , knut.omang@oracle.com, qemu-stable@nongnu.org, Michael Roth , Michael Tokarev , Gerd Hoffmann , "J. Kiszka" , chen.fan.fnst@cn.fujitsu.com, Paolo Bonzini , sebastian.tanase@openwide.fr, zhang.zhanghailiang@huawei.com On Wed, Sep 3, 2014 at 12:13 PM, Michael S. Tsirkin wrote: > On Wed, Sep 03, 2014 at 11:43:54AM +0400, Andrey Korolyov wrote: >> On Wed, Sep 3, 2014 at 10:10 AM, Michael S. Tsirkin wrote: >> > On Wed, Sep 03, 2014 at 02:17:02AM +0400, Andrey Korolyov wrote: >> >> On Wed, Sep 3, 2014 at 2:09 AM, Andrey Korolyov wrote: >> >> > On Wed, Sep 3, 2014 at 1:51 AM, Michael S. Tsirkin wrote: >> >> >> On Wed, Sep 03, 2014 at 01:29:29AM +0400, Andrey Korolyov wrote: >> >> >>> On Wed, Sep 3, 2014 at 1:03 AM, Michael S. Tsirkin wrote: >> >> >>> >> bad one is the >> >> >>> >> >> >> >>> >> Author: Jason Wang >> >> >>> >> Date: Tue Sep 2 18:07:46 2014 +0300 >> >> >>> >> >> >> >>> >> vhost_net: start/stop guest notifiers properly >> >> >>> > >> >> >>> > >> >> >>> > >> >> >>> > upstream has this (pull request sent today): >> >> >>> > vhost_net: cleanup start/stop condition >> >> >>> > >> >> >>> > Could you apply it and see if it helps please? >> >> >>> > >> >> >>> > Michael, if it helps it should be before start/stop guest notifiers >> >> >>> > ideally to avoid bisect problems. >> >> >>> >> >> >>> It is already applied as shown from the list in the previous message >> >> >>> (there are some aio fixes too on top of 2.1 I picked before but they >> >> >>> should not impact vhost-net interaction in any mean). The symptoms are >> >> >>> a bit interesting - VM crashes only at PCI device initalization (e.g. >> >> >>> grub stage after reset and initrd unpacking are passing well, but then >> >> >>> things getting ugly). I am running 3.14 guest i686-pae kernel from >> >> >>> debian backports in guest, so it may be version-specific after all. If >> >> >>> it`ll be hard to reproduce, I can try 64bit, expecting same behavior. >> >> >>> Please find args in attached file. >> >> >> >> >> >> >> >> >> >> >> >> ok just to make sure - which tree do I clone exactly? >> >> >> >> >> > >> >> > https://github.com/mdroth/qemu.git stable-2.1-staging showing same >> >> > behavior for me with those patches >> >> >> >> Forgot to mention important detail - I am playing with -mq now, so >> >> actually virtio-net working in a bit different way than it may >> >> expected (it also shown in args list from above, but someone may miss >> >> it): >> >> ... >> >> qemu-system-x86_64: unable to start vhost net: 95: falling back on >> >> userspace virtio >> >> qemu-system-x86_64: unable to start vhost net: 95: falling back on >> >> userspace virtio >> >> ... >> > >> > >> > OK I see at least one obvious bug there: does the following fix the >> > crash for you? >> > Separately, we need to debug why mq vhost is broken for you. >> > Is this a regression? >> > >> > diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c >> > index ba5d544..1fe18c7 100644 >> > --- a/hw/net/vhost_net.c >> > +++ b/hw/net/vhost_net.c >> > @@ -289,7 +289,7 @@ int vhost_net_start(VirtIODevice *dev, NetClientState *ncs, >> > BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(dev))); >> > VirtioBusState *vbus = VIRTIO_BUS(qbus); >> > VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(vbus); >> > - int r, i = 0; >> > + int r, i; >> > >> > if (!vhost_net_device_endian_ok(dev)) { >> > error_report("vhost-net does not support cross-endian"); >> > @@ -317,16 +317,22 @@ int vhost_net_start(VirtIODevice *dev, NetClientState *ncs, >> > r = vhost_net_start_one(get_vhost_net(ncs[i].peer), dev); >> > >> > if (r < 0) { >> > - goto err; >> > + goto err_start; >> > } >> > } >> > >> > return 0; >> > >> > -err: >> > +err_start: >> > while (--i >= 0) { >> > vhost_net_stop_one(get_vhost_net(ncs[i].peer), dev); >> > } >> > +err: >> > + r = k->set_guest_notifiers(qbus->parent, total_queues * 2, false); >> > + if (r < 0) { >> > + fprintf(stderr, "vhost guest notifier cleanup failed: %d\n", r); >> > + fflush(stderr); >> > + } >> > return r; >> > } >> > >> >> >> another bits of information: >> - the userspace fallback is not specific to mq (very unfortunately >> for me because I didn`t checked this exact regression week before when >> I saw it for mq and it is not specific for queued patches for 2.1.1), >> - bug itself is not specific to mq, reproduces every time even with >> more generic interface config without queues, >> - patch from above does not fix the issue. >> >> Strace output for all threads is available at >> http://xdel.ru/downloads/qemu.out.gz, attached just before reset. > > > > OK does my patch help? > > Jason sent patches to fix the fallback to virtio bug - > does that work for you? > Whoops, missed patch from Jason, meant yours above. The acceleration is fixed, thanks! Jason`s patch alone fixes both crash appearance and accel initialization while yours fixed initialization (while intended to fix assert appearance), with crash still in place.