From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:44241) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Uf2Vs-0003XC-2w for qemu-devel@nongnu.org; Wed, 22 May 2013 02:27:09 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Uf2Vn-0000nz-4m for qemu-devel@nongnu.org; Wed, 22 May 2013 02:27:04 -0400 Received: from mail-ph.de-nserver.de ([85.158.179.214]:36826) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Uf2Vm-0000nQ-Qh for qemu-devel@nongnu.org; Wed, 22 May 2013 02:26:59 -0400 References: <518C8FD7.9080201@profihost.ag> <20130510074217.GB1500@stefanha-thinkpad.redhat.com> <518CB8E4.5090305@profihost.ag> <51924A40.6090209@profihost.ag> <5192543C.4010305@profihost.ag> Mime-Version: 1.0 (1.0) In-Reply-To: <5192543C.4010305@profihost.ag> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Message-Id: <6BCF67A6-A1D5-4B4D-B83D-DC32F7621C43@profihost.ag> From: Stefan Priebe - Profihost AG Date: Wed, 22 May 2013 08:26:49 +0200 Subject: Re: [Qemu-devel] segfault in aio_bh_poll async.c:80 WAS: Re: kvm process disappears List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Stefan Hajnoczi Cc: Josh Durgin , Paolo Bonzini , qemu-devel , "pve-devel@pve.proxmox.com" , Michael Roth Hi josh, hi Stefan, > Am 14.05.2013 17:05, schrieb Stefan Hajnoczi: >> On Tue, May 14, 2013 at 4:29 PM, Stefan Priebe - Profihost AG >> wrote: >>> Am 10.05.2013 13:09, schrieb Stefan Hajnoczi: >>>> On Fri, May 10, 2013 at 11:07 AM, Stefan Priebe - Profihost AG >>>> wrote: >>>>> Am 10.05.2013 09:42, schrieb Stefan Hajnoczi: >>>>>> On Fri, May 10, 2013 at 08:12:39AM +0200, Stefan Priebe - Profihost A= G wrote: >>>>>> 3. Either use gdb or an LD_PRELOAD library that catches exit(3) and >>>>>> _exit(2) and dumps core using abort(3). Make sure core dumps are >>>>>> enabled. >>>=20 >>> This time i had a segfault Qemu 1.4.1 plus >>> http://git.qemu.org/?p=3Dqemu.git;a=3Dcommitdiff;h=3Ddc7588c1eb3008bda53= dde1d6b890cd299758155. >>>=20 >>> aio_bh_poll async.c:80 >>>=20 >>> Code... >>>=20 >>> for (bh =3D ctx->first_bh; bh; bh =3D next) { >>> next =3D bh->next; >>> if (!bh->deleted && bh->scheduled) { >>> bh->scheduled =3D 0; >>> if (!bh->idle) >>> ret =3D 1; >>> bh->idle =3D 0; >>> bh->cb(bh->opaque); >>> } >>> } >>>=20 >>> ctx->walking_bh--; >>>=20 >>> /* remove deleted bhs */ >>> if (!ctx->walking_bh) { >>> bhp =3D &ctx->first_bh; >>> while (*bhp) { >>> bh =3D *bhp; >>> =3D=3D=3D=3D=3D THIS IS THE SEGFAULT LINE =3D=3D=3D=3D=3D if (= bh->deleted) { >>> *bhp =3D bh->next; >>> g_free(bh); >>> } else { >>> bhp =3D &bh->next; >>> } >>> } >>> } >>>=20 >>> return ret; >>=20 >> Interesting crash. Do you have the output of "thread apply all bt"? >>=20 >> I would try looking at the AioContext using "p *ctx", and print out >> the ctx->first_bh linked list. >=20 > Hi, >=20 > as i can't reproduce no ;-( i just saw the kernel segfault message and > used addr2line and a qemu dbg package to get the code line. I've now seen this again for two or three times. It always happens when we d= o an fstrim inside the guest. And I've seen this first since josh async rbd patch. Stefan >=20 > Stefan