* [Qemu-devel] kvm process disappears @ 2013-05-10 6:12 Stefan Priebe - Profihost AG 2013-05-10 7:15 ` [Qemu-devel] [pve-devel] " Alexandre DERUMIER ` (2 more replies) 0 siblings, 3 replies; 19+ messages in thread From: Stefan Priebe - Profihost AG @ 2013-05-10 6:12 UTC (permalink / raw) To: pve-devel, qemu-devel Hello list, i've now seen this several times. A VM is suddently down no segfault nothing the kvm process just disappears... Anybody any idea how to debug this? Sadly i can't reproduce. Qemu version is 1.4.1. Greets Stefan ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Qemu-devel] [pve-devel] kvm process disappears 2013-05-10 6:12 [Qemu-devel] kvm process disappears Stefan Priebe - Profihost AG @ 2013-05-10 7:15 ` Alexandre DERUMIER 2013-05-10 7:20 ` Alexandre DERUMIER 2013-05-10 7:42 ` [Qemu-devel] " Stefan Hajnoczi 2 siblings, 0 replies; 19+ messages in thread From: Alexandre DERUMIER @ 2013-05-10 7:15 UTC (permalink / raw) To: Stefan Priebe - Profihost AG; +Cc: qemu-devel, pve-devel I never seen this, sorry ... ----- Mail original ----- De: "Stefan Priebe - Profihost AG" <s.priebe@profihost.ag> À: pve-devel@pve.proxmox.com, "qemu-devel" <qemu-devel@nongnu.org> Envoyé: Vendredi 10 Mai 2013 08:12:39 Objet: [pve-devel] kvm process disappears Hello list, i've now seen this several times. A VM is suddently down no segfault nothing the kvm process just disappears... Anybody any idea how to debug this? Sadly i can't reproduce. Qemu version is 1.4.1. Greets Stefan _______________________________________________ pve-devel mailing list pve-devel@pve.proxmox.com http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Qemu-devel] [pve-devel] kvm process disappears 2013-05-10 6:12 [Qemu-devel] kvm process disappears Stefan Priebe - Profihost AG 2013-05-10 7:15 ` [Qemu-devel] [pve-devel] " Alexandre DERUMIER @ 2013-05-10 7:20 ` Alexandre DERUMIER 2013-05-10 7:22 ` Stefan Priebe - Profihost AG 2013-05-10 7:42 ` [Qemu-devel] " Stefan Hajnoczi 2 siblings, 1 reply; 19+ messages in thread From: Alexandre DERUMIER @ 2013-05-10 7:20 UTC (permalink / raw) To: Stefan Priebe - Profihost AG; +Cc: qemu-devel, pve-devel Just an idea, maybe are you out of memory and process are killed ? nothing in logs ? ----- Mail original ----- De: "Stefan Priebe - Profihost AG" <s.priebe@profihost.ag> À: pve-devel@pve.proxmox.com, "qemu-devel" <qemu-devel@nongnu.org> Envoyé: Vendredi 10 Mai 2013 08:12:39 Objet: [pve-devel] kvm process disappears Hello list, i've now seen this several times. A VM is suddently down no segfault nothing the kvm process just disappears... Anybody any idea how to debug this? Sadly i can't reproduce. Qemu version is 1.4.1. Greets Stefan _______________________________________________ pve-devel mailing list pve-devel@pve.proxmox.com http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Qemu-devel] [pve-devel] kvm process disappears 2013-05-10 7:20 ` Alexandre DERUMIER @ 2013-05-10 7:22 ` Stefan Priebe - Profihost AG 2013-05-10 7:28 ` Alexandre DERUMIER 0 siblings, 1 reply; 19+ messages in thread From: Stefan Priebe - Profihost AG @ 2013-05-10 7:22 UTC (permalink / raw) To: Alexandre DERUMIER; +Cc: qemu-devel, pve-devel Am 10.05.2013 09:20, schrieb Alexandre DERUMIER: > Just an idea, maybe are you out of memory and process are killed ? > > nothing in logs ? 140GB free mem.... also nothing in dmesg... which logs did you mean? Stefan > ----- Mail original ----- > > De: "Stefan Priebe - Profihost AG" <s.priebe@profihost.ag> > À: pve-devel@pve.proxmox.com, "qemu-devel" <qemu-devel@nongnu.org> > Envoyé: Vendredi 10 Mai 2013 08:12:39 > Objet: [pve-devel] kvm process disappears > > Hello list, > > i've now seen this several times. A VM is suddently down no segfault > nothing the kvm process just disappears... > > Anybody any idea how to debug this? > > Sadly i can't reproduce. Qemu version is 1.4.1. > > Greets Stefan > _______________________________________________ > pve-devel mailing list > pve-devel@pve.proxmox.com > http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel > ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Qemu-devel] [pve-devel] kvm process disappears 2013-05-10 7:22 ` Stefan Priebe - Profihost AG @ 2013-05-10 7:28 ` Alexandre DERUMIER 2013-05-10 9:06 ` Stefan Priebe - Profihost AG 0 siblings, 1 reply; 19+ messages in thread From: Alexandre DERUMIER @ 2013-05-10 7:28 UTC (permalink / raw) To: Stefan Priebe - Profihost AG; +Cc: qemu-devel, pve-devel >>140GB free mem.... also nothing in dmesg... which logs did you mean? I thinked of /var/log/messages, logs with OOM Killer. But seem to not be your case ;) Do you use HA ? ----- Mail original ----- De: "Stefan Priebe - Profihost AG" <s.priebe@profihost.ag> À: "Alexandre DERUMIER" <aderumier@odiso.com> Cc: pve-devel@pve.proxmox.com, "qemu-devel" <qemu-devel@nongnu.org> Envoyé: Vendredi 10 Mai 2013 09:22:27 Objet: Re: [pve-devel] kvm process disappears Am 10.05.2013 09:20, schrieb Alexandre DERUMIER: > Just an idea, maybe are you out of memory and process are killed ? > > nothing in logs ? 140GB free mem.... also nothing in dmesg... which logs did you mean? Stefan > ----- Mail original ----- > > De: "Stefan Priebe - Profihost AG" <s.priebe@profihost.ag> > À: pve-devel@pve.proxmox.com, "qemu-devel" <qemu-devel@nongnu.org> > Envoyé: Vendredi 10 Mai 2013 08:12:39 > Objet: [pve-devel] kvm process disappears > > Hello list, > > i've now seen this several times. A VM is suddently down no segfault > nothing the kvm process just disappears... > > Anybody any idea how to debug this? > > Sadly i can't reproduce. Qemu version is 1.4.1. > > Greets Stefan > _______________________________________________ > pve-devel mailing list > pve-devel@pve.proxmox.com > http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel > ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Qemu-devel] [pve-devel] kvm process disappears 2013-05-10 7:28 ` Alexandre DERUMIER @ 2013-05-10 9:06 ` Stefan Priebe - Profihost AG 0 siblings, 0 replies; 19+ messages in thread From: Stefan Priebe - Profihost AG @ 2013-05-10 9:06 UTC (permalink / raw) To: Alexandre DERUMIER; +Cc: qemu-devel, pve-devel Am 10.05.2013 09:28, schrieb Alexandre DERUMIER: >>> 140GB free mem.... also nothing in dmesg... which logs did you mean? > I thinked of /var/log/messages, logs with OOM Killer. But seem to not be your case ;) Nothing... > Do you use HA ? No Stefan ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Qemu-devel] kvm process disappears 2013-05-10 6:12 [Qemu-devel] kvm process disappears Stefan Priebe - Profihost AG 2013-05-10 7:15 ` [Qemu-devel] [pve-devel] " Alexandre DERUMIER 2013-05-10 7:20 ` Alexandre DERUMIER @ 2013-05-10 7:42 ` Stefan Hajnoczi 2013-05-10 9:07 ` Stefan Priebe - Profihost AG 2 siblings, 1 reply; 19+ messages in thread From: Stefan Hajnoczi @ 2013-05-10 7:42 UTC (permalink / raw) To: Stefan Priebe - Profihost AG; +Cc: qemu-devel, pve-devel On Fri, May 10, 2013 at 08:12:39AM +0200, Stefan Priebe - Profihost AG wrote: > i've now seen this several times. A VM is suddently down no segfault > nothing the kvm process just disappears... > > Anybody any idea how to debug this? > > Sadly i can't reproduce. Qemu version is 1.4.1. 1. Double-check dmesg(1) for out-of-memory killer or segfaults. 2. Check libvirt or other management tool logs again. 3. Either use gdb or an LD_PRELOAD library that catches exit(3) and _exit(2) and dumps core using abort(3). Make sure core dumps are enabled. Stefan ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Qemu-devel] kvm process disappears 2013-05-10 7:42 ` [Qemu-devel] " Stefan Hajnoczi @ 2013-05-10 9:07 ` Stefan Priebe - Profihost AG 2013-05-10 11:09 ` Stefan Hajnoczi 0 siblings, 1 reply; 19+ messages in thread From: Stefan Priebe - Profihost AG @ 2013-05-10 9:07 UTC (permalink / raw) To: Stefan Hajnoczi; +Cc: qemu-devel, pve-devel Am 10.05.2013 09:42, schrieb Stefan Hajnoczi: > On Fri, May 10, 2013 at 08:12:39AM +0200, Stefan Priebe - Profihost AG wrote: >> i've now seen this several times. A VM is suddently down no segfault >> nothing the kvm process just disappears... >> >> Anybody any idea how to debug this? >> >> Sadly i can't reproduce. Qemu version is 1.4.1. > > 1. Double-check dmesg(1) for out-of-memory killer or segfaults. done => nothing in there also 120-140GB mem free. > 2. Check libvirt or other management tool logs again. kvm process was started via bash no management tool involved. > 3. Either use gdb or an LD_PRELOAD library that catches exit(3) and > _exit(2) and dumps core using abort(3). Make sure core dumps are > enabled. LD_PRELOAD sounds good can you point me to such a lib? Stefan ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Qemu-devel] kvm process disappears 2013-05-10 9:07 ` Stefan Priebe - Profihost AG @ 2013-05-10 11:09 ` Stefan Hajnoczi 2013-05-14 14:29 ` [Qemu-devel] segfault in aio_bh_poll async.c:80 WAS: " Stefan Priebe - Profihost AG 2013-05-27 21:09 ` [Qemu-devel] " Stefan Priebe 0 siblings, 2 replies; 19+ messages in thread From: Stefan Hajnoczi @ 2013-05-10 11:09 UTC (permalink / raw) To: Stefan Priebe - Profihost AG; +Cc: qemu-devel, pve-devel On Fri, May 10, 2013 at 11:07 AM, Stefan Priebe - Profihost AG <s.priebe@profihost.ag> wrote: > Am 10.05.2013 09:42, schrieb Stefan Hajnoczi: >> On Fri, May 10, 2013 at 08:12:39AM +0200, Stefan Priebe - Profihost AG wrote: >> 3. Either use gdb or an LD_PRELOAD library that catches exit(3) and >> _exit(2) and dumps core using abort(3). Make sure core dumps are >> enabled. > > LD_PRELOAD sounds good can you point me to such a lib? $ cat /tmp/catchexit.c #include <unistd.h> #include <stdlib.h> void exit(int status) { const char msg[] = "*** CAUGHT EXIT, DUMPING CORE ***\n"; write(2, msg, sizeof msg); abort(); } void _exit(int status) __attribute__((alias("exit"))); $ gcc -o catchexit.so -shared -fPIC -std=gnu99 catchexit.c $ LD_PRELOAD=/tmp/catchexit.so x86_64-softmmu/qemu-system-x86_64 -m 1024 -enable-kvm -cpu host -vga asdf Unknown vga type: asdf *** CAUGHT EXIT, DUMPING CORE *** Aborted (core dumped) Make sure to give the absolute path to catchexit.so. Also keep in mind that this does not catch a normal return from main() or possibly other ways of terminating the process. You can hook more library functions, if necessary. Stefan ^ permalink raw reply [flat|nested] 19+ messages in thread
* [Qemu-devel] segfault in aio_bh_poll async.c:80 WAS: Re: kvm process disappears 2013-05-10 11:09 ` Stefan Hajnoczi @ 2013-05-14 14:29 ` Stefan Priebe - Profihost AG 2013-05-14 15:05 ` Stefan Hajnoczi 2013-05-27 21:09 ` [Qemu-devel] " Stefan Priebe 1 sibling, 1 reply; 19+ messages in thread From: Stefan Priebe - Profihost AG @ 2013-05-14 14:29 UTC (permalink / raw) To: Stefan Hajnoczi; +Cc: Paolo Bonzini, qemu-devel, pve-devel, mdroth Am 10.05.2013 13:09, schrieb Stefan Hajnoczi: > On Fri, May 10, 2013 at 11:07 AM, Stefan Priebe - Profihost AG > <s.priebe@profihost.ag> wrote: >> Am 10.05.2013 09:42, schrieb Stefan Hajnoczi: >>> On Fri, May 10, 2013 at 08:12:39AM +0200, Stefan Priebe - Profihost AG wrote: >>> 3. Either use gdb or an LD_PRELOAD library that catches exit(3) and >>> _exit(2) and dumps core using abort(3). Make sure core dumps are >>> enabled. This time i had a segfault Qemu 1.4.1 plus http://git.qemu.org/?p=qemu.git;a=commitdiff;h=dc7588c1eb3008bda53dde1d6b890cd299758155. aio_bh_poll async.c:80 Code... for (bh = ctx->first_bh; bh; bh = next) { next = bh->next; if (!bh->deleted && bh->scheduled) { bh->scheduled = 0; if (!bh->idle) ret = 1; bh->idle = 0; bh->cb(bh->opaque); } } ctx->walking_bh--; /* remove deleted bhs */ if (!ctx->walking_bh) { bhp = &ctx->first_bh; while (*bhp) { bh = *bhp; ===== THIS IS THE SEGFAULT LINE ===== if (bh->deleted) { *bhp = bh->next; g_free(bh); } else { bhp = &bh->next; } } } return ret; Greets, Stefan ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Qemu-devel] segfault in aio_bh_poll async.c:80 WAS: Re: kvm process disappears 2013-05-14 14:29 ` [Qemu-devel] segfault in aio_bh_poll async.c:80 WAS: " Stefan Priebe - Profihost AG @ 2013-05-14 15:05 ` Stefan Hajnoczi 2013-05-14 15:11 ` Stefan Priebe - Profihost AG 0 siblings, 1 reply; 19+ messages in thread From: Stefan Hajnoczi @ 2013-05-14 15:05 UTC (permalink / raw) To: Stefan Priebe - Profihost AG Cc: Paolo Bonzini, qemu-devel, pve-devel, Michael Roth On Tue, May 14, 2013 at 4:29 PM, Stefan Priebe - Profihost AG <s.priebe@profihost.ag> wrote: > Am 10.05.2013 13:09, schrieb Stefan Hajnoczi: >> On Fri, May 10, 2013 at 11:07 AM, Stefan Priebe - Profihost AG >> <s.priebe@profihost.ag> wrote: >>> Am 10.05.2013 09:42, schrieb Stefan Hajnoczi: >>>> On Fri, May 10, 2013 at 08:12:39AM +0200, Stefan Priebe - Profihost AG wrote: >>>> 3. Either use gdb or an LD_PRELOAD library that catches exit(3) and >>>> _exit(2) and dumps core using abort(3). Make sure core dumps are >>>> enabled. > > This time i had a segfault Qemu 1.4.1 plus > http://git.qemu.org/?p=qemu.git;a=commitdiff;h=dc7588c1eb3008bda53dde1d6b890cd299758155. > > aio_bh_poll async.c:80 > > Code... > > for (bh = ctx->first_bh; bh; bh = next) { > next = bh->next; > if (!bh->deleted && bh->scheduled) { > bh->scheduled = 0; > if (!bh->idle) > ret = 1; > bh->idle = 0; > bh->cb(bh->opaque); > } > } > > ctx->walking_bh--; > > /* remove deleted bhs */ > if (!ctx->walking_bh) { > bhp = &ctx->first_bh; > while (*bhp) { > bh = *bhp; > ===== THIS IS THE SEGFAULT LINE ===== if (bh->deleted) { > *bhp = bh->next; > g_free(bh); > } else { > bhp = &bh->next; > } > } > } > > return ret; Interesting crash. Do you have the output of "thread apply all bt"? I would try looking at the AioContext using "p *ctx", and print out the ctx->first_bh linked list. Stefan ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Qemu-devel] segfault in aio_bh_poll async.c:80 WAS: Re: kvm process disappears 2013-05-14 15:05 ` Stefan Hajnoczi @ 2013-05-14 15:11 ` Stefan Priebe - Profihost AG 2013-05-22 6:26 ` Stefan Priebe - Profihost AG 0 siblings, 1 reply; 19+ messages in thread From: Stefan Priebe - Profihost AG @ 2013-05-14 15:11 UTC (permalink / raw) To: Stefan Hajnoczi; +Cc: Paolo Bonzini, qemu-devel, pve-devel, Michael Roth Am 14.05.2013 17:05, schrieb Stefan Hajnoczi: > On Tue, May 14, 2013 at 4:29 PM, Stefan Priebe - Profihost AG > <s.priebe@profihost.ag> wrote: >> Am 10.05.2013 13:09, schrieb Stefan Hajnoczi: >>> On Fri, May 10, 2013 at 11:07 AM, Stefan Priebe - Profihost AG >>> <s.priebe@profihost.ag> wrote: >>>> Am 10.05.2013 09:42, schrieb Stefan Hajnoczi: >>>>> On Fri, May 10, 2013 at 08:12:39AM +0200, Stefan Priebe - Profihost AG wrote: >>>>> 3. Either use gdb or an LD_PRELOAD library that catches exit(3) and >>>>> _exit(2) and dumps core using abort(3). Make sure core dumps are >>>>> enabled. >> >> This time i had a segfault Qemu 1.4.1 plus >> http://git.qemu.org/?p=qemu.git;a=commitdiff;h=dc7588c1eb3008bda53dde1d6b890cd299758155. >> >> aio_bh_poll async.c:80 >> >> Code... >> >> for (bh = ctx->first_bh; bh; bh = next) { >> next = bh->next; >> if (!bh->deleted && bh->scheduled) { >> bh->scheduled = 0; >> if (!bh->idle) >> ret = 1; >> bh->idle = 0; >> bh->cb(bh->opaque); >> } >> } >> >> ctx->walking_bh--; >> >> /* remove deleted bhs */ >> if (!ctx->walking_bh) { >> bhp = &ctx->first_bh; >> while (*bhp) { >> bh = *bhp; >> ===== THIS IS THE SEGFAULT LINE ===== if (bh->deleted) { >> *bhp = bh->next; >> g_free(bh); >> } else { >> bhp = &bh->next; >> } >> } >> } >> >> return ret; > > Interesting crash. Do you have the output of "thread apply all bt"? > > I would try looking at the AioContext using "p *ctx", and print out > the ctx->first_bh linked list. Hi, as i can't reproduce no ;-( i just saw the kernel segfault message and used addr2line and a qemu dbg package to get the code line. Stefan ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Qemu-devel] segfault in aio_bh_poll async.c:80 WAS: Re: kvm process disappears 2013-05-14 15:11 ` Stefan Priebe - Profihost AG @ 2013-05-22 6:26 ` Stefan Priebe - Profihost AG 2013-05-22 8:41 ` Paolo Bonzini 0 siblings, 1 reply; 19+ messages in thread From: Stefan Priebe - Profihost AG @ 2013-05-22 6:26 UTC (permalink / raw) To: Stefan Hajnoczi Cc: Josh Durgin, Paolo Bonzini, qemu-devel, pve-devel, Michael Roth Hi josh, hi Stefan, > Am 14.05.2013 17:05, schrieb Stefan Hajnoczi: >> On Tue, May 14, 2013 at 4:29 PM, Stefan Priebe - Profihost AG >> <s.priebe@profihost.ag> wrote: >>> Am 10.05.2013 13:09, schrieb Stefan Hajnoczi: >>>> On Fri, May 10, 2013 at 11:07 AM, Stefan Priebe - Profihost AG >>>> <s.priebe@profihost.ag> wrote: >>>>> Am 10.05.2013 09:42, schrieb Stefan Hajnoczi: >>>>>> On Fri, May 10, 2013 at 08:12:39AM +0200, Stefan Priebe - Profihost AG wrote: >>>>>> 3. Either use gdb or an LD_PRELOAD library that catches exit(3) and >>>>>> _exit(2) and dumps core using abort(3). Make sure core dumps are >>>>>> enabled. >>> >>> This time i had a segfault Qemu 1.4.1 plus >>> http://git.qemu.org/?p=qemu.git;a=commitdiff;h=dc7588c1eb3008bda53dde1d6b890cd299758155. >>> >>> aio_bh_poll async.c:80 >>> >>> Code... >>> >>> for (bh = ctx->first_bh; bh; bh = next) { >>> next = bh->next; >>> if (!bh->deleted && bh->scheduled) { >>> bh->scheduled = 0; >>> if (!bh->idle) >>> ret = 1; >>> bh->idle = 0; >>> bh->cb(bh->opaque); >>> } >>> } >>> >>> ctx->walking_bh--; >>> >>> /* remove deleted bhs */ >>> if (!ctx->walking_bh) { >>> bhp = &ctx->first_bh; >>> while (*bhp) { >>> bh = *bhp; >>> ===== THIS IS THE SEGFAULT LINE ===== if (bh->deleted) { >>> *bhp = bh->next; >>> g_free(bh); >>> } else { >>> bhp = &bh->next; >>> } >>> } >>> } >>> >>> return ret; >> >> Interesting crash. Do you have the output of "thread apply all bt"? >> >> I would try looking at the AioContext using "p *ctx", and print out >> the ctx->first_bh linked list. > > Hi, > > as i can't reproduce no ;-( i just saw the kernel segfault message and > used addr2line and a qemu dbg package to get the code line. I've now seen this again for two or three times. It always happens when we do an fstrim inside the guest. And I've seen this first since josh async rbd patch. Stefan > > Stefan ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Qemu-devel] segfault in aio_bh_poll async.c:80 WAS: Re: kvm process disappears 2013-05-22 6:26 ` Stefan Priebe - Profihost AG @ 2013-05-22 8:41 ` Paolo Bonzini 2013-05-22 12:24 ` Stefan Priebe - Profihost AG 0 siblings, 1 reply; 19+ messages in thread From: Paolo Bonzini @ 2013-05-22 8:41 UTC (permalink / raw) To: Stefan Priebe - Profihost AG Cc: Josh Durgin, Stefan Hajnoczi, qemu-devel, pve-devel, Michael Roth Il 22/05/2013 08:26, Stefan Priebe - Profihost AG ha scritto: >> Hi, >> >> as i can't reproduce no ;-( i just saw the kernel segfault message and >> used addr2line and a qemu dbg package to get the code line. > > I've now seen this again for two or three times. It always happens > when we do an fstrim inside the guest. > And I've seen this first since josh async rbd patch. This one? commit dc7588c1eb3008bda53dde1d6b890cd299758155 Author: Josh Durgin <josh.durgin@inktank.com> Date: Fri Mar 29 13:03:23 2013 -0700 rbd: add an asynchronous flush The existing bdrv_co_flush_to_disk implementation uses rbd_flush(), which is sychronous and causes the main qemu thread to block until it is complete. This results in unresponsiveness and extra latency for the guest. Fix this by using an asynchronous version of flush. This was added to librbd with a special #define to indicate its presence, since it will be backported to stable versions. Thus, there is no need to check the version of librbd. Implement this as bdrv_aio_flush, since it matches other aio functions in the rbd block driver, and leave out bdrv_co_flush_to_disk when the asynchronous version is available. Reported-by: Oliver Francke <oliver@filoo.de> Signed-off-by: Josh Durgin <josh.durgin@inktank.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Do you see it even with "-drive discard=off"? Paolo ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Qemu-devel] segfault in aio_bh_poll async.c:80 WAS: Re: kvm process disappears 2013-05-22 8:41 ` Paolo Bonzini @ 2013-05-22 12:24 ` Stefan Priebe - Profihost AG 2013-05-23 10:09 ` Paolo Bonzini 0 siblings, 1 reply; 19+ messages in thread From: Stefan Priebe - Profihost AG @ 2013-05-22 12:24 UTC (permalink / raw) To: Paolo Bonzini Cc: Josh Durgin, Stefan Hajnoczi, qemu-devel, pve-devel, Michael Roth Am 22.05.2013 um 10:41 schrieb Paolo Bonzini <pbonzini@redhat.com>: > Il 22/05/2013 08:26, Stefan Priebe - Profihost AG ha scritto: >>> Hi, >>> >>> as i can't reproduce no ;-( i just saw the kernel segfault message and >>> used addr2line and a qemu dbg package to get the code line. >> >> I've now seen this again for two or three times. It always happens >> when we do an fstrim inside the guest. > > >> And I've seen this first since josh async rbd patch. > > This one? > > commit dc7588c1eb3008bda53dde1d6b890cd299758155 Yes. But i'm not sure whether this is coincendence. > > Do you see it even with "-drive discard=off"? > I use discard / trim for thin provisioning and need it. This is a production system so I can't test without it. I use scsi virtio with discard_granularity=512 Stefan > Paolo ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Qemu-devel] segfault in aio_bh_poll async.c:80 WAS: Re: kvm process disappears 2013-05-22 12:24 ` Stefan Priebe - Profihost AG @ 2013-05-23 10:09 ` Paolo Bonzini 2013-05-23 19:22 ` Stefan Priebe 0 siblings, 1 reply; 19+ messages in thread From: Paolo Bonzini @ 2013-05-23 10:09 UTC (permalink / raw) To: Stefan Priebe - Profihost AG Cc: Josh Durgin, Stefan Hajnoczi, qemu-devel, pve-devel, Michael Roth Il 22/05/2013 14:24, Stefan Priebe - Profihost AG ha scritto: > Am 22.05.2013 um 10:41 schrieb Paolo Bonzini <pbonzini@redhat.com>: > >> Il 22/05/2013 08:26, Stefan Priebe - Profihost AG ha scritto: >>>> Hi, >>>> >>>> as i can't reproduce no ;-( i just saw the kernel segfault message and >>>> used addr2line and a qemu dbg package to get the code line. >>> >>> I've now seen this again for two or three times. It always happens >>> when we do an fstrim inside the guest. >> >> >>> And I've seen this first since josh async rbd patch. >> >> This one? >> >> commit dc7588c1eb3008bda53dde1d6b890cd299758155 > > Yes. But i'm not sure whether this is coincendence. Ok. >> Do you see it even with "-drive discard=off"? > > I use discard / trim for thin provisioning and need it. This is a > production system so I can't test without it. > > I use scsi virtio with discard_granularity=512 Note that 1.5.0 won't need this anymore, but it will need "-drive discard=on". Any chance you can try to reproduce it in a different environment? Paolo ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Qemu-devel] segfault in aio_bh_poll async.c:80 WAS: Re: kvm process disappears 2013-05-23 10:09 ` Paolo Bonzini @ 2013-05-23 19:22 ` Stefan Priebe 0 siblings, 0 replies; 19+ messages in thread From: Stefan Priebe @ 2013-05-23 19:22 UTC (permalink / raw) To: Paolo Bonzini Cc: Josh Durgin, Stefan Hajnoczi, qemu-devel, pve-devel, Michael Roth Am 23.05.2013 12:09, schrieb Paolo Bonzini: > Il 22/05/2013 14:24, Stefan Priebe - Profihost AG ha scritto: >> Am 22.05.2013 um 10:41 schrieb Paolo Bonzini <pbonzini@redhat.com>: >> >>> Il 22/05/2013 08:26, Stefan Priebe - Profihost AG ha scritto: >>>>> Hi, >>>>> >>>>> as i can't reproduce no ;-( i just saw the kernel segfault message and >>>>> used addr2line and a qemu dbg package to get the code line. >>>> >>>> I've now seen this again for two or three times. It always happens >>>> when we do an fstrim inside the guest. >>> >>> >>>> And I've seen this first since josh async rbd patch. >>> >>> This one? >>> >>> commit dc7588c1eb3008bda53dde1d6b890cd299758155 >> >> Yes. But i'm not sure whether this is coincendence. > > Ok. > >>> Do you see it even with "-drive discard=off"? >> >> I use discard / trim for thin provisioning and need it. This is a >> production system so I can't test without it. >> >> I use scsi virtio with discard_granularity=512 > > Note that 1.5.0 won't need this anymore, but it will need "-drive > discard=on". Any chance you can try to reproduce it in a different > environment? OK i'll try to update to 1.5.0. Stefan ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Qemu-devel] kvm process disappears 2013-05-10 11:09 ` Stefan Hajnoczi 2013-05-14 14:29 ` [Qemu-devel] segfault in aio_bh_poll async.c:80 WAS: " Stefan Priebe - Profihost AG @ 2013-05-27 21:09 ` Stefan Priebe 2013-05-28 8:06 ` Stefan Hajnoczi 1 sibling, 1 reply; 19+ messages in thread From: Stefan Priebe @ 2013-05-27 21:09 UTC (permalink / raw) To: Stefan Hajnoczi; +Cc: Paolo Bonzini, qemu-devel, pve-devel Am 10.05.2013 13:09, schrieb Stefan Hajnoczi: > On Fri, May 10, 2013 at 11:07 AM, Stefan Priebe - Profihost AG > <s.priebe@profihost.ag> wrote: >> Am 10.05.2013 09:42, schrieb Stefan Hajnoczi: >>> On Fri, May 10, 2013 at 08:12:39AM +0200, Stefan Priebe - Profihost AG wrote: >>> 3. Either use gdb or an LD_PRELOAD library that catches exit(3) and >>> _exit(2) and dumps core using abort(3). Make sure core dumps are >>> enabled. >> >> LD_PRELOAD sounds good can you point me to such a lib? > > $ cat /tmp/catchexit.c > #include <unistd.h> > #include <stdlib.h> > > void exit(int status) > { > const char msg[] = "*** CAUGHT EXIT, DUMPING CORE ***\n"; > write(2, msg, sizeof msg); > abort(); > } > > void _exit(int status) __attribute__((alias("exit"))); > > $ gcc -o catchexit.so -shared -fPIC -std=gnu99 catchexit.c > > $ LD_PRELOAD=/tmp/catchexit.so x86_64-softmmu/qemu-system-x86_64 -m > 1024 -enable-kvm -cpu host -vga asdf > Unknown vga type: asdf > *** CAUGHT EXIT, DUMPING CORE *** > Aborted (core dumped) > > Make sure to give the absolute path to catchexit.so. Also keep in > mind that this does not catch a normal return from main() or possibly > other ways of terminating the process. > > You can hook more library functions, if necessary. > > Stefan I'm really sorry for bothering you. It turned out to be a host kernel bug. Without NUMA Balancing turned on (kernel 3.8.13) i see no vm crashes at all... Greets, Stefan ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Qemu-devel] kvm process disappears 2013-05-27 21:09 ` [Qemu-devel] " Stefan Priebe @ 2013-05-28 8:06 ` Stefan Hajnoczi 0 siblings, 0 replies; 19+ messages in thread From: Stefan Hajnoczi @ 2013-05-28 8:06 UTC (permalink / raw) To: Stefan Priebe; +Cc: Paolo Bonzini, qemu-devel, pve-devel On Mon, May 27, 2013 at 11:09:51PM +0200, Stefan Priebe wrote: > Am 10.05.2013 13:09, schrieb Stefan Hajnoczi: > >On Fri, May 10, 2013 at 11:07 AM, Stefan Priebe - Profihost AG > ><s.priebe@profihost.ag> wrote: > >>Am 10.05.2013 09:42, schrieb Stefan Hajnoczi: > >>>On Fri, May 10, 2013 at 08:12:39AM +0200, Stefan Priebe - Profihost AG wrote: > >>>3. Either use gdb or an LD_PRELOAD library that catches exit(3) and > >>> _exit(2) and dumps core using abort(3). Make sure core dumps are > >>> enabled. > >> > >>LD_PRELOAD sounds good can you point me to such a lib? > > > >$ cat /tmp/catchexit.c > >#include <unistd.h> > >#include <stdlib.h> > > > >void exit(int status) > >{ > > const char msg[] = "*** CAUGHT EXIT, DUMPING CORE ***\n"; > > write(2, msg, sizeof msg); > > abort(); > >} > > > >void _exit(int status) __attribute__((alias("exit"))); > > > >$ gcc -o catchexit.so -shared -fPIC -std=gnu99 catchexit.c > > > >$ LD_PRELOAD=/tmp/catchexit.so x86_64-softmmu/qemu-system-x86_64 -m > >1024 -enable-kvm -cpu host -vga asdf > >Unknown vga type: asdf > >*** CAUGHT EXIT, DUMPING CORE *** > >Aborted (core dumped) > > > >Make sure to give the absolute path to catchexit.so. Also keep in > >mind that this does not catch a normal return from main() or possibly > >other ways of terminating the process. > > > >You can hook more library functions, if necessary. > > > >Stefan > > I'm really sorry for bothering you. It turned out to be a host > kernel bug. Without NUMA Balancing turned on (kernel 3.8.13) i see > no vm crashes at all... No worries. Glad you found the solution. Stefan ^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2013-05-28 8:06 UTC | newest] Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2013-05-10 6:12 [Qemu-devel] kvm process disappears Stefan Priebe - Profihost AG 2013-05-10 7:15 ` [Qemu-devel] [pve-devel] " Alexandre DERUMIER 2013-05-10 7:20 ` Alexandre DERUMIER 2013-05-10 7:22 ` Stefan Priebe - Profihost AG 2013-05-10 7:28 ` Alexandre DERUMIER 2013-05-10 9:06 ` Stefan Priebe - Profihost AG 2013-05-10 7:42 ` [Qemu-devel] " Stefan Hajnoczi 2013-05-10 9:07 ` Stefan Priebe - Profihost AG 2013-05-10 11:09 ` Stefan Hajnoczi 2013-05-14 14:29 ` [Qemu-devel] segfault in aio_bh_poll async.c:80 WAS: " Stefan Priebe - Profihost AG 2013-05-14 15:05 ` Stefan Hajnoczi 2013-05-14 15:11 ` Stefan Priebe - Profihost AG 2013-05-22 6:26 ` Stefan Priebe - Profihost AG 2013-05-22 8:41 ` Paolo Bonzini 2013-05-22 12:24 ` Stefan Priebe - Profihost AG 2013-05-23 10:09 ` Paolo Bonzini 2013-05-23 19:22 ` Stefan Priebe 2013-05-27 21:09 ` [Qemu-devel] " Stefan Priebe 2013-05-28 8:06 ` Stefan Hajnoczi
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.