All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] kvm process disappears
@ 2013-05-10  6:12 Stefan Priebe - Profihost AG
  2013-05-10  7:15 ` [Qemu-devel] [pve-devel] " Alexandre DERUMIER
                   ` (2 more replies)
  0 siblings, 3 replies; 19+ messages in thread
From: Stefan Priebe - Profihost AG @ 2013-05-10  6:12 UTC (permalink / raw)
  To: pve-devel, qemu-devel

Hello list,

i've now seen this several times. A VM is suddently down no segfault
nothing the kvm process just disappears...

Anybody any idea how to debug this?

Sadly i can't reproduce. Qemu version is 1.4.1.

Greets Stefan

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] [pve-devel] kvm process disappears
  2013-05-10  6:12 [Qemu-devel] kvm process disappears Stefan Priebe - Profihost AG
@ 2013-05-10  7:15 ` Alexandre DERUMIER
  2013-05-10  7:20 ` Alexandre DERUMIER
  2013-05-10  7:42 ` [Qemu-devel] " Stefan Hajnoczi
  2 siblings, 0 replies; 19+ messages in thread
From: Alexandre DERUMIER @ 2013-05-10  7:15 UTC (permalink / raw)
  To: Stefan Priebe - Profihost AG; +Cc: qemu-devel, pve-devel

I never seen this, sorry ...

----- Mail original -----

De: "Stefan Priebe - Profihost AG" <s.priebe@profihost.ag>
À: pve-devel@pve.proxmox.com, "qemu-devel" <qemu-devel@nongnu.org>
Envoyé: Vendredi 10 Mai 2013 08:12:39
Objet: [pve-devel] kvm process disappears

Hello list,

i've now seen this several times. A VM is suddently down no segfault
nothing the kvm process just disappears...

Anybody any idea how to debug this?

Sadly i can't reproduce. Qemu version is 1.4.1.

Greets Stefan
_______________________________________________
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] [pve-devel] kvm process disappears
  2013-05-10  6:12 [Qemu-devel] kvm process disappears Stefan Priebe - Profihost AG
  2013-05-10  7:15 ` [Qemu-devel] [pve-devel] " Alexandre DERUMIER
@ 2013-05-10  7:20 ` Alexandre DERUMIER
  2013-05-10  7:22   ` Stefan Priebe - Profihost AG
  2013-05-10  7:42 ` [Qemu-devel] " Stefan Hajnoczi
  2 siblings, 1 reply; 19+ messages in thread
From: Alexandre DERUMIER @ 2013-05-10  7:20 UTC (permalink / raw)
  To: Stefan Priebe - Profihost AG; +Cc: qemu-devel, pve-devel

Just an idea, maybe are you out of memory and process are killed ?

nothing in logs ?

----- Mail original -----

De: "Stefan Priebe - Profihost AG" <s.priebe@profihost.ag>
À: pve-devel@pve.proxmox.com, "qemu-devel" <qemu-devel@nongnu.org>
Envoyé: Vendredi 10 Mai 2013 08:12:39
Objet: [pve-devel] kvm process disappears

Hello list,

i've now seen this several times. A VM is suddently down no segfault
nothing the kvm process just disappears...

Anybody any idea how to debug this?

Sadly i can't reproduce. Qemu version is 1.4.1.

Greets Stefan
_______________________________________________
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] [pve-devel] kvm process disappears
  2013-05-10  7:20 ` Alexandre DERUMIER
@ 2013-05-10  7:22   ` Stefan Priebe - Profihost AG
  2013-05-10  7:28     ` Alexandre DERUMIER
  0 siblings, 1 reply; 19+ messages in thread
From: Stefan Priebe - Profihost AG @ 2013-05-10  7:22 UTC (permalink / raw)
  To: Alexandre DERUMIER; +Cc: qemu-devel, pve-devel

Am 10.05.2013 09:20, schrieb Alexandre DERUMIER:
> Just an idea, maybe are you out of memory and process are killed ?
> 
> nothing in logs ?

140GB free mem.... also nothing in dmesg... which logs did you mean?

Stefan


> ----- Mail original ----- 
> 
> De: "Stefan Priebe - Profihost AG" <s.priebe@profihost.ag> 
> À: pve-devel@pve.proxmox.com, "qemu-devel" <qemu-devel@nongnu.org> 
> Envoyé: Vendredi 10 Mai 2013 08:12:39 
> Objet: [pve-devel] kvm process disappears 
> 
> Hello list, 
> 
> i've now seen this several times. A VM is suddently down no segfault 
> nothing the kvm process just disappears... 
> 
> Anybody any idea how to debug this? 
> 
> Sadly i can't reproduce. Qemu version is 1.4.1. 
> 
> Greets Stefan 
> _______________________________________________ 
> pve-devel mailing list 
> pve-devel@pve.proxmox.com 
> http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel 
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] [pve-devel] kvm process disappears
  2013-05-10  7:22   ` Stefan Priebe - Profihost AG
@ 2013-05-10  7:28     ` Alexandre DERUMIER
  2013-05-10  9:06       ` Stefan Priebe - Profihost AG
  0 siblings, 1 reply; 19+ messages in thread
From: Alexandre DERUMIER @ 2013-05-10  7:28 UTC (permalink / raw)
  To: Stefan Priebe - Profihost AG; +Cc: qemu-devel, pve-devel

>>140GB free mem.... also nothing in dmesg... which logs did you mean? 
I thinked of /var/log/messages, logs with OOM Killer. But seem to not be your case ;)

Do you use HA ?

----- Mail original -----

De: "Stefan Priebe - Profihost AG" <s.priebe@profihost.ag>
À: "Alexandre DERUMIER" <aderumier@odiso.com>
Cc: pve-devel@pve.proxmox.com, "qemu-devel" <qemu-devel@nongnu.org>
Envoyé: Vendredi 10 Mai 2013 09:22:27
Objet: Re: [pve-devel] kvm process disappears

Am 10.05.2013 09:20, schrieb Alexandre DERUMIER:
> Just an idea, maybe are you out of memory and process are killed ?
>
> nothing in logs ?

140GB free mem.... also nothing in dmesg... which logs did you mean?

Stefan


> ----- Mail original -----
>
> De: "Stefan Priebe - Profihost AG" <s.priebe@profihost.ag>
> À: pve-devel@pve.proxmox.com, "qemu-devel" <qemu-devel@nongnu.org>
> Envoyé: Vendredi 10 Mai 2013 08:12:39
> Objet: [pve-devel] kvm process disappears
>
> Hello list,
>
> i've now seen this several times. A VM is suddently down no segfault
> nothing the kvm process just disappears...
>
> Anybody any idea how to debug this?
>
> Sadly i can't reproduce. Qemu version is 1.4.1.
>
> Greets Stefan
> _______________________________________________
> pve-devel mailing list
> pve-devel@pve.proxmox.com
> http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] kvm process disappears
  2013-05-10  6:12 [Qemu-devel] kvm process disappears Stefan Priebe - Profihost AG
  2013-05-10  7:15 ` [Qemu-devel] [pve-devel] " Alexandre DERUMIER
  2013-05-10  7:20 ` Alexandre DERUMIER
@ 2013-05-10  7:42 ` Stefan Hajnoczi
  2013-05-10  9:07   ` Stefan Priebe - Profihost AG
  2 siblings, 1 reply; 19+ messages in thread
From: Stefan Hajnoczi @ 2013-05-10  7:42 UTC (permalink / raw)
  To: Stefan Priebe - Profihost AG; +Cc: qemu-devel, pve-devel

On Fri, May 10, 2013 at 08:12:39AM +0200, Stefan Priebe - Profihost AG wrote:
> i've now seen this several times. A VM is suddently down no segfault
> nothing the kvm process just disappears...
> 
> Anybody any idea how to debug this?
> 
> Sadly i can't reproduce. Qemu version is 1.4.1.

1. Double-check dmesg(1) for out-of-memory killer or segfaults.

2. Check libvirt or other management tool logs again.

3. Either use gdb or an LD_PRELOAD library that catches exit(3) and
   _exit(2) and dumps core using abort(3).  Make sure core dumps are
   enabled.

Stefan

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] [pve-devel] kvm process disappears
  2013-05-10  7:28     ` Alexandre DERUMIER
@ 2013-05-10  9:06       ` Stefan Priebe - Profihost AG
  0 siblings, 0 replies; 19+ messages in thread
From: Stefan Priebe - Profihost AG @ 2013-05-10  9:06 UTC (permalink / raw)
  To: Alexandre DERUMIER; +Cc: qemu-devel, pve-devel

Am 10.05.2013 09:28, schrieb Alexandre DERUMIER:
>>> 140GB free mem.... also nothing in dmesg... which logs did you mean? 
> I thinked of /var/log/messages, logs with OOM Killer. But seem to not be your case ;)

Nothing...

> Do you use HA ?
No

Stefan

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] kvm process disappears
  2013-05-10  7:42 ` [Qemu-devel] " Stefan Hajnoczi
@ 2013-05-10  9:07   ` Stefan Priebe - Profihost AG
  2013-05-10 11:09     ` Stefan Hajnoczi
  0 siblings, 1 reply; 19+ messages in thread
From: Stefan Priebe - Profihost AG @ 2013-05-10  9:07 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: qemu-devel, pve-devel

Am 10.05.2013 09:42, schrieb Stefan Hajnoczi:
> On Fri, May 10, 2013 at 08:12:39AM +0200, Stefan Priebe - Profihost AG wrote:
>> i've now seen this several times. A VM is suddently down no segfault
>> nothing the kvm process just disappears...
>>
>> Anybody any idea how to debug this?
>>
>> Sadly i can't reproduce. Qemu version is 1.4.1.
> 
> 1. Double-check dmesg(1) for out-of-memory killer or segfaults.
done => nothing in there also 120-140GB mem free.

> 2. Check libvirt or other management tool logs again.

kvm process was started via bash no management tool involved.

> 3. Either use gdb or an LD_PRELOAD library that catches exit(3) and
>    _exit(2) and dumps core using abort(3).  Make sure core dumps are
>    enabled.

LD_PRELOAD sounds good can you point me to such a lib?

Stefan

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] kvm process disappears
  2013-05-10  9:07   ` Stefan Priebe - Profihost AG
@ 2013-05-10 11:09     ` Stefan Hajnoczi
  2013-05-14 14:29       ` [Qemu-devel] segfault in aio_bh_poll async.c:80 WAS: " Stefan Priebe - Profihost AG
  2013-05-27 21:09       ` [Qemu-devel] " Stefan Priebe
  0 siblings, 2 replies; 19+ messages in thread
From: Stefan Hajnoczi @ 2013-05-10 11:09 UTC (permalink / raw)
  To: Stefan Priebe - Profihost AG; +Cc: qemu-devel, pve-devel

On Fri, May 10, 2013 at 11:07 AM, Stefan Priebe - Profihost AG
<s.priebe@profihost.ag> wrote:
> Am 10.05.2013 09:42, schrieb Stefan Hajnoczi:
>> On Fri, May 10, 2013 at 08:12:39AM +0200, Stefan Priebe - Profihost AG wrote:
>> 3. Either use gdb or an LD_PRELOAD library that catches exit(3) and
>>    _exit(2) and dumps core using abort(3).  Make sure core dumps are
>>    enabled.
>
> LD_PRELOAD sounds good can you point me to such a lib?

$ cat /tmp/catchexit.c
#include <unistd.h>
#include <stdlib.h>

void exit(int status)
{
    const char msg[] = "*** CAUGHT EXIT, DUMPING CORE ***\n";
    write(2, msg, sizeof msg);
    abort();
}

void _exit(int status) __attribute__((alias("exit")));

$ gcc -o catchexit.so -shared -fPIC -std=gnu99 catchexit.c

$ LD_PRELOAD=/tmp/catchexit.so x86_64-softmmu/qemu-system-x86_64 -m
1024 -enable-kvm -cpu host -vga asdf
Unknown vga type: asdf
*** CAUGHT EXIT, DUMPING CORE ***
Aborted (core dumped)

Make sure to give the absolute path to catchexit.so.  Also keep in
mind that this does not catch a normal return from main() or possibly
other ways of terminating the process.

You can hook more library functions, if necessary.

Stefan

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Qemu-devel] segfault in aio_bh_poll async.c:80 WAS: Re: kvm process disappears
  2013-05-10 11:09     ` Stefan Hajnoczi
@ 2013-05-14 14:29       ` Stefan Priebe - Profihost AG
  2013-05-14 15:05         ` Stefan Hajnoczi
  2013-05-27 21:09       ` [Qemu-devel] " Stefan Priebe
  1 sibling, 1 reply; 19+ messages in thread
From: Stefan Priebe - Profihost AG @ 2013-05-14 14:29 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: Paolo Bonzini, qemu-devel, pve-devel, mdroth

Am 10.05.2013 13:09, schrieb Stefan Hajnoczi:
> On Fri, May 10, 2013 at 11:07 AM, Stefan Priebe - Profihost AG
> <s.priebe@profihost.ag> wrote:
>> Am 10.05.2013 09:42, schrieb Stefan Hajnoczi:
>>> On Fri, May 10, 2013 at 08:12:39AM +0200, Stefan Priebe - Profihost AG wrote:
>>> 3. Either use gdb or an LD_PRELOAD library that catches exit(3) and
>>>    _exit(2) and dumps core using abort(3).  Make sure core dumps are
>>>    enabled.

This time i had a segfault Qemu 1.4.1 plus
http://git.qemu.org/?p=qemu.git;a=commitdiff;h=dc7588c1eb3008bda53dde1d6b890cd299758155.

aio_bh_poll    async.c:80

Code...

   for (bh = ctx->first_bh; bh; bh = next) {
        next = bh->next;
        if (!bh->deleted && bh->scheduled) {
            bh->scheduled = 0;
            if (!bh->idle)
                ret = 1;
            bh->idle = 0;
            bh->cb(bh->opaque);
        }
    }

    ctx->walking_bh--;

    /* remove deleted bhs */
    if (!ctx->walking_bh) {
        bhp = &ctx->first_bh;
        while (*bhp) {
            bh = *bhp;
===== THIS IS THE SEGFAULT LINE =====            if (bh->deleted) {
                *bhp = bh->next;
                g_free(bh);
            } else {
                bhp = &bh->next;
            }
        }
    }

    return ret;

Greets,
Stefan

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] segfault in aio_bh_poll async.c:80 WAS: Re: kvm process disappears
  2013-05-14 14:29       ` [Qemu-devel] segfault in aio_bh_poll async.c:80 WAS: " Stefan Priebe - Profihost AG
@ 2013-05-14 15:05         ` Stefan Hajnoczi
  2013-05-14 15:11           ` Stefan Priebe - Profihost AG
  0 siblings, 1 reply; 19+ messages in thread
From: Stefan Hajnoczi @ 2013-05-14 15:05 UTC (permalink / raw)
  To: Stefan Priebe - Profihost AG
  Cc: Paolo Bonzini, qemu-devel, pve-devel, Michael Roth

On Tue, May 14, 2013 at 4:29 PM, Stefan Priebe - Profihost AG
<s.priebe@profihost.ag> wrote:
> Am 10.05.2013 13:09, schrieb Stefan Hajnoczi:
>> On Fri, May 10, 2013 at 11:07 AM, Stefan Priebe - Profihost AG
>> <s.priebe@profihost.ag> wrote:
>>> Am 10.05.2013 09:42, schrieb Stefan Hajnoczi:
>>>> On Fri, May 10, 2013 at 08:12:39AM +0200, Stefan Priebe - Profihost AG wrote:
>>>> 3. Either use gdb or an LD_PRELOAD library that catches exit(3) and
>>>>    _exit(2) and dumps core using abort(3).  Make sure core dumps are
>>>>    enabled.
>
> This time i had a segfault Qemu 1.4.1 plus
> http://git.qemu.org/?p=qemu.git;a=commitdiff;h=dc7588c1eb3008bda53dde1d6b890cd299758155.
>
> aio_bh_poll    async.c:80
>
> Code...
>
>    for (bh = ctx->first_bh; bh; bh = next) {
>         next = bh->next;
>         if (!bh->deleted && bh->scheduled) {
>             bh->scheduled = 0;
>             if (!bh->idle)
>                 ret = 1;
>             bh->idle = 0;
>             bh->cb(bh->opaque);
>         }
>     }
>
>     ctx->walking_bh--;
>
>     /* remove deleted bhs */
>     if (!ctx->walking_bh) {
>         bhp = &ctx->first_bh;
>         while (*bhp) {
>             bh = *bhp;
> ===== THIS IS THE SEGFAULT LINE =====            if (bh->deleted) {
>                 *bhp = bh->next;
>                 g_free(bh);
>             } else {
>                 bhp = &bh->next;
>             }
>         }
>     }
>
>     return ret;

Interesting crash.  Do you have the output of "thread apply all bt"?

I would try looking at the AioContext using "p *ctx", and print out
the ctx->first_bh linked list.

Stefan

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] segfault in aio_bh_poll async.c:80 WAS: Re: kvm process disappears
  2013-05-14 15:05         ` Stefan Hajnoczi
@ 2013-05-14 15:11           ` Stefan Priebe - Profihost AG
  2013-05-22  6:26             ` Stefan Priebe - Profihost AG
  0 siblings, 1 reply; 19+ messages in thread
From: Stefan Priebe - Profihost AG @ 2013-05-14 15:11 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: Paolo Bonzini, qemu-devel, pve-devel, Michael Roth

Am 14.05.2013 17:05, schrieb Stefan Hajnoczi:
> On Tue, May 14, 2013 at 4:29 PM, Stefan Priebe - Profihost AG
> <s.priebe@profihost.ag> wrote:
>> Am 10.05.2013 13:09, schrieb Stefan Hajnoczi:
>>> On Fri, May 10, 2013 at 11:07 AM, Stefan Priebe - Profihost AG
>>> <s.priebe@profihost.ag> wrote:
>>>> Am 10.05.2013 09:42, schrieb Stefan Hajnoczi:
>>>>> On Fri, May 10, 2013 at 08:12:39AM +0200, Stefan Priebe - Profihost AG wrote:
>>>>> 3. Either use gdb or an LD_PRELOAD library that catches exit(3) and
>>>>>    _exit(2) and dumps core using abort(3).  Make sure core dumps are
>>>>>    enabled.
>>
>> This time i had a segfault Qemu 1.4.1 plus
>> http://git.qemu.org/?p=qemu.git;a=commitdiff;h=dc7588c1eb3008bda53dde1d6b890cd299758155.
>>
>> aio_bh_poll    async.c:80
>>
>> Code...
>>
>>    for (bh = ctx->first_bh; bh; bh = next) {
>>         next = bh->next;
>>         if (!bh->deleted && bh->scheduled) {
>>             bh->scheduled = 0;
>>             if (!bh->idle)
>>                 ret = 1;
>>             bh->idle = 0;
>>             bh->cb(bh->opaque);
>>         }
>>     }
>>
>>     ctx->walking_bh--;
>>
>>     /* remove deleted bhs */
>>     if (!ctx->walking_bh) {
>>         bhp = &ctx->first_bh;
>>         while (*bhp) {
>>             bh = *bhp;
>> ===== THIS IS THE SEGFAULT LINE =====            if (bh->deleted) {
>>                 *bhp = bh->next;
>>                 g_free(bh);
>>             } else {
>>                 bhp = &bh->next;
>>             }
>>         }
>>     }
>>
>>     return ret;
> 
> Interesting crash.  Do you have the output of "thread apply all bt"?
> 
> I would try looking at the AioContext using "p *ctx", and print out
> the ctx->first_bh linked list.

Hi,

as i can't reproduce no ;-( i just saw the kernel segfault message and
used addr2line and a qemu dbg package to get the code line.

Stefan

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] segfault in aio_bh_poll async.c:80 WAS: Re: kvm process disappears
  2013-05-14 15:11           ` Stefan Priebe - Profihost AG
@ 2013-05-22  6:26             ` Stefan Priebe - Profihost AG
  2013-05-22  8:41               ` Paolo Bonzini
  0 siblings, 1 reply; 19+ messages in thread
From: Stefan Priebe - Profihost AG @ 2013-05-22  6:26 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Josh Durgin, Paolo Bonzini, qemu-devel, pve-devel, Michael Roth

Hi josh, hi Stefan,

> Am 14.05.2013 17:05, schrieb Stefan Hajnoczi:
>> On Tue, May 14, 2013 at 4:29 PM, Stefan Priebe - Profihost AG
>> <s.priebe@profihost.ag> wrote:
>>> Am 10.05.2013 13:09, schrieb Stefan Hajnoczi:
>>>> On Fri, May 10, 2013 at 11:07 AM, Stefan Priebe - Profihost AG
>>>> <s.priebe@profihost.ag> wrote:
>>>>> Am 10.05.2013 09:42, schrieb Stefan Hajnoczi:
>>>>>> On Fri, May 10, 2013 at 08:12:39AM +0200, Stefan Priebe - Profihost AG wrote:
>>>>>> 3. Either use gdb or an LD_PRELOAD library that catches exit(3) and
>>>>>>   _exit(2) and dumps core using abort(3).  Make sure core dumps are
>>>>>>   enabled.
>>> 
>>> This time i had a segfault Qemu 1.4.1 plus
>>> http://git.qemu.org/?p=qemu.git;a=commitdiff;h=dc7588c1eb3008bda53dde1d6b890cd299758155.
>>> 
>>> aio_bh_poll    async.c:80
>>> 
>>> Code...
>>> 
>>>   for (bh = ctx->first_bh; bh; bh = next) {
>>>        next = bh->next;
>>>        if (!bh->deleted && bh->scheduled) {
>>>            bh->scheduled = 0;
>>>            if (!bh->idle)
>>>                ret = 1;
>>>            bh->idle = 0;
>>>            bh->cb(bh->opaque);
>>>        }
>>>    }
>>> 
>>>    ctx->walking_bh--;
>>> 
>>>    /* remove deleted bhs */
>>>    if (!ctx->walking_bh) {
>>>        bhp = &ctx->first_bh;
>>>        while (*bhp) {
>>>            bh = *bhp;
>>> ===== THIS IS THE SEGFAULT LINE =====            if (bh->deleted) {
>>>                *bhp = bh->next;
>>>                g_free(bh);
>>>            } else {
>>>                bhp = &bh->next;
>>>            }
>>>        }
>>>    }
>>> 
>>>    return ret;
>> 
>> Interesting crash.  Do you have the output of "thread apply all bt"?
>> 
>> I would try looking at the AioContext using "p *ctx", and print out
>> the ctx->first_bh linked list.
> 
> Hi,
> 
> as i can't reproduce no ;-( i just saw the kernel segfault message and
> used addr2line and a qemu dbg package to get the code line.

I've now seen this again for two or three times. It always happens when we do an fstrim inside the guest.

And I've seen this first since josh async rbd patch.

Stefan



> 
> Stefan

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] segfault in aio_bh_poll async.c:80 WAS: Re: kvm process disappears
  2013-05-22  6:26             ` Stefan Priebe - Profihost AG
@ 2013-05-22  8:41               ` Paolo Bonzini
  2013-05-22 12:24                 ` Stefan Priebe - Profihost AG
  0 siblings, 1 reply; 19+ messages in thread
From: Paolo Bonzini @ 2013-05-22  8:41 UTC (permalink / raw)
  To: Stefan Priebe - Profihost AG
  Cc: Josh Durgin, Stefan Hajnoczi, qemu-devel, pve-devel, Michael Roth

Il 22/05/2013 08:26, Stefan Priebe - Profihost AG ha scritto:
>> Hi,
>>
>> as i can't reproduce no ;-( i just saw the kernel segfault message and
>> used addr2line and a qemu dbg package to get the code line.
> 
> I've now seen this again for two or three times. It always happens
> when we do an fstrim inside the guest.


> And I've seen this first since josh async rbd patch.

This one?

commit dc7588c1eb3008bda53dde1d6b890cd299758155
Author: Josh Durgin <josh.durgin@inktank.com>
Date:   Fri Mar 29 13:03:23 2013 -0700

    rbd: add an asynchronous flush
    
    The existing bdrv_co_flush_to_disk implementation uses rbd_flush(),
    which is sychronous and causes the main qemu thread to block until it
    is complete. This results in unresponsiveness and extra latency for
    the guest.
    
    Fix this by using an asynchronous version of flush.  This was added to
    librbd with a special #define to indicate its presence, since it will
    be backported to stable versions. Thus, there is no need to check the
    version of librbd.
    
    Implement this as bdrv_aio_flush, since it matches other aio functions
    in the rbd block driver, and leave out bdrv_co_flush_to_disk when the
    asynchronous version is available.
    
    Reported-by: Oliver Francke <oliver@filoo.de>
    Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
    Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>


Do you see it even with "-drive discard=off"?

Paolo

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] segfault in aio_bh_poll async.c:80 WAS: Re: kvm process disappears
  2013-05-22  8:41               ` Paolo Bonzini
@ 2013-05-22 12:24                 ` Stefan Priebe - Profihost AG
  2013-05-23 10:09                   ` Paolo Bonzini
  0 siblings, 1 reply; 19+ messages in thread
From: Stefan Priebe - Profihost AG @ 2013-05-22 12:24 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Josh Durgin, Stefan Hajnoczi, qemu-devel, pve-devel, Michael Roth

Am 22.05.2013 um 10:41 schrieb Paolo Bonzini <pbonzini@redhat.com>:

> Il 22/05/2013 08:26, Stefan Priebe - Profihost AG ha scritto:
>>> Hi,
>>> 
>>> as i can't reproduce no ;-( i just saw the kernel segfault message and
>>> used addr2line and a qemu dbg package to get the code line.
>> 
>> I've now seen this again for two or three times. It always happens
>> when we do an fstrim inside the guest.
> 
> 
>> And I've seen this first since josh async rbd patch.
> 
> This one?
> 
> commit dc7588c1eb3008bda53dde1d6b890cd299758155

Yes. But i'm not sure whether this is coincendence.

> 
>  Do you see it even with "-drive discard=off"?
> 
I use discard / trim for thin provisioning and need it. This is a production system so I can't test without it.

I use scsi virtio with discard_granularity=512

Stefan

> Paolo

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] segfault in aio_bh_poll async.c:80 WAS: Re: kvm process disappears
  2013-05-22 12:24                 ` Stefan Priebe - Profihost AG
@ 2013-05-23 10:09                   ` Paolo Bonzini
  2013-05-23 19:22                     ` Stefan Priebe
  0 siblings, 1 reply; 19+ messages in thread
From: Paolo Bonzini @ 2013-05-23 10:09 UTC (permalink / raw)
  To: Stefan Priebe - Profihost AG
  Cc: Josh Durgin, Stefan Hajnoczi, qemu-devel, pve-devel, Michael Roth

Il 22/05/2013 14:24, Stefan Priebe - Profihost AG ha scritto:
> Am 22.05.2013 um 10:41 schrieb Paolo Bonzini <pbonzini@redhat.com>:
> 
>> Il 22/05/2013 08:26, Stefan Priebe - Profihost AG ha scritto:
>>>> Hi,
>>>>
>>>> as i can't reproduce no ;-( i just saw the kernel segfault message and
>>>> used addr2line and a qemu dbg package to get the code line.
>>>
>>> I've now seen this again for two or three times. It always happens
>>> when we do an fstrim inside the guest.
>>
>>
>>> And I've seen this first since josh async rbd patch.
>>
>> This one?
>>
>> commit dc7588c1eb3008bda53dde1d6b890cd299758155
> 
> Yes. But i'm not sure whether this is coincendence.

Ok.

>>  Do you see it even with "-drive discard=off"?
>
> I use discard / trim for thin provisioning and need it. This is a
> production system so I can't test without it.
> 
> I use scsi virtio with discard_granularity=512

Note that 1.5.0 won't need this anymore, but it will need "-drive
discard=on".  Any chance you can try to reproduce it in a different
environment?

Paolo

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] segfault in aio_bh_poll async.c:80 WAS: Re: kvm process disappears
  2013-05-23 10:09                   ` Paolo Bonzini
@ 2013-05-23 19:22                     ` Stefan Priebe
  0 siblings, 0 replies; 19+ messages in thread
From: Stefan Priebe @ 2013-05-23 19:22 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Josh Durgin, Stefan Hajnoczi, qemu-devel, pve-devel, Michael Roth

Am 23.05.2013 12:09, schrieb Paolo Bonzini:
> Il 22/05/2013 14:24, Stefan Priebe - Profihost AG ha scritto:
>> Am 22.05.2013 um 10:41 schrieb Paolo Bonzini <pbonzini@redhat.com>:
>>
>>> Il 22/05/2013 08:26, Stefan Priebe - Profihost AG ha scritto:
>>>>> Hi,
>>>>>
>>>>> as i can't reproduce no ;-( i just saw the kernel segfault message and
>>>>> used addr2line and a qemu dbg package to get the code line.
>>>>
>>>> I've now seen this again for two or three times. It always happens
>>>> when we do an fstrim inside the guest.
>>>
>>>
>>>> And I've seen this first since josh async rbd patch.
>>>
>>> This one?
>>>
>>> commit dc7588c1eb3008bda53dde1d6b890cd299758155
>>
>> Yes. But i'm not sure whether this is coincendence.
>
> Ok.
>
>>>   Do you see it even with "-drive discard=off"?
>>
>> I use discard / trim for thin provisioning and need it. This is a
>> production system so I can't test without it.
>>
>> I use scsi virtio with discard_granularity=512
>
> Note that 1.5.0 won't need this anymore, but it will need "-drive
> discard=on".  Any chance you can try to reproduce it in a different
> environment?

OK i'll try to update to 1.5.0.

Stefan

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] kvm process disappears
  2013-05-10 11:09     ` Stefan Hajnoczi
  2013-05-14 14:29       ` [Qemu-devel] segfault in aio_bh_poll async.c:80 WAS: " Stefan Priebe - Profihost AG
@ 2013-05-27 21:09       ` Stefan Priebe
  2013-05-28  8:06         ` Stefan Hajnoczi
  1 sibling, 1 reply; 19+ messages in thread
From: Stefan Priebe @ 2013-05-27 21:09 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: Paolo Bonzini, qemu-devel, pve-devel

Am 10.05.2013 13:09, schrieb Stefan Hajnoczi:
> On Fri, May 10, 2013 at 11:07 AM, Stefan Priebe - Profihost AG
> <s.priebe@profihost.ag> wrote:
>> Am 10.05.2013 09:42, schrieb Stefan Hajnoczi:
>>> On Fri, May 10, 2013 at 08:12:39AM +0200, Stefan Priebe - Profihost AG wrote:
>>> 3. Either use gdb or an LD_PRELOAD library that catches exit(3) and
>>>     _exit(2) and dumps core using abort(3).  Make sure core dumps are
>>>     enabled.
>>
>> LD_PRELOAD sounds good can you point me to such a lib?
>
> $ cat /tmp/catchexit.c
> #include <unistd.h>
> #include <stdlib.h>
>
> void exit(int status)
> {
>      const char msg[] = "*** CAUGHT EXIT, DUMPING CORE ***\n";
>      write(2, msg, sizeof msg);
>      abort();
> }
>
> void _exit(int status) __attribute__((alias("exit")));
>
> $ gcc -o catchexit.so -shared -fPIC -std=gnu99 catchexit.c
>
> $ LD_PRELOAD=/tmp/catchexit.so x86_64-softmmu/qemu-system-x86_64 -m
> 1024 -enable-kvm -cpu host -vga asdf
> Unknown vga type: asdf
> *** CAUGHT EXIT, DUMPING CORE ***
> Aborted (core dumped)
>
> Make sure to give the absolute path to catchexit.so.  Also keep in
> mind that this does not catch a normal return from main() or possibly
> other ways of terminating the process.
>
> You can hook more library functions, if necessary.
>
> Stefan

I'm really sorry for bothering you. It turned out to be a host kernel 
bug. Without NUMA Balancing turned on (kernel 3.8.13) i see no vm 
crashes at all...

Greets,
Stefan

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] kvm process disappears
  2013-05-27 21:09       ` [Qemu-devel] " Stefan Priebe
@ 2013-05-28  8:06         ` Stefan Hajnoczi
  0 siblings, 0 replies; 19+ messages in thread
From: Stefan Hajnoczi @ 2013-05-28  8:06 UTC (permalink / raw)
  To: Stefan Priebe; +Cc: Paolo Bonzini, qemu-devel, pve-devel

On Mon, May 27, 2013 at 11:09:51PM +0200, Stefan Priebe wrote:
> Am 10.05.2013 13:09, schrieb Stefan Hajnoczi:
> >On Fri, May 10, 2013 at 11:07 AM, Stefan Priebe - Profihost AG
> ><s.priebe@profihost.ag> wrote:
> >>Am 10.05.2013 09:42, schrieb Stefan Hajnoczi:
> >>>On Fri, May 10, 2013 at 08:12:39AM +0200, Stefan Priebe - Profihost AG wrote:
> >>>3. Either use gdb or an LD_PRELOAD library that catches exit(3) and
> >>>    _exit(2) and dumps core using abort(3).  Make sure core dumps are
> >>>    enabled.
> >>
> >>LD_PRELOAD sounds good can you point me to such a lib?
> >
> >$ cat /tmp/catchexit.c
> >#include <unistd.h>
> >#include <stdlib.h>
> >
> >void exit(int status)
> >{
> >     const char msg[] = "*** CAUGHT EXIT, DUMPING CORE ***\n";
> >     write(2, msg, sizeof msg);
> >     abort();
> >}
> >
> >void _exit(int status) __attribute__((alias("exit")));
> >
> >$ gcc -o catchexit.so -shared -fPIC -std=gnu99 catchexit.c
> >
> >$ LD_PRELOAD=/tmp/catchexit.so x86_64-softmmu/qemu-system-x86_64 -m
> >1024 -enable-kvm -cpu host -vga asdf
> >Unknown vga type: asdf
> >*** CAUGHT EXIT, DUMPING CORE ***
> >Aborted (core dumped)
> >
> >Make sure to give the absolute path to catchexit.so.  Also keep in
> >mind that this does not catch a normal return from main() or possibly
> >other ways of terminating the process.
> >
> >You can hook more library functions, if necessary.
> >
> >Stefan
> 
> I'm really sorry for bothering you. It turned out to be a host
> kernel bug. Without NUMA Balancing turned on (kernel 3.8.13) i see
> no vm crashes at all...

No worries.  Glad you found the solution.

Stefan

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2013-05-28  8:06 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-05-10  6:12 [Qemu-devel] kvm process disappears Stefan Priebe - Profihost AG
2013-05-10  7:15 ` [Qemu-devel] [pve-devel] " Alexandre DERUMIER
2013-05-10  7:20 ` Alexandre DERUMIER
2013-05-10  7:22   ` Stefan Priebe - Profihost AG
2013-05-10  7:28     ` Alexandre DERUMIER
2013-05-10  9:06       ` Stefan Priebe - Profihost AG
2013-05-10  7:42 ` [Qemu-devel] " Stefan Hajnoczi
2013-05-10  9:07   ` Stefan Priebe - Profihost AG
2013-05-10 11:09     ` Stefan Hajnoczi
2013-05-14 14:29       ` [Qemu-devel] segfault in aio_bh_poll async.c:80 WAS: " Stefan Priebe - Profihost AG
2013-05-14 15:05         ` Stefan Hajnoczi
2013-05-14 15:11           ` Stefan Priebe - Profihost AG
2013-05-22  6:26             ` Stefan Priebe - Profihost AG
2013-05-22  8:41               ` Paolo Bonzini
2013-05-22 12:24                 ` Stefan Priebe - Profihost AG
2013-05-23 10:09                   ` Paolo Bonzini
2013-05-23 19:22                     ` Stefan Priebe
2013-05-27 21:09       ` [Qemu-devel] " Stefan Priebe
2013-05-28  8:06         ` Stefan Hajnoczi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.