All of lore.kernel.org
 help / color / mirror / Atom feed
* xl create crash when using stub domains
@ 2011-09-16  0:33 Jeremy Fitzhardinge
  2011-09-21  1:34 ` Jeremy Fitzhardinge
  0 siblings, 1 reply; 6+ messages in thread
From: Jeremy Fitzhardinge @ 2011-09-16  0:33 UTC (permalink / raw)
  To: Ian Campbell; +Cc: xen-devel, Ian Jackson, Stefano Stabellini

When I create an HVM domain with stubdom enabled, it crashes at:

(gdb) run create  -c /etc/xen/f14hv64  vcpus=4 xen_platform_pci=0 'boot="d"'
Starting program: /usr/sbin/xl create  -c /etc/xen/f14hv64  vcpus=4 xen_platform_pci=0 'boot="d"'
[Thread debugging using libthread_db enabled]
Parsing config file /etc/xen/f14hv64
xc: info: VIRTUAL MEMORY ARRANGEMENT:
  Loader:        0000000000100000->000000000017b9ec
  TOTAL:         0000000000000000->000000003f800000
  ENTRY ADDRESS: 0000000000100000
xc: info: PHYSICAL MEMORY ALLOCATION:
  4KB PAGES: 0x0000000000000200
  2MB PAGES: 0x00000000000001fb
  1GB PAGES: 0x0000000000000000
xc: error: panic: xc_dom_bzimageloader.c:588: xc_dom_probe_bzimage_kernel: kernel is not a bzImage: Invalid kernel
Detaching after fork from child process 26888.
[New Thread 0x7ffff7342700 (LWP 26889)]
[Thread 0x7ffff7342700 (LWP 26889) exited]
[New Thread 0x7ffff7342700 (LWP 26921)]

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7bbbec5 in libxl__wait_for_device_model (gc=0x7fffffffdbb0, 
    domid=22, state=0x7ffff7bc1b8c "running", starting=0x623760, 
    check_callback=0, check_callback_userdata=0x0) at libxl_device.c:555
555	    if (starting && starting->for_spawn->fd > xs_fileno(xsh))
(gdb) bt
#0  0x00007ffff7bbbec5 in libxl__wait_for_device_model (gc=0x7fffffffdbb0, 
    domid=22, state=0x7ffff7bc1b8c "running", starting=0x623760, 
    check_callback=0, check_callback_userdata=0x0) at libxl_device.c:555
#1  0x00007ffff7bb37b5 in libxl__confirm_device_model_startup (
    gc=0x7fffffffdbb0, starting=0x623760) at libxl_dm.c:922
#2  0x00007ffff7bb229b in do_domain_create (gc=0x7fffffffdbb0, 
    d_config=0x7fffffffde30, cb=0x40a053 <autoconnect_console>, 
    priv=0x7fffffffde14, domid_out=0x619ed8, restore_fd=-1)
    at libxl_create.c:576
#3  0x00007ffff7bb2481 in libxl_domain_create_new (ctx=<optimized out>, 
    d_config=<optimized out>, cb=<optimized out>, priv=<optimized out>, 
    domid=<optimized out>) at libxl_create.c:626
#4  0x0000000000409424 in create_domain (dom_info=0x7fffffffe0c0)
    at xl_cmdimpl.c:1520
#5  0x000000000040ceef in main_create (argc=6, argv=0x7fffffffe6b0)
    at xl_cmdimpl.c:3188
#6  0x000000000040501b in main (argc=6, argv=0x7fffffffe6b0) at xl.c:151

The stubdom seems fine, and when I unpause the main domain it seems to work fine.

	J

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: xl create crash when using stub domains
  2011-09-16  0:33 xl create crash when using stub domains Jeremy Fitzhardinge
@ 2011-09-21  1:34 ` Jeremy Fitzhardinge
  2011-09-21  8:56   ` Ian Campbell
  0 siblings, 1 reply; 6+ messages in thread
From: Jeremy Fitzhardinge @ 2011-09-21  1:34 UTC (permalink / raw)
  To: Ian Campbell; +Cc: xen-devel, Ian Jackson, Stefano Stabellini

On 09/15/2011 05:33 PM, Jeremy Fitzhardinge wrote:
> When I create an HVM domain with stubdom enabled, it crashes at:
>
> (gdb) run create  -c /etc/xen/f14hv64  vcpus=4 xen_platform_pci=0 'boot="d"'
> Starting program: /usr/sbin/xl create  -c /etc/xen/f14hv64  vcpus=4 xen_platform_pci=0 'boot="d"'
> [Thread debugging using libthread_db enabled]
> Parsing config file /etc/xen/f14hv64
> xc: info: VIRTUAL MEMORY ARRANGEMENT:
>   Loader:        0000000000100000->000000000017b9ec
>   TOTAL:         0000000000000000->000000003f800000
>   ENTRY ADDRESS: 0000000000100000
> xc: info: PHYSICAL MEMORY ALLOCATION:
>   4KB PAGES: 0x0000000000000200
>   2MB PAGES: 0x00000000000001fb
>   1GB PAGES: 0x0000000000000000
> xc: error: panic: xc_dom_bzimageloader.c:588: xc_dom_probe_bzimage_kernel: kernel is not a bzImage: Invalid kernel
> Detaching after fork from child process 26888.
> [New Thread 0x7ffff7342700 (LWP 26889)]
> [Thread 0x7ffff7342700 (LWP 26889) exited]
> [New Thread 0x7ffff7342700 (LWP 26921)]
>
> Program received signal SIGSEGV, Segmentation fault.
> 0x00007ffff7bbbec5 in libxl__wait_for_device_model (gc=0x7fffffffdbb0, 
>     domid=22, state=0x7ffff7bc1b8c "running", starting=0x623760, 
>     check_callback=0, check_callback_userdata=0x0) at libxl_device.c:555
> 555	    if (starting && starting->for_spawn->fd > xs_fileno(xsh))
> (gdb) bt

This patch seems to fix it, but I don't know if it is a real fix or just
papering over something else.

    J

diff -r 7779e12cc99e tools/libxl/libxl_device.c
--- a/tools/libxl/libxl_device.c	Tue Aug 16 17:05:18 2011 -0700
+++ b/tools/libxl/libxl_device.c	Tue Sep 20 18:23:03 2011 -0700
@@ -552,7 +552,7 @@
     tv.tv_sec = LIBXL_DEVICE_MODEL_START_TIMEOUT;
     tv.tv_usec = 0;
     nfds = xs_fileno(xsh) + 1;
-    if (starting && starting->for_spawn->fd > xs_fileno(xsh))
+    if (starting && starting->for_spawn && starting->for_spawn->fd > xs_fileno(xsh))
         nfds = starting->for_spawn->fd + 1;
 
     while (rc > 0 || (!rc && tv.tv_sec > 0)) {
@@ -586,7 +586,7 @@
         free(p);
         FD_ZERO(&rfds);
         FD_SET(xs_fileno(xsh), &rfds);
-        if (starting)
+        if (starting && starting->for_spawn)
             FD_SET(starting->for_spawn->fd, &rfds);
         rc = select(nfds, &rfds, NULL, NULL, &tv);
         if (rc > 0) {
@@ -597,7 +597,7 @@
                 else
                     goto again;
             }
-            if (starting && FD_ISSET(starting->for_spawn->fd, &rfds)) {
+            if (starting && starting->for_spawn && FD_ISSET(starting->for_spawn->fd, &rfds)) {
                 unsigned char dummy;
                 if (read(starting->for_spawn->fd, &dummy, sizeof(dummy)) != 1)
                     LIBXL__LOG_ERRNO(ctx, LIBXL__LOG_DEBUG,

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: xl create crash when using stub domains
  2011-09-21  1:34 ` Jeremy Fitzhardinge
@ 2011-09-21  8:56   ` Ian Campbell
  2011-09-21 23:06     ` Jeremy Fitzhardinge
  2011-09-27 17:04     ` Ian Jackson
  0 siblings, 2 replies; 6+ messages in thread
From: Ian Campbell @ 2011-09-21  8:56 UTC (permalink / raw)
  To: Jeremy Fitzhardinge; +Cc: xen-devel, Ian Jackson, Stefano Stabellini

On Wed, 2011-09-21 at 02:34 +0100, Jeremy Fitzhardinge wrote:
> On 09/15/2011 05:33 PM, Jeremy Fitzhardinge wrote:
> > When I create an HVM domain with stubdom enabled, it crashes at:
> >
> > (gdb) run create  -c /etc/xen/f14hv64  vcpus=4 xen_platform_pci=0 'boot="d"'
> > Starting program: /usr/sbin/xl create  -c /etc/xen/f14hv64  vcpus=4 xen_platform_pci=0 'boot="d"'
> > [Thread debugging using libthread_db enabled]
> > Parsing config file /etc/xen/f14hv64
> > xc: info: VIRTUAL MEMORY ARRANGEMENT:
> >   Loader:        0000000000100000->000000000017b9ec
> >   TOTAL:         0000000000000000->000000003f800000
> >   ENTRY ADDRESS: 0000000000100000
> > xc: info: PHYSICAL MEMORY ALLOCATION:
> >   4KB PAGES: 0x0000000000000200
> >   2MB PAGES: 0x00000000000001fb
> >   1GB PAGES: 0x0000000000000000
> > xc: error: panic: xc_dom_bzimageloader.c:588: xc_dom_probe_bzimage_kernel: kernel is not a bzImage: Invalid kernel

FWIW I don't get this message. It seems unrelated to the issue here but
makes me curious...

> > Detaching after fork from child process 26888.
> > [New Thread 0x7ffff7342700 (LWP 26889)]
> > [Thread 0x7ffff7342700 (LWP 26889) exited]
> > [New Thread 0x7ffff7342700 (LWP 26921)]
> >
> > Program received signal SIGSEGV, Segmentation fault.
> > 0x00007ffff7bbbec5 in libxl__wait_for_device_model (gc=0x7fffffffdbb0, 
> >     domid=22, state=0x7ffff7bc1b8c "running", starting=0x623760, 
> >     check_callback=0, check_callback_userdata=0x0) at libxl_device.c:555
> > 555	    if (starting && starting->for_spawn->fd > xs_fileno(xsh))
> > (gdb) bt
> 
> This patch seems to fix it, but I don't know if it is a real fix or just
> papering over something else.

I think this is correct because starting->for_spawn is only valid if the
device model was launched with libxl__spawn_spawn which is only the case
for process based stubdom.

libxl__create_device_model heads off into libxl__create_stubdom for this
case and explicitly sets for_spawn == NULL.

Hmm, actually this function never uses starting except to get at
for_spawn perhaps we should just pass in the for_spawn directly. Patch
to that effect follows.

Ian.

ps: can you add this to your ~/.hgrc please:
[diff]
showfunc = True

8<-----------------------------------------------

# HG changeset patch
# User Ian Campbell <ian.campbell@citrix.com>
# Date 1316595312 -3600
# Node ID eb9330c89fd3843ff0b1348b0ef21cfeb22d4a76
# Parent  21db7a7dd18483aab5c651f2364c09e8e492d7b1
libxl: make libxl__wait_for_device_model use libxl__spawn_starrting directly

Instead of indirecting via libxl_device_model_starting. This fixes a
segmentation fault using stubdomains where starting->for_spawn is
(validly) NULL because starting a stubdom doesn't need to spawn a
process.

Most callers of libxl__wait_for_device_model already pass NULL for
this variable (because they are not on the starting path) so on
libxl__confirm_device_model_startup needs to change.

Reported-by: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>

diff -r 21db7a7dd184 -r eb9330c89fd3 tools/libxl/libxl_device.c
--- a/tools/libxl/libxl_device.c	Tue Sep 20 16:50:44 2011 +0100
+++ b/tools/libxl/libxl_device.c	Wed Sep 21 09:55:12 2011 +0100
@@ -528,7 +528,7 @@ out:
 
 int libxl__wait_for_device_model(libxl__gc *gc,
                                  uint32_t domid, char *state,
-                                 libxl__device_model_starting *starting,
+                                 libxl__spawn_starting *spawning,
                                  int (*check_callback)(libxl__gc *gc,
                                                        uint32_t domid,
                                                        const char *state,
@@ -558,12 +558,12 @@ int libxl__wait_for_device_model(libxl__
     tv.tv_sec = LIBXL_DEVICE_MODEL_START_TIMEOUT;
     tv.tv_usec = 0;
     nfds = xs_fileno(xsh) + 1;
-    if (starting && starting->for_spawn->fd > xs_fileno(xsh))
-        nfds = starting->for_spawn->fd + 1;
+    if (spawning && spawning->fd > xs_fileno(xsh))
+        nfds = spawning->fd + 1;
 
     while (rc > 0 || (!rc && tv.tv_sec > 0)) {
-        if ( starting ) {
-            rc = libxl__spawn_check(gc, starting->for_spawn);
+        if ( spawning ) {
+            rc = libxl__spawn_check(gc, spawning);
             if ( rc ) {
                 LIBXL__LOG(ctx, LIBXL__LOG_ERROR,
                            "Device Model died during startup");
@@ -592,8 +592,8 @@ again:
         free(p);
         FD_ZERO(&rfds);
         FD_SET(xs_fileno(xsh), &rfds);
-        if (starting)
-            FD_SET(starting->for_spawn->fd, &rfds);
+        if (spawning)
+            FD_SET(spawning->fd, &rfds);
         rc = select(nfds, &rfds, NULL, NULL, &tv);
         if (rc > 0) {
             if (FD_ISSET(xs_fileno(xsh), &rfds)) {
@@ -603,9 +603,9 @@ again:
                 else
                     goto again;
             }
-            if (starting && FD_ISSET(starting->for_spawn->fd, &rfds)) {
+            if (spawning && FD_ISSET(spawning->fd, &rfds)) {
                 unsigned char dummy;
-                if (read(starting->for_spawn->fd, &dummy, sizeof(dummy)) != 1)
+                if (read(spawning->fd, &dummy, sizeof(dummy)) != 1)
                     LIBXL__LOG_ERRNO(ctx, LIBXL__LOG_DEBUG,
                                      "failed to read spawn status pipe");
             }
diff -r 21db7a7dd184 -r eb9330c89fd3 tools/libxl/libxl_dm.c
--- a/tools/libxl/libxl_dm.c	Tue Sep 20 16:50:44 2011 +0100
+++ b/tools/libxl/libxl_dm.c	Wed Sep 21 09:55:12 2011 +0100
@@ -934,7 +934,7 @@ int libxl__confirm_device_model_startup(
 {
     int detach;
     int problem = libxl__wait_for_device_model(gc, starting->domid, "running",
-                                               starting, NULL, NULL);
+                                               starting->for_spawn, NULL, NULL);
     detach = detach_device_model(gc, starting);
     return problem ? problem : detach;
 }
diff -r 21db7a7dd184 -r eb9330c89fd3 tools/libxl/libxl_internal.h
--- a/tools/libxl/libxl_internal.h	Tue Sep 20 16:50:44 2011 +0100
+++ b/tools/libxl/libxl_internal.h	Wed Sep 21 09:55:12 2011 +0100
@@ -288,7 +288,7 @@ _hidden int libxl__confirm_device_model_
 _hidden int libxl__detach_device_model(libxl__gc *gc, libxl__device_model_starting *starting);
 _hidden int libxl__wait_for_device_model(libxl__gc *gc,
                                 uint32_t domid, char *state,
-                                libxl__device_model_starting *starting,
+                                libxl__spawn_starting *spawning,
                                 int (*check_callback)(libxl__gc *gc,
                                                       uint32_t domid,
                                                       const char *state,

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: xl create crash when using stub domains
  2011-09-21  8:56   ` Ian Campbell
@ 2011-09-21 23:06     ` Jeremy Fitzhardinge
  2011-09-22  6:23       ` Ian Campbell
  2011-09-27 17:04     ` Ian Jackson
  1 sibling, 1 reply; 6+ messages in thread
From: Jeremy Fitzhardinge @ 2011-09-21 23:06 UTC (permalink / raw)
  To: Ian Campbell; +Cc: xen-devel, Ian Jackson, Stefano Stabellini

On 09/21/2011 01:56 AM, Ian Campbell wrote:
> On Wed, 2011-09-21 at 02:34 +0100, Jeremy Fitzhardinge wrote:
>> On 09/15/2011 05:33 PM, Jeremy Fitzhardinge wrote:
>>> When I create an HVM domain with stubdom enabled, it crashes at:
>>>
>>> (gdb) run create  -c /etc/xen/f14hv64  vcpus=4 xen_platform_pci=0 'boot="d"'
>>> Starting program: /usr/sbin/xl create  -c /etc/xen/f14hv64  vcpus=4 xen_platform_pci=0 'boot="d"'
>>> [Thread debugging using libthread_db enabled]
>>> Parsing config file /etc/xen/f14hv64
>>> xc: info: VIRTUAL MEMORY ARRANGEMENT:
>>>   Loader:        0000000000100000->000000000017b9ec
>>>   TOTAL:         0000000000000000->000000003f800000
>>>   ENTRY ADDRESS: 0000000000100000
>>> xc: info: PHYSICAL MEMORY ALLOCATION:
>>>   4KB PAGES: 0x0000000000000200
>>>   2MB PAGES: 0x00000000000001fb
>>>   1GB PAGES: 0x0000000000000000
>>> xc: error: panic: xc_dom_bzimageloader.c:588: xc_dom_probe_bzimage_kernel: kernel is not a bzImage: Invalid kernel
> FWIW I don't get this message. It seems unrelated to the issue here but
> makes me curious...

It's generated by xc_dom_probe_bzimage_kernel() when starting a PV
domain with pvgrub or an HVM domain with stubdoms, AFAIKT.  Do you not
see it, or just not do those things?  Seems to me that a "probe"
function shouldn't be making obnoxious noise.

    J

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: xl create crash when using stub domains
  2011-09-21 23:06     ` Jeremy Fitzhardinge
@ 2011-09-22  6:23       ` Ian Campbell
  0 siblings, 0 replies; 6+ messages in thread
From: Ian Campbell @ 2011-09-22  6:23 UTC (permalink / raw)
  To: Jeremy Fitzhardinge; +Cc: Ian, xen-devel, Jackson, Stefano Stabellini

On Thu, 2011-09-22 at 00:06 +0100, Jeremy Fitzhardinge wrote:
> On 09/21/2011 01:56 AM, Ian Campbell wrote:
> > On Wed, 2011-09-21 at 02:34 +0100, Jeremy Fitzhardinge wrote:
> >> On 09/15/2011 05:33 PM, Jeremy Fitzhardinge wrote:
> >>> When I create an HVM domain with stubdom enabled, it crashes at:
> >>>
> >>> (gdb) run create  -c /etc/xen/f14hv64  vcpus=4 xen_platform_pci=0 'boot="d"'
> >>> Starting program: /usr/sbin/xl create  -c /etc/xen/f14hv64  vcpus=4 xen_platform_pci=0 'boot="d"'
> >>> [Thread debugging using libthread_db enabled]
> >>> Parsing config file /etc/xen/f14hv64
> >>> xc: info: VIRTUAL MEMORY ARRANGEMENT:
> >>>   Loader:        0000000000100000->000000000017b9ec
> >>>   TOTAL:         0000000000000000->000000003f800000
> >>>   ENTRY ADDRESS: 0000000000100000
> >>> xc: info: PHYSICAL MEMORY ALLOCATION:
> >>>   4KB PAGES: 0x0000000000000200
> >>>   2MB PAGES: 0x00000000000001fb
> >>>   1GB PAGES: 0x0000000000000000
> >>> xc: error: panic: xc_dom_bzimageloader.c:588: xc_dom_probe_bzimage_kernel: kernel is not a bzImage: Invalid kernel
> > FWIW I don't get this message. It seems unrelated to the issue here but
> > makes me curious...
> 
> It's generated by xc_dom_probe_bzimage_kernel() when starting a PV
> domain with pvgrub or an HVM domain with stubdoms, AFAIKT.  Do you not
> see it, or just not do those things?

I didn't think I'd seen it (booting w/ a stubdom) but looking at the
code it must have been in there somewhere.

>   Seems to me that a "probe"
> function shouldn't be making obnoxious noise.

Full ACK.

Ian.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: xl create crash when using stub domains
  2011-09-21  8:56   ` Ian Campbell
  2011-09-21 23:06     ` Jeremy Fitzhardinge
@ 2011-09-27 17:04     ` Ian Jackson
  1 sibling, 0 replies; 6+ messages in thread
From: Ian Jackson @ 2011-09-27 17:04 UTC (permalink / raw)
  To: Ian Campbell; +Cc: Jeremy Fitzhardinge, xen-devel, Stefano Stabellini

Ian Campbell writes ("Re: [Xen-devel] xl create crash when using stub domains"):
> Hmm, actually this function never uses starting except to get at
> for_spawn perhaps we should just pass in the for_spawn directly. Patch
> to that effect follows.
...
> libxl: make libxl__wait_for_device_model use libxl__spawn_starrting directly

Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>

The original reason for the for_spawn was that all this create code
used to be outside libxl where it shouldn't be looking into
libxl's private data structures.  That reason no longer applies.

Ian.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2011-09-27 17:04 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-09-16  0:33 xl create crash when using stub domains Jeremy Fitzhardinge
2011-09-21  1:34 ` Jeremy Fitzhardinge
2011-09-21  8:56   ` Ian Campbell
2011-09-21 23:06     ` Jeremy Fitzhardinge
2011-09-22  6:23       ` Ian Campbell
2011-09-27 17:04     ` Ian Jackson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.