All of lore.kernel.org
 help / color / mirror / Atom feed
* 2.6.7-xen0 hang on boot
@ 2004-08-11 19:36 Rik van Riel
  2004-08-11 20:26 ` Keir Fraser
  2004-08-11 20:34 ` Ian Pratt
  0 siblings, 2 replies; 19+ messages in thread
From: Rik van Riel @ 2004-08-11 19:36 UTC (permalink / raw)
  To: xen-devel

I'm trying to run today's bitkeeper snapshot with kernel
2.6.7 in dom0.  On bootup time it gets past rc.sysinit
just fine, but after that init fails to switch to runlevel 3.

Compiling the xen0 kernel with magic sysrq report (should
that be on by default in defconfig?) shows me that init is
running every time I press alt-sysrq-p and have caught a
number of call traces, most involving upcalls but one real
gem that looks relevant:

page_fault -> ... -> do_page_fault -> expand_stack

Other common ones are:

page_fault  (all by itself)

work_notifysig -> do_notify_resume -> evtchn_do_upcall -> do_IRQ -> 
do_softirq -> do_signal -> get_signal_to_deliver -> force_evtchn_callback

page_fault -> evtchn_do_upcall -> do_IRQ -> do_softirq -> __do_softirq -> 
do_page_fault -> force_sig_info -> force_evtchn_callback

page_fault -> evtchn_do_upcall -> do_IRQ -> do_softirq -> do_signal -> 
do_page_fault

Sometimes init is running with the EIP at 0x91fef3, which I 
suspect is just its won executable or libc.

This is trying to boot Fedora Core 3 / rawhide userland.

Things work when I start init=/bin/bash and the normal bootup
sequence makes it all the way to the end of rc.sysinit.

Things only start failing when init tries to change runlevels.

I am wondering if there's something special about the way Xen
has the virtual memory layout, something that could interfere
with init changing its own commandline or environment...

kind regards,

Rik
-- 
"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it." - Brian W. Kernighan



-------------------------------------------------------
SF.Net email is sponsored by Shop4tech.com-Lowest price on Blank Media
100pk Sonic DVD-R 4x for only $29 -100pk Sonic DVD+R for only $33
Save 50% off Retail on Ink & Toner - Free Shipping and Free Gift.
http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 2.6.7-xen0 hang on boot
  2004-08-11 19:36 2.6.7-xen0 hang on boot Rik van Riel
@ 2004-08-11 20:26 ` Keir Fraser
  2004-08-11 20:34   ` Rik van Riel
  2004-08-11 20:34 ` Ian Pratt
  1 sibling, 1 reply; 19+ messages in thread
From: Keir Fraser @ 2004-08-11 20:26 UTC (permalink / raw)
  To: Rik van Riel; +Cc: xen-devel


Very weird. We do some odd things to patch up negative-offset segment
offsets, so perhaps trying with 'nosegfixup' added to Linux cmdline
would be worthwhile. Apart from that, I'll download FC3 myself -- is
it Fedora Core 3 Test 1 that you are using?

 Cheers,
 Keir

> I'm trying to run today's bitkeeper snapshot with kernel
> 2.6.7 in dom0.  On bootup time it gets past rc.sysinit
> just fine, but after that init fails to switch to runlevel 3.
> 
> Compiling the xen0 kernel with magic sysrq report (should
> that be on by default in defconfig?) shows me that init is
> running every time I press alt-sysrq-p and have caught a
> number of call traces, most involving upcalls but one real
> gem that looks relevant:
> 
> page_fault -> ... -> do_page_fault -> expand_stack
> 
> Other common ones are:
> 
> page_fault  (all by itself)
> 
> work_notifysig -> do_notify_resume -> evtchn_do_upcall -> do_IRQ -> 
> do_softirq -> do_signal -> get_signal_to_deliver -> force_evtchn_callback
> 
> page_fault -> evtchn_do_upcall -> do_IRQ -> do_softirq -> __do_softirq -> 
> do_page_fault -> force_sig_info -> force_evtchn_callback
> 
> page_fault -> evtchn_do_upcall -> do_IRQ -> do_softirq -> do_signal -> 
> do_page_fault
> 
> Sometimes init is running with the EIP at 0x91fef3, which I 
> suspect is just its won executable or libc.
> 
> This is trying to boot Fedora Core 3 / rawhide userland.
> 
> Things work when I start init=/bin/bash and the normal bootup
> sequence makes it all the way to the end of rc.sysinit.
> 
> Things only start failing when init tries to change runlevels.
> 
> I am wondering if there's something special about the way Xen
> has the virtual memory layout, something that could interfere
> with init changing its own commandline or environment...
> 
> kind regards,
> 
> Rik
> -- 
> "Debugging is twice as hard as writing the code in the first place.
> Therefore, if you write the code as cleverly as possible, you are,
> by definition, not smart enough to debug it." - Brian W. Kernighan
> 
> 
> 
> -------------------------------------------------------
> SF.Net email is sponsored by Shop4tech.com-Lowest price on Blank Media
> 100pk Sonic DVD-R 4x for only $29 -100pk Sonic DVD+R for only $33
> Save 50% off Retail on Ink & Toner - Free Shipping and Free Gift.
> http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/xen-devel



-------------------------------------------------------
SF.Net email is sponsored by Shop4tech.com-Lowest price on Blank Media
100pk Sonic DVD-R 4x for only $29 -100pk Sonic DVD+R for only $33
Save 50% off Retail on Ink & Toner - Free Shipping and Free Gift.
http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 2.6.7-xen0 hang on boot
  2004-08-11 19:36 2.6.7-xen0 hang on boot Rik van Riel
  2004-08-11 20:26 ` Keir Fraser
@ 2004-08-11 20:34 ` Ian Pratt
  2004-08-12  2:31   ` Rik van Riel
  1 sibling, 1 reply; 19+ messages in thread
From: Ian Pratt @ 2004-08-11 20:34 UTC (permalink / raw)
  To: Rik van Riel; +Cc: xen-devel, Ian.Pratt

> I'm trying to run today's bitkeeper snapshot with kernel
> 2.6.7 in dom0.  On bootup time it gets past rc.sysinit
> just fine, but after that init fails to switch to runlevel 3.

Do you know what the last snapshot that worked for you with this
file system was?

We've never tried FC3 before (Is it available to download?)

FC2 proved to be a huge pain because of the use of PIC code in
the tls libraries that required a heroic effort by Keir to
develop a new binary rewriter to work around the -ve segment
offset issue (which he checked in earlier today).

Please can you try booting with 'nosegfixup' on the Linux command
line, or just move /lib/tls out of the way.

> Compiling the xen0 kernel with magic sysrq report (should
> that be on by default in defconfig?) shows me that init is
> running every time I press alt-sysrq-p and have caught a
> number of call traces, most involving upcalls but one real
> gem that looks relevant:

[I'll add sysreq support to the defconfig.]

> Things work when I start init=/bin/bash and the normal bootup
> sequence makes it all the way to the end of rc.sysinit.

Phew -- it doesn't sound like the new fixup code is too broken then.
 
> Things only start failing when init tries to change runlevels.
> 
> I am wondering if there's something special about the way Xen
> has the virtual memory layout, something that could interfere
> with init changing its own commandline or environment...

Hmm. The new tls fixup code grabs a page starting at 4KB in the
VA space...


Ian


-------------------------------------------------------
SF.Net email is sponsored by Shop4tech.com-Lowest price on Blank Media
100pk Sonic DVD-R 4x for only $29 -100pk Sonic DVD+R for only $33
Save 50% off Retail on Ink & Toner - Free Shipping and Free Gift.
http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 2.6.7-xen0 hang on boot
  2004-08-11 20:26 ` Keir Fraser
@ 2004-08-11 20:34   ` Rik van Riel
  2004-08-11 20:46     ` Keir Fraser
  2004-08-12 16:46     ` Keir Fraser
  0 siblings, 2 replies; 19+ messages in thread
From: Rik van Riel @ 2004-08-11 20:34 UTC (permalink / raw)
  To: Keir Fraser; +Cc: xen-devel

On Wed, 11 Aug 2004, Keir Fraser wrote:

> Very weird. We do some odd things to patch up negative-offset segment
> offsets, so perhaps trying with 'nosegfixup' added to Linux cmdline
> would be worthwhile.

OK, 'nosegfixup' seems to fix things, the system is booting
now ...

> Apart from that, I'll download FC3 myself -- is
> it Fedora Core 3 Test 1 that you are using?

Close enough, I installed rawhide a week or so after
test 1 was released.  I suspect FC3 test 1 should show
the problem.

-- 
"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it." - Brian W. Kernighan



-------------------------------------------------------
SF.Net email is sponsored by Shop4tech.com-Lowest price on Blank Media
100pk Sonic DVD-R 4x for only $29 -100pk Sonic DVD+R for only $33
Save 50% off Retail on Ink & Toner - Free Shipping and Free Gift.
http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 2.6.7-xen0 hang on boot
  2004-08-11 20:34   ` Rik van Riel
@ 2004-08-11 20:46     ` Keir Fraser
  2004-08-11 20:50       ` Rik van Riel
  2004-08-12 16:46     ` Keir Fraser
  1 sibling, 1 reply; 19+ messages in thread
From: Keir Fraser @ 2004-08-11 20:46 UTC (permalink / raw)
  To: Rik van Riel; +Cc: Keir Fraser, xen-devel

> On Wed, 11 Aug 2004, Keir Fraser wrote:
> 
> > Very weird. We do some odd things to patch up negative-offset segment
> > offsets, so perhaps trying with 'nosegfixup' added to Linux cmdline
> > would be worthwhile.
> 
> OK, 'nosegfixup' seems to fix things, the system is booting
> now ...

Well, good news in that I know which piece of code is causing the
problem. Bad news in that it obviously isn't robust enough yet. :-(
I've only just finished fixing it for FC2!

> > Apart from that, I'll download FC3 myself -- is
> > it Fedora Core 3 Test 1 that you are using?
> 
> Close enough, I installed rawhide a week or so after
> test 1 was released.  I suspect FC3 test 1 should show
> the problem.

I shall give FC3T1 a go in the first instance. I see no ISOs for
rawhide so I guess I would have to download a bootstrap ISO and do a 
network install?

  -- Keir


-------------------------------------------------------
SF.Net email is sponsored by Shop4tech.com-Lowest price on Blank Media
100pk Sonic DVD-R 4x for only $29 -100pk Sonic DVD+R for only $33
Save 50% off Retail on Ink & Toner - Free Shipping and Free Gift.
http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 2.6.7-xen0 hang on boot
  2004-08-11 20:46     ` Keir Fraser
@ 2004-08-11 20:50       ` Rik van Riel
  0 siblings, 0 replies; 19+ messages in thread
From: Rik van Riel @ 2004-08-11 20:50 UTC (permalink / raw)
  To: Keir Fraser; +Cc: xen-devel

On Wed, 11 Aug 2004, Keir Fraser wrote:

> I shall give FC3T1 a go in the first instance. I see no ISOs for
> rawhide so I guess I would have to download a bootstrap ISO and do a 
> network install?

Either that, or do a yum upgrade from the FC2 you've
already got ;)

-- 
"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it." - Brian W. Kernighan



-------------------------------------------------------
SF.Net email is sponsored by Shop4tech.com-Lowest price on Blank Media
100pk Sonic DVD-R 4x for only $29 -100pk Sonic DVD+R for only $33
Save 50% off Retail on Ink & Toner - Free Shipping and Free Gift.
http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 2.6.7-xen0 hang on boot
  2004-08-11 20:34 ` Ian Pratt
@ 2004-08-12  2:31   ` Rik van Riel
  2004-08-12  2:35     ` Rik van Riel
  0 siblings, 1 reply; 19+ messages in thread
From: Rik van Riel @ 2004-08-12  2:31 UTC (permalink / raw)
  To: Ian Pratt; +Cc: xen-devel, Ulrich Drepper, Ingo Molnar

On Wed, 11 Aug 2004, Ian Pratt wrote:

> Please can you try booting with 'nosegfixup' on the Linux command
> line, or just move /lib/tls out of the way.

OK, moving /lib/tls out of the way seems to fix the issue, too.

It appears there is some interaction between TLS and the segment
fixup code in Xen...

-- 
"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it." - Brian W. Kernighan



-------------------------------------------------------
SF.Net email is sponsored by Shop4tech.com-Lowest price on Blank Media
100pk Sonic DVD-R 4x for only $29 -100pk Sonic DVD+R for only $33
Save 50% off Retail on Ink & Toner - Free Shipping and Free Gift.
http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 2.6.7-xen0 hang on boot
  2004-08-12  2:31   ` Rik van Riel
@ 2004-08-12  2:35     ` Rik van Riel
  2004-08-12  2:43       ` Rik van Riel
  0 siblings, 1 reply; 19+ messages in thread
From: Rik van Riel @ 2004-08-12  2:35 UTC (permalink / raw)
  To: Ian Pratt; +Cc: xen-devel, Ulrich Drepper, Ingo Molnar

On Wed, 11 Aug 2004, Rik van Riel wrote:
> On Wed, 11 Aug 2004, Ian Pratt wrote:
> 
> > Please can you try booting with 'nosegfixup' on the Linux command
> > line, or just move /lib/tls out of the way.
> 
> OK, moving /lib/tls out of the way seems to fix the issue, too.
> 
> It appears there is some interaction between TLS and the segment
> fixup code in Xen...

But oddly enough bash and ls don't seem to suffer, while
init and initlog do.

Putting /lib/tls back after booting xen doesn't seem to
impair my ability to use the system.  Stuff still works.

-- 
"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it." - Brian W. Kernighan



-------------------------------------------------------
SF.Net email is sponsored by Shop4tech.com-Lowest price on Blank Media
100pk Sonic DVD-R 4x for only $29 -100pk Sonic DVD+R for only $33
Save 50% off Retail on Ink & Toner - Free Shipping and Free Gift.
http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 2.6.7-xen0 hang on boot
  2004-08-12  2:35     ` Rik van Riel
@ 2004-08-12  2:43       ` Rik van Riel
  2004-08-12  3:07         ` Rik van Riel
  0 siblings, 1 reply; 19+ messages in thread
From: Rik van Riel @ 2004-08-12  2:43 UTC (permalink / raw)
  To: Ian Pratt; +Cc: xen-devel, Ulrich Drepper, Ingo Molnar

On Wed, 11 Aug 2004, Rik van Riel wrote:

> Putting /lib/tls back after booting xen doesn't seem to
> impair my ability to use the system.  Stuff still works.

OK, I managed to re-exec init (telinit U) after putting
TLS back, so far so good.

After changing runlevels, I got init to go into a tight
loop, using 100% of the CPU - all in userland though.
It doesn't seem to be able to recover from this and has
stopped reaping zombies.  Not sure how to debug things
now, except maybe by adding an ugly hack to the kernel
to allow me to debug init (might not be the best idea).

At least I'm having fun again ;)

-- 
"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it." - Brian W. Kernighan



-------------------------------------------------------
SF.Net email is sponsored by Shop4tech.com-Lowest price on Blank Media
100pk Sonic DVD-R 4x for only $29 -100pk Sonic DVD+R for only $33
Save 50% off Retail on Ink & Toner - Free Shipping and Free Gift.
http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 2.6.7-xen0 hang on boot
  2004-08-12  2:43       ` Rik van Riel
@ 2004-08-12  3:07         ` Rik van Riel
  2004-08-12  4:18           ` Rik van Riel
  0 siblings, 1 reply; 19+ messages in thread
From: Rik van Riel @ 2004-08-12  3:07 UTC (permalink / raw)
  To: Ian Pratt; +Cc: xen-devel, Ulrich Drepper, Ingo Molnar

On Wed, 11 Aug 2004, Rik van Riel wrote:

> stopped reaping zombies.  Not sure how to debug things
> now, except maybe by adding an ugly hack to the kernel
> to allow me to debug init (might not be the best idea).

Of course, there are other executables showing problems.
An easy one is /sbin/initlog.

When run the way the initscripts run it, it segfaults,
gdb gives the following (infinite?) backtrace, consisting
of these:

#8696 0x00920546 in vsyslog () from /lib/tls/libc.so.6
#8697 0x0091fecf in syslog () from /lib/tls/libc.so.6
#8698 0x00920546 in vsyslog () from /lib/tls/libc.so.6
#8699 0x0091fecf in syslog () from /lib/tls/libc.so.6
#8700 0x00920546 in vsyslog () from /lib/tls/libc.so.6

Here are the last lines strace give me:

stat64("/dev/log", {st_mode=S_IFSOCK|0666, st_size=0, ...}) = 0
socket(PF_FILE, SOCK_DGRAM, 0)          = 3
connect(3, {sa_family=AF_FILE, path="/dev/log"}, 110) = 0
close(3)                                = 0
--- SIGSEGV (Segmentation fault) @ 0 (0) ---
+++ killed by SIGSEGV +++


-- 
"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it." - Brian W. Kernighan



-------------------------------------------------------
SF.Net email is sponsored by Shop4tech.com-Lowest price on Blank Media
100pk Sonic DVD-R 4x for only $29 -100pk Sonic DVD+R for only $33
Save 50% off Retail on Ink & Toner - Free Shipping and Free Gift.
http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 2.6.7-xen0 hang on boot
  2004-08-12  3:07         ` Rik van Riel
@ 2004-08-12  4:18           ` Rik van Riel
  0 siblings, 0 replies; 19+ messages in thread
From: Rik van Riel @ 2004-08-12  4:18 UTC (permalink / raw)
  To: Ian Pratt; +Cc: xen-devel, Ulrich Drepper, Ingo Molnar

On Wed, 11 Aug 2004, Rik van Riel wrote:

> #8697 0x0091fecf in syslog () from /lib/tls/libc.so.6

Init's userspace EIP is 0x0091fef3, which I assume is
also inside syslog(), since /lib/tls/libc-2.3.3.so is
mapped at the same address in each program (prelink).

I can't attach a debugger to it, but alt-sysrq-p works.

Somehow the majority of the time appears to be spent
in kernel space, either page_fault and friends (especially
expand_stack is interesting) or Xen hypervisor calls
(work_notifysig -> do_notify_resume -> evtchn_do_upcall ->
 do_signal -> get_signal_to_deliver -> forc_evtchn_callback).

I guess this is explained by line 1800 of signal.c:

                /* Init gets no signals it doesn't want.  */
                if (current->pid == 1)
                        continue;

My guess is that init should also segfault, but the kernel
doesn't deliver the signal and dives straight back into the
pagefault handler, which doesn't know how to handle the fault,
etc...

The "all time spent in userspace" I reported before is an
artifact of Xen not reporting system time.  It really is
spending its time in kernel space.

-- 
"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it." - Brian W. Kernighan



-------------------------------------------------------
SF.Net email is sponsored by Shop4tech.com-Lowest price on Blank Media
100pk Sonic DVD-R 4x for only $29 -100pk Sonic DVD+R for only $33
Save 50% off Retail on Ink & Toner - Free Shipping and Free Gift.
http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 2.6.7-xen0 hang on boot
  2004-08-11 20:34   ` Rik van Riel
  2004-08-11 20:46     ` Keir Fraser
@ 2004-08-12 16:46     ` Keir Fraser
  2004-08-12 19:01       ` Rik van Riel
  1 sibling, 1 reply; 19+ messages in thread
From: Keir Fraser @ 2004-08-12 16:46 UTC (permalink / raw)
  To: Rik van Riel; +Cc: Keir Fraser, xen-devel

> On Wed, 11 Aug 2004, Keir Fraser wrote:
> 
> > Very weird. We do some odd things to patch up negative-offset segment
> > offsets, so perhaps trying with 'nosegfixup' added to Linux cmdline
> > would be worthwhile.
> 
> OK, 'nosegfixup' seems to fix things, the system is booting
> now ...
> 
> > Apart from that, I'll download FC3 myself -- is
> > it Fedora Core 3 Test 1 that you are using?
> 
> Close enough, I installed rawhide a week or so after
> test 1 was released.  I suspect FC3 test 1 should show
> the problem.

I installed FC3T1 and saw no problems. I then installed latest glibc
and SysVinit packages:
 glibc-2.3.3-42, SysVinit-2.85-31

...and still see no problems! I guess your problems must emanate from
one of those two -- perhaps you have a different versions installed?

If you can point me at RPMs containg your failing versions, or perhaps
tar up your /lib/tls directory for me, then I'll give them a spin!

 -- Keir


-------------------------------------------------------
SF.Net email is sponsored by Shop4tech.com-Lowest price on Blank Media
100pk Sonic DVD-R 4x for only $29 -100pk Sonic DVD+R for only $33
Save 50% off Retail on Ink & Toner - Free Shipping and Free Gift.
http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 2.6.7-xen0 hang on boot
  2004-08-12 16:46     ` Keir Fraser
@ 2004-08-12 19:01       ` Rik van Riel
  2004-08-12 19:56         ` Rik van Riel
  0 siblings, 1 reply; 19+ messages in thread
From: Rik van Riel @ 2004-08-12 19:01 UTC (permalink / raw)
  To: Keir Fraser; +Cc: xen-devel, Ulrich Drepper

On Thu, 12 Aug 2004, Keir Fraser wrote:

> I installed FC3T1 and saw no problems. I then installed latest glibc
> and SysVinit packages:
>  glibc-2.3.3-42, SysVinit-2.85-31
> 
> ...and still see no problems! I guess your problems must emanate from
> one of those two -- perhaps you have a different versions installed?

I've got glibc-2.3.3-31 here.   Let me try with -42...

> If you can point me at RPMs containg your failing versions, or perhaps
> tar up your /lib/tls directory for me, then I'll give them a spin!

If the problem turned out to be something that can be easily
avoided in glibc, I'd rather the Xen source code doesn't grow
an extra arm and a leg to deal with it ;)

-- 
"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it." - Brian W. Kernighan



-------------------------------------------------------
SF.Net email is sponsored by Shop4tech.com-Lowest price on Blank Media
100pk Sonic DVD-R 4x for only $29 -100pk Sonic DVD+R for only $33
Save 50% off Retail on Ink & Toner - Free Shipping and Free Gift.
http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 2.6.7-xen0 hang on boot
  2004-08-12 19:01       ` Rik van Riel
@ 2004-08-12 19:56         ` Rik van Riel
  2004-08-12 20:38           ` Ian Pratt
  2004-08-12 20:45           ` Keir Fraser
  0 siblings, 2 replies; 19+ messages in thread
From: Rik van Riel @ 2004-08-12 19:56 UTC (permalink / raw)
  To: Keir Fraser; +Cc: xen-devel, Ulrich Drepper

On Thu, 12 Aug 2004, Rik van Riel wrote:

> I've got glibc-2.3.3-31 here.   Let me try with -42...

Now it hangs way earlier, before init even gets
around to starting rc.sysinit.  This is on a
Pentium-M core...

Back to running with 'nosegfixup', so I can try
to build Fedora RPMs of Xen ;)

-- 
"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it." - Brian W. Kernighan



-------------------------------------------------------
SF.Net email is sponsored by Shop4tech.com-Lowest price on Blank Media
100pk Sonic DVD-R 4x for only $29 -100pk Sonic DVD+R for only $33
Save 50% off Retail on Ink & Toner - Free Shipping and Free Gift.
http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 2.6.7-xen0 hang on boot
  2004-08-12 19:56         ` Rik van Riel
@ 2004-08-12 20:38           ` Ian Pratt
  2004-08-12 21:27             ` Rik van Riel
  2004-08-12 21:42             ` Rik van Riel
  2004-08-12 20:45           ` Keir Fraser
  1 sibling, 2 replies; 19+ messages in thread
From: Ian Pratt @ 2004-08-12 20:38 UTC (permalink / raw)
  To: Rik van Riel; +Cc: Keir Fraser, xen-devel, Ulrich Drepper, Ian.Pratt

> On Thu, 12 Aug 2004, Rik van Riel wrote:
> 
> > I've got glibc-2.3.3-31 here.   Let me try with -42...
> 
> Now it hangs way earlier, before init even gets
> around to starting rc.sysinit.  This is on a
> Pentium-M core...

We've been running mostly on 18-month old 2.4GHz Xeon's. It seems
unlikely that there would be a hwcap difference that would cause
either ldconfig or the library's behaviour to differ from on a
Pentium-M.

Here's the flags from our system:
fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm

What happens if you boot with /lib/tls moved out the way then put
it back (and run ldconfig) after the system is up? Does it
seem stable?

This might isolate whether it's just an /sbin/init issue.

Thanks,
Ian



-------------------------------------------------------
SF.Net email is sponsored by Shop4tech.com-Lowest price on Blank Media
100pk Sonic DVD-R 4x for only $29 -100pk Sonic DVD+R for only $33
Save 50% off Retail on Ink & Toner - Free Shipping and Free Gift.
http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 2.6.7-xen0 hang on boot
  2004-08-12 19:56         ` Rik van Riel
  2004-08-12 20:38           ` Ian Pratt
@ 2004-08-12 20:45           ` Keir Fraser
  1 sibling, 0 replies; 19+ messages in thread
From: Keir Fraser @ 2004-08-12 20:45 UTC (permalink / raw)
  To: Rik van Riel; +Cc: xen-devel


> > I've got glibc-2.3.3-31 here.   Let me try with -42...
> 
> Now it hangs way earlier, before init even gets
> around to starting rc.sysinit.  This is on a
> Pentium-M core...
> 
> Back to running with 'nosegfixup', so I can try
> to build Fedora RPMs of Xen ;)

May I ask what is the latest changeset in your cloned Xen repository? 
Now I think about it, I only checked in final fixes for Fedora just
around the time that you would have experienced trouble and sent the
first email.

Have you updated to today's changesets?

I hate bugs that I can't reproduce. :-)

 Cheers,
 Keir


-------------------------------------------------------
SF.Net email is sponsored by Shop4tech.com-Lowest price on Blank Media
100pk Sonic DVD-R 4x for only $29 -100pk Sonic DVD+R for only $33
Save 50% off Retail on Ink & Toner - Free Shipping and Free Gift.
http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 2.6.7-xen0 hang on boot
  2004-08-12 20:38           ` Ian Pratt
@ 2004-08-12 21:27             ` Rik van Riel
  2004-08-12 21:52               ` Ian Pratt
  2004-08-12 21:42             ` Rik van Riel
  1 sibling, 1 reply; 19+ messages in thread
From: Rik van Riel @ 2004-08-12 21:27 UTC (permalink / raw)
  To: Ian Pratt; +Cc: Keir Fraser, xen-devel, Ulrich Drepper

On Thu, 12 Aug 2004, Ian Pratt wrote:

> What happens if you boot with /lib/tls moved out the way then put
> it back (and run ldconfig) after the system is up? Does it
> seem stable?

The system doesn't seem very stable after I put /lib/tls
back.  Every application seems to segfault ;)

# mv tls-nono/ tls
# grep flags /proc/cpuinfo
Segmentation fault
# mv tls tls-nono
Segmentation fault
$ su -
Password:
Segmentation fault

Time to reboot with 'nosegfixup' ;)

-- 
"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it." - Brian W. Kernighan



-------------------------------------------------------
SF.Net email is sponsored by Shop4tech.com-Lowest price on Blank Media
100pk Sonic DVD-R 4x for only $29 -100pk Sonic DVD+R for only $33
Save 50% off Retail on Ink & Toner - Free Shipping and Free Gift.
http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 2.6.7-xen0 hang on boot
  2004-08-12 20:38           ` Ian Pratt
  2004-08-12 21:27             ` Rik van Riel
@ 2004-08-12 21:42             ` Rik van Riel
  1 sibling, 0 replies; 19+ messages in thread
From: Rik van Riel @ 2004-08-12 21:42 UTC (permalink / raw)
  To: Ian Pratt; +Cc: Keir Fraser, xen-devel, Ulrich Drepper

On Thu, 12 Aug 2004, Ian Pratt wrote:

> Here's the flags from our system:

> fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36
> clflush dts acpi mmx fxsr sse sse2 ss ht tm

fpu vme de pse tsc msr mce cx8 apic sep mtrr pge mca cmov pat clflush dts 
acpi mmx fxsr sse sse2 tm pbe tm2 est

So it appears my CPU doesn't have:	 pae, pse36 and ss.

Your CPU doesn't have:			 pbe, tm2 and est.

Not sure if they make any big difference, though ...

-- 
"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it." - Brian W. Kernighan



-------------------------------------------------------
SF.Net email is sponsored by Shop4tech.com-Lowest price on Blank Media
100pk Sonic DVD-R 4x for only $29 -100pk Sonic DVD+R for only $33
Save 50% off Retail on Ink & Toner - Free Shipping and Free Gift.
http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 2.6.7-xen0 hang on boot
  2004-08-12 21:27             ` Rik van Riel
@ 2004-08-12 21:52               ` Ian Pratt
  0 siblings, 0 replies; 19+ messages in thread
From: Ian Pratt @ 2004-08-12 21:52 UTC (permalink / raw)
  To: Rik van Riel; +Cc: Ian Pratt, Keir Fraser, xen-devel, Ulrich Drepper

> On Thu, 12 Aug 2004, Ian Pratt wrote:
> 
> > What happens if you boot with /lib/tls moved out the way then put
> > it back (and run ldconfig) after the system is up? Does it
> > seem stable?
> 
> The system doesn't seem very stable after I put /lib/tls
> back.  Every application seems to segfault ;)

Oh well, I think we can declare /sbin/init innocent...

I'll try and get hold of a Pentium-M to test FC3T1 on, just in
case there's some weird hwcap thing. Beyond that, I guess we'd
need a tar ball of your whole file system.

It's definitely worth checking that you're fully up to date with
the BK repository.

> Time to reboot with 'nosegfixup' ;)

Running purely in -ve segment offset emulation-only mode is
pretty slow: you might want to just move /lib/tls out of the way
for the moment.  (When operational, the binary-rewriting fixup
code makes the overhead negligible)


Ian



-------------------------------------------------------
SF.Net email is sponsored by Shop4tech.com-Lowest price on Blank Media
100pk Sonic DVD-R 4x for only $29 -100pk Sonic DVD+R for only $33
Save 50% off Retail on Ink & Toner - Free Shipping and Free Gift.
http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2004-08-12 21:52 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-08-11 19:36 2.6.7-xen0 hang on boot Rik van Riel
2004-08-11 20:26 ` Keir Fraser
2004-08-11 20:34   ` Rik van Riel
2004-08-11 20:46     ` Keir Fraser
2004-08-11 20:50       ` Rik van Riel
2004-08-12 16:46     ` Keir Fraser
2004-08-12 19:01       ` Rik van Riel
2004-08-12 19:56         ` Rik van Riel
2004-08-12 20:38           ` Ian Pratt
2004-08-12 21:27             ` Rik van Riel
2004-08-12 21:52               ` Ian Pratt
2004-08-12 21:42             ` Rik van Riel
2004-08-12 20:45           ` Keir Fraser
2004-08-11 20:34 ` Ian Pratt
2004-08-12  2:31   ` Rik van Riel
2004-08-12  2:35     ` Rik van Riel
2004-08-12  2:43       ` Rik van Riel
2004-08-12  3:07         ` Rik van Riel
2004-08-12  4:18           ` Rik van Riel

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.