linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* random hangs during boot in 3.0-rc
@ 2011-06-07  3:46 Dave Jones
  2011-06-07  5:45 ` Mike Galbraith
  2011-06-07 11:01 ` Ingo Molnar
  0 siblings, 2 replies; 4+ messages in thread
From: Dave Jones @ 2011-06-07  3:46 UTC (permalink / raw)
  To: Linux Kernel; +Cc: Ingo Molnar, Linus Torvalds

I have two machines that occasionally (like 1 in 10 boots or so) hang solid
during boot-up.  Happens in different places, but usually either when loading
the microcode driver, or while doing a fsck.

I did a bisect which took a *long* time, since I booted each kernel 10 times
before pronouncing it 'good'. Once it fingered the bad commit, I started over,
and arrived at the same conclusion a second time.

But the actual commit is a merge commit.  What now ?

commit 42ac9e87fdd89b77fa2ca0a5226023c1c2d83226
Merge: 057f3fa f0e615c
Author: Ingo Molnar <mingo@elte.hu>
Date:   Thu Apr 21 11:39:21 2011 +0200

    Merge commit 'v2.6.39-rc4' into sched/core
    
    Merge reason: Pick up upstream fixes.
    
    Signed-off-by: Ingo Molnar <mingo@elte.hu>


It's possible I just didn't get 'lucky' and marked something as good,
when it wouldn't have triggered until the 11th boot, which is why I did that
second bisect run.  Should I bother doing a 3rd try ?

The kernels have a bunch of debug options turned on, but I don't get anything
out of the machine at all, it's just wedged solid.

The machines I'm seeing this on are a quad-core AMD Phenom, and a Dual core2duo,
so quite disparate hardware. (And making me believe it's too coincidental to be a
hardware problem).

Anyone else seeing anything like this ?

	Dave


git bisect start
# bad: [d762f4383100c2a87b1a3f2d678cd3b5425655b4] Merge branch 'sh-latest' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6
git bisect bad d762f4383100c2a87b1a3f2d678cd3b5425655b4
# good: [61c4f2c81c61f73549928dfd9f3e8f26aa36a8cf] Linux 2.6.39
git bisect good 61c4f2c81c61f73549928dfd9f3e8f26aa36a8cf
# bad: [052497553e5dedc04c43800820c1d5788201cc71] Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
git bisect bad 052497553e5dedc04c43800820c1d5788201cc71
# good: [2142c131a3e290ae350f8a0b0d354c0585a96df1] net: convert to new cpumask API
git bisect good 2142c131a3e290ae350f8a0b0d354c0585a96df1
# bad: [a2d063ac216c1618bfc2b4d40b7176adffa63511] extable, core_kernel_data(): Make sure all archs define _sdata
git bisect bad a2d063ac216c1618bfc2b4d40b7176adffa63511
# good: [df48d8716eab9608fe93924e4ae06ff110e8674f] Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
git bisect good df48d8716eab9608fe93924e4ae06ff110e8674f
# bad: [13588209aa90d9c8e502750fc86160314555612f] Merge branch 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
git bisect bad 13588209aa90d9c8e502750fc86160314555612f
# bad: [7e6628e4bcb3b3546c625ec63ca724f28ab14f0c] Merge branch 'timers-clockevents-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
git bisect bad 7e6628e4bcb3b3546c625ec63ca724f28ab14f0c
# good: [6ddafdaab3f809b110ada253d2f2d4910ebd3ac5] Merge branch 'sched/locking' into sched/core
git bisect good 6ddafdaab3f809b110ada253d2f2d4910ebd3ac5
# good: [61ee9a4ba05f0a4163d43a33dee7a0651e080b98] x86: Convert PIT to clockevents_config_and_register()
git bisect good 61ee9a4ba05f0a4163d43a33dee7a0651e080b98
# bad: [7142d17e8f935fa842e9f6eece2281b6d41625d6] sched: Shorten the construction of the span cpu mask of sched domain
git bisect bad 7142d17e8f935fa842e9f6eece2281b6d41625d6
# bad: [d3bf52e998056a6002b2aecfe1d25486376382ac] sched: Remove obsolete comment from scheduler_tick()
git bisect bad d3bf52e998056a6002b2aecfe1d25486376382ac
# good: [2f36825b176f67e5c5228aa33d828bc39718811f] sched: Next buddy hint on sleep and preempt path
git bisect good 2f36825b176f67e5c5228aa33d828bc39718811f
# bad: [42ac9e87fdd89b77fa2ca0a5226023c1c2d83226] Merge commit 'v2.6.39-rc4' into sched/core
git bisect bad 42ac9e87fdd89b77fa2ca0a5226023c1c2d83226
# good: [057f3fadb347e9c51b07e1b277bbdda79f976768] sched: Fix sched_domain iterations vs. RCU
git bisect good 057f3fadb347e9c51b07e1b277bbdda79f976768


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: random hangs during boot in 3.0-rc
  2011-06-07  3:46 random hangs during boot in 3.0-rc Dave Jones
@ 2011-06-07  5:45 ` Mike Galbraith
  2011-06-07 11:01 ` Ingo Molnar
  1 sibling, 0 replies; 4+ messages in thread
From: Mike Galbraith @ 2011-06-07  5:45 UTC (permalink / raw)
  To: Dave Jones; +Cc: Linux Kernel, Ingo Molnar, Linus Torvalds

On Mon, 2011-06-06 at 23:46 -0400, Dave Jones wrote:
> I have two machines that occasionally (like 1 in 10 boots or so) hang solid
> during boot-up.  Happens in different places, but usually either when loading
> the microcode driver, or while doing a fsck.
> 
> I did a bisect which took a *long* time, since I booted each kernel 10 times
> before pronouncing it 'good'. Once it fingered the bad commit, I started over,
> and arrived at the same conclusion a second time.
> 
> But the actual commit is a merge commit.  What now ?
> 
> commit 42ac9e87fdd89b77fa2ca0a5226023c1c2d83226
> Merge: 057f3fa f0e615c
> Author: Ingo Molnar <mingo@elte.hu>
> Date:   Thu Apr 21 11:39:21 2011 +0200
> 
>     Merge commit 'v2.6.39-rc4' into sched/core
>     
>     Merge reason: Pick up upstream fixes.
>     
>     Signed-off-by: Ingo Molnar <mingo@elte.hu>
> 
> 
> It's possible I just didn't get 'lucky' and marked something as good,
> when it wouldn't have triggered until the 11th boot, which is why I did that
> second bisect run.  Should I bother doing a 3rd try ?
> 
> The kernels have a bunch of debug options turned on, but I don't get anything
> out of the machine at all, it's just wedged solid.

Ditto here using a bug report config.  Seems to be cured now.

> The machines I'm seeing this on are a quad-core AMD Phenom, and a Dual core2duo,
> so quite disparate hardware. (And making me believe it's too coincidental to be a
> hardware problem).
> 
> Anyone else seeing anything like this ?

Maybe this?

https://lkml.org/lkml/2011/6/6/645

	-Mike


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: random hangs during boot in 3.0-rc
  2011-06-07  3:46 random hangs during boot in 3.0-rc Dave Jones
  2011-06-07  5:45 ` Mike Galbraith
@ 2011-06-07 11:01 ` Ingo Molnar
  2011-06-07 16:24   ` Dave Jones
  1 sibling, 1 reply; 4+ messages in thread
From: Ingo Molnar @ 2011-06-07 11:01 UTC (permalink / raw)
  To: Dave Jones, Linux Kernel, Linus Torvalds


* Dave Jones <davej@redhat.com> wrote:

> I have two machines that occasionally (like 1 in 10 boots or so) 
> hang solid during boot-up.  Happens in different places, but 
> usually either when loading the microcode driver, or while doing a 
> fsck.

I think this commit in tip:sched/urgent will fix it:

  f2513cde93f0: lockdep: Fix lock_is_held() on recursion

Will send it to Linus later today.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: random hangs during boot in 3.0-rc
  2011-06-07 11:01 ` Ingo Molnar
@ 2011-06-07 16:24   ` Dave Jones
  0 siblings, 0 replies; 4+ messages in thread
From: Dave Jones @ 2011-06-07 16:24 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Linux Kernel, Linus Torvalds

On Tue, Jun 07, 2011 at 01:01:02PM +0200, Ingo Molnar wrote:
 > 
 > * Dave Jones <davej@redhat.com> wrote:
 > 
 > > I have two machines that occasionally (like 1 in 10 boots or so) 
 > > hang solid during boot-up.  Happens in different places, but 
 > > usually either when loading the microcode driver, or while doing a 
 > > fsck.
 > 
 > I think this commit in tip:sched/urgent will fix it:
 > 
 >   f2513cde93f0: lockdep: Fix lock_is_held() on recursion
 > 
 > Will send it to Linus later today.

Indeed. Looks like that does fix the problem. 
I've done a number of reboots on one of the affected machines this morning,
without any hangs.

	Dave


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2011-06-07 16:25 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-06-07  3:46 random hangs during boot in 3.0-rc Dave Jones
2011-06-07  5:45 ` Mike Galbraith
2011-06-07 11:01 ` Ingo Molnar
2011-06-07 16:24   ` Dave Jones

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).