* wierd failures from -mm1
@ 2006-04-07 18:05 Martin Bligh
[not found] ` <1144433309.24221.7.camel@localhost.localdomain>
0 siblings, 1 reply; 4+ messages in thread
From: Martin Bligh @ 2006-04-07 18:05 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux-kernel, Andy Whitcroft
I hadn't mailed this out for a while, cause we weren't sure if it was
-mm or a testing glitch, but there's been no -git releases, so Andy
reran -mm to double check, and it still seems to be there. a subsequent
test of rc1 + cons patches didn't hit this ... I think -mm has issues ;-)
Look at the 2.6.17-rc1-mm1 column from: http://test.kernel.org/
Drilling down into the console logs:
http://test.kernel.org/abat/27597/debug/console.log
Hangs after testing NMI watchdog.
http://test.kernel.org/abat/27596/debug/console.log
Hangs after bringing up cpus.
http://test.kernel.org/abat/27598/debug/console.log
http://test.kernel.org/abat/27593/debug/console.log
Both fail with reiserfs fsck errors; at first sight look like just dirty
root partitions, but I don't think they are.
Filesystem is clean
Failed to lock the process to fsck the mounted ro partition. Bad address.
fsck.reiserfs /dev/sda3 failed (status 0x8). Run manually!
Note that it's actually saying it's clean.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: wierd failures from -mm1
[not found] ` <1144433309.24221.7.camel@localhost.localdomain>
@ 2006-04-07 18:20 ` Martin Bligh
2006-04-07 19:11 ` Andrew Morton
0 siblings, 1 reply; 4+ messages in thread
From: Martin Bligh @ 2006-04-07 18:20 UTC (permalink / raw)
To: Dave Hansen; +Cc: Andrew Morton, linux-kernel, Andy Whitcroft
Dave Hansen wrote:
> On Fri, 2006-04-07 at 11:05 -0700, Martin Bligh wrote:
>
>>http://test.kernel.org/abat/27596/debug/console.log
>>Hangs after bringing up cpus.
>
>
> See attached patch. It fixes curly.
Splendid -thanks. This may well fix the first two ... I think the reiser
thing is likely still borked though.
M.
> -- Dave
>
>
> ------------------------------------------------------------------------
>
> Subject:
> [PATCH 2.6.17-rc1-mm1] sched_domain-handle-kmalloc-failure-fix
> From:
> Lee Schermerhorn <Lee.Schermerhorn@hp.com>
> Date:
> Thu, 06 Apr 2006 15:58:47 -0400
> To:
> linux-kernel <linux-kernel@vger.kernel.org>
>
> To:
> linux-kernel <linux-kernel@vger.kernel.org>
> CC:
> Andrew Morton <akpm@osdl.org>, Eric Whitney <eric.whitney@hp.com>
>
>
> [PATCH] sched_domain-handle-kmalloc-failure-fix
>
> 2.6.17-rc1-mm1 hangs during boot on HP rx8620 and dl585 -- both 4 node
> NUMA platforms. Problem is in build_sched_domains() setting up the
> sched_group_nodes[] lists, resulting from patch:
> sched_domain-handle-kmalloc-failure.patch
>
> The referenced patch does not propagate the "next" pointer from the head
> of the list, resulting in a loop between the last 2 groups in the list.
> This causes a tight loop/hang in init_numa_sched_groups_power() because
> 'sg->next' never == 'group_head' when you have > 2 nodes.
>
> This patch seems to fix the problem.
>
> Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
>
> Index: linux-2.6.17-rc1-mm1/kernel/sched.c
> ===================================================================
> --- linux-2.6.17-rc1-mm1.orig/kernel/sched.c 2006-04-06 15:18:32.000000000 -0400
> +++ linux-2.6.17-rc1-mm1/kernel/sched.c 2006-04-06 15:20:49.000000000 -0400
> @@ -6360,7 +6360,7 @@ static int build_sched_domains(const cpu
> }
> sg->cpu_power = 0;
> sg->cpumask = tmp;
> - sg->next = prev;
> + sg->next = prev->next;
> cpus_or(covered, covered, tmp);
> prev->next = sg;
> prev = sg;
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: wierd failures from -mm1
2006-04-07 18:20 ` Martin Bligh
@ 2006-04-07 19:11 ` Andrew Morton
2006-04-08 14:28 ` Martin J. Bligh
0 siblings, 1 reply; 4+ messages in thread
From: Andrew Morton @ 2006-04-07 19:11 UTC (permalink / raw)
To: Martin Bligh; +Cc: haveblue, linux-kernel, apw
Martin Bligh <mbligh@mbligh.org> wrote:
>
> Dave Hansen wrote:
> > On Fri, 2006-04-07 at 11:05 -0700, Martin Bligh wrote:
> >
> >>http://test.kernel.org/abat/27596/debug/console.log
> >>Hangs after bringing up cpus.
> >
> >
> > See attached patch. It fixes curly.
>
> Splendid -thanks. This may well fix the first two ... I think the reiser
> thing is likely still borked though.
The reiserfsck problem looks like a failed mlockall. Reverting
mm-posix-memory-lock.patch should fix it.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: wierd failures from -mm1
2006-04-07 19:11 ` Andrew Morton
@ 2006-04-08 14:28 ` Martin J. Bligh
0 siblings, 0 replies; 4+ messages in thread
From: Martin J. Bligh @ 2006-04-08 14:28 UTC (permalink / raw)
To: Andrew Morton; +Cc: haveblue, linux-kernel, apw
Andrew Morton wrote:
> Martin Bligh <mbligh@mbligh.org> wrote:
>
>>Dave Hansen wrote:
>> > On Fri, 2006-04-07 at 11:05 -0700, Martin Bligh wrote:
>> >
>> >>http://test.kernel.org/abat/27596/debug/console.log
>> >>Hangs after bringing up cpus.
>> >
>> >
>> > See attached patch. It fixes curly.
>>
>> Splendid -thanks. This may well fix the first two ... I think the reiser
>> thing is likely still borked though.
>
>
> The reiserfsck problem looks like a failed mlockall. Reverting
> mm-posix-memory-lock.patch should fix it.
Didn't manage to get that test kicked off before you released -mm2,
which seems to work fine (across the boxes that still work, at least)
M.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2006-04-08 14:28 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-04-07 18:05 wierd failures from -mm1 Martin Bligh
[not found] ` <1144433309.24221.7.camel@localhost.localdomain>
2006-04-07 18:20 ` Martin Bligh
2006-04-07 19:11 ` Andrew Morton
2006-04-08 14:28 ` Martin J. Bligh
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).