linux-next: PowerPC boot failures in next-20120521

* linux-next: PowerPC boot failures in next-20120521
@ 2012-05-22  1:40 Stephen Rothwell
  2012-05-22  1:53 ` David Rientjes
  2012-05-22  2:12 ` linux-next: PowerPC boot failures in next-20120521 Michael Neuling
  0 siblings, 2 replies; 13+ messages in thread
From: Stephen Rothwell @ 2012-05-22  1:40 UTC (permalink / raw)
  To: LKML
  Cc: linux-next, ppc-dev, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, Peter Zijlstra, Lee Schermerhorn, Linus

[-- Attachment #1.1: Type: text/plain, Size: 2690 bytes --]

Hi all,

Last nights boot tests on various PowerPC systems failed like this:

calling  .numa_group_init+0x0/0x3c @ 1
initcall .numa_group_init+0x0/0x3c returned 0 after 0 usecs
calling  .numa_init+0x0/0x1dc @ 1
Unable to handle kernel paging request for data at address 0x00001688
Faulting instruction address: 0xc00000000016e154
Oops: Kernel access of bad area, sig: 11 [#1]
SMP NR_CPUS=32 NUMA pSeries
Modules linked in:
NIP: c00000000016e154 LR: c0000000001b9140 CTR: 0000000000000000
REGS: c0000003fc8c76d0 TRAP: 0300   Not tainted  (3.4.0-autokern1)
MSR: 8000000000009032 <SF,EE,ME,IR,DR,RI>  CR: 24044022  XER: 00000003
SOFTE: 1
CFAR: 000000000000562c
DAR: 0000000000001688, DSISR: 40000000
TASK = c0000003fc8c8000[1] 'swapper/0' THREAD: c0000003fc8c4000 CPU: 0
GPR00: 0000000000000000 c0000003fc8c7950 c000000000d05b30 00000000000012d0 
GPR04: 0000000000000000 0000000000001680 0000000000000000 c0000003fe032f60 
GPR08: 0004005400000001 0000000000000000 ffffffffffffc980 c000000000d24fe0 
GPR12: 0000000024044024 c00000000f33b000 0000000001a3fa78 00000000009bac00 
GPR16: 0000000000e1f338 0000000002d513f0 0000000000001680 0000000000000000 
GPR20: 0000000000000001 c0000003fc8c7c00 0000000000000000 0000000000000001 
GPR24: 0000000000000001 c000000000d1b490 0000000000000000 0000000000001680 
GPR28: 0000000000000000 0000000000000000 c000000000c7ce58 c0000003fe009200 
NIP [c00000000016e154] .__alloc_pages_nodemask+0xc4/0x8f0
LR [c0000000001b9140] .new_slab+0xd0/0x3c0
Call Trace:
[c0000003fc8c7950] [2e6e756d615f696e] 0x2e6e756d615f696e (unreliable)
[c0000003fc8c7ae0] [c0000000001b9140] .new_slab+0xd0/0x3c0
[c0000003fc8c7b90] [c0000000001b9844] .__slab_alloc+0x254/0x5b0
[c0000003fc8c7cd0] [c0000000001bb7a4] .kmem_cache_alloc_node_trace+0x94/0x260
[c0000003fc8c7d80] [c000000000ba36d0] .numa_init+0x98/0x1dc
[c0000003fc8c7e10] [c00000000000ace4] .do_one_initcall+0x1a4/0x1e0
[c0000003fc8c7ed0] [c000000000b7b354] .kernel_init+0x124/0x2e0
[c0000003fc8c7f90] [c0000000000211c8] .kernel_thread+0x54/0x70
Instruction dump:
5400d97e 7b170020 0b000000 eb3e8000 3b800000 80190088 2f800000 40de0014 
7860efe2 787c6fe2 78000fa4 7f9c0378 <e81b0008> 83f90000 2fa00000 7fff1838 
---[ end trace 31fd0ba7d8756001 ]---

swapper/0 (1) used greatest stack depth: 10864 bytes left
Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b

I may be completely wrong, but I guess the obvious target would be the
sched/numa branch that came in via the tip tree.

Config file attached.  I haven't had a chance to try to bisect this yet.

Anyone have any ideas?
-- 
Cheers,
Stephen Rothwell                    sfr@canb.auug.org.au

[-- Attachment #1.2: dotconfig.bz2 --]
[-- Type: application/octet-stream, Size: 15419 bytes --]

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread