* [PATCH v3.3-rc5] memblock: Fix size aligning of memblock_alloc_base_nid() @ 2012-02-28 20:56 Tejun Heo 2012-02-13 7:45 ` OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88 Meelis Roos 2012-02-28 22:16 ` [PATCH v3.3-rc5] " Sam Ravnborg 0 siblings, 2 replies; 46+ messages in thread From: Tejun Heo @ 2012-02-28 20:56 UTC (permalink / raw) To: Ingo Molnar, H. Peter Anvin Cc: David S. Miller, linux-kernel, Meelis Roos, Grant Likely, Rob Herring, sparclinux, sam memblock allocator aligns @size to @align to reduce the amount of fragmentation. 7bd0b0f0da "memblock: Reimplement memblock allocation using reverse free area iterator" broke it by incorrectly relocating @size aligning to memblock_find_in_range_node(). As the aligned size is not propagated back to memblock_alloc_base_nid(), the actually reserved size isn't aligned. While this increases memory use for memblock reserved array, this shouldn't cause any critical failure; however, it seems that the size aligning was hiding a use-beyond-allocation bug in sparc64 and losing the aligning causes boot failure. The underlying problem is currently being debugged but this is a proper fix in itself, it's already pretty late in -rc cycle for boot failures and reverting the change for debugging isn't difficult. Restore the size aligning moving it to memblock_alloc_base_nid(). Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Meelis Roos <mroos@linux.ee> Reported-by: Sam Ravnborg <sam@ravnborg.org> LKML-Reference: <alpine.SOC.1.00.1202130942030.1488@math.ut.ee> --- mm/memblock.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/mm/memblock.c b/mm/memblock.c index 77b5f22..99f2855 100644 --- a/mm/memblock.c +++ b/mm/memblock.c @@ -99,9 +99,6 @@ phys_addr_t __init_memblock memblock_find_in_range_node(phys_addr_t start, phys_addr_t this_start, this_end, cand; u64 i; - /* align @size to avoid excessive fragmentation on reserved array */ - size = round_up(size, align); - /* pump up @end */ if (end == MEMBLOCK_ALLOC_ACCESSIBLE) end = memblock.current_limit; @@ -731,6 +728,9 @@ static phys_addr_t __init memblock_alloc_base_nid(phys_addr_t size, { phys_addr_t found; + /* align @size to avoid excessive fragmentation on reserved array */ + size = round_up(size, align); + found = memblock_find_in_range_node(0, max_addr, size, align, nid); if (found && !memblock_reserve(found, size)) return found; ^ permalink raw reply related [flat|nested] 46+ messages in thread
* OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88 @ 2012-02-13 7:45 ` Meelis Roos 2012-02-13 8:06 ` Grant Likely 2012-03-01 12:24 ` [tip:core/urgent] memblock: Fix size aligning of memblock_alloc_base_nid() tip-bot for Tejun Heo 0 siblings, 2 replies; 46+ messages in thread From: Meelis Roos @ 2012-02-13 7:45 UTC (permalink / raw) To: Grant Likely, Rob Herring; +Cc: sparclinux, Linux Kernel list (Resend with proper To-s for OF people) This is my first post-3.2 test on 2-CPU Sun Enterprise 3500 (PCI+SBus IO). prtconf is also below. Something OF-related seems to be happening here. [ 0.000000] PROMLIB: Sun IEEE Boot Prom 'OBP 3.2.30 2002/10/25 14:03' [ 0.000000] PROMLIB: Root node compatible: [ 0.000000] Initializing cgroup subsys cpu [ 0.000000] Linux version 3.3.0-rc3-00188-g3ec1e88 (mroos@korvits) (gcc version 4.6.2 (Debian 4.6.2-14) ) #64 SMP Sun Feb 12 22:26:40 EET 2012 [ 0.000000] debug: ignoring loglevel setting. [ 0.000000] bootconsole [earlyprom0] enabled [ 0.000000] ARCH: SUN4U [ 0.000000] Ethernet address: 08:00:20:b6:ee:e2 [ 0.000000] Kernel: Using 4 locked TLB entries for main kernel image. [ 0.000000] Remapping the kernel... done. [ 0.000000] Unable to handle kernel NULL pointer dereference [ 0.000000] tsk->{mm,active_mm}->context = 0000000000000000 [ 0.000000] tsk->{mm,active_mm}->pgd = fffff800008c77d0 [ 0.000000] \|/ ____ \|/ [ 0.000000] "@'/ .. \`@" [ 0.000000] /_| \__/ |_\ [ 0.000000] \__U_/ [ 0.000000] swapper(0): Oops [#1] [ 0.000000] TSTATE: 0000000080e01607 TPC: 00000000006459a0 TNPC: 0000000000645964 Y: 00000037 Not tainted [ 0.000000] TPC: <of_find_node_by_path+0x60/0x80> [ 0.000000] g0: 0000000000000000 g1: 0000000000000001 g2: 00000000000000ff g3: 00000000000000f0 [ 0.000000] g4: 0000000000853fd0 g5: 0000000000000000 g6: 0000000000834000 g7: 0000000000000050 [ 0.000000] o0: 0000000000000001 o1: fffff8007fced7c0 o2: 0000000001010101 o3: 0000000080808080 [ 0.000000] o4: fffff8007fcc0a4d o5: 00000000000199b5 sp: 0000000000837231 ret_pc: 0000000000645970 [ 0.000000] RPC: <of_find_node_by_path+0x30/0x80> [ 0.000000] l0: 00000000008ab400 l1: fffff8007fcc1f40 l2: 000000000085c5ec l3: 0000000000000025 [ 0.000000] l4: 00000000005c0400 l5: 00000000008fa5e6 l6: 0000000000000006 l7: 0028280000000000 [ 0.000000] i0: fffff8007fced7c0 i1: 0000000000808fd8 i2: 0000000001010101 i3: 0000000080808080 [ 0.000000] i4: 0000000000876c00 i5: 0000000000000050 i6: 00000000008372e1 i7: 000000000064684c [ 0.000000] I7: <of_alias_scan+0xcc/0x1c0> [ 0.000000] Call Trace: [ 0.000000] [000000000064684c] of_alias_scan+0xcc/0x1c0 [ 0.000000] [00000000008a0350] of_pdt_build_devicetree+0x90/0xa0 [ 0.000000] [000000000088c540] prom_build_devicetree+0x10/0x3c [ 0.000000] [00000000008904d4] paging_init+0x59c/0x6bc [ 0.000000] [000000000088bebc] setup_arch+0xf8/0x110 [ 0.000000] [000000000088a51c] start_kernel+0x8c/0x34c [ 0.000000] [00000000006fbf28] tlb_fixup_done+0xa0/0xa8 [ 0.000000] [0000000000000000] (null) [ 0.000000] Disabling lock debugging due to kernel taint [ 0.000000] Caller[000000000064684c]: of_alias_scan+0xcc/0x1c0 [ 0.000000] Caller[00000000008a0350]: of_pdt_build_devicetree+0x90/0xa0 [ 0.000000] Caller[000000000088c540]: prom_build_devicetree+0x10/0x3c [ 0.000000] Caller[00000000008904d4]: paging_init+0x59c/0x6bc [ 0.000000] Caller[000000000088bebc]: setup_arch+0xf8/0x110 [ 0.000000] Caller[000000000088a51c]: start_kernel+0x8c/0x34c [ 0.000000] Caller[00000000006fbf28]: tlb_fixup_done+0xa0/0xa8 [ 0.000000] Caller[0000000000000000]: (null) [ 0.000000] Instruction DUMP: 01000000 fa5f6050 2aff7ff2 <c25f6018> 901720f0 40034b86 b010001d 81cfe008 01000000 [ 0.000000] Kernel panic - not syncing: Attempted to kill the idle task! [ 0.000000] Press Stop-A (L1-A) to return to the boot prom System Configuration: Sun Microsystems sun4u Memory size: 2048 Megabytes System Peripherals (PROM Nodes): Node 0xf0029c88 .node: f0029c88 clock-frequency: 05f5e100 previous-reset-reason: 'S-POR' banner-name: '5-slot Sun Enterprise E3500' idprom: 01800800.20b6eee2.00000000.b6eee2a9.00000000.00000000.00000000.00000000 reset-reason: 'S-POR' fatal-reset-info: 00006000 breakpoint-trap: 0000007f #size-cells: 00000002 name: 'SUNW,Ultra-Enterprise' Node 0xf002cf50 .node: f002cf50 name: 'packages' Node 0xf00365c0 .node: f00365c0 iso6429-1983-colors: name: 'terminal-emulator' Node 0xf003932c .node: f003932c disk-write-fix: name: 'deblocker' Node 0xf0039a08 .node: f0039a08 name: 'obp-tftp' Node 0xf00447cc .node: f00447cc name: 'disk-label' Node 0xf002cfc0 .node: f002cfc0 stdout: ffdc1428 stdin: ffdc1658 eeprom: f005dd0c mmu: fffe9f70 memory: fffea170 bootargs: 00 bootpath: '/pci@f,4000/SUNW,isptwo@3/sd@2,0:a' stdout-#lines: ffffffff name: 'chosen' Node 0xf002d02c .node: f002d02c add-brd-supported-types: '014' version: 'OBP 3.2.30 2002/10/25 14:03' model: 'SUNW,3.2' decode-complete: aligned-allocator: relative-addressing: name: 'openprom' Node 0xf002d0bc .node: f002d0bc name: 'client-services' Node 0xf002d164 .node: f002d164 disabled-memory-list: disabled-board-list: memory-interleave: 'max' configuration-policy: 'component' scsi-initiator-id: '7' keyboard-click?: 'false' keymap: ttyb-rts-dtr-off: 'false' ttyb-ignore-cd: 'true' ttya-rts-dtr-off: 'false' ttya-ignore-cd: 'true' ttyb-mode: '9600,8,n,1,-' ttya-mode: '9600,8,n,1,-' sbus-specific-probe: sbus-probe-default: 'd3120' mfg-mode: 'off ' diag-level: 'min' powerfail-time: '0' #power-cycles: '52' fcode-debug?: 'false' output-device: 'ttya' input-device: 'ttya' load-base: '16384' boot-command: 'boot' auto-boot?: 'false' watchdog-reboot?: 'false' diag-file: diag-device: 'mydisk' boot-file: boot-device: 'mydisk' local-mac-address?: 'false' ansi-terminal?: 'true' screen-#columns: '80' screen-#rows: '34' silent-mode?: 'false' use-nvramrc?: 'true' nvramrc: 64657661.6c696173.206d7964.69736b20.2f706369.40662c34.3030302f.53554e57.2c697370.74776f40.332f7364.40322c30.0a security-mode: 'none' security-password: security-#badlogins: '0' oem-logo: oem-logo?: 'false' oem-banner: oem-banner?: 'false' hardware-revision: last-hardware-update: '0' diag-switch?: 'false' name: 'options' Node 0xf002d1d4 .node: f002d1d4 mydisk: '/pci@f,4000/SUNW,isptwo@3/sd@2,0' disk: '/sbus@2,0/SUNW,socal@d,10000/sf@0,0/ssd@0,0' disksocal: '/sbus@2,0/SUNW,socal@d,10000/sf@0,0/ssd@0,0' diskbrd: '/sbus@3,0/SUNW,fas@3,8800000/sd@a,0' diskisp: '/sbus@3,0/QLGC,isp@0,10000/sd@0,0' net: '/sbus@3,0/SUNW,hme@3,8c00000' cdrom: '/sbus@3,0/SUNW,fas@3,8800000/sd@6,0:f' tape: '/sbus@3,0/SUNW,fas@3,8800000/st@4,0' scsi: '/sbus@3,0/SUNW,fas@3,8800000' disk0: '/sbus@3,0/SUNW,fas@3,8800000/sd@0,0' disk1: '/sbus@3,0/SUNW,fas@3,8800000/sd@1,0' disk2: '/sbus@3,0/SUNW,fas@3,8800000/sd@2,0' disk3: '/sbus@3,0/SUNW,fas@3,8800000/sd@3,0' disk4: '/sbus@3,0/SUNW,fas@3,8800000/sd@4,0' disk5: '/sbus@3,0/SUNW,fas@3,8800000/sd@5,0' tape0: '/sbus@3,0/SUNW,fas@3,8800000/st@4,0' tape1: '/sbus@3,0/SUNW,fas@3,8800000/st@5,0' ttya: '/central/fhc/zs@0,902000:a' ttyb: '/central/fhc/zs@0,902000:b' keyboard: '/central/fhc/zs@0,904000' keyboard!: '/central/fhc/zs@0,904000:forcemode' name: 'aliases' Node 0xf004efb4 .node: f004efb4 reg: 00000000.00000000.00000000.40000000.00000000.40000000.00000000.40000000 available: 00000000.7fce2000.00000000.00014000.00000000.7fc00000.00000000.000d2000.00000000.00000000.00000000.7f7de000 name: 'memory' Node 0xf004f594 .node: f004f594 translations: 00000000.fffd0000.00000000.00020000.80000000.7ff600b6.00000000.fff70000.00000000.00060000.80000000.7fef80b6.00000000.fff6e000.00000000.00002000.80000000.7fbfe0b6.00000000.fff6c000.00000000.00002000.80000000.7fef60b6.00000000.fff66000.00000000.00002000.800001ff.f890208e.00000000.fff64000.00000000.00002000.800001ff.f890808e.00000000.fff62000.00000000.00002000.800001ff.f890808e.00000000.fff60000.00000000.00002000.800001c4.f830008e.00000000.fff5e000.00000000.00002000.800001d4.f830008e.00000000.fff5c000.00000000.00002000.800001dc.f830008e.00000000.ffdd8000.00000000.00184000.80000000.7fd720b6.00000000.ffdcc000.00000000.0000c000.800001cc.f880408e.00000000.ffdca000.00000000.00002000.80000000.7fd700b6.00000000.ffdc8000.00000000.00002000.800001ff.f820608e.00000000.ffdc4000.00000000.00004000.80000000.7fd3c0b6.00000000.ffdc2000.00000000.00002000.800001ff.f890408e.00000000.ffdc0000.00000000.00002000.80000000.7fd3a0b6.00000000.ffdb2000.00000000.00006000.800001c4.0000 008e.00000000.ffdac000.00000000.00006000.800001c4.0000008e.00000000.ffda4000.00000000.00008000.80000000.7fd680b6.00000000.ffd98000.00000000.0000c000.800001c4.f880408e.00000000.ffd96000.00000000.00002000.800001c4.f830008e.00000000.ffd94000.00000000.00002000.800001c4.0000208e.00000000.ffd92000.00000000.00002000.800001c4.0000208e.00000000.ffd90000.00000000.00002000.800001c4.0000208e.00000000.ffd8a000.00000000.00006000.800001c6.0000008e.00000000.ffd84000.00000000.00006000.800001c6.0000008e.00000000.ffd7c000.00000000.00008000.80000000.7fd600b6.00000000.ffd7a000.00000000.00002000.800001c6.0000208e.00000000.ffd78000.00000000.00002000.800001c6.0000208e.00000000.ffd76000.00000000.00002000.800001c6.0000208e.00000000.ffd70000.00000000.00006000.800001d4.0000008e.00000000.ffd6a000.00000000.00006000.800001d4.0000008e.00000000.ffd62000.00000000.00008000.80000000.7fd580b6.00000000.ffd56000.00000000.0000c000.800001d4.f880408e.00000000.ffd54000.00000000.00002000.800001d4.f830008e.00000000.ffd 52000.00000000.00002000.800001d4.0000208e.00000000.ffd50000.00000000.00002000.800001d4.0000208e.00000000.ffd4e000.00000000.00002000.800001d4.0000208e.00000000.ffd48000.00000000.00006000.800001d6.0000008e.00000000.ffd42000.00000000.00006000.800001d6.0000008e.00000000.ffd3a000.00000000.00008000.80000000.7fd500b6.00000000.ffd38000.00000000.00002000.800001d6.0000208e.00000000.ffd36000.00000000.00002000.800001d6.0000208e.00000000.ffd34000.00000000.00002000.800001d6.0000208e.00000000.ffd32000.00000000.00002000.800001dc.0000408e.00000000.ffd30000.00000000.00002000.880001dc.0100008e.00000000.ffd22000.00000000.0000e000.800001dc.0000008e.00000000.ffd1a000.00000000.00008000.80000000.7fd480b6.00000000.ffd18000.00000000.00002000.800001dc.0000208e.00000000.ffd16000.00000000.00002000.880001dc.0180008e.00000000.ffd08000.00000000.0000e000.800001dc.0000008e.00000000.ffcfc000.00000000.0000c000.800001dc.f880408e.00000000.ffcfa000.00000000.00002000.800001dc.f830008e.00000000.ffcf8000.00000000.00 002000.800001dc.0000008e.00000000.ffcf6000.00000000.00002000.800001dc.0000008e.00000000.ffcf4000.00000000.00002000.800001dc.0000008e.00000000.ffcf2000.00000000.00002000.800001de.0000408e.00000000.ffcf0000.00000000.00002000.880001de.0100008e.00000000.ffce2000.00000000.0000e000.800001de.0000008e.00000000.ffcda000.00000000.00008000.80000000.7fd400b6.00000000.ffcd8000.00000000.00002000.800001de.0000208e.00000000.ffcd6000.00000000.00002000.880001de.0180008e.00000000.ffcc8000.00000000.0000e000.800001de.0000008e.00000000.ffcc6000.00000000.00002000.800001de.0000008e.00000000.ffcc4000.00000000.00002000.800001de.0000008e.00000000.ffcc2000.00000000.00002000.800001de.0000008e.00000000.ffac2000.00000000.00200000.80000000.7f7de0b6.00000000.f07fe000.00000000.00002000.800001ff.f004208e.00000000.f02a0000.00000000.00040000.80000000.7fcfa0b6.00000000.f0080000.00000000.00220000.80000000.7f9de0b6.00000000.f0000000.00000000.00080000.80000000.7ff800b6.00000000.4162a000.00000000.029d6000.80000000.0 1a2a036.00000000.40000000.00000000.00c00000.80000000.00400036.00000000.00002000.00000000.00bfe000.80000000.00002036 existing: 00000000.00000000.00000800.00000000.fffff800.00000000.00000800.00000000 available: fffff800.00000000.000007fc.00000000.00000001.00000000.000007ff.00000000.00000000.ffff0000.00000000.0000e000.00000000.00000000.00000000.f0000000.00000000.ffdb8000.00000000.00008000.00000000.f0800000.00000000.0f2c2000 page-size: 00002000 name: 'virtual-memory' Node 0xf005da70 .node: f005da70 ranges: 00000000.f8000000.000001ff.f8000000.08000000 reg: 000001ff.00000000.00000000.08000000 name: 'central' Node 0xf005db8c .node: f005db8c board-model: 'SUNW,501-2511' ranges: 00000000.00000000.00000000.f8000000.08000000 reg: 00000000.f8800000.00000110.00000000.f8802000.00000010.00000000.f8804000.00000020.00000000.f8806000.00000020.00000000.f8808000.00000020.00000000.f880a000.00000020 name: 'fhc' Node 0xf005dd0c .node: f005dd0c address: fff62000 watchdog-enable: interrupts: 0000003a reg: 00000000.00908000.00002000 model: 'mk48t59' name: 'eeprom' Node 0xf005de3c .node: f005de3c port-b-ignore-cd: port-a-ignore-cd: address: fff66000 interrupts: 00000039 device_type: 'serial' reg: 00000000.00902000.00000008 name: 'zs' Node 0xf005df14 .node: f005df14 address: ffdc2000 port-b-ignore-cd: port-a-ignore-cd: keyboard: interrupts: 00000039 device_type: 'serial' reg: 00000000.00904000.00000008 name: 'zs' Node 0xf005e05c .node: f005e05c reg: 00000000.00900000.00000008.00000000.00906000.00000060.00000000.0090c000.00000001 interrupts: 00000038 name: 'clock-board' Node 0xf00df7bc .node: f00df7bc board-type: 'cpu' board-model: 'SUNW,501-2557' ranges: 00000000.00000000.000001cc.f8000000.08000000 central-space: board#: 00000003 reg: 000001cc.f8800000.00000000.00000110.000001cc.f8802000.00000000.00000010.000001cc.f8804000.00000000.00000020.000001cc.f8806000.00000000.00000020.000001cc.f8808000.00000000.00000020.000001cc.f880a000.00000000.00000020 manfid#: 0000003e version#: 00000001 model: 'SUNW,fhc0FA0' name: 'fhc' Node 0xf00dfa08 .node: f00dfa08 reg: 00000000.01000000.00008000.00000000.02000000.01000000 bank-0-status: 'ok' bank-1-status: 'ok' manfid#: 0000003e version#: 00000005 model: 'SUNW,ac0F9E' device_type: 'memory-controller' name: 'ac' Node 0xf00dfb90 .node: f00dfb90 reg: 00000000.00600000.00000010 name: 'simm-status' Node 0xf00dfc28 .node: f00dfc28 interrupts: 0000003b reg: 00000000.00400000.00000010 name: 'environment' Node 0xf00dfce4 .node: f00dfce4 reg: 00000000.00200000.00008000.00000000.00280000.00008000 name: 'sram' Node 0xf00dfd80 .node: f00dfd80 version: 4f425020.2020332e.322e3330.20323030.322f3130.2f323520.31343a30.3300504f.53542020.332e392e.33302032.3030322f.31302f32.35203134.3a303400 model: 'SUNW,525-1431' reg: 00000000.00000000.00080000 name: 'flashprom' Node 0xf00dfec4 .node: f00dfec4 manufacturer#: 00000017 implementation#: 00000011 mask#: 000000a0 sparc-version: 00000009 ecache-associativity: 00000001 ecache-line-size: 00000040 ecache-size: 00800000 #dtlb-entries: 00000040 dcache-associativity: 00000001 dcache-line-size: 00000020 dcache-size: 00004000 #itlb-entries: 00000040 icache-associativity: 00000002 icache-line-size: 00000020 icache-size: 00004000 upa-portid: 00000006 clock-frequency: 17d78400 rated-frequency: 17d78400 reg: 000001cc.00000000.00000000.00000008 board#: 00000003 device_type: 'cpu' name: 'SUNW,UltraSPARC-II' Node 0xf00e0284 .node: f00e0284 manufacturer#: 00000017 implementation#: 00000011 mask#: 000000a0 sparc-version: 00000009 ecache-associativity: 00000001 ecache-line-size: 00000040 ecache-size: 00800000 #dtlb-entries: 00000040 dcache-associativity: 00000001 dcache-line-size: 00000020 dcache-size: 00004000 #itlb-entries: 00000040 icache-associativity: 00000002 icache-line-size: 00000020 icache-size: 00004000 upa-portid: 00000007 clock-frequency: 17d78400 rated-frequency: 17d78400 reg: 000001ce.00000000.00000000.00000008 board#: 00000003 device_type: 'cpu' name: 'SUNW,UltraSPARC-II' Node 0xf006f7bc .node: f006f7bc ranges: 00000001.00000000.000001c5.10000000.10000000.00000002.00000000.000001c5.20000000.10000000.0000000d.00000000.000001c5.d0000000.10000000 interrupts: 000000b4.000000b5.000000b6.000000a5.000000aa.000000b7 version#: 00000001 implementation#: 00000000 bus-parity-generated: address: ffdb2000 scsi-initiator-id: 00000007 model: 'SUNW,sysio' reg: 000001c4.00000000.00000000.00006000 slot-address-bits: 0000001c up-burst-sizes: 0078007f burst-sizes: 00f8007f device_type: 'sbus' name: 'sbus' upa-portid: 00000002 clock-frequency: 017d7840 board#: 00000001 Node 0xf0075084 .node: f0075084 wwn: 20040800.20b6eee2 intr: 00000003.00000000 interrupts: 00000022 ranges: 00000000.00000000.0000000d.00010240.00000018.00000001.00000000.0000000d.00010258.00000018.00000010.00000000.0000000d.00010300.00000008.00000011.00000000.0000000d.00010308.00000008 reg: 0000000d.00010000.00010018 device_type: 'socal' version: '@(#) FCode 1.12 99/07/30' manufacturer: 'SUNW' model: '501-3060' name: 'SUNW,socal' Node 0xf007c8ec .node: f007c8ec port-wwn: 20050800.20b6eee2 reg: 00000000.00000000.00000018.00000010.00000000.00000008 port#: 00000000 #address-cells: 00000004 device_type: 'scsi-3' name: 'sf' Node 0xf007e704 .node: f007e704 device_type: 'block' name: 'ssd' Node 0xf007efd4 .node: f007efd4 port-wwn: 20060800.20b6eee2 reg: 00000001.00000000.00000018.00000011.00000000.00000008 port#: 00000001 #address-cells: 00000004 device_type: 'scsi-3' name: 'sf' Node 0xf007f670 .node: f007f670 device_type: 'block' name: 'ssd' Node 0xf0080050 .node: f0080050 local-mac-address: 080020ee.2248 gem-rev: 00000000 burst-sizes: 0078007f shared-pins: 'serdes' board-rev: 00000005 interrupts: 00000004 compatible: 'SUNW,sbus-gem' model: 'SUNW,sbus-gem' has-fcode: ' ' version: '1.7' device_type: 'network' address-bits: 00000030 max-frame-size: 00004000 reg: 00000001.00100000.00000014.00000001.00200000.00009060 name: 'network' Node 0xf0086420 .node: f0086420 scsi-initiator-id: 00000007 isp-fcode: '1.21 95/05/18' device_type: 'scsi' intr: 00000003.00000000 interrupts: 00000003 wide: 00 clock-frequency: 02625a00 reg: 00000002.00010000.00000450 64-bit-clean: 00 model: 'QLGC,ISP1000' name: 'QLGC,isp' Node 0xf008bc8c .node: f008bc8c device_type: 'block' name: 'sd' Node 0xf008c4a0 .node: f008c4a0 device_type: 'byte' name: 'st' Node 0xf0071c1c .node: f0071c1c board-type: 'dual-sbus-soc+' manfid#: 0000003e version#: 00000001 ranges: 00000000.00000000.000001c4.f8000000.08000000 reg: 000001c4.f8800000.00000000.00000110.000001c4.f8802000.00000000.00000010.000001c4.f8804000.00000000.00000020.000001c4.f8806000.00000000.00000020.000001c4.f8808000.00000000.00000020.000001c4.f880a000.00000000.00000020 board-model: 'SUNW,501-2558' model: 'SUNW,fhc0FA0' board#: 00000001 name: 'fhc' Node 0xf00720cc .node: f00720cc manfid#: 0000003e version#: 00000005 device_type: 'memory-controller' reg: 00000000.01000000.00008000.00000000.02000000.01000000 model: 'SUNW,ac0F9E' name: 'ac' Node 0xf0072204 .node: f0072204 interrupts: 0000003b reg: 00000000.00400000.00000010 name: 'environment' Node 0xf00722c0 .node: f00722c0 version: 46434f44.4520312e.382e3330.20323030.322f3130.2f323520.31343a30.32006950.4f535420.332e342e.33302032.3030322f.31302f32.35203134.3a303300 model: 'SUNW,525-1757' reg: 00000000.00000000.00080000 name: 'flashprom' Node 0xf00726f8 .node: f00726f8 address: ffd96000 interrupts: 0000003a reg: 00000000.00300000.00002000 model: 'mk48t59' name: 'eeprom' Node 0xf00727f0 .node: f00727f0 reg: 00000000.00500000.00000010 name: 'sbus-speed' Node 0xf00728e4 .node: f00728e4 address: ffd95c00.ffd91860.ffd93060 interrupts: 000000b0.000000b1 reg: 000001c4.00003c00.00000000.00000020.000001c4.00003860.00000000.00000010.000001c4.00003060.00000000.00000010 board#: 00000001 name: 'counter-timer' Node 0xf0072ad4 .node: f0072ad4 ranges: 00000000.00000000.000001c7.00000000.10000000.00000003.00000000.000001c7.30000000.10000000 interrupts: 000000f4.000000f5.000000f6.000000e5.000000ea.000000f7 version#: 00000001 implementation#: 00000000 bus-parity-generated: address: ffd8a000 scsi-initiator-id: 00000007 model: 'SUNW,sysio' reg: 000001c6.00000000.00000000.00006000 slot-address-bits: 0000001c up-burst-sizes: 0078007f burst-sizes: 00f8007f device_type: 'sbus' name: 'sbus' upa-portid: 00000003 clock-frequency: 017d7840 board#: 00000001 Node 0xf008d070 .node: f008d070 hm-rev: 00000022 device_type: 'network' intr: 00000004.00000000 interrupts: 00000004 address-bits: 00000030 max-frame-size: 00004000 reg: 00000003.08c00000.00000108.00000003.08c02000.00002000.00000003.08c04000.00002000.00000003.08c06000.00002000.00000003.08c07000.00000020 name: 'SUNW,hme' Node 0xf0093c14 .node: f0093c14 hm-rev: 00000022 device_type: 'scsi' clock-frequency: 02625a00 intr: 00000003.00000000 interrupts: 00000003 reg: 00000003.08800000.00000010.00000003.08810000.00000040 name: 'SUNW,fas' Node 0xf009864c .node: f009864c device_type: 'block' name: 'sd' Node 0xf0098f08 .node: f0098f08 device_type: 'byte' name: 'st' Node 0xf0099bf4 .node: f0099bf4 local-mac-address: 08002093.7994 hm-rev: 00000022 device_type: 'network' intr: 00000004.00000000 interrupts: 00000004 address-bits: 00000030 max-frame-size: 00004000 reg: 00000000.08c00000.00000108.00000000.08c02000.00002000.00000000.08c04000.00002000.00000000.08c06000.00002000.00000000.08c07000.00000020 model: 'SUNW,sbus-qfe' version: '1.11' name: 'SUNW,qfe' Node 0xf009fba8 .node: f009fba8 local-mac-address: 08002093.7995 hm-rev: 00000022 device_type: 'network' intr: 00000004.00000000 interrupts: 00000004 address-bits: 00000030 max-frame-size: 00004000 reg: 00000000.08c10000.00000108.00000000.08c12000.00002000.00000000.08c14000.00002000.00000000.08c16000.00002000.00000000.08c17000.00000020 model: 'SUNW,sbus-qfe' version: '1.11' name: 'SUNW,qfe' Node 0xf00a5a84 .node: f00a5a84 local-mac-address: 08002093.7996 hm-rev: 00000022 device_type: 'network' intr: 00000004.00000000 interrupts: 00000004 address-bits: 00000030 max-frame-size: 00004000 reg: 00000000.08c20000.00000108.00000000.08c22000.00002000.00000000.08c24000.00002000.00000000.08c26000.00002000.00000000.08c27000.00000020 model: 'SUNW,sbus-qfe' version: '1.11' name: 'SUNW,qfe' Node 0xf00ab960 .node: f00ab960 local-mac-address: 08002093.7997 hm-rev: 00000022 device_type: 'network' intr: 00000004.00000000 interrupts: 00000004 address-bits: 00000030 max-frame-size: 00004000 reg: 00000000.08c30000.00000108.00000000.08c32000.00002000.00000000.08c34000.00002000.00000000.08c36000.00002000.00000000.08c37000.00000020 model: 'SUNW,sbus-qfe' version: '1.11' name: 'SUNW,qfe' Node 0xf0074e94 .node: f0074e94 address: ffd7bc00.ffd77860.ffd79060 interrupts: 000000f0.000000f1 reg: 000001c6.00003c00.00000000.00000020.000001c6.00003860.00000000.00000010.000001c6.00003060.00000000.00000010 board#: 00000001 name: 'counter-timer' Node 0xf014f7bc .node: f014f7bc ranges: 00000001.00000000.000001d5.10000000.10000000.00000002.00000000.000001d5.20000000.10000000.0000000d.00000000.000001d5.d0000000.10000000 interrupts: 000002b4.000002b5.000002b6.000002a5.000002aa.000002b7 version#: 00000001 implementation#: 00000000 bus-parity-generated: address: ffd70000 scsi-initiator-id: 00000007 model: 'SUNW,sysio' reg: 000001d4.00000000.00000000.00006000 slot-address-bits: 0000001c up-burst-sizes: 0078007f burst-sizes: 00f8007f device_type: 'sbus' name: 'sbus' upa-portid: 0000000a clock-frequency: 017d7840 board#: 00000005 Node 0xf0155084 .node: f0155084 wwn: 20140800.20b6eee2 intr: 00000003.00000000 interrupts: 00000022 ranges: 00000000.00000000.0000000d.00010240.00000018.00000001.00000000.0000000d.00010258.00000018.00000010.00000000.0000000d.00010300.00000008.00000011.00000000.0000000d.00010308.00000008 reg: 0000000d.00010000.00010018 device_type: 'socal' version: '@(#) FCode 1.12 99/07/30' manufacturer: 'SUNW' model: '501-3060' name: 'SUNW,socal' Node 0xf015c8ec .node: f015c8ec port-wwn: 20150800.20b6eee2 reg: 00000000.00000000.00000018.00000010.00000000.00000008 port#: 00000000 #address-cells: 00000004 device_type: 'scsi-3' name: 'sf' Node 0xf015e704 .node: f015e704 device_type: 'block' name: 'ssd' Node 0xf015efd4 .node: f015efd4 port-wwn: 20160800.20b6eee2 reg: 00000001.00000000.00000018.00000011.00000000.00000008 port#: 00000001 #address-cells: 00000004 device_type: 'scsi-3' name: 'sf' Node 0xf015f670 .node: f015f670 device_type: 'block' name: 'ssd' Node 0xf0160050 .node: f0160050 scsi-initiator-id: 00000007 clock-frequency: 03938700 differential: 00 isp-fcode: '1.28 99/11/08' device_type: 'scsi' intr: 00000003.00000000 interrupts: 00000003 wide: 00 fast-20: 00 reg: 00000001.00010000.00000450 64-bit-clean: 00 model: 'QLGC,ISP1000U' name: 'QLGC,isp' Node 0xf0165dc8 .node: f0165dc8 device_type: 'block' name: 'sd' Node 0xf01665b8 .node: f01665b8 device_type: 'byte' name: 'st' Node 0xf016713c .node: f016713c cache-linesize: 00000010 cache-size: 00008000 intr: 00000002.00000000 interrupts: 00000002 reg: 00000002.00010000.00000080.00000002.00020000.00000068.00000002.00030000.0000000c model: 'SUNW,501-1763-01' name: 'SUNW,SunPC' Node 0xf0151c1c .node: f0151c1c board-type: 'dual-sbus-soc+' manfid#: 0000003e version#: 00000001 ranges: 00000000.00000000.000001d4.f8000000.08000000 reg: 000001d4.f8800000.00000000.00000110.000001d4.f8802000.00000000.00000010.000001d4.f8804000.00000000.00000020.000001d4.f8806000.00000000.00000020.000001d4.f8808000.00000000.00000020.000001d4.f880a000.00000000.00000020 board-model: 'SUNW,501-2558' model: 'SUNW,fhc0FA0' board#: 00000005 name: 'fhc' Node 0xf01520cc .node: f01520cc manfid#: 0000003e version#: 00000005 device_type: 'memory-controller' reg: 00000000.01000000.00008000.00000000.02000000.01000000 model: 'SUNW,ac0F9E' name: 'ac' Node 0xf0152204 .node: f0152204 interrupts: 0000003b reg: 00000000.00400000.00000010 name: 'environment' Node 0xf01522c0 .node: f01522c0 version: 46434f44.4520312e.382e3330.20323030.322f3130.2f323520.31343a30.32006950.4f535420.332e342e.33302032.3030322f.31302f32.35203134.3a303300 model: 'SUNW,525-1757' reg: 00000000.00000000.00080000 name: 'flashprom' Node 0xf01526f8 .node: f01526f8 address: ffd54000 interrupts: 0000003a reg: 00000000.00300000.00002000 model: 'mk48t59' name: 'eeprom' Node 0xf01527f0 .node: f01527f0 reg: 00000000.00500000.00000010 name: 'sbus-speed' Node 0xf01528e4 .node: f01528e4 address: ffd53c00.ffd4f860.ffd51060 interrupts: 000002b0.000002b1 reg: 000001d4.00003c00.00000000.00000020.000001d4.00003860.00000000.00000010.000001d4.00003060.00000000.00000010 board#: 00000005 name: 'counter-timer' Node 0xf0152ad4 .node: f0152ad4 ranges: 00000000.00000000.000001d7.00000000.10000000.00000003.00000000.000001d7.30000000.10000000 interrupts: 000002f4.000002f5.000002f6.000002e5.000002ea.000002f7 version#: 00000001 implementation#: 00000000 bus-parity-generated: address: ffd48000 scsi-initiator-id: 00000007 model: 'SUNW,sysio' reg: 000001d6.00000000.00000000.00006000 slot-address-bits: 0000001c up-burst-sizes: 0078007f burst-sizes: 00f8007f device_type: 'sbus' name: 'sbus' upa-portid: 0000000b clock-frequency: 017d7840 board#: 00000005 Node 0xf01673a8 .node: f01673a8 hm-rev: 00000022 device_type: 'network' intr: 00000004.00000000 interrupts: 00000004 address-bits: 00000030 max-frame-size: 00004000 reg: 00000003.08c00000.00000108.00000003.08c02000.00002000.00000003.08c04000.00002000.00000003.08c06000.00002000.00000003.08c07000.00000020 name: 'SUNW,hme' Node 0xf016df4c .node: f016df4c hm-rev: 00000022 device_type: 'scsi' clock-frequency: 02625a00 intr: 00000003.00000000 interrupts: 00000003 reg: 00000003.08800000.00000010.00000003.08810000.00000040 name: 'SUNW,fas' Node 0xf0172984 .node: f0172984 device_type: 'block' name: 'sd' Node 0xf0173240 .node: f0173240 device_type: 'byte' name: 'st' Node 0xf0173f2c .node: f0173f2c scsi-initiator-id: 00000007 clock-frequency: 03938700 differential: 00 isp-fcode: '1.28 99/11/08' device_type: 'scsi' intr: 00000003.00000000 interrupts: 00000003 wide: 00 fast-20: 00 reg: 00000000.00010000.00000450 64-bit-clean: 00 model: 'QLGC,ISP1000U' name: 'QLGC,isp' Node 0xf0179ca4 .node: f0179ca4 device_type: 'block' name: 'sd' Node 0xf017a494 .node: f017a494 device_type: 'byte' name: 'st' Node 0xf0154e94 .node: f0154e94 address: ffd39c00.ffd35860.ffd37060 interrupts: 000002f0.000002f1 reg: 000001d6.00003c00.00000000.00000020.000001d6.00003860.00000000.00000010.000001d6.00003060.00000000.00000010 board#: 00000005 name: 'counter-timer' Node 0xf01bf7bc .node: f01bf7bc available: 82000000.00000000.02808000.00000000.7d7f8000.81000000.00000000.00000400.00000000.0000fc00 bus-range: 00000000.00000000 version#: 00000004 implementation#: 00000000 clock-frequency: 01f78a40 upa-portid: 0000000e interrupts: 000003b1.000003ae.000003af.000003a5.000003a8.000003b2 ranges: 00000000.00000000.00000000.000001dc.01000000.00000000.00800000.01000000.00000000.00000000.000001dc.02010000.00000000.00010000.02000000.00000000.00000000.000001dd.80000000.00000000.80000000.03000000.00000000.00000000.000001dd.80000000.00000000.80000000 address: ffd32000.ffd30000.ffd22000 reg: 000001dc.00004000.00000000.00002000.000001dc.01000000.00000000.00000100.000001dc.00000000.00000000.0000d000 board#: 00000007 model: 'SUNW,psycho' compatible: 'pci108e,8000' bus-parity-generated: #size-cells: 00000002 #address-cells: 00000003 device_type: 'pci' name: 'pci' Node 0xf01d3d84 .node: f01d3d84 assigned-addresses: 82000810.00000000.01000000.00000000.01000000.82000814.00000000.02000000.00000000.00800000 power-consumption: 00000000.00e4e1c0 reg: 00000800.00000000.00000000.00000000.00000000.02000810.00000000.00000000.00000000.01000000.02000814.00000000.00000000.00000000.00800000 compatible: 70636931.3038652c.31303030.00706369.636c6173.732c3036.38303030.00 name: 'pci108e,1000' 66mhz-capable: 00000000 udf-supported: 00000000 fast-back-to-back: 00000001 devsel-speed: 00000001 class-code: 00068000 interrupts: 00000001 max-latency: 00000019 min-grant: 0000000a revision-id: 00000001 device-id: 00001000 vendor-id: 0000108e Node 0xf01d4058 .node: f01d4058 assigned-addresses: 82000910.00000000.02800000.00000000.00007030 compatible: 'pci108e,1001' version: '1.17' device_type: 'network' hm-rev: 000000c1 address-bits: 00000030 max-frame-size: 00004000 reg: 00000900.00000000.00000000.00000000.00000000.02000910.00000000.00000000.00000000.00007030 model: 'SUNW,cheerio' name: 'SUNW,hme' 66mhz-capable: 00000000 udf-supported: 00000000 fast-back-to-back: 00000001 devsel-speed: 00000001 class-code: 00020000 interrupts: 000003a1 max-latency: 00000005 min-grant: 0000000a revision-id: 00000001 device-id: 00001001 vendor-id: 0000108e Node 0xf01c88e0 .node: f01c88e0 available: 82800000.00000000.00002100.00000000.7fffdf00.81800000.00000000.00000440.00000000.0000fbc0 bus-range: 00000080.00000080 version#: 00000004 implementation#: 00000000 clock-frequency: 01f78a40 slot-names: 00000004.7063692d.736c6f74.203000 upa-portid: 0000000e 66mhz-capable: interrupts: 000003b0.000003ae.000003af.000003a5.000003a8.000003b2 ranges: 00800000.00000000.00000000.000001dc.01000000.00000000.00800000.01000000.00000000.00000000.000001dc.02000000.00000000.00010000.02000000.00000000.00000000.000001dd.00000000.00000000.80000000.03000000.00000000.00000000.000001dd.00000000.00000000.80000000 address: ffd18000.ffd16000.ffd08000 reg: 000001dc.00002000.00000000.00002000.000001dc.01800000.00000000.00000100.000001dc.00000000.00000000.0000d000 board#: 00000007 model: 'SUNW,psycho' compatible: 'pci108e,8000' bus-parity-generated: #size-cells: 00000002 #address-cells: 00000003 device_type: 'pci' name: 'pci' Node 0xf01e7cb8 .node: f01e7cb8 assigned-addresses: 81801020.00000000.00000400.00000000.00000020 power-consumption: 00000000.00e4e1c0 reg: 00801000.00000000.00000000.00000000.00000000.01801020.00000000.00000000.00000000.00000020 compatible: 70636939.32352c31.32333400.70636931.3130362c.33303338.00706369.636c6173.732c3063.30333030.00757362.00 name: 'usb' 66mhz-capable: 00000000 udf-supported: 00000000 fast-back-to-back: 00000000 devsel-speed: 00000001 class-code: 000c0300 interrupts: 00000001 subsystem-vendor-id: 00000925 subsystem-id: 00001234 max-latency: 00000000 min-grant: 00000000 revision-id: 00000050 device-id: 00003038 vendor-id: 00001106 Node 0xf01e7fd4 .node: f01e7fd4 assigned-addresses: 81801120.00000000.00000420.00000000.00000020 reg: 00801100.00000000.00000000.00000000.00000000.01801120.00000000.00000000.00000000.00000020 compatible: 70636939.32352c31.32333400.70636931.3130362c.33303338.00706369.636c6173.732c3063.30333030.00757362.00 name: 'usb' 66mhz-capable: 00000000 udf-supported: 00000000 fast-back-to-back: 00000000 devsel-speed: 00000001 class-code: 000c0300 interrupts: 00000002 subsystem-vendor-id: 00000925 subsystem-id: 00001234 max-latency: 00000000 min-grant: 00000000 revision-id: 00000050 device-id: 00003038 vendor-id: 00001106 Node 0xf01e82c0 .node: f01e82c0 assigned-addresses: 82801210.00000000.00002000.00000000.00000100 reg: 00801200.00000000.00000000.00000000.00000000.02801210.00000000.00000000.00000000.00000100 compatible: 70636939.32352c31.32333400.70636931.3130362c.33313034.00706369.636c6173.732c3063.30333230.00757362.00 name: 'usb' 66mhz-capable: 00000000 udf-supported: 00000000 fast-back-to-back: 00000000 devsel-speed: 00000001 class-code: 000c0320 interrupts: 00000003 subsystem-vendor-id: 00000925 subsystem-id: 00001234 max-latency: 00000000 min-grant: 00000000 revision-id: 00000051 device-id: 00003104 vendor-id: 00001106 Node 0xf01c923c .node: f01c923c board-type: 'dual-pci' manfid#: 0000003e version#: 00000001 ranges: 00000000.00000000.000001dc.f8000000.08000000 reg: 000001dc.f8800000.00000000.00000110.000001dc.f8802000.00000000.00000010.000001dc.f8804000.00000000.00000020.000001dc.f8806000.00000000.00000020.000001dc.f8808000.00000000.00000020.000001dc.f880a000.00000000.00000020 board-model: 'SUNW,501-3023' model: 'SUNW,fhc0FA0' board#: 00000007 name: 'fhc' Node 0xf01c9718 .node: f01c9718 manfid#: 0000003e version#: 00000005 device_type: 'memory-controller' reg: 00000000.01000000.00008000.00000000.02000000.01000000 model: 'SUNW,ac0F9E' name: 'ac' Node 0xf01c9850 .node: f01c9850 interrupts: 0000003b reg: 00000000.00400000.00000010 name: 'environment' Node 0xf01c990c .node: f01c990c version: 46434f44.4520312e.382e3330.20323030.322f3130.2f323520.31343a30.32006950.4f535420.332e302e.33302032.3030322f.31302f32.35203134.3a303300 model: 'SUNW,525-1680' reg: 00000000.00000000.00080000 name: 'flashprom' Node 0xf01c9d44 .node: f01c9d44 address: ffcfa000 interrupts: 0000003a reg: 00000000.00300000.00002000 model: 'mk48t59' name: 'eeprom' Node 0xf01c9e3c .node: f01c9e3c reg: 00000000.00500000.00000010 name: 'sbus-speed' Node 0xf01c9f28 .node: f01c9f28 address: ffcf9c00.ffcf5860.ffcf7060 interrupts: 000003ac.000003ad reg: 000001dc.00001c00.00000000.00000020.000001dc.00001860.00000000.00000010.000001dc.00001060.00000000.00000010 board#: 00000007 name: 'counter-timer' Node 0xf01ca118 .node: f01ca118 available: 82000000.00000000.00020000.00000000.7ffe0000.81000000.00000000.00000500.00000000.0000fb00 bus-range: 00000000.00000000 version#: 00000004 implementation#: 00000000 clock-frequency: 01f78a40 upa-portid: 0000000f interrupts: 000003f1.000003ee.000003ef.000003e5.000003e8.000003f2 ranges: 00000000.00000000.00000000.000001de.01000000.00000000.00800000.01000000.00000000.00000000.000001de.02010000.00000000.00010000.02000000.00000000.00000000.000001df.80000000.00000000.80000000.03000000.00000000.00000000.000001df.80000000.00000000.80000000 address: ffcf2000.ffcf0000.ffce2000 reg: 000001de.00004000.00000000.00002000.000001de.01000000.00000000.00000100.000001de.00000000.00000000.0000d000 board#: 00000007 model: 'SUNW,psycho' compatible: 'pci108e,8000' bus-parity-generated: #size-cells: 00000002 #address-cells: 00000003 device_type: 'pci' name: 'pci' Node 0xf01dc3c0 .node: f01dc3c0 assigned-addresses: 81001810.00000000.00000400.00000000.00000100.82001814.00000000.00002000.00000000.00001000.82001830.00000000.00010000.00000000.00010000 model: 'QLGC,ISP1040B' scsi-initiator-id: 00000007 clock-frequency: 03938700 alternate-reg: 00000000.00000000.00000000.00000000.00000000.02001814.00000000.00000000.00000000.00000100.01001810.00000000.00000000.00000000.00000100 reg: 00001800.00000000.00000000.00000000.00000000.01001810.00000000.00000000.00000000.00000100.02001814.00000000.00000000.00000000.00001000.02001830.00000000.00000000.00000000.00010000 power-consumption: 00000000.00000000.00895440.00895440 manufacturer: 'QLGC' device_type: 'scsi' name: 'SUNW,isptwo' 66mhz-capable: 00000000 udf-supported: 00000000 fast-back-to-back: 00000000 devsel-speed: 00000001 class-code: 00010000 interrupts: 000003e0 max-latency: 00000000 min-grant: 00000000 revision-id: 00000002 device-id: 00001020 vendor-id: 00001077 Node 0xf01e6534 .node: f01e6534 device_type: 'block' name: 'sd' Node 0xf01e7010 .node: f01e7010 device_type: 'byte' name: 'st' Node 0xf01d320c .node: f01d320c available: 82800000.00000000.00004000.00000000.7fffc000.81800000.00000000.00000900.00000000.0000f700 bus-range: 00000080.00000080 version#: 00000004 implementation#: 00000000 clock-frequency: 03ef1480 slot-names: 00000004.7063692d.736c6f74.203100 upa-portid: 0000000f 66mhz-capable: interrupts: 000003f0.000003ee.000003ef.000003e5.000003e8.000003f2 ranges: 00800000.00000000.00000000.000001de.01000000.00000000.00800000.01000000.00000000.00000000.000001de.02000000.00000000.00010000.02000000.00000000.00000000.000001df.00000000.00000000.80000000.03000000.00000000.00000000.000001df.00000000.00000000.80000000 address: ffcd8000.ffcd6000.ffcc8000 reg: 000001de.00002000.00000000.00002000.000001de.01800000.00000000.00000100.000001de.00000000.00000000.0000d000 board#: 00000007 model: 'SUNW,psycho' compatible: 'pci108e,8000' bus-parity-generated: #size-cells: 00000002 #address-cells: 00000003 device_type: 'pci' name: 'pci' Node 0xf01e86d0 .node: f01e86d0 assigned-addresses: 81801010.00000000.00000400.00000000.00000100.83801014.00000000.00002000.00000000.00002000.8180101c.00000000.00000800.00000000.00000100 power-consumption: 00000000.00e4e1c0 reg: 00801000.00000000.00000000.00000000.00000000.01801010.00000000.00000000.00000000.00000100.03801014.00000000.00000000.00000000.00002000.0180101c.00000000.00000000.00000000.00000100 compatible: 70636939.3030352c.34340070.63693930.30352c38.30313700.70636963.6c617373.2c303130.30303000.73637369.00 name: 'scsi' 66mhz-capable: 00000001 udf-supported: 00000000 fast-back-to-back: 00000000 devsel-speed: 00000002 class-code: 00010000 interrupts: 00000001 subsystem-vendor-id: 00009005 subsystem-id: 00000044 max-latency: 00000019 min-grant: 00000028 revision-id: 00000010 device-id: 00008017 vendor-id: 00009005 Node 0xf01d3b68 .node: f01d3b68 address: ffcc7c00.ffcc3860.ffcc5060 interrupts: 000003ec.000003ed reg: 000001de.00001c00.00000000.00000020.000001de.00001860.00000000.00000010.000001de.00001060.00000000.00000010 board#: 00000007 name: 'counter-timer' -- Meelis Roos (mroos@linux.ee) ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88 2012-02-13 7:45 ` OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88 Meelis Roos @ 2012-02-13 8:06 ` Grant Likely 2012-02-13 9:20 ` Meelis Roos 2012-02-13 9:50 ` Meelis Roos 2012-03-01 12:24 ` [tip:core/urgent] memblock: Fix size aligning of memblock_alloc_base_nid() tip-bot for Tejun Heo 1 sibling, 2 replies; 46+ messages in thread From: Grant Likely @ 2012-02-13 8:06 UTC (permalink / raw) To: Meelis Roos; +Cc: Rob Herring, sparclinux, Linux Kernel list On Mon, Feb 13, 2012 at 09:45:40AM +0200, Meelis Roos wrote: > (Resend with proper To-s for OF people) > > This is my first post-3.2 test on 2-CPU Sun Enterprise 3500 (PCI+SBus > IO). prtconf is also below. Something OF-related seems to be happening > here. > > [ 0.000000] PROMLIB: Sun IEEE Boot Prom 'OBP 3.2.30 2002/10/25 14:03' > [ 0.000000] PROMLIB: Root node compatible: > [ 0.000000] Initializing cgroup subsys cpu > [ 0.000000] Linux version 3.3.0-rc3-00188-g3ec1e88 (mroos@korvits) (gcc version 4.6.2 (Debian 4.6.2-14) ) #64 SMP Sun Feb 12 22:26:40 EET 2012 > [ 0.000000] debug: ignoring loglevel setting. > [ 0.000000] bootconsole [earlyprom0] enabled > [ 0.000000] ARCH: SUN4U > [ 0.000000] Ethernet address: 08:00:20:b6:ee:e2 > [ 0.000000] Kernel: Using 4 locked TLB entries for main kernel image. > [ 0.000000] Remapping the kernel... done. > [ 0.000000] Unable to handle kernel NULL pointer dereference > [ 0.000000] tsk->{mm,active_mm}->context = 0000000000000000 > [ 0.000000] tsk->{mm,active_mm}->pgd = fffff800008c77d0 > [ 0.000000] \|/ ____ \|/ > [ 0.000000] "@'/ .. \`@" > [ 0.000000] /_| \__/ |_\ > [ 0.000000] \__U_/ > [ 0.000000] swapper(0): Oops [#1] > [ 0.000000] TSTATE: 0000000080e01607 TPC: 00000000006459a0 TNPC: 0000000000645964 Y: 00000037 Not tainted > [ 0.000000] TPC: <of_find_node_by_path+0x60/0x80> > [ 0.000000] g0: 0000000000000000 g1: 0000000000000001 g2: 00000000000000ff g3: 00000000000000f0 > [ 0.000000] g4: 0000000000853fd0 g5: 0000000000000000 g6: 0000000000834000 g7: 0000000000000050 > [ 0.000000] o0: 0000000000000001 o1: fffff8007fced7c0 o2: 0000000001010101 o3: 0000000080808080 > [ 0.000000] o4: fffff8007fcc0a4d o5: 00000000000199b5 sp: 0000000000837231 ret_pc: 0000000000645970 > [ 0.000000] RPC: <of_find_node_by_path+0x30/0x80> > [ 0.000000] l0: 00000000008ab400 l1: fffff8007fcc1f40 l2: 000000000085c5ec l3: 0000000000000025 > [ 0.000000] l4: 00000000005c0400 l5: 00000000008fa5e6 l6: 0000000000000006 l7: 0028280000000000 > [ 0.000000] i0: fffff8007fced7c0 i1: 0000000000808fd8 i2: 0000000001010101 i3: 0000000080808080 > [ 0.000000] i4: 0000000000876c00 i5: 0000000000000050 i6: 00000000008372e1 i7: 000000000064684c > [ 0.000000] I7: <of_alias_scan+0xcc/0x1c0> > [ 0.000000] Call Trace: > [ 0.000000] [000000000064684c] of_alias_scan+0xcc/0x1c0 > [ 0.000000] [00000000008a0350] of_pdt_build_devicetree+0x90/0xa0 > [ 0.000000] [000000000088c540] prom_build_devicetree+0x10/0x3c > [ 0.000000] [00000000008904d4] paging_init+0x59c/0x6bc > [ 0.000000] [000000000088bebc] setup_arch+0xf8/0x110 > [ 0.000000] [000000000088a51c] start_kernel+0x8c/0x34c Try the following patch. I suspect the new of_alias_scan() isn't careful enough about which properties it dereferences: --- diff --git a/drivers/of/base.c b/drivers/of/base.c index 133908a..9188caa 100644 --- a/drivers/of/base.c +++ b/drivers/of/base.c @@ -1174,6 +1174,10 @@ void of_alias_scan(void * (*dt_alloc)(u64 size, u64 align)) !strcmp(pp->name, "linux,phandle")) continue; + /* Check for null value or non-strings (no null termination) */ + if (!pp->value || strnlen(pp->value, pp->length) == pp->length) + continue; + np = of_find_node_by_path(pp->value); if (!np) continue; ^ permalink raw reply related [flat|nested] 46+ messages in thread
* Re: OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88 2012-02-13 8:06 ` Grant Likely @ 2012-02-13 9:20 ` Meelis Roos 2012-02-13 21:46 ` Grant Likely 2012-02-13 9:50 ` Meelis Roos 1 sibling, 1 reply; 46+ messages in thread From: Meelis Roos @ 2012-02-13 9:20 UTC (permalink / raw) To: Grant Likely; +Cc: Rob Herring, sparclinux, Linux Kernel list > Try the following patch. I suspect the new of_alias_scan() isn't careful > enough about which properties it dereferences: > > --- > > diff --git a/drivers/of/base.c b/drivers/of/base.c > index 133908a..9188caa 100644 > --- a/drivers/of/base.c > +++ b/drivers/of/base.c > @@ -1174,6 +1174,10 @@ void of_alias_scan(void * (*dt_alloc)(u64 size, u64 align)) > !strcmp(pp->name, "linux,phandle")) > continue; > > + /* Check for null value or non-strings (no null termination) */ > + if (!pp->value || strnlen(pp->value, pp->length) == pp->length) > + continue; > + > np = of_find_node_by_path(pp->value); > if (!np) > continue; > Yes, it probably gets past this problem but oopses in a different place: [ 0.000000] PROMLIB: Sun IEEE Boot Prom 'OBP 3.2.30 2002/10/25 14:03' [ 0.000000] PROMLIB: Root node compatible: [ 0.000000] Initializing cgroup subsys cpu [ 0.000000] Linux version 3.3.0-rc3-00188-g3ec1e88-dirty (mroos@korvits) (gcc version 4.6.2 (Debian 42 [ 0.000000] debug: ignoring loglevel setting. [ 0.000000] bootconsole [earlyprom0] enabled [ 0.000000] ARCH: SUN4U [ 0.000000] Ethernet address: 08:00:20:b6:ee:e2 [ 0.000000] Kernel: Using 4 locked TLB entries for main kernel image. [ 0.000000] Remapping the kernel... done. [ 0.000000] Unable to handle kernel NULL pointer dereference [ 0.000000] tsk->{mm,active_mm}->context = 0000000000000000 [ 0.000000] tsk->{mm,active_mm}->pgd = fffff800008c77d0 [ 0.000000] \|/ ____ \|/ [ 0.000000] "@'/ .. \`@" [ 0.000000] /_| \__/ |_\ [ 0.000000] \__U_/ [ 0.000000] swapper(0): Oops [#1] [ 0.000000] TSTATE: 0000000080e01606 TPC: 0000000000645810 TNPC: 0000000000645814 Y: 00000037 Not d [ 0.000000] TPC: <of_find_node_by_phandle+0x30/0x60> [ 0.000000] g0: 0000000000837b88 g1: 00000000fffff800 g2: 0000000000000000 g3: 0000000000000002 [ 0.000000] g4: 0000000000853fd0 g5: 0000000000000000 g6: 0000000000834000 g7: 0000000000000050 [ 0.000000] o0: 0000000000876cf0 o1: fffff8007fcc0900 o2: 0000000001010101 o3: 0000000080808080 [ 0.000000] o4: 000000000000000e o5: 000000000086c000 sp: 0000000000837301 ret_pc: 00000000006457e8 [ 0.000000] RPC: <of_find_node_by_phandle+0x8/0x60> [ 0.000000] l0: 0000000000808fd8 l1: 0000000000876d28 l2: 000000000072a800 l3: 0000000000000080 [ 0.000000] l4: 0000000000000013 l5: 0000000000000013 l6: 0000000000000000 l7: 0000000000000281 [ 0.000000] i0: 00000000f005de3c i1: ffffffffffdc1428 i2: 0000000000000100 i3: 0000000000000004 [ 0.000000] i4: 0000000000000050 i5: 0000000000876c00 i6: 00000000008373b1 i7: 000000000088cd10 [ 0.000000] I7: <of_console_init+0xa4/0x144> [ 0.000000] Call Trace: [ 0.000000] [000000000088cd10] of_console_init+0xa4/0x144 [ 0.000000] [000000000088c548] prom_build_devicetree+0x18/0x3c [ 0.000000] [00000000008904d4] paging_init+0x59c/0x6bc [ 0.000000] [000000000088bebc] setup_arch+0xf8/0x110 [ 0.000000] [000000000088a51c] start_kernel+0x8c/0x34c [ 0.000000] [00000000006fbf28] tlb_fixup_done+0xa0/0xa8 [ 0.000000] [0000000000000000] (null) [ 0.000000] Disabling lock debugging due to kernel taint [ 0.000000] Caller[000000000088cd10]: of_console_init+0xa4/0x144 [ 0.000000] Caller[000000000088c548]: prom_build_devicetree+0x18/0x3c [ 0.000000] Caller[00000000008904d4]: paging_init+0x59c/0x6bc [ 0.000000] Caller[000000000088bebc]: setup_arch+0xf8/0x110 [ 0.000000] Caller[000000000088a51c]: start_kernel+0x8c/0x34c [ 0.000000] Caller[00000000006fbf28]: tlb_fixup_done+0xa0/0xa8 [ 0.000000] Caller[0000000000000000]: (null) [ 0.000000] Instruction DUMP: 901760f0 02c70007 901760f0 <c2072010> 80a04018 324ffffc f85f2050 9 [ 0.000000] Kernel panic - not syncing: Attempted to kill the idle task! [ 0.000000] Press Stop-A (L1-A) to return to the boot prom -- Meelis Roos (mroos@linux.ee) ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88 2012-02-13 9:20 ` Meelis Roos @ 2012-02-13 21:46 ` Grant Likely 2012-02-14 0:58 ` David Miller 2012-02-16 19:53 ` Meelis Roos 0 siblings, 2 replies; 46+ messages in thread From: Grant Likely @ 2012-02-13 21:46 UTC (permalink / raw) To: Meelis Roos; +Cc: Rob Herring, sparclinux, Linux Kernel list On Mon, Feb 13, 2012 at 11:20:36AM +0200, Meelis Roos wrote: > > Try the following patch. I suspect the new of_alias_scan() isn't careful > > enough about which properties it dereferences: > > > > --- > > > > diff --git a/drivers/of/base.c b/drivers/of/base.c > > index 133908a..9188caa 100644 > > --- a/drivers/of/base.c > > +++ b/drivers/of/base.c > > @@ -1174,6 +1174,10 @@ void of_alias_scan(void * (*dt_alloc)(u64 size, u64 align)) > > !strcmp(pp->name, "linux,phandle")) > > continue; > > > > + /* Check for null value or non-strings (no null termination) */ > > + if (!pp->value || strnlen(pp->value, pp->length) == pp->length) > > + continue; > > + > > np = of_find_node_by_path(pp->value); > > if (!np) > > continue; > > > > Yes, it probably gets past this problem but oopses in a different place: > > [ 0.000000] PROMLIB: Sun IEEE Boot Prom 'OBP 3.2.30 2002/10/25 14:03' > [ 0.000000] PROMLIB: Root node compatible: > [ 0.000000] Initializing cgroup subsys cpu > [ 0.000000] Linux version 3.3.0-rc3-00188-g3ec1e88-dirty (mroos@korvits) (gcc version 4.6.2 (Debian 42 > [ 0.000000] debug: ignoring loglevel setting. > [ 0.000000] bootconsole [earlyprom0] enabled > [ 0.000000] ARCH: SUN4U > [ 0.000000] Ethernet address: 08:00:20:b6:ee:e2 > [ 0.000000] Kernel: Using 4 locked TLB entries for main kernel image. > [ 0.000000] Remapping the kernel... done. > [ 0.000000] Unable to handle kernel NULL pointer dereference > [ 0.000000] tsk->{mm,active_mm}->context = 0000000000000000 > [ 0.000000] tsk->{mm,active_mm}->pgd = fffff800008c77d0 > [ 0.000000] \|/ ____ \|/ > [ 0.000000] "@'/ .. \`@" > [ 0.000000] /_| \__/ |_\ > [ 0.000000] \__U_/ > [ 0.000000] swapper(0): Oops [#1] > [ 0.000000] TSTATE: 0000000080e01606 TPC: 0000000000645810 TNPC: 0000000000645814 Y: 00000037 Not d > [ 0.000000] TPC: <of_find_node_by_phandle+0x30/0x60> Ugh; that looks bad. If it failed there, then the global device node list is corrupted. I hate to ask you this, but would you be able to git bisect to narrow down the commit that causes the problem? g. ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88 2012-02-13 21:46 ` Grant Likely @ 2012-02-14 0:58 ` David Miller 2012-02-14 2:30 ` Grant Likely 2012-02-14 5:54 ` mroos 2012-02-16 19:53 ` Meelis Roos 1 sibling, 2 replies; 46+ messages in thread From: David Miller @ 2012-02-14 0:58 UTC (permalink / raw) To: grant.likely; +Cc: mroos, rob.herring, sparclinux, linux-kernel From: Grant Likely <grant.likely@secretlab.ca> Date: Mon, 13 Feb 2012 14:46:23 -0700 > Ugh; that looks bad. If it failed there, then the global device node list > is corrupted. I hate to ask you this, but would you be able to git bisect to > narrow down the commit that causes the problem? Wild guess on all of these bugs, bad OF node reference counting and a OF node is free'd up prematurely. If you look at the sparc code that has been subsumed into the generic drivers/of/ stuff over the past few years, you'll see that we never consistently did any of the reference counting bits on the sparc side. I never did it, because I don't anticipate ever having hot-plug support for OF nodes. Anyways, if you now start to mix the drivers/of/ stuff which religiously does the reference counting with of_node_{get,put}() with the remaining scraps of sparc code that doesn't... it might not be pretty. In the crash dump after your test patch, we are in of_find_node_by_phandle() with a 'np' pointer in the allnodes list equal to 0x50. The signature in the original crash dump is identical, except that time we were in of_find_node_by_path(), but again the 'np' pointer was 0x50. Something else that might be suspicious were the memblock changes that happened this release cycle, so I wouldn't be surprised if a bisect turned up something in there. FWIW I've been running current kernels on my niagara boxes without incident for several weeks. ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88 2012-02-14 0:58 ` David Miller @ 2012-02-14 2:30 ` Grant Likely 2012-02-14 2:41 ` Grant Likely 2012-02-16 21:08 ` mroos 2012-02-14 5:54 ` mroos 1 sibling, 2 replies; 46+ messages in thread From: Grant Likely @ 2012-02-14 2:30 UTC (permalink / raw) To: David Miller; +Cc: mroos, rob.herring, sparclinux, linux-kernel On Mon, Feb 13, 2012 at 5:58 PM, David Miller <davem@davemloft.net> wrote: > From: Grant Likely <grant.likely@secretlab.ca> > Date: Mon, 13 Feb 2012 14:46:23 -0700 > >> Ugh; that looks bad. If it failed there, then the global device node list >> is corrupted. I hate to ask you this, but would you be able to git bisect to >> narrow down the commit that causes the problem? > > Wild guess on all of these bugs, bad OF node reference counting and a > OF node is free'd up prematurely. > > If you look at the sparc code that has been subsumed into the generic > drivers/of/ stuff over the past few years, you'll see that we never > consistently did any of the reference counting bits on the sparc side. Hmmm.... The of_node_put() code path shouldn't exist on sparc. You'll see that it is #ifdef'd out in include/linux/of.h. Plus, only 'OF_DETACHED' nodes are allowed to be released, an there are only 3 code paths (all calling of_detach_node()) specific to powerpc that can detach a node. > I never did it, because I don't anticipate ever having hot-plug > support for OF nodes. > > Anyways, if you now start to mix the drivers/of/ stuff which > religiously does the reference counting with of_node_{get,put}() > with the remaining scraps of sparc code that doesn't... it might > not be pretty. > > In the crash dump after your test patch, we are in > of_find_node_by_phandle() with a 'np' pointer in the allnodes list > equal to 0x50. Definitely not right! It would be interesting to add a printk() to of_find_node_by_phandle() or of_find_node_by_path() to blast out the node names as it traverses the tree. That could help track down corruption. > > The signature in the original crash dump is identical, except > that time we were in of_find_node_by_path(), but again the 'np' > pointer was 0x50. > > Something else that might be suspicious were the memblock changes > that happened this release cycle, so I wouldn't be surprised if > a bisect turned up something in there. > > FWIW I've been running current kernels on my niagara boxes without > incident for several weeks. > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- Grant Likely, B.Sc., P.Eng. Secret Lab Technologies Ltd. ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88 2012-02-14 2:30 ` Grant Likely @ 2012-02-14 2:41 ` Grant Likely 2012-02-16 21:08 ` mroos 1 sibling, 0 replies; 46+ messages in thread From: Grant Likely @ 2012-02-14 2:41 UTC (permalink / raw) To: David Miller; +Cc: mroos, rob.herring, sparclinux, linux-kernel On Mon, Feb 13, 2012 at 7:30 PM, Grant Likely <grant.likely@secretlab.ca> wrote: > On Mon, Feb 13, 2012 at 5:58 PM, David Miller <davem@davemloft.net> wrote: >> From: Grant Likely <grant.likely@secretlab.ca> >> Date: Mon, 13 Feb 2012 14:46:23 -0700 >> >>> Ugh; that looks bad. If it failed there, then the global device node list >>> is corrupted. I hate to ask you this, but would you be able to git bisect to >>> narrow down the commit that causes the problem? >> >> Wild guess on all of these bugs, bad OF node reference counting and a >> OF node is free'd up prematurely. >> >> If you look at the sparc code that has been subsumed into the generic >> drivers/of/ stuff over the past few years, you'll see that we never >> consistently did any of the reference counting bits on the sparc side. > > Hmmm.... The of_node_put() code path shouldn't exist on sparc. You'll > see that it is #ifdef'd out in include/linux/of.h. Plus, only > 'OF_DETACHED' nodes are allowed to be released, an there are only 3 > code paths (all calling of_detach_node()) specific to powerpc that can > detach a node. In fact, I should disable those paths always when CONFIG_OF_DYNAMIC is disabled. I'll look into doing so for v3.4. g. ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88 2012-02-14 2:30 ` Grant Likely 2012-02-14 2:41 ` Grant Likely @ 2012-02-16 21:08 ` mroos 1 sibling, 0 replies; 46+ messages in thread From: mroos @ 2012-02-16 21:08 UTC (permalink / raw) To: Grant Likely; +Cc: David Miller, rob.herring, sparclinux, linux-kernel > Definitely not right! It would be interesting to add a printk() to > of_find_node_by_phandle() or of_find_node_by_path() to blast out the > node names as it traverses the tree. That could help track down > corruption. [ 0.000000] of_find_node_by_path: /chosen [ 0.000000] of_find_node_by_path: /aliases ¥_6䥷~ê7\eý+õï*¢ꢏñ?¿sM ý{ aliases000000] ò7find_node_by_path: ðÑÔ_Bÿ [ 0.000000] Unable to handle kernel NULL pointer dereference -- Meelis Roos (mroos@linux.ee) ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88 2012-02-14 0:58 ` David Miller 2012-02-14 2:30 ` Grant Likely @ 2012-02-14 5:54 ` mroos 1 sibling, 0 replies; 46+ messages in thread From: mroos @ 2012-02-14 5:54 UTC (permalink / raw) To: David Miller; +Cc: grant.likely, rob.herring, sparclinux, Linux Kernel list > FWIW I've been running current kernels on my niagara boxes without > incident for several weeks. It runs for me on Ultra 1, Ultra 5 IDE, Ultra 10 SCSI and Blade 100. Fails on E3500, V100 and Netra X1 so it's probably dependent on something in the device tree. I will try bisecting and the suggested printk's but it takes time since I will be away from computers most of today. -- Meelis Roos (mroos@linux.ee) ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88 2012-02-13 21:46 ` Grant Likely 2012-02-14 0:58 ` David Miller @ 2012-02-16 19:53 ` Meelis Roos 2012-02-16 21:23 ` Sam Ravnborg 2012-02-20 9:11 ` Meelis Roos 1 sibling, 2 replies; 46+ messages in thread From: Meelis Roos @ 2012-02-16 19:53 UTC (permalink / raw) To: Grant Likely; +Cc: Rob Herring, sparclinux, Linux Kernel list > Ugh; that looks bad. If it failed there, then the global device node list > is corrupted. I hate to ask you this, but would you be able to git bisect to > narrow down the commit that causes the problem? Finished bisecting on E2500 (the original machine where I found the problem). Bisecting leads to [0ee332c1451869963626bf9cac88f165a90990e1] memblock: Kill early_node_map[] So yes, it looks like memblock. -- Meelis Roos (mroos@linux.ee) ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88 2012-02-16 19:53 ` Meelis Roos @ 2012-02-16 21:23 ` Sam Ravnborg 2012-02-20 9:11 ` Meelis Roos 1 sibling, 0 replies; 46+ messages in thread From: Sam Ravnborg @ 2012-02-16 21:23 UTC (permalink / raw) To: Meelis Roos, Tejun Heo Cc: Grant Likely, Rob Herring, sparclinux, Linux Kernel list On Thu, Feb 16, 2012 at 09:53:14PM +0200, Meelis Roos wrote: > > Ugh; that looks bad. If it failed there, then the global device node list > > is corrupted. I hate to ask you this, but would you be able to git bisect to > > narrow down the commit that causes the problem? > > Finished bisecting on E2500 (the original machine where I found the > problem). Bisecting leads to > [0ee332c1451869963626bf9cac88f165a90990e1] memblock: Kill early_node_map[] > So yes, it looks like memblock. Added Tejun. Sam ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88 2012-02-16 19:53 ` Meelis Roos 2012-02-16 21:23 ` Sam Ravnborg @ 2012-02-20 9:11 ` Meelis Roos 2012-02-20 17:06 ` Tejun Heo 1 sibling, 1 reply; 46+ messages in thread From: Meelis Roos @ 2012-02-20 9:11 UTC (permalink / raw) To: Tejun Heo; +Cc: Grant Likely, Rob Herring, sparclinux, Linux Kernel list > So yes, it looks like memblock. Finished bisecting on the other machine too (Sun Fire V100 where strlen crashes): 7bd0b0f0da3b1ec11cbcc798eb0ef747a1184077 is the first bad commit commit 7bd0b0f0da3b1ec11cbcc798eb0ef747a1184077 Author: Tejun Heo <tj@kernel.org> Date: Thu Dec 8 10:22:09 2011 -0800 memblock: Reimplement memblock allocation using reverse free area iterator Now that all early memory information is in memblock when enabled, we can implement reverse free area iterator and use it to implement NUMA aware allocator which is then wrapped for simpler variants instead of the confusing and inefficient mending of information in separate NUMA aware allocator. Implement for_each_free_mem_range_reverse(), use it to reimplement memblock_find_in_range_node() which in turn is used by all allocators. The visible allocator interface is inconsistent and can probably use some cleanup too. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Yinghai Lu <yinghai@kernel.org> :040000 040000 f74f55a80162a0a1a45c135ca62a51b9af824d53 a2dc2bccf4a30ee516709d0fdcb33faae11059ff M include :040000 040000 e4c4292fe66c4d8d6aa89710ce9f538fbf550ae8 5677586fad018ae9978d53084ba5d617fe231a3d M mm -- Meelis Roos (mroos@linux.ee) ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88 2012-02-20 9:11 ` Meelis Roos @ 2012-02-20 17:06 ` Tejun Heo 2012-02-20 20:04 ` Meelis Roos 2012-02-20 22:32 ` Meelis Roos 0 siblings, 2 replies; 46+ messages in thread From: Tejun Heo @ 2012-02-20 17:06 UTC (permalink / raw) To: Meelis Roos; +Cc: Grant Likely, Rob Herring, sparclinux, Linux Kernel list, sam Hello, Meelis, Sam. Sorry about the delay. I've been pretty swamped lately. On Mon, Feb 20, 2012 at 11:11:05AM +0200, Meelis Roos wrote: > Finished bisecting on the other machine too (Sun Fire V100 where strlen > crashes): > > 7bd0b0f0da3b1ec11cbcc798eb0ef747a1184077 is the first bad commit > commit 7bd0b0f0da3b1ec11cbcc798eb0ef747a1184077 > Author: Tejun Heo <tj@kernel.org> > Date: Thu Dec 8 10:22:09 2011 -0800 > > memblock: Reimplement memblock allocation using reverse free area iterator > > Now that all early memory information is in memblock when enabled, we > can implement reverse free area iterator and use it to implement NUMA > aware allocator which is then wrapped for simpler variants instead of > the confusing and inefficient mending of information in separate NUMA > aware allocator. > > Implement for_each_free_mem_range_reverse(), use it to reimplement > memblock_find_in_range_node() which in turn is used by all allocators. > > The visible allocator interface is inconsistent and can probably use > some cleanup too. > > Signed-off-by: Tejun Heo <tj@kernel.org> > Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> > Cc: Yinghai Lu <yinghai@kernel.org> Hmmm.... So, different bisection results from two machines? That's a bit weird. I *think* this bisection result makes more sense. Can you please verify the bisection result on e2500 once more? Thanks. -- tejun ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88 2012-02-20 17:06 ` Tejun Heo @ 2012-02-20 20:04 ` Meelis Roos 2012-02-20 21:01 ` Tejun Heo 2012-02-20 22:32 ` Meelis Roos 1 sibling, 1 reply; 46+ messages in thread From: Meelis Roos @ 2012-02-20 20:04 UTC (permalink / raw) To: Tejun Heo; +Cc: Grant Likely, Rob Herring, sparclinux, Linux Kernel list, sam > Hmmm.... So, different bisection results from two machines? That's a > bit weird. I *think* this bisection result makes more sense. Can you > please verify the bisection result on e2500 once more? Will do. -- Meelis Roos (mroos@linux.ee) ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88 2012-02-20 20:04 ` Meelis Roos @ 2012-02-20 21:01 ` Tejun Heo 0 siblings, 0 replies; 46+ messages in thread From: Tejun Heo @ 2012-02-20 21:01 UTC (permalink / raw) To: Meelis Roos; +Cc: Grant Likely, Rob Herring, sparclinux, Linux Kernel list, sam Hello, On Mon, Feb 20, 2012 at 10:04:10PM +0200, Meelis Roos wrote: > > Hmmm.... So, different bisection results from two machines? That's a > > bit weird. I *think* this bisection result makes more sense. Can you > > please verify the bisection result on e2500 once more? > > Will do. Thanks a lot. I'm *suspecting* that somehow memory used to back the device tree is not fully reserved and the change in allocation logic is giving out it as part of allocation. I'll look through the change more and see if I can spot a bug in the new code but I guess we'll probably have to print out some pointer values to find out the offending address. Thanks. -- tejun ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88 2012-02-20 17:06 ` Tejun Heo 2012-02-20 20:04 ` Meelis Roos @ 2012-02-20 22:32 ` Meelis Roos 2012-02-21 1:05 ` Tejun Heo 1 sibling, 1 reply; 46+ messages in thread From: Meelis Roos @ 2012-02-20 22:32 UTC (permalink / raw) To: Tejun Heo; +Cc: Grant Likely, Rob Herring, sparclinux, Linux Kernel list, sam > On Mon, Feb 20, 2012 at 11:11:05AM +0200, Meelis Roos wrote: > > Finished bisecting on the other machine too (Sun Fire V100 where strlen > > crashes): > > > > 7bd0b0f0da3b1ec11cbcc798eb0ef747a1184077 is the first bad commit > > commit 7bd0b0f0da3b1ec11cbcc798eb0ef747a1184077 > > Author: Tejun Heo <tj@kernel.org> > > Date: Thu Dec 8 10:22:09 2011 -0800 > > > > memblock: Reimplement memblock allocation using reverse free area iterator > > > > Now that all early memory information is in memblock when enabled, we > > can implement reverse free area iterator and use it to implement NUMA > > aware allocator which is then wrapped for simpler variants instead of > > the confusing and inefficient mending of information in separate NUMA > > aware allocator. > > > > Implement for_each_free_mem_range_reverse(), use it to reimplement > > memblock_find_in_range_node() which in turn is used by all allocators. > > > > The visible allocator interface is inconsistent and can probably use > > some cleanup too. > > > > Signed-off-by: Tejun Heo <tj@kernel.org> > > Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> > > Cc: Yinghai Lu <yinghai@kernel.org> > > Hmmm.... So, different bisection results from two machines? That's a > bit weird. I *think* this bisection result makes more sense. Can you > please verify the bisection result on e2500 once more? You were right. The first machine now bisects down to the same commit - I was confused by "0 revisions to test" and did not run the last step whe first bisecting. -- Meelis Roos (mroos@linux.ee) ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88 2012-02-20 22:32 ` Meelis Roos @ 2012-02-21 1:05 ` Tejun Heo 2012-02-22 0:36 ` Meelis Roos 2012-02-22 17:03 ` Sam Ravnborg 0 siblings, 2 replies; 46+ messages in thread From: Tejun Heo @ 2012-02-21 1:05 UTC (permalink / raw) To: Meelis Roos Cc: Grant Likely, Rob Herring, sparclinux, Linux Kernel list, sam, David S. Miller Hello, Meelis, can you please apply the following patch before & after the offending commit, boot with "memblock=debug" added as kernel param and post the boot log? The patch will generate some offset warnings after the commit but should work fine. Sam, David, as I'm not familiar with the code base, is it possible to tell which address is corrupted (zeroed, it seems)? ie. can we add "if (XXX == NULL) printk("%p is corrputed\n"...);" somewhere? Thanks. diff --git a/mm/memblock.c b/mm/memblock.c index 1adbef0..dccfced 100644 --- a/mm/memblock.c +++ b/mm/memblock.c @@ -179,9 +179,15 @@ int __init_memblock memblock_reserve_reserved_regions(void) static void __init_memblock memblock_remove_region(struct memblock_type *type, unsigned long r) { - type->total_size -= type->regions[r].size; - memmove(&type->regions[r], &type->regions[r + 1], - (type->cnt - (r + 1)) * sizeof(type->regions[r])); + struct memblock_region *rgn = &type->regions[r]; + + memblock_dbg(" memblock %s: rm [%#016llx-%#016llx] node %d\n", + memblock_type_name(type), + (unsigned long long)rgn->base, + (unsigned long long)rgn->base + rgn->size, rgn->nid); + + type->total_size -= rgn->size; + memmove(rgn, rgn + 1, (type->cnt - (r + 1)) * sizeof(*rgn)); type->cnt--; /* Special case for empty arrays */ @@ -317,6 +323,9 @@ static void __init_memblock memblock_insert_region(struct memblock_type *type, memblock_set_region_node(rgn, nid); type->cnt++; type->total_size += size; + memblock_dbg(" memblock %s: add [%#016llx-%016llx] node %d @%d\n", + memblock_type_name(type), (unsigned long long)base, + (unsigned long long)base + size, nid, idx); } /** @@ -342,6 +351,10 @@ static int __init_memblock memblock_add_region(struct memblock_type *type, phys_addr_t end = base + memblock_cap_size(base, &size); int i, nr_new; + memblock_dbg(" memblock %s: ADD [%#016llx-%#016llx] node %d\n", + memblock_type_name(type), (unsigned long long)base, + (unsigned long long)base + size, nid); + /* special case for empty array */ if (type->regions[0].size == 0) { WARN_ON(type->cnt != 1 || type->total_size); @@ -349,6 +362,8 @@ static int __init_memblock memblock_add_region(struct memblock_type *type, type->regions[0].size = size; memblock_set_region_node(&type->regions[0], nid); type->total_size = size; + memblock_dbg(" memblock %s: add first entry\n", + memblock_type_name(type)); return 0; } repeat: @@ -494,6 +509,10 @@ static int __init_memblock __memblock_remove(struct memblock_type *type, int start_rgn, end_rgn; int i, ret; + memblock_dbg(" memblock %s: RM [%#016llx-%016llx]\n", + memblock_type_name(type), (unsigned long long)base, + (unsigned long long)base + size); + ret = memblock_isolate_range(type, base, size, &start_rgn, &end_rgn); if (ret) return ret; ^ permalink raw reply related [flat|nested] 46+ messages in thread
* Re: OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88 2012-02-21 1:05 ` Tejun Heo @ 2012-02-22 0:36 ` Meelis Roos 2012-02-22 17:48 ` Tejun Heo 2012-02-22 18:22 ` Richard Mortimer 2012-02-22 17:03 ` Sam Ravnborg 1 sibling, 2 replies; 46+ messages in thread From: Meelis Roos @ 2012-02-22 0:36 UTC (permalink / raw) To: Tejun Heo Cc: Grant Likely, Rob Herring, sparclinux, Linux Kernel list, sam, David S. Miller [-- Attachment #1: Type: TEXT/PLAIN, Size: 645 bytes --] > Meelis, can you please apply the following patch before & after the > offending commit, boot with "memblock=debug" added as kernel param and > post the boot log? The patch will generate some offset warnings after > the commit but should work fine. Before the commit (v3.2-rc3-75-g0ee332c): memblock1.gz (attached) After the commit (v3.2-rc3-76-g7bd0b0f): memblock2.gz (attached) In addition, a third type of sparc machines breaks in a third way - V210 and V240 just hang after telling console [tty0] enabled, bootconsole disabled and before calibrating the delay loop. Bisect has led to the same commit. -- Meelis Roos (mroos@linux.ee) [-- Attachment #2: Type: APPLICATION/octet-stream, Size: 49939 bytes --] [-- Attachment #3: Type: APPLICATION/octet-stream, Size: 39513 bytes --] ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88 2012-02-22 0:36 ` Meelis Roos @ 2012-02-22 17:48 ` Tejun Heo 2012-02-22 18:25 ` Meelis Roos 2012-02-22 20:44 ` David Miller 2012-02-22 18:22 ` Richard Mortimer 1 sibling, 2 replies; 46+ messages in thread From: Tejun Heo @ 2012-02-22 17:48 UTC (permalink / raw) To: Meelis Roos Cc: Grant Likely, Rob Herring, sparclinux, Linux Kernel list, sam, David S. Miller On Wed, Feb 22, 2012 at 02:36:13AM +0200, Meelis Roos wrote: > > Meelis, can you please apply the following patch before & after the > > offending commit, boot with "memblock=debug" added as kernel param and > > post the boot log? The patch will generate some offset warnings after > > the commit but should work fine. > > Before the commit (v3.2-rc3-75-g0ee332c): memblock1.gz (attached) > After the commit (v3.2-rc3-76-g7bd0b0f): memblock2.gz (attached) Can you please try the following patch? If it still fails to boot, please attach the failing log. Thank you. diff --git a/mm/memblock.c b/mm/memblock.c index 77b5f22..99f2855 100644 --- a/mm/memblock.c +++ b/mm/memblock.c @@ -99,9 +99,6 @@ phys_addr_t __init_memblock memblock_find_in_range_node(phys_addr_t start, phys_addr_t this_start, this_end, cand; u64 i; - /* align @size to avoid excessive fragmentation on reserved array */ - size = round_up(size, align); - /* pump up @end */ if (end == MEMBLOCK_ALLOC_ACCESSIBLE) end = memblock.current_limit; @@ -731,6 +728,9 @@ static phys_addr_t __init memblock_alloc_base_nid(phys_addr_t size, { phys_addr_t found; + /* align @size to avoid excessive fragmentation on reserved array */ + size = round_up(size, align); + found = memblock_find_in_range_node(0, max_addr, size, align, nid); if (found && !memblock_reserve(found, size)) return found; ^ permalink raw reply related [flat|nested] 46+ messages in thread
* Re: OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88 2012-02-22 17:48 ` Tejun Heo @ 2012-02-22 18:25 ` Meelis Roos 2012-02-23 18:55 ` Tejun Heo 2012-02-22 20:44 ` David Miller 1 sibling, 1 reply; 46+ messages in thread From: Meelis Roos @ 2012-02-22 18:25 UTC (permalink / raw) To: Tejun Heo Cc: Grant Likely, Rob Herring, sparclinux, Linux Kernel list, sam, David S. Miller > Can you please try the following patch? If it still fails to boot, > please attach the failing log. Thank you. It works on E3500! Will try other machines tomorrow. -- Meelis Roos (mroos@linux.ee) ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88 2012-02-22 18:25 ` Meelis Roos @ 2012-02-23 18:55 ` Tejun Heo 2012-02-23 23:31 ` David Miller 2012-02-24 9:20 ` Meelis Roos 0 siblings, 2 replies; 46+ messages in thread From: Tejun Heo @ 2012-02-23 18:55 UTC (permalink / raw) To: Meelis Roos Cc: Grant Likely, Rob Herring, sparclinux, Linux Kernel list, sam, David S. Miller Hello, On Wed, Feb 22, 2012 at 08:25:32PM +0200, Meelis Roos wrote: > > Can you please try the following patch? If it still fails to boot, > > please attach the failing log. Thank you. > > It works on E3500! Will try other machines tomorrow. Once confirmed, I'll push the patch through tip. It just hides the underlying problem but we should be in no worse shape than before, it's two line change so reproduing the problem again for proper diagnosing isn't difficult, and we're getting a bit late in release cycle already. Thanks. -- tejun ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88 2012-02-23 18:55 ` Tejun Heo @ 2012-02-23 23:31 ` David Miller 2012-02-24 9:20 ` Meelis Roos 1 sibling, 0 replies; 46+ messages in thread From: David Miller @ 2012-02-23 23:31 UTC (permalink / raw) To: tj; +Cc: mroos, grant.likely, rob.herring, sparclinux, linux-kernel, sam From: Tejun Heo <tj@kernel.org> Date: Thu, 23 Feb 2012 10:55:03 -0800 > Hello, > > On Wed, Feb 22, 2012 at 08:25:32PM +0200, Meelis Roos wrote: >> > Can you please try the following patch? If it still fails to boot, >> > please attach the failing log. Thank you. >> >> It works on E3500! Will try other machines tomorrow. > > Once confirmed, I'll push the patch through tip. It just hides the > underlying problem but we should be in no worse shape than before, > it's two line change so reproduing the problem again for proper > diagnosing isn't difficult, and we're getting a bit late in release > cycle already. Ok. ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88 2012-02-23 18:55 ` Tejun Heo 2012-02-23 23:31 ` David Miller @ 2012-02-24 9:20 ` Meelis Roos 2012-02-27 17:17 ` Meelis Roos 1 sibling, 1 reply; 46+ messages in thread From: Meelis Roos @ 2012-02-24 9:20 UTC (permalink / raw) To: Tejun Heo Cc: Grant Likely, Rob Herring, sparclinux, Linux Kernel list, sam, David S. Miller > > > Can you please try the following patch? If it still fails to boot, > > > please attach the failing log. Thank you. > > > > It works on E3500! Will try other machines tomorrow. > > Once confirmed, I'll push the patch through tip. It just hides the > underlying problem but we should be in no worse shape than before, > it's two line change so reproduing the problem again for proper > diagnosing isn't difficult, and we're getting a bit late in release > cycle already. It cured the V210 too but I could not test V100 since it's offline until monday. -- Meelis Roos (mroos@linux.ee) ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88 2012-02-24 9:20 ` Meelis Roos @ 2012-02-27 17:17 ` Meelis Roos 2012-02-27 19:43 ` Sam Ravnborg 0 siblings, 1 reply; 46+ messages in thread From: Meelis Roos @ 2012-02-27 17:17 UTC (permalink / raw) To: Tejun Heo Cc: Grant Likely, Rob Herring, sparclinux, Linux Kernel list, sam, David S. Miller > > > > Can you please try the following patch? If it still fails to boot, > > > > please attach the failing log. Thank you. > > > > > > It works on E3500! Will try other machines tomorrow. > > > > Once confirmed, I'll push the patch through tip. It just hides the > > underlying problem but we should be in no worse shape than before, > > it's two line change so reproduing the problem again for proper > > diagnosing isn't difficult, and we're getting a bit late in release > > cycle already. > > It cured the V210 too but I could not test V100 since it's offline until > monday. Tested V100 too, success! -- Meelis Roos (mroos@linux.ee) ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88 2012-02-27 17:17 ` Meelis Roos @ 2012-02-27 19:43 ` Sam Ravnborg 2012-02-27 21:25 ` Meelis Roos 0 siblings, 1 reply; 46+ messages in thread From: Sam Ravnborg @ 2012-02-27 19:43 UTC (permalink / raw) To: Meelis Roos Cc: Tejun Heo, Grant Likely, Rob Herring, sparclinux, Linux Kernel list, David S. Miller On Mon, Feb 27, 2012 at 07:17:42PM +0200, Meelis Roos wrote: > > > > > Can you please try the following patch? If it still fails to boot, > > > > > please attach the failing log. Thank you. > > > > > > > > It works on E3500! Will try other machines tomorrow. > > > > > > Once confirmed, I'll push the patch through tip. It just hides the > > > underlying problem but we should be in no worse shape than before, > > > it's two line change so reproduing the problem again for proper > > > diagnosing isn't difficult, and we're getting a bit late in release > > > cycle already. > > > > It cured the V210 too but I could not test V100 since it's offline until > > monday. > > Tested V100 too, success! Hi Meelis. I have tried to cook up a small patch that verify the length of what we read - compared to the original length. Could you try to give this a quick spin and see if something turns up. I you have time it would be good to try on a box that worked before and one that was fixed by the patch from Tejun. I have not looked much at the of stuff - but this looked like the right place to start. I have no possibility to try it out myself... Sam diff --git a/drivers/of/pdt.c b/drivers/of/pdt.c index 07cc1d6..826204a 100644 --- a/drivers/of/pdt.c +++ b/drivers/of/pdt.c @@ -128,6 +128,10 @@ static struct property * __init of_pdt_build_one_prop(phandle node, char *prev, p->value = prom_early_alloc(p->length + 1); len = of_pdt_prom_ops->getproperty(node, p->name, p->value, p->length); + + if (len != p->length) + pr_err("prop: %s %d => %d", p->name, p->length, len); + if (len <= 0) p->length = 0; ((unsigned char *)p->value)[p->length] = '\0'; @@ -161,8 +165,13 @@ static char * __init of_pdt_get_one_property(phandle node, const char *name) len = of_pdt_prom_ops->getproplen(node, name); if (len > 0) { + int proplen; buf = prom_early_alloc(len); - len = of_pdt_prom_ops->getproperty(node, name, buf, len); + proplen = of_pdt_prom_ops->getproperty(node, name, buf, len); + + if (proplen != len) + pr_err("prop: %s %d => %d\n", name, len, proplen); + } return buf; ^ permalink raw reply related [flat|nested] 46+ messages in thread
* Re: OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88 2012-02-27 19:43 ` Sam Ravnborg @ 2012-02-27 21:25 ` Meelis Roos 2012-02-27 21:30 ` David Miller 0 siblings, 1 reply; 46+ messages in thread From: Meelis Roos @ 2012-02-27 21:25 UTC (permalink / raw) To: Sam Ravnborg Cc: Tejun Heo, Grant Likely, Rob Herring, sparclinux, Linux Kernel list, David S. Miller > Could you try to give this a quick spin and see if something > turns up. I you have time it would be good to try on a box > that worked before and one that was fixed by the patch from Tejun. Neither of the machines - already working one and "fixed with the rounding patch" one emit any prot: messages. -- Meelis Roos (mroos@linux.ee) ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88 2012-02-27 21:25 ` Meelis Roos @ 2012-02-27 21:30 ` David Miller 2012-02-28 21:10 ` David Miller 0 siblings, 1 reply; 46+ messages in thread From: David Miller @ 2012-02-27 21:30 UTC (permalink / raw) To: mroos; +Cc: sam, tj, grant.likely, rob.herring, sparclinux, linux-kernel From: Meelis Roos <mroos@linux.ee> Date: Mon, 27 Feb 2012 23:25:11 +0200 (EET) >> Could you try to give this a quick spin and see if something >> turns up. I you have time it would be good to try on a box >> that worked before and one that was fixed by the patch from Tejun. > > Neither of the machines - already working one and "fixed with the > rounding patch" one emit any prot: messages. I think the issue is that OF writes past the end of the buffer even though the length it reports is smaller than what it writes. That's why we really need to fill the memblock memory with magic numbers and scan every allocation for free memory with corrupted magic values. ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88 2012-02-27 21:30 ` David Miller @ 2012-02-28 21:10 ` David Miller 2012-02-28 21:36 ` Meelis Roos 0 siblings, 1 reply; 46+ messages in thread From: David Miller @ 2012-02-28 21:10 UTC (permalink / raw) To: mroos; +Cc: sam, tj, grant.likely, rob.herring, sparclinux, linux-kernel From: David Miller <davem@davemloft.net> Date: Mon, 27 Feb 2012 16:30:44 -0500 (EST) > I think the issue is that OF writes past the end of the buffer even > though the length it reports is smaller than what it writes. Meelis, can you get your tree back into a state where the crash happens and then add the following debugging patch and see what happens? Thanks! diff --git a/drivers/of/pdt.c b/drivers/of/pdt.c index 07cc1d6..367ef33 100644 --- a/drivers/of/pdt.c +++ b/drivers/of/pdt.c @@ -125,12 +125,31 @@ static struct property * __init of_pdt_build_one_prop(phandle node, char *prev, } else { int len; +#if 1 + int i; + p->value = prom_early_alloc(p->length + 1 + 64); + for (i = p->length + 1; i < p->length + 1 + 64; i++) + ((unsigned char *)p->value)[i] = 0xff; +#else p->value = prom_early_alloc(p->length + 1); +#endif len = of_pdt_prom_ops->getproperty(node, p->name, p->value, p->length); - if (len <= 0) + if (len <= 0) { + pr_info("OF BUG: getproperty(%s, %d) returns %d\n", + p->name, p->length, len); p->length = 0; + } ((unsigned char *)p->value)[p->length] = '\0'; +#if 1 + for (i = p->length + 1; i < p->length + 1 + 64; i++) { + if (((unsigned char *)p->value)[i] != 0xff) { + pr_info("OF BUG: Write past end of property buffer\n"); + pr_info("OF BUG: Property name [%s] length [%d] getprop len [%d]\n", + p->name, p->length, len); + } + } +#endif } } return p; @@ -161,7 +180,11 @@ static char * __init of_pdt_get_one_property(phandle node, const char *name) len = of_pdt_prom_ops->getproplen(node, name); if (len > 0) { +#if 1 + buf = prom_early_alloc(len + 64); +#else buf = prom_early_alloc(len); +#endif len = of_pdt_prom_ops->getproperty(node, name, buf, len); } ^ permalink raw reply related [flat|nested] 46+ messages in thread
* Re: OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88 2012-02-28 21:10 ` David Miller @ 2012-02-28 21:36 ` Meelis Roos 2012-02-28 22:56 ` David Miller 0 siblings, 1 reply; 46+ messages in thread From: Meelis Roos @ 2012-02-28 21:36 UTC (permalink / raw) To: David Miller; +Cc: sam, tj, grant.likely, rob.herring, sparclinux, linux-kernel > Meelis, can you get your tree back into a state where the crash happens > and then add the following debugging patch and see what happens? Tried it, no obvious results in dmesg, except the crash is in a slightly different location. [ 0.000000] PROMLIB: Sun IEEE Boot Prom 'OBP 3.2.30 2002/10/25 14:03' [ 0.000000] PROMLIB: Root node compatible: [ 0.000000] Linux version 3.2.0-rc3-00076-g7bd0b0f-dirty (mroos@korvits) (gcc version 4.6.2 (Debian 4.6.2-14) ) #84 SMP Tue Feb 28 23:28:49 EET 2012 [ 0.000000] debug: ignoring loglevel setting. [ 0.000000] bootconsole [earlyprom0] enabled [ 0.000000] ARCH: SUN4U [ 0.000000] Ethernet address: 08:00:20:b6:ee:e2 [ 0.000000] Kernel: Using 4 locked TLB entries for main kernel image. [ 0.000000] Remapping the kernel... done. [ 0.000000] Unable to handle kernel paging request at virtual address 000000007fcf2000 [ 0.000000] tsk->{mm,active_mm}->context = 0000000000000000 [ 0.000000] tsk->{mm,active_mm}->pgd = fffff800007db7d0 [ 0.000000] \|/ ____ \|/ [ 0.000000] "@'/ .. \`@" [ 0.000000] /_| \__/ |_\ [ 0.000000] \__U_/ [ 0.000000] swapper(0): Oops [#1] [ 0.000000] TSTATE: 0000008880e01600 TPC: 000000000057b4c8 TNPC: 000000000057b4cc Y: 00000037 Not tainted [ 0.000000] TPC: <strcmp+0x8/0x60> [ 0.000000] g0: 000000000077f7f0 g1: 0000000000000000 g2: 000000000000002f g3: 00000000000000f0 [ 0.000000] g4: 000000000077f350 g5: 0000000000000000 g6: 0000000000760000 g7: 0000000000000050 [ 0.000000] o0: 000000000079dbc8 o1: 0000000000000000 o2: 0000000000000000 o3: 0000000000000002 [ 0.000000] o4: 0000000000000002 o5: 0000000000000000 sp: 0000000000763181 ret_pc: 00000000006a9984 [ 0.000000] RPC: <_raw_read_lock+0x24/0x40> [ 0.000000] l0: 0000000001028000 l1: fffff8007fcbc380 l2: 8000000000000000 l3: 0800000000000000 [ 0.000000] l4: 0000000000000080 l5: 0000000000000002 l6: 0000000000000000 l7: 0020280000000000 [ 0.000000] i0: 000000007fcf3c80 i1: fffff8007fcec480 i2: 0000000001010101 i3: 0000000080808080 [ 0.000000] i4: fffff8007fcb8ccd i5: 0000000000028337 i6: 0000000000763231 i7: 0000000000606250 [ 0.000000] I7: <of_find_node_by_path+0x30/0x80> [ 0.000000] Call Trace: [ 0.000000] [0000000000606250] of_find_node_by_path+0x30/0x80 [ 0.000000] [0000000000606e0c] of_alias_scan+0xcc/0x1c0 [ 0.000000] [00000000007c328c] of_pdt_build_devicetree+0x90/0xa0 [ 0.000000] [00000000007b0680] prom_build_devicetree+0x10/0x3c [ 0.000000] [00000000007b4614] paging_init+0x59c/0x6bc [ 0.000000] [00000000007afffc] setup_arch+0xf8/0x110 [ 0.000000] [00000000007ae514] start_kernel+0x84/0x32c [ 0.000000] [00000000006918c8] tlb_fixup_done+0xa0/0xa8 [ 0.000000] [0000000000000000] (null) [ 0.000000] Disabling lock debugging due to kernel taint [ 0.000000] Caller[0000000000606250]: of_find_node_by_path+0x30/0x80 [ 0.000000] Caller[0000000000606e0c]: of_alias_scan+0xcc/0x1c0 [ 0.000000] Caller[00000000007c328c]: of_pdt_build_devicetree+0x90/0xa0 [ 0.000000] Caller[00000000007b0680]: prom_build_devicetree+0x10/0x3c [ 0.000000] Caller[00000000007b4614]: paging_init+0x59c/0x6bc [ 0.000000] Caller[00000000007afffc]: setup_arch+0xf8/0x110 [ 0.000000] Caller[00000000007ae514]: start_kernel+0x84/0x32c [ 0.000000] Caller[00000000006918c8]: tlb_fixup_done+0xa0/0xa8 [ 0.000000] Caller[0000000000000000]: (null) [ 0.000000] Instruction DUMP: 01000000 9de3bf50 82102000 <c40e0001> c60e4001 80a08003 12400008 82006001 80a0a000 [ 0.000000] Kernel panic - not syncing: Attempted to kill the idle task! [ 0.000000] Call Trace: [ 0.000000] [000000000069c7fc] panic+0x68/0x1e4 [ 0.000000] [0000000000461a30] do_exit+0x230/0x2c0 [ 0.000000] [00000000004292c0] die_if_kernel+0x180/0x260 [ 0.000000] [000000000069c224] unhandled_fault+0x8c/0x98 [ 0.000000] [0000000000445778] do_kernel_fault+0xd8/0x100 [ 0.000000] [000000000044584c] do_sparc64_fault+0xac/0x540 [ 0.000000] [0000000000407948] sparc64_realfault_common+0x10/0x20 [ 0.000000] [000000000057b4c8] strcmp+0x8/0x60 [ 0.000000] [0000000000606250] of_find_node_by_path+0x30/0x80 [ 0.000000] [0000000000606e0c] of_alias_scan+0xcc/0x1c0 [ 0.000000] [00000000007c328c] of_pdt_build_devicetree+0x90/0xa0 [ 0.000000] [00000000007b0680] prom_build_devicetree+0x10/0x3c [ 0.000000] [00000000007b4614] paging_init+0x59c/0x6bc [ 0.000000] [00000000007afffc] setup_arch+0xf8/0x110 [ 0.000000] [00000000007ae514] start_kernel+0x84/0x32c [ 0.000000] [00000000006918c8] tlb_fixup_done+0xa0/0xa8 [ 0.000000] Press Stop-A (L1-A) to return to the boot prom -- Meelis Roos (mroos@linux.ee) ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88 2012-02-28 21:36 ` Meelis Roos @ 2012-02-28 22:56 ` David Miller 2012-02-29 6:15 ` Meelis Roos 0 siblings, 1 reply; 46+ messages in thread From: David Miller @ 2012-02-28 22:56 UTC (permalink / raw) To: mroos; +Cc: sam, tj, grant.likely, rob.herring, sparclinux, linux-kernel From: Meelis Roos <mroos@linux.ee> Date: Tue, 28 Feb 2012 23:36:07 +0200 (EET) >> Meelis, can you get your tree back into a state where the crash happens >> and then add the following debugging patch and see what happens? > > Tried it, no obvious results in dmesg, except the crash is in a slightly > different location. Interesting, the corruption is a little bit different this time, yet similar to the ones we saw previously: > [ 0.000000] TPC: <strcmp+0x8/0x60> ... > [ 0.000000] i0: 000000007fcf3c80 i1: fffff8007fcec480 i2: 0000000001010101 i3: 0000000080808080 > [ 0.000000] i4: fffff8007fcb8ccd i5: 0000000000028337 i6: 0000000000763231 i7: 0000000000606250 This is strcmp(0x000000007fcf3c80, 0xfffff8007fcec480), the first arg is a bad pointer, somehow the top virtual address bits have been zero'd out. It comes from dp->full_name, so something walked all over the beginning of a device_node object. Let's see if we can figure out anything else about the nature of the corruption, please add this patch on top. diff --git a/drivers/of/base.c b/drivers/of/base.c index 133908a..7c0f7f4 100644 --- a/drivers/of/base.c +++ b/drivers/of/base.c @@ -376,6 +376,18 @@ struct device_node *of_find_node_by_path(const char *path) read_lock(&devtree_lock); for (; np; np = np->allnext) { + if (!np->full_name) + continue; + + if ((unsigned long)np->full_name < 0xfffff80000000000) { + pr_info("OF BUG: Bogus full_name pointer [%p]\n", + np->full_name); + pr_info("OF BUG: np[%p] np->name[%p] np->type[%p] np->phandle[0x%08x]\n", + np, np->name, np->type, (unsigned int) np->phandle); + pr_info("OF BUG: np->name(%s) np->type(%s)\n", + np->name, np->type); + } + if (np->full_name && (of_node_cmp(np->full_name, path) == 0) && of_node_get(np)) break; ^ permalink raw reply related [flat|nested] 46+ messages in thread
* Re: OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88 2012-02-28 22:56 ` David Miller @ 2012-02-29 6:15 ` Meelis Roos 2012-02-29 6:27 ` David Miller 0 siblings, 1 reply; 46+ messages in thread From: Meelis Roos @ 2012-02-29 6:15 UTC (permalink / raw) To: David Miller; +Cc: sam, tj, grant.likely, rob.herring, sparclinux, linux-kernel > > Tried it, no obvious results in dmesg, except the crash is in a slightly > > different location. > > Interesting, the corruption is a little bit different this time, yet similar > to the ones we saw previously: > > > [ 0.000000] TPC: <strcmp+0x8/0x60> > ... > > [ 0.000000] i0: 000000007fcf3c80 i1: fffff8007fcec480 i2: 0000000001010101 i3: 0000000080808080 > > [ 0.000000] i4: fffff8007fcb8ccd i5: 0000000000028337 i6: 0000000000763231 i7: 0000000000606250 > > This is strcmp(0x000000007fcf3c80, 0xfffff8007fcec480), the first arg is > a bad pointer, somehow the top virtual address bits have been zero'd out. > > It comes from dp->full_name, so something walked all over the beginning > of a device_node object. > > Let's see if we can figure out anything else about the nature of the > corruption, please add this patch on top. Here it is - triggers this time: [ 0.000000] PROMLIB: Sun IEEE Boot Prom 'OBP 3.2.30 2002/10/25 14:03' [ 0.000000] PROMLIB: Root node compatible: [ 0.000000] Linux version 3.2.0-rc3-00076-g7bd0b0f-dirty (mroos@korvits) (gcc version 4.6.2 (Debian 4.6.2-14) ) #85 SMP Wed Feb 29 08:06:38 EET 2012 [ 0.000000] debug: ignoring loglevel setting. [ 0.000000] bootconsole [earlyprom0] enabled [ 0.000000] ARCH: SUN4U [ 0.000000] Ethernet address: 08:00:20:b6:ee:e2 [ 0.000000] Kernel: Using 4 locked TLB entries for main kernel image. [ 0.000000] Remapping the kernel... done. [ 0.000000] OF BUG: Bogus full_name pointer [0000000000730e08] [ 0.000000] OF BUG: np[fffff8007fcf3f40] np->name[fffff8007fcf3ec0] np->type[0000000000756bf8] np->phandle[0xf0029c88] [ 0.000000] OF BUG: np->name(SUNW,Ultra-Enterprise) np->type(<NULL>) [ 0.000000] OF BUG: Bogus full_name pointer [0000000000730e08] [ 0.000000] OF BUG: np[fffff8007fcf3f40] np->name[fffff8007fcf3ec0] np->type[0000000000756bf8] np->phandle[0xf0029c88] [ 0.000000] OF BUG: np->name(SUNW,Ultra-Enterprise) np->type(<NULL>) [ 0.000000] OF BUG: Bogus full_name pointer [0000000000730e08] [ 0.000000] OF BUG: np[fffff8007fcf3f40] np->name[fffff8007fcf3ec0] np->type[0000000000756bf8] np->phandle[0xf0029c88] [ 0.000000] OF BUG: np->name(SUNW,Ultra-Enterprise) np->type(<NULL>) [ 0.000000] OF BUG: Bogus full_name pointer [000000007fcf3c80] [ 0.000000] OF BUG: np[fffff8007fceacc0] np->name[ (null)] np->type[ (null)] np->phandle[0x00000001] [ 0.000000] OF BUG: np->name((null)) np->type((null)) [ 0.000000] Unable to handle kernel paging request at virtual address 000000007fcf2000 [ 0.000000] tsk->{mm,active_mm}->context = 0000000000000000 [ 0.000000] tsk->{mm,active_mm}->pgd = fffff800007db7d0 [ 0.000000] \|/ ____ \|/ [ 0.000000] "@'/ .. \`@" [ 0.000000] /_| \__/ |_\ [ 0.000000] \__U_/ [ 0.000000] swapper(0): Oops [#1] [ 0.000000] TSTATE: 0000004480e01600 TPC: 000000000057b4c8 TNPC: 000000000057b4cc Y: 00000037 Not tainted [ 0.000000] TPC: <strcmp+0x8/0x60> [ 0.000000] g0: 000000000077f7f0 g1: 0000000000000000 g2: 0000000000000000 g3: 0000000000787950 [ 0.000000] g4: 000000000077f350 g5: 0000000000000000 g6: 0000000000760000 g7: 0000000000000040 [ 0.000000] o0: 000000000000003f o1: 0000000000763930 o2: 0000000000000003 o3: 00000000007879e4 [ 0.000000] o4: 000000000080ee45 o5: 000000000080ee1b sp: 0000000000763181 ret_pc: 000000000069cad0 [ 0.000000] RPC: <printk+0x24/0x38> [ 0.000000] l0: 0000000001028000 l1: fffff8007fcbc380 l2: 8000000000000000 l3: 0800000000000000 [ 0.000000] l4: 0000000000000080 l5: 0000000000000002 l6: 0000000000000000 l7: 0020280000000000 [ 0.000000] i0: 000000007fcf3c80 i1: fffff8007fcec480 i2: 0000000000000000 i3: 0000000000000000 [ 0.000000] i4: 0000000000000001 i5: 0000000000028337 i6: 0000000000763231 i7: 0000000000606278 [ 0.000000] I7: <of_find_node_by_path+0x58/0xe0> [ 0.000000] Call Trace: [ 0.000000] [0000000000606278] of_find_node_by_path+0x58/0xe0 [ 0.000000] [0000000000606e6c] of_alias_scan+0xcc/0x1c0 [ 0.000000] [00000000007c328c] of_pdt_build_devicetree+0x90/0xa0 [ 0.000000] [00000000007b0680] prom_build_devicetree+0x10/0x3c [ 0.000000] [00000000007b4614] paging_init+0x59c/0x6bc [ 0.000000] [00000000007afffc] setup_arch+0xf8/0x110 [ 0.000000] [00000000007ae514] start_kernel+0x84/0x32c [ 0.000000] [0000000000691928] tlb_fixup_done+0xa0/0xa8 [ 0.000000] [0000000000000000] (null) [ 0.000000] Disabling lock debugging due to kernel taint [ 0.000000] Caller[0000000000606278]: of_find_node_by_path+0x58/0xe0 [ 0.000000] Caller[0000000000606e6c]: of_alias_scan+0xcc/0x1c0 [ 0.000000] Caller[00000000007c328c]: of_pdt_build_devicetree+0x90/0xa0 [ 0.000000] Caller[00000000007b0680]: prom_build_devicetree+0x10/0x3c [ 0.000000] Caller[00000000007b4614]: paging_init+0x59c/0x6bc [ 0.000000] Caller[00000000007afffc]: setup_arch+0xf8/0x110 [ 0.000000] Caller[00000000007ae514]: start_kernel+0x84/0x32c [ 0.000000] Caller[0000000000691928]: tlb_fixup_done+0xa0/0xa8 [ 0.000000] Caller[0000000000000000]: (null) [ 0.000000] Instruction DUMP: 01000000 9de3bf50 82102000 <c40e0001> c60e4001 80a08003 12400008 82006001 80a0a000 [ 0.000000] Kernel panic - not syncing: Attempted to kill the idle task! [ 0.000000] Call Trace: [ 0.000000] [000000000069c85c] panic+0x68/0x1e4 [ 0.000000] [0000000000461a30] do_exit+0x230/0x2c0 [ 0.000000] [00000000004292c0] die_if_kernel+0x180/0x260 [ 0.000000] [000000000069c284] unhandled_fault+0x8c/0x98 [ 0.000000] [0000000000445778] do_kernel_fault+0xd8/0x100 [ 0.000000] [000000000044584c] do_sparc64_fault+0xac/0x540 [ 0.000000] [0000000000407948] sparc64_realfault_common+0x10/0x20 [ 0.000000] [000000000057b4c8] strcmp+0x8/0x60 [ 0.000000] [0000000000606278] of_find_node_by_path+0x58/0xe0 [ 0.000000] [0000000000606e6c] of_alias_scan+0xcc/0x1c0 [ 0.000000] [00000000007c328c] of_pdt_build_devicetree+0x90/0xa0 [ 0.000000] [00000000007b0680] prom_build_devicetree+0x10/0x3c [ 0.000000] [00000000007b4614] paging_init+0x59c/0x6bc [ 0.000000] [00000000007afffc] setup_arch+0xf8/0x110 [ 0.000000] [00000000007ae514] start_kernel+0x84/0x32c [ 0.000000] [0000000000691928] tlb_fixup_done+0xa0/0xa8 [ 0.000000] Press Stop-A (L1-A) to return to the boot prom -- Meelis Roos (mroos@linux.ee) ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88 2012-02-29 6:15 ` Meelis Roos @ 2012-02-29 6:27 ` David Miller 0 siblings, 0 replies; 46+ messages in thread From: David Miller @ 2012-02-29 6:27 UTC (permalink / raw) To: mroos; +Cc: sam, tj, grant.likely, rob.herring, sparclinux, linux-kernel From: Meelis Roos <mroos@linux.ee> Date: Wed, 29 Feb 2012 08:15:06 +0200 (EET) > Here it is - triggers this time: Thanks a lot. I need to add some more diagnostics to further narrow it down, I'll give you a patch for that when I get a chance. ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88 2012-02-22 17:48 ` Tejun Heo 2012-02-22 18:25 ` Meelis Roos @ 2012-02-22 20:44 ` David Miller 2012-02-22 21:00 ` Tejun Heo 1 sibling, 1 reply; 46+ messages in thread From: David Miller @ 2012-02-22 20:44 UTC (permalink / raw) To: tj; +Cc: mroos, grant.likely, rob.herring, sparclinux, linux-kernel, sam From: Tejun Heo <tj@kernel.org> Date: Wed, 22 Feb 2012 09:48:25 -0800 > On Wed, Feb 22, 2012 at 02:36:13AM +0200, Meelis Roos wrote: >> > Meelis, can you please apply the following patch before & after the >> > offending commit, boot with "memblock=debug" added as kernel param and >> > post the boot log? The patch will generate some offset warnings after >> > the commit but should work fine. >> >> Before the commit (v3.2-rc3-75-g0ee332c): memblock1.gz (attached) >> After the commit (v3.2-rc3-76-g7bd0b0f): memblock2.gz (attached) > > Can you please try the following patch? If it still fails to boot, > please attach the failing log. Thank you. Interesting, but two things strike me. First, this seems like it would only cause problems if the caller specified a too small size parameter, and then wrote past the 'size' bytes of the buffer. And if so, this means we have an improperly sized allocation somewhere, probably in the OF tree fetching code. For example, maybe we mis-calculate the size of an OF device node property before we fetch it from the firmware, therefore allocate too small a buffer, and the property fetch operation splats all over the end of the buffer. Another possibility is that the property length reported by the firmware is wrong and too small. BTW, this kind of bug would be easy to catch, simply put a magic number signature into all unallocated memblock memory then at allocation time check that signature. If we signal an error when we don't see the proper signature and turn on the OF tree building logging, we can see exactly which operation writes past the end of a buffer. Second, you'd need similar handling in other call chains such as memblock_double_array()'s invocation of memblock_find_in_range(). It seems a bad idea to hide how size is modified, so probably it's best to pass the address of the size parameter and modify the caller's value in that way so that the size used in the reserve matches up. ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88 2012-02-22 20:44 ` David Miller @ 2012-02-22 21:00 ` Tejun Heo 0 siblings, 0 replies; 46+ messages in thread From: Tejun Heo @ 2012-02-22 21:00 UTC (permalink / raw) To: David Miller Cc: mroos, grant.likely, rob.herring, sparclinux, linux-kernel, sam Hello, David. On Wed, Feb 22, 2012 at 03:44:17PM -0500, David Miller wrote: > > Can you please try the following patch? If it still fails to boot, > > please attach the failing log. Thank you. > > Interesting, but two things strike me. > > First, this seems like it would only cause problems if the caller > specified a too small size parameter, and then wrote past the 'size' > bytes of the buffer. And if so, this means we have an improperly > sized allocation somewhere, probably in the OF tree fetching code. There's another, less likely, possibility. It made the allocation table much larger and the lowest address used ended up lower. 0x0000007fc8fa40 vs 0x0000007fc94000. Not too much of difference and just allocating some more memory should rule out or confirm it. > For example, maybe we mis-calculate the size of an OF device node > property before we fetch it from the firmware, therefore allocate > too small a buffer, and the property fetch operation splats all > over the end of the buffer. Another possibility is that the > property length reported by the firmware is wrong and too small. > > BTW, this kind of bug would be easy to catch, simply put a magic > number signature into all unallocated memblock memory then at > allocation time check that signature. If we signal an error when we > don't see the proper signature and turn on the OF tree building > logging, we can see exactly which operation writes past the end of a > buffer. Yeah, redzonning can definitely help but I'm not sure whether we want to go full on allocation debugging and all for early allocator. The thing doesn't even support freeing. > Second, you'd need similar handling in other call chains such as > memblock_double_array()'s invocation of memblock_find_in_range(). > It seems a bad idea to hide how size is modified, so probably it's > best to pass the address of the size parameter and modify the > caller's value in that way so that the size used in the reserve > matches up. I suspect the size modification was added later to avoid expanding allocation table early during boot and we can do that only for memblock_alloc*() calls as they don't have matching free interface. If we modify explicit reservations, we have to propagate the modified size to each user and so on. Given that the allocation table is discarded after boot completion and there aren't too many explicit reservations, I don't think we need to expand size aligning to all find_in_range users. I guess it all depends on how complete allocator we want for early boot. Thanks. -- tejun ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88 2012-02-22 0:36 ` Meelis Roos 2012-02-22 17:48 ` Tejun Heo @ 2012-02-22 18:22 ` Richard Mortimer 2012-02-22 20:26 ` David Miller 1 sibling, 1 reply; 46+ messages in thread From: Richard Mortimer @ 2012-02-22 18:22 UTC (permalink / raw) To: Meelis Roos Cc: Tejun Heo, Grant Likely, Rob Herring, sparclinux, Linux Kernel list, sam, David S. Miller On 22/02/2012 00:36, Meelis Roos wrote: >> Meelis, can you please apply the following patch before& after the >> offending commit, boot with "memblock=debug" added as kernel param and >> post the boot log? The patch will generate some offset warnings after >> the commit but should work fine. > > Before the commit (v3.2-rc3-75-g0ee332c): memblock1.gz (attached) > After the commit (v3.2-rc3-76-g7bd0b0f): memblock2.gz (attached) > Its a long time since I regularly had to worry about SPARC boxes (not) booting so may be the difference between virtual & physical addresses but I notice that some of the addresses in the register dump have non-zero values in the upper 32 bits but the memblock values have zero in the upper half. memblock reserved: ADD [0x0000007fcc0a40-0x0000007fcc0a4e] node 1 memblock reserved: add [0x0000007fcc0a40-000000007fcc0a4e] node 1 @767 But a similar address in the registers has fffff800 in there. o4: fffff8007fcc0a4d I know that there are a number of explanations why things would be different (32 bit acesses etc) but it could explain things plus we would be talking 64 bit addresses in the kernel. Just a thought. Richard > In addition, a third type of sparc machines breaks in a third way - V210 > and V240 just hang after telling > > console [tty0] enabled, bootconsole disabled > > and before calibrating the delay loop. Bisect has led to the same commit. > ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88 2012-02-22 18:22 ` Richard Mortimer @ 2012-02-22 20:26 ` David Miller 0 siblings, 0 replies; 46+ messages in thread From: David Miller @ 2012-02-22 20:26 UTC (permalink / raw) To: richm; +Cc: mroos, tj, grant.likely, rob.herring, sparclinux, linux-kernel, sam From: Richard Mortimer <richm@oldelvet.org.uk> Date: Wed, 22 Feb 2012 18:22:36 +0000 > memblock reserved: ADD [0x0000007fcc0a40-0x0000007fcc0a4e] node 1 > memblock reserved: add [0x0000007fcc0a40-000000007fcc0a4e] node 1 @767 These are physical addresses. > But a similar address in the registers has fffff800 in there. > > o4: fffff8007fcc0a4d All of physical memory is mapped linearly starting at 0xfffff80000000000 and this is such a virtual address. ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88 2012-02-21 1:05 ` Tejun Heo 2012-02-22 0:36 ` Meelis Roos @ 2012-02-22 17:03 ` Sam Ravnborg 2012-02-22 17:12 ` Meelis Roos 1 sibling, 1 reply; 46+ messages in thread From: Sam Ravnborg @ 2012-02-22 17:03 UTC (permalink / raw) To: Tejun Heo Cc: Meelis Roos, Grant Likely, Rob Herring, sparclinux, Linux Kernel list, David S. Miller On Mon, Feb 20, 2012 at 05:05:37PM -0800, Tejun Heo wrote: > Hello, > > Meelis, can you please apply the following patch before & after the > offending commit, boot with "memblock=debug" added as kernel param and > post the boot log? The patch will generate some offset warnings after > the commit but should work fine. > > Sam, David, as I'm not familiar with the code base, is it possible to > tell which address is corrupted (zeroed, it seems)? ie. can we add > "if (XXX == NULL) printk("%p is corrputed\n"...);" somewhere? No idea - sorry. I spend most of the time with sparc32 - which I do not even feel familiar with yet :-( One thing I noticed while working with memblock for sparc32 (*) is that allocations are done top-down. So we may end up allocatng memory with a considerably higher address than we are used to. This is obviously just a wild guess... Meelis - do the affected boxes have any special memory configurations? Could you try to boot with a sensible mem=xxx value to see if limiting the memory helps. (*) I have re-done the original patch-set and I have a quite good feeling about it. HIGHMEM support is outstanding - I got a bit confused when I looked at x86. But my ss5 crashes the first time I try to use the allocated memory - so I assume I have some silly issue somewhere. Nothing points at memblock in this case. Sam ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88 2012-02-22 17:03 ` Sam Ravnborg @ 2012-02-22 17:12 ` Meelis Roos 2012-02-22 17:21 ` Sam Ravnborg 0 siblings, 1 reply; 46+ messages in thread From: Meelis Roos @ 2012-02-22 17:12 UTC (permalink / raw) To: Sam Ravnborg Cc: Tejun Heo, Grant Likely, Rob Herring, sparclinux, Linux Kernel list, David S. Miller > Meelis - do the affected boxes have any special memory configurations? Nothin special to me. E3500 has 2G, V100 has 1G, V210 and V240 have 2G and 1.5G. > Could you try to boot with a sensible mem=xxx value to see if limiting the memory > helps. Like mem=256M? Will try. -- Meelis Roos (mroos@ut.ee) http://www.cs.ut.ee/~mroos/ ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88 2012-02-22 17:12 ` Meelis Roos @ 2012-02-22 17:21 ` Sam Ravnborg 2012-02-22 17:41 ` Meelis Roos 0 siblings, 1 reply; 46+ messages in thread From: Sam Ravnborg @ 2012-02-22 17:21 UTC (permalink / raw) To: Meelis Roos Cc: Tejun Heo, Grant Likely, Rob Herring, sparclinux, Linux Kernel list, David S. Miller On Wed, Feb 22, 2012 at 07:12:06PM +0200, Meelis Roos wrote: > > Meelis - do the affected boxes have any special memory configurations? > > Nothin special to me. E3500 has 2G, V100 has 1G, V210 and V240 have 2G > and 1.5G. > > > Could you try to boot with a sensible mem=xxx value to see if limiting the memory > > helps. > > Like mem=256M? Will try. Think just a little more - I do not think this will help. I confused myself with some of the sparc32 issues I have hit. I have looked a little at the log files you included. The only thing that looked different was that the faulty version had a number after "@" which is higher than 1 - where the OK always have 1. This is "idx" in memblock_insert_region() - but I did not look closer. Sam ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88 2012-02-22 17:21 ` Sam Ravnborg @ 2012-02-22 17:41 ` Meelis Roos 0 siblings, 0 replies; 46+ messages in thread From: Meelis Roos @ 2012-02-22 17:41 UTC (permalink / raw) To: Sam Ravnborg Cc: Tejun Heo, Grant Likely, Rob Herring, sparclinux, Linux Kernel list, David S. Miller > > > Could you try to boot with a sensible mem=xxx value to see if limiting the memory > > > helps. > > > > Like mem=256M? Will try. > Think just a little more - I do not think this will help. Tried it on the 2G V210. It changes the picture. With 2G RAM, it just hangs. With mem=256M it produces a crash in strlen and of_alias_scan like in V100 with 1G. mem=512M results in the same strlen error. mem=1G results in a stranger error: [ 0.000000] Kernel panic - not syncing: ERROR: Failed to allocate 0x90 bytes below 0x0. [ 0.000000] [ 0.000000] Call Trace: [ 0.000000] [00000000007a6a28] memblock_alloc_base+0x28/0x38 [ 0.000000] [000000000079ca50] prom_early_alloc+0xc/0x60 [ 0.000000] [00000000007ae090] of_pdt_create_node.part.0+0x4/0xe0 [ 0.000000] [00000000007ae250] of_pdt_build_devicetree+0x30/0xa0 [ 0.000000] [000000000079c4a8] prom_build_devicetree+0x18/0x38 [ 0.000000] [00000000007a03c0] paging_init+0x59c/0x6bc [ 0.000000] [000000000079be50] setup_arch+0xf8/0x108 [ 0.000000] [000000000079a4e8] start_kernel+0x78/0x30c [ 0.000000] [00000000006a3e80] tlb_fixup_done+0x98/0xa0 [ 0.000000] [0000000000000000] (null) The working machines have 512M RAM, 834M RAM and 2G RAM so it's not just the amount of RAM. -- Meelis Roos (mroos@linux.ee) ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88 2012-02-13 8:06 ` Grant Likely 2012-02-13 9:20 ` Meelis Roos @ 2012-02-13 9:50 ` Meelis Roos 2012-02-13 9:51 ` Meelis Roos 2012-02-13 10:35 ` Meelis Roos 1 sibling, 2 replies; 46+ messages in thread From: Meelis Roos @ 2012-02-13 9:50 UTC (permalink / raw) To: Grant Likely; +Cc: Rob Herring, sparclinux, Linux Kernel list Another variation of the crash, without the patch, but backtrace is slightly different (strlen) - maybe fixed by the patch, maybe not. 0.000000] Unable to handle kernel NULL pointer dereference [ 0.000000] tsk->{mm,active_mm}->context = 0000000000000000 [ 0.000000] tsk->{mm,active_mm}->pgd = fffff800604ea3a8 [ 0.000000] \|/ ____ \|/ [ 0.000000] "@'/ .. \`@" [ 0.000000] /_| \__/ |_\ [ 0.000000] \__U_/ [ 0.000000] swapper(0): Oops [#1] [ 0.000000] TSTATE: 0000004480e01606 TPC: 00000000005be460 TNPC: 00000000005be464 Y: 00000037 Not d [ 0.000000] TPC: <strlen+0x60/0xd4> [ 0.000000] g0: 000000000000002f g1: 0000000000000001 g2: 0000000000000000 g3: 000000000073a700 [ 0.000000] g4: 000000000085ea50 g5: 0000000000000000 g6: 0000000000854000 g7: 0030a80000000000 [ 0.000000] o0: 0000000000000000 o1: 0000000000000000 o2: 0000000001010101 o3: 0000000080808080 [ 0.000000] o4: 0000000001010000 o5: fffff8006feae140 sp: 00000000008572c1 ret_pc: 0000000000655108 [ 0.000000] RPC: <of_alias_scan+0x68/0x200> [ 0.000000] l0: 00000000008a4380 l1: fffff8006feae6b5 l2: fffff8006feae140 l3: fffff8006fe98e00 [ 0.000000] l4: 0000000000000000 l5: 0000000000000000 l6: 0000000000000000 l7: 00000000008678d0 [ 0.000000] i0: 00000000008c3f24 i1: 0000000000896ca0 i2: 00000000008268c0 i3: 00000000008268b8 [ 0.000000] i4: 00000000008038c8 i5: fffff8006feae5c0 i6: 0000000000857381 i7: 00000000008c4314 [ 0.000000] I7: <of_pdt_build_devicetree+0x90/0xa0> [ 0.000000] Call Trace: [ 0.000000] [00000000008c4314] of_pdt_build_devicetree+0x90/0xa0 [ 0.000000] [00000000008b0330] prom_build_devicetree+0x10/0x3c [ 0.000000] [00000000008b3bb8] paging_init+0xa3c/0xde8 [ 0.000000] [00000000008af978] setup_arch+0x324/0x688 [ 0.000000] [00000000008ae4ec] start_kernel+0x80/0x338 [ 0.000000] [0000000000715b30] tlb_fixup_done+0x88/0x90 [ 0.000000] [0000000000000000] (null) [ 0.000000] Disabling lock debugging due to kernel taint [ 0.000000] Caller[00000000008c4314]: of_pdt_build_devicetree+0x90/0xa0 [ 0.000000] Caller[00000000008b0330]: prom_build_devicetree+0x10/0x3c [ 0.000000] Caller[00000000008b3bb8]: paging_init+0xa3c/0xde8 [ 0.000000] Caller[00000000008af978]: setup_arch+0x324/0x688 [ 0.000000] Caller[00000000008ae4ec]: start_kernel+0x80/0x338 [ 0.000000] Caller[0000000000715b30]: tlb_fixup_done+0x88/0x90 [ 0.000000] Caller[0000000000000000]: (null) [ 0.000000] Instruction DUMP: 96132080 19004040 94132101 <da020000> 9823400a 808b000b 024ffffd 9 -- Meelis Roos (mroos@linux.ee) ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88 2012-02-13 9:50 ` Meelis Roos @ 2012-02-13 9:51 ` Meelis Roos 2012-02-13 10:35 ` Meelis Roos 1 sibling, 0 replies; 46+ messages in thread From: Meelis Roos @ 2012-02-13 9:51 UTC (permalink / raw) To: Grant Likely; +Cc: Rob Herring, sparclinux, Linux Kernel list > Another variation of the crash, without the patch, but backtrace is > slightly different (strlen) - maybe fixed by the patch, maybe not. This variation means it's from a different machine - sorry to be confusing. -- Meelis Roos (mroos@ut.ee) http://www.cs.ut.ee/~mroos/ ^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88 2012-02-13 9:50 ` Meelis Roos 2012-02-13 9:51 ` Meelis Roos @ 2012-02-13 10:35 ` Meelis Roos 1 sibling, 0 replies; 46+ messages in thread From: Meelis Roos @ 2012-02-13 10:35 UTC (permalink / raw) To: Grant Likely; +Cc: Rob Herring, sparclinux, Linux Kernel list > Another variation of the crash, without the patch, but backtrace is > slightly different (strlen) - maybe fixed by the patch, maybe not. Tried this machine with the patvch too, same backtrace to strlen. prtconf below. > [ 0.000000] Unable to handle kernel NULL pointer dereference > [ 0.000000] tsk->{mm,active_mm}->context = 0000000000000000 > [ 0.000000] tsk->{mm,active_mm}->pgd = fffff800604ea3a8 > [ 0.000000] \|/ ____ \|/ > [ 0.000000] "@'/ .. \`@" > [ 0.000000] /_| \__/ |_\ > [ 0.000000] \__U_/ > [ 0.000000] swapper(0): Oops [#1] > [ 0.000000] TSTATE: 0000004480e01606 TPC: 00000000005be460 TNPC: 00000000005be464 Y: 00000037 Not d > [ 0.000000] TPC: <strlen+0x60/0xd4> > [ 0.000000] g0: 000000000000002f g1: 0000000000000001 g2: 0000000000000000 g3: 000000000073a700 > [ 0.000000] g4: 000000000085ea50 g5: 0000000000000000 g6: 0000000000854000 g7: 0030a80000000000 > [ 0.000000] o0: 0000000000000000 o1: 0000000000000000 o2: 0000000001010101 o3: 0000000080808080 > [ 0.000000] o4: 0000000001010000 o5: fffff8006feae140 sp: 00000000008572c1 ret_pc: 0000000000655108 > [ 0.000000] RPC: <of_alias_scan+0x68/0x200> > [ 0.000000] l0: 00000000008a4380 l1: fffff8006feae6b5 l2: fffff8006feae140 l3: fffff8006fe98e00 > [ 0.000000] l4: 0000000000000000 l5: 0000000000000000 l6: 0000000000000000 l7: 00000000008678d0 > [ 0.000000] i0: 00000000008c3f24 i1: 0000000000896ca0 i2: 00000000008268c0 i3: 00000000008268b8 > [ 0.000000] i4: 00000000008038c8 i5: fffff8006feae5c0 i6: 0000000000857381 i7: 00000000008c4314 > [ 0.000000] I7: <of_pdt_build_devicetree+0x90/0xa0> > [ 0.000000] Call Trace: > [ 0.000000] [00000000008c4314] of_pdt_build_devicetree+0x90/0xa0 > [ 0.000000] [00000000008b0330] prom_build_devicetree+0x10/0x3c > [ 0.000000] [00000000008b3bb8] paging_init+0xa3c/0xde8 > [ 0.000000] [00000000008af978] setup_arch+0x324/0x688 > [ 0.000000] [00000000008ae4ec] start_kernel+0x80/0x338 > [ 0.000000] [0000000000715b30] tlb_fixup_done+0x88/0x90 > [ 0.000000] [0000000000000000] (null) > [ 0.000000] Disabling lock debugging due to kernel taint > [ 0.000000] Caller[00000000008c4314]: of_pdt_build_devicetree+0x90/0xa0 > [ 0.000000] Caller[00000000008b0330]: prom_build_devicetree+0x10/0x3c > [ 0.000000] Caller[00000000008b3bb8]: paging_init+0xa3c/0xde8 > [ 0.000000] Caller[00000000008af978]: setup_arch+0x324/0x688 > [ 0.000000] Caller[00000000008ae4ec]: start_kernel+0x80/0x338 > [ 0.000000] Caller[0000000000715b30]: tlb_fixup_done+0x88/0x90 > [ 0.000000] Caller[0000000000000000]: (null) > [ 0.000000] Instruction DUMP: 96132080 19004040 94132101 <da020000> 9823400a 808b000b 024ffffd 9 System Configuration: Sun Microsystems sun4u Memory size: 1024 Megabytes System Peripherals (PROM Nodes): Node 0xf002a678 .node: f002a678 idprom: 01830003.ba11b371.000003ba.11b37182.00000000.00000000.00000000.00000000 scsi-initiator-id: 00000007 reset-reason: 'S-POR' breakpoint-trap: 0000007f #size-cells: 00000002 model: 'SUNW,375-3015' name: 'SUNW,UltraAX-i2' clock-frequency: 05f5e100 banner-name: 'Sun Fire V100 (UltraSPARC-IIe 500MHz)' compatible: 'sun4u' device_type: 'upa' stick-frequency: 0054c563 Node 0xf002d908 .node: f002d908 name: 'packages' Node 0xf0035e4c .node: f0035e4c iso6429-1983-colors: name: 'terminal-emulator' Node 0xf0038e7c .node: f0038e7c disk-write-fix: name: 'deblocker' Node 0xf00395c4 .node: f00395c4 name: 'obp-tftp' Node 0xf0044b08 .node: f0044b08 name: 'disk-label' Node 0xf0059f74 .node: f0059f74 name: 'SUNW,builtin-drivers' Node 0xf0062644 .node: f0062644 source: '/pci@1f,0/isa@7/flashprom@1f,0:' name: 'dropins' Node 0xf00730e0 .node: f00730e0 name: 'kbd-translator' Node 0xf002d978 .node: f002d978 mmu: fffe7ae0 memory: fffe7ce0 bootargs: 00 bootpath: '/pci@1f,0/ide@d/disk@2,0:a' stdout: fffbd7b8 stdin: fffbda00 stdout-#lines: ffffffff name: 'chosen' Node 0xf002d9e4 .node: f002d9e4 version: 'OBP 4.0.18 2002/05/23 18:22' model: 'SUNW,4.0' aligned-allocator: relative-addressing: name: 'openprom' Node 0xf002da74 .node: f002da74 name: 'client-services' Node 0xf002db1c .node: f002db1c ras-shutdown-enabled?: 'false' shutdown-temp: '75' warning-temp: '70' env-monitor: 'enabled' diag-passes: '1' diag-continue?: '0' diag-targets: '0' diag-verbosity: '0' keyboard-click?: 'false' keymap: scsi-initiator-id: '7' #power-cycles: '100' system-board-serial#: system-board-date: ttyb-rts-dtr-off: 'false' ttyb-ignore-cd: 'true' ttya-rts-dtr-off: 'false' ttya-ignore-cd: 'true' ttyb-mode: '9600,8,n,1,-' ttya-mode: '9600,8,n,1,-' pci-probe-list: '7,3,c,5,a,d' mfg-mode: 'off' diag-level: 'max' fcode-debug?: 'false' output-device: 'ttya' input-device: 'ttya' load-base: '16384' auto-boot-retry?: 'false' boot-command: 'boot' auto-boot?: 'true' watchdog-reboot?: 'true' diag-file: diag-device: 'disk' boot-file: boot-device: 'disk net' local-mac-address?: 'false' net-timeout: '0' ansi-terminal?: 'true' screen-#columns: '80' screen-#rows: '34' silent-mode?: 'false' use-nvramrc?: 'false' nvramrc: security-mode: 'none' security-password: security-#badlogins: '0' oem-logo: oem-logo?: 'false' oem-banner: oem-banner?: 'false' hardware-revision: last-hardware-update: diag-switch?: 'true' name: 'options' Node 0xf002db8c .node: f002db8c disk: '/pci@1f,0/ide@d/disk@2,0' rtc: '/pci@1f,0/isa@7/rtc@0,70' usb: '/pci@1f,0/usb@a' flash: '/pci@1f,0/isa@7/flashprom@1f,0' lom: '/pci@1f,0/isa@7/SUNW,lomh@0,8010' i2c-nvram: '/pci@1f,0/pmu@3/i2c@0,0/i2c-nvram@0,aa' net1: '/pci@1f,0/ethernet@5' dload1: '/pci@1f,0/ethernet@5:,' dload: '/pci@1f,0/ethernet@c:,' net0: '/pci@1f,0/ethernet@c' net: '/pci@1f,0/ethernet@c' cdrom: '/pci@1f,0/ide@d/cdrom@3,0:f' disk3: '/pci@1f,0/ide@d/disk@3,0' disk2: '/pci@1f,0/ide@d/disk@2,0' disk1: '/pci@1f,0/ide@d/disk@1,0' disk0: '/pci@1f,0/ide@d/disk@0,0' ide: '/pci@1f,0/ide@d' floppy: '/pci@1f,0/isa@7/dma/floppy' ttyb: '/pci@1f,0/isa@7/serial@0,2e8' ttya: '/pci@1f,0/isa@7/serial@0,3f8' name: 'aliases' Node 0xf0050050 .node: f0050050 reg: 00000000.00000000.00000000.10000000.00000000.20000000.00000000.10000000.00000000.40000000.00000000.10000000.00000000.60000000.00000000.10000000 available: 00000000.6fec0000.00000000.00006000.00000000.6fe80000.00000000.00030000.00000000.6f000000.00000000.00e00000.00000000.60000000.00000000.0effe000.00000000.40000000.00000000.10000000.00000000.20000000.00000000.10000000.00000000.00000000.00000000.10000000 name: 'memory' Node 0xf0050634 .node: f0050634 translations: 00000000.fffe0000.00000000.00010000.80000000.6fef00b6.00000000.fffdc000.00000000.00004000.80000000.6fee40b6.00000000.fffd4000.00000000.00004000.80000000.6fede0b6.00000000.fffd2000.00000000.00002000.800001fe.0200808e.00000000.fffd0000.00000000.00002000.80000000.6fed60b6.00000000.fffce000.00000000.00002000.800001fe.0200008e.00000000.fffcc000.00000000.00002000.800001fe.0200208e.00000000.fffca000.00000000.00002000.800001fe.0200408e.00000000.fffc8000.00000000.00002000.80000000.6effe0b6.00000000.fffc6000.00000000.00002000.80000000.6fed20b6.00000000.fffc4000.00000000.00002000.80000000.6fedc0b6.00000000.fffc2000.00000000.00002000.800001fe.0200008e.00000000.fffbc000.00000000.00004000.80000000.6fec80b6.00000000.fff82000.00000000.00010000.800001fe.0000008e.00000000.fff7e000.00000000.00004000.80000000.6fed80b6.00000000.f0000000.00000000.00100000.80000000.6ff000b6.00000000.40000000.00000000.04000000.80000000.60000036.00000000.00400000.00000000.01000000.80000000.6000 0036.00000000.00002000.00000000.003fe000.80000000.00002036 existing: 00000000.00000000.00000800.00000000.fffff800.00000000.00000800.00000000 available: fffff800.00000000.000007fc.00000000.00000001.00000000.000007ff.00000000.00000000.ffff0000.00000000.0000e000.00000000.00000000.00000000.f0000000.00000000.fffc0000.00000000.00002000.00000000.fff92000.00000000.0002a000.00000000.fff00000.00000000.0007e000.00000000.f0f80000.00000000.0e080000.00000000.f0800000.00000000.00700000 page-size: 00002000 name: 'virtual-memory' Node 0xf0069d48 .node: f0069d48 available: 81000000.00000000.00010230.00000000.00bffdd0.82000000.00000000.00004000.00000000.0003c000.82000000.00000000.000c0000.00000000.00f40000.82000000.00000000.02000000.00000000.5e000000.82000000.00000000.80000000.00000000.40000000.82000000.00000000.e0000000.00000000.10000000 bus-range: 00000000.00000000 interrupt-map: 00006800.00000000.00000000.00000001.f0069d48.0000000c.00005000.00000000.00000000.00000001.f0069d48.00000024.00006000.00000000.00000000.00000001.f0069d48.00000006.00002800.00000000.00000000.00000001.f0069d48.0000001c.00003800.00000000.00000000.00000004.f0069d48.0000002b.00003800.00000000.00000000.00000005.f0069d48.00000023.00003800.00000000.00000000.00000001.f0069d48.0000002a.00001800.00000000.00000000.00000001.f0069d48.00000022 interrupt-map-mask: 00fff800.00000000.00000000.00000007 #interrupt-cells: 00000001 virtual-dma: 60000000.20000000 reg: 000001fe.00000000.00000000.00010000.000001fe.01000000.00000000.00000100 ranges: 00000000.00000000.00000000.000001fe.01000000.00000000.01000000.01000000.00000000.00000000.000001fe.02000000.00000000.01000000.02000000.00000000.00000000.000001ff.00000000.00000001.00000000.03000000.00000000.00000000.000001ff.00000000.00000001.00000000 #virtual-dma-size-cells: 00000001 #virtual-dma-addr-cells: 00000001 clock-frequency: 03ef1480 latency-timer: button-interrupt: no-streaming-cache: 66mhz-capable: interrupts: 00000030.0000002e.0000002f.00000025 upa-portid: 0000001f bus-parity-generated: compatible: 'pci108e,a001' model: 'SUNW,sabre' name: 'pci' device_type: 'pci' #address-cells: 00000003 #size-cells: 00000002 Node 0xf0073e2c .node: f0073e2c cache-line-size: 00000000 latency-timer: 00000000 #size-cells: 00000001 #address-cells: 00000002 name: 'isa' ranges: 00000000.00000000.81003810.00000000.00000000.00010000.0000001f.00000000.82003814.00000000.f0000000.00080000 reg: 00003800.00000000.00000000.00000000.00000000.81003810.00000000.00000000.00000000.00010000.82003814.00000000.00000000.00000000.00100000 devsel-speed: 00000001 class-code: 00060100 max-latency: 00000000 min-grant: 00000000 subsystem-id: 00001533 subsystem-vendor-id: 000010b9 revision-id: 00000000 device-id: 00001533 vendor-id: 000010b9 Node 0xf00749f4 .node: f00749f4 reg: 00000000.00000000.00010000 interrupts: 00000001 compatible: 'isadma' name: 'dma' Node 0xf0074ccc .node: f0074ccc address: fffce070 reg: 00000000.00000070.00000002 compatible: 'm5819' model: 'm5819' name: 'rtc' Node 0xf009cac4 .node: f009cac4 device_type: 'tod' name: 'todm5819' Node 0xf007583c .node: f007583c compatible: 'acpi-power' button: interrupts: 00000005 reg: 00000000.00002000.00000008 name: 'power' Node 0xf00759d0 .node: f00759d0 reg: 00000000.00008010.00000002 interrupts: 00000001 device_type: 'block' name: 'SUNW,lomh' Node 0xf0076e0c .node: f0076e0c port-a-ignore-cd: nohupcl: 00 interrupt-priorities: 0000000c.0000000c reg: 00000000.000003f8.00000008 compatible: 73753136.35353000.737500 device_type: 'serial' name: 'serial' interrupts: 00000004 Node 0xf0078af8 .node: f0078af8 port-b-ignore-cd: nohupcl: 00 interrupt-priorities: 0000000c.0000000c reg: 00000000.000002e8.00000008 compatible: 73753136.35353000.737500 device_type: 'serial' name: 'serial' interrupts: 00000004 Node 0xf007ac10 .node: f007ac10 model: 'SUNW,258-7883' version: 'CORE 1.0.18 2002/05/23 18:22' name: 'flashprom' reg: 0000001f.00000000.00080000 Node 0xf007b6bc .node: f007b6bc name: 'pmu' ranges: 00000000.00000000.00001800.00000000.00000000.00000100.00000001.00000000.81001810.00000000.00004000.00000100.00000002.00000000.81001814.00000000.00000000.00000100 reg: 00001800.00000000.00000000.00000000.00000000.81001810.00000000.00004000.00000000.00000010 compatible: 70636931.3062392c.37313031.00706369.636c6173.732c3030.30303030.00 #address-cells: 00000002 #size-cells: 00000001 devsel-speed: 00000001 class-code: 00000000 max-latency: 00000000 min-grant: 00000000 revision-id: 00000000 device-id: 00007101 vendor-id: 000010b9 Node 0xf007be84 .node: f007be84 reg: 00000000.00000000.00000100.00000001.00000000.00000100 #address-cells: 00000002 #size-cells: 00000000 interrupts: 00000001 compatible: 'i2c-smbus' name: 'i2c' Node 0xf007d31c .node: f007d31c compatible: 'i2c-max1617' name: 'temperature' reg: 00000000.00000030 Node 0xf007d48c .node: f007d48c compatible: 'i2c-at34c02' name: 'dimm' reg: 00000000.000000a8 Node 0xf007d544 .node: f007d544 compatible: 'i2c-at34c02' name: 'dimm' reg: 00000000.000000aa Node 0xf007d5fc .node: f007d5fc compatible: 'i2c-at34c02' name: 'dimm' reg: 00000000.000000ac Node 0xf007d6b4 .node: f007d6b4 compatible: 'i2c-at34c02' name: 'dimm' reg: 00000000.000000ae Node 0xf007d76c .node: f007d76c reg: 00000000.000000a0 #address-cells: 00000001 compatible: 'i2c-at24c64' device_type: 'nvram' name: 'i2c-nvram' Node 0xf007e284 .node: f007e284 reg: 00001fd8.00000028 device_type: 'idprom' name: 'idprom' Node 0xf007e538 .node: f007e538 reg: 00000000.000000a2 #address-cells: 00000001 compatible: 'i2c-at24c64' name: 'motherboard-fru' Node 0xf007f0d0 .node: f007f0d0 compatible: 'SUNW,smbus-ppm' name: 'ppm' register-mask: 00000000.00000001 reg: 00000000.000000b3.00000001.80000000.000000ba.00000001.00000000.000000bb.00000001 Node 0xf007f344 .node: f007f344 compatible: 'SUNW,smbus-beep' name: 'beep' reg: 00000000.000000b2.00000001.00000000.000000d3.00000001.00000002.00000042.00000002.00000002.00000061.00000001 Node 0xf007f45c .node: f007f45c compatible: 'SUNW,smbus-fan-control' name: 'fan-control' register-mask: 00000000.00000002 reg: 00000000.000000c8.00000004.80000000.000000ba.00000001 Node 0xf007f660 .node: f007f660 name: 'lomp' reg: 00001800.00000000.00000000.00000000.00000000.81001810.00004000.00000000.00000000.00000010 Node 0xf007fae8 .node: f007fae8 local-mac-address: 0003ba11.b371 assigned-addresses: 81006010.00000000.00010000.00000000.00000100.82006014.00000000.00000000.00000000.00002000.82006030.00000000.00040000.00000000.00040000 version: '1.0' compatible: 70636934.3535342c.34333465.00706369.31323868.2c393130.32007063.69313238.322c3931.30320070.6369636c.6173732c.30323030.303000 device_type: 'network' subsystem-id: 0000434e subsystem-vendor-id: 00004554 reg: 00006000.00000000.00000000.00000000.00000000.01006010.00000000.00000000.00000000.00000100.02006014.00000000.00000000.00000000.00000100 name: 'ethernet' devsel-speed: 00000001 class-code: 00020000 interrupts: 00000001 max-latency: 00000028 min-grant: 00000014 revision-id: 00000031 device-id: 00009102 vendor-id: 00001282 Node 0xf0089634 .node: f0089634 local-mac-address: 0003ba11.b372 assigned-addresses: 81002810.00000000.00010100.00000000.00000100.82002814.00000000.00002000.00000000.00002000.82002830.00000000.00080000.00000000.00040000 version: '1.0' compatible: 70636934.3535342c.34333465.00706369.31323868.2c393130.32007063.69313238.322c3931.30320070.6369636c.6173732c.30323030.303000 device_type: 'network' subsystem-id: 0000434e subsystem-vendor-id: 00004554 reg: 00002800.00000000.00000000.00000000.00000000.01002810.00000000.00000000.00000000.00000100.02002814.00000000.00000000.00000000.00000100 name: 'ethernet' devsel-speed: 00000001 class-code: 00020000 interrupts: 00000001 max-latency: 00000028 min-grant: 00000014 revision-id: 00000031 device-id: 00009102 vendor-id: 00001282 Node 0xf0093180 .node: f0093180 assigned-addresses: 82005010.00000000.01000000.00000000.01000000 sunw,find-fcode: f009838c maximum-frame#: 0000ffff reg: 00005000.00000000.00000000.00000000.00000000.02005010.00000000.00000000.00000000.01000000 #size-cells: 00000000 #address-cells: 00000001 compatible: 70636931.3062392c.35323337.2e330070.63693130.62392c35.32333700.70636963.6c617373.2c306330.33313000.70636963.6c617373.2c306330.3300 name: 'usb' fast-back-to-back: devsel-speed: 00000001 class-code: 000c0310 interrupts: 00000001 max-latency: 00000050 min-grant: 00000000 revision-id: 00000003 device-id: 00005237 vendor-id: 000010b9 Node 0xf0098ff8 .node: f0098ff8 assigned-addresses: 81006810.00000000.00010200.00000000.00000008.81006814.00000000.00010218.00000000.00000008.81006818.00000000.00010210.00000000.00000008.8100681c.00000000.00010208.00000000.00000008.81006820.00000000.00010220.00000000.00000010 reg: 00006800.00000000.00000000.00000000.00000000.01006810.00000000.00000000.00000000.00000008.01006814.00000000.00000000.00000000.00000004.01006818.00000000.00000000.00000000.00000008.0100681c.00000000.00000000.00000000.00000004.01006820.00000000.00000000.00000000.00000010 compatible: 70636931.3062392c.35323239.00706369.636c6173.732c3031.30316666.00 #address-cells: 00000002 device_type: 'ide' name: 'ide' fast-back-to-back: devsel-speed: 00000001 class-code: 000101ff interrupts: 00000001 max-latency: 00000004 min-grant: 00000002 revision-id: 000000c3 device-id: 00005229 vendor-id: 000010b9 Node 0xf009b86c .node: f009b86c device_type: 'block' name: 'disk' compatible: 'ide-disk' Node 0xf009bf18 .node: f009bf18 device_type: 'block' name: 'cdrom' compatible: 'ide-cdrom' Node 0xf0072d50 .node: f0072d50 manufacturer#: 00000017 implementation#: 00000013 mask#: 00000014 ecache-size: 00040000 clock-frequency: 1dcd6500 name: 'SUNW,UltraSPARC-IIe' sparc-version: 00000009 ecache-associativity: 00000001 ecache-line-size: 00000040 #dtlb-entries: 00000040 dcache-associativity: 00000001 dcache-line-size: 00000020 dcache-size: 00004000 #itlb-entries: 00000040 icache-associativity: 00000002 icache-line-size: 00000020 icache-size: 00004000 upa-portid: 00000000 reg: 000001c0.00000000.00000000.00000008 device_type: 'cpu' -- Meelis Roos (mroos@ut.ee) http://www.cs.ut.ee/~mroos/ ^ permalink raw reply [flat|nested] 46+ messages in thread
* [tip:core/urgent] memblock: Fix size aligning of memblock_alloc_base_nid() 2012-02-13 7:45 ` OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88 Meelis Roos 2012-02-13 8:06 ` Grant Likely @ 2012-03-01 12:24 ` tip-bot for Tejun Heo 1 sibling, 0 replies; 46+ messages in thread From: tip-bot for Tejun Heo @ 2012-03-01 12:24 UTC (permalink / raw) To: linux-tip-commits Cc: linux-kernel, grant.likely, hpa, mingo, torvalds, davem, rob.herring, akpm, tj, mroos, tglx, mingo Commit-ID: 847854f5988a04fe7e02d2fdd4fa0df9f96360fe Gitweb: http://git.kernel.org/tip/847854f5988a04fe7e02d2fdd4fa0df9f96360fe Author: Tejun Heo <tj@kernel.org> AuthorDate: Wed, 29 Feb 2012 05:56:21 +0900 Committer: Ingo Molnar <mingo@elte.hu> CommitDate: Thu, 1 Mar 2012 10:53:18 +0100 memblock: Fix size aligning of memblock_alloc_base_nid() memblock allocator aligns @size to @align to reduce the amount of fragmentation. Commit: 7bd0b0f0da ("memblock: Reimplement memblock allocation using reverse free area iterator") Broke it by incorrectly relocating @size aligning to memblock_find_in_range_node(). As the aligned size is not propagated back to memblock_alloc_base_nid(), the actually reserved size isn't aligned. While this increases memory use for memblock reserved array, this shouldn't cause any critical failure; however, it seems that the size aligning was hiding a use-beyond-allocation bug in sparc64 and losing the aligning causes boot failure. The underlying problem is currently being debugged but this is a proper fix in itself, it's already pretty late in -rc cycle for boot failures and reverting the change for debugging isn't difficult. Restore the size aligning moving it to memblock_alloc_base_nid(). Reported-by: Meelis Roos <mroos@linux.ee> Signed-off-by: Tejun Heo <tj@kernel.org> Cc: David S. Miller <davem@davemloft.net> Cc: Grant Likely <grant.likely@secretlab.ca> Cc: Rob Herring <rob.herring@calxeda.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/r/20120228205621.GC3252@dhcp-172-17-108-109.mtv.corp.google.com Signed-off-by: Ingo Molnar <mingo@elte.hu> LKML-Reference: <alpine.SOC.1.00.1202130942030.1488@math.ut.ee> --- mm/memblock.c | 6 +++--- 1 files changed, 3 insertions(+), 3 deletions(-) diff --git a/mm/memblock.c b/mm/memblock.c index 77b5f22..99f2855 100644 --- a/mm/memblock.c +++ b/mm/memblock.c @@ -99,9 +99,6 @@ phys_addr_t __init_memblock memblock_find_in_range_node(phys_addr_t start, phys_addr_t this_start, this_end, cand; u64 i; - /* align @size to avoid excessive fragmentation on reserved array */ - size = round_up(size, align); - /* pump up @end */ if (end == MEMBLOCK_ALLOC_ACCESSIBLE) end = memblock.current_limit; @@ -731,6 +728,9 @@ static phys_addr_t __init memblock_alloc_base_nid(phys_addr_t size, { phys_addr_t found; + /* align @size to avoid excessive fragmentation on reserved array */ + size = round_up(size, align); + found = memblock_find_in_range_node(0, max_addr, size, align, nid); if (found && !memblock_reserve(found, size)) return found; ^ permalink raw reply related [flat|nested] 46+ messages in thread
* Re: [PATCH v3.3-rc5] memblock: Fix size aligning of memblock_alloc_base_nid() 2012-02-28 20:56 [PATCH v3.3-rc5] memblock: Fix size aligning of memblock_alloc_base_nid() Tejun Heo 2012-02-13 7:45 ` OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88 Meelis Roos @ 2012-02-28 22:16 ` Sam Ravnborg 1 sibling, 0 replies; 46+ messages in thread From: Sam Ravnborg @ 2012-02-28 22:16 UTC (permalink / raw) To: Tejun Heo Cc: Ingo Molnar, H. Peter Anvin, David S. Miller, linux-kernel, Meelis Roos, Grant Likely, Rob Herring, sparclinux On Wed, Feb 29, 2012 at 05:56:21AM +0900, Tejun Heo wrote: > memblock allocator aligns @size to @align to reduce the amount of > fragmentation. 7bd0b0f0da "memblock: Reimplement memblock allocation > using reverse free area iterator" broke it by incorrectly relocating > @size aligning to memblock_find_in_range_node(). As the aligned size > is not propagated back to memblock_alloc_base_nid(), the actually > reserved size isn't aligned. > > While this increases memory use for memblock reserved array, this > shouldn't cause any critical failure; however, it seems that the size > aligning was hiding a use-beyond-allocation bug in sparc64 and losing > the aligning causes boot failure. > > The underlying problem is currently being debugged but this is a > proper fix in itself, it's already pretty late in -rc cycle for boot > failures and reverting the change for debugging isn't difficult. > Restore the size aligning moving it to memblock_alloc_base_nid(). > > Signed-off-by: Tejun Heo <tj@kernel.org> > Reported-by: Meelis Roos <mroos@linux.ee> > Reported-by: Sam Ravnborg <sam@ravnborg.org> Actually not :-( I only fooled around with some clueless suggestions - I do not have any sparc64 boxes. And my sparc32 box that is alive atm, does not exhibit this problem. Sam ^ permalink raw reply [flat|nested] 46+ messages in thread
end of thread, other threads:[~2012-03-01 12:25 UTC | newest] Thread overview: 46+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2012-02-28 20:56 [PATCH v3.3-rc5] memblock: Fix size aligning of memblock_alloc_base_nid() Tejun Heo 2012-02-13 7:45 ` OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88 Meelis Roos 2012-02-13 8:06 ` Grant Likely 2012-02-13 9:20 ` Meelis Roos 2012-02-13 21:46 ` Grant Likely 2012-02-14 0:58 ` David Miller 2012-02-14 2:30 ` Grant Likely 2012-02-14 2:41 ` Grant Likely 2012-02-16 21:08 ` mroos 2012-02-14 5:54 ` mroos 2012-02-16 19:53 ` Meelis Roos 2012-02-16 21:23 ` Sam Ravnborg 2012-02-20 9:11 ` Meelis Roos 2012-02-20 17:06 ` Tejun Heo 2012-02-20 20:04 ` Meelis Roos 2012-02-20 21:01 ` Tejun Heo 2012-02-20 22:32 ` Meelis Roos 2012-02-21 1:05 ` Tejun Heo 2012-02-22 0:36 ` Meelis Roos 2012-02-22 17:48 ` Tejun Heo 2012-02-22 18:25 ` Meelis Roos 2012-02-23 18:55 ` Tejun Heo 2012-02-23 23:31 ` David Miller 2012-02-24 9:20 ` Meelis Roos 2012-02-27 17:17 ` Meelis Roos 2012-02-27 19:43 ` Sam Ravnborg 2012-02-27 21:25 ` Meelis Roos 2012-02-27 21:30 ` David Miller 2012-02-28 21:10 ` David Miller 2012-02-28 21:36 ` Meelis Roos 2012-02-28 22:56 ` David Miller 2012-02-29 6:15 ` Meelis Roos 2012-02-29 6:27 ` David Miller 2012-02-22 20:44 ` David Miller 2012-02-22 21:00 ` Tejun Heo 2012-02-22 18:22 ` Richard Mortimer 2012-02-22 20:26 ` David Miller 2012-02-22 17:03 ` Sam Ravnborg 2012-02-22 17:12 ` Meelis Roos 2012-02-22 17:21 ` Sam Ravnborg 2012-02-22 17:41 ` Meelis Roos 2012-02-13 9:50 ` Meelis Roos 2012-02-13 9:51 ` Meelis Roos 2012-02-13 10:35 ` Meelis Roos 2012-03-01 12:24 ` [tip:core/urgent] memblock: Fix size aligning of memblock_alloc_base_nid() tip-bot for Tejun Heo 2012-02-28 22:16 ` [PATCH v3.3-rc5] " Sam Ravnborg
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).