> On 13-Mar-2020, at 5:05 PM, Vlastimil Babka wrote: > > On 3/13/20 12:12 PM, Srikar Dronamraju wrote: >> * Michael Ellerman [2020-03-13 21:48:06]: >> >>> Sachin Sant writes: >>>>> The patch below might work. Sachin can you test this? I tried faking up >>>>> a system with a memoryless node zero but couldn't get it to even start >>>>> booting. >>>>> >>>> The patch did not help. The kernel crashed during >>>> the boot with the same call trace. >>>> >>>> BUG_ON() introduced with the patch was not triggered. >>> >>> OK, that's weird. >>> >>> I eventually managed to get a memoryless node going in sim, and it >>> appears to work there. >>> >>> eg in dmesg: >>> >>> [ 0.000000][ T0] numa: NODE_DATA [mem 0x2000fffa2f80-0x2000fffa7fff] >>> [ 0.000000][ T0] numa: NODE_DATA(0) on node 1 >>> [ 0.000000][ T0] numa: NODE_DATA [mem 0x2000fff9df00-0x2000fffa2f7f] >>> ... >>> [ 0.000000][ T0] Early memory node ranges >>> [ 0.000000][ T0] node 1: [mem 0x0000000000000000-0x00000000ffffffff] >>> [ 0.000000][ T0] node 1: [mem 0x0000200000000000-0x00002000ffffffff] >>> [ 0.000000][ T0] Could not find start_pfn for node 0 >>> [ 0.000000][ T0] Initmem setup node 0 [mem 0x0000000000000000-0x0000000000000000] >>> [ 0.000000][ T0] On node 0 totalpages: 0 >>> [ 0.000000][ T0] Initmem setup node 1 [mem 0x0000000000000000-0x00002000ffffffff] >>> [ 0.000000][ T0] On node 1 totalpages: 131072 >>> >>> # dmesg | grep set_numa >>> [ 0.000000][ T0] set_numa_mem: mem node for 0 = 1 >>> [ 0.005654][ T0] set_numa_mem: mem node for 1 = 1 >>> >>> So is the problem more than just node zero having no memory? >>> I tried with just the patch Michael suggested on top of March 13 next tree. I still see the same failure. Here is a snippet from the log [ 0.000000] numa: NODE_DATA [mem 0x8bfedc900-0x8bfee3fff] [ 0.000000] numa: NODE_DATA(0) on node 1 [ 0.000000] numa: NODE_DATA [mem 0x8bfed5200-0x8bfedc8ff] [ 0.000000] rfi-flush: fallback displacement flush available [ 0.000000] rfi-flush: mttrig type flush available [ 0.000000] link-stack-flush: software flush enabled. [ 0.000000] count-cache-flush: software flush disabled. [ 0.000000] stf-barrier: eieio barrier available [ 0.000000] lpar: H_BLOCK_REMOVE supports base psize:0 psize:0 block size:8 [ 0.000000] lpar: H_BLOCK_REMOVE supports base psize:0 psize:2 block size:8 [ 0.000000] lpar: H_BLOCK_REMOVE supports base psize:0 psize:10 block size:8 [ 0.000000] lpar: H_BLOCK_REMOVE supports base psize:2 psize:2 block size:8 [ 0.000000] lpar: H_BLOCK_REMOVE supports base psize:2 psize:10 block size:8 [ 0.000000] PPC64 nvram contains 15360 bytes [ 0.000000] barrier-nospec: using ORI speculation barrier [ 0.000000] Zone ranges: [ 0.000000] Normal [mem 0x0000000000000000-0x00000008bfffffff] [ 0.000000] Device empty [ 0.000000] Movable zone start for each node [ 0.000000] Early memory node ranges [ 0.000000] node 1: [mem 0x0000000000000000-0x00000008bfffffff] [ 0.000000] Could not find start_pfn for node 0 [ 0.000000] Initmem setup node 0 [mem 0x0000000000000000-0x0000000000000000] [ 0.000000] Initmem setup node 1 [mem 0x0000000000000000-0x00000008bfffffff] [ 0.000000] percpu: Embedded 11 pages/cpu s624024 r0 d96872 u1048576 [ 0.000000] Built 2 zonelists, mobility grouping on. Total pages: 572880 Have attached the complete boot log. Thanks -Sachin