From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.3 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A4EE2C11D3D for ; Thu, 27 Feb 2020 16:16:46 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 82ED82469F for ; Thu, 27 Feb 2020 16:16:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729443AbgB0QQq (ORCPT ); Thu, 27 Feb 2020 11:16:46 -0500 Received: from mx2.suse.de ([195.135.220.15]:37942 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728059AbgB0QQp (ORCPT ); Thu, 27 Feb 2020 11:16:45 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id 56680B198; Thu, 27 Feb 2020 16:16:43 +0000 (UTC) Subject: Re: [5.6.0-rc2-next-20200218/powerpc] Boot failure on POWER9 To: Sachin Sant , Michal Hocko Cc: Pekka Enberg , Linux-Next Mailing List , David Rientjes , Christopher Lameter , linuxppc-dev@lists.ozlabs.org, Joonsoo Kim , Kirill Tkhai References: <20200218142620.GF4151@dhcp22.suse.cz> <35EE65CF-40E3-4870-AEBC-D326977176DA@linux.vnet.ibm.com> <20200218152441.GH4151@dhcp22.suse.cz> <20200224085812.GB22443@dhcp22.suse.cz> <20200226184152.GQ3771@dhcp22.suse.cz> <20200227121214.GE3771@dhcp22.suse.cz> <52EF4673-7292-4C4C-B459-AF583951BA48@linux.vnet.ibm.com> From: Vlastimil Babka Message-ID: <9a86f865-50b5-7483-9257-dbb08fecd62b@suse.cz> Date: Thu, 27 Feb 2020 17:16:41 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.5.0 MIME-Version: 1.0 In-Reply-To: <52EF4673-7292-4C4C-B459-AF583951BA48@linux.vnet.ibm.com> Content-Type: text/plain; charset=windows-1252 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-next-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-next@vger.kernel.org On 2/27/20 5:00 PM, Sachin Sant wrote: > > >> On 27-Feb-2020, at 5:42 PM, Michal Hocko wrote: >> >> A very good hint indeed. I would do this >> diff --git a/include/linux/topology.h b/include/linux/topology.h >> index eb2fe6edd73c..d9f1b6737e4d 100644 >> --- a/include/linux/topology.h >> +++ b/include/linux/topology.h >> @@ -137,6 +137,8 @@ static inline void set_numa_mem(int node) >> { >> this_cpu_write(_numa_mem_, node); >> _node_numa_mem_[numa_node_id()] = node; >> + pr_info("%s %d -> %d\n", __FUNCTION__, numa_node_id(), node); >> + dump_stack(); >> } >> #endif >> >> Btw. it would be also helpful to get >> `faddr2line ___slab_alloc+0x334' from your kernel Sachin. > > [linux-next]# ./scripts/faddr2line ./vmlinux ___slab_alloc+0x334 > ___slab_alloc+0x334/0x760: > new_slab_objects at mm/slub.c:2478 > (inlined by) ___slab_alloc at mm/slub.c:2628 > [linux-next]# Hmm that doesn't look relevant, but that address was marked as unreliable, no? Don't we actually need this one? [ 8.768727] NIP [c0000000003d55f4] ___slab_alloc+0x1f4/0x760 > I have also attached boot log with a kernel that include about change. > I see the following o/p during boot: > > [ 0.005269] set_numa_mem 1 -> 1 So there's no "set_numa_mem 0 -> X", specifically not "set_numa_mem 0 -> 1" which I would have expected. That seems to confirm my suspicion that the arch code doesn't set up the memoryless node 0 properly. > [ 0.005270] CPU: 12 PID: 0 Comm: swapper/12 Not tainted 5.6.0-rc3-next-20200227-autotest+ #6 > [ 0.005271] Call Trace: > [ 0.005272] [c0000008b37dfe80] [c000000000b5d948] dump_stack+0xbc/0x104 (unreliable) > [ 0.005274] [c0000008b37dfec0] [c000000000059320] start_secondary+0x600/0x6e0 > [ 0.005277] [c0000008b37dff90] [c00000000000ac54] start_secondary_prolog+0x10/0x14 > > Thanks > -Sachin >