linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: Michael Ellerman <mpe@ellerman.id.au>
To: Michal Hocko <mhocko@kernel.org>,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Paul Mackerras <paulus@samba.org>
Cc: Sachin Sant <sachinp@linux.vnet.ibm.com>,
	Pekka Enberg <penberg@kernel.org>,
	Linux-Next Mailing List <linux-next@vger.kernel.org>,
	Vlastimil Babka <vbabka@suse.cz>,
	David Rientjes <rientjes@google.com>,
	Christopher Lameter <cl@linux.com>,
	linuxppc-dev@lists.ozlabs.org,
	Joonsoo Kim <iamjoonsoo.kim@lge.com>,
	Kirill Tkhai <ktkhai@virtuozzo.com>
Subject: Re: [5.6.0-rc2-next-20200218/powerpc] Boot failure on POWER9
Date: Thu, 12 Mar 2020 23:18:42 +1100	[thread overview]
Message-ID: <87a74lix5p.fsf@mpe.ellerman.id.au> (raw)
In-Reply-To: <20200310150114.GO8447@dhcp22.suse.cz>

Michal Hocko <mhocko@kernel.org> writes:
> On Thu 27-02-20 19:26:54, Michal Hocko wrote:
>> [Cc ppc maintainers]
> [...]
>> Please have a look at http://lkml.kernel.org/r/52EF4673-7292-4C4C-B459-AF583951BA48@linux.vnet.ibm.com
>> for the boot log with the debugging patch which tracks set_numa_mem.
>> This seems to lead to a crash in the slab allocator bebcause
>> node_to_mem_node(0) for memory less node resolves to the memory less
>> node http://lkml.kernel.org/r/dd450314-d428-6776-af07-f92c04c7b967@suse.cz.
>> The original report is http://lkml.kernel.org/r/3381CD91-AB3D-4773-BA04-E7A072A63968@linux.vnet.ibm.com
>
> ping 

The obvious fix is:

diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index 37c12e3bab9e..33b1fca0b258 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -892,6 +892,7 @@ void smp_prepare_boot_cpu(void)
 	paca_ptrs[boot_cpuid]->__current = current;
 #endif
 	set_numa_node(numa_cpu_lookup_table[boot_cpuid]);
+	set_numa_mem(local_memory_node(numa_cpu_lookup_table[boot_cpuid]));
 	current_set[boot_cpuid] = current;
 }


But that doesn't work because smp_prepare_boot_cpu() is called too
early:

asmlinkage __visible void __init start_kernel(void)
{
	...
	smp_prepare_boot_cpu();	/* arch-specific boot-cpu hooks */
	boot_cpu_hotplug_init();

	build_all_zonelists(NULL);


And local_memory_node() uses first_zones_zonelist() which doesn't work
prior to build_all_zonelists() being called.


The patch below might work. Sachin can you test this? I tried faking up
a system with a memoryless node zero but couldn't get it to even start
booting.

cheers


diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
index 9b4f5fb719e0..d1f11437f6c4 100644
--- a/arch/powerpc/mm/mem.c
+++ b/arch/powerpc/mm/mem.c
@@ -282,6 +282,9 @@ void __init mem_init(void)
 	 */
 	BUILD_BUG_ON(MMU_PAGE_COUNT > 16);
 
+	BUG_ON(smp_processor_id() != boot_cpuid);
+	set_numa_mem(local_memory_node(numa_cpu_lookup_table[boot_cpuid]));
+
 #ifdef CONFIG_SWIOTLB
 	/*
 	 * Some platforms (e.g. 85xx) limit DMA-able memory way below

  reply	other threads:[~2020-03-12 12:22 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-18 10:45 [5.6.0-rc2-next-20200218/powerpc] Boot failure on POWER9 Sachin Sant
2020-02-18 10:50 ` Kirill Tkhai
2020-02-18 11:01   ` Kirill Tkhai
2020-02-18 11:35     ` Kirill Tkhai
2020-02-18 11:40     ` Sachin Sant
2020-02-18 11:55       ` Michal Hocko
2020-02-18 14:00         ` Sachin Sant
2020-02-18 14:26           ` Michal Hocko
2020-02-18 15:11             ` Sachin Sant
2020-02-18 15:24               ` Michal Hocko
2020-02-22  3:38                 ` Christopher Lameter
2020-02-24  8:58                   ` Michal Hocko
2020-02-26 18:25                     ` Christopher Lameter
2020-02-26 18:41                       ` Michal Hocko
2020-02-26 18:44                         ` Christopher Lameter
2020-02-26 19:01                           ` Michal Hocko
2020-02-26 20:31                             ` David Rientjes
2020-02-26 20:52                               ` Michal Hocko
2020-02-26 21:45                         ` Vlastimil Babka
2020-02-26 22:29                           ` Vlastimil Babka
2020-02-27 12:12                             ` Michal Hocko
2020-02-27 16:00                               ` Sachin Sant
2020-02-27 16:16                                 ` Vlastimil Babka
2020-02-27 18:26                                   ` Michal Hocko
2020-03-10 15:01                                     ` Michal Hocko
2020-03-12 12:18                                       ` Michael Ellerman [this message]
2020-03-12 16:51                                         ` Sachin Sant
2020-03-13 10:48                                           ` Michael Ellerman
2020-03-13 11:12                                             ` Srikar Dronamraju
2020-03-13 11:35                                               ` Vlastimil Babka
2020-03-14  8:10                                                 ` Sachin Sant
2020-02-27 12:02                           ` Michal Hocko
2020-02-18 11:38   ` Sachin Sant
2020-02-18 11:53     ` Kirill Tkhai
2020-03-17 13:17 ` [PATCH 0/4] Fix kmalloc_node on offline nodes Srikar Dronamraju
2020-03-17 13:17   ` [PATCH 1/4] mm: Check for node_online in node_present_pages Srikar Dronamraju
2020-03-17 13:37     ` Srikar Dronamraju
2020-03-17 13:17   ` [PATCH 2/4] mm/slub: Use mem_node to allocate a new slab Srikar Dronamraju
2020-03-17 13:34     ` Vlastimil Babka
2020-03-17 13:45       ` Srikar Dronamraju
2020-03-17 13:53         ` Vlastimil Babka
2020-03-17 14:51           ` Srikar Dronamraju
2020-03-17 15:29             ` Vlastimil Babka
2020-03-18  7:29               ` Srikar Dronamraju
2020-03-17 16:41       ` Srikar Dronamraju
2020-03-17 13:17   ` [PATCH 3/4] mm: Implement reset_numa_mem Srikar Dronamraju
2020-03-17 13:17   ` [PATCH 4/4] powerpc/numa: Set fallback nodes for offline nodes Srikar Dronamraju
2020-03-17 14:22     ` Bharata B Rao
2020-03-17 14:29       ` Srikar Dronamraju

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87a74lix5p.fsf@mpe.ellerman.id.au \
    --to=mpe@ellerman.id.au \
    --cc=benh@kernel.crashing.org \
    --cc=cl@linux.com \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=ktkhai@virtuozzo.com \
    --cc=linux-next@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mhocko@kernel.org \
    --cc=paulus@samba.org \
    --cc=penberg@kernel.org \
    --cc=rientjes@google.com \
    --cc=sachinp@linux.vnet.ibm.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).