From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.6 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS,USER_AGENT_SANE_1 autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D1BE6C433FF for ; Wed, 31 Jul 2019 14:41:22 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id A89CF208C3 for ; Wed, 31 Jul 2019 14:41:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1564584082; bh=GhbJUwRiiNhZF85fa859UBgY3tAHuk6moEdCD4IZZuw=; h=Date:From:To:Cc:Subject:References:In-Reply-To:List-ID:From; b=d9SaORidTW7+wCl7Pg7FdOwY3aYKXVvICrPReE4Owl9v8GvxIgCdWRzpCuaZkG5ni 5m4ZtqRn7sGgBCFUhER9N2hFoolxzSHyza0Y82NWBxpHV6CimSTK1fgQhtJQuvmrZg 6Zwb8W/q143xIXWK4nsX0KyHbWfvZaRjqfhX5RGM= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729953AbfGaOlV (ORCPT ); Wed, 31 Jul 2019 10:41:21 -0400 Received: from mx2.suse.de ([195.135.220.15]:57882 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1729935AbfGaOlU (ORCPT ); Wed, 31 Jul 2019 10:41:20 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id E6CFCB0C6; Wed, 31 Jul 2019 14:41:17 +0000 (UTC) Date: Wed, 31 Jul 2019 16:41:14 +0200 From: Michal Hocko To: Mike Rapoport Cc: Hoan Tran OS , Will Deacon , Catalin Marinas , Heiko Carstens , "open list:MEMORY MANAGEMENT" , Paul Mackerras , "H . Peter Anvin" , "sparclinux@vger.kernel.org" , Alexander Duyck , "linux-s390@vger.kernel.org" , Michael Ellerman , "x86@kernel.org" , Christian Borntraeger , Ingo Molnar , Vlastimil Babka , Benjamin Herrenschmidt , Open Source Submission , Pavel Tatashin , Vasily Gorbik , Will Deacon , Borislav Petkov , Thomas Gleixner , "linux-arm-kernel@lists.infradead.org" , Oscar Salvador , "linux-kernel@vger.kernel.org" , Andrew Morton , "linuxppc-dev@lists.ozlabs.org" , "David S . Miller" , "willy@infradead.org" Subject: Re: microblaze HAVE_MEMBLOCK_NODE_MAP dependency (was Re: [PATCH v2 0/5] mm: Enable CONFIG_NODES_SPAN_OTHER_NODES by default for NUMA) Message-ID: <20190731144114.GY9330@dhcp22.suse.cz> References: <20190712150007.GU29483@dhcp22.suse.cz> <730368c5-1711-89ae-e3ef-65418b17ddc9@os.amperecomputing.com> <20190730081415.GN9330@dhcp22.suse.cz> <20190731062420.GC21422@rapoport-lnx> <20190731080309.GZ9330@dhcp22.suse.cz> <20190731111422.GA14538@rapoport-lnx> <20190731114016.GI9330@dhcp22.suse.cz> <20190731122631.GB14538@rapoport-lnx> <20190731130037.GN9330@dhcp22.suse.cz> <20190731142129.GA24998@rapoport-lnx> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190731142129.GA24998@rapoport-lnx> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed 31-07-19 17:21:29, Mike Rapoport wrote: > On Wed, Jul 31, 2019 at 03:00:37PM +0200, Michal Hocko wrote: > > On Wed 31-07-19 15:26:32, Mike Rapoport wrote: > > > On Wed, Jul 31, 2019 at 01:40:16PM +0200, Michal Hocko wrote: > > > > On Wed 31-07-19 14:14:22, Mike Rapoport wrote: > > > > > On Wed, Jul 31, 2019 at 10:03:09AM +0200, Michal Hocko wrote: > > > > > > On Wed 31-07-19 09:24:21, Mike Rapoport wrote: > > > > > > > [ sorry for a late reply too, somehow I missed this thread before ] > > > > > > > > > > > > > > On Tue, Jul 30, 2019 at 10:14:15AM +0200, Michal Hocko wrote: > > > > > > > > [Sorry for a late reply] > > > > > > > > > > > > > > > > On Mon 15-07-19 17:55:07, Hoan Tran OS wrote: > > > > > > > > > Hi, > > > > > > > > > > > > > > > > > > On 7/12/19 10:00 PM, Michal Hocko wrote: > > > > > > > > [...] > > > > > > > > > > Hmm, I thought this was selectable. But I am obviously wrong here. > > > > > > > > > > Looking more closely, it seems that this is indeed only about > > > > > > > > > > __early_pfn_to_nid and as such not something that should add a config > > > > > > > > > > symbol. This should have been called out in the changelog though. > > > > > > > > > > > > > > > > > > Yes, do you have any other comments about my patch? > > > > > > > > > > > > > > > > Not really. Just make sure to explicitly state that > > > > > > > > CONFIG_NODES_SPAN_OTHER_NODES is only about __early_pfn_to_nid and that > > > > > > > > doesn't really deserve it's own config and can be pulled under NUMA. > > > > > > > > > > > > > > > > > > Also while at it, does HAVE_MEMBLOCK_NODE_MAP fall into a similar > > > > > > > > > > bucket? Do we have any NUMA architecture that doesn't enable it? > > > > > > > > > > > > > > > > > > > > > > > > HAVE_MEMBLOCK_NODE_MAP makes huge difference in node/zone initialization > > > > > > > sequence so it's not only about a singe function. > > > > > > > > > > > > The question is whether we want to have this a config option or enable > > > > > > it unconditionally for each NUMA system. > > > > > > > > > > We can make it 'default NUMA', but we can't drop it completely because > > > > > microblaze uses sparse_memory_present_with_active_regions() which is > > > > > unavailable when HAVE_MEMBLOCK_NODE_MAP=n. > > > > > > > > I suppose you mean that microblaze is using > > > > sparse_memory_present_with_active_regions even without CONFIG_NUMA, > > > > right? > > > > > > Yes. > > > > > > > I have to confess I do not understand that code. What is the deal > > > > with setting node id there? > > > > > > The sparse_memory_present_with_active_regions() iterates over > > > memblock.memory regions and uses the node id of each region as the > > > parameter to memory_present(). The assumption here is that sometime before > > > each region was assigned a proper non-negative node id. > > > > > > microblaze uses device tree for memory enumeration and the current FDT code > > > does memblock_add() that implicitly sets nid in memblock.memory regions to -1. > > > > > > So in order to have proper node id passed to memory_present() microblaze > > > has to call memblock_set_node() before it can use > > > sparse_memory_present_with_active_regions(). > > > > I am sorry, but I still do not follow. Who is consuming that node id > > information when NUMA=n. In other words why cannot we simply do > > We can, I think nobody cared to change it. It would be great if somebody with the actual HW could try it out. I can throw a patch but I do not even have a cross compiler in my toolbox. > > > diff --git a/arch/microblaze/mm/init.c b/arch/microblaze/mm/init.c > > index a015a951c8b7..3a47e8db8d1c 100644 > > --- a/arch/microblaze/mm/init.c > > +++ b/arch/microblaze/mm/init.c > > @@ -175,14 +175,9 @@ void __init setup_memory(void) > > > > start_pfn = memblock_region_memory_base_pfn(reg); > > end_pfn = memblock_region_memory_end_pfn(reg); > > - memblock_set_node(start_pfn << PAGE_SHIFT, > > - (end_pfn - start_pfn) << PAGE_SHIFT, > > - &memblock.memory, 0); > > + memory_present(0, start_pfn << PAGE_SHIFT, end_pfn << PAGE_SHIFT); > > memory_present() expects pfns, the shift is not needed. Right. -- Michal Hocko SUSE Labs