From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.8 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5D071C5ACCC for ; Tue, 16 Oct 2018 20:49:17 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 2929E2145D for ; Tue, 16 Oct 2018 20:49:17 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 2929E2145D Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727081AbeJQEl0 (ORCPT ); Wed, 17 Oct 2018 00:41:26 -0400 Received: from mga14.intel.com ([192.55.52.115]:4660 "EHLO mga14.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726048AbeJQEl0 (ORCPT ); Wed, 17 Oct 2018 00:41:26 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga103.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 16 Oct 2018 13:49:15 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.54,389,1534834800"; d="scan'208";a="92542532" Received: from ahduyck-mobl.amr.corp.intel.com (HELO [10.7.198.154]) ([10.7.198.154]) by orsmga003.jf.intel.com with ESMTP; 16 Oct 2018 13:49:15 -0700 Subject: Re: [mm PATCH v3 2/6] mm: Drop meminit_pfn_in_nid as it is redundant To: Pavel Tatashin , linux-mm@kvack.org, akpm@linux-foundation.org Cc: pavel.tatashin@microsoft.com, mhocko@suse.com, dave.jiang@intel.com, linux-kernel@vger.kernel.org, willy@infradead.org, davem@davemloft.net, yi.z.zhang@linux.intel.com, khalid.aziz@oracle.com, rppt@linux.vnet.ibm.com, vbabka@suse.cz, sparclinux@vger.kernel.org, dan.j.williams@intel.com, ldufour@linux.vnet.ibm.com, mgorman@techsingularity.net, mingo@kernel.org, kirill.shutemov@linux.intel.com References: <20181015202456.2171.88406.stgit@localhost.localdomain> <20181015202703.2171.40829.stgit@localhost.localdomain> From: Alexander Duyck Message-ID: Date: Tue, 16 Oct 2018 13:49:14 -0700 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.2.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 10/16/2018 1:33 PM, Pavel Tatashin wrote: > > > On 10/15/18 4:27 PM, Alexander Duyck wrote: >> As best as I can tell the meminit_pfn_in_nid call is completely redundant. >> The deferred memory initialization is already making use of >> for_each_free_mem_range which in turn will call into __next_mem_range which >> will only return a memory range if it matches the node ID provided assuming >> it is not NUMA_NO_NODE. >> >> I am operating on the assumption that there are no zones or pgdata_t >> structures that have a NUMA node of NUMA_NO_NODE associated with them. If >> that is the case then __next_mem_range will never return a memory range >> that doesn't match the zone's node ID and as such the check is redundant. >> >> So one piece I would like to verfy on this is if this works for ia64. >> Technically it was using a different approach to get the node ID, but it >> seems to have the node ID also encoded into the memblock. So I am >> assuming this is okay, but would like to get confirmation on that. >> >> Signed-off-by: Alexander Duyck > > If I am not mistaken, this code is for systems with memory interleaving. > Quick looks shows that x86, powerpc, s390, and sparc have it set. > > I am not sure about other arches, but at least on SPARC, there are some > processors with memory interleaving feature: > > http://www.fujitsu.com/global/products/computing/servers/unix/sparc-enterprise/technology/performance/memory.html > > Pavel I get what it is for. However as best I can tell the check is actually redundant. In the case of the deferred page initialization we are already pulling the memory regions via "for_each_free_mem_range". That function is already passed a NUMA node ID. Because of that we are already checking the memory range to determine if it is in the node or not. As such it doesn't really make sense to go through for each PFN and then go back to the memory range and see if the node matches or not. You can take a look at __next_mem_range which is called by for_each_free_mem_range and passed &memblock.memory and &memblock.reserved to avoid: https://elixir.bootlin.com/linux/latest/source/mm/memblock.c#L899 Then you can work your way through: meminit_pfn_in_nid(pfn, node, state) __early_pfn_to_nid(pfn, state) memblock_search_pfn_nid(pfn, &start_pfn, &end_pfn) memblock_search(&memblock.memory, pfn) From what I can tell the deferred init is going back through the memblock.memory list we pulled this range from and just validating it against itself. This makes sense for the standard init as that is just going from start_pfn->end_pfn, but for the deferred init we are pulling the memory ranges ahead of time so we shouldn't need to re-validate the memory that is contained within that range.