From: Pavel Tatashin <pasha.tatashin@gmail.com>
To: Alexander Duyck <alexander.h.duyck@linux.intel.com>,
linux-mm@kvack.org, akpm@linux-foundation.org
Cc: pavel.tatashin@microsoft.com, mhocko@suse.com,
dave.jiang@intel.com, linux-kernel@vger.kernel.org,
willy@infradead.org, davem@davemloft.net,
yi.z.zhang@linux.intel.com, khalid.aziz@oracle.com,
rppt@linux.vnet.ibm.com, vbabka@suse.cz,
sparclinux@vger.kernel.org, dan.j.williams@intel.com,
ldufour@linux.vnet.ibm.com, mgorman@techsingularity.net,
mingo@kernel.org, kirill.shutemov@linux.intel.com
Subject: Re: [mm PATCH v3 2/6] mm: Drop meminit_pfn_in_nid as it is redundant
Date: Tue, 16 Oct 2018 16:33:10 -0400 [thread overview]
Message-ID: <a21c1827-10ad-8ff5-c8b6-e34a3f1e8b80@gmail.com> (raw)
In-Reply-To: <20181015202703.2171.40829.stgit@localhost.localdomain>
On 10/15/18 4:27 PM, Alexander Duyck wrote:
> As best as I can tell the meminit_pfn_in_nid call is completely redundant.
> The deferred memory initialization is already making use of
> for_each_free_mem_range which in turn will call into __next_mem_range which
> will only return a memory range if it matches the node ID provided assuming
> it is not NUMA_NO_NODE.
>
> I am operating on the assumption that there are no zones or pgdata_t
> structures that have a NUMA node of NUMA_NO_NODE associated with them. If
> that is the case then __next_mem_range will never return a memory range
> that doesn't match the zone's node ID and as such the check is redundant.
>
> So one piece I would like to verfy on this is if this works for ia64.
> Technically it was using a different approach to get the node ID, but it
> seems to have the node ID also encoded into the memblock. So I am
> assuming this is okay, but would like to get confirmation on that.
>
> Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
If I am not mistaken, this code is for systems with memory interleaving.
Quick looks shows that x86, powerpc, s390, and sparc have it set.
I am not sure about other arches, but at least on SPARC, there are some
processors with memory interleaving feature:
http://www.fujitsu.com/global/products/computing/servers/unix/sparc-enterprise/technology/performance/memory.html
Pavel
> ---
> mm/page_alloc.c | 50 ++++++++++++++------------------------------------
> 1 file changed, 14 insertions(+), 36 deletions(-)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 4bd858d1c3ba..a766a15fad81 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -1301,36 +1301,22 @@ int __meminit early_pfn_to_nid(unsigned long pfn)
> #endif
>
> #ifdef CONFIG_NODES_SPAN_OTHER_NODES
> -static inline bool __meminit __maybe_unused
> -meminit_pfn_in_nid(unsigned long pfn, int node,
> - struct mminit_pfnnid_cache *state)
> +/* Only safe to use early in boot when initialisation is single-threaded */
> +static inline bool __meminit early_pfn_in_nid(unsigned long pfn, int node)
> {
> int nid;
>
> - nid = __early_pfn_to_nid(pfn, state);
> + nid = __early_pfn_to_nid(pfn, &early_pfnnid_cache);
> if (nid >= 0 && nid != node)
> return false;
> return true;
> }
>
> -/* Only safe to use early in boot when initialisation is single-threaded */
> -static inline bool __meminit early_pfn_in_nid(unsigned long pfn, int node)
> -{
> - return meminit_pfn_in_nid(pfn, node, &early_pfnnid_cache);
> -}
> -
> #else
> -
> static inline bool __meminit early_pfn_in_nid(unsigned long pfn, int node)
> {
> return true;
> }
> -static inline bool __meminit __maybe_unused
> -meminit_pfn_in_nid(unsigned long pfn, int node,
> - struct mminit_pfnnid_cache *state)
> -{
> - return true;
> -}
> #endif
>
>
> @@ -1459,21 +1445,13 @@ static inline void __init pgdat_init_report_one_done(void)
> *
> * Then, we check if a current large page is valid by only checking the validity
> * of the head pfn.
> - *
> - * Finally, meminit_pfn_in_nid is checked on systems where pfns can interleave
> - * within a node: a pfn is between start and end of a node, but does not belong
> - * to this memory node.
> */
> -static inline bool __init
> -deferred_pfn_valid(int nid, unsigned long pfn,
> - struct mminit_pfnnid_cache *nid_init_state)
> +static inline bool __init deferred_pfn_valid(unsigned long pfn)
> {
> if (!pfn_valid_within(pfn))
> return false;
> if (!(pfn & (pageblock_nr_pages - 1)) && !pfn_valid(pfn))
> return false;
> - if (!meminit_pfn_in_nid(pfn, nid, nid_init_state))
> - return false;
> return true;
> }
>
> @@ -1481,15 +1459,14 @@ static inline void __init pgdat_init_report_one_done(void)
> * Free pages to buddy allocator. Try to free aligned pages in
> * pageblock_nr_pages sizes.
> */
> -static void __init deferred_free_pages(int nid, int zid, unsigned long pfn,
> +static void __init deferred_free_pages(unsigned long pfn,
> unsigned long end_pfn)
> {
> - struct mminit_pfnnid_cache nid_init_state = { };
> unsigned long nr_pgmask = pageblock_nr_pages - 1;
> unsigned long nr_free = 0;
>
> for (; pfn < end_pfn; pfn++) {
> - if (!deferred_pfn_valid(nid, pfn, &nid_init_state)) {
> + if (!deferred_pfn_valid(pfn)) {
> deferred_free_range(pfn - nr_free, nr_free);
> nr_free = 0;
> } else if (!(pfn & nr_pgmask)) {
> @@ -1509,17 +1486,18 @@ static void __init deferred_free_pages(int nid, int zid, unsigned long pfn,
> * by performing it only once every pageblock_nr_pages.
> * Return number of pages initialized.
> */
> -static unsigned long __init deferred_init_pages(int nid, int zid,
> +static unsigned long __init deferred_init_pages(struct zone *zone,
> unsigned long pfn,
> unsigned long end_pfn)
> {
> - struct mminit_pfnnid_cache nid_init_state = { };
> unsigned long nr_pgmask = pageblock_nr_pages - 1;
> + int nid = zone_to_nid(zone);
> unsigned long nr_pages = 0;
> + int zid = zone_idx(zone);
> struct page *page = NULL;
>
> for (; pfn < end_pfn; pfn++) {
> - if (!deferred_pfn_valid(nid, pfn, &nid_init_state)) {
> + if (!deferred_pfn_valid(pfn)) {
> page = NULL;
> continue;
> } else if (!page || !(pfn & nr_pgmask)) {
> @@ -1582,12 +1560,12 @@ static int __init deferred_init_memmap(void *data)
> for_each_free_mem_range(i, nid, MEMBLOCK_NONE, &spa, &epa, NULL) {
> spfn = max_t(unsigned long, first_init_pfn, PFN_UP(spa));
> epfn = min_t(unsigned long, zone_end_pfn(zone), PFN_DOWN(epa));
> - nr_pages += deferred_init_pages(nid, zid, spfn, epfn);
> + nr_pages += deferred_init_pages(zone, spfn, epfn);
> }
> for_each_free_mem_range(i, nid, MEMBLOCK_NONE, &spa, &epa, NULL) {
> spfn = max_t(unsigned long, first_init_pfn, PFN_UP(spa));
> epfn = min_t(unsigned long, zone_end_pfn(zone), PFN_DOWN(epa));
> - deferred_free_pages(nid, zid, spfn, epfn);
> + deferred_free_pages(spfn, epfn);
> }
> pgdat_resize_unlock(pgdat, &flags);
>
> @@ -1676,7 +1654,7 @@ static int __init deferred_init_memmap(void *data)
> while (spfn < epfn && nr_pages < nr_pages_needed) {
> t = ALIGN(spfn + PAGES_PER_SECTION, PAGES_PER_SECTION);
> first_deferred_pfn = min(t, epfn);
> - nr_pages += deferred_init_pages(nid, zid, spfn,
> + nr_pages += deferred_init_pages(zone, spfn,
> first_deferred_pfn);
> spfn = first_deferred_pfn;
> }
> @@ -1688,7 +1666,7 @@ static int __init deferred_init_memmap(void *data)
> for_each_free_mem_range(i, nid, MEMBLOCK_NONE, &spa, &epa, NULL) {
> spfn = max_t(unsigned long, first_init_pfn, PFN_UP(spa));
> epfn = min_t(unsigned long, first_deferred_pfn, PFN_DOWN(epa));
> - deferred_free_pages(nid, zid, spfn, epfn);
> + deferred_free_pages(spfn, epfn);
>
> if (first_deferred_pfn == epfn)
> break;
>
next prev parent reply other threads:[~2018-10-16 20:33 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-10-15 20:26 [mm PATCH v3 0/6] Deferred page init improvements Alexander Duyck
2018-10-15 20:26 ` [mm PATCH v3 1/6] mm: Use mm_zero_struct_page from SPARC on all 64b architectures Alexander Duyck
2018-10-16 19:01 ` Pavel Tatashin
2018-10-17 7:30 ` Mike Rapoport
2018-10-17 14:52 ` Alexander Duyck
2018-10-17 8:47 ` Michal Hocko
2018-10-17 15:07 ` Alexander Duyck
2018-10-17 15:12 ` Pavel Tatashin
2018-10-17 15:40 ` David Laight
2018-10-17 16:31 ` Alexander Duyck
2018-10-17 17:08 ` Pavel Tatashin
2018-10-17 16:34 ` Michal Hocko
2018-10-15 20:27 ` [mm PATCH v3 2/6] mm: Drop meminit_pfn_in_nid as it is redundant Alexander Duyck
2018-10-16 20:33 ` Pavel Tatashin [this message]
2018-10-16 20:49 ` Alexander Duyck
2018-10-16 21:06 ` Pavel Tatashin
2018-10-17 9:04 ` Michal Hocko
2018-10-15 20:27 ` [mm PATCH v3 3/6] mm: Use memblock/zone specific iterator for handling deferred page init Alexander Duyck
2018-10-17 9:11 ` Michal Hocko
2018-10-17 15:17 ` Alexander Duyck
2018-10-17 16:42 ` Mike Rapoport
2018-10-15 20:27 ` [mm PATCH v3 4/6] mm: Move hot-plug specific memory init into separate functions and optimize Alexander Duyck
2018-10-17 9:18 ` Michal Hocko
2018-10-17 15:26 ` Alexander Duyck
2018-10-24 12:36 ` Michal Hocko
2018-10-24 15:08 ` Alexander Duyck
2018-10-24 15:27 ` Michal Hocko
2018-10-24 17:35 ` Alexander Duyck
2018-10-25 12:41 ` Michal Hocko
2018-10-15 20:27 ` [mm PATCH v3 5/6] mm: Use common iterator for deferred_init_pages and deferred_free_pages Alexander Duyck
2018-10-15 20:27 ` [mm PATCH v3 6/6] mm: Add reserved flag setting to set_page_links Alexander Duyck
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=a21c1827-10ad-8ff5-c8b6-e34a3f1e8b80@gmail.com \
--to=pasha.tatashin@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=alexander.h.duyck@linux.intel.com \
--cc=dan.j.williams@intel.com \
--cc=dave.jiang@intel.com \
--cc=davem@davemloft.net \
--cc=khalid.aziz@oracle.com \
--cc=kirill.shutemov@linux.intel.com \
--cc=ldufour@linux.vnet.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@techsingularity.net \
--cc=mhocko@suse.com \
--cc=mingo@kernel.org \
--cc=pavel.tatashin@microsoft.com \
--cc=rppt@linux.vnet.ibm.com \
--cc=sparclinux@vger.kernel.org \
--cc=vbabka@suse.cz \
--cc=willy@infradead.org \
--cc=yi.z.zhang@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).