linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: Anshuman Khandual <anshuman.khandual@arm.com>, linux-mm@kvack.org
Cc: Russell King <linux@armlinux.org.uk>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Mike Rapoport <rppt@kernel.org>,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org
Subject: Re: [RFC] mm: Enable generic pfn_valid() to handle early sections with memmap holes
Date: Wed, 17 Mar 2021 12:20:15 +0100	[thread overview]
Message-ID: <2541b182-1f1c-c1ed-c15c-9d71160eb6fe@redhat.com> (raw)
In-Reply-To: <72902ace-5f00-b484-aa71-e6841fb7d082@arm.com>

On 11.03.21 05:29, Anshuman Khandual wrote:
> 
> 
> On 3/8/21 2:07 PM, David Hildenbrand wrote:
>> On 08.03.21 04:27, Anshuman Khandual wrote:
>>> Platforms like arm and arm64 have redefined pfn_valid() because their early
>>> memory sections might have contained memmap holes caused by memblock areas
>>> tagged with MEMBLOCK_NOMAP, which should be skipped while validating a pfn
>>> for struct page backing. This scenario could be captured with a new option
>>> CONFIG_HAVE_EARLY_SECTION_MEMMAP_HOLES and then generic pfn_valid() can be
>>> improved to accommodate such platforms. This reduces overall code footprint
>>> and also improves maintainability.
>>>
>>> Commit 4f5b0c178996 ("arm, arm64: move free_unused_memmap() to generic mm")
>>> had used CONFIG_HAVE_ARCH_PFN_VALID to gate free_unused_memmap(), which in
>>> turn had expanded its scope to new platforms like arc and m68k. Rather lets
>>> restrict back the scope for free_unused_memmap() to arm and arm64 platforms
>>> using this new config option i.e CONFIG_HAVE_EARLY_SECTION_MEMMAP.
>>>
>>> While here, it exports the symbol memblock_is_map_memory() to build drivers
>>> that depend on pfn_valid() but does not have the required visibility. After
>>> this new config is in place, just drop CONFIG_HAVE_ARCH_PFN_VALID from both
>>> arm and arm64 platforms.
>>>
>>> Cc: Russell King <linux@armlinux.org.uk>
>>> Cc: Catalin Marinas <catalin.marinas@arm.com>
>>> Cc: Will Deacon <will@kernel.org>
>>> Cc: Andrew Morton <akpm@linux-foundation.org>
>>> Cc: Mike Rapoport <rppt@kernel.org>
>>> Cc: David Hildenbrand <david@redhat.com>
>>> Cc: linux-arm-kernel@lists.infradead.org
>>> Cc: linux-kernel@vger.kernel.org
>>> Cc: linux-mm@kvack.org
>>> Suggested-by: David Hildenbrand <david@redhat.com>
>>> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
>>> ---
>>> This applies on 5.12-rc2 along with arm64 pfn_valid() fix patches [1] and
>>> has been lightly tested on the arm64 platform. The idea to represent this
>>> unique situation on the arm and arm64 platforms with a config option was
>>> proposed by David H during an earlier discussion [2]. This still does not
>>> build on arm platform due to pfn_valid() resolution errors. Nonetheless
>>> wanted to get some early feedback whether the overall approach here, is
>>> acceptable or not.
>>
>> It might make sense to keep the arm variant for now. The arm64 variant is where the magic happens and where we missed updates when working on the generic variant.
> 
> Sure, will drop the changes on arm.
> 
>>
>> The generic variant really only applies to 64bit targets where we have SPARSEMEM. See x86 as an example.
> 
> Okay.
> 
>>
>> [...]
>>
>>>    /*
>>> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
>>> index 47946cec7584..93532994113f 100644
>>> --- a/include/linux/mmzone.h
>>> +++ b/include/linux/mmzone.h
>>> @@ -1409,8 +1409,23 @@ static inline int pfn_section_valid(struct mem_section *ms, unsigned long pfn)
>>>    }
>>>    #endif
>>>    +bool memblock_is_map_memory(phys_addr_t addr);
>>> +
>>>    #ifndef CONFIG_HAVE_ARCH_PFN_VALID
>>>    static inline int pfn_valid(unsigned long pfn)
>>> +{
>>> +    phys_addr_t addr = PFN_PHYS(pfn);
>>> +
>>> +    /*
>>> +     * Ensure the upper PAGE_SHIFT bits are clear in the
>>> +     * pfn. Else it might lead to false positives when
>>> +     * some of the upper bits are set, but the lower bits
>>> +     * match a valid pfn.
>>> +     */
>>> +    if (PHYS_PFN(addr) != pfn)
>>> +        return 0;
>>
>> I think this should be fine for other archs as well.
>>
>>> +
>>> +#ifdef CONFIG_SPARSEMEM
>>
>> Why do we need the ifdef now? If that's to cover the arm case, then please consider the arm64 case only for now.
> 
> Yes, it is not needed.
> 
>>
>>>    {
>>>        struct mem_section *ms;
>>>    @@ -1423,7 +1438,14 @@ static inline int pfn_valid(unsigned long pfn)
>>>         * Traditionally early sections always returned pfn_valid() for
>>>         * the entire section-sized span.
>>>         */
>>> -    return early_section(ms) || pfn_section_valid(ms, pfn);
>>> +    if (early_section(ms))
>>> +        return IS_ENABLED(CONFIG_HAVE_EARLY_SECTION_MEMMAP_HOLES) ?
>>> +            memblock_is_map_memory(pfn << PAGE_SHIFT) : 1;
>>> +
>>> +    return pfn_section_valid(ms, pfn);
>>> +}
>>> +#endif
>>> +    return 1;
>>>    }
>>>    #endif
>>>    diff --git a/mm/Kconfig b/mm/Kconfig
>>> index 24c045b24b95..0ec20f661b3f 100644
>>> --- a/mm/Kconfig
>>> +++ b/mm/Kconfig
>>> @@ -135,6 +135,16 @@ config HAVE_FAST_GUP
>>>    config ARCH_KEEP_MEMBLOCK
>>>        bool
>>>    +config HAVE_EARLY_SECTION_MEMMAP_HOLES
>>> +    depends on ARCH_KEEP_MEMBLOCK && SPARSEMEM_VMEMMAP
>>> +    def_bool n
>>> +    help
>>> +      Early sections on certain platforms might have portions which are
>>> +      not backed with struct page mapping as their memblock entries are
>>> +      marked with MEMBLOCK_NOMAP. When subscribed, this option enables
>>> +      specific handling for those memory sections in certain situations
>>> +      such as pfn_valid().
>>> +
>>>    # Keep arch NUMA mapping infrastructure post-init.
>>>    config NUMA_KEEP_MEMINFO
>>>        bool
>>> diff --git a/mm/memblock.c b/mm/memblock.c
>>> index afaefa8fc6ab..d9fa2e62ab7a 100644
>>> --- a/mm/memblock.c
>>> +++ b/mm/memblock.c
>>> @@ -1744,6 +1744,7 @@ bool __init_memblock memblock_is_map_memory(phys_addr_t addr)
>>>            return false;
>>>        return !memblock_is_nomap(&memblock.memory.regions[i]);
>>>    }
>>> +EXPORT_SYMBOL(memblock_is_map_memory);
>>>      int __init_memblock memblock_search_pfn_nid(unsigned long pfn,
>>>                 unsigned long *start_pfn, unsigned long *end_pfn)
>>> @@ -1926,7 +1927,7 @@ static void __init free_unused_memmap(void)
>>>        unsigned long start, end, prev_end = 0;
>>>        int i;
>>>    -    if (!IS_ENABLED(CONFIG_HAVE_ARCH_PFN_VALID) ||
>>> +    if (!IS_ENABLED(CONFIG_HAVE_EARLY_SECTION_MEMMAP_HOLES) ||
>>>            IS_ENABLED(CONFIG_SPARSEMEM_VMEMMAP))
>>>            return;
>>>   
>>
>> With
>>
>> commit 1f90a3477df3ff1a91e064af554cdc887c8f9e5e
>> Author: Dan Williams <dan.j.williams@intel.com>
>> Date:   Thu Feb 25 17:17:05 2021 -0800
>>
>>      mm: teach pfn_to_online_page() about ZONE_DEVICE section collisions
>>
>> (still in -next I think)
> 
> Already has merged mainline.
> 
>>
>> You'll also have to take care of pfn_to_online_page().
>>
> 
> Something like this would work ?
> 
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index 5ba51a8bdaeb..19812deb807f 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -309,6 +309,11 @@ struct page *pfn_to_online_page(unsigned long pfn)
>           * Save some code text when online_section() +
>           * pfn_section_valid() are sufficient.
>           */
> +       if (IS_ENABLED(CONFIG_HAVE_EARLY_SECTION_MEMMAP_HOLES)) {
> +               if (early_section(ms) && !memblock_is_map_memory(PFN_PHYS(pfn)))
> +                       return NULL;
> +       }
> +
>          if (IS_ENABLED(CONFIG_HAVE_ARCH_PFN_VALID) && !pfn_valid(pfn))
>                  return NULL;
> 

Sorry for the late reply, just stumbled over this again.

I think, yes. I do wonder if we then still need the 
CONFIG_HAVE_ARCH_PFN_VALID handling below - are there any custom 
pfn_valid() implementation with SPARSE remaining? I don't think so.

-- 
Thanks,

David / dhildenb



  reply	other threads:[~2021-03-17 11:20 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-08  3:27 [RFC] mm: Enable generic pfn_valid() to handle early sections with memmap holes Anshuman Khandual
2021-03-08  8:37 ` David Hildenbrand
2021-03-11  4:29   ` Anshuman Khandual
2021-03-17 11:20     ` David Hildenbrand [this message]
2021-03-08  8:55 ` Mike Rapoport
2021-03-11  7:52   ` Anshuman Khandual
2021-03-11  8:14     ` Mike Rapoport
2021-03-11  9:33     ` Will Deacon
2021-03-11 10:43       ` Mike Rapoport

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2541b182-1f1c-c1ed-c15c-9d71160eb6fe@redhat.com \
    --to=david@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=anshuman.khandual@arm.com \
    --cc=catalin.marinas@arm.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux@armlinux.org.uk \
    --cc=rppt@kernel.org \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).