linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
To: linux-mm@kvack.org, akpm@linux-foundation.org,
	mpe@ellerman.id.au, linuxppc-dev@lists.ozlabs.org,
	npiggin@gmail.com, christophe.leroy@csgroup.eu
Cc: Oscar Salvador <osalvador@suse.de>,
	Mike Kravetz <mike.kravetz@oracle.com>,
	Dan Williams <dan.j.williams@intel.com>,
	Joao Martins <joao.m.martins@oracle.com>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Muchun Song <muchun.song@linux.dev>,
	Will Deacon <will@kernel.org>
Subject: Re: [PATCH v5 10/13] powerpc/book3s64/vmemmap: Switch radix to use a different vmemmap handling function
Date: Mon, 24 Jul 2023 23:59:27 +0530	[thread overview]
Message-ID: <87edkx2mew.fsf@linux.ibm.com> (raw)
In-Reply-To: <20230718022934.90447-11-aneesh.kumar@linux.ibm.com>

"Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> writes:

> This is in preparation to update radix to implement vmemmap optimization
> for devdax. Below are the rules w.r.t radix vmemmap mapping
>
> 1. First try to map things using PMD (2M)
> 2. With altmap if altmap cross-boundary check returns true, fall back to
>    PAGE_SIZE
> 3. If we can't allocate PMD_SIZE backing memory for vmemmap, fallback to
>    PAGE_SIZE
>
> On removing vmemmap mapping, check if every subsection that is using the
> vmemmap area is invalid. If found to be invalid, that implies we can safely
> free the vmemmap area. We don't use the PAGE_UNUSED pattern used by x86
> because with 64K page size, we need to do the above check even at the
> PAGE_SIZE granularity.
>
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
> ---
>  arch/powerpc/include/asm/book3s/64/radix.h |   2 +
>  arch/powerpc/include/asm/pgtable.h         |   4 +
>  arch/powerpc/mm/book3s64/radix_pgtable.c   | 326 +++++++++++++++++++--
>  arch/powerpc/mm/init_64.c                  |  26 +-
>  4 files changed, 327 insertions(+), 31 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/book3s/64/radix.h b/arch/powerpc/include/asm/book3s/64/radix.h
> index 2ef92f36340f..f1461289643a 100644
> --- a/arch/powerpc/include/asm/book3s/64/radix.h
> +++ b/arch/powerpc/include/asm/book3s/64/radix.h
> @@ -331,6 +331,8 @@ extern int __meminit radix__vmemmap_create_mapping(unsigned long start,
>  					     unsigned long phys);
>  int __meminit radix__vmemmap_populate(unsigned long start, unsigned long end,
>  				      int node, struct vmem_altmap *altmap);
> +void __ref radix__vmemmap_free(unsigned long start, unsigned long end,
> +			       struct vmem_altmap *altmap);
>  extern void radix__vmemmap_remove_mapping(unsigned long start,
>  				    unsigned long page_size);
>  
> diff --git a/arch/powerpc/include/asm/pgtable.h b/arch/powerpc/include/asm/pgtable.h
> index 6a88bfdaa69b..68817ea7f994 100644
> --- a/arch/powerpc/include/asm/pgtable.h
> +++ b/arch/powerpc/include/asm/pgtable.h
> @@ -165,6 +165,10 @@ static inline bool is_ioremap_addr(const void *x)
>  
>  	return addr >= IOREMAP_BASE && addr < IOREMAP_END;
>  }
> +
> +int __meminit vmemmap_populated(unsigned long vmemmap_addr, int vmemmap_map_size);
> +bool altmap_cross_boundary(struct vmem_altmap *altmap, unsigned long start,
> +			   unsigned long page_size);
>  #endif /* CONFIG_PPC64 */
>  
>  #endif /* __ASSEMBLY__ */
> diff --git a/arch/powerpc/mm/book3s64/radix_pgtable.c b/arch/powerpc/mm/book3s64/radix_pgtable.c
> index 227fea53c217..9a7f3707b6fb 100644
> --- a/arch/powerpc/mm/book3s64/radix_pgtable.c
> +++ b/arch/powerpc/mm/book3s64/radix_pgtable.c
> @@ -744,8 +744,59 @@ static void free_pud_table(pud_t *pud_start, p4d_t *p4d)
>  	p4d_clear(p4d);
>  }
>  
> +#ifdef CONFIG_SPARSEMEM_VMEMMAP
> +static bool __meminit vmemmap_pmd_is_unused(unsigned long addr, unsigned long end)
> +{
> +	unsigned long start = ALIGN_DOWN(addr, PMD_SIZE);
> +
> +	return !vmemmap_populated(start, PMD_SIZE);
> +}
> +
> +static bool __meminit vmemmap_page_is_unused(unsigned long addr, unsigned long end)
> +{
> +	unsigned long start = ALIGN_DOWN(addr, PAGE_SIZE);
> +
> +	return !vmemmap_populated(start, PAGE_SIZE);
> +
> +}
> +#endif
> +
> +static void __meminit free_vmemmap_pages(struct page *page,
> +					 struct vmem_altmap *altmap,
> +					 int order)
> +{
> +	unsigned int nr_pages = 1 << order;
> +
> +	if (altmap) {
> +		unsigned long alt_start, alt_end;
> +		unsigned long base_pfn = page_to_pfn(page);
> +
> +		/*
> +		 * with 2M vmemmap mmaping we can have things setup
> +		 * such that even though atlmap is specified we never
> +		 * used altmap.
> +		 */
> +		alt_start = altmap->base_pfn;
> +		alt_end = altmap->base_pfn + altmap->reserve +
> +			altmap->free + altmap->alloc + altmap->align;
> +
> +		if (base_pfn >= alt_start && base_pfn < alt_end) {
> +			vmem_altmap_free(altmap, nr_pages);
> +			return;
> +		}
> +	}
> +

Please take this diff on top of this patch when adding this series to
-mm .

commit 613569d9517be60611a86bf4b9821b150c4c4954
Author: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Date:   Mon Jul 24 22:49:29 2023 +0530

    powerpc/mm/altmap: Fix altmap boundary check
    
    altmap->free includes the entire free space from which altmap blocks
    can be allocated. So when checking whether the kernel is doing altmap
    block free, compute the boundary correctly.
    
    Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>

diff --git a/arch/powerpc/mm/book3s64/radix_pgtable.c b/arch/powerpc/mm/book3s64/radix_pgtable.c
index 7761c2e93bff..ed63c2953b54 100644
--- a/arch/powerpc/mm/book3s64/radix_pgtable.c
+++ b/arch/powerpc/mm/book3s64/radix_pgtable.c
@@ -766,8 +766,7 @@ static void __meminit free_vmemmap_pages(struct page *page,
 		 * used altmap.
 		 */
 		alt_start = altmap->base_pfn;
-		alt_end = altmap->base_pfn + altmap->reserve +
-			altmap->free + altmap->alloc + altmap->align;
+		alt_end = altmap->base_pfn + altmap->reserve + altmap->free;
 
 		if (base_pfn >= alt_start && base_pfn < alt_end) {
 			vmem_altmap_free(altmap, nr_pages);


  reply	other threads:[~2023-07-24 18:29 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-18  2:29 [PATCH v5 00/13] Add support for DAX vmemmap optimization for ppc64 Aneesh Kumar K.V
2023-07-18  2:29 ` [PATCH v5 01/13] mm/hugepage pud: Allow arch-specific helper function to check huge page pud support Aneesh Kumar K.V
2023-07-18  2:29 ` [PATCH v5 04/13] mm/vmemmap: Allow architectures to override how vmemmap optimization works Aneesh Kumar K.V
2023-07-18  2:29 ` [PATCH v5 05/13] mm: Add pud_same similar to __HAVE_ARCH_P4D_SAME Aneesh Kumar K.V
2023-07-18  2:29 ` [PATCH v5 07/13] mm/vmemmap optimization: Split hugetlb and devdax vmemmap optimization Aneesh Kumar K.V
2023-07-18  2:29 ` [PATCH v5 08/13] powerpc/mm/trace: Convert trace event to trace event class Aneesh Kumar K.V
2023-07-18  2:29 ` [PATCH v5 09/13] powerpc/book3s64/mm: Enable transparent pud hugepage Aneesh Kumar K.V
2023-07-18  2:29 ` [PATCH v5 10/13] powerpc/book3s64/vmemmap: Switch radix to use a different vmemmap handling function Aneesh Kumar K.V
2023-07-24 18:29   ` Aneesh Kumar K.V [this message]
2023-07-24 18:48     ` Andrew Morton
2023-07-18  2:29 ` [PATCH v5 11/13] powerpc/book3s64/radix: Add support for vmemmap optimization for radix Aneesh Kumar K.V
2023-07-18  2:29 ` [PATCH v5 12/13] powerpc/book3s64/radix: Remove mmu_vmemmap_psize Aneesh Kumar K.V
2023-07-18  2:29 ` [PATCH v5 13/13] powerpc/book3s64/radix: Add debug message to give more details of vmemmap allocation Aneesh Kumar K.V

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87edkx2mew.fsf@linux.ibm.com \
    --to=aneesh.kumar@linux.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=catalin.marinas@arm.com \
    --cc=christophe.leroy@csgroup.eu \
    --cc=dan.j.williams@intel.com \
    --cc=joao.m.martins@oracle.com \
    --cc=linux-mm@kvack.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mike.kravetz@oracle.com \
    --cc=mpe@ellerman.id.au \
    --cc=muchun.song@linux.dev \
    --cc=npiggin@gmail.com \
    --cc=osalvador@suse.de \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).