From: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> To: Dan Williams <dan.j.williams@intel.com> Cc: Michael Ellerman <mpe@ellerman.id.au>, linuxppc-dev <linuxppc-dev@lists.ozlabs.org>, linux-nvdimm <linux-nvdimm@lists.01.org> Subject: Re: [PATCH 2/2] powerpc/nvdimm: Update vmemmap_populated to check sub-section range Date: Tue, 17 Sep 2019 15:24:36 +0530 [thread overview] Message-ID: <cc1e080a-eb95-dab7-a77b-a05e12624578@linux.ibm.com> (raw) In-Reply-To: <CAPcyv4ia0_GUu+=j-ecCuJkqaE5dVENNQxK_S-mO_KBmuA=9hw@mail.gmail.com> On 9/16/19 11:16 PM, Dan Williams wrote: > On Mon, Sep 9, 2019 at 11:29 PM Aneesh Kumar K.V > <aneesh.kumar@linux.ibm.com> wrote: >> >> With commit: 7cc7867fb061 ("mm/devm_memremap_pages: enable sub-section remap") >> pmem namespaces are remapped in 2M chunks. On architectures like ppc64 we >> can map the memmap area using 16MB hugepage size and that can cover >> a memory range of 16G. >> >> While enabling new pmem namespaces, since memory is added in sub-section chunks, >> before creating a new memmap mapping, kernel should check whether there is an >> existing memmap mapping covering the new pmem namespace. Currently, this is >> validated by checking whether the section covering the range is already >> initialized or not. Considering there can be multiple namespaces in the same >> section this can result in wrong validation. Update this to check for >> sub-sections in the range. This is done by checking for all pfns in the range we >> are mapping. >> >> We could optimize this by checking only just one pfn in each sub-section. But >> since this is not fast-path we keep this simple. >> >> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> >> --- >> arch/powerpc/mm/init_64.c | 45 ++++++++++++++++++++------------------- >> 1 file changed, 23 insertions(+), 22 deletions(-) >> >> diff --git a/arch/powerpc/mm/init_64.c b/arch/powerpc/mm/init_64.c >> index 4e08246acd79..7710ccdc19a2 100644 >> --- a/arch/powerpc/mm/init_64.c >> +++ b/arch/powerpc/mm/init_64.c >> @@ -70,30 +70,24 @@ EXPORT_SYMBOL_GPL(kernstart_addr); >> >> #ifdef CONFIG_SPARSEMEM_VMEMMAP >> /* >> - * Given an address within the vmemmap, determine the pfn of the page that >> - * represents the start of the section it is within. Note that we have to >> - * do this by hand as the proffered address may not be correctly aligned. >> - * Subtraction of non-aligned pointers produces undefined results. >> - */ >> -static unsigned long __meminit vmemmap_section_start(unsigned long page) >> -{ >> - unsigned long offset = page - ((unsigned long)(vmemmap)); >> - >> - /* Return the pfn of the start of the section. */ >> - return (offset / sizeof(struct page)) & PAGE_SECTION_MASK; >> -} >> - >> -/* >> - * Check if this vmemmap page is already initialised. If any section >> + * Check if this vmemmap page is already initialised. If any sub section >> * which overlaps this vmemmap page is initialised then this page is >> * initialised already. >> */ >> -static int __meminit vmemmap_populated(unsigned long start, int page_size) >> + >> +static int __meminit vmemmap_populated(unsigned long start, int size) >> { >> - unsigned long end = start + page_size; >> - start = (unsigned long)(pfn_to_page(vmemmap_section_start(start))); >> + unsigned long end = start + size; >> >> - for (; start < end; start += (PAGES_PER_SECTION * sizeof(struct page))) >> + /* start is size aligned and it is always > sizeof(struct page) */ >> + VM_BUG_ON(start & sizeof(struct page)); > > If start is size aligned why not include that assumption in the VM_BUG_ON()? > > Otherwise it seems this patch could be reduced simply by: > > s/PAGE_SECTION_MASK/PAGE_SUBSECTION_MASK/ > s/PAGES_PER_SECTION/PAGES_PER_SUBSECTION/ > > ...and leave the vmemmap_section_start() function in place? In other > words this path used to guarantee that 'start' was aligned to the > minimum mem-hotplug granularity, the change looks ok on the surface, > but it seems a subtle change in semantics. > How about this? /* * Given an address within the vmemmap, determine the page that * represents the start of the subsection it is within. Note that we have to * do this by hand as the proffered address may not be correctly aligned. * Subtraction of non-aligned pointers produces undefined results. */ static struct page * __meminit vmemmap_subsection_start(unsigned long vmemmap_addr) { unsigned long start_pfn; unsigned long offset = vmemmap_addr - ((unsigned long)(vmemmap)); /* Return the pfn of the start of the section. */ start_pfn = (offset / sizeof(struct page)) & PAGE_SUBSECTION_MASK; return pfn_to_page(start_pfn); } /* * Since memory is added in sub-section chunks, before creating a new vmemmap * mapping, the kernel should check whether there is an existing memmap mapping * covering the new subsection added. This is needed because kernel can map * vmemmap area using 16MB pages which will cover a memory range of 16G. Such * a range covers multiple subsections (2M) * * If any subsection in the 16G range mapped by vmemmap is valid we consider the * vmemmap populated (There is a page table entry already present). We can't do * a page table lookup here because with the hash translation we don't keep * vmemmap details in linux page table. */ static int __meminit vmemmap_populated(unsigned long vmemmap_addr, int vmemmap_map_size) { struct page *start; unsigned long vmemmap_end = vmemmap_addr + vmemmap_map_size; start = vmemmap_subsection_start(vmemmap_addr); for (; (unsigned long)start < vmemmap_end; start += PAGES_PER_SUBSECTION) /* * pfn valid check here is intended to really check * whether we have any subsection already initialized * in this range. */ if (pfn_valid(page_to_pfn(start))) return 1; return 0; } > Can you get an ack from a powerpc maintainer, or maybe this patch > should route through the powerpc tree? > I guess this can go via powerpc tree. I will send an update V2 patch for this alone and work with Michael to pick that up. > I'll take patch1 through the nvdimm tree. > Thanks. -aneesh _______________________________________________ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org https://lists.01.org/mailman/listinfo/linux-nvdimm
WARNING: multiple messages have this Message-ID (diff)
From: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> To: Dan Williams <dan.j.williams@intel.com> Cc: Oliver O'Halloran <oohall@gmail.com>, linuxppc-dev <linuxppc-dev@lists.ozlabs.org>, linux-nvdimm <linux-nvdimm@lists.01.org> Subject: Re: [PATCH 2/2] powerpc/nvdimm: Update vmemmap_populated to check sub-section range Date: Tue, 17 Sep 2019 15:24:36 +0530 [thread overview] Message-ID: <cc1e080a-eb95-dab7-a77b-a05e12624578@linux.ibm.com> (raw) In-Reply-To: <CAPcyv4ia0_GUu+=j-ecCuJkqaE5dVENNQxK_S-mO_KBmuA=9hw@mail.gmail.com> On 9/16/19 11:16 PM, Dan Williams wrote: > On Mon, Sep 9, 2019 at 11:29 PM Aneesh Kumar K.V > <aneesh.kumar@linux.ibm.com> wrote: >> >> With commit: 7cc7867fb061 ("mm/devm_memremap_pages: enable sub-section remap") >> pmem namespaces are remapped in 2M chunks. On architectures like ppc64 we >> can map the memmap area using 16MB hugepage size and that can cover >> a memory range of 16G. >> >> While enabling new pmem namespaces, since memory is added in sub-section chunks, >> before creating a new memmap mapping, kernel should check whether there is an >> existing memmap mapping covering the new pmem namespace. Currently, this is >> validated by checking whether the section covering the range is already >> initialized or not. Considering there can be multiple namespaces in the same >> section this can result in wrong validation. Update this to check for >> sub-sections in the range. This is done by checking for all pfns in the range we >> are mapping. >> >> We could optimize this by checking only just one pfn in each sub-section. But >> since this is not fast-path we keep this simple. >> >> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> >> --- >> arch/powerpc/mm/init_64.c | 45 ++++++++++++++++++++------------------- >> 1 file changed, 23 insertions(+), 22 deletions(-) >> >> diff --git a/arch/powerpc/mm/init_64.c b/arch/powerpc/mm/init_64.c >> index 4e08246acd79..7710ccdc19a2 100644 >> --- a/arch/powerpc/mm/init_64.c >> +++ b/arch/powerpc/mm/init_64.c >> @@ -70,30 +70,24 @@ EXPORT_SYMBOL_GPL(kernstart_addr); >> >> #ifdef CONFIG_SPARSEMEM_VMEMMAP >> /* >> - * Given an address within the vmemmap, determine the pfn of the page that >> - * represents the start of the section it is within. Note that we have to >> - * do this by hand as the proffered address may not be correctly aligned. >> - * Subtraction of non-aligned pointers produces undefined results. >> - */ >> -static unsigned long __meminit vmemmap_section_start(unsigned long page) >> -{ >> - unsigned long offset = page - ((unsigned long)(vmemmap)); >> - >> - /* Return the pfn of the start of the section. */ >> - return (offset / sizeof(struct page)) & PAGE_SECTION_MASK; >> -} >> - >> -/* >> - * Check if this vmemmap page is already initialised. If any section >> + * Check if this vmemmap page is already initialised. If any sub section >> * which overlaps this vmemmap page is initialised then this page is >> * initialised already. >> */ >> -static int __meminit vmemmap_populated(unsigned long start, int page_size) >> + >> +static int __meminit vmemmap_populated(unsigned long start, int size) >> { >> - unsigned long end = start + page_size; >> - start = (unsigned long)(pfn_to_page(vmemmap_section_start(start))); >> + unsigned long end = start + size; >> >> - for (; start < end; start += (PAGES_PER_SECTION * sizeof(struct page))) >> + /* start is size aligned and it is always > sizeof(struct page) */ >> + VM_BUG_ON(start & sizeof(struct page)); > > If start is size aligned why not include that assumption in the VM_BUG_ON()? > > Otherwise it seems this patch could be reduced simply by: > > s/PAGE_SECTION_MASK/PAGE_SUBSECTION_MASK/ > s/PAGES_PER_SECTION/PAGES_PER_SUBSECTION/ > > ...and leave the vmemmap_section_start() function in place? In other > words this path used to guarantee that 'start' was aligned to the > minimum mem-hotplug granularity, the change looks ok on the surface, > but it seems a subtle change in semantics. > How about this? /* * Given an address within the vmemmap, determine the page that * represents the start of the subsection it is within. Note that we have to * do this by hand as the proffered address may not be correctly aligned. * Subtraction of non-aligned pointers produces undefined results. */ static struct page * __meminit vmemmap_subsection_start(unsigned long vmemmap_addr) { unsigned long start_pfn; unsigned long offset = vmemmap_addr - ((unsigned long)(vmemmap)); /* Return the pfn of the start of the section. */ start_pfn = (offset / sizeof(struct page)) & PAGE_SUBSECTION_MASK; return pfn_to_page(start_pfn); } /* * Since memory is added in sub-section chunks, before creating a new vmemmap * mapping, the kernel should check whether there is an existing memmap mapping * covering the new subsection added. This is needed because kernel can map * vmemmap area using 16MB pages which will cover a memory range of 16G. Such * a range covers multiple subsections (2M) * * If any subsection in the 16G range mapped by vmemmap is valid we consider the * vmemmap populated (There is a page table entry already present). We can't do * a page table lookup here because with the hash translation we don't keep * vmemmap details in linux page table. */ static int __meminit vmemmap_populated(unsigned long vmemmap_addr, int vmemmap_map_size) { struct page *start; unsigned long vmemmap_end = vmemmap_addr + vmemmap_map_size; start = vmemmap_subsection_start(vmemmap_addr); for (; (unsigned long)start < vmemmap_end; start += PAGES_PER_SUBSECTION) /* * pfn valid check here is intended to really check * whether we have any subsection already initialized * in this range. */ if (pfn_valid(page_to_pfn(start))) return 1; return 0; } > Can you get an ack from a powerpc maintainer, or maybe this patch > should route through the powerpc tree? > I guess this can go via powerpc tree. I will send an update V2 patch for this alone and work with Michael to pick that up. > I'll take patch1 through the nvdimm tree. > Thanks. -aneesh
next prev parent reply other threads:[~2019-09-17 9:55 UTC|newest] Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top 2019-09-10 6:28 [PATCH 1/2] libnvdimm/altmap: Track namespace boundaries in altmap Aneesh Kumar K.V 2019-09-10 6:28 ` Aneesh Kumar K.V 2019-09-10 6:28 ` [PATCH 2/2] powerpc/nvdimm: Update vmemmap_populated to check sub-section range Aneesh Kumar K.V 2019-09-10 6:28 ` Aneesh Kumar K.V 2019-09-16 17:46 ` Dan Williams 2019-09-16 17:46 ` Dan Williams 2019-09-17 9:54 ` Aneesh Kumar K.V [this message] 2019-09-17 9:54 ` Aneesh Kumar K.V 2019-09-17 5:47 ` Oliver O'Halloran 2019-09-17 5:47 ` Oliver O'Halloran 2019-09-10 7:40 ` [PATCH 1/2] libnvdimm/altmap: Track namespace boundaries in altmap Pankaj Gupta 2019-09-10 7:40 ` Pankaj Gupta 2019-09-10 8:10 ` Dan Williams 2019-09-10 8:10 ` Dan Williams 2019-09-10 8:30 ` Aneesh Kumar K.V 2019-09-10 8:30 ` Aneesh Kumar K.V 2019-09-10 9:08 ` Dan Williams 2019-09-10 9:08 ` Dan Williams 2019-09-10 8:29 ` Santosh Sivaraj 2019-09-10 8:29 ` Santosh Sivaraj 2019-09-11 12:03 ` Johannes Thumshirn 2019-09-11 12:03 ` Johannes Thumshirn 2019-09-16 17:58 ` Dan Williams 2019-09-16 17:58 ` Dan Williams 2019-09-17 7:39 ` Aneesh Kumar K.V 2019-09-17 7:39 ` Aneesh Kumar K.V 2019-09-17 14:24 ` Dan Williams 2019-09-17 14:24 ` Dan Williams
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=cc1e080a-eb95-dab7-a77b-a05e12624578@linux.ibm.com \ --to=aneesh.kumar@linux.ibm.com \ --cc=dan.j.williams@intel.com \ --cc=linux-nvdimm@lists.01.org \ --cc=linuxppc-dev@lists.ozlabs.org \ --cc=mpe@ellerman.id.au \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.