linux-nvdimm.lists.01.org archive mirror
 help / color / mirror / Atom feed
From: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
To: Dan Williams <dan.j.williams@intel.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>,
	linuxppc-dev <linuxppc-dev@lists.ozlabs.org>,
	linux-nvdimm <linux-nvdimm@lists.01.org>
Subject: Re: [PATCH 2/2] powerpc/nvdimm: Update vmemmap_populated to check sub-section range
Date: Tue, 17 Sep 2019 15:24:36 +0530	[thread overview]
Message-ID: <cc1e080a-eb95-dab7-a77b-a05e12624578@linux.ibm.com> (raw)
In-Reply-To: <CAPcyv4ia0_GUu+=j-ecCuJkqaE5dVENNQxK_S-mO_KBmuA=9hw@mail.gmail.com>

On 9/16/19 11:16 PM, Dan Williams wrote:
> On Mon, Sep 9, 2019 at 11:29 PM Aneesh Kumar K.V
> <aneesh.kumar@linux.ibm.com> wrote:
>>
>> With commit: 7cc7867fb061 ("mm/devm_memremap_pages: enable sub-section remap")
>> pmem namespaces are remapped in 2M chunks. On architectures like ppc64 we
>> can map the memmap area using 16MB hugepage size and that can cover
>> a memory range of 16G.
>>
>> While enabling new pmem namespaces, since memory is added in sub-section chunks,
>> before creating a new memmap mapping, kernel should check whether there is an
>> existing memmap mapping covering the new pmem namespace. Currently, this is
>> validated by checking whether the section covering the range is already
>> initialized or not. Considering there can be multiple namespaces in the same
>> section this can result in wrong validation. Update this to check for
>> sub-sections in the range. This is done by checking for all pfns in the range we
>> are mapping.
>>
>> We could optimize this by checking only just one pfn in each sub-section. But
>> since this is not fast-path we keep this simple.
>>
>> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
>> ---
>>   arch/powerpc/mm/init_64.c | 45 ++++++++++++++++++++-------------------
>>   1 file changed, 23 insertions(+), 22 deletions(-)
>>
>> diff --git a/arch/powerpc/mm/init_64.c b/arch/powerpc/mm/init_64.c
>> index 4e08246acd79..7710ccdc19a2 100644
>> --- a/arch/powerpc/mm/init_64.c
>> +++ b/arch/powerpc/mm/init_64.c
>> @@ -70,30 +70,24 @@ EXPORT_SYMBOL_GPL(kernstart_addr);
>>
>>   #ifdef CONFIG_SPARSEMEM_VMEMMAP
>>   /*
>> - * Given an address within the vmemmap, determine the pfn of the page that
>> - * represents the start of the section it is within.  Note that we have to
>> - * do this by hand as the proffered address may not be correctly aligned.
>> - * Subtraction of non-aligned pointers produces undefined results.
>> - */
>> -static unsigned long __meminit vmemmap_section_start(unsigned long page)
>> -{
>> -       unsigned long offset = page - ((unsigned long)(vmemmap));
>> -
>> -       /* Return the pfn of the start of the section. */
>> -       return (offset / sizeof(struct page)) & PAGE_SECTION_MASK;
>> -}
>> -
>> -/*
>> - * Check if this vmemmap page is already initialised.  If any section
>> + * Check if this vmemmap page is already initialised.  If any sub section
>>    * which overlaps this vmemmap page is initialised then this page is
>>    * initialised already.
>>    */
>> -static int __meminit vmemmap_populated(unsigned long start, int page_size)
>> +
>> +static int __meminit vmemmap_populated(unsigned long start, int size)
>>   {
>> -       unsigned long end = start + page_size;
>> -       start = (unsigned long)(pfn_to_page(vmemmap_section_start(start)));
>> +       unsigned long end = start + size;
>>
>> -       for (; start < end; start += (PAGES_PER_SECTION * sizeof(struct page)))
>> +       /* start is size aligned and it is always > sizeof(struct page) */
>> +       VM_BUG_ON(start & sizeof(struct page));
> 
> If start is size aligned why not include that assumption in the VM_BUG_ON()?
> 
> Otherwise it seems this patch could be reduced simply by:
> 
> s/PAGE_SECTION_MASK/PAGE_SUBSECTION_MASK/
> s/PAGES_PER_SECTION/PAGES_PER_SUBSECTION/
> 
> ...and leave the vmemmap_section_start() function in place? In other
> words this path used to guarantee that 'start' was aligned to the
> minimum mem-hotplug granularity, the change looks ok on the surface,
> but it seems a subtle change in semantics.
> 


How about this?

/*
  * Given an address within the vmemmap, determine the page that
  * represents the start of the subsection it is within.  Note that we 
have to
  * do this by hand as the proffered address may not be correctly aligned.
  * Subtraction of non-aligned pointers produces undefined results.
  */
static struct page * __meminit vmemmap_subsection_start(unsigned long 
vmemmap_addr)
{
	unsigned long start_pfn;
	unsigned long offset = vmemmap_addr - ((unsigned long)(vmemmap));

	/* Return the pfn of the start of the section. */
	start_pfn = (offset / sizeof(struct page)) & PAGE_SUBSECTION_MASK;
	return pfn_to_page(start_pfn);
}

/*
  * Since memory is added in sub-section chunks, before creating a new 
vmemmap
  * mapping, the kernel should check whether there is an existing memmap 
mapping
  * covering the new subsection added. This is needed because kernel can map
  * vmemmap area using 16MB pages which will cover a memory range of 
16G. Such
  * a range covers multiple subsections (2M)
  *
  * If any subsection in the 16G range mapped by vmemmap is valid we 
consider the
  * vmemmap populated (There is a page table entry already present). We 
can't do
  * a page table lookup here because with the hash translation we don't keep
  * vmemmap details in linux page table.
  */
static int __meminit vmemmap_populated(unsigned long vmemmap_addr, int 
vmemmap_map_size)
{
	struct page *start;
	unsigned long vmemmap_end = vmemmap_addr + vmemmap_map_size;
	start = vmemmap_subsection_start(vmemmap_addr);

	for (; (unsigned long)start < vmemmap_end; start += PAGES_PER_SUBSECTION)
		/*
		 * pfn valid check here is intended to really check
		 * whether we have any subsection already initialized
		 * in this range.
		 */
		if (pfn_valid(page_to_pfn(start)))
			return 1;

	return 0;
}




> Can you get an ack from a powerpc maintainer, or maybe this patch
> should route through the powerpc tree?
> 

I guess this can go via powerpc tree. I will send an update V2 patch for 
this alone and work with Michael to pick that up.

> I'll take patch1 through the nvdimm tree.
> 

Thanks.

-aneesh



_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

  reply	other threads:[~2019-09-17  9:55 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-10  6:28 [PATCH 1/2] libnvdimm/altmap: Track namespace boundaries in altmap Aneesh Kumar K.V
2019-09-10  6:28 ` [PATCH 2/2] powerpc/nvdimm: Update vmemmap_populated to check sub-section range Aneesh Kumar K.V
2019-09-16 17:46   ` Dan Williams
2019-09-17  9:54     ` Aneesh Kumar K.V [this message]
2019-09-17  5:47   ` Oliver O'Halloran
2019-09-10  7:40 ` [PATCH 1/2] libnvdimm/altmap: Track namespace boundaries in altmap Pankaj Gupta
2019-09-10  8:10 ` Dan Williams
2019-09-10  8:30   ` Aneesh Kumar K.V
2019-09-10  9:08     ` Dan Williams
2019-09-10  8:29 ` Santosh Sivaraj
2019-09-11 12:03 ` Johannes Thumshirn
2019-09-16 17:58 ` Dan Williams
2019-09-17  7:39   ` Aneesh Kumar K.V
2019-09-17 14:24     ` Dan Williams

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cc1e080a-eb95-dab7-a77b-a05e12624578@linux.ibm.com \
    --to=aneesh.kumar@linux.ibm.com \
    --cc=dan.j.williams@intel.com \
    --cc=linux-nvdimm@lists.01.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mpe@ellerman.id.au \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).