All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
To: Dan Williams <dan.j.williams@intel.com>, Jeff Moyer <jmoyer@redhat.com>
Cc: linux-nvdimm <linux-nvdimm@lists.01.org>,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Paul Mackerras <paulus@samba.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	linuxppc-dev <linuxppc-dev@lists.ozlabs.org>
Subject: Re: [PATCH v2 1/4] mm/memremap_pages: Introduce memremap_compat_align()
Date: Fri, 14 Feb 2020 08:56:22 +0530	[thread overview]
Message-ID: <878sl5x31d.fsf@linux.ibm.com> (raw)
In-Reply-To: <CAPcyv4i8xNEsdX=8c2+ehf24U2AFcc-sKmAPS9UoVvm8z0aRng@mail.gmail.com>

Dan Williams <dan.j.williams@intel.com> writes:

> On Thu, Feb 13, 2020 at 8:58 AM Jeff Moyer <jmoyer@redhat.com> wrote:
>>
>> Dan Williams <dan.j.williams@intel.com> writes:
>>
>> > The "sub-section memory hotplug" facility allows memremap_pages() users
>> > like libnvdimm to compensate for hardware platforms like x86 that have a
>> > section size larger than their hardware memory mapping granularity.  The
>> > compensation that sub-section support affords is being tolerant of
>> > physical memory resources shifting by units smaller (64MiB on x86) than
>> > the memory-hotplug section size (128 MiB). Where the platform
>> > physical-memory mapping granularity is limited by the number and
>> > capability of address-decode-registers in the memory controller.
>> >
>> > While the sub-section support allows memremap_pages() to operate on
>> > sub-section (2MiB) granularity, the Power architecture may still
>> > require 16MiB alignment on "!radix_enabled()" platforms.
>> >
>> > In order for libnvdimm to be able to detect and manage this per-arch
>> > limitation, introduce memremap_compat_align() as a common minimum
>> > alignment across all driver-facing memory-mapping interfaces, and let
>> > Power override it to 16MiB in the "!radix_enabled()" case.
>> >
>> > The assumption / requirement for 16MiB to be a viable
>> > memremap_compat_align() value is that Power does not have platforms
>> > where its equivalent of address-decode-registers never hardware remaps a
>> > persistent memory resource on smaller than 16MiB boundaries. Note that I
>> > tried my best to not add a new Kconfig symbol, but header include
>> > entanglements defeated the #ifndef memremap_compat_align design pattern
>> > and the need to export it defeats the __weak design pattern for arch
>> > overrides.
>> >
>> > Based on an initial patch by Aneesh.
>>
>> I have just a couple of questions.
>>
>> First, can you please add a comment above the generic implementation of
>> memremap_compat_align describing its purpose, and why a platform might
>> want to override it?
>
> Sure, how about:
>
> /*
>  * The memremap() and memremap_pages() interfaces are alternately used
>  * to map persistent memory namespaces. These interfaces place different
>  * constraints on the alignment and size of the mapping (namespace).
>  * memremap() can map individual PAGE_SIZE pages. memremap_pages() can
>  * only map subsections (2MB), and at least one architecture (PowerPC)
>  * the minimum mapping granularity of memremap_pages() is 16MB.
>  *
>  * The role of memremap_compat_align() is to communicate the minimum
>  * arch supported alignment of a namespace such that it can freely
>  * switch modes without violating the arch constraint. Namely, do not
>  * allow a namespace to be PAGE_SIZE aligned since that namespace may be
>  * reconfigured into a mode that requires SUBSECTION_SIZE alignment.
>  */
>
>> Second, I will take it at face value that the power architecture
>> requires a 16MB alignment, but it's not clear to me why mmu_linear_psize
>> was chosen to represent that.  What's the relationship, there, and can
>> we please have a comment explaining it?
>
> Aneesh, can you help here?

With hash translation, we map the direct-map range with just one page
size. Based on different restrictions as described in htab_init_page_sizes
we can end up choosing 16M, 64K or even 4K. We use the variable
mmu_linear_psize to indicate which page size we used for direct-map
range. 

ie we should do. 

 +unsigned long arch_namespace_align_size(void)
 +{
 +	unsigned long sub_section_size = (1UL << SUBSECTION_SHIFT);
 +
 +	if (radix_enabled())
 +		return sub_section_size;
 +	return max(sub_section_size, (1UL << mmu_psize_defs[mmu_linear_psize].shift));
 +
 +}
 +EXPORT_SYMBOL_GPL(arch_namespace_align_size);

as done here

https://lore.kernel.org/linux-nvdimm/20200120140749.69549-4-aneesh.kumar@linux.ibm.com/

Dan can you update the powerpc definition?

-aneesh
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

WARNING: multiple messages have this Message-ID (diff)
From: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
To: Dan Williams <dan.j.williams@intel.com>, Jeff Moyer <jmoyer@redhat.com>
Cc: linux-nvdimm <linux-nvdimm@lists.01.org>,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Paul Mackerras <paulus@samba.org>,
	Vishal L Verma <vishal.l.verma@intel.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	linuxppc-dev <linuxppc-dev@lists.ozlabs.org>
Subject: Re: [PATCH v2 1/4] mm/memremap_pages: Introduce memremap_compat_align()
Date: Fri, 14 Feb 2020 08:56:22 +0530	[thread overview]
Message-ID: <878sl5x31d.fsf@linux.ibm.com> (raw)
In-Reply-To: <CAPcyv4i8xNEsdX=8c2+ehf24U2AFcc-sKmAPS9UoVvm8z0aRng@mail.gmail.com>

Dan Williams <dan.j.williams@intel.com> writes:

> On Thu, Feb 13, 2020 at 8:58 AM Jeff Moyer <jmoyer@redhat.com> wrote:
>>
>> Dan Williams <dan.j.williams@intel.com> writes:
>>
>> > The "sub-section memory hotplug" facility allows memremap_pages() users
>> > like libnvdimm to compensate for hardware platforms like x86 that have a
>> > section size larger than their hardware memory mapping granularity.  The
>> > compensation that sub-section support affords is being tolerant of
>> > physical memory resources shifting by units smaller (64MiB on x86) than
>> > the memory-hotplug section size (128 MiB). Where the platform
>> > physical-memory mapping granularity is limited by the number and
>> > capability of address-decode-registers in the memory controller.
>> >
>> > While the sub-section support allows memremap_pages() to operate on
>> > sub-section (2MiB) granularity, the Power architecture may still
>> > require 16MiB alignment on "!radix_enabled()" platforms.
>> >
>> > In order for libnvdimm to be able to detect and manage this per-arch
>> > limitation, introduce memremap_compat_align() as a common minimum
>> > alignment across all driver-facing memory-mapping interfaces, and let
>> > Power override it to 16MiB in the "!radix_enabled()" case.
>> >
>> > The assumption / requirement for 16MiB to be a viable
>> > memremap_compat_align() value is that Power does not have platforms
>> > where its equivalent of address-decode-registers never hardware remaps a
>> > persistent memory resource on smaller than 16MiB boundaries. Note that I
>> > tried my best to not add a new Kconfig symbol, but header include
>> > entanglements defeated the #ifndef memremap_compat_align design pattern
>> > and the need to export it defeats the __weak design pattern for arch
>> > overrides.
>> >
>> > Based on an initial patch by Aneesh.
>>
>> I have just a couple of questions.
>>
>> First, can you please add a comment above the generic implementation of
>> memremap_compat_align describing its purpose, and why a platform might
>> want to override it?
>
> Sure, how about:
>
> /*
>  * The memremap() and memremap_pages() interfaces are alternately used
>  * to map persistent memory namespaces. These interfaces place different
>  * constraints on the alignment and size of the mapping (namespace).
>  * memremap() can map individual PAGE_SIZE pages. memremap_pages() can
>  * only map subsections (2MB), and at least one architecture (PowerPC)
>  * the minimum mapping granularity of memremap_pages() is 16MB.
>  *
>  * The role of memremap_compat_align() is to communicate the minimum
>  * arch supported alignment of a namespace such that it can freely
>  * switch modes without violating the arch constraint. Namely, do not
>  * allow a namespace to be PAGE_SIZE aligned since that namespace may be
>  * reconfigured into a mode that requires SUBSECTION_SIZE alignment.
>  */
>
>> Second, I will take it at face value that the power architecture
>> requires a 16MB alignment, but it's not clear to me why mmu_linear_psize
>> was chosen to represent that.  What's the relationship, there, and can
>> we please have a comment explaining it?
>
> Aneesh, can you help here?

With hash translation, we map the direct-map range with just one page
size. Based on different restrictions as described in htab_init_page_sizes
we can end up choosing 16M, 64K or even 4K. We use the variable
mmu_linear_psize to indicate which page size we used for direct-map
range. 

ie we should do. 

 +unsigned long arch_namespace_align_size(void)
 +{
 +	unsigned long sub_section_size = (1UL << SUBSECTION_SHIFT);
 +
 +	if (radix_enabled())
 +		return sub_section_size;
 +	return max(sub_section_size, (1UL << mmu_psize_defs[mmu_linear_psize].shift));
 +
 +}
 +EXPORT_SYMBOL_GPL(arch_namespace_align_size);

as done here

https://lore.kernel.org/linux-nvdimm/20200120140749.69549-4-aneesh.kumar@linux.ibm.com/

Dan can you update the powerpc definition?

-aneesh

WARNING: multiple messages have this Message-ID (diff)
From: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
To: Dan Williams <dan.j.williams@intel.com>, Jeff Moyer <jmoyer@redhat.com>
Cc: linux-nvdimm <linux-nvdimm@lists.01.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Paul Mackerras <paulus@samba.org>,
	Vishal L Verma <vishal.l.verma@intel.com>,
	linuxppc-dev <linuxppc-dev@lists.ozlabs.org>
Subject: Re: [PATCH v2 1/4] mm/memremap_pages: Introduce memremap_compat_align()
Date: Fri, 14 Feb 2020 08:56:22 +0530	[thread overview]
Message-ID: <878sl5x31d.fsf@linux.ibm.com> (raw)
In-Reply-To: <CAPcyv4i8xNEsdX=8c2+ehf24U2AFcc-sKmAPS9UoVvm8z0aRng@mail.gmail.com>

Dan Williams <dan.j.williams@intel.com> writes:

> On Thu, Feb 13, 2020 at 8:58 AM Jeff Moyer <jmoyer@redhat.com> wrote:
>>
>> Dan Williams <dan.j.williams@intel.com> writes:
>>
>> > The "sub-section memory hotplug" facility allows memremap_pages() users
>> > like libnvdimm to compensate for hardware platforms like x86 that have a
>> > section size larger than their hardware memory mapping granularity.  The
>> > compensation that sub-section support affords is being tolerant of
>> > physical memory resources shifting by units smaller (64MiB on x86) than
>> > the memory-hotplug section size (128 MiB). Where the platform
>> > physical-memory mapping granularity is limited by the number and
>> > capability of address-decode-registers in the memory controller.
>> >
>> > While the sub-section support allows memremap_pages() to operate on
>> > sub-section (2MiB) granularity, the Power architecture may still
>> > require 16MiB alignment on "!radix_enabled()" platforms.
>> >
>> > In order for libnvdimm to be able to detect and manage this per-arch
>> > limitation, introduce memremap_compat_align() as a common minimum
>> > alignment across all driver-facing memory-mapping interfaces, and let
>> > Power override it to 16MiB in the "!radix_enabled()" case.
>> >
>> > The assumption / requirement for 16MiB to be a viable
>> > memremap_compat_align() value is that Power does not have platforms
>> > where its equivalent of address-decode-registers never hardware remaps a
>> > persistent memory resource on smaller than 16MiB boundaries. Note that I
>> > tried my best to not add a new Kconfig symbol, but header include
>> > entanglements defeated the #ifndef memremap_compat_align design pattern
>> > and the need to export it defeats the __weak design pattern for arch
>> > overrides.
>> >
>> > Based on an initial patch by Aneesh.
>>
>> I have just a couple of questions.
>>
>> First, can you please add a comment above the generic implementation of
>> memremap_compat_align describing its purpose, and why a platform might
>> want to override it?
>
> Sure, how about:
>
> /*
>  * The memremap() and memremap_pages() interfaces are alternately used
>  * to map persistent memory namespaces. These interfaces place different
>  * constraints on the alignment and size of the mapping (namespace).
>  * memremap() can map individual PAGE_SIZE pages. memremap_pages() can
>  * only map subsections (2MB), and at least one architecture (PowerPC)
>  * the minimum mapping granularity of memremap_pages() is 16MB.
>  *
>  * The role of memremap_compat_align() is to communicate the minimum
>  * arch supported alignment of a namespace such that it can freely
>  * switch modes without violating the arch constraint. Namely, do not
>  * allow a namespace to be PAGE_SIZE aligned since that namespace may be
>  * reconfigured into a mode that requires SUBSECTION_SIZE alignment.
>  */
>
>> Second, I will take it at face value that the power architecture
>> requires a 16MB alignment, but it's not clear to me why mmu_linear_psize
>> was chosen to represent that.  What's the relationship, there, and can
>> we please have a comment explaining it?
>
> Aneesh, can you help here?

With hash translation, we map the direct-map range with just one page
size. Based on different restrictions as described in htab_init_page_sizes
we can end up choosing 16M, 64K or even 4K. We use the variable
mmu_linear_psize to indicate which page size we used for direct-map
range. 

ie we should do. 

 +unsigned long arch_namespace_align_size(void)
 +{
 +	unsigned long sub_section_size = (1UL << SUBSECTION_SHIFT);
 +
 +	if (radix_enabled())
 +		return sub_section_size;
 +	return max(sub_section_size, (1UL << mmu_psize_defs[mmu_linear_psize].shift));
 +
 +}
 +EXPORT_SYMBOL_GPL(arch_namespace_align_size);

as done here

https://lore.kernel.org/linux-nvdimm/20200120140749.69549-4-aneesh.kumar@linux.ibm.com/

Dan can you update the powerpc definition?

-aneesh

  reply	other threads:[~2020-02-14  3:26 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-13  0:48 [PATCH v2 0/4] libnvdimm: Cross-arch compatible namespace alignment Dan Williams
2020-02-13  0:48 ` Dan Williams
2020-02-13  0:48 ` Dan Williams
2020-02-13  0:48 ` [PATCH v2 1/4] mm/memremap_pages: Introduce memremap_compat_align() Dan Williams
2020-02-13  0:48   ` Dan Williams
2020-02-13  0:48   ` Dan Williams
2020-02-13 16:57   ` Jeff Moyer
2020-02-13 16:57     ` Jeff Moyer
2020-02-13 16:57     ` Jeff Moyer
2020-02-13 18:26     ` Dan Williams
2020-02-13 18:26       ` Dan Williams
2020-02-13 18:26       ` Dan Williams
2020-02-14  3:26       ` Aneesh Kumar K.V [this message]
2020-02-14  3:26         ` Aneesh Kumar K.V
2020-02-14  3:26         ` Aneesh Kumar K.V
2020-02-14 20:59       ` Jeff Moyer
2020-02-14 20:59         ` Jeff Moyer
2020-02-14 20:59         ` Jeff Moyer
2020-02-14 23:05         ` Dan Williams
2020-02-14 23:05           ` Dan Williams
2020-02-14 23:05           ` Dan Williams
2020-02-13  0:48 ` [PATCH v2 2/4] libnvdimm/namespace: Enforce memremap_compat_align() Dan Williams
2020-02-13  0:48   ` Dan Williams
2020-02-13  0:48   ` Dan Williams
2020-02-13 19:16   ` Jeff Moyer
2020-02-13 19:16     ` Jeff Moyer
2020-02-13 19:16     ` Jeff Moyer
2020-02-13 21:55   ` Jeff Moyer
2020-02-13 21:55     ` Jeff Moyer
2020-02-13 21:55     ` Jeff Moyer
2020-02-13 22:43     ` Dan Williams
2020-02-13 22:43       ` Dan Williams
2020-02-13 22:43       ` Dan Williams
2020-02-14 16:44       ` Jeff Moyer
2020-02-14 16:44         ` Jeff Moyer
2020-02-14 16:44         ` Jeff Moyer
2020-02-14 16:55         ` Aneesh Kumar K.V
2020-02-14 16:55           ` Aneesh Kumar K.V
2020-02-14 16:55           ` Aneesh Kumar K.V
2020-02-13  0:48 ` [PATCH v2 3/4] libnvdimm/region: Introduce NDD_LABELING Dan Williams
2020-02-13  0:48   ` Dan Williams
2020-02-13  0:48   ` Dan Williams
2020-02-13 19:12   ` Jeff Moyer
2020-02-13 19:12     ` Jeff Moyer
2020-02-13 19:12     ` Jeff Moyer
2020-02-13  0:48 ` [PATCH v2 4/4] libnvdimm/region: Introduce an 'align' attribute Dan Williams
2020-02-13  0:48   ` Dan Williams
2020-02-13  0:48   ` Dan Williams
2020-02-14 20:19   ` Jeff Moyer
2020-02-14 20:19     ` Jeff Moyer
2020-02-14 20:19     ` Jeff Moyer
2020-02-14 21:03 ` [PATCH v2 0/4] libnvdimm: Cross-arch compatible namespace alignment Jeff Moyer
2020-02-14 21:03   ` Jeff Moyer
2020-02-14 21:03   ` Jeff Moyer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=878sl5x31d.fsf@linux.ibm.com \
    --to=aneesh.kumar@linux.ibm.com \
    --cc=benh@kernel.crashing.org \
    --cc=dan.j.williams@intel.com \
    --cc=jmoyer@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nvdimm@lists.01.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=paulus@samba.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.