linux-nvdimm.lists.01.org archive mirror
 help / color / mirror / Atom feed
From: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
To: Dan Williams <dan.j.williams@intel.com>, Jeff Moyer <jmoyer@redhat.com>
Cc: linux-nvdimm <linux-nvdimm@lists.01.org>,
	linuxppc-dev <linuxppc-dev@lists.ozlabs.org>
Subject: Re: [PATCH v4 1/6] libnvdimm/namespace: Make namespace size validation arch dependent
Date: Sun, 26 Jan 2020 17:11:02 +0530	[thread overview]
Message-ID: <871rrmqwc1.fsf@linux.ibm.com> (raw)
In-Reply-To: <CAPcyv4jPBfhC4t5+e2gxhzKfaLdQi_qTKfLEcXdo-MjTA5a+aw@mail.gmail.com>

Dan Williams <dan.j.williams@intel.com> writes:

> On Fri, Jan 24, 2020 at 9:07 AM Aneesh Kumar K.V
> <aneesh.kumar@linux.ibm.com> wrote:
>>
>> On 1/24/20 10:15 PM, Dan Williams wrote:
>> > On Thu, Jan 23, 2020 at 11:34 PM Aneesh Kumar K.V
>> > <aneesh.kumar@linux.ibm.com> wrote:
>> >>
>> >> On 1/24/20 11:27 AM, Dan Williams wrote:
>> >>> On Mon, Jan 20, 2020 at 6:08 AM Aneesh Kumar K.V
>> >>>
>> >>
>> >> ....
>> >>
>> >>>>
>> >>>> +unsigned long arch_namespace_map_size(void)
>> >>>> +{
>> >>>> +       return PAGE_SIZE;
>> >>>> +}
>> >>>> +EXPORT_SYMBOL_GPL(arch_namespace_map_size);
>> >>>> +
>> >>>> +
>> >>>>    static void __cpa_flush_all(void *arg)
>> >>>>    {
>> >>>>           unsigned long cache = (unsigned long)arg;
>> >>>> diff --git a/include/linux/libnvdimm.h b/include/linux/libnvdimm.h
>> >>>> index 9df091bd30ba..a3476dbd2656 100644
>> >>>> --- a/include/linux/libnvdimm.h
>> >>>> +++ b/include/linux/libnvdimm.h
>> >>>> @@ -284,4 +284,5 @@ static inline void arch_invalidate_pmem(void *addr, size_t size)
>> >>>>    }
>> >>>>    #endif
>> >>>>
>> >>>> +unsigned long arch_namespace_map_size(void);
>> >>>
>> >>> This property is more generic than the nvdimm namespace mapping size,
>> >>> it's more the fundamental remap granularity that the architecture
>> >>> supports. So I would expect this to be defined in core header files.
>> >>> Something like:
>> >>>
>> >>> diff --git a/include/linux/io.h b/include/linux/io.h
>> >>> index a59834bc0a11..58b3b2091dbb 100644
>> >>> --- a/include/linux/io.h
>> >>> +++ b/include/linux/io.h
>> >>> @@ -155,6 +155,13 @@ enum {
>> >>>    void *memremap(resource_size_t offset, size_t size, unsigned long flags);
>> >>>    void memunmap(void *addr);
>> >>>
>> >>> +#ifndef memremap_min_align
>> >>> +static inline unsigned int memremap_min_align(void)
>> >>> +{
>> >>> +       return PAGE_SIZE;
>> >>> +}
>> >>> +#endif
>> >>> +
>> >>
>> >>
>> >> Should that be memremap_pages_min_align()?
>> >
>> > No, and on second look it needs to be a common value that results in
>> > properly aligned / sized namespaces across architectures.
>> >
>> > What would it take for Power to make it's minimum mapping granularity
>> > SUBSECTION_SIZE? The minute that the minimum alignment changes across
>> > architectures we lose compatibility.
>> >
>> > The namespaces need to be sized such that the mode can be changed freely.
>> >
>>
>> Linux on ppc64 with hash translation use just one page size for direct
>> mapping and that is 16MB.
>
> Ok, I think this means that the dream of SUBSECTION_SIZE being the
> minimum compat alignment is dead, or at least a dream deferred.
>
> Let's do this, change the name of this function to:
>
>     memremap_compat_align()
>
> ...and define it to be the max() of all the alignment constraints that
> the arch may require through either memremap(), or memremap_pages().
> Then, teach ndctl to make its default alignment compatible by default,
> 16MiB, with an override to allow namespace creation with the current
> architecture's memremap_compat_align(), exported via sysfs, if it
> happens to be less then 16MiB. Finally, cross our fingers and hope
> that Power remains the only arch that violates the SUBSECTION_SIZE
> minimum value for memremap_compat_align().

We do have two issues related to alignment here.

1) With upstream kernel, we don't validate the namespace start and size
value and hence we can end up creating namespace that is not aligned to
SUBSECTION_SIZE. This was observed by Jeff Moyer in his test. That means
we will fail to enable already created namespace if we use
SUBSECTION_SIZE to validate their alignment.

The solution I came up with was arch_namespace_map_size() that depends on the
direct-map mapping page size. On architecture like ppc64, this value can
be 16MB. 

3) For new namespaces, we can now ensure they are properly
aligned. For architectures other than ppc64 that value is SUBSECTION_SIZE;
ie, the resource start address and the size value should be aligned to
SUBSECTION_SIZE. For ppc64 this value should be 16MB because if they are
not 16MB aligned we cannot direct-map them.

I guess this can be memremap_compat_align() and we expose this value via
namespace attribute. By default, all architecture will now to try to
align things to 16MB unless specified --nocompat as ndctl
create-namespace command-line option. When used we use the
architecture-specific sysfs value (SUBSECTION_SIZE) to align things
correctly.


-aneesh
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

  reply	other threads:[~2020-01-26 11:44 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-20 14:07 [PATCH v4 0/6] Validating namespace size and start address attributes Aneesh Kumar K.V
2020-01-20 14:07 ` [PATCH v4 1/6] libnvdimm/namespace: Make namespace size validation arch dependent Aneesh Kumar K.V
2020-01-24  5:57   ` Dan Williams
2020-01-24  7:34     ` Aneesh Kumar K.V
2020-01-24 16:45       ` Dan Williams
2020-01-24 17:07         ` Aneesh Kumar K.V
2020-01-24 18:25           ` Dan Williams
2020-01-26 11:41             ` Aneesh Kumar K.V [this message]
2020-01-24 17:08         ` Aneesh Kumar K.V
2020-01-20 14:07 ` [PATCH v4 2/6] libnvdimm/namespace: Validate namespace start addr and size Aneesh Kumar K.V
2020-01-25  1:55   ` Dan Williams
2020-01-20 14:07 ` [PATCH v4 3/6] libnvdimm/namespace: Add arch dependent callback for namespace create time validation Aneesh Kumar K.V
2020-01-25  1:59   ` Dan Williams
2020-01-20 14:07 ` [PATCH v4 4/6] libnvdimm/namespace: Validate namespace size when creating a new namespace Aneesh Kumar K.V
2020-01-25  2:22   ` Dan Williams
2020-01-20 14:07 ` [PATCH v4 5/6] libnvdimm/namespace: Align DPA based on arch restrictions Aneesh Kumar K.V
2020-01-20 14:07 ` [PATCH v4 6/6] libnvdimm/namespace: Expose arch specific supported size align value Aneesh Kumar K.V

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=871rrmqwc1.fsf@linux.ibm.com \
    --to=aneesh.kumar@linux.ibm.com \
    --cc=dan.j.williams@intel.com \
    --cc=jmoyer@redhat.com \
    --cc=linux-nvdimm@lists.01.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).