From: Jeff Moyer <jmoyer@redhat.com>
To: Dan Williams <dan.j.williams@intel.com>
Cc: linux-nvdimm <linux-nvdimm@lists.01.org>,
"Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
linuxppc-dev <linuxppc-dev@lists.ozlabs.org>
Subject: Re: [PATCH v2 2/4] libnvdimm/namespace: Enforce memremap_compat_align()
Date: Fri, 14 Feb 2020 11:44:38 -0500 [thread overview]
Message-ID: <x49h7ztdsp5.fsf@segfault.boston.devel.redhat.com> (raw)
In-Reply-To: <CAPcyv4hQouRNBcJ4uZ2mysr_aKstLhvUf66gRQ_3QoQNyOy72g@mail.gmail.com> (Dan Williams's message of "Thu, 13 Feb 2020 14:43:28 -0800")
Dan Williams <dan.j.williams@intel.com> writes:
> On Thu, Feb 13, 2020 at 1:55 PM Jeff Moyer <jmoyer@redhat.com> wrote:
>>
>> Dan Williams <dan.j.williams@intel.com> writes:
>>
>> > The pmem driver on PowerPC crashes with the following signature when
>> > instantiating misaligned namespaces that map their capacity via
>> > memremap_pages().
>> >
>> > BUG: Unable to handle kernel data access at 0xc001000406000000
>> > Faulting instruction address: 0xc000000000090790
>> > NIP [c000000000090790] arch_add_memory+0xc0/0x130
>> > LR [c000000000090744] arch_add_memory+0x74/0x130
>> > Call Trace:
>> > arch_add_memory+0x74/0x130 (unreliable)
>> > memremap_pages+0x74c/0xa30
>> > devm_memremap_pages+0x3c/0xa0
>> > pmem_attach_disk+0x188/0x770
>> > nvdimm_bus_probe+0xd8/0x470
>> >
>> > With the assumption that only memremap_pages() has alignment
>> > constraints, enforce memremap_compat_align() for
>> > pmem_should_map_pages(), nd_pfn, or nd_dax cases.
>> >
>> > Reported-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
>> > Cc: Jeff Moyer <jmoyer@redhat.com>
>> > Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
>> > Link: https://lore.kernel.org/r/158041477336.3889308.4581652885008605170.stgit@dwillia2-desk3.amr.corp.intel.com
>> > Signed-off-by: Dan Williams <dan.j.williams@intel.com>
>> > ---
>> > drivers/nvdimm/namespace_devs.c | 10 ++++++++++
>> > 1 file changed, 10 insertions(+)
>> >
>> > diff --git a/drivers/nvdimm/namespace_devs.c b/drivers/nvdimm/namespace_devs.c
>> > index 032dc61725ff..aff1f32fdb4f 100644
>> > --- a/drivers/nvdimm/namespace_devs.c
>> > +++ b/drivers/nvdimm/namespace_devs.c
>> > @@ -1739,6 +1739,16 @@ struct nd_namespace_common *nvdimm_namespace_common_probe(struct device *dev)
>> > return ERR_PTR(-ENODEV);
>> > }
>> >
>> > + if (pmem_should_map_pages(dev) || nd_pfn || nd_dax) {
>> > + struct nd_namespace_io *nsio = to_nd_namespace_io(&ndns->dev);
>> > + resource_size_t start = nsio->res.start;
>> > +
>> > + if (!IS_ALIGNED(start | size, memremap_compat_align())) {
>> > + dev_dbg(&ndns->dev, "misaligned, unable to map\n");
>> > + return ERR_PTR(-EOPNOTSUPP);
>> > + }
>> > + }
>> > +
>> > if (is_namespace_pmem(&ndns->dev)) {
>> > struct nd_namespace_pmem *nspm;
>> >
>>
>> Actually, I take back my ack. :) This prevents a previously working
>> namespace from being successfully probed/setup.
>
> Do you have a test case handy? I can see a potential gap with a
> namespace that used internal padding to fix up the alignment.
# ndctl list -v -n namespace0.0
[
{
"dev":"namespace0.0",
"mode":"fsdax",
"map":"dev",
"size":52846133248,
"uuid":"b99f6f6a-2909-4189-9bfa-6eeebd95d40e",
"raw_uuid":"aff43777-015b-493f-bbf9-7c7b0fe33519",
"sector_size":512,
"align":4096,
"blockdev":"pmem0",
"numa_node":0
}
]
# cat /sys/bus/nd/devices/region0/mappings
6
# grep namespace0.0 /proc/iomem
1860000000-24e0003fff : namespace0.0
> The goal of this check is to catch cases that are just going to fail
> devm_memremap_pages(), and the expectation is that it could not have
> worked before unless it was ported from another platform, or someone
> flipped the page-size switch on PowerPC.
On x86, creation and probing of the namespace worked fine before this
patch. What *doesn't* work is creating another fsdax namespace after
this one. sector mode namespaces can still be created, though:
[
{
"dev":"namespace0.1",
"mode":"sector",
"size":53270768640,
"uuid":"67ea2c74-d4b1-4fc9-9c1a-a7d2a6c2a4a7",
"sector_size":512,
"blockdev":"pmem0.1s"
},
# grep namespace0.1 /proc/iomem
24e0004000-3160007fff : namespace0.1
>> I thought we were only going to enforce the alignment for a newly
>> created namespace? This should only check whether the alignment
>> works for the current platform.
>
> The model is a new default 16MB alignment is enforced at creation
> time, but if you need to support previously created namespaces then
> you can manually trim that alignment requirement to no less than
> memremap_compat_align() because that's the point at which
> devm_memremap_pages() will start failing or crashing.
The problem is that older kernels did not enforce alignment to
SUBSECTION_SIZE. We shouldn't prevent those namespaces from being
accessed. The probe itself will not cause the WARN_ON to trigger.
Creating new namespaces at misaligned addresses could, but you've
altered the free space allocation such that we won't hit that anymore.
If I drop this patch, the probe will still work, and allocating new
namespaces will also work:
# ndctl list
[
{
"dev":"namespace0.1",
"mode":"sector",
"size":53270768640,
"uuid":"67ea2c74-d4b1-4fc9-9c1a-a7d2a6c2a4a7",
"sector_size":512,
"blockdev":"pmem0.1s"
},
{
"dev":"namespace0.0",
"mode":"fsdax",
"map":"dev",
"size":52846133248,
"uuid":"b99f6f6a-2909-4189-9bfa-6eeebd95d40e",
"sector_size":512,
"align":4096,
"blockdev":"pmem0"
}
]
ndctl create-namespace -m fsdax -s 36g -r 0
{
"dev":"namespace0.2",
"mode":"fsdax",
"map":"dev",
"size":"35.44 GiB (38.05 GB)",
"uuid":"7893264c-c7ef-4cbe-95e1-ccf2aff041fb",
"sector_size":512,
"align":2097152,
"blockdev":"pmem0.2"
}
proc/iomem:
1860000000-d55fffffff : Persistent Memory
1860000000-24e0003fff : namespace0.0
24e0004000-3160007fff : namespace0.1
3162000000-3a61ffffff : namespace0.2
So, maybe the right thing is to make memremap_compat_align return
PAGE_SIZE for x86 instead of SUBSECTION_SIZE?
-Jeff
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org
next prev parent reply other threads:[~2020-02-14 16:44 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-02-13 0:48 [PATCH v2 0/4] libnvdimm: Cross-arch compatible namespace alignment Dan Williams
2020-02-13 0:48 ` [PATCH v2 1/4] mm/memremap_pages: Introduce memremap_compat_align() Dan Williams
2020-02-13 16:57 ` Jeff Moyer
2020-02-13 18:26 ` Dan Williams
2020-02-14 3:26 ` Aneesh Kumar K.V
2020-02-14 20:59 ` Jeff Moyer
2020-02-14 23:05 ` Dan Williams
2020-02-13 0:48 ` [PATCH v2 2/4] libnvdimm/namespace: Enforce memremap_compat_align() Dan Williams
2020-02-13 19:16 ` Jeff Moyer
2020-02-13 21:55 ` Jeff Moyer
2020-02-13 22:43 ` Dan Williams
2020-02-14 16:44 ` Jeff Moyer [this message]
2020-02-14 16:55 ` Aneesh Kumar K.V
2020-02-13 0:48 ` [PATCH v2 3/4] libnvdimm/region: Introduce NDD_LABELING Dan Williams
2020-02-13 19:12 ` Jeff Moyer
2020-02-13 0:48 ` [PATCH v2 4/4] libnvdimm/region: Introduce an 'align' attribute Dan Williams
2020-02-14 20:19 ` Jeff Moyer
2020-02-14 21:03 ` [PATCH v2 0/4] libnvdimm: Cross-arch compatible namespace alignment Jeff Moyer
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=x49h7ztdsp5.fsf@segfault.boston.devel.redhat.com \
--to=jmoyer@redhat.com \
--cc=aneesh.kumar@linux.ibm.com \
--cc=dan.j.williams@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nvdimm@lists.01.org \
--cc=linuxppc-dev@lists.ozlabs.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).