From: Dan Williams <dan.j.williams@intel.com> To: akpm@linux-foundation.org Cc: david@redhat.com, "Bjorn Helgaas" <bhelgaas@google.com>, "Dave Hansen" <dave.hansen@linux.intel.com>, "David Airlie" <airlied@linux.ie>, joao.m.martins@oracle.com, "Jonathan Cameron" <Jonathan.Cameron@huawei.com>, "Greg Kroah-Hartman" <gregkh@linuxfoundation.org>, "Pavel Tatashin" <pasha.tatashin@soleen.com>, "Hulk Robot" <hulkci@huawei.com>, "Ben Skeggs" <bskeggs@redhat.com>, "Benjamin Herrenschmidt" <benh@kernel.crashing.org>, "Jia He" <justin.he@arm.com>, "Jérôme Glisse" <jglisse@redhat.com>, "Jason Yan" <yanaijie@huawei.com>, "Paul Mackerras" <paulus@ozlabs.org>, "Boris Ostrovsky" <boris.ostrovsky@oracle.com>, "Brice Goglin" <Brice.Goglin@inria.fr>, "Stefano Stabellini" <sstabellini@kernel.org>, "Michael Ellerman" <mpe@ellerman.id.au>, "Dan Carpenter" <dan.carpenter@oracle.com>, "Juergen Gross" <jgross@suse.com>, "Daniel Vetter" <daniel@ffwll.ch>, linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-nvdimm@lists.01.org Subject: [PATCH v6 00/11] device-dax: support sub-dividing soft-reserved ranges Date: Mon, 05 Oct 2020 23:54:44 -0700 [thread overview] Message-ID: <160196728453.2166475.12832711415715687418.stgit@dwillia2-desk3.amr.corp.intel.com> (raw) Changes since v5 [1]: - (David) Introduce range_len() to include/linux/range.h immediately in "device-dax: make pgmap optional for instance creation" rather than wait until "mm/memremap_pages: convert to 'struct range'" to move it. - (David) David points out that release_mem_region() can not be used in the kmem driver since it depends on the resource range being busy at free. The dance the driver does to hand-off busy/free management to add_memory_driver_managed() breaks request_mem_region()'s assumptions and requires the driver to continue to use a open-coded release_resource() + kfree() sequence. For the new multi-range case, expand the driver-data to hold all the resulting 'struct resource' instances from mapping the ranges. - (Boris) consolidate pgmap manipulation code in the xen_alloc_unpopulated_pages() path. Since this touched "mm/memremap_pages: convert to 'struct range'" with the pending fix from Dan, I folded in that fix and gave him a Reported-by credit. [1]: http://lore.kernel.org/r/160106109960.30709.7379926726669669398.stgit@dwillia2-desk3.amr.corp.intel.com --- Hi Andrew, As before patches that are in your tree and did not change as a result of these updates are not re-sent. This set replaces: device-dax-make-pgmap-optional-for-instance-creation.patch ...through... device-dax-add-dis-contiguous-resource-support.patch ...in your stack. I let this soak over the weekend in kbuild-robot visible tree and it received a build success notification over 160 configs, and no other regression notices. --- The device-dax facility allows an address range to be directly mapped through a chardev, or optionally hotplugged to the core kernel page allocator as System-RAM. It is the mechanism for converting persistent memory (pmem) to be used as another volatile memory pool i.e. the current Memory Tiering hot topic on linux-mm. In the case of pmem the nvdimm-namespace-label mechanism can sub-divide it, but that labeling mechanism is not available / applicable to soft-reserved ("EFI specific purpose") memory [2]. This series provides a sysfs-mechanism for the daxctl utility to enable provisioning of volatile-soft-reserved memory ranges. The motivations for this facility are: 1/ Allow performance differentiated memory ranges to be split between kernel-managed and directly-accessed use cases. 2/ Allow physical memory to be provisioned along performance relevant address boundaries. For example, divide a memory-side cache [3] along cache-color boundaries. 3/ Parcel out soft-reserved memory to VMs using device-dax as a security / permissions boundary [4]. Specifically I have seen people (ab)using memmap=nn!ss (mark System-RAM as Persistent Memory) just to get the device-dax interface on custom address ranges. A follow-on for the VM use case is to teach device-dax to dynamically allocate 'struct page' at runtime to reduce the duplication of 'struct page' space in both the guest and the host kernel for the same physical pages. [2]: http://lore.kernel.org/r/157309097008.1579826.12818463304589384434.stgit@dwillia2-desk3.amr.corp.intel.com [3]: http://lore.kernel.org/r/154899811738.3165233.12325692939590944259.stgit@dwillia2-desk3.amr.corp.intel.com [4]: http://lore.kernel.org/r/20200110190313.17144-1-joao.m.martins@oracle.com --- Dan Williams (11): device-dax: make pgmap optional for instance creation device-dax/kmem: introduce dax_kmem_range() device-dax/kmem: move resource tracking to drvdata device-dax: add an allocation interface for device-dax instances device-dax: introduce 'struct dev_dax' typed-driver operations device-dax: introduce 'seed' devices drivers/base: make device_find_child_by_name() compatible with sysfs inputs device-dax: add resize support mm/memremap_pages: convert to 'struct range' mm/memremap_pages: support multiple ranges per invocation device-dax: add dis-contiguous resource support arch/powerpc/kvm/book3s_hv_uvmem.c | 14 - drivers/base/core.c | 2 drivers/dax/bus.c | 708 ++++++++++++++++++++++++++++++-- drivers/dax/bus.h | 11 drivers/dax/dax-private.h | 23 + drivers/dax/device.c | 71 ++- drivers/dax/hmem/hmem.c | 14 - drivers/dax/kmem.c | 198 ++++++--- drivers/dax/pmem/compat.c | 2 drivers/dax/pmem/core.c | 14 - drivers/gpu/drm/nouveau/nouveau_dmem.c | 15 - drivers/nvdimm/badrange.c | 26 + drivers/nvdimm/claim.c | 13 - drivers/nvdimm/nd.h | 3 drivers/nvdimm/pfn_devs.c | 13 - drivers/nvdimm/pmem.c | 27 + drivers/nvdimm/region.c | 21 + drivers/pci/p2pdma.c | 12 - drivers/xen/unpopulated-alloc.c | 49 +- include/linux/memremap.h | 11 include/linux/range.h | 6 lib/test_hmm.c | 51 +- mm/memremap.c | 299 ++++++++------ tools/testing/nvdimm/dax-dev.c | 22 + tools/testing/nvdimm/test/iomap.c | 2 25 files changed, 1216 insertions(+), 411 deletions(-) base-commit: d524ed85683d657593ac1e58098407bed0601a84 _______________________________________________ Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org To unsubscribe send an email to linux-nvdimm-leave@lists.01.org
WARNING: multiple messages have this Message-ID (diff)
From: Dan Williams <dan.j.williams@intel.com> To: akpm@linux-foundation.org Cc: "David Hildenbrand" <david@redhat.com>, "Ira Weiny" <ira.weiny@intel.com>, "Bjorn Helgaas" <bhelgaas@google.com>, "Vishal Verma" <vishal.l.verma@intel.com>, "Dave Hansen" <dave.hansen@linux.intel.com>, "David Airlie" <airlied@linux.ie>, "Vivek Goyal" <vgoyal@redhat.com>, "Joao Martins" <joao.m.martins@oracle.com>, "Dave Jiang" <dave.jiang@intel.com>, "Jonathan Cameron" <Jonathan.Cameron@huawei.com>, "Greg Kroah-Hartman" <gregkh@linuxfoundation.org>, "Pavel Tatashin" <pasha.tatashin@soleen.com>, "Hulk Robot" <hulkci@huawei.com>, "Ben Skeggs" <bskeggs@redhat.com>, "Benjamin Herrenschmidt" <benh@kernel.crashing.org>, "Jia He" <justin.he@arm.com>, "Jérôme Glisse" <jglisse@redhat.com>, "Jason Yan" <yanaijie@huawei.com>, "Paul Mackerras" <paulus@ozlabs.org>, "Boris Ostrovsky" <boris.ostrovsky@oracle.com>, "Brice Goglin" <Brice.Goglin@inria.fr>, "Stefano Stabellini" <sstabellini@kernel.org>, "Michael Ellerman" <mpe@ellerman.id.au>, "Dan Carpenter" <dan.carpenter@oracle.com>, "Juergen Gross" <jgross@suse.com>, "Daniel Vetter" <daniel@ffwll.ch>, linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-nvdimm@lists.01.org, david@redhat.com, joao.m.martins@oracle.com Subject: [PATCH v6 00/11] device-dax: support sub-dividing soft-reserved ranges Date: Mon, 05 Oct 2020 23:54:44 -0700 [thread overview] Message-ID: <160196728453.2166475.12832711415715687418.stgit@dwillia2-desk3.amr.corp.intel.com> (raw) Changes since v5 [1]: - (David) Introduce range_len() to include/linux/range.h immediately in "device-dax: make pgmap optional for instance creation" rather than wait until "mm/memremap_pages: convert to 'struct range'" to move it. - (David) David points out that release_mem_region() can not be used in the kmem driver since it depends on the resource range being busy at free. The dance the driver does to hand-off busy/free management to add_memory_driver_managed() breaks request_mem_region()'s assumptions and requires the driver to continue to use a open-coded release_resource() + kfree() sequence. For the new multi-range case, expand the driver-data to hold all the resulting 'struct resource' instances from mapping the ranges. - (Boris) consolidate pgmap manipulation code in the xen_alloc_unpopulated_pages() path. Since this touched "mm/memremap_pages: convert to 'struct range'" with the pending fix from Dan, I folded in that fix and gave him a Reported-by credit. [1]: http://lore.kernel.org/r/160106109960.30709.7379926726669669398.stgit@dwillia2-desk3.amr.corp.intel.com --- Hi Andrew, As before patches that are in your tree and did not change as a result of these updates are not re-sent. This set replaces: device-dax-make-pgmap-optional-for-instance-creation.patch ...through... device-dax-add-dis-contiguous-resource-support.patch ...in your stack. I let this soak over the weekend in kbuild-robot visible tree and it received a build success notification over 160 configs, and no other regression notices. --- The device-dax facility allows an address range to be directly mapped through a chardev, or optionally hotplugged to the core kernel page allocator as System-RAM. It is the mechanism for converting persistent memory (pmem) to be used as another volatile memory pool i.e. the current Memory Tiering hot topic on linux-mm. In the case of pmem the nvdimm-namespace-label mechanism can sub-divide it, but that labeling mechanism is not available / applicable to soft-reserved ("EFI specific purpose") memory [2]. This series provides a sysfs-mechanism for the daxctl utility to enable provisioning of volatile-soft-reserved memory ranges. The motivations for this facility are: 1/ Allow performance differentiated memory ranges to be split between kernel-managed and directly-accessed use cases. 2/ Allow physical memory to be provisioned along performance relevant address boundaries. For example, divide a memory-side cache [3] along cache-color boundaries. 3/ Parcel out soft-reserved memory to VMs using device-dax as a security / permissions boundary [4]. Specifically I have seen people (ab)using memmap=nn!ss (mark System-RAM as Persistent Memory) just to get the device-dax interface on custom address ranges. A follow-on for the VM use case is to teach device-dax to dynamically allocate 'struct page' at runtime to reduce the duplication of 'struct page' space in both the guest and the host kernel for the same physical pages. [2]: http://lore.kernel.org/r/157309097008.1579826.12818463304589384434.stgit@dwillia2-desk3.amr.corp.intel.com [3]: http://lore.kernel.org/r/154899811738.3165233.12325692939590944259.stgit@dwillia2-desk3.amr.corp.intel.com [4]: http://lore.kernel.org/r/20200110190313.17144-1-joao.m.martins@oracle.com --- Dan Williams (11): device-dax: make pgmap optional for instance creation device-dax/kmem: introduce dax_kmem_range() device-dax/kmem: move resource tracking to drvdata device-dax: add an allocation interface for device-dax instances device-dax: introduce 'struct dev_dax' typed-driver operations device-dax: introduce 'seed' devices drivers/base: make device_find_child_by_name() compatible with sysfs inputs device-dax: add resize support mm/memremap_pages: convert to 'struct range' mm/memremap_pages: support multiple ranges per invocation device-dax: add dis-contiguous resource support arch/powerpc/kvm/book3s_hv_uvmem.c | 14 - drivers/base/core.c | 2 drivers/dax/bus.c | 708 ++++++++++++++++++++++++++++++-- drivers/dax/bus.h | 11 drivers/dax/dax-private.h | 23 + drivers/dax/device.c | 71 ++- drivers/dax/hmem/hmem.c | 14 - drivers/dax/kmem.c | 198 ++++++--- drivers/dax/pmem/compat.c | 2 drivers/dax/pmem/core.c | 14 - drivers/gpu/drm/nouveau/nouveau_dmem.c | 15 - drivers/nvdimm/badrange.c | 26 + drivers/nvdimm/claim.c | 13 - drivers/nvdimm/nd.h | 3 drivers/nvdimm/pfn_devs.c | 13 - drivers/nvdimm/pmem.c | 27 + drivers/nvdimm/region.c | 21 + drivers/pci/p2pdma.c | 12 - drivers/xen/unpopulated-alloc.c | 49 +- include/linux/memremap.h | 11 include/linux/range.h | 6 lib/test_hmm.c | 51 +- mm/memremap.c | 299 ++++++++------ tools/testing/nvdimm/dax-dev.c | 22 + tools/testing/nvdimm/test/iomap.c | 2 25 files changed, 1216 insertions(+), 411 deletions(-) base-commit: d524ed85683d657593ac1e58098407bed0601a84
next reply other threads:[~2020-10-06 7:13 UTC|newest] Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top 2020-10-06 6:54 Dan Williams [this message] 2020-10-06 6:54 ` [PATCH v6 00/11] device-dax: support sub-dividing soft-reserved ranges Dan Williams 2020-10-06 6:54 ` [PATCH v6 01/11] device-dax: make pgmap optional for instance creation Dan Williams 2020-10-06 6:54 ` Dan Williams 2020-10-06 6:54 ` [PATCH v6 02/11] device-dax/kmem: introduce dax_kmem_range() Dan Williams 2020-10-06 6:54 ` Dan Williams 2020-10-06 6:55 ` [PATCH v6 03/11] device-dax/kmem: move resource tracking to drvdata Dan Williams 2020-10-06 6:55 ` Dan Williams 2020-10-06 8:14 ` David Hildenbrand 2020-10-06 8:14 ` David Hildenbrand 2020-10-06 6:55 ` [PATCH v6 04/11] device-dax: add an allocation interface for device-dax instances Dan Williams 2020-10-06 6:55 ` Dan Williams 2020-10-06 6:55 ` [PATCH v6 05/11] device-dax: introduce 'struct dev_dax' typed-driver operations Dan Williams 2020-10-06 6:55 ` Dan Williams 2020-10-06 6:55 ` [PATCH v6 06/11] device-dax: introduce 'seed' devices Dan Williams 2020-10-06 6:55 ` Dan Williams 2020-10-06 6:55 ` [PATCH v6 07/11] drivers/base: make device_find_child_by_name() compatible with sysfs inputs Dan Williams 2020-10-06 6:55 ` Dan Williams 2020-10-06 6:55 ` [PATCH v6 08/11] device-dax: add resize support Dan Williams 2020-10-06 6:55 ` Dan Williams 2020-10-06 6:55 ` [PATCH v6 09/11] mm/memremap_pages: convert to 'struct range' Dan Williams 2020-10-06 6:55 ` Dan Williams 2020-10-08 19:52 ` boris.ostrovsky 2020-10-08 19:52 ` boris.ostrovsky 2020-10-06 6:55 ` [PATCH v6 10/11] mm/memremap_pages: support multiple ranges per invocation Dan Williams 2020-10-06 6:55 ` Dan Williams 2020-10-06 6:55 ` [PATCH v6 11/11] device-dax: add dis-contiguous resource support Dan Williams 2020-10-06 6:55 ` Dan Williams
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=160196728453.2166475.12832711415715687418.stgit@dwillia2-desk3.amr.corp.intel.com \ --to=dan.j.williams@intel.com \ --cc=Brice.Goglin@inria.fr \ --cc=Jonathan.Cameron@huawei.com \ --cc=airlied@linux.ie \ --cc=akpm@linux-foundation.org \ --cc=benh@kernel.crashing.org \ --cc=bhelgaas@google.com \ --cc=boris.ostrovsky@oracle.com \ --cc=bskeggs@redhat.com \ --cc=dan.carpenter@oracle.com \ --cc=daniel@ffwll.ch \ --cc=dave.hansen@linux.intel.com \ --cc=david@redhat.com \ --cc=gregkh@linuxfoundation.org \ --cc=hulkci@huawei.com \ --cc=jglisse@redhat.com \ --cc=jgross@suse.com \ --cc=joao.m.martins@oracle.com \ --cc=justin.he@arm.com \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-mm@kvack.org \ --cc=linux-nvdimm@lists.01.org \ --cc=mpe@ellerman.id.au \ --cc=pasha.tatashin@soleen.com \ --cc=paulus@ozlabs.org \ --cc=sstabellini@kernel.org \ --cc=yanaijie@huawei.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.