All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dan Williams <dan.j.williams@intel.com>
To: linux-mm@kvack.org
Cc: David Hildenbrand <david@redhat.com>,
	Jason Gunthorpe <jgg@ziepe.ca>,
	Bjorn Helgaas <bhelgaas@google.com>,
	Michael Ellerman <mpe@ellerman.id.au>,
	Ben Skeggs <bskeggs@redhat.com>,
	Paul Mackerras <paulus@ozlabs.org>,
	Christoph Hellwig <hch@lst.de>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Pavel Tatashin <pasha.tatashin@soleen.com>,
	Joao Martins <joao.m.martins@oracle.com>,
	linux-nvdimm@lists.01.org, linux-kernel@vger.kernel.org
Subject: [PATCH 00/12] device-dax: Support sub-dividing soft-reserved ranges
Date: Mon, 23 Mar 2020 16:54:31 -0700	[thread overview]
Message-ID: <158500767138.2088294.17131646259803932461.stgit@dwillia2-desk3.amr.corp.intel.com> (raw)

The device-dax facility allows an address range to be directly mapped
through a chardev, or turned around and hotplugged to the core kernel
page allocator as System-RAM. It is the baseline mechanism for
converting persistent memory (pmem) to be used as another volatile
memory pool i.e. the current Memory Tiering hot topic on linux-mm.

In the case of pmem the nvdimm-namespace-label mechanism can sub-divide
it, but that labeling mechanism is not available / applicable to
soft-reserved ("EFI specific purpose") memory [1]. This series provides
a sysfs-mechanism for the daxctl utility to enable provisioning of
volatile-soft-reserved memory ranges.

The motivations for this facility are:

1/ Allow performance differentiated memory ranges to be split between
   kernel-managed and directly-accessed use cases.

2/ Allow physical memory to be provisioned along performance relevant
   address boundaries. For example, divide a memory-side cache [2] along
   cache-color boundaries.

3/ Parcel out soft-reserved memory to VMs using device-dax as a security
   / permissions boundary [3]. Specifically I have seen people (ab)using
   memmap=nn!ss (mark System-RAM as Peristent Memory) just to get the
   device-dax interface on custom address ranges.

The baseline for this series is today's next/master + "[PATCH v2 0/6]
Manual definition of Soft Reserved memory devices" [4].

Big thanks to Joao for the early testing and feedback on this series!

Given the dependencies on the memremap_pages() reworks in Andrew's tree
and the proximity to v5.7 this is clearly v5.8 material. The patches in
most need of a second opinion are the memremap_pages() reworks to switch
from 'struct resource' to 'struct range' and allow for an array of
ranges to be mapped at once.

[1]: https://lore.kernel.org/r/157309097008.1579826.12818463304589384434.stgit@dwillia2-desk3.amr.corp.intel.com/
[2]: https://lore.kernel.org/r/154899811738.3165233.12325692939590944259.stgit@dwillia2-desk3.amr.corp.intel.com/
[3]: https://lore.kernel.org/r/20200110190313.17144-1-joao.m.martins@oracle.com/
[4]: http://lore.kernel.org/r/158489354353.1457606.8327903161927980740.stgit@dwillia2-desk3.amr.corp.intel.com/

---

Dan Williams (12):
      device-dax: Drop the dax_region.pfn_flags attribute
      device-dax: Move instance creation parameters to 'struct dev_dax_data'
      device-dax: Make pgmap optional for instance creation
      device-dax: Kill dax_kmem_res
      device-dax: Add an allocation interface for device-dax instances
      device-dax: Introduce seed devices
      drivers/base: Make device_find_child_by_name() compatible with sysfs inputs
      device-dax: Add resize support
      mm/memremap_pages: Convert to 'struct range'
      mm/memremap_pages: Support multiple ranges per invocation
      device-dax: Add dis-contiguous resource support
      device-dax: Introduce 'mapping' devices


 arch/powerpc/kvm/book3s_hv_uvmem.c     |   14 -
 drivers/base/core.c                    |    2 
 drivers/dax/bus.c                      |  877 ++++++++++++++++++++++++++++++--
 drivers/dax/bus.h                      |   28 +
 drivers/dax/dax-private.h              |   36 +
 drivers/dax/device.c                   |   97 ++--
 drivers/dax/hmem/hmem.c                |   18 -
 drivers/dax/kmem.c                     |  170 +++---
 drivers/dax/pmem/compat.c              |    2 
 drivers/dax/pmem/core.c                |   22 +
 drivers/gpu/drm/nouveau/nouveau_dmem.c |    4 
 drivers/nvdimm/badrange.c              |   26 -
 drivers/nvdimm/claim.c                 |   13 
 drivers/nvdimm/nd.h                    |    3 
 drivers/nvdimm/pfn_devs.c              |   13 
 drivers/nvdimm/pmem.c                  |   27 +
 drivers/nvdimm/region.c                |   21 -
 drivers/pci/p2pdma.c                   |   12 
 include/linux/memremap.h               |    9 
 include/linux/range.h                  |    6 
 mm/memremap.c                          |  297 ++++++-----
 tools/testing/nvdimm/dax-dev.c         |   22 +
 tools/testing/nvdimm/test/iomap.c      |    2 
 23 files changed, 1318 insertions(+), 403 deletions(-)
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

WARNING: multiple messages have this Message-ID (diff)
From: Dan Williams <dan.j.williams@intel.com>
To: linux-mm@kvack.org
Cc: David Hildenbrand <david@redhat.com>,
	Jason Gunthorpe <jgg@ziepe.ca>, Ira Weiny <ira.weiny@intel.com>,
	Bjorn Helgaas <bhelgaas@google.com>,
	Michael Ellerman <mpe@ellerman.id.au>,
	Vishal Verma <vishal.l.verma@intel.com>,
	Ben Skeggs <bskeggs@redhat.com>,
	Paul Mackerras <paulus@ozlabs.org>,
	Logan Gunthorpe <logang@deltatee.com>,
	Christoph Hellwig <hch@lst.de>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Pavel Tatashin <pasha.tatashin@soleen.com>,
	Joao Martins <joao.m.martins@oracle.com>,
	linux-nvdimm@lists.01.org, linux-kernel@vger.kernel.org,
	jmoyer@redhat.com
Subject: [PATCH 00/12] device-dax: Support sub-dividing soft-reserved ranges
Date: Mon, 23 Mar 2020 16:54:31 -0700	[thread overview]
Message-ID: <158500767138.2088294.17131646259803932461.stgit@dwillia2-desk3.amr.corp.intel.com> (raw)

The device-dax facility allows an address range to be directly mapped
through a chardev, or turned around and hotplugged to the core kernel
page allocator as System-RAM. It is the baseline mechanism for
converting persistent memory (pmem) to be used as another volatile
memory pool i.e. the current Memory Tiering hot topic on linux-mm.

In the case of pmem the nvdimm-namespace-label mechanism can sub-divide
it, but that labeling mechanism is not available / applicable to
soft-reserved ("EFI specific purpose") memory [1]. This series provides
a sysfs-mechanism for the daxctl utility to enable provisioning of
volatile-soft-reserved memory ranges.

The motivations for this facility are:

1/ Allow performance differentiated memory ranges to be split between
   kernel-managed and directly-accessed use cases.

2/ Allow physical memory to be provisioned along performance relevant
   address boundaries. For example, divide a memory-side cache [2] along
   cache-color boundaries.

3/ Parcel out soft-reserved memory to VMs using device-dax as a security
   / permissions boundary [3]. Specifically I have seen people (ab)using
   memmap=nn!ss (mark System-RAM as Peristent Memory) just to get the
   device-dax interface on custom address ranges.

The baseline for this series is today's next/master + "[PATCH v2 0/6]
Manual definition of Soft Reserved memory devices" [4].

Big thanks to Joao for the early testing and feedback on this series!

Given the dependencies on the memremap_pages() reworks in Andrew's tree
and the proximity to v5.7 this is clearly v5.8 material. The patches in
most need of a second opinion are the memremap_pages() reworks to switch
from 'struct resource' to 'struct range' and allow for an array of
ranges to be mapped at once.

[1]: https://lore.kernel.org/r/157309097008.1579826.12818463304589384434.stgit@dwillia2-desk3.amr.corp.intel.com/
[2]: https://lore.kernel.org/r/154899811738.3165233.12325692939590944259.stgit@dwillia2-desk3.amr.corp.intel.com/
[3]: https://lore.kernel.org/r/20200110190313.17144-1-joao.m.martins@oracle.com/
[4]: http://lore.kernel.org/r/158489354353.1457606.8327903161927980740.stgit@dwillia2-desk3.amr.corp.intel.com/

---

Dan Williams (12):
      device-dax: Drop the dax_region.pfn_flags attribute
      device-dax: Move instance creation parameters to 'struct dev_dax_data'
      device-dax: Make pgmap optional for instance creation
      device-dax: Kill dax_kmem_res
      device-dax: Add an allocation interface for device-dax instances
      device-dax: Introduce seed devices
      drivers/base: Make device_find_child_by_name() compatible with sysfs inputs
      device-dax: Add resize support
      mm/memremap_pages: Convert to 'struct range'
      mm/memremap_pages: Support multiple ranges per invocation
      device-dax: Add dis-contiguous resource support
      device-dax: Introduce 'mapping' devices


 arch/powerpc/kvm/book3s_hv_uvmem.c     |   14 -
 drivers/base/core.c                    |    2 
 drivers/dax/bus.c                      |  877 ++++++++++++++++++++++++++++++--
 drivers/dax/bus.h                      |   28 +
 drivers/dax/dax-private.h              |   36 +
 drivers/dax/device.c                   |   97 ++--
 drivers/dax/hmem/hmem.c                |   18 -
 drivers/dax/kmem.c                     |  170 +++---
 drivers/dax/pmem/compat.c              |    2 
 drivers/dax/pmem/core.c                |   22 +
 drivers/gpu/drm/nouveau/nouveau_dmem.c |    4 
 drivers/nvdimm/badrange.c              |   26 -
 drivers/nvdimm/claim.c                 |   13 
 drivers/nvdimm/nd.h                    |    3 
 drivers/nvdimm/pfn_devs.c              |   13 
 drivers/nvdimm/pmem.c                  |   27 +
 drivers/nvdimm/region.c                |   21 -
 drivers/pci/p2pdma.c                   |   12 
 include/linux/memremap.h               |    9 
 include/linux/range.h                  |    6 
 mm/memremap.c                          |  297 ++++++-----
 tools/testing/nvdimm/dax-dev.c         |   22 +
 tools/testing/nvdimm/test/iomap.c      |    2 
 23 files changed, 1318 insertions(+), 403 deletions(-)

             reply	other threads:[~2020-03-24  0:10 UTC|newest]

Thread overview: 56+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-23 23:54 Dan Williams [this message]
2020-03-23 23:54 ` [PATCH 00/12] device-dax: Support sub-dividing soft-reserved ranges Dan Williams
2020-03-23 23:54 ` [PATCH 01/12] device-dax: Drop the dax_region.pfn_flags attribute Dan Williams
2020-03-23 23:54   ` Dan Williams
2020-03-23 23:54 ` [PATCH 02/12] device-dax: Move instance creation parameters to 'struct dev_dax_data' Dan Williams
2020-03-23 23:54   ` Dan Williams
2020-03-23 23:54 ` [PATCH 03/12] device-dax: Make pgmap optional for instance creation Dan Williams
2020-03-23 23:54   ` Dan Williams
2020-03-23 23:54 ` [PATCH 04/12] device-dax: Kill dax_kmem_res Dan Williams
2020-03-23 23:54   ` Dan Williams
2020-03-23 23:55 ` [PATCH 05/12] device-dax: Add an allocation interface for device-dax instances Dan Williams
2020-03-23 23:55   ` Dan Williams
2020-03-23 23:55 ` [PATCH 06/12] device-dax: Introduce seed devices Dan Williams
2020-03-23 23:55   ` Dan Williams
2020-03-23 23:55 ` [PATCH 07/12] drivers/base: Make device_find_child_by_name() compatible with sysfs inputs Dan Williams
2020-03-23 23:55   ` Dan Williams
2020-03-23 23:55 ` [PATCH 08/12] device-dax: Add resize support Dan Williams
2020-03-23 23:55   ` Dan Williams
2020-03-23 23:55 ` [PATCH 09/12] mm/memremap_pages: Convert to 'struct range' Dan Williams
2020-03-23 23:55   ` Dan Williams
2020-03-23 23:55 ` [PATCH 10/12] mm/memremap_pages: Support multiple ranges per invocation Dan Williams
2020-03-23 23:55   ` Dan Williams
2020-03-23 23:55 ` [PATCH 11/12] device-dax: Add dis-contiguous resource support Dan Williams
2020-03-23 23:55   ` Dan Williams
2020-03-24 16:12   ` Joao Martins
2020-03-24 16:12     ` Joao Martins
2020-03-25 10:35     ` Joao Martins
2020-03-25 10:35       ` Joao Martins
2020-03-25 17:48       ` Dan Williams
2020-03-25 17:48         ` Dan Williams
2020-03-25 17:48         ` Dan Williams
2020-03-26 17:49         ` Joao Martins
2020-03-26 17:49           ` Joao Martins
2020-07-11  0:44     ` Dan Williams
2020-07-11  0:44       ` Dan Williams
2020-07-11  0:44       ` Dan Williams
2020-04-06 10:43   ` Joao Martins
2020-04-06 10:43     ` Joao Martins
2020-04-06 20:22     ` Dan Williams
2020-04-06 20:22       ` Dan Williams
2020-04-06 20:22       ` Dan Williams
2020-07-11  0:47       ` Dan Williams
2020-07-11  0:47         ` Dan Williams
2020-07-11  0:47         ` Dan Williams
2020-05-12 14:36   ` Joao Martins
2020-05-12 14:36     ` Joao Martins
2020-07-11  0:52     ` Dan Williams
2020-07-11  0:52       ` Dan Williams
2020-07-11  0:52       ` Dan Williams
2020-03-23 23:55 ` [PATCH 12/12] device-dax: Introduce 'mapping' devices Dan Williams
2020-03-23 23:55   ` Dan Williams
2020-03-24 16:27   ` Joao Martins
2020-03-24 16:27     ` Joao Martins
2020-03-24 23:51     ` Dan Williams
2020-03-24 23:51       ` Dan Williams
2020-03-24 23:51       ` Dan Williams

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=158500767138.2088294.17131646259803932461.stgit@dwillia2-desk3.amr.corp.intel.com \
    --to=dan.j.williams@intel.com \
    --cc=bhelgaas@google.com \
    --cc=bskeggs@redhat.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=david@redhat.com \
    --cc=hch@lst.de \
    --cc=jgg@ziepe.ca \
    --cc=joao.m.martins@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nvdimm@lists.01.org \
    --cc=mpe@ellerman.id.au \
    --cc=pasha.tatashin@soleen.com \
    --cc=paulus@ozlabs.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.