linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Dan Williams <dan.j.williams@intel.com>
To: akpm@linux-foundation.org
Cc: "David Hildenbrand" <david@redhat.com>,
	"Ira Weiny" <ira.weiny@intel.com>,
	"Bjorn Helgaas" <bhelgaas@google.com>,
	"Vishal Verma" <vishal.l.verma@intel.com>,
	"Dave Hansen" <dave.hansen@linux.intel.com>,
	"David Airlie" <airlied@linux.ie>,
	"Vivek Goyal" <vgoyal@redhat.com>,
	"Joao Martins" <joao.m.martins@oracle.com>,
	"Dave Jiang" <dave.jiang@intel.com>,
	"Jonathan Cameron" <Jonathan.Cameron@huawei.com>,
	"Greg Kroah-Hartman" <gregkh@linuxfoundation.org>,
	"Pavel Tatashin" <pasha.tatashin@soleen.com>,
	"Hulk Robot" <hulkci@huawei.com>,
	"Ben Skeggs" <bskeggs@redhat.com>,
	"Benjamin Herrenschmidt" <benh@kernel.crashing.org>,
	"Jia He" <justin.he@arm.com>,
	"Jérôme Glisse" <jglisse@redhat.com>,
	"Jason Yan" <yanaijie@huawei.com>,
	"Paul Mackerras" <paulus@ozlabs.org>,
	"Boris Ostrovsky" <boris.ostrovsky@oracle.com>,
	"Brice Goglin" <Brice.Goglin@inria.fr>,
	"Stefano Stabellini" <sstabellini@kernel.org>,
	"Michael Ellerman" <mpe@ellerman.id.au>,
	"Dan Carpenter" <dan.carpenter@oracle.com>,
	"Juergen Gross" <jgross@suse.com>,
	"Daniel Vetter" <daniel@ffwll.ch>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	linux-nvdimm@lists.01.org, david@redhat.com,
	joao.m.martins@oracle.com
Subject: [PATCH v6 00/11] device-dax: support sub-dividing soft-reserved ranges
Date: Mon, 05 Oct 2020 23:54:44 -0700	[thread overview]
Message-ID: <160196728453.2166475.12832711415715687418.stgit@dwillia2-desk3.amr.corp.intel.com> (raw)

Changes since v5 [1]:
- (David) Introduce range_len() to include/linux/range.h immediately in
  "device-dax: make pgmap optional for instance creation" rather than
  wait until "mm/memremap_pages: convert to 'struct range'" to move it.

- (David) David points out that release_mem_region() can not be used in
  the kmem driver since it depends on the resource range being busy at
  free. The dance the driver does to hand-off busy/free management to
  add_memory_driver_managed() breaks request_mem_region()'s assumptions
  and requires the driver to continue to use a open-coded
  release_resource() + kfree() sequence. For the new multi-range case,
  expand the driver-data to hold all the resulting 'struct resource'
  instances from mapping the ranges.

- (Boris) consolidate pgmap manipulation code in the
  xen_alloc_unpopulated_pages() path. Since this touched
  "mm/memremap_pages: convert to 'struct range'" with the pending fix from
  Dan, I folded in that fix and gave him a Reported-by credit.

[1]: http://lore.kernel.org/r/160106109960.30709.7379926726669669398.stgit@dwillia2-desk3.amr.corp.intel.com

---

Hi Andrew,

As before patches that are in your tree and did not change as a result
of these updates are not re-sent. This set replaces:

device-dax-make-pgmap-optional-for-instance-creation.patch

...through...

device-dax-add-dis-contiguous-resource-support.patch

...in your stack.

I let this soak over the weekend in kbuild-robot visible tree and it
received a build success notification over 160 configs, and no other
regression notices.

---

The device-dax facility allows an address range to be directly mapped
through a chardev, or optionally hotplugged to the core kernel page
allocator as System-RAM. It is the mechanism for converting persistent
memory (pmem) to be used as another volatile memory pool i.e. the
current Memory Tiering hot topic on linux-mm.

In the case of pmem the nvdimm-namespace-label mechanism can sub-divide
it, but that labeling mechanism is not available / applicable to
soft-reserved ("EFI specific purpose") memory [2]. This series provides
a sysfs-mechanism for the daxctl utility to enable provisioning of
volatile-soft-reserved memory ranges.

The motivations for this facility are:

1/ Allow performance differentiated memory ranges to be split between
   kernel-managed and directly-accessed use cases.

2/ Allow physical memory to be provisioned along performance relevant
   address boundaries. For example, divide a memory-side cache [3] along
   cache-color boundaries.

3/ Parcel out soft-reserved memory to VMs using device-dax as a security
   / permissions boundary [4]. Specifically I have seen people (ab)using
   memmap=nn!ss (mark System-RAM as Persistent Memory) just to get the
   device-dax interface on custom address ranges. A follow-on for the VM
   use case is to teach device-dax to dynamically allocate 'struct page' at
   runtime to reduce the duplication of 'struct page' space in both the
   guest and the host kernel for the same physical pages.

[2]: http://lore.kernel.org/r/157309097008.1579826.12818463304589384434.stgit@dwillia2-desk3.amr.corp.intel.com
[3]: http://lore.kernel.org/r/154899811738.3165233.12325692939590944259.stgit@dwillia2-desk3.amr.corp.intel.com
[4]: http://lore.kernel.org/r/20200110190313.17144-1-joao.m.martins@oracle.com

---

Dan Williams (11):
      device-dax: make pgmap optional for instance creation
      device-dax/kmem: introduce dax_kmem_range()
      device-dax/kmem: move resource tracking to drvdata
      device-dax: add an allocation interface for device-dax instances
      device-dax: introduce 'struct dev_dax' typed-driver operations
      device-dax: introduce 'seed' devices
      drivers/base: make device_find_child_by_name() compatible with sysfs inputs
      device-dax: add resize support
      mm/memremap_pages: convert to 'struct range'
      mm/memremap_pages: support multiple ranges per invocation
      device-dax: add dis-contiguous resource support


 arch/powerpc/kvm/book3s_hv_uvmem.c     |   14 -
 drivers/base/core.c                    |    2 
 drivers/dax/bus.c                      |  708 ++++++++++++++++++++++++++++++--
 drivers/dax/bus.h                      |   11 
 drivers/dax/dax-private.h              |   23 +
 drivers/dax/device.c                   |   71 ++-
 drivers/dax/hmem/hmem.c                |   14 -
 drivers/dax/kmem.c                     |  198 ++++++---
 drivers/dax/pmem/compat.c              |    2 
 drivers/dax/pmem/core.c                |   14 -
 drivers/gpu/drm/nouveau/nouveau_dmem.c |   15 -
 drivers/nvdimm/badrange.c              |   26 +
 drivers/nvdimm/claim.c                 |   13 -
 drivers/nvdimm/nd.h                    |    3 
 drivers/nvdimm/pfn_devs.c              |   13 -
 drivers/nvdimm/pmem.c                  |   27 +
 drivers/nvdimm/region.c                |   21 +
 drivers/pci/p2pdma.c                   |   12 -
 drivers/xen/unpopulated-alloc.c        |   49 +-
 include/linux/memremap.h               |   11 
 include/linux/range.h                  |    6 
 lib/test_hmm.c                         |   51 +-
 mm/memremap.c                          |  299 ++++++++------
 tools/testing/nvdimm/dax-dev.c         |   22 +
 tools/testing/nvdimm/test/iomap.c      |    2 
 25 files changed, 1216 insertions(+), 411 deletions(-)

base-commit: d524ed85683d657593ac1e58098407bed0601a84


             reply	other threads:[~2020-10-06  7:13 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-06  6:54 Dan Williams [this message]
2020-10-06  6:54 ` [PATCH v6 01/11] device-dax: make pgmap optional for instance creation Dan Williams
2020-10-06  6:54 ` [PATCH v6 02/11] device-dax/kmem: introduce dax_kmem_range() Dan Williams
2020-10-06  6:55 ` [PATCH v6 03/11] device-dax/kmem: move resource tracking to drvdata Dan Williams
2020-10-06  8:14   ` David Hildenbrand
2020-10-06  6:55 ` [PATCH v6 04/11] device-dax: add an allocation interface for device-dax instances Dan Williams
2020-10-06  6:55 ` [PATCH v6 05/11] device-dax: introduce 'struct dev_dax' typed-driver operations Dan Williams
2020-10-06  6:55 ` [PATCH v6 06/11] device-dax: introduce 'seed' devices Dan Williams
2020-10-06  6:55 ` [PATCH v6 07/11] drivers/base: make device_find_child_by_name() compatible with sysfs inputs Dan Williams
2020-10-06  6:55 ` [PATCH v6 08/11] device-dax: add resize support Dan Williams
2020-10-06  6:55 ` [PATCH v6 09/11] mm/memremap_pages: convert to 'struct range' Dan Williams
2020-10-08 19:52   ` boris.ostrovsky
2020-10-06  6:55 ` [PATCH v6 10/11] mm/memremap_pages: support multiple ranges per invocation Dan Williams
2020-10-06  6:55 ` [PATCH v6 11/11] device-dax: add dis-contiguous resource support Dan Williams

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=160196728453.2166475.12832711415715687418.stgit@dwillia2-desk3.amr.corp.intel.com \
    --to=dan.j.williams@intel.com \
    --cc=Brice.Goglin@inria.fr \
    --cc=Jonathan.Cameron@huawei.com \
    --cc=airlied@linux.ie \
    --cc=akpm@linux-foundation.org \
    --cc=benh@kernel.crashing.org \
    --cc=bhelgaas@google.com \
    --cc=boris.ostrovsky@oracle.com \
    --cc=bskeggs@redhat.com \
    --cc=dan.carpenter@oracle.com \
    --cc=daniel@ffwll.ch \
    --cc=dave.hansen@linux.intel.com \
    --cc=dave.jiang@intel.com \
    --cc=david@redhat.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=hulkci@huawei.com \
    --cc=ira.weiny@intel.com \
    --cc=jglisse@redhat.com \
    --cc=jgross@suse.com \
    --cc=joao.m.martins@oracle.com \
    --cc=justin.he@arm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nvdimm@lists.01.org \
    --cc=mpe@ellerman.id.au \
    --cc=pasha.tatashin@soleen.com \
    --cc=paulus@ozlabs.org \
    --cc=sstabellini@kernel.org \
    --cc=vgoyal@redhat.com \
    --cc=vishal.l.verma@intel.com \
    --cc=yanaijie@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).