Linux-ACPI Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH v3 00/23] device-dax: Support sub-dividing soft-reserved ranges
@ 2020-08-01  3:24 Dan Williams
  2020-08-01  3:25 ` [PATCH v3 01/23] x86/numa: Cleanup configuration dependent command-line options Dan Williams
                   ` (23 more replies)
  0 siblings, 24 replies; 30+ messages in thread
From: Dan Williams @ 2020-08-01  3:24 UTC (permalink / raw)
  To: akpm
  Cc: David Hildenbrand, Ira Weiny, Ard Biesheuvel, Mike Rapoport,
	Borislav Petkov, Vishal Verma, David Airlie, Will Deacon,
	Catalin Marinas, Ard Biesheuvel, Joao Martins, Tom Lendacky,
	Dave Jiang, Rafael J. Wysocki, Jonathan Cameron, Wei Yang, x86,
	H. Peter Anvin, Thomas Gleixner, Greg Kroah-Hartman,
	Pavel Tatashin, Peter Zijlstra, Ben Skeggs,
	Benjamin Herrenschmidt, Jason Gunthorpe, Jia He, Ingo Molnar,
	Dave Hansen, Paul Mackerras, Brice Goglin, Jeff Moyer,
	Michael Ellerman, Rafael J. Wysocki, Daniel Vetter,
	Andy Lutomirski, Rafael J. Wysocki, vishal.l.verma, linux-mm,
	linux-nvdimm, joao.m.martins, linux-kernel, linux-acpi,
	dri-devel

Changes since v2 [1]:
- Rebase on next/master to resolve conflicts with pending mem-hotplug
  and memremap_pages() changes in -mm

- Drop attempt at a generic phys_to_target_node() implementation and
  just follow the default fallback approach taken with
  memory_add_physaddr_to_nid() (Mike)

- Fix test_hmm and other compilation fixups (Ralph)

- Integrate Joao's extensions to the device-dax sub-division interface
  (per-device align, user-directed extent allocation). (Joao)

[1]: http://lore.kernel.org/r/159457116473.754248.7879464730875147365.stgit@dwillia2-desk3.amr.corp.intel.com

---
Merge notes:

Andrew, this series is rebased on today's next/master to resolve
conflicts with some pending patches in -mm. I'd like to take it through
your tree given the intersections with memremap_pages() and memory
hotplug. If at all possible I'd like to see it in v5.10, but I realize
time is short. Outside of the Intel identified use cases for this Joao
has identified a use case for Oracle as well.

I would have sent this earlier save for the fact I am mostly offline
tending to a newborn these days. Vishal has stepped up to take on care
and feeding of this patchset if additional review / integration fixups
are needed.

The one test feedback this wants is from Justin (justin.he@arm.com), and
whether this lights up dax_kmem and now dax_hmem for him on arm64.
Otherwise, Joao has written unit tests for this in his enabling of the
daxctl userspace utility [2].

---
Cover:

The device-dax facility allows an address range to be directly mapped
through a chardev, or optionally hotplugged to the core kernel page
allocator as System-RAM. It is the mechanism for converting persistent
memory (pmem) to be used as another volatile memory pool i.e. the
current Memory Tiering hot topic on linux-mm.

In the case of pmem the nvdimm-namespace-label mechanism can sub-divide
it, but that labeling mechanism is not available / applicable to
soft-reserved ("EFI specific purpose") memory [3]. This series provides
a sysfs-mechanism for the daxctl utility to enable provisioning of
volatile-soft-reserved memory ranges.

The motivations for this facility are:

1/ Allow performance differentiated memory ranges to be split between
   kernel-managed and directly-accessed use cases.

2/ Allow physical memory to be provisioned along performance relevant
   address boundaries. For example, divide a memory-side cache [4] along
   cache-color boundaries.

3/ Parcel out soft-reserved memory to VMs using device-dax as a security
   / permissions boundary [5]. Specifically I have seen people (ab)using
   memmap=nn!ss (mark System-RAM as Persistent Memory) just to get the
   device-dax interface on custom address ranges. A follow-on for the VM
   use case is to teach device-dax to dynamically allocate 'struct page' at
   runtime to reduce the duplication of 'struct page' space in both the
   guest and the host kernel for the same physical pages.

[2]: http://lore.kernel.org/r/20200713160837.13774-11-joao.m.martins@oracle.com
[3]: http://lore.kernel.org/r/157309097008.1579826.12818463304589384434.stgit@dwillia2-desk3.amr.corp.intel.com
[4]: http://lore.kernel.org/r/154899811738.3165233.12325692939590944259.stgit@dwillia2-desk3.amr.corp.intel.com
[5]: http://lore.kernel.org/r/20200110190313.17144-1-joao.m.martins@oracle.com

---

Dan Williams (19):
      x86/numa: Cleanup configuration dependent command-line options
      x86/numa: Add 'nohmat' option
      efi/fake_mem: Arrange for a resource entry per efi_fake_mem instance
      ACPI: HMAT: Refactor hmat_register_target_device to hmem_register_device
      resource: Report parent to walk_iomem_res_desc() callback
      mm/memory_hotplug: Introduce default phys_to_target_node() implementation
      ACPI: HMAT: Attach a device for each soft-reserved range
      device-dax: Drop the dax_region.pfn_flags attribute
      device-dax: Move instance creation parameters to 'struct dev_dax_data'
      device-dax: Make pgmap optional for instance creation
      device-dax: Kill dax_kmem_res
      device-dax: Add an allocation interface for device-dax instances
      device-dax: Introduce 'seed' devices
      drivers/base: Make device_find_child_by_name() compatible with sysfs inputs
      device-dax: Add resize support
      mm/memremap_pages: Convert to 'struct range'
      mm/memremap_pages: Support multiple ranges per invocation
      device-dax: Add dis-contiguous resource support
      device-dax: Introduce 'mapping' devices

Joao Martins (4):
      device-dax: Make align a per-device property
      device-dax: Add an 'align' attribute
      dax/hmem: Introduce dax_hmem.region_idle parameter
      device-dax: Add a range mapping allocation attribute


 arch/powerpc/kvm/book3s_hv_uvmem.c     |   14 
 arch/x86/include/asm/numa.h            |    8 
 arch/x86/kernel/e820.c                 |   16 
 arch/x86/mm/numa.c                     |   11 
 arch/x86/mm/numa_emulation.c           |    3 
 arch/x86/xen/enlighten_pv.c            |    2 
 drivers/acpi/numa/hmat.c               |   76 --
 drivers/acpi/numa/srat.c               |    9 
 drivers/base/core.c                    |    2 
 drivers/dax/Kconfig                    |    4 
 drivers/dax/Makefile                   |    3 
 drivers/dax/bus.c                      | 1055 ++++++++++++++++++++++++++++++--
 drivers/dax/bus.h                      |   28 +
 drivers/dax/dax-private.h              |   40 +
 drivers/dax/device.c                   |  132 ++--
 drivers/dax/hmem.c                     |   56 --
 drivers/dax/hmem/Makefile              |    6 
 drivers/dax/hmem/device.c              |  100 +++
 drivers/dax/hmem/hmem.c                |   65 ++
 drivers/dax/kmem.c                     |  199 +++---
 drivers/dax/pmem/compat.c              |    2 
 drivers/dax/pmem/core.c                |   22 -
 drivers/firmware/efi/x86_fake_mem.c    |   12 
 drivers/gpu/drm/nouveau/nouveau_dmem.c |   15 
 drivers/nvdimm/badrange.c              |   26 -
 drivers/nvdimm/claim.c                 |   13 
 drivers/nvdimm/nd.h                    |    3 
 drivers/nvdimm/pfn_devs.c              |   13 
 drivers/nvdimm/pmem.c                  |   27 -
 drivers/nvdimm/region.c                |   21 -
 drivers/pci/p2pdma.c                   |   12 
 include/acpi/acpi_numa.h               |   14 
 include/linux/dax.h                    |    8 
 include/linux/memory_hotplug.h         |    5 
 include/linux/memremap.h               |   11 
 include/linux/range.h                  |    6 
 kernel/resource.c                      |   11 
 lib/test_hmm.c                         |   15 
 mm/memory_hotplug.c                    |   10 
 mm/memremap.c                          |  299 +++++----
 tools/testing/nvdimm/dax-dev.c         |   22 -
 tools/testing/nvdimm/test/iomap.c      |    2 
 42 files changed, 1810 insertions(+), 588 deletions(-)
 delete mode 100644 drivers/dax/hmem.c
 create mode 100644 drivers/dax/hmem/Makefile
 create mode 100644 drivers/dax/hmem/device.c
 create mode 100644 drivers/dax/hmem/hmem.c

base-commit: 01830e6c042e8eb6eb202e05d7df8057135b4c26

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, back to index

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-01  3:24 [PATCH v3 00/23] device-dax: Support sub-dividing soft-reserved ranges Dan Williams
2020-08-01  3:25 ` [PATCH v3 01/23] x86/numa: Cleanup configuration dependent command-line options Dan Williams
2020-08-01  3:25 ` [PATCH v3 02/23] x86/numa: Add 'nohmat' option Dan Williams
2020-08-01  3:51   ` Randy Dunlap
2020-08-01 16:36     ` Dan Williams
2020-08-01  3:25 ` [PATCH v3 03/23] efi/fake_mem: Arrange for a resource entry per efi_fake_mem instance Dan Williams
2020-08-01  3:25 ` [PATCH v3 04/23] ACPI: HMAT: Refactor hmat_register_target_device to hmem_register_device Dan Williams
2020-08-01  3:25 ` [PATCH v3 05/23] resource: Report parent to walk_iomem_res_desc() callback Dan Williams
2020-08-01  3:25 ` [PATCH v3 06/23] mm/memory_hotplug: Introduce default phys_to_target_node() implementation Dan Williams
2020-08-01  3:25 ` [PATCH v3 07/23] ACPI: HMAT: Attach a device for each soft-reserved range Dan Williams
2020-08-01  3:25 ` [PATCH v3 08/23] device-dax: Drop the dax_region.pfn_flags attribute Dan Williams
2020-08-01  3:25 ` [PATCH v3 09/23] device-dax: Move instance creation parameters to 'struct dev_dax_data' Dan Williams
2020-08-01  3:25 ` [PATCH v3 10/23] device-dax: Make pgmap optional for instance creation Dan Williams
2020-08-01  3:26 ` [PATCH v3 11/23] device-dax: Kill dax_kmem_res Dan Williams
2020-08-01  3:26 ` [PATCH v3 12/23] device-dax: Add an allocation interface for device-dax instances Dan Williams
2020-08-01  3:26 ` [PATCH v3 13/23] device-dax: Introduce 'seed' devices Dan Williams
2020-08-01  3:26 ` [PATCH v3 14/23] drivers/base: Make device_find_child_by_name() compatible with sysfs inputs Dan Williams
2020-08-01  3:26 ` [PATCH v3 15/23] device-dax: Add resize support Dan Williams
2020-08-01  3:26 ` [PATCH v3 16/23] mm/memremap_pages: Convert to 'struct range' Dan Williams
2020-08-01  3:26 ` [PATCH v3 17/23] mm/memremap_pages: Support multiple ranges per invocation Dan Williams
2020-08-01  3:26 ` [PATCH v3 18/23] device-dax: Add dis-contiguous resource support Dan Williams
2020-08-01  3:26 ` [PATCH v3 19/23] device-dax: Introduce 'mapping' devices Dan Williams
2020-08-01  3:26 ` [PATCH v3 20/23] device-dax: Make align a per-device property Dan Williams
2020-08-01  7:23   ` kernel test robot
2020-08-01  3:26 ` [PATCH v3 21/23] device-dax: Add an 'align' attribute Dan Williams
2020-08-01  6:14   ` kernel test robot
2020-08-01  6:18   ` kernel test robot
2020-08-01  3:27 ` [PATCH v3 22/23] dax/hmem: Introduce dax_hmem.region_idle parameter Dan Williams
2020-08-01  3:27 ` [PATCH v3 23/23] device-dax: Add a range mapping allocation attribute Dan Williams
2020-08-04 17:02 ` [PATCH v3 00/23] device-dax: Support sub-dividing soft-reserved ranges Jason Gunthorpe

Linux-ACPI Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-acpi/0 linux-acpi/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-acpi linux-acpi/ https://lore.kernel.org/linux-acpi \
		linux-acpi@vger.kernel.org
	public-inbox-index linux-acpi

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-acpi


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git