From: Dan Williams <dan.j.williams@intel.com> To: linux-acpi@vger.kernel.org Cc: Jason Gunthorpe <jgg@ziepe.ca>, "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>, Peter Zijlstra <peterz@infradead.org>, Ard Biesheuvel <ardb@kernel.org>, Jonathan Cameron <Jonathan.Cameron@huawei.com>, Borislav Petkov <bp@alien8.de>, x86@kernel.org, "H. Peter Anvin" <hpa@zytor.com>, Brice Goglin <Brice.Goglin@inria.fr>, Thomas Gleixner <tglx@linutronix.de>, Catalin Marinas <catalin.marinas@arm.com>, Ingo Molnar <mingo@redhat.com>, Dave Hansen <dave.hansen@linux.intel.com>, Will Deacon <will@kernel.org>, "Rafael J. Wysocki" <rjw@rjwysocki.net>, Ard Biesheuvel <ard.biesheuvel@linaro.org>, Andy Lutomirski <luto@kernel.org>, Tom Lendacky <thomas.lendacky@amd.com>, linux-nvdimm@lists.01.org, linux-kernel@vger.kernel.org, joao.m.martins@oracle.com Subject: [PATCH v2 0/6] Manual definition of Soft Reserved memory devices Date: Sun, 22 Mar 2020 09:12:23 -0700 [thread overview] Message-ID: <158489354353.1457606.8327903161927980740.stgit@dwillia2-desk3.amr.corp.intel.com> (raw) Changes since v1 [1]: - Kill the ifdef'ery in arch/x86/mm/numa.c (Rafael) - Add a dummy phys_to_target_node() for ARM64 (0day-robot) - Initialize ->child and ->sibling to NULL in the resource returned by find_next_iomem_res() (Inspired by Tom's feedback even though it does not set them like he suggested) - Collect Ard's Ack [1]: http://lore.kernel.org/r/158318759687.2216124.4684754859068906007.stgit@dwillia2-desk3.amr.corp.intel.com --- My primary motivation is making the dax_kmem facility useful to shipping platforms that have performance differentiated memory, but may not have EFI-defined soft-reservations / HMAT (or non-EFI-ACPI-platform equivalent). I'm anticipating HMAT enabled platforms where the platform firmware policy for what is soft-reserved, or not, is not the policy the system owner would pick. I'd also highlight Joao's work [2] (see the TODO section) as an indication of the demand for custom carving memory resources and applying the device-dax memory management interface. Given the current dearth of systems that supply an ACPI HMAT table, and the utility of being able to manually define device-dax "hmem" instances via the efi_fake_mem= option, relax the requirements for creating these devices. Specifically, add an option (numa=nohmat) to optionally disable consideration of the HMAT and update efi_fake_mem= to behave like memmap=nn!ss in terms of delimiting device boundaries. [2]: https://lore.kernel.org/lkml/20200110190313.17144-1-joao.m.martins@oracle.com/ With Ard's and Rafael's ack I'd feel ok taking this through the nvdimm tree, please holler if anything still needs some fixups. Dependencies: b2ca916ce392 ACPI: NUMA: Up-level "map to online node" functionality 4fcbe96e4d0b mm/numa: Skip NUMA_NO_NODE and online nodes in numa_map_to_online_node() 575e23b6e13c powerpc/papr_scm: Switch to numa_map_to_online_node() 1e5d8e1e47af x86/mm: Introduce CONFIG_NUMA_KEEP_MEMINFO 5d30f92e7631 x86/NUMA: Provide a range-to-target_node lookup facility 7b27a8622f80 libnvdimm/e820: Retrieve and populate correct 'target_node' info Tested with: numa=nohmat efi_fake_mem=4G@9G:0x40000,4G@13G:0x40000 ...to create to device-dax instances: # daxctl list -RDu [ { "path":"\/platform\/hmem.1", "id":1, "size":"4.00 GiB (4.29 GB)", "align":2097152, "devices":[ { "chardev":"dax1.0", "size":"4.00 GiB (4.29 GB)", "target_node":3, "mode":"devdax" } ] }, { "path":"\/platform\/hmem.0", "id":0, "size":"4.00 GiB (4.29 GB)", "align":2097152, "devices":[ { "chardev":"dax0.0", "size":"4.00 GiB (4.29 GB)", "target_node":2, "mode":"devdax" } ] } ] --- Dan Williams (6): x86/numa: Cleanup configuration dependent command-line options x86/numa: Add 'nohmat' option efi/fake_mem: Arrange for a resource entry per efi_fake_mem instance ACPI: HMAT: Refactor hmat_register_target_device to hmem_register_device resource: Report parent to walk_iomem_res_desc() callback ACPI: HMAT: Attach a device for each soft-reserved range arch/arm64/mm/numa.c | 13 +++++ arch/x86/include/asm/numa.h | 8 +++ arch/x86/kernel/e820.c | 16 +++++- arch/x86/mm/numa.c | 10 +--- arch/x86/mm/numa_emulation.c | 3 + arch/x86/xen/enlighten_pv.c | 2 - drivers/acpi/numa/hmat.c | 76 +++++---------------------- drivers/acpi/numa/srat.c | 9 +++ drivers/dax/Kconfig | 5 ++ drivers/dax/Makefile | 3 - drivers/dax/hmem/Makefile | 6 ++ drivers/dax/hmem/device.c | 97 +++++++++++++++++++++++++++++++++++ drivers/dax/hmem/hmem.c | 2 - drivers/firmware/efi/x86_fake_mem.c | 12 +++- include/acpi/acpi_numa.h | 14 +++++ include/linux/dax.h | 8 +++ kernel/resource.c | 11 +++- 17 files changed, 209 insertions(+), 86 deletions(-) create mode 100644 drivers/dax/hmem/Makefile create mode 100644 drivers/dax/hmem/device.c rename drivers/dax/{hmem.c => hmem/hmem.c} (98%) base-commit: 7b27a8622f802761d5c6abd6c37b22312a35343c _______________________________________________ Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org To unsubscribe send an email to linux-nvdimm-leave@lists.01.org
WARNING: multiple messages have this Message-ID
From: Dan Williams <dan.j.williams@intel.com> To: linux-acpi@vger.kernel.org Cc: Jason Gunthorpe <jgg@ziepe.ca>, "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>, Peter Zijlstra <peterz@infradead.org>, Ard Biesheuvel <ardb@kernel.org>, Jonathan Cameron <Jonathan.Cameron@huawei.com>, Borislav Petkov <bp@alien8.de>, Wei Yang <richardw.yang@linux.intel.com>, x86@kernel.org, "H. Peter Anvin" <hpa@zytor.com>, Brice Goglin <Brice.Goglin@inria.fr>, Thomas Gleixner <tglx@linutronix.de>, Catalin Marinas <catalin.marinas@arm.com>, Jeff Moyer <jmoyer@redhat.com>, Ingo Molnar <mingo@redhat.com>, Dave Hansen <dave.hansen@linux.intel.com>, Will Deacon <will@kernel.org>, "Rafael J. Wysocki" <rjw@rjwysocki.net>, Ard Biesheuvel <ard.biesheuvel@linaro.org>, Andy Lutomirski <luto@kernel.org>, Tom Lendacky <thomas.lendacky@amd.com>, linux-nvdimm@lists.01.org, linux-kernel@vger.kernel.org, x86@kernel.org, joao.m.martins@oracle.com Subject: [PATCH v2 0/6] Manual definition of Soft Reserved memory devices Date: Sun, 22 Mar 2020 09:12:23 -0700 [thread overview] Message-ID: <158489354353.1457606.8327903161927980740.stgit@dwillia2-desk3.amr.corp.intel.com> (raw) Changes since v1 [1]: - Kill the ifdef'ery in arch/x86/mm/numa.c (Rafael) - Add a dummy phys_to_target_node() for ARM64 (0day-robot) - Initialize ->child and ->sibling to NULL in the resource returned by find_next_iomem_res() (Inspired by Tom's feedback even though it does not set them like he suggested) - Collect Ard's Ack [1]: http://lore.kernel.org/r/158318759687.2216124.4684754859068906007.stgit@dwillia2-desk3.amr.corp.intel.com --- My primary motivation is making the dax_kmem facility useful to shipping platforms that have performance differentiated memory, but may not have EFI-defined soft-reservations / HMAT (or non-EFI-ACPI-platform equivalent). I'm anticipating HMAT enabled platforms where the platform firmware policy for what is soft-reserved, or not, is not the policy the system owner would pick. I'd also highlight Joao's work [2] (see the TODO section) as an indication of the demand for custom carving memory resources and applying the device-dax memory management interface. Given the current dearth of systems that supply an ACPI HMAT table, and the utility of being able to manually define device-dax "hmem" instances via the efi_fake_mem= option, relax the requirements for creating these devices. Specifically, add an option (numa=nohmat) to optionally disable consideration of the HMAT and update efi_fake_mem= to behave like memmap=nn!ss in terms of delimiting device boundaries. [2]: https://lore.kernel.org/lkml/20200110190313.17144-1-joao.m.martins@oracle.com/ With Ard's and Rafael's ack I'd feel ok taking this through the nvdimm tree, please holler if anything still needs some fixups. Dependencies: b2ca916ce392 ACPI: NUMA: Up-level "map to online node" functionality 4fcbe96e4d0b mm/numa: Skip NUMA_NO_NODE and online nodes in numa_map_to_online_node() 575e23b6e13c powerpc/papr_scm: Switch to numa_map_to_online_node() 1e5d8e1e47af x86/mm: Introduce CONFIG_NUMA_KEEP_MEMINFO 5d30f92e7631 x86/NUMA: Provide a range-to-target_node lookup facility 7b27a8622f80 libnvdimm/e820: Retrieve and populate correct 'target_node' info Tested with: numa=nohmat efi_fake_mem=4G@9G:0x40000,4G@13G:0x40000 ...to create to device-dax instances: # daxctl list -RDu [ { "path":"\/platform\/hmem.1", "id":1, "size":"4.00 GiB (4.29 GB)", "align":2097152, "devices":[ { "chardev":"dax1.0", "size":"4.00 GiB (4.29 GB)", "target_node":3, "mode":"devdax" } ] }, { "path":"\/platform\/hmem.0", "id":0, "size":"4.00 GiB (4.29 GB)", "align":2097152, "devices":[ { "chardev":"dax0.0", "size":"4.00 GiB (4.29 GB)", "target_node":2, "mode":"devdax" } ] } ] --- Dan Williams (6): x86/numa: Cleanup configuration dependent command-line options x86/numa: Add 'nohmat' option efi/fake_mem: Arrange for a resource entry per efi_fake_mem instance ACPI: HMAT: Refactor hmat_register_target_device to hmem_register_device resource: Report parent to walk_iomem_res_desc() callback ACPI: HMAT: Attach a device for each soft-reserved range arch/arm64/mm/numa.c | 13 +++++ arch/x86/include/asm/numa.h | 8 +++ arch/x86/kernel/e820.c | 16 +++++- arch/x86/mm/numa.c | 10 +--- arch/x86/mm/numa_emulation.c | 3 + arch/x86/xen/enlighten_pv.c | 2 - drivers/acpi/numa/hmat.c | 76 +++++---------------------- drivers/acpi/numa/srat.c | 9 +++ drivers/dax/Kconfig | 5 ++ drivers/dax/Makefile | 3 - drivers/dax/hmem/Makefile | 6 ++ drivers/dax/hmem/device.c | 97 +++++++++++++++++++++++++++++++++++ drivers/dax/hmem/hmem.c | 2 - drivers/firmware/efi/x86_fake_mem.c | 12 +++- include/acpi/acpi_numa.h | 14 +++++ include/linux/dax.h | 8 +++ kernel/resource.c | 11 +++- 17 files changed, 209 insertions(+), 86 deletions(-) create mode 100644 drivers/dax/hmem/Makefile create mode 100644 drivers/dax/hmem/device.c rename drivers/dax/{hmem.c => hmem/hmem.c} (98%) base-commit: 7b27a8622f802761d5c6abd6c37b22312a35343c
next reply other threads:[~2020-03-22 16:28 UTC|newest] Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top 2020-03-22 16:12 Dan Williams [this message] 2020-03-22 16:12 ` Dan Williams 2020-03-22 16:12 ` [PATCH v2 1/6] x86/numa: Cleanup configuration dependent command-line options Dan Williams 2020-03-22 16:12 ` Dan Williams 2020-03-22 16:12 ` [PATCH v2 2/6] x86/numa: Add 'nohmat' option Dan Williams 2020-03-22 16:12 ` Dan Williams 2020-03-22 16:12 ` [PATCH v2 3/6] efi/fake_mem: Arrange for a resource entry per efi_fake_mem instance Dan Williams 2020-03-22 16:12 ` Dan Williams 2020-03-22 16:12 ` [PATCH v2 4/6] ACPI: HMAT: Refactor hmat_register_target_device to hmem_register_device Dan Williams 2020-03-22 16:12 ` Dan Williams 2020-03-24 19:40 ` Joao Martins 2020-03-24 19:40 ` Joao Martins 2020-03-24 21:04 ` Dan Williams 2020-03-24 21:04 ` Dan Williams 2020-03-25 22:32 ` Dan Williams 2020-03-25 22:32 ` Dan Williams 2020-03-22 16:12 ` [PATCH v2 5/6] resource: Report parent to walk_iomem_res_desc() callback Dan Williams 2020-03-22 16:12 ` Dan Williams 2020-03-22 16:12 ` [PATCH v2 6/6] ACPI: HMAT: Attach a device for each soft-reserved range Dan Williams 2020-03-22 16:12 ` Dan Williams 2020-03-24 19:41 ` Joao Martins 2020-03-24 19:41 ` Joao Martins 2020-03-24 21:06 ` Dan Williams 2020-03-24 21:06 ` Dan Williams 2020-03-24 21:30 ` Joao Martins 2020-03-24 21:30 ` Joao Martins 2020-03-25 11:10 ` Will Deacon 2020-03-25 11:10 ` Will Deacon 2020-03-25 17:10 ` Dan Williams 2020-03-25 17:10 ` Dan Williams 2020-03-25 10:02 ` [PATCH v2 0/6] Manual definition of Soft Reserved memory devices Rafael J. Wysocki 2020-03-25 10:02 ` Rafael J. Wysocki
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=158489354353.1457606.8327903161927980740.stgit@dwillia2-desk3.amr.corp.intel.com \ --to=dan.j.williams@intel.com \ --cc=Brice.Goglin@inria.fr \ --cc=Jonathan.Cameron@huawei.com \ --cc=ard.biesheuvel@linaro.org \ --cc=ardb@kernel.org \ --cc=bp@alien8.de \ --cc=catalin.marinas@arm.com \ --cc=dave.hansen@linux.intel.com \ --cc=hpa@zytor.com \ --cc=jgg@ziepe.ca \ --cc=joao.m.martins@oracle.com \ --cc=linux-acpi@vger.kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-nvdimm@lists.01.org \ --cc=luto@kernel.org \ --cc=mingo@redhat.com \ --cc=peterz@infradead.org \ --cc=rafael.j.wysocki@intel.com \ --cc=rjw@rjwysocki.net \ --cc=tglx@linutronix.de \ --cc=thomas.lendacky@amd.com \ --cc=will@kernel.org \ --cc=x86@kernel.org \ --subject='Re: [PATCH v2 0/6] Manual definition of Soft Reserved memory devices' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.