From: Dan Williams <dan.j.williams@intel.com> To: linux-acpi@vger.kernel.org Cc: Jason Gunthorpe <jgg@ziepe.ca>, "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>, Peter Zijlstra <peterz@infradead.org>, Ard Biesheuvel <ardb@kernel.org>, Jonathan Cameron <Jonathan.Cameron@huawei.com>, Borislav Petkov <bp@alien8.de>, x86@kernel.org, "H. Peter Anvin" <hpa@zytor.com>, Brice Goglin <Brice.Goglin@inria.fr>, Thomas Gleixner <tglx@linutronix.de>, Catalin Marinas <catalin.marinas@arm.com>, Ingo Molnar <mingo@redhat.com>, Dave Hansen <dave.hansen@linux.intel.com>, Will Deacon <will@kernel.org>, "Rafael J. Wysocki" <rjw@rjwysocki.net>, Ard Biesheuvel <ard.biesheuvel@linaro.org>, Andy Lutomirski <luto@kernel.org>, Tom Lendacky <thomas.lendacky@amd.com>, linux-nvdimm@lists.01.org, linux-kernel@vger.kernel.org, joao.m.martins@oracle.com Subject: [PATCH v2 0/6] Manual definition of Soft Reserved memory devices Date: Sun, 22 Mar 2020 09:12:23 -0700 [thread overview] Message-ID: <158489354353.1457606.8327903161927980740.stgit@dwillia2-desk3.amr.corp.intel.com> (raw) Changes since v1 [1]: - Kill the ifdef'ery in arch/x86/mm/numa.c (Rafael) - Add a dummy phys_to_target_node() for ARM64 (0day-robot) - Initialize ->child and ->sibling to NULL in the resource returned by find_next_iomem_res() (Inspired by Tom's feedback even though it does not set them like he suggested) - Collect Ard's Ack [1]: http://lore.kernel.org/r/158318759687.2216124.4684754859068906007.stgit@dwillia2-desk3.amr.corp.intel.com --- My primary motivation is making the dax_kmem facility useful to shipping platforms that have performance differentiated memory, but may not have EFI-defined soft-reservations / HMAT (or non-EFI-ACPI-platform equivalent). I'm anticipating HMAT enabled platforms where the platform firmware policy for what is soft-reserved, or not, is not the policy the system owner would pick. I'd also highlight Joao's work [2] (see the TODO section) as an indication of the demand for custom carving memory resources and applying the device-dax memory management interface. Given the current dearth of systems that supply an ACPI HMAT table, and the utility of being able to manually define device-dax "hmem" instances via the efi_fake_mem= option, relax the requirements for creating these devices. Specifically, add an option (numa=nohmat) to optionally disable consideration of the HMAT and update efi_fake_mem= to behave like memmap=nn!ss in terms of delimiting device boundaries. [2]: https://lore.kernel.org/lkml/20200110190313.17144-1-joao.m.martins@oracle.com/ With Ard's and Rafael's ack I'd feel ok taking this through the nvdimm tree, please holler if anything still needs some fixups. Dependencies: b2ca916ce392 ACPI: NUMA: Up-level "map to online node" functionality 4fcbe96e4d0b mm/numa: Skip NUMA_NO_NODE and online nodes in numa_map_to_online_node() 575e23b6e13c powerpc/papr_scm: Switch to numa_map_to_online_node() 1e5d8e1e47af x86/mm: Introduce CONFIG_NUMA_KEEP_MEMINFO 5d30f92e7631 x86/NUMA: Provide a range-to-target_node lookup facility 7b27a8622f80 libnvdimm/e820: Retrieve and populate correct 'target_node' info Tested with: numa=nohmat efi_fake_mem=4G@9G:0x40000,4G@13G:0x40000 ...to create to device-dax instances: # daxctl list -RDu [ { "path":"\/platform\/hmem.1", "id":1, "size":"4.00 GiB (4.29 GB)", "align":2097152, "devices":[ { "chardev":"dax1.0", "size":"4.00 GiB (4.29 GB)", "target_node":3, "mode":"devdax" } ] }, { "path":"\/platform\/hmem.0", "id":0, "size":"4.00 GiB (4.29 GB)", "align":2097152, "devices":[ { "chardev":"dax0.0", "size":"4.00 GiB (4.29 GB)", "target_node":2, "mode":"devdax" } ] } ] --- Dan Williams (6): x86/numa: Cleanup configuration dependent command-line options x86/numa: Add 'nohmat' option efi/fake_mem: Arrange for a resource entry per efi_fake_mem instance ACPI: HMAT: Refactor hmat_register_target_device to hmem_register_device resource: Report parent to walk_iomem_res_desc() callback ACPI: HMAT: Attach a device for each soft-reserved range arch/arm64/mm/numa.c | 13 +++++ arch/x86/include/asm/numa.h | 8 +++ arch/x86/kernel/e820.c | 16 +++++- arch/x86/mm/numa.c | 10 +--- arch/x86/mm/numa_emulation.c | 3 + arch/x86/xen/enlighten_pv.c | 2 - drivers/acpi/numa/hmat.c | 76 +++++---------------------- drivers/acpi/numa/srat.c | 9 +++ drivers/dax/Kconfig | 5 ++ drivers/dax/Makefile | 3 - drivers/dax/hmem/Makefile | 6 ++ drivers/dax/hmem/device.c | 97 +++++++++++++++++++++++++++++++++++ drivers/dax/hmem/hmem.c | 2 - drivers/firmware/efi/x86_fake_mem.c | 12 +++- include/acpi/acpi_numa.h | 14 +++++ include/linux/dax.h | 8 +++ kernel/resource.c | 11 +++- 17 files changed, 209 insertions(+), 86 deletions(-) create mode 100644 drivers/dax/hmem/Makefile create mode 100644 drivers/dax/hmem/device.c rename drivers/dax/{hmem.c => hmem/hmem.c} (98%) base-commit: 7b27a8622f802761d5c6abd6c37b22312a35343c _______________________________________________ Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org To unsubscribe send an email to linux-nvdimm-leave@lists.01.org
WARNING: multiple messages have this Message-ID (diff)
From: Dan Williams <dan.j.williams@intel.com> To: linux-acpi@vger.kernel.org Cc: Jason Gunthorpe <jgg@ziepe.ca>, "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>, Peter Zijlstra <peterz@infradead.org>, Ard Biesheuvel <ardb@kernel.org>, Jonathan Cameron <Jonathan.Cameron@huawei.com>, Borislav Petkov <bp@alien8.de>, Wei Yang <richardw.yang@linux.intel.com>, x86@kernel.org, "H. Peter Anvin" <hpa@zytor.com>, Brice Goglin <Brice.Goglin@inria.fr>, Thomas Gleixner <tglx@linutronix.de>, Catalin Marinas <catalin.marinas@arm.com>, Jeff Moyer <jmoyer@redhat.com>, Ingo Molnar <mingo@redhat.com>, Dave Hansen <dave.hansen@linux.intel.com>, Will Deacon <will@kernel.org>, "Rafael J. Wysocki" <rjw@rjwysocki.net>, Ard Biesheuvel <ard.biesheuvel@linaro.org>, Andy Lutomirski <luto@kernel.org>, Tom Lendacky <thomas.lendacky@amd.com>, linux-nvdimm@lists.01.org, linux-kernel@vger.kernel.org, x86@kernel.org, joao.m.martins@oracle.com Subject: [PATCH v2 0/6] Manual definition of Soft Reserved memory devices Date: Sun, 22 Mar 2020 09:12:23 -0700 [thread overview] Message-ID: <158489354353.1457606.8327903161927980740.stgit@dwillia2-desk3.amr.corp.intel.com> (raw) Changes since v1 [1]: - Kill the ifdef'ery in arch/x86/mm/numa.c (Rafael) - Add a dummy phys_to_target_node() for ARM64 (0day-robot) - Initialize ->child and ->sibling to NULL in the resource returned by find_next_iomem_res() (Inspired by Tom's feedback even though it does not set them like he suggested) - Collect Ard's Ack [1]: http://lore.kernel.org/r/158318759687.2216124.4684754859068906007.stgit@dwillia2-desk3.amr.corp.intel.com --- My primary motivation is making the dax_kmem facility useful to shipping platforms that have performance differentiated memory, but may not have EFI-defined soft-reservations / HMAT (or non-EFI-ACPI-platform equivalent). I'm anticipating HMAT enabled platforms where the platform firmware policy for what is soft-reserved, or not, is not the policy the system owner would pick. I'd also highlight Joao's work [2] (see the TODO section) as an indication of the demand for custom carving memory resources and applying the device-dax memory management interface. Given the current dearth of systems that supply an ACPI HMAT table, and the utility of being able to manually define device-dax "hmem" instances via the efi_fake_mem= option, relax the requirements for creating these devices. Specifically, add an option (numa=nohmat) to optionally disable consideration of the HMAT and update efi_fake_mem= to behave like memmap=nn!ss in terms of delimiting device boundaries. [2]: https://lore.kernel.org/lkml/20200110190313.17144-1-joao.m.martins@oracle.com/ With Ard's and Rafael's ack I'd feel ok taking this through the nvdimm tree, please holler if anything still needs some fixups. Dependencies: b2ca916ce392 ACPI: NUMA: Up-level "map to online node" functionality 4fcbe96e4d0b mm/numa: Skip NUMA_NO_NODE and online nodes in numa_map_to_online_node() 575e23b6e13c powerpc/papr_scm: Switch to numa_map_to_online_node() 1e5d8e1e47af x86/mm: Introduce CONFIG_NUMA_KEEP_MEMINFO 5d30f92e7631 x86/NUMA: Provide a range-to-target_node lookup facility 7b27a8622f80 libnvdimm/e820: Retrieve and populate correct 'target_node' info Tested with: numa=nohmat efi_fake_mem=4G@9G:0x40000,4G@13G:0x40000 ...to create to device-dax instances: # daxctl list -RDu [ { "path":"\/platform\/hmem.1", "id":1, "size":"4.00 GiB (4.29 GB)", "align":2097152, "devices":[ { "chardev":"dax1.0", "size":"4.00 GiB (4.29 GB)", "target_node":3, "mode":"devdax" } ] }, { "path":"\/platform\/hmem.0", "id":0, "size":"4.00 GiB (4.29 GB)", "align":2097152, "devices":[ { "chardev":"dax0.0", "size":"4.00 GiB (4.29 GB)", "target_node":2, "mode":"devdax" } ] } ] --- Dan Williams (6): x86/numa: Cleanup configuration dependent command-line options x86/numa: Add 'nohmat' option efi/fake_mem: Arrange for a resource entry per efi_fake_mem instance ACPI: HMAT: Refactor hmat_register_target_device to hmem_register_device resource: Report parent to walk_iomem_res_desc() callback ACPI: HMAT: Attach a device for each soft-reserved range arch/arm64/mm/numa.c | 13 +++++ arch/x86/include/asm/numa.h | 8 +++ arch/x86/kernel/e820.c | 16 +++++- arch/x86/mm/numa.c | 10 +--- arch/x86/mm/numa_emulation.c | 3 + arch/x86/xen/enlighten_pv.c | 2 - drivers/acpi/numa/hmat.c | 76 +++++---------------------- drivers/acpi/numa/srat.c | 9 +++ drivers/dax/Kconfig | 5 ++ drivers/dax/Makefile | 3 - drivers/dax/hmem/Makefile | 6 ++ drivers/dax/hmem/device.c | 97 +++++++++++++++++++++++++++++++++++ drivers/dax/hmem/hmem.c | 2 - drivers/firmware/efi/x86_fake_mem.c | 12 +++- include/acpi/acpi_numa.h | 14 +++++ include/linux/dax.h | 8 +++ kernel/resource.c | 11 +++- 17 files changed, 209 insertions(+), 86 deletions(-) create mode 100644 drivers/dax/hmem/Makefile create mode 100644 drivers/dax/hmem/device.c rename drivers/dax/{hmem.c => hmem/hmem.c} (98%) base-commit: 7b27a8622f802761d5c6abd6c37b22312a35343c
next reply other threads:[~2020-03-22 16:28 UTC|newest] Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top 2020-03-22 16:12 Dan Williams [this message] 2020-03-22 16:12 ` [PATCH v2 0/6] Manual definition of Soft Reserved memory devices Dan Williams 2020-03-22 16:12 ` [PATCH v2 1/6] x86/numa: Cleanup configuration dependent command-line options Dan Williams 2020-03-22 16:12 ` Dan Williams 2020-03-22 16:12 ` [PATCH v2 2/6] x86/numa: Add 'nohmat' option Dan Williams 2020-03-22 16:12 ` Dan Williams 2020-03-22 16:12 ` [PATCH v2 3/6] efi/fake_mem: Arrange for a resource entry per efi_fake_mem instance Dan Williams 2020-03-22 16:12 ` Dan Williams 2020-03-22 16:12 ` [PATCH v2 4/6] ACPI: HMAT: Refactor hmat_register_target_device to hmem_register_device Dan Williams 2020-03-22 16:12 ` Dan Williams 2020-03-24 19:40 ` Joao Martins 2020-03-24 19:40 ` Joao Martins 2020-03-24 21:04 ` Dan Williams 2020-03-24 21:04 ` Dan Williams 2020-03-25 22:32 ` Dan Williams 2020-03-25 22:32 ` Dan Williams 2020-03-22 16:12 ` [PATCH v2 5/6] resource: Report parent to walk_iomem_res_desc() callback Dan Williams 2020-03-22 16:12 ` Dan Williams 2020-03-22 16:12 ` [PATCH v2 6/6] ACPI: HMAT: Attach a device for each soft-reserved range Dan Williams 2020-03-22 16:12 ` Dan Williams 2020-03-24 19:41 ` Joao Martins 2020-03-24 19:41 ` Joao Martins 2020-03-24 21:06 ` Dan Williams 2020-03-24 21:06 ` Dan Williams 2020-03-24 21:30 ` Joao Martins 2020-03-24 21:30 ` Joao Martins 2020-03-25 11:10 ` Will Deacon 2020-03-25 11:10 ` Will Deacon 2020-03-25 17:10 ` Dan Williams 2020-03-25 17:10 ` Dan Williams 2020-03-25 10:02 ` [PATCH v2 0/6] Manual definition of Soft Reserved memory devices Rafael J. Wysocki 2020-03-25 10:02 ` Rafael J. Wysocki
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=158489354353.1457606.8327903161927980740.stgit@dwillia2-desk3.amr.corp.intel.com \ --to=dan.j.williams@intel.com \ --cc=Brice.Goglin@inria.fr \ --cc=Jonathan.Cameron@huawei.com \ --cc=ard.biesheuvel@linaro.org \ --cc=ardb@kernel.org \ --cc=bp@alien8.de \ --cc=catalin.marinas@arm.com \ --cc=dave.hansen@linux.intel.com \ --cc=hpa@zytor.com \ --cc=jgg@ziepe.ca \ --cc=joao.m.martins@oracle.com \ --cc=linux-acpi@vger.kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-nvdimm@lists.01.org \ --cc=luto@kernel.org \ --cc=mingo@redhat.com \ --cc=peterz@infradead.org \ --cc=rafael.j.wysocki@intel.com \ --cc=rjw@rjwysocki.net \ --cc=tglx@linutronix.de \ --cc=thomas.lendacky@amd.com \ --cc=will@kernel.org \ --cc=x86@kernel.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.