From: Dan Williams <dan.j.williams@intel.com> To: linux-nvdimm@lists.01.org Cc: Dave Hansen <dave.hansen@linux.intel.com>, Andy Lutomirski <luto@kernel.org>, Peter Zijlstra <peterz@infradead.org>, Thomas Gleixner <tglx@linutronix.de>, Ingo Molnar <mingo@redhat.com>, Andrew Morton <akpm@linux-foundation.org>, David Hildenbrand <david@redhat.com>, Michal Hocko <mhocko@suse.com>, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH 16/16] libnvdimm/e820: Retrieve and populate correct 'target_node' info Date: Wed, 06 Nov 2019 19:58:03 -0800 [thread overview] Message-ID: <157309908326.1582359.13665017314935413372.stgit@dwillia2-desk3.amr.corp.intel.com> (raw) In-Reply-To: <157309899529.1582359.15358067933360719580.stgit@dwillia2-desk3.amr.corp.intel.com> Use the new memory_add_physaddr_to_target_node() and numa_map_to_online_node() helpers to retrieve the correct id for the 'numa_node' (online initiator) and 'target_node' (offline target memory node) sysfs attributes. Below is an example from a 4 numa node system where all the memory on node2 is pmem / reserved. It should be noted that with the arrival of the ACPI HMAT table and EFI Specific Purpose Memory the kernel will start to see more platforms with reserved / performance differentiated memory in its own numa node. Hence all the stakeholders on the Cc for what is ostensibly a libnvdimm local patch. === Before === /* Notice no online memory on node2 at start */ # numactl --hardware available: 3 nodes (0-1,3) node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 node 0 size: 3958 MB node 0 free: 3708 MB node 1 cpus: 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 node 1 size: 4027 MB node 1 free: 3871 MB node 3 cpus: node 3 size: 3994 MB node 3 free: 3971 MB node distances: node 0 1 3 0: 10 21 21 1: 21 10 21 3: 21 21 10 /* * Put the pmem namespace into devdax mode so it can be assigned to the * kmem driver */ # ndctl create-namespace -e namespace0.0 -m devdax -f { "dev":"namespace0.0", "mode":"devdax", "map":"dev", "size":"3.94 GiB (4.23 GB)", "uuid":"1650af9b-9ba3-4704-acd6-10178399d9a3", [..] } /* Online Persistent Memory as System RAM */ # daxctl reconfigure-device --mode=system-ram dax0.0 libdaxctl: memblock_in_dev: dax0.0: memory0: Unable to determine phys_index: Success libdaxctl: memblock_in_dev: dax0.0: memory0: Unable to determine phys_index: Success libdaxctl: memblock_in_dev: dax0.0: memory0: Unable to determine phys_index: Success libdaxctl: memblock_in_dev: dax0.0: memory0: Unable to determine phys_index: Success [ { "chardev":"dax0.0", "size":4225761280, "target_node":0, "mode":"system-ram" } ] reconfigured 1 device /* Note that the memory is onlined by default to the wrong node, node0 */ # numactl --hardware available: 3 nodes (0-1,3) node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 node 0 size: 7926 MB node 0 free: 7655 MB node 1 cpus: 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 node 1 size: 4027 MB node 1 free: 3871 MB node 3 cpus: node 3 size: 3994 MB node 3 free: 3971 MB node distances: node 0 1 3 0: 10 21 21 1: 21 10 21 3: 21 21 10 === After === /* Notice that the "phys_index" error messages are gone */ # daxctl reconfigure-device --mode=system-ram dax0.0 [ { "chardev":"dax0.0", "size":4225761280, "target_node":2, "mode":"system-ram" } ] reconfigured 1 device /* Notice that node2 is now correctly populated */ # numactl --hardware available: 4 nodes (0-3) node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 node 0 size: 3958 MB node 0 free: 3793 MB node 1 cpus: 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 node 1 size: 4027 MB node 1 free: 3851 MB node 2 cpus: node 2 size: 3968 MB node 2 free: 3968 MB node 3 cpus: node 3 size: 3994 MB node 3 free: 3908 MB node distances: node 0 1 2 3 0: 10 21 21 21 1: 21 10 21 21 2: 21 21 10 21 3: 21 21 21 10 Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: David Hildenbrand <david@redhat.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Ira Weiny <ira.weiny@intel.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> --- drivers/nvdimm/e820.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/nvdimm/e820.c b/drivers/nvdimm/e820.c index b802291bcde1..23121dd6e494 100644 --- a/drivers/nvdimm/e820.c +++ b/drivers/nvdimm/e820.c @@ -20,11 +20,12 @@ static int e820_register_one(struct resource *res, void *data) { struct nd_region_desc ndr_desc; struct nvdimm_bus *nvdimm_bus = data; + int nid = memory_add_physaddr_to_target_node(res->start); memset(&ndr_desc, 0, sizeof(ndr_desc)); ndr_desc.res = res; - ndr_desc.numa_node = memory_add_physaddr_to_nid(res->start); - ndr_desc.target_node = ndr_desc.numa_node; + ndr_desc.numa_node = numa_map_to_online_node(nid); + ndr_desc.target_node = nid; set_bit(ND_REGION_PAGEMAP, &ndr_desc.flags); if (!nvdimm_pmem_region_create(nvdimm_bus, &ndr_desc)) return -ENXIO; _______________________________________________ Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org To unsubscribe send an email to linux-nvdimm-leave@lists.01.org
WARNING: multiple messages have this Message-ID (diff)
From: Dan Williams <dan.j.williams@intel.com> To: linux-nvdimm@lists.01.org Cc: Dave Hansen <dave.hansen@linux.intel.com>, Andy Lutomirski <luto@kernel.org>, Peter Zijlstra <peterz@infradead.org>, Thomas Gleixner <tglx@linutronix.de>, Ingo Molnar <mingo@redhat.com>, Andrew Morton <akpm@linux-foundation.org>, David Hildenbrand <david@redhat.com>, Michal Hocko <mhocko@suse.com>, Ira Weiny <ira.weiny@intel.com>, Vishal Verma <vishal.l.verma@intel.com>, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH 16/16] libnvdimm/e820: Retrieve and populate correct 'target_node' info Date: Wed, 06 Nov 2019 19:58:03 -0800 [thread overview] Message-ID: <157309908326.1582359.13665017314935413372.stgit@dwillia2-desk3.amr.corp.intel.com> (raw) In-Reply-To: <157309899529.1582359.15358067933360719580.stgit@dwillia2-desk3.amr.corp.intel.com> Use the new memory_add_physaddr_to_target_node() and numa_map_to_online_node() helpers to retrieve the correct id for the 'numa_node' (online initiator) and 'target_node' (offline target memory node) sysfs attributes. Below is an example from a 4 numa node system where all the memory on node2 is pmem / reserved. It should be noted that with the arrival of the ACPI HMAT table and EFI Specific Purpose Memory the kernel will start to see more platforms with reserved / performance differentiated memory in its own numa node. Hence all the stakeholders on the Cc for what is ostensibly a libnvdimm local patch. === Before === /* Notice no online memory on node2 at start */ # numactl --hardware available: 3 nodes (0-1,3) node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 node 0 size: 3958 MB node 0 free: 3708 MB node 1 cpus: 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 node 1 size: 4027 MB node 1 free: 3871 MB node 3 cpus: node 3 size: 3994 MB node 3 free: 3971 MB node distances: node 0 1 3 0: 10 21 21 1: 21 10 21 3: 21 21 10 /* * Put the pmem namespace into devdax mode so it can be assigned to the * kmem driver */ # ndctl create-namespace -e namespace0.0 -m devdax -f { "dev":"namespace0.0", "mode":"devdax", "map":"dev", "size":"3.94 GiB (4.23 GB)", "uuid":"1650af9b-9ba3-4704-acd6-10178399d9a3", [..] } /* Online Persistent Memory as System RAM */ # daxctl reconfigure-device --mode=system-ram dax0.0 libdaxctl: memblock_in_dev: dax0.0: memory0: Unable to determine phys_index: Success libdaxctl: memblock_in_dev: dax0.0: memory0: Unable to determine phys_index: Success libdaxctl: memblock_in_dev: dax0.0: memory0: Unable to determine phys_index: Success libdaxctl: memblock_in_dev: dax0.0: memory0: Unable to determine phys_index: Success [ { "chardev":"dax0.0", "size":4225761280, "target_node":0, "mode":"system-ram" } ] reconfigured 1 device /* Note that the memory is onlined by default to the wrong node, node0 */ # numactl --hardware available: 3 nodes (0-1,3) node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 node 0 size: 7926 MB node 0 free: 7655 MB node 1 cpus: 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 node 1 size: 4027 MB node 1 free: 3871 MB node 3 cpus: node 3 size: 3994 MB node 3 free: 3971 MB node distances: node 0 1 3 0: 10 21 21 1: 21 10 21 3: 21 21 10 === After === /* Notice that the "phys_index" error messages are gone */ # daxctl reconfigure-device --mode=system-ram dax0.0 [ { "chardev":"dax0.0", "size":4225761280, "target_node":2, "mode":"system-ram" } ] reconfigured 1 device /* Notice that node2 is now correctly populated */ # numactl --hardware available: 4 nodes (0-3) node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 node 0 size: 3958 MB node 0 free: 3793 MB node 1 cpus: 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 node 1 size: 4027 MB node 1 free: 3851 MB node 2 cpus: node 2 size: 3968 MB node 2 free: 3968 MB node 3 cpus: node 3 size: 3994 MB node 3 free: 3908 MB node distances: node 0 1 2 3 0: 10 21 21 21 1: 21 10 21 21 2: 21 21 10 21 3: 21 21 21 10 Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: David Hildenbrand <david@redhat.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Ira Weiny <ira.weiny@intel.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> --- drivers/nvdimm/e820.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/nvdimm/e820.c b/drivers/nvdimm/e820.c index b802291bcde1..23121dd6e494 100644 --- a/drivers/nvdimm/e820.c +++ b/drivers/nvdimm/e820.c @@ -20,11 +20,12 @@ static int e820_register_one(struct resource *res, void *data) { struct nd_region_desc ndr_desc; struct nvdimm_bus *nvdimm_bus = data; + int nid = memory_add_physaddr_to_target_node(res->start); memset(&ndr_desc, 0, sizeof(ndr_desc)); ndr_desc.res = res; - ndr_desc.numa_node = memory_add_physaddr_to_nid(res->start); - ndr_desc.target_node = ndr_desc.numa_node; + ndr_desc.numa_node = numa_map_to_online_node(nid); + ndr_desc.target_node = nid; set_bit(ND_REGION_PAGEMAP, &ndr_desc.flags); if (!nvdimm_pmem_region_create(nvdimm_bus, &ndr_desc)) return -ENXIO;
next prev parent reply other threads:[~2019-11-07 4:12 UTC|newest] Thread overview: 79+ messages / expand[flat|nested] mbox.gz Atom feed top 2019-11-07 3:56 [PATCH 00/16] Memory Hierarchy: Enable target node lookups for reserved memory Dan Williams 2019-11-07 3:56 ` Dan Williams 2019-11-07 3:56 ` [PATCH 01/16] libnvdimm: Move attribute groups to device type Dan Williams 2019-11-07 3:56 ` Dan Williams 2019-11-12 11:28 ` Aneesh Kumar K.V 2019-11-12 11:28 ` Aneesh Kumar K.V 2019-11-07 3:56 ` [PATCH 02/16] libnvdimm: Move region attribute group definition Dan Williams 2019-11-07 3:56 ` Dan Williams 2019-11-12 11:29 ` Aneesh Kumar K.V 2019-11-12 11:29 ` Aneesh Kumar K.V 2019-11-07 3:56 ` [PATCH 03/16] libnvdimm: Move nd_device_attribute_group to device_type Dan Williams 2019-11-07 3:56 ` Dan Williams 2019-11-12 11:30 ` Aneesh Kumar K.V 2019-11-12 11:30 ` Aneesh Kumar K.V 2019-11-07 3:56 ` [PATCH 04/16] libnvdimm: Move nd_numa_attribute_group " Dan Williams 2019-11-07 3:56 ` Dan Williams 2019-11-12 9:22 ` Aneesh Kumar K.V 2019-11-12 9:22 ` Aneesh Kumar K.V 2019-11-13 1:26 ` Dan Williams 2019-11-13 1:26 ` Dan Williams 2019-11-13 1:26 ` Dan Williams 2019-11-13 6:02 ` Aneesh Kumar K.V 2019-11-13 6:02 ` Aneesh Kumar K.V 2019-11-13 6:14 ` Dan Williams 2019-11-13 6:14 ` Dan Williams 2019-11-13 6:14 ` Dan Williams 2019-11-07 3:57 ` [PATCH 05/16] libnvdimm: Move nd_region_attribute_group " Dan Williams 2019-11-07 3:57 ` Dan Williams 2019-11-12 11:45 ` Aneesh Kumar K.V 2019-11-12 11:45 ` Aneesh Kumar K.V 2019-11-07 3:57 ` [PATCH 06/16] libnvdimm: Move nd_mapping_attribute_group " Dan Williams 2019-11-07 3:57 ` Dan Williams 2019-11-12 11:45 ` Aneesh Kumar K.V 2019-11-12 11:45 ` Aneesh Kumar K.V 2019-11-07 3:57 ` [PATCH 07/16] libnvdimm: Move nvdimm_attribute_group " Dan Williams 2019-11-07 3:57 ` Dan Williams 2019-11-12 11:48 ` Aneesh Kumar K.V 2019-11-12 11:48 ` Aneesh Kumar K.V 2019-11-07 3:57 ` [PATCH 08/16] libnvdimm: Move nvdimm_bus_attribute_group " Dan Williams 2019-11-07 3:57 ` Dan Williams 2019-11-12 11:48 ` Aneesh Kumar K.V 2019-11-12 11:48 ` Aneesh Kumar K.V 2019-11-07 3:57 ` [PATCH 09/16] dax: Create a dax device_type Dan Williams 2019-11-07 3:57 ` Dan Williams 2019-11-12 11:49 ` Aneesh Kumar K.V 2019-11-12 11:49 ` Aneesh Kumar K.V 2019-11-07 3:57 ` [PATCH 10/16] dax: Simplify root read-only definition for the 'resource' attribute Dan Williams 2019-11-07 3:57 ` Dan Williams 2019-11-12 11:49 ` Aneesh Kumar K.V 2019-11-12 11:49 ` Aneesh Kumar K.V 2019-11-07 3:57 ` [PATCH 11/16] libnvdimm: " Dan Williams 2019-11-07 3:57 ` Dan Williams 2019-11-12 11:50 ` Aneesh Kumar K.V 2019-11-12 11:50 ` Aneesh Kumar K.V 2019-11-07 3:57 ` [PATCH 12/16] dax: Add numa_node to the default device-dax attributes Dan Williams 2019-11-07 3:57 ` Dan Williams 2019-11-12 11:50 ` Aneesh Kumar K.V 2019-11-12 11:50 ` Aneesh Kumar K.V 2019-11-07 3:57 ` [PATCH 13/16] acpi/mm: Up-level "map to online node" functionality Dan Williams 2019-11-07 3:57 ` Dan Williams 2019-11-11 11:30 ` Aneesh Kumar K.V 2019-11-11 11:30 ` Aneesh Kumar K.V 2019-11-11 23:38 ` Dan Williams 2019-11-11 23:38 ` Dan Williams 2019-11-11 23:38 ` Dan Williams 2019-11-07 3:57 ` [PATCH 14/16] x86/numa: Provide a range-to-target_node lookup facility Dan Williams 2019-11-07 3:57 ` Dan Williams 2019-11-07 3:57 ` [PATCH 15/16] libnvdimm/e820: Drop the wrapper around memory_add_physaddr_to_nid Dan Williams 2019-11-07 3:57 ` Dan Williams 2019-11-07 3:58 ` Dan Williams [this message] 2019-11-07 3:58 ` [PATCH 16/16] libnvdimm/e820: Retrieve and populate correct 'target_node' info Dan Williams 2019-11-09 5:02 ` kbuild test robot 2019-11-09 5:02 ` kbuild test robot 2019-11-09 5:02 ` kbuild test robot 2019-11-12 11:42 ` [PATCH 00/16] Memory Hierarchy: Enable target node lookups for reserved memory Aneesh Kumar K.V 2019-11-12 11:42 ` Aneesh Kumar K.V 2019-11-12 19:37 ` Dan Williams 2019-11-12 19:37 ` Dan Williams 2019-11-12 19:37 ` Dan Williams
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=157309908326.1582359.13665017314935413372.stgit@dwillia2-desk3.amr.corp.intel.com \ --to=dan.j.williams@intel.com \ --cc=akpm@linux-foundation.org \ --cc=dave.hansen@linux.intel.com \ --cc=david@redhat.com \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-mm@kvack.org \ --cc=linux-nvdimm@lists.01.org \ --cc=luto@kernel.org \ --cc=mhocko@suse.com \ --cc=mingo@redhat.com \ --cc=peterz@infradead.org \ --cc=tglx@linutronix.de \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.