All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jia He <justin.he@arm.com>
To: Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will@kernel.org>, Tony Luck <tony.luck@intel.com>,
	Fenghua Yu <fenghua.yu@intel.com>,
	Yoshinori Sato <ysato@users.sourceforge.jp>,
	Rich Felker <dalias@libc.org>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Andy Lutomirski <luto@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	David Hildenbrand <david@redhat.com>
Cc: x86@kernel.org, "H. Peter Anvin" <hpa@zytor.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Baoquan He <bhe@redhat.com>, Chuhong Yuan <hslester96@gmail.com>,
	Mike Rapoport <rppt@linux.ibm.com>,
	Masahiro Yamada <masahiroy@kernel.org>,
	Michal Hocko <mhocko@suse.com>,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, linux-ia64@vger.kernel.org,
	linux-sh@vger.kernel.org, linux-nvdimm@lists.01.org,
	linux-mm@kvack.org,
	Jonathan Cameron <Jonathan.Cameron@Huawei.com>,
	Kaly Xin <Kaly.Xin@arm.com>, Jia He <justin.he@arm.com>
Subject: [PATCH v3 5/6] device-dax: use fallback nid when numa_node is invalid
Date: Thu,  9 Jul 2020 10:06:28 +0800	[thread overview]
Message-ID: <20200709020629.91671-6-justin.he@arm.com> (raw)
In-Reply-To: <20200709020629.91671-1-justin.he@arm.com>

numa_off is set unconditionally at the end of dummy_numa_init(),
even with a fake numa node. ACPI detects node id as NUMA_NO_NODE(-1) in
acpi_map_pxm_to_node() because it regards numa_off as turning off the numa
node. Hence dev_dax->target_node is NUMA_NO_NODE on arm64 with fake numa.

Without this patch, pmem can't be probed as a RAM device on arm64 if SRAT table
isn't present:
$ndctl create-namespace -fe namespace0.0 --mode=devdax --map=dev -s 1g -a 64K
kmem dax0.0: rejecting DAX region [mem 0x240400000-0x2bfffffff] with invalid node: -1
kmem: probe of dax0.0 failed with error -22

This fixes it by using fallback memory_add_physaddr_to_nid() as nid.

Suggested-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Jia He <justin.he@arm.com>
---
 drivers/dax/kmem.c | 21 +++++++++++++--------
 1 file changed, 13 insertions(+), 8 deletions(-)

diff --git a/drivers/dax/kmem.c b/drivers/dax/kmem.c
index 275aa5f87399..218f66057994 100644
--- a/drivers/dax/kmem.c
+++ b/drivers/dax/kmem.c
@@ -31,22 +31,23 @@ int dev_dax_kmem_probe(struct device *dev)
 	int numa_node;
 	int rc;
 
+	/* Hotplug starting at the beginning of the next block: */
+	kmem_start = ALIGN(res->start, memory_block_size_bytes());
+
 	/*
 	 * Ensure good NUMA information for the persistent memory.
 	 * Without this check, there is a risk that slow memory
 	 * could be mixed in a node with faster memory, causing
-	 * unavoidable performance issues.
+	 * unavoidable performance issues. Furthermore, fallback node
+	 * id can be used when numa_node is invalid.
 	 */
 	numa_node = dev_dax->target_node;
 	if (numa_node < 0) {
-		dev_warn(dev, "rejecting DAX region %pR with invalid node: %d\n",
-			 res, numa_node);
-		return -EINVAL;
+		numa_node = memory_add_physaddr_to_nid(kmem_start);
+		dev_info(dev, "using nid %d for DAX region with undefined nid %pR\n",
+			numa_node, res);
 	}
 
-	/* Hotplug starting at the beginning of the next block: */
-	kmem_start = ALIGN(res->start, memory_block_size_bytes());
-
 	kmem_size = resource_size(res);
 	/* Adjust the size down to compensate for moving up kmem_start: */
 	kmem_size -= kmem_start - res->start;
@@ -100,15 +101,19 @@ static int dev_dax_kmem_remove(struct device *dev)
 	resource_size_t kmem_start = res->start;
 	resource_size_t kmem_size = resource_size(res);
 	const char *res_name = res->name;
+	int numa_node = dev_dax->target_node;
 	int rc;
 
+	if (numa_node < 0)
+		numa_node = memory_add_physaddr_to_nid(kmem_start);
+
 	/*
 	 * We have one shot for removing memory, if some memory blocks were not
 	 * offline prior to calling this function remove_memory() will fail, and
 	 * there is no way to hotremove this memory until reboot because device
 	 * unbind will succeed even if we return failure.
 	 */
-	rc = remove_memory(dev_dax->target_node, kmem_start, kmem_size);
+	rc = remove_memory(numa_node, kmem_start, kmem_size);
 	if (rc) {
 		any_hotremove_failed = true;
 		dev_err(dev,
-- 
2.17.1
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

WARNING: multiple messages have this Message-ID (diff)
From: Jia He <justin.he@arm.com>
To: Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will@kernel.org>, Tony Luck <tony.luck@intel.com>,
	Fenghua Yu <fenghua.yu@intel.com>,
	Yoshinori Sato <ysato@users.sourceforge.jp>,
	Rich Felker <dalias@libc.org>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Andy Lutomirski <luto@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	David Hildenbrand <david@redhat.com>
Cc: Kaly Xin <Kaly.Xin@arm.com>, Michal Hocko <mhocko@suse.com>,
	Dave Jiang <dave.jiang@intel.com>, Baoquan He <bhe@redhat.com>,
	linux-sh@vger.kernel.org, Vishal Verma <vishal.l.verma@intel.com>,
	Masahiro Yamada <masahiroy@kernel.org>,
	x86@kernel.org, Chuhong Yuan <hslester96@gmail.com>,
	linux-kernel@vger.kernel.org, Mike Rapoport <rppt@linux.ibm.com>,
	linux-mm@kvack.org, linux-nvdimm@lists.01.org,
	Jonathan Cameron <Jonathan.Cameron@Huawei.com>,
	"H. Peter Anvin" <hpa@zytor.com>,
	linux-ia64@vger.kernel.org,
	Dan Williams <dan.j.williams@intel.com>,
	Logan Gunthorpe <logang@deltatee.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-arm-kernel@lists.infradead.org, Jia He <justin.he@arm.com>
Subject: [PATCH v3 5/6] device-dax: use fallback nid when numa_node is invalid
Date: Thu, 09 Jul 2020 02:06:28 +0000	[thread overview]
Message-ID: <20200709020629.91671-6-justin.he@arm.com> (raw)
In-Reply-To: <20200709020629.91671-1-justin.he@arm.com>

numa_off is set unconditionally at the end of dummy_numa_init(),
even with a fake numa node. ACPI detects node id as NUMA_NO_NODE(-1) in
acpi_map_pxm_to_node() because it regards numa_off as turning off the numa
node. Hence dev_dax->target_node is NUMA_NO_NODE on arm64 with fake numa.

Without this patch, pmem can't be probed as a RAM device on arm64 if SRAT table
isn't present:
$ndctl create-namespace -fe namespace0.0 --modeÞvdax --mapÞv -s 1g -a 64K
kmem dax0.0: rejecting DAX region [mem 0x240400000-0x2bfffffff] with invalid node: -1
kmem: probe of dax0.0 failed with error -22

This fixes it by using fallback memory_add_physaddr_to_nid() as nid.

Suggested-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Jia He <justin.he@arm.com>
---
 drivers/dax/kmem.c | 21 +++++++++++++--------
 1 file changed, 13 insertions(+), 8 deletions(-)

diff --git a/drivers/dax/kmem.c b/drivers/dax/kmem.c
index 275aa5f87399..218f66057994 100644
--- a/drivers/dax/kmem.c
+++ b/drivers/dax/kmem.c
@@ -31,22 +31,23 @@ int dev_dax_kmem_probe(struct device *dev)
 	int numa_node;
 	int rc;
 
+	/* Hotplug starting at the beginning of the next block: */
+	kmem_start = ALIGN(res->start, memory_block_size_bytes());
+
 	/*
 	 * Ensure good NUMA information for the persistent memory.
 	 * Without this check, there is a risk that slow memory
 	 * could be mixed in a node with faster memory, causing
-	 * unavoidable performance issues.
+	 * unavoidable performance issues. Furthermore, fallback node
+	 * id can be used when numa_node is invalid.
 	 */
 	numa_node = dev_dax->target_node;
 	if (numa_node < 0) {
-		dev_warn(dev, "rejecting DAX region %pR with invalid node: %d\n",
-			 res, numa_node);
-		return -EINVAL;
+		numa_node = memory_add_physaddr_to_nid(kmem_start);
+		dev_info(dev, "using nid %d for DAX region with undefined nid %pR\n",
+			numa_node, res);
 	}
 
-	/* Hotplug starting at the beginning of the next block: */
-	kmem_start = ALIGN(res->start, memory_block_size_bytes());
-
 	kmem_size = resource_size(res);
 	/* Adjust the size down to compensate for moving up kmem_start: */
 	kmem_size -= kmem_start - res->start;
@@ -100,15 +101,19 @@ static int dev_dax_kmem_remove(struct device *dev)
 	resource_size_t kmem_start = res->start;
 	resource_size_t kmem_size = resource_size(res);
 	const char *res_name = res->name;
+	int numa_node = dev_dax->target_node;
 	int rc;
 
+	if (numa_node < 0)
+		numa_node = memory_add_physaddr_to_nid(kmem_start);
+
 	/*
 	 * We have one shot for removing memory, if some memory blocks were not
 	 * offline prior to calling this function remove_memory() will fail, and
 	 * there is no way to hotremove this memory until reboot because device
 	 * unbind will succeed even if we return failure.
 	 */
-	rc = remove_memory(dev_dax->target_node, kmem_start, kmem_size);
+	rc = remove_memory(numa_node, kmem_start, kmem_size);
 	if (rc) {
 		any_hotremove_failed = true;
 		dev_err(dev,
-- 
2.17.1

WARNING: multiple messages have this Message-ID (diff)
From: Jia He <justin.he@arm.com>
To: Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will@kernel.org>, Tony Luck <tony.luck@intel.com>,
	Fenghua Yu <fenghua.yu@intel.com>,
	Yoshinori Sato <ysato@users.sourceforge.jp>,
	Rich Felker <dalias@libc.org>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Andy Lutomirski <luto@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	David Hildenbrand <david@redhat.com>
Cc: x86@kernel.org, "H. Peter Anvin" <hpa@zytor.com>,
	Dan Williams <dan.j.williams@intel.com>,
	Vishal Verma <vishal.l.verma@intel.com>,
	Dave Jiang <dave.jiang@intel.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Baoquan He <bhe@redhat.com>, Chuhong Yuan <hslester96@gmail.com>,
	Mike Rapoport <rppt@linux.ibm.com>,
	Logan Gunthorpe <logang@deltatee.com>,
	Masahiro Yamada <masahiroy@kernel.org>,
	Michal Hocko <mhocko@suse.com>,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, linux-ia64@vger.kernel.org,
	linux-sh@vger.kernel.org, linux-nvdimm@lists.01.org,
	linux-mm@kvack.org,
	Jonathan Cameron <Jonathan.Cameron@Huawei.com>,
	Kaly Xin <Kaly.Xin@arm.com>, Jia He <justin.he@arm.com>
Subject: [PATCH v3 5/6] device-dax: use fallback nid when numa_node is invalid
Date: Thu,  9 Jul 2020 10:06:28 +0800	[thread overview]
Message-ID: <20200709020629.91671-6-justin.he@arm.com> (raw)
In-Reply-To: <20200709020629.91671-1-justin.he@arm.com>

numa_off is set unconditionally at the end of dummy_numa_init(),
even with a fake numa node. ACPI detects node id as NUMA_NO_NODE(-1) in
acpi_map_pxm_to_node() because it regards numa_off as turning off the numa
node. Hence dev_dax->target_node is NUMA_NO_NODE on arm64 with fake numa.

Without this patch, pmem can't be probed as a RAM device on arm64 if SRAT table
isn't present:
$ndctl create-namespace -fe namespace0.0 --mode=devdax --map=dev -s 1g -a 64K
kmem dax0.0: rejecting DAX region [mem 0x240400000-0x2bfffffff] with invalid node: -1
kmem: probe of dax0.0 failed with error -22

This fixes it by using fallback memory_add_physaddr_to_nid() as nid.

Suggested-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Jia He <justin.he@arm.com>
---
 drivers/dax/kmem.c | 21 +++++++++++++--------
 1 file changed, 13 insertions(+), 8 deletions(-)

diff --git a/drivers/dax/kmem.c b/drivers/dax/kmem.c
index 275aa5f87399..218f66057994 100644
--- a/drivers/dax/kmem.c
+++ b/drivers/dax/kmem.c
@@ -31,22 +31,23 @@ int dev_dax_kmem_probe(struct device *dev)
 	int numa_node;
 	int rc;
 
+	/* Hotplug starting at the beginning of the next block: */
+	kmem_start = ALIGN(res->start, memory_block_size_bytes());
+
 	/*
 	 * Ensure good NUMA information for the persistent memory.
 	 * Without this check, there is a risk that slow memory
 	 * could be mixed in a node with faster memory, causing
-	 * unavoidable performance issues.
+	 * unavoidable performance issues. Furthermore, fallback node
+	 * id can be used when numa_node is invalid.
 	 */
 	numa_node = dev_dax->target_node;
 	if (numa_node < 0) {
-		dev_warn(dev, "rejecting DAX region %pR with invalid node: %d\n",
-			 res, numa_node);
-		return -EINVAL;
+		numa_node = memory_add_physaddr_to_nid(kmem_start);
+		dev_info(dev, "using nid %d for DAX region with undefined nid %pR\n",
+			numa_node, res);
 	}
 
-	/* Hotplug starting at the beginning of the next block: */
-	kmem_start = ALIGN(res->start, memory_block_size_bytes());
-
 	kmem_size = resource_size(res);
 	/* Adjust the size down to compensate for moving up kmem_start: */
 	kmem_size -= kmem_start - res->start;
@@ -100,15 +101,19 @@ static int dev_dax_kmem_remove(struct device *dev)
 	resource_size_t kmem_start = res->start;
 	resource_size_t kmem_size = resource_size(res);
 	const char *res_name = res->name;
+	int numa_node = dev_dax->target_node;
 	int rc;
 
+	if (numa_node < 0)
+		numa_node = memory_add_physaddr_to_nid(kmem_start);
+
 	/*
 	 * We have one shot for removing memory, if some memory blocks were not
 	 * offline prior to calling this function remove_memory() will fail, and
 	 * there is no way to hotremove this memory until reboot because device
 	 * unbind will succeed even if we return failure.
 	 */
-	rc = remove_memory(dev_dax->target_node, kmem_start, kmem_size);
+	rc = remove_memory(numa_node, kmem_start, kmem_size);
 	if (rc) {
 		any_hotremove_failed = true;
 		dev_err(dev,
-- 
2.17.1


WARNING: multiple messages have this Message-ID (diff)
From: Jia He <justin.he@arm.com>
To: Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will@kernel.org>, Tony Luck <tony.luck@intel.com>,
	Fenghua Yu <fenghua.yu@intel.com>,
	Yoshinori Sato <ysato@users.sourceforge.jp>,
	Rich Felker <dalias@libc.org>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Andy Lutomirski <luto@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	David Hildenbrand <david@redhat.com>
Cc: Kaly Xin <Kaly.Xin@arm.com>, Michal Hocko <mhocko@suse.com>,
	Dave Jiang <dave.jiang@intel.com>, Baoquan He <bhe@redhat.com>,
	linux-sh@vger.kernel.org, Vishal Verma <vishal.l.verma@intel.com>,
	Masahiro Yamada <masahiroy@kernel.org>,
	x86@kernel.org, Chuhong Yuan <hslester96@gmail.com>,
	linux-kernel@vger.kernel.org, Mike Rapoport <rppt@linux.ibm.com>,
	linux-mm@kvack.org, linux-nvdimm@lists.01.org,
	Jonathan Cameron <Jonathan.Cameron@Huawei.com>,
	"H. Peter Anvin" <hpa@zytor.com>,
	linux-ia64@vger.kernel.org,
	Dan Williams <dan.j.williams@intel.com>,
	Logan Gunthorpe <logang@deltatee.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-arm-kernel@lists.infradead.org, Jia He <justin.he@arm.com>
Subject: [PATCH v3 5/6] device-dax: use fallback nid when numa_node is invalid
Date: Thu,  9 Jul 2020 10:06:28 +0800	[thread overview]
Message-ID: <20200709020629.91671-6-justin.he@arm.com> (raw)
In-Reply-To: <20200709020629.91671-1-justin.he@arm.com>

numa_off is set unconditionally at the end of dummy_numa_init(),
even with a fake numa node. ACPI detects node id as NUMA_NO_NODE(-1) in
acpi_map_pxm_to_node() because it regards numa_off as turning off the numa
node. Hence dev_dax->target_node is NUMA_NO_NODE on arm64 with fake numa.

Without this patch, pmem can't be probed as a RAM device on arm64 if SRAT table
isn't present:
$ndctl create-namespace -fe namespace0.0 --mode=devdax --map=dev -s 1g -a 64K
kmem dax0.0: rejecting DAX region [mem 0x240400000-0x2bfffffff] with invalid node: -1
kmem: probe of dax0.0 failed with error -22

This fixes it by using fallback memory_add_physaddr_to_nid() as nid.

Suggested-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Jia He <justin.he@arm.com>
---
 drivers/dax/kmem.c | 21 +++++++++++++--------
 1 file changed, 13 insertions(+), 8 deletions(-)

diff --git a/drivers/dax/kmem.c b/drivers/dax/kmem.c
index 275aa5f87399..218f66057994 100644
--- a/drivers/dax/kmem.c
+++ b/drivers/dax/kmem.c
@@ -31,22 +31,23 @@ int dev_dax_kmem_probe(struct device *dev)
 	int numa_node;
 	int rc;
 
+	/* Hotplug starting at the beginning of the next block: */
+	kmem_start = ALIGN(res->start, memory_block_size_bytes());
+
 	/*
 	 * Ensure good NUMA information for the persistent memory.
 	 * Without this check, there is a risk that slow memory
 	 * could be mixed in a node with faster memory, causing
-	 * unavoidable performance issues.
+	 * unavoidable performance issues. Furthermore, fallback node
+	 * id can be used when numa_node is invalid.
 	 */
 	numa_node = dev_dax->target_node;
 	if (numa_node < 0) {
-		dev_warn(dev, "rejecting DAX region %pR with invalid node: %d\n",
-			 res, numa_node);
-		return -EINVAL;
+		numa_node = memory_add_physaddr_to_nid(kmem_start);
+		dev_info(dev, "using nid %d for DAX region with undefined nid %pR\n",
+			numa_node, res);
 	}
 
-	/* Hotplug starting at the beginning of the next block: */
-	kmem_start = ALIGN(res->start, memory_block_size_bytes());
-
 	kmem_size = resource_size(res);
 	/* Adjust the size down to compensate for moving up kmem_start: */
 	kmem_size -= kmem_start - res->start;
@@ -100,15 +101,19 @@ static int dev_dax_kmem_remove(struct device *dev)
 	resource_size_t kmem_start = res->start;
 	resource_size_t kmem_size = resource_size(res);
 	const char *res_name = res->name;
+	int numa_node = dev_dax->target_node;
 	int rc;
 
+	if (numa_node < 0)
+		numa_node = memory_add_physaddr_to_nid(kmem_start);
+
 	/*
 	 * We have one shot for removing memory, if some memory blocks were not
 	 * offline prior to calling this function remove_memory() will fail, and
 	 * there is no way to hotremove this memory until reboot because device
 	 * unbind will succeed even if we return failure.
 	 */
-	rc = remove_memory(dev_dax->target_node, kmem_start, kmem_size);
+	rc = remove_memory(numa_node, kmem_start, kmem_size);
 	if (rc) {
 		any_hotremove_failed = true;
 		dev_err(dev,
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

WARNING: multiple messages have this Message-ID (diff)
From: Jia He <justin.he@arm.com>
To: Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will@kernel.org>, Tony Luck <tony.luck@intel.com>,
	Fenghua Yu <fenghua.yu@intel.com>,
	Yoshinori Sato <ysato@users.sourceforge.jp>,
	Rich Felker <dalias@libc.org>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Andy Lutomirski <luto@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	David Hildenbrand <david@redhat.com>
Cc: x86@kernel.org, "H. Peter Anvin" <hpa@zytor.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Baoquan He <bhe@redhat.com>, Chuhong Yuan <hslester96@gmail.com>,
	Mike Rapoport <rppt@linux.ibm.com>,
	Masahiro Yamada <masahiroy@kernel.org>,
	Michal Hocko <mhocko@suse.com>,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, linux-ia64@vger.kernel.org,
	linux-sh@vger.kernel.org, linux-nvdimm@lists.01.org,
	linux-mm@kvack.org,
	Jonathan Cameron <Jonathan.Cameron@Huawei.com>,
	Kaly Xin <Kaly.Xin@arm.com>, Jia He <justin.he@arm.com>
Subject: [PATCH v3 5/6] device-dax: use fallback nid when numa_node is invalid
Date: Thu, 09 Jul 2020 02:06:28 +0000	[thread overview]
Message-ID: <20200709020629.91671-6-justin.he@arm.com> (raw)
In-Reply-To: <20200709020629.91671-1-justin.he@arm.com>

numa_off is set unconditionally at the end of dummy_numa_init(),
even with a fake numa node. ACPI detects node id as NUMA_NO_NODE(-1) in
acpi_map_pxm_to_node() because it regards numa_off as turning off the numa
node. Hence dev_dax->target_node is NUMA_NO_NODE on arm64 with fake numa.

Without this patch, pmem can't be probed as a RAM device on arm64 if SRAT table
isn't present:
$ndctl create-namespace -fe namespace0.0 --modefivdax --mapfiv -s 1g -a 64K
kmem dax0.0: rejecting DAX region [mem 0x240400000-0x2bfffffff] with invalid node: -1
kmem: probe of dax0.0 failed with error -22

This fixes it by using fallback memory_add_physaddr_to_nid() as nid.

Suggested-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Jia He <justin.he@arm.com>
---
 drivers/dax/kmem.c | 21 +++++++++++++--------
 1 file changed, 13 insertions(+), 8 deletions(-)

diff --git a/drivers/dax/kmem.c b/drivers/dax/kmem.c
index 275aa5f87399..218f66057994 100644
--- a/drivers/dax/kmem.c
+++ b/drivers/dax/kmem.c
@@ -31,22 +31,23 @@ int dev_dax_kmem_probe(struct device *dev)
 	int numa_node;
 	int rc;
 
+	/* Hotplug starting at the beginning of the next block: */
+	kmem_start = ALIGN(res->start, memory_block_size_bytes());
+
 	/*
 	 * Ensure good NUMA information for the persistent memory.
 	 * Without this check, there is a risk that slow memory
 	 * could be mixed in a node with faster memory, causing
-	 * unavoidable performance issues.
+	 * unavoidable performance issues. Furthermore, fallback node
+	 * id can be used when numa_node is invalid.
 	 */
 	numa_node = dev_dax->target_node;
 	if (numa_node < 0) {
-		dev_warn(dev, "rejecting DAX region %pR with invalid node: %d\n",
-			 res, numa_node);
-		return -EINVAL;
+		numa_node = memory_add_physaddr_to_nid(kmem_start);
+		dev_info(dev, "using nid %d for DAX region with undefined nid %pR\n",
+			numa_node, res);
 	}
 
-	/* Hotplug starting at the beginning of the next block: */
-	kmem_start = ALIGN(res->start, memory_block_size_bytes());
-
 	kmem_size = resource_size(res);
 	/* Adjust the size down to compensate for moving up kmem_start: */
 	kmem_size -= kmem_start - res->start;
@@ -100,15 +101,19 @@ static int dev_dax_kmem_remove(struct device *dev)
 	resource_size_t kmem_start = res->start;
 	resource_size_t kmem_size = resource_size(res);
 	const char *res_name = res->name;
+	int numa_node = dev_dax->target_node;
 	int rc;
 
+	if (numa_node < 0)
+		numa_node = memory_add_physaddr_to_nid(kmem_start);
+
 	/*
 	 * We have one shot for removing memory, if some memory blocks were not
 	 * offline prior to calling this function remove_memory() will fail, and
 	 * there is no way to hotremove this memory until reboot because device
 	 * unbind will succeed even if we return failure.
 	 */
-	rc = remove_memory(dev_dax->target_node, kmem_start, kmem_size);
+	rc = remove_memory(numa_node, kmem_start, kmem_size);
 	if (rc) {
 		any_hotremove_failed = true;
 		dev_err(dev,
-- 
2.17.1

  parent reply	other threads:[~2020-07-09  2:07 UTC|newest]

Thread overview: 110+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-09  2:06 [PATCH v3 0/6] Fix and enable pmem as RAM device on arm64 Jia He
2020-07-09  2:06 ` Jia He
2020-07-09  2:06 ` Jia He
2020-07-09  2:06 ` Jia He
2020-07-09  2:06 ` Jia He
2020-07-09  2:06 ` [PATCH v3 1/6] mm/memory_hotplug: introduce default dummy memory_add_physaddr_to_nid() Jia He
2020-07-09  2:06   ` Jia He
2020-07-09  2:06   ` Jia He
2020-07-09  2:06   ` Jia He
2020-07-09  2:06   ` Jia He
2020-07-09  9:10   ` David Hildenbrand
2020-07-09  9:10     ` David Hildenbrand
2020-07-09  9:10     ` David Hildenbrand
2020-07-09  9:10     ` David Hildenbrand
2020-07-09  9:10     ` David Hildenbrand
2020-07-09  2:06 ` [PATCH v3 2/6] arm64/mm: use " Jia He
2020-07-09  2:06   ` Jia He
2020-07-09  2:06   ` Jia He
2020-07-09  2:06   ` Jia He
2020-07-09  2:06   ` Jia He
2020-07-09  9:10   ` David Hildenbrand
2020-07-09  9:10     ` David Hildenbrand
2020-07-09  9:10     ` David Hildenbrand
2020-07-09  9:10     ` David Hildenbrand
2020-07-09  9:10     ` David Hildenbrand
2020-07-09  2:06 ` [PATCH v3 3/6] sh/mm: " Jia He
2020-07-09  2:06   ` Jia He
2020-07-09  2:06   ` Jia He
2020-07-09  2:06   ` Jia He
2020-07-09  2:06   ` Jia He
2020-07-09  9:10   ` David Hildenbrand
2020-07-09  9:10     ` David Hildenbrand
2020-07-09  9:10     ` David Hildenbrand
2020-07-09  9:10     ` David Hildenbrand
2020-07-09  9:10     ` David Hildenbrand
2020-07-20 21:39     ` Rich Felker
2020-07-20 21:39       ` Rich Felker
2020-07-20 21:39       ` Rich Felker
2020-07-20 21:39       ` Rich Felker
2020-07-20 21:39       ` Rich Felker
2020-07-21  3:23   ` Pankaj Gupta
2020-07-21  3:23     ` Pankaj Gupta
2020-07-21  3:23     ` Pankaj Gupta
2020-07-21  3:23     ` Pankaj Gupta
2020-07-21  3:23     ` Pankaj Gupta
2020-07-21  3:23     ` Pankaj Gupta
2020-07-09  2:06 ` [PATCH v3 4/6] mm: don't export memory_add_physaddr_to_nid in arch specific directory Jia He
2020-07-09  2:06   ` Jia He
2020-07-09  2:06   ` Jia He
2020-07-09  2:06   ` Jia He
2020-07-09  2:06   ` Jia He
2020-07-09  2:11   ` Matthew Wilcox
2020-07-09  2:11     ` Matthew Wilcox
2020-07-09  2:11     ` Matthew Wilcox
2020-07-09  2:11     ` Matthew Wilcox
2020-07-09  2:11     ` Matthew Wilcox
2020-07-09  2:16     ` Justin He
2020-07-09  2:16       ` Justin He
2020-07-09  2:16       ` Justin He
2020-07-09  2:16       ` Justin He
2020-07-09  2:16       ` Justin He
2020-07-09  2:16       ` Justin He
2020-07-09  9:18     ` Mike Rapoport
2020-07-09  9:18       ` Mike Rapoport
2020-07-09  9:18       ` Mike Rapoport
2020-07-09  9:18       ` Mike Rapoport
2020-07-09  9:18       ` Mike Rapoport
2020-07-09  9:18       ` David Hildenbrand
2020-07-09  9:18         ` David Hildenbrand
2020-07-09  9:18         ` David Hildenbrand
2020-07-09  9:18         ` David Hildenbrand
2020-07-09  9:18         ` David Hildenbrand
2020-07-09  9:36         ` Justin He
2020-07-09  9:36           ` Justin He
2020-07-09  9:36           ` Justin He
2020-07-09  9:36           ` Justin He
2020-07-09  9:36           ` Justin He
2020-07-09  9:36           ` Justin He
2020-07-09  2:06 ` Jia He [this message]
2020-07-09  2:06   ` [PATCH v3 5/6] device-dax: use fallback nid when numa_node is invalid Jia He
2020-07-09  2:06   ` Jia He
2020-07-09  2:06   ` Jia He
2020-07-09  2:06   ` Jia He
2020-07-09  3:38   ` Dan Williams
2020-07-09  3:38     ` Dan Williams
2020-07-09  3:38     ` Dan Williams
2020-07-09  3:38     ` Dan Williams
2020-07-09  3:38     ` Dan Williams
2020-07-09  3:38     ` Dan Williams
2020-07-09  5:13     ` Justin He
2020-07-09  5:13       ` Justin He
2020-07-09  5:13       ` Justin He
2020-07-09  5:13       ` Justin He
2020-07-09  5:13       ` Justin He
2020-07-09  2:06 ` [PATCH v3 6/6] mm/memory_hotplug: fix unpaired mem_hotplug_begin/done Jia He
2020-07-09  2:06   ` Jia He
2020-07-09  2:06   ` Jia He
2020-07-09  2:06   ` Jia He
2020-07-09  2:06   ` Jia He
2020-07-09  9:11   ` David Hildenbrand
2020-07-09  9:11     ` David Hildenbrand
2020-07-09  9:11     ` David Hildenbrand
2020-07-09  9:11     ` David Hildenbrand
2020-07-09  9:11     ` David Hildenbrand
2020-07-31 15:28 ` [PATCH v3 0/6] Fix and enable pmem as RAM device on arm64 Dan Williams
2020-07-31 15:28   ` Dan Williams
2020-07-31 15:28   ` Dan Williams
2020-07-31 15:28   ` Dan Williams
2020-07-31 15:28   ` Dan Williams
2020-07-31 15:28   ` Dan Williams

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200709020629.91671-6-justin.he@arm.com \
    --to=justin.he@arm.com \
    --cc=Jonathan.Cameron@Huawei.com \
    --cc=Kaly.Xin@arm.com \
    --cc=akpm@linux-foundation.org \
    --cc=bhe@redhat.com \
    --cc=bp@alien8.de \
    --cc=catalin.marinas@arm.com \
    --cc=dalias@libc.org \
    --cc=dave.hansen@linux.intel.com \
    --cc=david@redhat.com \
    --cc=fenghua.yu@intel.com \
    --cc=hpa@zytor.com \
    --cc=hslester96@gmail.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-ia64@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nvdimm@lists.01.org \
    --cc=linux-sh@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=masahiroy@kernel.org \
    --cc=mhocko@suse.com \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rppt@linux.ibm.com \
    --cc=tglx@linutronix.de \
    --cc=tony.luck@intel.com \
    --cc=will@kernel.org \
    --cc=x86@kernel.org \
    --cc=ysato@users.sourceforge.jp \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.