All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jia He <justin.he@arm.com>
To: Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will@kernel.org>,
	Dan Williams <dan.j.williams@intel.com>,
	Vishal Verma <vishal.l.verma@intel.com>,
	Dave Jiang <dave.jiang@intel.com>
Cc: Michal Hocko <mhocko@suse.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Mike Rapoport <rppt@linux.ibm.com>, Baoquan He <bhe@redhat.com>,
	Chuhong Yuan <hslester96@gmail.com>,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	linux-nvdimm@lists.01.org, Kaly Xin <Kaly.Xin@arm.com>,
	Jia He <justin.he@arm.com>
Subject: [RFC PATCH v2 2/3] device-dax: use fallback nid when numa_node is invalid
Date: Tue,  7 Jul 2020 13:59:16 +0800	[thread overview]
Message-ID: <20200707055917.143653-3-justin.he@arm.com> (raw)
In-Reply-To: <20200707055917.143653-1-justin.he@arm.com>

Previously, numa_off is set unconditionally at the end of dummy_numa_init(),
even with a fake numa node. Then ACPI detects node id as NUMA_NO_NODE(-1) in
acpi_map_pxm_to_node() because it regards numa_off as turning off the numa
node. Hence dev_dax->target_node is NUMA_NO_NODE on arm64 with fake numa.

Without this patch, pmem can't be probed as a RAM device on arm64 if SRAT table
isn't present:
$ndctl create-namespace -fe namespace0.0 --mode=devdax --map=dev -s 1g -a 64K
kmem dax0.0: rejecting DAX region [mem 0x240400000-0x2bfffffff] with invalid node: -1
kmem: probe of dax0.0 failed with error -22

This fixes it by using fallback memory_add_physaddr_to_nid() as nid.

Suggested-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Jia He <justin.he@arm.com>
---
I noticed that on powerpc memory_add_physaddr_to_nid is not exported for module
driver. Set it to RFC due to this concern.

 drivers/dax/kmem.c | 22 ++++++++++++++--------
 1 file changed, 14 insertions(+), 8 deletions(-)

diff --git a/drivers/dax/kmem.c b/drivers/dax/kmem.c
index 275aa5f87399..68e693ca6d59 100644
--- a/drivers/dax/kmem.c
+++ b/drivers/dax/kmem.c
@@ -28,20 +28,22 @@ int dev_dax_kmem_probe(struct device *dev)
 	resource_size_t kmem_end;
 	struct resource *new_res;
 	const char *new_res_name;
-	int numa_node;
+	int numa_node, new_node;
 	int rc;
 
 	/*
 	 * Ensure good NUMA information for the persistent memory.
-	 * Without this check, there is a risk that slow memory
-	 * could be mixed in a node with faster memory, causing
-	 * unavoidable performance issues.
+	 * Without this check, there is a risk but not fatal that slow
+	 * memory could be mixed in a node with faster memory, causing
+	 * unavoidable performance issues. Furthermore, fallback node
+	 * id can be used when numa_node is invalid.
 	 */
 	numa_node = dev_dax->target_node;
 	if (numa_node < 0) {
-		dev_warn(dev, "rejecting DAX region %pR with invalid node: %d\n",
-			 res, numa_node);
-		return -EINVAL;
+		new_node = memory_add_physaddr_to_nid(kmem_start);
+		dev_info(dev, "changing nid from %d to %d for DAX region %pR\n",
+			numa_node, new_node, res);
+		numa_node = new_node;
 	}
 
 	/* Hotplug starting at the beginning of the next block: */
@@ -100,6 +102,7 @@ static int dev_dax_kmem_remove(struct device *dev)
 	resource_size_t kmem_start = res->start;
 	resource_size_t kmem_size = resource_size(res);
 	const char *res_name = res->name;
+	int numa_node = dev_dax->target_node;
 	int rc;
 
 	/*
@@ -108,7 +111,10 @@ static int dev_dax_kmem_remove(struct device *dev)
 	 * there is no way to hotremove this memory until reboot because device
 	 * unbind will succeed even if we return failure.
 	 */
-	rc = remove_memory(dev_dax->target_node, kmem_start, kmem_size);
+	if (numa_node < 0)
+		numa_node = memory_add_physaddr_to_nid(kmem_start);
+
+	rc = remove_memory(numa_node, kmem_start, kmem_size);
 	if (rc) {
 		any_hotremove_failed = true;
 		dev_err(dev,
-- 
2.17.1
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

WARNING: multiple messages have this Message-ID (diff)
From: Jia He <justin.he@arm.com>
To: Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will@kernel.org>,
	Dan Williams <dan.j.williams@intel.com>,
	Vishal Verma <vishal.l.verma@intel.com>,
	Dave Jiang <dave.jiang@intel.com>
Cc: Michal Hocko <mhocko@suse.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Mike Rapoport <rppt@linux.ibm.com>, Baoquan He <bhe@redhat.com>,
	Chuhong Yuan <hslester96@gmail.com>,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	linux-nvdimm@lists.01.org, Kaly Xin <Kaly.Xin@arm.com>,
	Jia He <justin.he@arm.com>
Subject: [RFC PATCH v2 2/3] device-dax: use fallback nid when numa_node is invalid
Date: Tue,  7 Jul 2020 13:59:16 +0800	[thread overview]
Message-ID: <20200707055917.143653-3-justin.he@arm.com> (raw)
In-Reply-To: <20200707055917.143653-1-justin.he@arm.com>

Previously, numa_off is set unconditionally at the end of dummy_numa_init(),
even with a fake numa node. Then ACPI detects node id as NUMA_NO_NODE(-1) in
acpi_map_pxm_to_node() because it regards numa_off as turning off the numa
node. Hence dev_dax->target_node is NUMA_NO_NODE on arm64 with fake numa.

Without this patch, pmem can't be probed as a RAM device on arm64 if SRAT table
isn't present:
$ndctl create-namespace -fe namespace0.0 --mode=devdax --map=dev -s 1g -a 64K
kmem dax0.0: rejecting DAX region [mem 0x240400000-0x2bfffffff] with invalid node: -1
kmem: probe of dax0.0 failed with error -22

This fixes it by using fallback memory_add_physaddr_to_nid() as nid.

Suggested-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Jia He <justin.he@arm.com>
---
I noticed that on powerpc memory_add_physaddr_to_nid is not exported for module
driver. Set it to RFC due to this concern.

 drivers/dax/kmem.c | 22 ++++++++++++++--------
 1 file changed, 14 insertions(+), 8 deletions(-)

diff --git a/drivers/dax/kmem.c b/drivers/dax/kmem.c
index 275aa5f87399..68e693ca6d59 100644
--- a/drivers/dax/kmem.c
+++ b/drivers/dax/kmem.c
@@ -28,20 +28,22 @@ int dev_dax_kmem_probe(struct device *dev)
 	resource_size_t kmem_end;
 	struct resource *new_res;
 	const char *new_res_name;
-	int numa_node;
+	int numa_node, new_node;
 	int rc;
 
 	/*
 	 * Ensure good NUMA information for the persistent memory.
-	 * Without this check, there is a risk that slow memory
-	 * could be mixed in a node with faster memory, causing
-	 * unavoidable performance issues.
+	 * Without this check, there is a risk but not fatal that slow
+	 * memory could be mixed in a node with faster memory, causing
+	 * unavoidable performance issues. Furthermore, fallback node
+	 * id can be used when numa_node is invalid.
 	 */
 	numa_node = dev_dax->target_node;
 	if (numa_node < 0) {
-		dev_warn(dev, "rejecting DAX region %pR with invalid node: %d\n",
-			 res, numa_node);
-		return -EINVAL;
+		new_node = memory_add_physaddr_to_nid(kmem_start);
+		dev_info(dev, "changing nid from %d to %d for DAX region %pR\n",
+			numa_node, new_node, res);
+		numa_node = new_node;
 	}
 
 	/* Hotplug starting at the beginning of the next block: */
@@ -100,6 +102,7 @@ static int dev_dax_kmem_remove(struct device *dev)
 	resource_size_t kmem_start = res->start;
 	resource_size_t kmem_size = resource_size(res);
 	const char *res_name = res->name;
+	int numa_node = dev_dax->target_node;
 	int rc;
 
 	/*
@@ -108,7 +111,10 @@ static int dev_dax_kmem_remove(struct device *dev)
 	 * there is no way to hotremove this memory until reboot because device
 	 * unbind will succeed even if we return failure.
 	 */
-	rc = remove_memory(dev_dax->target_node, kmem_start, kmem_size);
+	if (numa_node < 0)
+		numa_node = memory_add_physaddr_to_nid(kmem_start);
+
+	rc = remove_memory(numa_node, kmem_start, kmem_size);
 	if (rc) {
 		any_hotremove_failed = true;
 		dev_err(dev,
-- 
2.17.1


WARNING: multiple messages have this Message-ID (diff)
From: Jia He <justin.he@arm.com>
To: Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will@kernel.org>,
	Dan Williams <dan.j.williams@intel.com>,
	Vishal Verma <vishal.l.verma@intel.com>,
	Dave Jiang <dave.jiang@intel.com>
Cc: Kaly Xin <Kaly.Xin@arm.com>, Michal Hocko <mhocko@suse.com>,
	Jia He <justin.he@arm.com>, Baoquan He <bhe@redhat.com>,
	linux-nvdimm@lists.01.org, Chuhong Yuan <hslester96@gmail.com>,
	linux-kernel@vger.kernel.org, Mike Rapoport <rppt@linux.ibm.com>,
	linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
	linux-arm-kernel@lists.infradead.org
Subject: [RFC PATCH v2 2/3] device-dax: use fallback nid when numa_node is invalid
Date: Tue,  7 Jul 2020 13:59:16 +0800	[thread overview]
Message-ID: <20200707055917.143653-3-justin.he@arm.com> (raw)
In-Reply-To: <20200707055917.143653-1-justin.he@arm.com>

Previously, numa_off is set unconditionally at the end of dummy_numa_init(),
even with a fake numa node. Then ACPI detects node id as NUMA_NO_NODE(-1) in
acpi_map_pxm_to_node() because it regards numa_off as turning off the numa
node. Hence dev_dax->target_node is NUMA_NO_NODE on arm64 with fake numa.

Without this patch, pmem can't be probed as a RAM device on arm64 if SRAT table
isn't present:
$ndctl create-namespace -fe namespace0.0 --mode=devdax --map=dev -s 1g -a 64K
kmem dax0.0: rejecting DAX region [mem 0x240400000-0x2bfffffff] with invalid node: -1
kmem: probe of dax0.0 failed with error -22

This fixes it by using fallback memory_add_physaddr_to_nid() as nid.

Suggested-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Jia He <justin.he@arm.com>
---
I noticed that on powerpc memory_add_physaddr_to_nid is not exported for module
driver. Set it to RFC due to this concern.

 drivers/dax/kmem.c | 22 ++++++++++++++--------
 1 file changed, 14 insertions(+), 8 deletions(-)

diff --git a/drivers/dax/kmem.c b/drivers/dax/kmem.c
index 275aa5f87399..68e693ca6d59 100644
--- a/drivers/dax/kmem.c
+++ b/drivers/dax/kmem.c
@@ -28,20 +28,22 @@ int dev_dax_kmem_probe(struct device *dev)
 	resource_size_t kmem_end;
 	struct resource *new_res;
 	const char *new_res_name;
-	int numa_node;
+	int numa_node, new_node;
 	int rc;
 
 	/*
 	 * Ensure good NUMA information for the persistent memory.
-	 * Without this check, there is a risk that slow memory
-	 * could be mixed in a node with faster memory, causing
-	 * unavoidable performance issues.
+	 * Without this check, there is a risk but not fatal that slow
+	 * memory could be mixed in a node with faster memory, causing
+	 * unavoidable performance issues. Furthermore, fallback node
+	 * id can be used when numa_node is invalid.
 	 */
 	numa_node = dev_dax->target_node;
 	if (numa_node < 0) {
-		dev_warn(dev, "rejecting DAX region %pR with invalid node: %d\n",
-			 res, numa_node);
-		return -EINVAL;
+		new_node = memory_add_physaddr_to_nid(kmem_start);
+		dev_info(dev, "changing nid from %d to %d for DAX region %pR\n",
+			numa_node, new_node, res);
+		numa_node = new_node;
 	}
 
 	/* Hotplug starting at the beginning of the next block: */
@@ -100,6 +102,7 @@ static int dev_dax_kmem_remove(struct device *dev)
 	resource_size_t kmem_start = res->start;
 	resource_size_t kmem_size = resource_size(res);
 	const char *res_name = res->name;
+	int numa_node = dev_dax->target_node;
 	int rc;
 
 	/*
@@ -108,7 +111,10 @@ static int dev_dax_kmem_remove(struct device *dev)
 	 * there is no way to hotremove this memory until reboot because device
 	 * unbind will succeed even if we return failure.
 	 */
-	rc = remove_memory(dev_dax->target_node, kmem_start, kmem_size);
+	if (numa_node < 0)
+		numa_node = memory_add_physaddr_to_nid(kmem_start);
+
+	rc = remove_memory(numa_node, kmem_start, kmem_size);
 	if (rc) {
 		any_hotremove_failed = true;
 		dev_err(dev,
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  parent reply	other threads:[~2020-07-07  5:59 UTC|newest]

Thread overview: 162+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-07  5:59 [PATCH v2 0/3] Fix and enable pmem as RAM device on arm64 Jia He
2020-07-07  5:59 ` Jia He
2020-07-07  5:59 ` Jia He
2020-07-07  5:59 ` [PATCH v2 1/3] arm64/numa: export memory_add_physaddr_to_nid as EXPORT_SYMBOL_GPL Jia He
2020-07-07  5:59   ` Jia He
2020-07-07  5:59   ` Jia He
2020-07-07 11:35   ` David Hildenbrand
2020-07-07 11:35     ` David Hildenbrand
2020-07-07 11:35     ` David Hildenbrand
2020-07-07 11:54   ` Michal Hocko
2020-07-07 11:54     ` Michal Hocko
2020-07-07 11:54     ` Michal Hocko
2020-07-07 12:13     ` Mike Rapoport
2020-07-07 12:13       ` Mike Rapoport
2020-07-07 12:13       ` Mike Rapoport
2020-07-07 12:26       ` David Hildenbrand
2020-07-07 12:26         ` David Hildenbrand
2020-07-07 12:26         ` David Hildenbrand
2020-07-07 18:00         ` Mike Rapoport
2020-07-07 18:00           ` Mike Rapoport
2020-07-07 18:00           ` Mike Rapoport
2020-07-07 22:05           ` Dan Williams
2020-07-07 22:05             ` Dan Williams
2020-07-07 22:05             ` Dan Williams
2020-07-07 22:05             ` Dan Williams
2020-07-08  5:27             ` Mike Rapoport
2020-07-08  5:27               ` Mike Rapoport
2020-07-08  5:27               ` Mike Rapoport
2020-07-08  7:21               ` David Hildenbrand
2020-07-08  7:21                 ` David Hildenbrand
2020-07-08  7:21                 ` David Hildenbrand
2020-07-08  7:38                 ` Mike Rapoport
2020-07-08  7:38                   ` Mike Rapoport
2020-07-08  7:38                   ` Mike Rapoport
2020-07-08  7:40                   ` David Hildenbrand
2020-07-08  7:40                     ` David Hildenbrand
2020-07-08  7:40                     ` David Hildenbrand
2020-07-08  7:50                 ` Dan Williams
2020-07-08  7:50                   ` Dan Williams
2020-07-08  7:50                   ` Dan Williams
2020-07-08  7:50                   ` Dan Williams
2020-07-08  8:26                   ` David Hildenbrand
2020-07-08  8:26                     ` David Hildenbrand
2020-07-08  8:26                     ` David Hildenbrand
2020-07-08  8:39                     ` Mike Rapoport
2020-07-08  8:39                       ` Mike Rapoport
2020-07-08  8:39                       ` Mike Rapoport
2020-07-08  8:45                       ` David Hildenbrand
2020-07-08  8:45                         ` David Hildenbrand
2020-07-08  8:45                         ` David Hildenbrand
2020-07-08  9:15                         ` Mike Rapoport
2020-07-08  9:15                           ` Mike Rapoport
2020-07-08  9:15                           ` Mike Rapoport
2020-07-08  9:25                           ` David Hildenbrand
2020-07-08  9:25                             ` David Hildenbrand
2020-07-08  9:25                             ` David Hildenbrand
2020-07-08  9:45                             ` Mike Rapoport
2020-07-08  9:45                               ` Mike Rapoport
2020-07-08  9:45                               ` Mike Rapoport
2020-07-08 10:04                               ` David Hildenbrand
2020-07-08 10:04                                 ` David Hildenbrand
2020-07-08 10:04                                 ` David Hildenbrand
2020-07-08 15:50                                 ` Dan Williams
2020-07-08 15:50                                   ` Dan Williams
2020-07-08 15:50                                   ` Dan Williams
2020-07-08 15:50                                   ` Dan Williams
2020-07-08 16:10                                   ` David Hildenbrand
2020-07-08 16:10                                     ` David Hildenbrand
2020-07-08 16:10                                     ` David Hildenbrand
2020-07-08 16:47                                     ` Mike Rapoport
2020-07-08 16:47                                       ` Mike Rapoport
2020-07-08 16:47                                       ` Mike Rapoport
2020-07-08  2:20     ` Justin He
2020-07-08  2:20       ` Justin He
2020-07-08  2:20       ` Justin He
2020-07-08  2:20       ` Justin He
2020-07-08  3:56       ` Dan Williams
2020-07-08  3:56         ` Dan Williams
2020-07-08  3:56         ` Dan Williams
2020-07-08  3:56         ` Dan Williams
2020-07-08  4:08         ` Justin He
2020-07-08  4:08           ` Justin He
2020-07-08  4:08           ` Justin He
2020-07-08  4:08           ` Justin He
2020-07-08  4:27           ` Dan Williams
2020-07-08  4:27             ` Dan Williams
2020-07-08  4:27             ` Dan Williams
2020-07-08  4:27             ` Dan Williams
2020-07-08  6:22             ` Mike Rapoport
2020-07-08  6:22               ` Mike Rapoport
2020-07-08  6:22               ` Mike Rapoport
2020-07-08  6:22               ` Mike Rapoport
2020-07-08  6:53               ` Dan Williams
2020-07-08  6:53                 ` Dan Williams
2020-07-08  6:53                 ` Dan Williams
2020-07-08  6:53                 ` Dan Williams
2020-07-08  6:59               ` David Hildenbrand
2020-07-08  6:59                 ` David Hildenbrand
2020-07-08  6:59                 ` David Hildenbrand
2020-07-08  6:59                 ` David Hildenbrand
2020-07-08  7:04                 ` Dan Williams
2020-07-08  7:04                   ` Dan Williams
2020-07-08  7:04                   ` Dan Williams
2020-07-08  7:04                   ` Dan Williams
2020-07-08  7:16                   ` David Hildenbrand
2020-07-08  7:16                     ` David Hildenbrand
2020-07-08  7:16                     ` David Hildenbrand
2020-07-08  7:16                     ` David Hildenbrand
2020-07-08  7:43                     ` Mike Rapoport
2020-07-08  7:43                       ` Mike Rapoport
2020-07-08  7:43                       ` Mike Rapoport
2020-07-08  7:43                       ` Mike Rapoport
2020-07-08  5:32         ` Mike Rapoport
2020-07-08  5:32           ` Mike Rapoport
2020-07-08  5:32           ` Mike Rapoport
2020-07-08  5:32           ` Mike Rapoport
2020-07-08  5:48           ` Dan Williams
2020-07-08  5:48             ` Dan Williams
2020-07-08  5:48             ` Dan Williams
2020-07-08  5:48             ` Dan Williams
2020-07-08  6:19             ` Mike Rapoport
2020-07-08  6:19               ` Mike Rapoport
2020-07-08  6:19               ` Mike Rapoport
2020-07-08  6:19               ` Mike Rapoport
2020-07-08  6:44               ` Dan Williams
2020-07-08  6:44                 ` Dan Williams
2020-07-08  6:44                 ` Dan Williams
2020-07-08  6:44                 ` Dan Williams
2020-07-08  6:56             ` Justin He
2020-07-08  6:56               ` Justin He
2020-07-08  6:56               ` Justin He
2020-07-08  6:56               ` Justin He
2020-07-08  7:00               ` David Hildenbrand
2020-07-08  7:00                 ` David Hildenbrand
2020-07-08  7:00                 ` David Hildenbrand
2020-07-08  7:00                 ` David Hildenbrand
2020-07-07  5:59 ` Jia He [this message]
2020-07-07  5:59   ` [RFC PATCH v2 2/3] device-dax: use fallback nid when numa_node is invalid Jia He
2020-07-07  5:59   ` Jia He
2020-07-07  6:08   ` Justin He
2020-07-07  6:08     ` Justin He
2020-07-07  6:08     ` Justin He
2020-07-07  6:08     ` Justin He
2020-07-07 11:34   ` David Hildenbrand
2020-07-07 11:34     ` David Hildenbrand
2020-07-07 11:34     ` David Hildenbrand
2020-07-08  1:41     ` Justin He
2020-07-08  1:41       ` Justin He
2020-07-08  1:41       ` Justin He
2020-07-08  1:41       ` Justin He
2020-07-07 13:53   ` kernel test robot
2020-07-08  7:07   ` kernel test robot
2020-07-07  5:59 ` [PATCH v2 3/3] mm/memory_hotplug: fix unpaired mem_hotplug_begin/done Jia He
2020-07-07  5:59   ` Jia He
2020-07-07  5:59   ` Jia He
2020-07-07 10:06   ` Michal Hocko
2020-07-07 10:06     ` Michal Hocko
2020-07-07 10:06     ` Michal Hocko
2020-07-07 11:31   ` David Hildenbrand
2020-07-07 11:31     ` David Hildenbrand
2020-07-07 11:31     ` David Hildenbrand
2020-07-10 14:02   ` Sasha Levin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200707055917.143653-3-justin.he@arm.com \
    --to=justin.he@arm.com \
    --cc=Kaly.Xin@arm.com \
    --cc=akpm@linux-foundation.org \
    --cc=bhe@redhat.com \
    --cc=catalin.marinas@arm.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave.jiang@intel.com \
    --cc=hslester96@gmail.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nvdimm@lists.01.org \
    --cc=mhocko@suse.com \
    --cc=rppt@linux.ibm.com \
    --cc=vishal.l.verma@intel.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.