linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH -V6 RESEND 0/3] numa balancing: Migrate on fault among multiple bound nodes
@ 2020-12-02  8:42 Huang Ying
  2020-12-02  8:42 ` [PATCH -V6 RESEND 1/3] " Huang Ying
                   ` (2 more replies)
  0 siblings, 3 replies; 17+ messages in thread
From: Huang Ying @ 2020-12-02  8:42 UTC (permalink / raw)
  To: Peter Zijlstra, Mel Gorman
  Cc: linux-mm, linux-kernel, Huang Ying, Matthew Wilcox (Oracle),
	Rafael Aquini, Andrew Morton, Ingo Molnar, Rik van Riel,
	Johannes Weiner, Dave Hansen, Andi Kleen, Michal Hocko,
	David Rientjes, linux-api

To make it possible to optimize cross-socket memory accessing with
AutoNUMA even if the memory of the application is bound to multiple
NUMA nodes.

Patch [2/3] and [3/3] are NOT kernel patches.  Instead, they are
patches for man-pages and numactl respectively.  They are sent
together to make it easy to review the newly added kernel API.

Changes:

v6:

- Rebased on latest upstream kernel 5.10-rc5

- Added some benchmark data and example in patch description of [1/3]

- Rename AutoNUMA to NUMA Balancing

- Add patches to man-pages [2/3] and numactl [3/3]

v5:

- Remove mbind() support, because it's not clear that it's necessary.

v4:

- Use new flags instead of reuse MPOL_MF_LAZY.

v3:

- Rebased on latest upstream (v5.10-rc3)

- Revised the change log.

v2:

- Rebased on latest upstream (v5.10-rc1)

Best Regards,
Huang, Ying


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH -V6 RESEND 1/3] numa balancing: Migrate on fault among multiple bound nodes
  2020-12-02  8:42 [PATCH -V6 RESEND 0/3] numa balancing: Migrate on fault among multiple bound nodes Huang Ying
@ 2020-12-02  8:42 ` Huang Ying
  2020-12-02 11:40   ` Mel Gorman
  2020-12-02  8:42 ` [PATCH -V6 RESEND 2/3] NOT kernel/man-pages: man2/set_mempolicy.2: Add mode flag MPOL_F_NUMA_BALANCING Huang Ying
  2020-12-02  8:42 ` [PATCH -V6 RESEND 3/3] NOT kernel/numactl: Support to enable Linux kernel NUMA balancing Huang Ying
  2 siblings, 1 reply; 17+ messages in thread
From: Huang Ying @ 2020-12-02  8:42 UTC (permalink / raw)
  To: Peter Zijlstra, Mel Gorman
  Cc: linux-mm, linux-kernel, Huang Ying, Andrew Morton, Ingo Molnar,
	Rik van Riel, Johannes Weiner, Matthew Wilcox (Oracle),
	Dave Hansen, Andi Kleen, Michal Hocko, David Rientjes, linux-api

Now, NUMA balancing can only optimize the page placement among the
NUMA nodes if the default memory policy is used.  Because the memory
policy specified explicitly should take precedence.  But this seems
too strict in some situations.  For example, on a system with 4 NUMA
nodes, if the memory of an application is bound to the node 0 and 1,
NUMA balancing can potentially migrate the pages between the node 0
and 1 to reduce cross-node accessing without breaking the explicit
memory binding policy.

So in this patch, we add MPOL_F_NUMA_BALANCING mode flag to
set_mempolicy().  With the flag specified, NUMA balancing will be
enabled within the thread to optimize the page placement within the
constrains of the specified memory binding policy.  With the newly
added flag, the NUMA balancing control mechanism becomes,

- sysctl knob numa_balancing can enable/disable the NUMA balancing
  globally.

- even if sysctl numa_balancing is enabled, the NUMA balancing will be
  disabled for the memory areas or applications with the explicit memory
  policy by default.

- MPOL_F_NUMA_BALANCING can be used to enable the NUMA balancing for the
  applications when specifying the explicit memory policy.

Various page placement optimization based on the NUMA balancing can be
done with these flags.  As the first step, in this patch, if the
memory of the application is bound to multiple nodes (MPOL_BIND), and
in the hint page fault handler the accessing node are in the policy
nodemask, the page will be tried to be migrated to the accessing node
to reduce the cross-node accessing.

If the newly added MPOL_F_NUMA_BALANCING flag is specified by an
application on an old kernel version without its support,
set_mempolicy() will return -1 and errno will be set to EINVAL.  The
application can use this behavior to run on both old and new kernel
versions.

In the previous version of the patch, we tried to reuse MPOL_MF_LAZY
for mbind().  But that flag is tied to MPOL_MF_MOVE.*, so it seems not
a good API/ABI for the purpose of the patch.

And because it's not clear whether it's necessary to enable NUMA
balancing for a specific memory area inside an application, so we only
add the flag at the thread level (set_mempolicy()) instead of the
memory area level (mbind()).  We can do that when it become necessary.

To test the patch, we run a test case as follows on a 4-node machine
with 192 GB memory (48 GB per node).

1. Change pmbench memory accessing benchmark to call set_mempolicy()
   to bind its memory to node 1 and 3 and enable NUMA balancing.  Some
   related code snippets are as follows,

     #include <numaif.h>
     #include <numa.h>

	struct bitmask *bmp;
	int ret;

	bmp = numa_parse_nodestring("1,3");
	ret = set_mempolicy(MPOL_BIND | MPOL_F_NUMA_BALANCING,
			    bmp->maskp, bmp->size + 1);
	/* If MPOL_F_NUMA_BALANCING isn't supported, fall back to MPOL_BIND */
	if (ret < 0 && errno == EINVAL)
		ret = set_mempolicy(MPOL_BIND, bmp->maskp, bmp->size + 1);
	if (ret < 0) {
		perror("Failed to call set_mempolicy");
		exit(-1);
	}

2. Run a memory eater on node 3 to use 40 GB memory before running pmbench.

3. Run pmbench with 64 processes, the working-set size of each process
   is 640 MB, so the total working-set size is 64 * 640 MB = 40 GB.  The
   CPU and the memory (as in step 1.) of all pmbench processes is bound
   to node 1 and 3. So, after CPU usage is balanced, some pmbench
   processes run on the CPUs of the node 3 will access the memory of
   the node 1.

4. After the pmbench processes run for 100 seconds, kill the memory
   eater.  Now it's possible for some pmbench processes to migrate
   their pages from node 1 to node 3 to reduce cross-node accessing.

Test results show that, with the patch, the pages can be migrated from
node 1 to node 3 after killing the memory eater, and the pmbench score
can increase about 17.5%.

Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Rik van Riel <riel@surriel.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: David Rientjes <rientjes@google.com>
Cc: linux-api@vger.kernel.org
---
 include/uapi/linux/mempolicy.h | 4 +++-
 mm/mempolicy.c                 | 9 +++++++++
 2 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/include/uapi/linux/mempolicy.h b/include/uapi/linux/mempolicy.h
index 3354774af61e..8948467b3992 100644
--- a/include/uapi/linux/mempolicy.h
+++ b/include/uapi/linux/mempolicy.h
@@ -28,12 +28,14 @@ enum {
 /* Flags for set_mempolicy */
 #define MPOL_F_STATIC_NODES	(1 << 15)
 #define MPOL_F_RELATIVE_NODES	(1 << 14)
+#define MPOL_F_NUMA_BALANCING	(1 << 13) /* Optimize with NUMA balancing if possible */
 
 /*
  * MPOL_MODE_FLAGS is the union of all possible optional mode flags passed to
  * either set_mempolicy() or mbind().
  */
-#define MPOL_MODE_FLAGS	(MPOL_F_STATIC_NODES | MPOL_F_RELATIVE_NODES)
+#define MPOL_MODE_FLAGS							\
+	(MPOL_F_STATIC_NODES | MPOL_F_RELATIVE_NODES | MPOL_F_NUMA_BALANCING)
 
 /* Flags for get_mempolicy */
 #define MPOL_F_NODE	(1<<0)	/* return next IL mode instead of node mask */
diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 3ca4898f3f24..f74d863a9ad3 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -875,6 +875,9 @@ static long do_set_mempolicy(unsigned short mode, unsigned short flags,
 		goto out;
 	}
 
+	if (new && new->mode == MPOL_BIND && (flags & MPOL_F_NUMA_BALANCING))
+		new->flags |= (MPOL_F_MOF | MPOL_F_MORON);
+
 	ret = mpol_set_nodemask(new, nodes, scratch);
 	if (ret) {
 		mpol_put(new);
@@ -2490,6 +2493,12 @@ int mpol_misplaced(struct page *page, struct vm_area_struct *vma, unsigned long
 		break;
 
 	case MPOL_BIND:
+		/* Optimize placement among multiple nodes via NUMA balancing */
+		if (pol->flags & MPOL_F_MORON) {
+			if (node_isset(thisnid, pol->v.nodes))
+				break;
+			goto out;
+		}
 
 		/*
 		 * allows binding to multiple nodes.
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH -V6 RESEND 2/3] NOT kernel/man-pages: man2/set_mempolicy.2: Add mode flag MPOL_F_NUMA_BALANCING
  2020-12-02  8:42 [PATCH -V6 RESEND 0/3] numa balancing: Migrate on fault among multiple bound nodes Huang Ying
  2020-12-02  8:42 ` [PATCH -V6 RESEND 1/3] " Huang Ying
@ 2020-12-02  8:42 ` Huang Ying
  2020-12-02 11:43   ` Mel Gorman
  2020-12-02 12:33   ` Alejandro Colomar (mailing lists; readonly)
  2020-12-02  8:42 ` [PATCH -V6 RESEND 3/3] NOT kernel/numactl: Support to enable Linux kernel NUMA balancing Huang Ying
  2 siblings, 2 replies; 17+ messages in thread
From: Huang Ying @ 2020-12-02  8:42 UTC (permalink / raw)
  To: Peter Zijlstra, Mel Gorman
  Cc: linux-mm, linux-kernel, Huang Ying, Matthew Wilcox (Oracle),
	Rafael Aquini, Andrew Morton, Ingo Molnar, Rik van Riel,
	Johannes Weiner, Dave Hansen, Andi Kleen, Michal Hocko,
	David Rientjes, linux-api

Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
---
 man2/set_mempolicy.2 | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/man2/set_mempolicy.2 b/man2/set_mempolicy.2
index 68011eecb..3754b3e12 100644
--- a/man2/set_mempolicy.2
+++ b/man2/set_mempolicy.2
@@ -113,6 +113,12 @@ A nonempty
 .I nodemask
 specifies node IDs that are relative to the set of
 node IDs allowed by the process's current cpuset.
+.TP
+.BR MPOL_F_NUMA_BALANCING " (since Linux 5.11)"
+Enable the Linux kernel NUMA balancing for the task if it is supported
+by kernel.
+If the flag isn't supported by Linux kernel, return -1 and errno is
+set to EINVAL.
 .PP
 .I nodemask
 points to a bit mask of node IDs that contains up to
@@ -293,6 +299,9 @@ argument specified both
 .B MPOL_F_STATIC_NODES
 and
 .BR MPOL_F_RELATIVE_NODES .
+Or, the
+.B MPOL_F_NUMA_BALANCING
+isn't supported by the Linux kernel.
 .TP
 .B ENOMEM
 Insufficient kernel memory was available.
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH -V6 RESEND 3/3] NOT kernel/numactl: Support to enable Linux kernel NUMA balancing
  2020-12-02  8:42 [PATCH -V6 RESEND 0/3] numa balancing: Migrate on fault among multiple bound nodes Huang Ying
  2020-12-02  8:42 ` [PATCH -V6 RESEND 1/3] " Huang Ying
  2020-12-02  8:42 ` [PATCH -V6 RESEND 2/3] NOT kernel/man-pages: man2/set_mempolicy.2: Add mode flag MPOL_F_NUMA_BALANCING Huang Ying
@ 2020-12-02  8:42 ` Huang Ying
  2020-12-02 11:45   ` Mel Gorman
  2 siblings, 1 reply; 17+ messages in thread
From: Huang Ying @ 2020-12-02  8:42 UTC (permalink / raw)
  To: Peter Zijlstra, Mel Gorman
  Cc: linux-mm, linux-kernel, Huang Ying, Matthew Wilcox (Oracle),
	Rafael Aquini, Andrew Morton, Ingo Molnar, Rik van Riel,
	Johannes Weiner, Dave Hansen, Andi Kleen, Michal Hocko,
	David Rientjes, linux-api

A new API: numa_set_membind_balancing() is added to libnuma.  It is
same as numa_set_membind() except that the Linux kernel NUMA balancing
will be enabled for the task if the feature is supported by the
kernel.

At the same time, a new option: --balancing (-b) is added to numactl.
Which can be used before the memory policy options in the command
line.  With it, the Linux kernel NUMA balancing will be enabled for
the process if the feature is supported by the kernel.

Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
---
 libnuma.c         | 14 ++++++++++++++
 numa.3            | 15 +++++++++++++++
 numa.h            |  4 ++++
 numactl.8         |  9 +++++++++
 numactl.c         | 17 ++++++++++++++---
 numaif.h          |  3 +++
 versions.ldscript |  8 ++++++++
 7 files changed, 67 insertions(+), 3 deletions(-)

diff --git a/libnuma.c b/libnuma.c
index 88f479b..f073c50 100644
--- a/libnuma.c
+++ b/libnuma.c
@@ -1064,6 +1064,20 @@ numa_set_membind_v2(struct bitmask *bmp)
 
 make_internal_alias(numa_set_membind_v2);
 
+void
+numa_set_membind_balancing(struct bitmask *bmp)
+{
+	/* MPOL_F_NUMA_BALANCING: ignore if unsupported */
+	if (set_mempolicy(MPOL_BIND | MPOL_F_NUMA_BALANCING,
+			  bmp->maskp, bmp->size + 1) < 0) {
+		if (errno == EINVAL) {
+			errno = 0;
+			numa_set_membind_v2(bmp);
+		} else
+			numa_error("set_mempolicy");
+	}
+}
+
 /*
  * copy a bitmask map body to a numa.h nodemask_t structure
  */
diff --git a/numa.3 b/numa.3
index 3e18098..af01c8f 100644
--- a/numa.3
+++ b/numa.3
@@ -80,6 +80,8 @@ numa \- NUMA policy library
 .br
 .BI "void numa_set_membind(struct bitmask *" nodemask );
 .br
+.BI "void numa_set_membind_balancing(struct bitmask *" nodemask );
+.br
 .B struct bitmask *numa_get_membind(void);
 .sp
 .BI "void *numa_alloc_onnode(size_t " size ", int " node );
@@ -538,6 +540,19 @@ that contains nodes other than those in the mask returned by
 .IR numa_get_mems_allowed ()
 will result in an error.
 
+.BR numa_set_membind_balancing ()
+sets the memory allocation mask and enable the Linux kernel NUMA
+balancing for the task if the feature is supported by the kernel.
+The task will only allocate memory from the nodes set in
+.IR nodemask .
+Passing an empty
+.I nodemask
+or a
+.I nodemask
+that contains nodes other than those in the mask returned by
+.IR numa_get_mems_allowed ()
+will result in an error.
+
 .BR numa_get_membind ()
 returns the mask of nodes from which memory can currently be allocated.
 If the returned mask is equal to
diff --git a/numa.h b/numa.h
index bd1d676..5d8543a 100644
--- a/numa.h
+++ b/numa.h
@@ -192,6 +192,10 @@ void numa_set_localalloc(void);
 /* Only allocate memory from the nodes set in mask. 0 to turn off */
 void numa_set_membind(struct bitmask *nodemask);
 
+/* Only allocate memory from the nodes set in mask. Optimize page
+   placement with Linux kernel NUMA balancing if possible. 0 to turn off */
+void numa_set_membind_balancing(struct bitmask *bmp);
+
 /* Return current membind */
 struct bitmask *numa_get_membind(void);
 
diff --git a/numactl.8 b/numactl.8
index f3bb22b..109dd8f 100644
--- a/numactl.8
+++ b/numactl.8
@@ -25,6 +25,8 @@ numactl \- Control NUMA policy for processes or shared memory
 [
 .B \-\-all
 ] [
+.B \-\-balancing
+] [
 .B \-\-interleave nodes
 ] [
 .B \-\-preferred node 
@@ -168,6 +170,9 @@ but if memory cannot be allocated there fall back to other nodes.
 This option takes only a single node number.
 Relative notation may be used.
 .TP
+.B \-\-balancing, \-b
+Enable Linux kernel NUMA balancing for the process if it is supported by kernel.
+.TP
 .B \-\-show, \-s
 Show NUMA policy settings of the current process. 
 .TP
@@ -278,6 +283,10 @@ numactl \-\-cpunodebind=0 \-\-membind=0,1 -- process -l
 Run process as above, but with an option (-l) that would be confused with
 a numactl option.
 
+numactl \-\-cpunodebind=0 \-\-balancing \-\-membind=0,1 process
+Run process on node 0 with memory allocated on node 0 and 1.  Optimize the
+page placement with Linux kernel NUMA balancing mechanism if possible.
+
 numactl \-\-cpunodebind=netdev:eth0 \-\-membind=netdev:eth0 network-server
 Run network-server on the node of network device eth0 with its memory
 also in the same node.
diff --git a/numactl.c b/numactl.c
index df9dbcb..5a9d2df 100644
--- a/numactl.c
+++ b/numactl.c
@@ -45,6 +45,7 @@ struct option opts[] = {
 	{"membind", 1, 0, 'm'},
 	{"show", 0, 0, 's' },
 	{"localalloc", 0,0, 'l'},
+	{"balancing", 0, 0, 'b'},
 	{"hardware", 0,0,'H' },
 
 	{"shm", 1, 0, 'S'},
@@ -65,9 +66,10 @@ struct option opts[] = {
 void usage(void)
 {
 	fprintf(stderr,
-		"usage: numactl [--all | -a] [--interleave= | -i <nodes>] [--preferred= | -p <node>]\n"
-		"               [--physcpubind= | -C <cpus>] [--cpunodebind= | -N <nodes>]\n"
-		"               [--membind= | -m <nodes>] [--localalloc | -l] command args ...\n"
+		"usage: numactl [--all | -a] [--balancing | -b] [--interleave= | -i <nodes>]\n"
+		"               [--preferred= | -p <node>] [--physcpubind= | -C <cpus>]\n"
+		"               [--cpunodebind= | -N <nodes>] [--membind= | -m <nodes>]\n"
+		"               [--localalloc | -l] command args ...\n"
 		"       numactl [--show | -s]\n"
 		"       numactl [--hardware | -H]\n"
 		"       numactl [--length | -l <length>] [--offset | -o <offset>] [--shmmode | -M <shmmode>]\n"
@@ -90,6 +92,8 @@ void usage(void)
 		"all numbers and ranges can be made cpuset-relative with +\n"
 		"the old --cpubind argument is deprecated.\n"
 		"use --cpunodebind or --physcpubind instead\n"
+		"use --balancing | -b to enable Linux kernel NUMA balancing\n"
+		"for the process if it is supported by kernel\n"
 		"<length> can have g (GB), m (MB) or k (KB) suffixes\n");
 	exit(1);
 }
@@ -338,6 +342,7 @@ int do_dump = 0;
 int shmattached = 0;
 int did_node_cpu_parse = 0;
 int parse_all = 0;
+int numa_balancing = 0;
 char *shmoption;
 
 void check_cpubind(int flag)
@@ -431,6 +436,10 @@ int main(int ac, char **av)
 			nopolicy();
 			hardware();
 			exit(0);
+		case 'b': /* --balancing  */
+			nopolicy();
+			numa_balancing = 1;
+			break;
 		case 'i': /* --interleave */
 			checknuma();
 			if (parse_all)
@@ -507,6 +516,8 @@ int main(int ac, char **av)
 			numa_set_bind_policy(1);
 			if (shmfd >= 0) {
 				numa_tonodemask_memory(shmptr, shmlen, mask);
+			} else if (numa_balancing) {
+				numa_set_membind_balancing(mask);
 			} else {
 				numa_set_membind(mask);
 			}
diff --git a/numaif.h b/numaif.h
index 91aa230..32c12c3 100644
--- a/numaif.h
+++ b/numaif.h
@@ -29,6 +29,9 @@ extern long move_pages(int pid, unsigned long count,
 #define MPOL_LOCAL       4
 #define MPOL_MAX         5
 
+/* Flags for set_mempolicy, specified in mode */
+#define MPOL_F_NUMA_BALANCING	(1 << 13) /* Optimize with NUMA balancing if possible */
+
 /* Flags for get_mem_policy */
 #define MPOL_F_NODE    (1<<0)   /* return next il node or node of address */
 				/* Warning: MPOL_F_NODE is unsupported and
diff --git a/versions.ldscript b/versions.ldscript
index 23074a0..358eeeb 100644
--- a/versions.ldscript
+++ b/versions.ldscript
@@ -146,3 +146,11 @@ libnuma_1.4 {
   local:
     *;
 } libnuma_1.3;
+
+# New interface for membind with NUMA balancing optimization
+libnuma_1.5 {
+  global:
+    numa_set_membind_balancing;
+  local:
+    *;
+} libnuma_1.4;
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH -V6 RESEND 1/3] numa balancing: Migrate on fault among multiple bound nodes
  2020-12-02  8:42 ` [PATCH -V6 RESEND 1/3] " Huang Ying
@ 2020-12-02 11:40   ` Mel Gorman
  2020-12-03 10:25     ` Peter Zijlstra
  0 siblings, 1 reply; 17+ messages in thread
From: Mel Gorman @ 2020-12-02 11:40 UTC (permalink / raw)
  To: Huang Ying
  Cc: Peter Zijlstra, linux-mm, linux-kernel, Andrew Morton,
	Ingo Molnar, Rik van Riel, Johannes Weiner,
	Matthew Wilcox (Oracle),
	Dave Hansen, Andi Kleen, Michal Hocko, David Rientjes, linux-api

On Wed, Dec 02, 2020 at 04:42:32PM +0800, Huang Ying wrote:
> Now, NUMA balancing can only optimize the page placement among the
> NUMA nodes if the default memory policy is used.  Because the memory
> policy specified explicitly should take precedence.  But this seems
> too strict in some situations.  For example, on a system with 4 NUMA
> nodes, if the memory of an application is bound to the node 0 and 1,
> NUMA balancing can potentially migrate the pages between the node 0
> and 1 to reduce cross-node accessing without breaking the explicit
> memory binding policy.
> 

Ok, I think this part is ok and while the test case is somewhat
superficial, it at least demonstrated that the NUMA balancing overhead
did not offset any potential benefit

Acked-by: Mel Gorman <mgorman@suse.de>

-- 
Mel Gorman
SUSE Labs


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH -V6 RESEND 2/3] NOT kernel/man-pages: man2/set_mempolicy.2: Add mode flag MPOL_F_NUMA_BALANCING
  2020-12-02  8:42 ` [PATCH -V6 RESEND 2/3] NOT kernel/man-pages: man2/set_mempolicy.2: Add mode flag MPOL_F_NUMA_BALANCING Huang Ying
@ 2020-12-02 11:43   ` Mel Gorman
  2020-12-03  1:49     ` Huang, Ying
  2020-12-02 12:33   ` Alejandro Colomar (mailing lists; readonly)
  1 sibling, 1 reply; 17+ messages in thread
From: Mel Gorman @ 2020-12-02 11:43 UTC (permalink / raw)
  To: Huang Ying
  Cc: Peter Zijlstra, linux-mm, linux-kernel, Matthew Wilcox (Oracle),
	Rafael Aquini, Andrew Morton, Ingo Molnar, Rik van Riel,
	Johannes Weiner, Dave Hansen, Andi Kleen, Michal Hocko,
	David Rientjes, linux-api

On Wed, Dec 02, 2020 at 04:42:33PM +0800, Huang Ying wrote:
> Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
> ---
>  man2/set_mempolicy.2 | 9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> diff --git a/man2/set_mempolicy.2 b/man2/set_mempolicy.2
> index 68011eecb..3754b3e12 100644
> --- a/man2/set_mempolicy.2
> +++ b/man2/set_mempolicy.2
> @@ -113,6 +113,12 @@ A nonempty
>  .I nodemask
>  specifies node IDs that are relative to the set of
>  node IDs allowed by the process's current cpuset.
> +.TP
> +.BR MPOL_F_NUMA_BALANCING " (since Linux 5.11)"
> +Enable the Linux kernel NUMA balancing for the task if it is supported
> +by kernel.
> +If the flag isn't supported by Linux kernel, return -1 and errno is
> +set to EINVAL.
>  .PP
>  .I nodemask
>  points to a bit mask of node IDs that contains up to
> @@ -293,6 +299,9 @@ argument specified both

Should this be expanded more to clarify it applies to MPOL_BIND
specifically?

Maybe the first patch should be expanded more and explictly fail if
MPOL_F_NUMA_BALANCING is used with anything other than MPOL_BIND?

>  .B MPOL_F_STATIC_NODES
>  and
>  .BR MPOL_F_RELATIVE_NODES .
> +Or, the
> +.B MPOL_F_NUMA_BALANCING
> +isn't supported by the Linux kernel.

This will be difficult for an app to distinguish but we can't go back in
time and make this ENOSYS :(

The linux-api people might have more guidance but it may go to the
extent of including a small test program in the manual page for a
sequence that tests whether MPOL_F_NUMA_BALANCING works. They might have
a better recommendation on how it should be handled.

-- 
Mel Gorman
SUSE Labs


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH -V6 RESEND 3/3] NOT kernel/numactl: Support to enable Linux kernel NUMA balancing
  2020-12-02  8:42 ` [PATCH -V6 RESEND 3/3] NOT kernel/numactl: Support to enable Linux kernel NUMA balancing Huang Ying
@ 2020-12-02 11:45   ` Mel Gorman
  0 siblings, 0 replies; 17+ messages in thread
From: Mel Gorman @ 2020-12-02 11:45 UTC (permalink / raw)
  To: Huang Ying
  Cc: Peter Zijlstra, linux-mm, linux-kernel, Matthew Wilcox (Oracle),
	Rafael Aquini, Andrew Morton, Ingo Molnar, Rik van Riel,
	Johannes Weiner, Dave Hansen, Andi Kleen, Michal Hocko,
	David Rientjes, linux-api

On Wed, Dec 02, 2020 at 04:42:34PM +0800, Huang Ying wrote:
> A new API: numa_set_membind_balancing() is added to libnuma.  It is
> same as numa_set_membind() except that the Linux kernel NUMA balancing
> will be enabled for the task if the feature is supported by the
> kernel.
> 
> At the same time, a new option: --balancing (-b) is added to numactl.
> Which can be used before the memory policy options in the command
> line.  With it, the Linux kernel NUMA balancing will be enabled for
> the process if the feature is supported by the kernel.
> 
> Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
> index f3bb22b..109dd8f 100644
> --- a/numactl.8
> +++ b/numactl.8
> @@ -25,6 +25,8 @@ numactl \- Control NUMA policy for processes or shared memory
>  [
>  .B \-\-all
>  ] [
> +.B \-\-balancing
> +] [

--balancing is a bit vague, maybe --balance-bind? The intent is to hint
that it's specific to MPOL_BIND at this time.

-- 
Mel Gorman
SUSE Labs


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH -V6 RESEND 2/3] NOT kernel/man-pages: man2/set_mempolicy.2: Add mode flag MPOL_F_NUMA_BALANCING
  2020-12-02  8:42 ` [PATCH -V6 RESEND 2/3] NOT kernel/man-pages: man2/set_mempolicy.2: Add mode flag MPOL_F_NUMA_BALANCING Huang Ying
  2020-12-02 11:43   ` Mel Gorman
@ 2020-12-02 12:33   ` Alejandro Colomar (mailing lists; readonly)
  2020-12-08  8:13     ` Huang, Ying
  1 sibling, 1 reply; 17+ messages in thread
From: Alejandro Colomar (mailing lists; readonly) @ 2020-12-02 12:33 UTC (permalink / raw)
  To: Huang Ying, Peter Zijlstra, Mel Gorman
  Cc: linux-mm, linux-kernel, Matthew Wilcox (Oracle),
	Rafael Aquini, Andrew Morton, Ingo Molnar, Rik van Riel,
	Johannes Weiner, Dave Hansen, Andi Kleen, Michal Hocko,
	David Rientjes, linux-api

Hi Huang Ying,

Please see a few fixes below.

Michael, as always, some question for you too ;)

Thanks,

Alex

On 12/2/20 9:42 AM, Huang Ying wrote:
> Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
> ---
>  man2/set_mempolicy.2 | 9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> diff --git a/man2/set_mempolicy.2 b/man2/set_mempolicy.2
> index 68011eecb..3754b3e12 100644
> --- a/man2/set_mempolicy.2
> +++ b/man2/set_mempolicy.2
> @@ -113,6 +113,12 @@ A nonempty
>  .I nodemask
>  specifies node IDs that are relative to the set of
>  node IDs allowed by the process's current cpuset.
> +.TP
> +.BR MPOL_F_NUMA_BALANCING " (since Linux 5.11)"

I'd prefer it to be in alphabetical order (rather than just adding at
the bottom).

That way, when lists grow, it's easier to find things.

> +Enable the Linux kernel NUMA balancing for the task if it is supported
> +by kernel.

I'd s/Linux kernel/kernel/ when it doesn't specifically refer to the
Linux kernel to differentiate it from other kernels.  It only adds noise
(IMHO).  mtk?

wfix:

... supported by _the_ kernel.

> +If the flag isn't supported by Linux kernel, return -1 and errno is

wfix:

If the flag isn't supported by _the_ kernel, ...

> +set to EINVAL.

errno and EINVAL should use .I and .B respectively

>  .PP
>  .I nodemask
>  points to a bit mask of node IDs that contains up to
> @@ -293,6 +299,9 @@ argument specified both
>  .B MPOL_F_STATIC_NODES
>  and
>  .BR MPOL_F_RELATIVE_NODES .
> +Or, the
> +.B MPOL_F_NUMA_BALANCING
> +isn't supported by the Linux kernel.
>  .TP
>  .B ENOMEM
>  Insufficient kernel memory was available.
> 


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH -V6 RESEND 2/3] NOT kernel/man-pages: man2/set_mempolicy.2: Add mode flag MPOL_F_NUMA_BALANCING
  2020-12-02 11:43   ` Mel Gorman
@ 2020-12-03  1:49     ` Huang, Ying
  2020-12-03  9:37       ` Mel Gorman
  0 siblings, 1 reply; 17+ messages in thread
From: Huang, Ying @ 2020-12-03  1:49 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Peter Zijlstra, linux-mm, linux-kernel, Matthew Wilcox (Oracle),
	Rafael Aquini, Andrew Morton, Ingo Molnar, Rik van Riel,
	Johannes Weiner, Dave Hansen, Andi Kleen, Michal Hocko,
	David Rientjes, linux-api

Mel Gorman <mgorman@suse.de> writes:

> On Wed, Dec 02, 2020 at 04:42:33PM +0800, Huang Ying wrote:
>> Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
>> ---
>>  man2/set_mempolicy.2 | 9 +++++++++
>>  1 file changed, 9 insertions(+)
>> 
>> diff --git a/man2/set_mempolicy.2 b/man2/set_mempolicy.2
>> index 68011eecb..3754b3e12 100644
>> --- a/man2/set_mempolicy.2
>> +++ b/man2/set_mempolicy.2
>> @@ -113,6 +113,12 @@ A nonempty
>>  .I nodemask
>>  specifies node IDs that are relative to the set of
>>  node IDs allowed by the process's current cpuset.
>> +.TP
>> +.BR MPOL_F_NUMA_BALANCING " (since Linux 5.11)"
>> +Enable the Linux kernel NUMA balancing for the task if it is supported
>> +by kernel.
>> +If the flag isn't supported by Linux kernel, return -1 and errno is
>> +set to EINVAL.
>>  .PP
>>  .I nodemask
>>  points to a bit mask of node IDs that contains up to
>> @@ -293,6 +299,9 @@ argument specified both
>
> Should this be expanded more to clarify it applies to MPOL_BIND
> specifically?
>
> Maybe the first patch should be expanded more and explictly fail if
> MPOL_F_NUMA_BALANCING is used with anything other than MPOL_BIND?

For MPOL_PREFERRED, why could we not use NUMA balancing to migrate pages
to the accessing local node if it is same as the preferred node?  We
have a way to turn off NUMA balancing already, why could we not provide
a way to enable it if that's intended?

Even for MPOL_INTERLEAVE, if the target node is the same as the
accessing local node, can we use NUMA balancing to migrate pages?

So, I prefer to make MPOL_F_NUMA_BALANCING to be

  Optimizing with NUMA balancing if possible, and we may add more
  optimization in the future.

Do you agree?

Best Regards,
Huang, Ying

>>  .B MPOL_F_STATIC_NODES
>>  and
>>  .BR MPOL_F_RELATIVE_NODES .
>> +Or, the
>> +.B MPOL_F_NUMA_BALANCING
>> +isn't supported by the Linux kernel.
>
> This will be difficult for an app to distinguish but we can't go back in
> time and make this ENOSYS :(
>
> The linux-api people might have more guidance but it may go to the
> extent of including a small test program in the manual page for a
> sequence that tests whether MPOL_F_NUMA_BALANCING works. They might have
> a better recommendation on how it should be handled.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH -V6 RESEND 2/3] NOT kernel/man-pages: man2/set_mempolicy.2: Add mode flag MPOL_F_NUMA_BALANCING
  2020-12-03  1:49     ` Huang, Ying
@ 2020-12-03  9:37       ` Mel Gorman
  0 siblings, 0 replies; 17+ messages in thread
From: Mel Gorman @ 2020-12-03  9:37 UTC (permalink / raw)
  To: Huang, Ying
  Cc: Peter Zijlstra, linux-mm, linux-kernel, Matthew Wilcox (Oracle),
	Rafael Aquini, Andrew Morton, Ingo Molnar, Rik van Riel,
	Johannes Weiner, Dave Hansen, Andi Kleen, Michal Hocko,
	David Rientjes, linux-api

On Thu, Dec 03, 2020 at 09:49:02AM +0800, Huang, Ying wrote:
> >> diff --git a/man2/set_mempolicy.2 b/man2/set_mempolicy.2
> >> index 68011eecb..3754b3e12 100644
> >> --- a/man2/set_mempolicy.2
> >> +++ b/man2/set_mempolicy.2
> >> @@ -113,6 +113,12 @@ A nonempty
> >>  .I nodemask
> >>  specifies node IDs that are relative to the set of
> >>  node IDs allowed by the process's current cpuset.
> >> +.TP
> >> +.BR MPOL_F_NUMA_BALANCING " (since Linux 5.11)"
> >> +Enable the Linux kernel NUMA balancing for the task if it is supported
> >> +by kernel.
> >> +If the flag isn't supported by Linux kernel, return -1 and errno is
> >> +set to EINVAL.
> >>  .PP
> >>  .I nodemask
> >>  points to a bit mask of node IDs that contains up to
> >> @@ -293,6 +299,9 @@ argument specified both
> >
> > Should this be expanded more to clarify it applies to MPOL_BIND
> > specifically?
> >
> > Maybe the first patch should be expanded more and explictly fail if
> > MPOL_F_NUMA_BALANCING is used with anything other than MPOL_BIND?
> 
> For MPOL_PREFERRED, why could we not use NUMA balancing to migrate pages
> to the accessing local node if it is same as the preferred node? 

You could but the kernel patch does not do that by making preferred_nid
stick to the preferred node when hinting faults are trapped on that VMA.
It would have to be a separate patch coupled with a man page update. If
you wanted to go in this direction in the future, then the patch should
explicitly return an error *now* if MPOL_PREFERRED is or'd with
MPOL_F_NUMA_BALANCING so that an application becomes aware of
MPOL_F_NUMA_BALANCING then it can detect if support exists in the
current running kernel.

> Even for MPOL_INTERLEAVE, if the target node is the same as the
> accessing local node, can we use NUMA balancing to migrate pages?
> 

The intent of MPOL_INTERLEAVE is to average the costs of the memory
access so the average cost across the VMA is roughly similar across the
entire range. This may be particularly important if the VMA is shared
between multiple threads that are spread out on multiple nodes. A change
in semantics there should be clearly documented.

Similar, if you want to go in this direction, MPOL_F_NUMA_BALANCING
should be chcked against MPOL_INTERLEAVE and explicitly fail now so
suport can be detected at runtime.

> So, I prefer to make MPOL_F_NUMA_BALANCING to be
> 
>   Optimizing with NUMA balancing if possible, and we may add more
>   optimization in the future.
> 

Maybe, but I think it's best that the actual behaviour of the kernel is
documented instead of desired behaviour or future planning.

-- 
Mel Gorman
SUSE Labs


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH -V6 RESEND 1/3] numa balancing: Migrate on fault among multiple bound nodes
  2020-12-02 11:40   ` Mel Gorman
@ 2020-12-03 10:25     ` Peter Zijlstra
  2020-12-03 10:53       ` Mel Gorman
  2020-12-04  9:19       ` Huang, Ying
  0 siblings, 2 replies; 17+ messages in thread
From: Peter Zijlstra @ 2020-12-03 10:25 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Huang Ying, linux-mm, linux-kernel, Andrew Morton, Ingo Molnar,
	Rik van Riel, Johannes Weiner, Matthew Wilcox (Oracle),
	Dave Hansen, Andi Kleen, Michal Hocko, David Rientjes, linux-api

On Wed, Dec 02, 2020 at 11:40:54AM +0000, Mel Gorman wrote:
> On Wed, Dec 02, 2020 at 04:42:32PM +0800, Huang Ying wrote:
> > Now, NUMA balancing can only optimize the page placement among the
> > NUMA nodes if the default memory policy is used.  Because the memory
> > policy specified explicitly should take precedence.  But this seems
> > too strict in some situations.  For example, on a system with 4 NUMA
> > nodes, if the memory of an application is bound to the node 0 and 1,
> > NUMA balancing can potentially migrate the pages between the node 0
> > and 1 to reduce cross-node accessing without breaking the explicit
> > memory binding policy.
> > 
> 
> Ok, I think this part is ok and while the test case is somewhat
> superficial, it at least demonstrated that the NUMA balancing overhead
> did not offset any potential benefit
> 
> Acked-by: Mel Gorman <mgorman@suse.de>

Who do we expect to merge this, me through tip/sched/core or akpm ?


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH -V6 RESEND 1/3] numa balancing: Migrate on fault among multiple bound nodes
  2020-12-03 10:25     ` Peter Zijlstra
@ 2020-12-03 10:53       ` Mel Gorman
  2020-12-04  9:19       ` Huang, Ying
  1 sibling, 0 replies; 17+ messages in thread
From: Mel Gorman @ 2020-12-03 10:53 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Huang Ying, linux-mm, linux-kernel, Andrew Morton, Ingo Molnar,
	Rik van Riel, Johannes Weiner, Matthew Wilcox (Oracle),
	Dave Hansen, Andi Kleen, Michal Hocko, David Rientjes, linux-api

On Thu, Dec 03, 2020 at 11:25:50AM +0100, Peter Zijlstra wrote:
> On Wed, Dec 02, 2020 at 11:40:54AM +0000, Mel Gorman wrote:
> > On Wed, Dec 02, 2020 at 04:42:32PM +0800, Huang Ying wrote:
> > > Now, NUMA balancing can only optimize the page placement among the
> > > NUMA nodes if the default memory policy is used.  Because the memory
> > > policy specified explicitly should take precedence.  But this seems
> > > too strict in some situations.  For example, on a system with 4 NUMA
> > > nodes, if the memory of an application is bound to the node 0 and 1,
> > > NUMA balancing can potentially migrate the pages between the node 0
> > > and 1 to reduce cross-node accessing without breaking the explicit
> > > memory binding policy.
> > > 
> > 
> > Ok, I think this part is ok and while the test case is somewhat
> > superficial, it at least demonstrated that the NUMA balancing overhead
> > did not offset any potential benefit
> > 
> > Acked-by: Mel Gorman <mgorman@suse.de>
> 
> Who do we expect to merge this, me through tip/sched/core or akpm ?

I would expect akpm, it's much more on the mm side because it affects
the semantics of memory policies. It should also have more mm-orientated
review than just mine because it affects user-visible semantics and the
ability to detect whether the feature is available or not needs to be
treated with care.

-- 
Mel Gorman
SUSE Labs


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH -V6 RESEND 1/3] numa balancing: Migrate on fault among multiple bound nodes
  2020-12-03 10:25     ` Peter Zijlstra
  2020-12-03 10:53       ` Mel Gorman
@ 2020-12-04  9:19       ` Huang, Ying
  2020-12-10  8:21         ` Huang, Ying
  1 sibling, 1 reply; 17+ messages in thread
From: Huang, Ying @ 2020-12-04  9:19 UTC (permalink / raw)
  To: Peter Zijlstra, Andrew Morton
  Cc: Mel Gorman, linux-mm, linux-kernel, Ingo Molnar, Rik van Riel,
	Johannes Weiner, Matthew Wilcox (Oracle),
	Dave Hansen, Andi Kleen, Michal Hocko, David Rientjes, linux-api

Peter Zijlstra <peterz@infradead.org> writes:

> On Wed, Dec 02, 2020 at 11:40:54AM +0000, Mel Gorman wrote:
>> On Wed, Dec 02, 2020 at 04:42:32PM +0800, Huang Ying wrote:
>> > Now, NUMA balancing can only optimize the page placement among the
>> > NUMA nodes if the default memory policy is used.  Because the memory
>> > policy specified explicitly should take precedence.  But this seems
>> > too strict in some situations.  For example, on a system with 4 NUMA
>> > nodes, if the memory of an application is bound to the node 0 and 1,
>> > NUMA balancing can potentially migrate the pages between the node 0
>> > and 1 to reduce cross-node accessing without breaking the explicit
>> > memory binding policy.
>> > 
>> 
>> Ok, I think this part is ok and while the test case is somewhat
>> superficial, it at least demonstrated that the NUMA balancing overhead
>> did not offset any potential benefit
>> 
>> Acked-by: Mel Gorman <mgorman@suse.de>
>
> Who do we expect to merge this, me through tip/sched/core or akpm ?

Hi, Peter,

Per my understanding, this is NUMA balancing related, so could go
through your tree.

BTW: I have just sent -V7 with some small changes per Mel's latest
comments.

Hi, Andrew,

Do you agree?

Best Regards,
Huang, Ying


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH -V6 RESEND 2/3] NOT kernel/man-pages: man2/set_mempolicy.2: Add mode flag MPOL_F_NUMA_BALANCING
  2020-12-02 12:33   ` Alejandro Colomar (mailing lists; readonly)
@ 2020-12-08  8:13     ` Huang, Ying
  2020-12-18 10:21       ` Alejandro Colomar (mailing lists; readonly)
  0 siblings, 1 reply; 17+ messages in thread
From: Huang, Ying @ 2020-12-08  8:13 UTC (permalink / raw)
  To: Alejandro Colomar (mailing lists; readonly)
  Cc: Peter Zijlstra, Mel Gorman, linux-mm, linux-kernel,
	Matthew Wilcox (Oracle),
	Rafael Aquini, Andrew Morton, Ingo Molnar, Rik van Riel,
	Johannes Weiner, Dave Hansen, Andi Kleen, Michal Hocko,
	David Rientjes, linux-api

Hi, Alex,

Sorry for late, I just notice this email today.

"Alejandro Colomar (mailing lists; readonly)"
<alx.mailinglists@gmail.com> writes:

> Hi Huang Ying,
>
> Please see a few fixes below.
>
> Michael, as always, some question for you too ;)
>
> Thanks,
>
> Alex
>
> On 12/2/20 9:42 AM, Huang Ying wrote:
>> Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
>> ---
>>  man2/set_mempolicy.2 | 9 +++++++++
>>  1 file changed, 9 insertions(+)
>> 
>> diff --git a/man2/set_mempolicy.2 b/man2/set_mempolicy.2
>> index 68011eecb..3754b3e12 100644
>> --- a/man2/set_mempolicy.2
>> +++ b/man2/set_mempolicy.2
>> @@ -113,6 +113,12 @@ A nonempty
>>  .I nodemask
>>  specifies node IDs that are relative to the set of
>>  node IDs allowed by the process's current cpuset.
>> +.TP
>> +.BR MPOL_F_NUMA_BALANCING " (since Linux 5.11)"
>
> I'd prefer it to be in alphabetical order (rather than just adding at
> the bottom).

That's OK for me.  But it's better to be done in another patch to
distinguish contents from pure order change?

> That way, when lists grow, it's easier to find things.
>
>> +Enable the Linux kernel NUMA balancing for the task if it is supported
>> +by kernel.
>
> I'd s/Linux kernel/kernel/ when it doesn't specifically refer to the
> Linux kernel to differentiate it from other kernels.  It only adds noise
> (IMHO).  mtk?

Sure.  Will fix this and all following comments below.  Thanks a lot for
your help!  I am new to man pages.

Best Regards,
Huang, Ying


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH -V6 RESEND 1/3] numa balancing: Migrate on fault among multiple bound nodes
  2020-12-04  9:19       ` Huang, Ying
@ 2020-12-10  8:21         ` Huang, Ying
  0 siblings, 0 replies; 17+ messages in thread
From: Huang, Ying @ 2020-12-10  8:21 UTC (permalink / raw)
  To: Peter Zijlstra, Andrew Morton
  Cc: Mel Gorman, linux-mm, linux-kernel, Ingo Molnar, Rik van Riel,
	Johannes Weiner, Matthew Wilcox (Oracle),
	Dave Hansen, Andi Kleen, Michal Hocko, David Rientjes, linux-api

"Huang, Ying" <ying.huang@intel.com> writes:

> Peter Zijlstra <peterz@infradead.org> writes:
>
>> On Wed, Dec 02, 2020 at 11:40:54AM +0000, Mel Gorman wrote:
>>> On Wed, Dec 02, 2020 at 04:42:32PM +0800, Huang Ying wrote:
>>> > Now, NUMA balancing can only optimize the page placement among the
>>> > NUMA nodes if the default memory policy is used.  Because the memory
>>> > policy specified explicitly should take precedence.  But this seems
>>> > too strict in some situations.  For example, on a system with 4 NUMA
>>> > nodes, if the memory of an application is bound to the node 0 and 1,
>>> > NUMA balancing can potentially migrate the pages between the node 0
>>> > and 1 to reduce cross-node accessing without breaking the explicit
>>> > memory binding policy.
>>> > 
>>> 
>>> Ok, I think this part is ok and while the test case is somewhat
>>> superficial, it at least demonstrated that the NUMA balancing overhead
>>> did not offset any potential benefit
>>> 
>>> Acked-by: Mel Gorman <mgorman@suse.de>
>>
>> Who do we expect to merge this, me through tip/sched/core or akpm ?
>
> Hi, Peter,
>
> Per my understanding, this is NUMA balancing related, so could go
> through your tree.
>
> BTW: I have just sent -V7 with some small changes per Mel's latest
> comments.
>
> Hi, Andrew,
>
> Do you agree?

So, what's the conclusion here?  Both path works for me.  I will update
2/3 per Alejandro Colomar's comments.  But that's for man-pages only,
not for kernel.  So, we can merge this one into kernel if you think it's
appropriate.

Best Regards,
Huang, Ying


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH -V6 RESEND 2/3] NOT kernel/man-pages: man2/set_mempolicy.2: Add mode flag MPOL_F_NUMA_BALANCING
  2020-12-08  8:13     ` Huang, Ying
@ 2020-12-18 10:21       ` Alejandro Colomar (mailing lists; readonly)
  2020-12-21  1:31         ` Huang, Ying
  0 siblings, 1 reply; 17+ messages in thread
From: Alejandro Colomar (mailing lists; readonly) @ 2020-12-18 10:21 UTC (permalink / raw)
  To: Huang, Ying
  Cc: Peter Zijlstra, Mel Gorman, linux-mm, linux-kernel,
	Matthew Wilcox (Oracle),
	Rafael Aquini, Andrew Morton, Ingo Molnar, Rik van Riel,
	Johannes Weiner, Dave Hansen, Andi Kleen, Michal Hocko,
	David Rientjes, linux-api

Hi Huang, Ying,

Sorry I forgot to answer.
See below.

BTW, Linux 5.10 has been released recently;
is this series already merged for 5.11?
If not yet, could you just write '5.??' and we'll fix it (and add a
commit number in a comment) when we know the definitive version?

Thanks,

Alex

On 12/8/20 9:13 AM, Huang, Ying wrote:
> Hi, Alex,
> 
> Sorry for late, I just notice this email today.
> 
> "Alejandro Colomar (mailing lists; readonly)"
> <alx.mailinglists@gmail.com> writes:
> 
>> Hi Huang Ying,
>>
>> Please see a few fixes below.
>>
>> Michael, as always, some question for you too ;)
>>
>> Thanks,
>>
>> Alex
>>
>> On 12/2/20 9:42 AM, Huang Ying wrote:
>>> Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
>>> ---
>>>  man2/set_mempolicy.2 | 9 +++++++++
>>>  1 file changed, 9 insertions(+)
>>>
>>> diff --git a/man2/set_mempolicy.2 b/man2/set_mempolicy.2
>>> index 68011eecb..3754b3e12 100644
>>> --- a/man2/set_mempolicy.2
>>> +++ b/man2/set_mempolicy.2
>>> @@ -113,6 +113,12 @@ A nonempty
>>>  .I nodemask
>>>  specifies node IDs that are relative to the set of
>>>  node IDs allowed by the process's current cpuset.
>>> +.TP
>>> +.BR MPOL_F_NUMA_BALANCING " (since Linux 5.11)"
>>
>> I'd prefer it to be in alphabetical order (rather than just adding at
>> the bottom).
> 
> That's OK for me.  But it's better to be done in another patch to
> distinguish contents from pure order change?

Yes, if you could do a series of 2 patches with a reordering first, it
would be great.

> 
>> That way, when lists grow, it's easier to find things.
>>
>>> +Enable the Linux kernel NUMA balancing for the task if it is supported
>>> +by kernel.
>>
>> I'd s/Linux kernel/kernel/ when it doesn't specifically refer to the
>> Linux kernel to differentiate it from other kernels.  It only adds noise
>> (IMHO).  mtk?
> 
> Sure.  Will fix this and all following comments below.  Thanks a lot for
> your help!  I am new to man pages.

Thank you!

> 
> Best Regards,
> Huang, Ying
> 


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH -V6 RESEND 2/3] NOT kernel/man-pages: man2/set_mempolicy.2: Add mode flag MPOL_F_NUMA_BALANCING
  2020-12-18 10:21       ` Alejandro Colomar (mailing lists; readonly)
@ 2020-12-21  1:31         ` Huang, Ying
  0 siblings, 0 replies; 17+ messages in thread
From: Huang, Ying @ 2020-12-21  1:31 UTC (permalink / raw)
  To: Alejandro Colomar (mailing lists; readonly)
  Cc: Peter Zijlstra, Mel Gorman, linux-mm, linux-kernel,
	Matthew Wilcox (Oracle),
	Rafael Aquini, Andrew Morton, Ingo Molnar, Rik van Riel,
	Johannes Weiner, Dave Hansen, Andi Kleen, Michal Hocko,
	David Rientjes, linux-api

"Alejandro Colomar (mailing lists; readonly)"
<alx.mailinglists@gmail.com> writes:

> Hi Huang, Ying,
>
> Sorry I forgot to answer.
> See below.
>
> BTW, Linux 5.10 has been released recently;
> is this series already merged for 5.11?
> If not yet, could you just write '5.??' and we'll fix it (and add a
> commit number in a comment) when we know the definitive version?

Sure.  Will replace it with 5.12.  Thanks for reminding!

Best Regards,
Huang, Ying

> Thanks,
>
> Alex
>
> On 12/8/20 9:13 AM, Huang, Ying wrote:
>> Hi, Alex,
>> 
>> Sorry for late, I just notice this email today.
>> 
>> "Alejandro Colomar (mailing lists; readonly)"
>> <alx.mailinglists@gmail.com> writes:
>> 
>>> Hi Huang Ying,
>>>
>>> Please see a few fixes below.
>>>
>>> Michael, as always, some question for you too ;)
>>>
>>> Thanks,
>>>
>>> Alex
>>>
>>> On 12/2/20 9:42 AM, Huang Ying wrote:
>>>> Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
>>>> ---
>>>>  man2/set_mempolicy.2 | 9 +++++++++
>>>>  1 file changed, 9 insertions(+)
>>>>
>>>> diff --git a/man2/set_mempolicy.2 b/man2/set_mempolicy.2
>>>> index 68011eecb..3754b3e12 100644
>>>> --- a/man2/set_mempolicy.2
>>>> +++ b/man2/set_mempolicy.2
>>>> @@ -113,6 +113,12 @@ A nonempty
>>>>  .I nodemask
>>>>  specifies node IDs that are relative to the set of
>>>>  node IDs allowed by the process's current cpuset.
>>>> +.TP
>>>> +.BR MPOL_F_NUMA_BALANCING " (since Linux 5.11)"
>>>
>>> I'd prefer it to be in alphabetical order (rather than just adding at
>>> the bottom).
>> 
>> That's OK for me.  But it's better to be done in another patch to
>> distinguish contents from pure order change?
>
> Yes, if you could do a series of 2 patches with a reordering first, it
> would be great.
>
>> 
>>> That way, when lists grow, it's easier to find things.
>>>
>>>> +Enable the Linux kernel NUMA balancing for the task if it is supported
>>>> +by kernel.
>>>
>>> I'd s/Linux kernel/kernel/ when it doesn't specifically refer to the
>>> Linux kernel to differentiate it from other kernels.  It only adds noise
>>> (IMHO).  mtk?
>> 
>> Sure.  Will fix this and all following comments below.  Thanks a lot for
>> your help!  I am new to man pages.
>
> Thank you!
>
>> 
>> Best Regards,
>> Huang, Ying
>> 


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2020-12-21  1:31 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-02  8:42 [PATCH -V6 RESEND 0/3] numa balancing: Migrate on fault among multiple bound nodes Huang Ying
2020-12-02  8:42 ` [PATCH -V6 RESEND 1/3] " Huang Ying
2020-12-02 11:40   ` Mel Gorman
2020-12-03 10:25     ` Peter Zijlstra
2020-12-03 10:53       ` Mel Gorman
2020-12-04  9:19       ` Huang, Ying
2020-12-10  8:21         ` Huang, Ying
2020-12-02  8:42 ` [PATCH -V6 RESEND 2/3] NOT kernel/man-pages: man2/set_mempolicy.2: Add mode flag MPOL_F_NUMA_BALANCING Huang Ying
2020-12-02 11:43   ` Mel Gorman
2020-12-03  1:49     ` Huang, Ying
2020-12-03  9:37       ` Mel Gorman
2020-12-02 12:33   ` Alejandro Colomar (mailing lists; readonly)
2020-12-08  8:13     ` Huang, Ying
2020-12-18 10:21       ` Alejandro Colomar (mailing lists; readonly)
2020-12-21  1:31         ` Huang, Ying
2020-12-02  8:42 ` [PATCH -V6 RESEND 3/3] NOT kernel/numactl: Support to enable Linux kernel NUMA balancing Huang Ying
2020-12-02 11:45   ` Mel Gorman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).