All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH RFC v2 0/4] some fixes and clean up for mempolicy
@ 2017-10-27 10:14 ` Yisheng Xie
  0 siblings, 0 replies; 61+ messages in thread
From: Yisheng Xie @ 2017-10-27 10:14 UTC (permalink / raw)
  To: akpm, vbabka, mhocko, mingo, rientjes, n-horiguchi, salls
  Cc: linux-mm, linux-kernel, tanxiaojun, linux-api

This patchset is triggered by Xiojun's report of ltp test fail[1],
and I have sent a patch to resolve it by check nodes_empty of new_nodes[2].

The new version is to follow Vlastimil's suggestion, which fix by checking
the new_nodes value in function get_nodes. And I just split them to small
patches for easy to review and discussion. For more detail, please look
into each patches.

Change logs of v2:
 * fix get_nodes's mask miscalculation
 * remove redundant check in get_nodes
 * fix the check of nodemask from user - per Vlastimil

Any comment and complain is welome.

Thanks
Yisheng Xie

[1] https://patchwork.kernel.org/patch/10012005/
[2] https://patchwork.kernel.org/patch/10013329/

Yisheng Xie (4):
  mm/mempolicy: Fix get_nodes() mask miscalculation
  mm/mempolicy: remove redundant check in get_nodes
  mm/mempolicy: fix the check of nodemask from user
  mm/mempolicy: add nodes_empty check in SYSC_migrate_pages

 mm/mempolicy.c | 31 +++++++++++++++++++++++++------
 1 file changed, 25 insertions(+), 6 deletions(-)

-- 
1.7.12.4

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH RFC v2 0/4] some fixes and clean up for mempolicy
@ 2017-10-27 10:14 ` Yisheng Xie
  0 siblings, 0 replies; 61+ messages in thread
From: Yisheng Xie @ 2017-10-27 10:14 UTC (permalink / raw)
  To: akpm, vbabka, mhocko, mingo, rientjes, n-horiguchi, salls
  Cc: linux-mm, linux-kernel, tanxiaojun, linux-api

This patchset is triggered by Xiojun's report of ltp test fail[1],
and I have sent a patch to resolve it by check nodes_empty of new_nodes[2].

The new version is to follow Vlastimil's suggestion, which fix by checking
the new_nodes value in function get_nodes. And I just split them to small
patches for easy to review and discussion. For more detail, please look
into each patches.

Change logs of v2:
 * fix get_nodes's mask miscalculation
 * remove redundant check in get_nodes
 * fix the check of nodemask from user - per Vlastimil

Any comment and complain is welome.

Thanks
Yisheng Xie

[1] https://patchwork.kernel.org/patch/10012005/
[2] https://patchwork.kernel.org/patch/10013329/

Yisheng Xie (4):
  mm/mempolicy: Fix get_nodes() mask miscalculation
  mm/mempolicy: remove redundant check in get_nodes
  mm/mempolicy: fix the check of nodemask from user
  mm/mempolicy: add nodes_empty check in SYSC_migrate_pages

 mm/mempolicy.c | 31 +++++++++++++++++++++++++------
 1 file changed, 25 insertions(+), 6 deletions(-)

-- 
1.7.12.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH RFC v2 1/4] mm/mempolicy: Fix get_nodes() mask miscalculation
@ 2017-10-27 10:14   ` Yisheng Xie
  0 siblings, 0 replies; 61+ messages in thread
From: Yisheng Xie @ 2017-10-27 10:14 UTC (permalink / raw)
  To: akpm, vbabka, mhocko, mingo, rientjes, n-horiguchi, salls
  Cc: linux-mm, linux-kernel, tanxiaojun, linux-api

It appears there is a nodemask miscalculation in the get_nodes()
function in mm/mempolicy.c.  This bug has two effects:

1. It is impossible to specify a length 1 nodemask.
2. It is impossible to specify a nodemask containing the last node.

Brent have submmit a patch before v2.6.12, however, Andi revert his
changed for ABI problem. I just resent this patch as RFC, for do not
clear about what's the problem Andi have met.

As manpage of set_mempolicy, If the value of maxnode is zero, the
nodemask argument is ignored. but we should not ignore the nodemask
when maxnode is 1.

Signed-off-by: Yisheng Xie <xieyisheng1@huawei.com>
---
 mm/mempolicy.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index a2af6d5..613e9d0 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -1265,7 +1265,6 @@ static int get_nodes(nodemask_t *nodes, const unsigned long __user *nmask,
 	unsigned long nlongs;
 	unsigned long endmask;
 
-	--maxnode;
 	nodes_clear(*nodes);
 	if (maxnode == 0 || !nmask)
 		return 0;
-- 
1.7.12.4

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH RFC v2 1/4] mm/mempolicy: Fix get_nodes() mask miscalculation
@ 2017-10-27 10:14   ` Yisheng Xie
  0 siblings, 0 replies; 61+ messages in thread
From: Yisheng Xie @ 2017-10-27 10:14 UTC (permalink / raw)
  To: akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, vbabka-AlSwsSmVLrQ,
	mhocko-IBi9RG/b67k, mingo-DgEjT+Ai2ygdnm+yROfE0A,
	rientjes-hpIqsD4AKlfQT0dZR+AlfA,
	n-horiguchi-PaJj6Psr51x8UrSeD/g0lQ, salls-b3bnyZ7c9ISVc3sceRu5cw
  Cc: linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	tanxiaojun-hv44wF8Li93QT0dZR+AlfA,
	linux-api-u79uwXL29TY76Z2rM5mHXA

It appears there is a nodemask miscalculation in the get_nodes()
function in mm/mempolicy.c.  This bug has two effects:

1. It is impossible to specify a length 1 nodemask.
2. It is impossible to specify a nodemask containing the last node.

Brent have submmit a patch before v2.6.12, however, Andi revert his
changed for ABI problem. I just resent this patch as RFC, for do not
clear about what's the problem Andi have met.

As manpage of set_mempolicy, If the value of maxnode is zero, the
nodemask argument is ignored. but we should not ignore the nodemask
when maxnode is 1.

Signed-off-by: Yisheng Xie <xieyisheng1-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
---
 mm/mempolicy.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index a2af6d5..613e9d0 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -1265,7 +1265,6 @@ static int get_nodes(nodemask_t *nodes, const unsigned long __user *nmask,
 	unsigned long nlongs;
 	unsigned long endmask;
 
-	--maxnode;
 	nodes_clear(*nodes);
 	if (maxnode == 0 || !nmask)
 		return 0;
-- 
1.7.12.4

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH RFC v2 1/4] mm/mempolicy: Fix get_nodes() mask miscalculation
@ 2017-10-27 10:14   ` Yisheng Xie
  0 siblings, 0 replies; 61+ messages in thread
From: Yisheng Xie @ 2017-10-27 10:14 UTC (permalink / raw)
  To: akpm, vbabka, mhocko, mingo, rientjes, n-horiguchi, salls
  Cc: linux-mm, linux-kernel, tanxiaojun, linux-api

It appears there is a nodemask miscalculation in the get_nodes()
function in mm/mempolicy.c.  This bug has two effects:

1. It is impossible to specify a length 1 nodemask.
2. It is impossible to specify a nodemask containing the last node.

Brent have submmit a patch before v2.6.12, however, Andi revert his
changed for ABI problem. I just resent this patch as RFC, for do not
clear about what's the problem Andi have met.

As manpage of set_mempolicy, If the value of maxnode is zero, the
nodemask argument is ignored. but we should not ignore the nodemask
when maxnode is 1.

Signed-off-by: Yisheng Xie <xieyisheng1@huawei.com>
---
 mm/mempolicy.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index a2af6d5..613e9d0 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -1265,7 +1265,6 @@ static int get_nodes(nodemask_t *nodes, const unsigned long __user *nmask,
 	unsigned long nlongs;
 	unsigned long endmask;
 
-	--maxnode;
 	nodes_clear(*nodes);
 	if (maxnode == 0 || !nmask)
 		return 0;
-- 
1.7.12.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH RFC v2 2/4] mm/mempolicy: remove redundant check in get_nodes
@ 2017-10-27 10:14   ` Yisheng Xie
  0 siblings, 0 replies; 61+ messages in thread
From: Yisheng Xie @ 2017-10-27 10:14 UTC (permalink / raw)
  To: akpm, vbabka, mhocko, mingo, rientjes, n-horiguchi, salls
  Cc: linux-mm, linux-kernel, tanxiaojun, linux-api

We have already checked whether maxnode is a page worth of bits, by:
    maxnode > PAGE_SIZE*BITS_PER_BYTE

So no need to check it once more.

Signed-off-by: Yisheng Xie <xieyisheng1@huawei.com>
---
 mm/mempolicy.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 613e9d0..3b51bb3 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -1280,8 +1280,6 @@ static int get_nodes(nodemask_t *nodes, const unsigned long __user *nmask,
 	/* When the user specified more nodes than supported just check
 	   if the non supported part is all zero. */
 	if (nlongs > BITS_TO_LONGS(MAX_NUMNODES)) {
-		if (nlongs > PAGE_SIZE/sizeof(long))
-			return -EINVAL;
 		for (k = BITS_TO_LONGS(MAX_NUMNODES); k < nlongs; k++) {
 			unsigned long t;
 			if (get_user(t, nmask + k))
-- 
1.7.12.4

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH RFC v2 2/4] mm/mempolicy: remove redundant check in get_nodes
@ 2017-10-27 10:14   ` Yisheng Xie
  0 siblings, 0 replies; 61+ messages in thread
From: Yisheng Xie @ 2017-10-27 10:14 UTC (permalink / raw)
  To: akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, vbabka-AlSwsSmVLrQ,
	mhocko-IBi9RG/b67k, mingo-DgEjT+Ai2ygdnm+yROfE0A,
	rientjes-hpIqsD4AKlfQT0dZR+AlfA,
	n-horiguchi-PaJj6Psr51x8UrSeD/g0lQ, salls-b3bnyZ7c9ISVc3sceRu5cw
  Cc: linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	tanxiaojun-hv44wF8Li93QT0dZR+AlfA,
	linux-api-u79uwXL29TY76Z2rM5mHXA

We have already checked whether maxnode is a page worth of bits, by:
    maxnode > PAGE_SIZE*BITS_PER_BYTE

So no need to check it once more.

Signed-off-by: Yisheng Xie <xieyisheng1-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
---
 mm/mempolicy.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 613e9d0..3b51bb3 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -1280,8 +1280,6 @@ static int get_nodes(nodemask_t *nodes, const unsigned long __user *nmask,
 	/* When the user specified more nodes than supported just check
 	   if the non supported part is all zero. */
 	if (nlongs > BITS_TO_LONGS(MAX_NUMNODES)) {
-		if (nlongs > PAGE_SIZE/sizeof(long))
-			return -EINVAL;
 		for (k = BITS_TO_LONGS(MAX_NUMNODES); k < nlongs; k++) {
 			unsigned long t;
 			if (get_user(t, nmask + k))
-- 
1.7.12.4

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH RFC v2 2/4] mm/mempolicy: remove redundant check in get_nodes
@ 2017-10-27 10:14   ` Yisheng Xie
  0 siblings, 0 replies; 61+ messages in thread
From: Yisheng Xie @ 2017-10-27 10:14 UTC (permalink / raw)
  To: akpm, vbabka, mhocko, mingo, rientjes, n-horiguchi, salls
  Cc: linux-mm, linux-kernel, tanxiaojun, linux-api

We have already checked whether maxnode is a page worth of bits, by:
    maxnode > PAGE_SIZE*BITS_PER_BYTE

So no need to check it once more.

Signed-off-by: Yisheng Xie <xieyisheng1@huawei.com>
---
 mm/mempolicy.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 613e9d0..3b51bb3 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -1280,8 +1280,6 @@ static int get_nodes(nodemask_t *nodes, const unsigned long __user *nmask,
 	/* When the user specified more nodes than supported just check
 	   if the non supported part is all zero. */
 	if (nlongs > BITS_TO_LONGS(MAX_NUMNODES)) {
-		if (nlongs > PAGE_SIZE/sizeof(long))
-			return -EINVAL;
 		for (k = BITS_TO_LONGS(MAX_NUMNODES); k < nlongs; k++) {
 			unsigned long t;
 			if (get_user(t, nmask + k))
-- 
1.7.12.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH RFC v2 3/4] mm/mempolicy: fix the check of nodemask from user
  2017-10-27 10:14 ` Yisheng Xie
@ 2017-10-27 10:14   ` Yisheng Xie
  -1 siblings, 0 replies; 61+ messages in thread
From: Yisheng Xie @ 2017-10-27 10:14 UTC (permalink / raw)
  To: akpm, vbabka, mhocko, mingo, rientjes, n-horiguchi, salls
  Cc: linux-mm, linux-kernel, tanxiaojun, linux-api

As Xiaojun reported the ltp of migrate_pages01 will failed on ARCH arm64
system which has 4 nodes[0...3], all have memory and CONFIG_NODES_SHIFT=2:

migrate_pages01    0  TINFO  :  test_invalid_nodes
migrate_pages01   14  TFAIL  :  migrate_pages_common.c:45: unexpected failure - returned value = 0, expected: -1
migrate_pages01   15  TFAIL  :  migrate_pages_common.c:55: call succeeded unexpectedly

In this case the test_invalid_nodes of migrate_pages01 will call:
SYSC_migrate_pages as:

migrate_pages(0, , {0x0000000000000001}, 64, , {0x0000000000000010}, 64) = 0

The new nodes specifies one or more node IDs that are greater than the
maximum supported node ID, however, the errno is not set to EINVAL as
expected.

As man pages of set_mempolicy[1], mbind[2], and migrate_pages[3] memtioned,
when nodemask specifies one or more node IDs that are greater than the
maximum supported node ID, the errno should set to EINVAL. However, get_nodes
only check whether the part of bits [BITS_PER_LONG*BITS_TO_LONGS(MAX_NUMNODES),
maxnode) is zero or not,  and remain [MAX_NUMNODES, BITS_PER_LONG*BITS_TO_LONGS(MAX_NUMNODES)
unchecked.

This patch is to check the bits of [MAX_NUMNODES, maxnode) in get_nodes to
let migrate_pages set the errno to EINVAL when nodemask specifies one or
more node IDs that are greater than the maximum supported node ID, which
follows the manpage's guide.

[1] http://man7.org/linux/man-pages/man2/set_mempolicy.2.html
[2] http://man7.org/linux/man-pages/man2/mbind.2.html
[3] http://man7.org/linux/man-pages/man2/migrate_pages.2.html

Reported-by: Tan Xiaojun <tanxiaojun@huawei.com>
Signed-off-by: Yisheng Xie <xieyisheng1@huawei.com>
---
 mm/mempolicy.c | 23 ++++++++++++++++++++---
 1 file changed, 20 insertions(+), 3 deletions(-)

diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 3b51bb3..8798ecb 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -1262,6 +1262,7 @@ static int get_nodes(nodemask_t *nodes, const unsigned long __user *nmask,
 		     unsigned long maxnode)
 {
 	unsigned long k;
+	unsigned long t;
 	unsigned long nlongs;
 	unsigned long endmask;
 
@@ -1277,11 +1278,17 @@ static int get_nodes(nodemask_t *nodes, const unsigned long __user *nmask,
 	else
 		endmask = (1UL << (maxnode % BITS_PER_LONG)) - 1;
 
-	/* When the user specified more nodes than supported just check
-	   if the non supported part is all zero. */
+	/*
+	 * When the user specified more nodes than supported just check
+	 * if the non supported part is all zero.
+	 *
+	 * If maxnode have more longs than MAX_NUMNODES, check
+	 * the bits in that area first. And then go through to
+	 * check the rest bits which equal or bigger than MAX_NUMNODES.
+	 * Otherwise, just check bits [MAX_NUMNODES, maxnode).
+	 */
 	if (nlongs > BITS_TO_LONGS(MAX_NUMNODES)) {
 		for (k = BITS_TO_LONGS(MAX_NUMNODES); k < nlongs; k++) {
-			unsigned long t;
 			if (get_user(t, nmask + k))
 				return -EFAULT;
 			if (k == nlongs - 1) {
@@ -1294,6 +1301,16 @@ static int get_nodes(nodemask_t *nodes, const unsigned long __user *nmask,
 		endmask = ~0UL;
 	}
 
+	if (maxnode > MAX_NUMNODES && MAX_NUMNODES % BITS_PER_LONG != 0) {
+		unsigned long valid_mask = endmask;
+
+		valid_mask &= ~((1UL << (MAX_NUMNODES % BITS_PER_LONG)) - 1);
+		if (get_user(t, nmask + nlongs - 1))
+			return -EFAULT;
+		if (t & valid_mask)
+			return -EINVAL;
+	}
+
 	if (copy_from_user(nodes_addr(*nodes), nmask, nlongs*sizeof(unsigned long)))
 		return -EFAULT;
 	nodes_addr(*nodes)[nlongs-1] &= endmask;
-- 
1.7.12.4

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH RFC v2 3/4] mm/mempolicy: fix the check of nodemask from user
@ 2017-10-27 10:14   ` Yisheng Xie
  0 siblings, 0 replies; 61+ messages in thread
From: Yisheng Xie @ 2017-10-27 10:14 UTC (permalink / raw)
  To: akpm, vbabka, mhocko, mingo, rientjes, n-horiguchi, salls
  Cc: linux-mm, linux-kernel, tanxiaojun, linux-api

As Xiaojun reported the ltp of migrate_pages01 will failed on ARCH arm64
system which has 4 nodes[0...3], all have memory and CONFIG_NODES_SHIFT=2:

migrate_pages01    0  TINFO  :  test_invalid_nodes
migrate_pages01   14  TFAIL  :  migrate_pages_common.c:45: unexpected failure - returned value = 0, expected: -1
migrate_pages01   15  TFAIL  :  migrate_pages_common.c:55: call succeeded unexpectedly

In this case the test_invalid_nodes of migrate_pages01 will call:
SYSC_migrate_pages as:

migrate_pages(0, , {0x0000000000000001}, 64, , {0x0000000000000010}, 64) = 0

The new nodes specifies one or more node IDs that are greater than the
maximum supported node ID, however, the errno is not set to EINVAL as
expected.

As man pages of set_mempolicy[1], mbind[2], and migrate_pages[3] memtioned,
when nodemask specifies one or more node IDs that are greater than the
maximum supported node ID, the errno should set to EINVAL. However, get_nodes
only check whether the part of bits [BITS_PER_LONG*BITS_TO_LONGS(MAX_NUMNODES),
maxnode) is zero or not,  and remain [MAX_NUMNODES, BITS_PER_LONG*BITS_TO_LONGS(MAX_NUMNODES)
unchecked.

This patch is to check the bits of [MAX_NUMNODES, maxnode) in get_nodes to
let migrate_pages set the errno to EINVAL when nodemask specifies one or
more node IDs that are greater than the maximum supported node ID, which
follows the manpage's guide.

[1] http://man7.org/linux/man-pages/man2/set_mempolicy.2.html
[2] http://man7.org/linux/man-pages/man2/mbind.2.html
[3] http://man7.org/linux/man-pages/man2/migrate_pages.2.html

Reported-by: Tan Xiaojun <tanxiaojun@huawei.com>
Signed-off-by: Yisheng Xie <xieyisheng1@huawei.com>
---
 mm/mempolicy.c | 23 ++++++++++++++++++++---
 1 file changed, 20 insertions(+), 3 deletions(-)

diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 3b51bb3..8798ecb 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -1262,6 +1262,7 @@ static int get_nodes(nodemask_t *nodes, const unsigned long __user *nmask,
 		     unsigned long maxnode)
 {
 	unsigned long k;
+	unsigned long t;
 	unsigned long nlongs;
 	unsigned long endmask;
 
@@ -1277,11 +1278,17 @@ static int get_nodes(nodemask_t *nodes, const unsigned long __user *nmask,
 	else
 		endmask = (1UL << (maxnode % BITS_PER_LONG)) - 1;
 
-	/* When the user specified more nodes than supported just check
-	   if the non supported part is all zero. */
+	/*
+	 * When the user specified more nodes than supported just check
+	 * if the non supported part is all zero.
+	 *
+	 * If maxnode have more longs than MAX_NUMNODES, check
+	 * the bits in that area first. And then go through to
+	 * check the rest bits which equal or bigger than MAX_NUMNODES.
+	 * Otherwise, just check bits [MAX_NUMNODES, maxnode).
+	 */
 	if (nlongs > BITS_TO_LONGS(MAX_NUMNODES)) {
 		for (k = BITS_TO_LONGS(MAX_NUMNODES); k < nlongs; k++) {
-			unsigned long t;
 			if (get_user(t, nmask + k))
 				return -EFAULT;
 			if (k == nlongs - 1) {
@@ -1294,6 +1301,16 @@ static int get_nodes(nodemask_t *nodes, const unsigned long __user *nmask,
 		endmask = ~0UL;
 	}
 
+	if (maxnode > MAX_NUMNODES && MAX_NUMNODES % BITS_PER_LONG != 0) {
+		unsigned long valid_mask = endmask;
+
+		valid_mask &= ~((1UL << (MAX_NUMNODES % BITS_PER_LONG)) - 1);
+		if (get_user(t, nmask + nlongs - 1))
+			return -EFAULT;
+		if (t & valid_mask)
+			return -EINVAL;
+	}
+
 	if (copy_from_user(nodes_addr(*nodes), nmask, nlongs*sizeof(unsigned long)))
 		return -EFAULT;
 	nodes_addr(*nodes)[nlongs-1] &= endmask;
-- 
1.7.12.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH RFC v2 4/4] mm/mempolicy: add nodes_empty check in SYSC_migrate_pages
  2017-10-27 10:14 ` Yisheng Xie
@ 2017-10-27 10:14   ` Yisheng Xie
  -1 siblings, 0 replies; 61+ messages in thread
From: Yisheng Xie @ 2017-10-27 10:14 UTC (permalink / raw)
  To: akpm, vbabka, mhocko, mingo, rientjes, n-horiguchi, salls
  Cc: linux-mm, linux-kernel, tanxiaojun, linux-api

As manpage of migrate_pages, the errno should be set to EINVAL when none
of the specified nodes contain memory. However, when new_nodes is null,
i.e. the specified nodes also do not have memory, as the following case:

	new_nodes = 0;
	old_nodes = 0xf;
	ret = migrate_pages(pid, old_nodes, new_nodes, MAX);

The ret will be 0 and no errno is set.

This patch is to add nodes_empty check to fix above case.

Signed-off-by: Yisheng Xie <xieyisheng1@huawei.com>
---
 mm/mempolicy.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 8798ecb..58352cc 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -1402,6 +1402,11 @@ static int copy_nodes_to_user(unsigned long __user *mask, unsigned long maxnode,
 	if (err)
 		goto out;
 
+	if (nodes_empty(*new)) {
+		err = -EINVAL;
+		goto out;
+	}
+
 	/* Find the mm_struct */
 	rcu_read_lock();
 	task = pid ? find_task_by_vpid(pid) : current;
-- 
1.7.12.4

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH RFC v2 4/4] mm/mempolicy: add nodes_empty check in SYSC_migrate_pages
@ 2017-10-27 10:14   ` Yisheng Xie
  0 siblings, 0 replies; 61+ messages in thread
From: Yisheng Xie @ 2017-10-27 10:14 UTC (permalink / raw)
  To: akpm, vbabka, mhocko, mingo, rientjes, n-horiguchi, salls
  Cc: linux-mm, linux-kernel, tanxiaojun, linux-api

As manpage of migrate_pages, the errno should be set to EINVAL when none
of the specified nodes contain memory. However, when new_nodes is null,
i.e. the specified nodes also do not have memory, as the following case:

	new_nodes = 0;
	old_nodes = 0xf;
	ret = migrate_pages(pid, old_nodes, new_nodes, MAX);

The ret will be 0 and no errno is set.

This patch is to add nodes_empty check to fix above case.

Signed-off-by: Yisheng Xie <xieyisheng1@huawei.com>
---
 mm/mempolicy.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 8798ecb..58352cc 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -1402,6 +1402,11 @@ static int copy_nodes_to_user(unsigned long __user *mask, unsigned long maxnode,
 	if (err)
 		goto out;
 
+	if (nodes_empty(*new)) {
+		err = -EINVAL;
+		goto out;
+	}
+
 	/* Find the mm_struct */
 	rcu_read_lock();
 	task = pid ? find_task_by_vpid(pid) : current;
-- 
1.7.12.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* Re: [PATCH RFC v2 1/4] mm/mempolicy: Fix get_nodes() mask miscalculation
  2017-10-27 10:14   ` Yisheng Xie
@ 2017-10-31  8:34     ` Vlastimil Babka
  -1 siblings, 0 replies; 61+ messages in thread
From: Vlastimil Babka @ 2017-10-31  8:34 UTC (permalink / raw)
  To: Yisheng Xie, akpm, mhocko, mingo, rientjes, n-horiguchi, salls
  Cc: linux-mm, linux-kernel, tanxiaojun, linux-api, Andi Kleen

On 10/27/2017 12:14 PM, Yisheng Xie wrote:
> It appears there is a nodemask miscalculation in the get_nodes()
> function in mm/mempolicy.c.  This bug has two effects:
> 
> 1. It is impossible to specify a length 1 nodemask.
> 2. It is impossible to specify a nodemask containing the last node.

This should be more specific, which syscalls are you talking about?
I assume it's set_mempolicy() and mbind() and it's the same issue that
was discussed at https://marc.info/?l=linux-mm&m=150732591909576&w=2 ?

> Brent have submmit a patch before v2.6.12, however, Andi revert his
> changed for ABI problem. I just resent this patch as RFC, for do not
> clear about what's the problem Andi have met.

You should have CC'd Andi. As was discussed in the other thread, this
would make existing programs potentially unsafe, so we can't change it.
Instead it should be documented.

> As manpage of set_mempolicy, If the value of maxnode is zero, the
> nodemask argument is ignored. but we should not ignore the nodemask
> when maxnode is 1.
> 
> Signed-off-by: Yisheng Xie <xieyisheng1@huawei.com>
> ---
>  mm/mempolicy.c | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/mm/mempolicy.c b/mm/mempolicy.c
> index a2af6d5..613e9d0 100644
> --- a/mm/mempolicy.c
> +++ b/mm/mempolicy.c
> @@ -1265,7 +1265,6 @@ static int get_nodes(nodemask_t *nodes, const unsigned long __user *nmask,
>  	unsigned long nlongs;
>  	unsigned long endmask;
>  
> -	--maxnode;
>  	nodes_clear(*nodes);
>  	if (maxnode == 0 || !nmask)
>  		return 0;
> 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH RFC v2 1/4] mm/mempolicy: Fix get_nodes() mask miscalculation
@ 2017-10-31  8:34     ` Vlastimil Babka
  0 siblings, 0 replies; 61+ messages in thread
From: Vlastimil Babka @ 2017-10-31  8:34 UTC (permalink / raw)
  To: Yisheng Xie, akpm, mhocko, mingo, rientjes, n-horiguchi, salls
  Cc: linux-mm, linux-kernel, tanxiaojun, linux-api, Andi Kleen

On 10/27/2017 12:14 PM, Yisheng Xie wrote:
> It appears there is a nodemask miscalculation in the get_nodes()
> function in mm/mempolicy.c.  This bug has two effects:
> 
> 1. It is impossible to specify a length 1 nodemask.
> 2. It is impossible to specify a nodemask containing the last node.

This should be more specific, which syscalls are you talking about?
I assume it's set_mempolicy() and mbind() and it's the same issue that
was discussed at https://marc.info/?l=linux-mm&m=150732591909576&w=2 ?

> Brent have submmit a patch before v2.6.12, however, Andi revert his
> changed for ABI problem. I just resent this patch as RFC, for do not
> clear about what's the problem Andi have met.

You should have CC'd Andi. As was discussed in the other thread, this
would make existing programs potentially unsafe, so we can't change it.
Instead it should be documented.

> As manpage of set_mempolicy, If the value of maxnode is zero, the
> nodemask argument is ignored. but we should not ignore the nodemask
> when maxnode is 1.
> 
> Signed-off-by: Yisheng Xie <xieyisheng1@huawei.com>
> ---
>  mm/mempolicy.c | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/mm/mempolicy.c b/mm/mempolicy.c
> index a2af6d5..613e9d0 100644
> --- a/mm/mempolicy.c
> +++ b/mm/mempolicy.c
> @@ -1265,7 +1265,6 @@ static int get_nodes(nodemask_t *nodes, const unsigned long __user *nmask,
>  	unsigned long nlongs;
>  	unsigned long endmask;
>  
> -	--maxnode;
>  	nodes_clear(*nodes);
>  	if (maxnode == 0 || !nmask)
>  		return 0;
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH RFC v2 2/4] mm/mempolicy: remove redundant check in get_nodes
  2017-10-27 10:14   ` Yisheng Xie
@ 2017-10-31  8:55     ` Vlastimil Babka
  -1 siblings, 0 replies; 61+ messages in thread
From: Vlastimil Babka @ 2017-10-31  8:55 UTC (permalink / raw)
  To: Yisheng Xie, akpm, mhocko, mingo, rientjes, n-horiguchi, salls
  Cc: linux-mm, linux-kernel, tanxiaojun, linux-api

On 10/27/2017 12:14 PM, Yisheng Xie wrote:
> We have already checked whether maxnode is a page worth of bits, by:
>     maxnode > PAGE_SIZE*BITS_PER_BYTE
> 
> So no need to check it once more.
> 
> Signed-off-by: Yisheng Xie <xieyisheng1@huawei.com>

Acked-by: Vlastimil Babka <vbabka@suse.cz>

> ---
>  mm/mempolicy.c | 2 --
>  1 file changed, 2 deletions(-)
> 
> diff --git a/mm/mempolicy.c b/mm/mempolicy.c
> index 613e9d0..3b51bb3 100644
> --- a/mm/mempolicy.c
> +++ b/mm/mempolicy.c
> @@ -1280,8 +1280,6 @@ static int get_nodes(nodemask_t *nodes, const unsigned long __user *nmask,
>  	/* When the user specified more nodes than supported just check
>  	   if the non supported part is all zero. */
>  	if (nlongs > BITS_TO_LONGS(MAX_NUMNODES)) {
> -		if (nlongs > PAGE_SIZE/sizeof(long))
> -			return -EINVAL;
>  		for (k = BITS_TO_LONGS(MAX_NUMNODES); k < nlongs; k++) {
>  			unsigned long t;
>  			if (get_user(t, nmask + k))
> 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH RFC v2 2/4] mm/mempolicy: remove redundant check in get_nodes
@ 2017-10-31  8:55     ` Vlastimil Babka
  0 siblings, 0 replies; 61+ messages in thread
From: Vlastimil Babka @ 2017-10-31  8:55 UTC (permalink / raw)
  To: Yisheng Xie, akpm, mhocko, mingo, rientjes, n-horiguchi, salls
  Cc: linux-mm, linux-kernel, tanxiaojun, linux-api

On 10/27/2017 12:14 PM, Yisheng Xie wrote:
> We have already checked whether maxnode is a page worth of bits, by:
>     maxnode > PAGE_SIZE*BITS_PER_BYTE
> 
> So no need to check it once more.
> 
> Signed-off-by: Yisheng Xie <xieyisheng1@huawei.com>

Acked-by: Vlastimil Babka <vbabka@suse.cz>

> ---
>  mm/mempolicy.c | 2 --
>  1 file changed, 2 deletions(-)
> 
> diff --git a/mm/mempolicy.c b/mm/mempolicy.c
> index 613e9d0..3b51bb3 100644
> --- a/mm/mempolicy.c
> +++ b/mm/mempolicy.c
> @@ -1280,8 +1280,6 @@ static int get_nodes(nodemask_t *nodes, const unsigned long __user *nmask,
>  	/* When the user specified more nodes than supported just check
>  	   if the non supported part is all zero. */
>  	if (nlongs > BITS_TO_LONGS(MAX_NUMNODES)) {
> -		if (nlongs > PAGE_SIZE/sizeof(long))
> -			return -EINVAL;
>  		for (k = BITS_TO_LONGS(MAX_NUMNODES); k < nlongs; k++) {
>  			unsigned long t;
>  			if (get_user(t, nmask + k))
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH RFC v2 3/4] mm/mempolicy: fix the check of nodemask from user
  2017-10-27 10:14   ` Yisheng Xie
@ 2017-10-31  9:30     ` Vlastimil Babka
  -1 siblings, 0 replies; 61+ messages in thread
From: Vlastimil Babka @ 2017-10-31  9:30 UTC (permalink / raw)
  To: Yisheng Xie, akpm, mhocko, mingo, rientjes, n-horiguchi, salls
  Cc: linux-mm, linux-kernel, tanxiaojun, linux-api

On 10/27/2017 12:14 PM, Yisheng Xie wrote:
> As Xiaojun reported the ltp of migrate_pages01 will failed on ARCH arm64
> system which has 4 nodes[0...3], all have memory and CONFIG_NODES_SHIFT=2:
> 
> migrate_pages01    0  TINFO  :  test_invalid_nodes
> migrate_pages01   14  TFAIL  :  migrate_pages_common.c:45: unexpected failure - returned value = 0, expected: -1
> migrate_pages01   15  TFAIL  :  migrate_pages_common.c:55: call succeeded unexpectedly
> 
> In this case the test_invalid_nodes of migrate_pages01 will call:
> SYSC_migrate_pages as:
> 
> migrate_pages(0, , {0x0000000000000001}, 64, , {0x0000000000000010}, 64) = 0
> 
> The new nodes specifies one or more node IDs that are greater than the
> maximum supported node ID, however, the errno is not set to EINVAL as
> expected.
> 
> As man pages of set_mempolicy[1], mbind[2], and migrate_pages[3] memtioned,
> when nodemask specifies one or more node IDs that are greater than the
> maximum supported node ID, the errno should set to EINVAL. However, get_nodes
> only check whether the part of bits [BITS_PER_LONG*BITS_TO_LONGS(MAX_NUMNODES),
> maxnode) is zero or not,  and remain [MAX_NUMNODES, BITS_PER_LONG*BITS_TO_LONGS(MAX_NUMNODES)
> unchecked.
> 
> This patch is to check the bits of [MAX_NUMNODES, maxnode) in get_nodes to
> let migrate_pages set the errno to EINVAL when nodemask specifies one or
> more node IDs that are greater than the maximum supported node ID, which
> follows the manpage's guide.
> 
> [1] http://man7.org/linux/man-pages/man2/set_mempolicy.2.html
> [2] http://man7.org/linux/man-pages/man2/mbind.2.html
> [3] http://man7.org/linux/man-pages/man2/migrate_pages.2.html
> 
> Reported-by: Tan Xiaojun <tanxiaojun@huawei.com>
> Signed-off-by: Yisheng Xie <xieyisheng1@huawei.com>
> ---
>  mm/mempolicy.c | 23 ++++++++++++++++++++---
>  1 file changed, 20 insertions(+), 3 deletions(-)
> 
> diff --git a/mm/mempolicy.c b/mm/mempolicy.c
> index 3b51bb3..8798ecb 100644
> --- a/mm/mempolicy.c
> +++ b/mm/mempolicy.c
> @@ -1262,6 +1262,7 @@ static int get_nodes(nodemask_t *nodes, const unsigned long __user *nmask,
>  		     unsigned long maxnode)
>  {
>  	unsigned long k;
> +	unsigned long t;
>  	unsigned long nlongs;
>  	unsigned long endmask;
>  
> @@ -1277,11 +1278,17 @@ static int get_nodes(nodemask_t *nodes, const unsigned long __user *nmask,
>  	else
>  		endmask = (1UL << (maxnode % BITS_PER_LONG)) - 1;
>  
> -	/* When the user specified more nodes than supported just check
> -	   if the non supported part is all zero. */
> +	/*
> +	 * When the user specified more nodes than supported just check
> +	 * if the non supported part is all zero.
> +	 *
> +	 * If maxnode have more longs than MAX_NUMNODES, check
> +	 * the bits in that area first. And then go through to
> +	 * check the rest bits which equal or bigger than MAX_NUMNODES.
> +	 * Otherwise, just check bits [MAX_NUMNODES, maxnode).
> +	 */
>  	if (nlongs > BITS_TO_LONGS(MAX_NUMNODES)) {
>  		for (k = BITS_TO_LONGS(MAX_NUMNODES); k < nlongs; k++) {
> -			unsigned long t;
>  			if (get_user(t, nmask + k))
>  				return -EFAULT;
>  			if (k == nlongs - 1) {
> @@ -1294,6 +1301,16 @@ static int get_nodes(nodemask_t *nodes, const unsigned long __user *nmask,
>  		endmask = ~0UL;
>  	}
>  
> +	if (maxnode > MAX_NUMNODES && MAX_NUMNODES % BITS_PER_LONG != 0) {
> +		unsigned long valid_mask = endmask;
> +
> +		valid_mask &= ~((1UL << (MAX_NUMNODES % BITS_PER_LONG)) - 1);

I'm not sure if the combination with endmask works in this case:

0      BITS_PER_LONG  2xBITS_PER_LONG
|____________|____________|
       |             |
  MAX_NUMNODES      maxnode

endmask will contain bits between 0 and maxnode
but here we want to check bits between MAX_NUMNODES and BITS_PER_LONG
and endmask should not be mixed up with that?


Vlastimil

> +		if (get_user(t, nmask + nlongs - 1))
> +			return -EFAULT;
> +		if (t & valid_mask)
> +			return -EINVAL;
> +	}
> +
>  	if (copy_from_user(nodes_addr(*nodes), nmask, nlongs*sizeof(unsigned long)))
>  		return -EFAULT;
>  	nodes_addr(*nodes)[nlongs-1] &= endmask;
> 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH RFC v2 3/4] mm/mempolicy: fix the check of nodemask from user
@ 2017-10-31  9:30     ` Vlastimil Babka
  0 siblings, 0 replies; 61+ messages in thread
From: Vlastimil Babka @ 2017-10-31  9:30 UTC (permalink / raw)
  To: Yisheng Xie, akpm, mhocko, mingo, rientjes, n-horiguchi, salls
  Cc: linux-mm, linux-kernel, tanxiaojun, linux-api

On 10/27/2017 12:14 PM, Yisheng Xie wrote:
> As Xiaojun reported the ltp of migrate_pages01 will failed on ARCH arm64
> system which has 4 nodes[0...3], all have memory and CONFIG_NODES_SHIFT=2:
> 
> migrate_pages01    0  TINFO  :  test_invalid_nodes
> migrate_pages01   14  TFAIL  :  migrate_pages_common.c:45: unexpected failure - returned value = 0, expected: -1
> migrate_pages01   15  TFAIL  :  migrate_pages_common.c:55: call succeeded unexpectedly
> 
> In this case the test_invalid_nodes of migrate_pages01 will call:
> SYSC_migrate_pages as:
> 
> migrate_pages(0, , {0x0000000000000001}, 64, , {0x0000000000000010}, 64) = 0
> 
> The new nodes specifies one or more node IDs that are greater than the
> maximum supported node ID, however, the errno is not set to EINVAL as
> expected.
> 
> As man pages of set_mempolicy[1], mbind[2], and migrate_pages[3] memtioned,
> when nodemask specifies one or more node IDs that are greater than the
> maximum supported node ID, the errno should set to EINVAL. However, get_nodes
> only check whether the part of bits [BITS_PER_LONG*BITS_TO_LONGS(MAX_NUMNODES),
> maxnode) is zero or not,  and remain [MAX_NUMNODES, BITS_PER_LONG*BITS_TO_LONGS(MAX_NUMNODES)
> unchecked.
> 
> This patch is to check the bits of [MAX_NUMNODES, maxnode) in get_nodes to
> let migrate_pages set the errno to EINVAL when nodemask specifies one or
> more node IDs that are greater than the maximum supported node ID, which
> follows the manpage's guide.
> 
> [1] http://man7.org/linux/man-pages/man2/set_mempolicy.2.html
> [2] http://man7.org/linux/man-pages/man2/mbind.2.html
> [3] http://man7.org/linux/man-pages/man2/migrate_pages.2.html
> 
> Reported-by: Tan Xiaojun <tanxiaojun@huawei.com>
> Signed-off-by: Yisheng Xie <xieyisheng1@huawei.com>
> ---
>  mm/mempolicy.c | 23 ++++++++++++++++++++---
>  1 file changed, 20 insertions(+), 3 deletions(-)
> 
> diff --git a/mm/mempolicy.c b/mm/mempolicy.c
> index 3b51bb3..8798ecb 100644
> --- a/mm/mempolicy.c
> +++ b/mm/mempolicy.c
> @@ -1262,6 +1262,7 @@ static int get_nodes(nodemask_t *nodes, const unsigned long __user *nmask,
>  		     unsigned long maxnode)
>  {
>  	unsigned long k;
> +	unsigned long t;
>  	unsigned long nlongs;
>  	unsigned long endmask;
>  
> @@ -1277,11 +1278,17 @@ static int get_nodes(nodemask_t *nodes, const unsigned long __user *nmask,
>  	else
>  		endmask = (1UL << (maxnode % BITS_PER_LONG)) - 1;
>  
> -	/* When the user specified more nodes than supported just check
> -	   if the non supported part is all zero. */
> +	/*
> +	 * When the user specified more nodes than supported just check
> +	 * if the non supported part is all zero.
> +	 *
> +	 * If maxnode have more longs than MAX_NUMNODES, check
> +	 * the bits in that area first. And then go through to
> +	 * check the rest bits which equal or bigger than MAX_NUMNODES.
> +	 * Otherwise, just check bits [MAX_NUMNODES, maxnode).
> +	 */
>  	if (nlongs > BITS_TO_LONGS(MAX_NUMNODES)) {
>  		for (k = BITS_TO_LONGS(MAX_NUMNODES); k < nlongs; k++) {
> -			unsigned long t;
>  			if (get_user(t, nmask + k))
>  				return -EFAULT;
>  			if (k == nlongs - 1) {
> @@ -1294,6 +1301,16 @@ static int get_nodes(nodemask_t *nodes, const unsigned long __user *nmask,
>  		endmask = ~0UL;
>  	}
>  
> +	if (maxnode > MAX_NUMNODES && MAX_NUMNODES % BITS_PER_LONG != 0) {
> +		unsigned long valid_mask = endmask;
> +
> +		valid_mask &= ~((1UL << (MAX_NUMNODES % BITS_PER_LONG)) - 1);

I'm not sure if the combination with endmask works in this case:

0      BITS_PER_LONG  2xBITS_PER_LONG
|____________|____________|
       |             |
  MAX_NUMNODES      maxnode

endmask will contain bits between 0 and maxnode
but here we want to check bits between MAX_NUMNODES and BITS_PER_LONG
and endmask should not be mixed up with that?


Vlastimil

> +		if (get_user(t, nmask + nlongs - 1))
> +			return -EFAULT;
> +		if (t & valid_mask)
> +			return -EINVAL;
> +	}
> +
>  	if (copy_from_user(nodes_addr(*nodes), nmask, nlongs*sizeof(unsigned long)))
>  		return -EFAULT;
>  	nodes_addr(*nodes)[nlongs-1] &= endmask;
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH RFC v2 4/4] mm/mempolicy: add nodes_empty check in SYSC_migrate_pages
@ 2017-10-31  9:46     ` Vlastimil Babka
  0 siblings, 0 replies; 61+ messages in thread
From: Vlastimil Babka @ 2017-10-31  9:46 UTC (permalink / raw)
  To: Yisheng Xie, akpm, mhocko, mingo, rientjes, n-horiguchi, salls
  Cc: linux-mm, linux-kernel, tanxiaojun, linux-api, Andi Kleen,
	Christoph Lameter

+CC Andi and Christoph

On 10/27/2017 12:14 PM, Yisheng Xie wrote:
> As manpage of migrate_pages, the errno should be set to EINVAL when none
> of the specified nodes contain memory. However, when new_nodes is null,
> i.e. the specified nodes also do not have memory, as the following case:
> 
> 	new_nodes = 0;
> 	old_nodes = 0xf;
> 	ret = migrate_pages(pid, old_nodes, new_nodes, MAX);
> 
> The ret will be 0 and no errno is set.
> 
> This patch is to add nodes_empty check to fix above case.

Hmm, I think we have a bigger problem than "empty set is a subset of
anything" here.

The existing checks are:

        task_nodes = cpuset_mems_allowed(task);
        /* Is the user allowed to access the target nodes? */
        if (!nodes_subset(*new, task_nodes) && !capable(CAP_SYS_NICE)) {
                err = -EPERM;
                goto out_put;
        }

        if (!nodes_subset(*new, node_states[N_MEMORY])) {
                err = -EINVAL;
                goto out_put;
        }

And manpage says:

       EINVAL The value specified by maxnode exceeds a kernel-imposed
limit.  Or, old_nodes or new_nodes specifies one or more node IDs that
are greater than the maximum supported node
              ID.  *Or, none of the node IDs specified by new_nodes are
on-line and allowed by the process's current cpuset context, or none of
the specified nodes contain memory.*

       EPERM  Insufficient privilege (CAP_SYS_NICE) to move pages of the
process specified by pid, or insufficient privilege (CAP_SYS_NICE) to
access the specified target nodes.

- it says "none ... are allowed", but checking for subset means we check
if "all ... are allowed". Shouldn't we be checking for a non-empty
intersection?
- there doesn't seem to be any EINVAL check for "process's current
cpuset context", there's just an EPERM check for "target process's
cpuset context".

> 
> Signed-off-by: Yisheng Xie <xieyisheng1@huawei.com>
> ---
>  mm/mempolicy.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/mm/mempolicy.c b/mm/mempolicy.c
> index 8798ecb..58352cc 100644
> --- a/mm/mempolicy.c
> +++ b/mm/mempolicy.c
> @@ -1402,6 +1402,11 @@ static int copy_nodes_to_user(unsigned long __user *mask, unsigned long maxnode,
>  	if (err)
>  		goto out;
>  
> +	if (nodes_empty(*new)) {
> +		err = -EINVAL;
> +		goto out;
> +	}
> +
>  	/* Find the mm_struct */
>  	rcu_read_lock();
>  	task = pid ? find_task_by_vpid(pid) : current;
> 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH RFC v2 4/4] mm/mempolicy: add nodes_empty check in SYSC_migrate_pages
@ 2017-10-31  9:46     ` Vlastimil Babka
  0 siblings, 0 replies; 61+ messages in thread
From: Vlastimil Babka @ 2017-10-31  9:46 UTC (permalink / raw)
  To: Yisheng Xie, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	mhocko-IBi9RG/b67k, mingo-DgEjT+Ai2ygdnm+yROfE0A,
	rientjes-hpIqsD4AKlfQT0dZR+AlfA,
	n-horiguchi-PaJj6Psr51x8UrSeD/g0lQ, salls-b3bnyZ7c9ISVc3sceRu5cw
  Cc: linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	tanxiaojun-hv44wF8Li93QT0dZR+AlfA,
	linux-api-u79uwXL29TY76Z2rM5mHXA, Andi Kleen, Christoph Lameter

+CC Andi and Christoph

On 10/27/2017 12:14 PM, Yisheng Xie wrote:
> As manpage of migrate_pages, the errno should be set to EINVAL when none
> of the specified nodes contain memory. However, when new_nodes is null,
> i.e. the specified nodes also do not have memory, as the following case:
> 
> 	new_nodes = 0;
> 	old_nodes = 0xf;
> 	ret = migrate_pages(pid, old_nodes, new_nodes, MAX);
> 
> The ret will be 0 and no errno is set.
> 
> This patch is to add nodes_empty check to fix above case.

Hmm, I think we have a bigger problem than "empty set is a subset of
anything" here.

The existing checks are:

        task_nodes = cpuset_mems_allowed(task);
        /* Is the user allowed to access the target nodes? */
        if (!nodes_subset(*new, task_nodes) && !capable(CAP_SYS_NICE)) {
                err = -EPERM;
                goto out_put;
        }

        if (!nodes_subset(*new, node_states[N_MEMORY])) {
                err = -EINVAL;
                goto out_put;
        }

And manpage says:

       EINVAL The value specified by maxnode exceeds a kernel-imposed
limit.  Or, old_nodes or new_nodes specifies one or more node IDs that
are greater than the maximum supported node
              ID.  *Or, none of the node IDs specified by new_nodes are
on-line and allowed by the process's current cpuset context, or none of
the specified nodes contain memory.*

       EPERM  Insufficient privilege (CAP_SYS_NICE) to move pages of the
process specified by pid, or insufficient privilege (CAP_SYS_NICE) to
access the specified target nodes.

- it says "none ... are allowed", but checking for subset means we check
if "all ... are allowed". Shouldn't we be checking for a non-empty
intersection?
- there doesn't seem to be any EINVAL check for "process's current
cpuset context", there's just an EPERM check for "target process's
cpuset context".

> 
> Signed-off-by: Yisheng Xie <xieyisheng1-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
> ---
>  mm/mempolicy.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/mm/mempolicy.c b/mm/mempolicy.c
> index 8798ecb..58352cc 100644
> --- a/mm/mempolicy.c
> +++ b/mm/mempolicy.c
> @@ -1402,6 +1402,11 @@ static int copy_nodes_to_user(unsigned long __user *mask, unsigned long maxnode,
>  	if (err)
>  		goto out;
>  
> +	if (nodes_empty(*new)) {
> +		err = -EINVAL;
> +		goto out;
> +	}
> +
>  	/* Find the mm_struct */
>  	rcu_read_lock();
>  	task = pid ? find_task_by_vpid(pid) : current;
> 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH RFC v2 4/4] mm/mempolicy: add nodes_empty check in SYSC_migrate_pages
@ 2017-10-31  9:46     ` Vlastimil Babka
  0 siblings, 0 replies; 61+ messages in thread
From: Vlastimil Babka @ 2017-10-31  9:46 UTC (permalink / raw)
  To: Yisheng Xie, akpm, mhocko, mingo, rientjes, n-horiguchi, salls
  Cc: linux-mm, linux-kernel, tanxiaojun, linux-api, Andi Kleen,
	Christoph Lameter

+CC Andi and Christoph

On 10/27/2017 12:14 PM, Yisheng Xie wrote:
> As manpage of migrate_pages, the errno should be set to EINVAL when none
> of the specified nodes contain memory. However, when new_nodes is null,
> i.e. the specified nodes also do not have memory, as the following case:
> 
> 	new_nodes = 0;
> 	old_nodes = 0xf;
> 	ret = migrate_pages(pid, old_nodes, new_nodes, MAX);
> 
> The ret will be 0 and no errno is set.
> 
> This patch is to add nodes_empty check to fix above case.

Hmm, I think we have a bigger problem than "empty set is a subset of
anything" here.

The existing checks are:

        task_nodes = cpuset_mems_allowed(task);
        /* Is the user allowed to access the target nodes? */
        if (!nodes_subset(*new, task_nodes) && !capable(CAP_SYS_NICE)) {
                err = -EPERM;
                goto out_put;
        }

        if (!nodes_subset(*new, node_states[N_MEMORY])) {
                err = -EINVAL;
                goto out_put;
        }

And manpage says:

       EINVAL The value specified by maxnode exceeds a kernel-imposed
limit.  Or, old_nodes or new_nodes specifies one or more node IDs that
are greater than the maximum supported node
              ID.  *Or, none of the node IDs specified by new_nodes are
on-line and allowed by the process's current cpuset context, or none of
the specified nodes contain memory.*

       EPERM  Insufficient privilege (CAP_SYS_NICE) to move pages of the
process specified by pid, or insufficient privilege (CAP_SYS_NICE) to
access the specified target nodes.

- it says "none ... are allowed", but checking for subset means we check
if "all ... are allowed". Shouldn't we be checking for a non-empty
intersection?
- there doesn't seem to be any EINVAL check for "process's current
cpuset context", there's just an EPERM check for "target process's
cpuset context".

> 
> Signed-off-by: Yisheng Xie <xieyisheng1@huawei.com>
> ---
>  mm/mempolicy.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/mm/mempolicy.c b/mm/mempolicy.c
> index 8798ecb..58352cc 100644
> --- a/mm/mempolicy.c
> +++ b/mm/mempolicy.c
> @@ -1402,6 +1402,11 @@ static int copy_nodes_to_user(unsigned long __user *mask, unsigned long maxnode,
>  	if (err)
>  		goto out;
>  
> +	if (nodes_empty(*new)) {
> +		err = -EINVAL;
> +		goto out;
> +	}
> +
>  	/* Find the mm_struct */
>  	rcu_read_lock();
>  	task = pid ? find_task_by_vpid(pid) : current;
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH RFC v2 3/4] mm/mempolicy: fix the check of nodemask from user
  2017-10-31  9:30     ` Vlastimil Babka
  (?)
@ 2017-10-31 11:01       ` Yisheng Xie
  -1 siblings, 0 replies; 61+ messages in thread
From: Yisheng Xie @ 2017-10-31 11:01 UTC (permalink / raw)
  To: Vlastimil Babka, akpm, mhocko, mingo, rientjes, n-horiguchi, salls
  Cc: linux-mm, linux-kernel, tanxiaojun, linux-api



On 2017/10/31 17:30, Vlastimil Babka wrote:
> On 10/27/2017 12:14 PM, Yisheng Xie wrote:
>> +	/*
>> +	 * When the user specified more nodes than supported just check
>> +	 * if the non supported part is all zero.
>> +	 *
>> +	 * If maxnode have more longs than MAX_NUMNODES, check
>> +	 * the bits in that area first. And then go through to
>> +	 * check the rest bits which equal or bigger than MAX_NUMNODES.
>> +	 * Otherwise, just check bits [MAX_NUMNODES, maxnode).
>> +	 */
>>  	if (nlongs > BITS_TO_LONGS(MAX_NUMNODES)) {
>>  		for (k = BITS_TO_LONGS(MAX_NUMNODES); k < nlongs; k++) {
>> -			unsigned long t;
>>  			if (get_user(t, nmask + k))
>>  				return -EFAULT;
>>  			if (k == nlongs - 1) {
>> @@ -1294,6 +1301,16 @@ static int get_nodes(nodemask_t *nodes, const unsigned long __user *nmask,
>>  		endmask = ~0UL;
>>  	}
>>  
>> +	if (maxnode > MAX_NUMNODES && MAX_NUMNODES % BITS_PER_LONG != 0) {
>> +		unsigned long valid_mask = endmask;
>> +
>> +		valid_mask &= ~((1UL << (MAX_NUMNODES % BITS_PER_LONG)) - 1);
> 
> I'm not sure if the combination with endmask works in this case:
> 
> 0      BITS_PER_LONG  2xBITS_PER_LONG
> |____________|____________|
>        |             |
>   MAX_NUMNODES      maxnode
> 
> endmask will contain bits between 0 and maxnode

In the case, BITS_TO_LONGS(maxnode) > BITS_TO_LONGS(MAX_NUMNODES), right?
And after checking BITS_PER_LONG to 2xBITS_PER_LONG,endmask will set to
"~0UL". e.g. endmask will be 0xffff ffff ffff ffff if
unsigned long is 64bit.

Then the valid_mask will just contain bits MAX_NUMNODES to BITS_PER_LONG.

Thanks
Yisheng Xie

> but here we want to check bits between MAX_NUMNODES and BITS_PER_LONG
> and endmask should not be mixed up with that?
> 
> 
> Vlastimil
> 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH RFC v2 3/4] mm/mempolicy: fix the check of nodemask from user
@ 2017-10-31 11:01       ` Yisheng Xie
  0 siblings, 0 replies; 61+ messages in thread
From: Yisheng Xie @ 2017-10-31 11:01 UTC (permalink / raw)
  To: Vlastimil Babka, akpm, mhocko, mingo, rientjes, n-horiguchi, salls
  Cc: linux-mm, linux-kernel, tanxiaojun, linux-api



On 2017/10/31 17:30, Vlastimil Babka wrote:
> On 10/27/2017 12:14 PM, Yisheng Xie wrote:
>> +	/*
>> +	 * When the user specified more nodes than supported just check
>> +	 * if the non supported part is all zero.
>> +	 *
>> +	 * If maxnode have more longs than MAX_NUMNODES, check
>> +	 * the bits in that area first. And then go through to
>> +	 * check the rest bits which equal or bigger than MAX_NUMNODES.
>> +	 * Otherwise, just check bits [MAX_NUMNODES, maxnode).
>> +	 */
>>  	if (nlongs > BITS_TO_LONGS(MAX_NUMNODES)) {
>>  		for (k = BITS_TO_LONGS(MAX_NUMNODES); k < nlongs; k++) {
>> -			unsigned long t;
>>  			if (get_user(t, nmask + k))
>>  				return -EFAULT;
>>  			if (k == nlongs - 1) {
>> @@ -1294,6 +1301,16 @@ static int get_nodes(nodemask_t *nodes, const unsigned long __user *nmask,
>>  		endmask = ~0UL;
>>  	}
>>  
>> +	if (maxnode > MAX_NUMNODES && MAX_NUMNODES % BITS_PER_LONG != 0) {
>> +		unsigned long valid_mask = endmask;
>> +
>> +		valid_mask &= ~((1UL << (MAX_NUMNODES % BITS_PER_LONG)) - 1);
> 
> I'm not sure if the combination with endmask works in this case:
> 
> 0      BITS_PER_LONG  2xBITS_PER_LONG
> |____________|____________|
>        |             |
>   MAX_NUMNODES      maxnode
> 
> endmask will contain bits between 0 and maxnode

In the case, BITS_TO_LONGS(maxnode) > BITS_TO_LONGS(MAX_NUMNODES), right?
And after checking BITS_PER_LONG to 2xBITS_PER_LONG,endmask will set to
"~0UL". e.g. endmask will be 0xffff ffff ffff ffff if
unsigned long is 64bit.

Then the valid_mask will just contain bits MAX_NUMNODES to BITS_PER_LONG.

Thanks
Yisheng Xie

> but here we want to check bits between MAX_NUMNODES and BITS_PER_LONG
> and endmask should not be mixed up with that?
> 
> 
> Vlastimil
> 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH RFC v2 3/4] mm/mempolicy: fix the check of nodemask from user
@ 2017-10-31 11:01       ` Yisheng Xie
  0 siblings, 0 replies; 61+ messages in thread
From: Yisheng Xie @ 2017-10-31 11:01 UTC (permalink / raw)
  To: Vlastimil Babka, akpm, mhocko, mingo, rientjes, n-horiguchi, salls
  Cc: linux-mm, linux-kernel, tanxiaojun, linux-api



On 2017/10/31 17:30, Vlastimil Babka wrote:
> On 10/27/2017 12:14 PM, Yisheng Xie wrote:
>> +	/*
>> +	 * When the user specified more nodes than supported just check
>> +	 * if the non supported part is all zero.
>> +	 *
>> +	 * If maxnode have more longs than MAX_NUMNODES, check
>> +	 * the bits in that area first. And then go through to
>> +	 * check the rest bits which equal or bigger than MAX_NUMNODES.
>> +	 * Otherwise, just check bits [MAX_NUMNODES, maxnode).
>> +	 */
>>  	if (nlongs > BITS_TO_LONGS(MAX_NUMNODES)) {
>>  		for (k = BITS_TO_LONGS(MAX_NUMNODES); k < nlongs; k++) {
>> -			unsigned long t;
>>  			if (get_user(t, nmask + k))
>>  				return -EFAULT;
>>  			if (k == nlongs - 1) {
>> @@ -1294,6 +1301,16 @@ static int get_nodes(nodemask_t *nodes, const unsigned long __user *nmask,
>>  		endmask = ~0UL;
>>  	}
>>  
>> +	if (maxnode > MAX_NUMNODES && MAX_NUMNODES % BITS_PER_LONG != 0) {
>> +		unsigned long valid_mask = endmask;
>> +
>> +		valid_mask &= ~((1UL << (MAX_NUMNODES % BITS_PER_LONG)) - 1);
> 
> I'm not sure if the combination with endmask works in this case:
> 
> 0      BITS_PER_LONG  2xBITS_PER_LONG
> |____________|____________|
>        |             |
>   MAX_NUMNODES      maxnode
> 
> endmask will contain bits between 0 and maxnode

In the case, BITS_TO_LONGS(maxnode) > BITS_TO_LONGS(MAX_NUMNODES), right?
And after checking BITS_PER_LONG to 2xBITS_PER_LONGi 1/4 ?endmask will set to
"~0UL". e.g. endmask will be 0xffff ffff ffff ffff if
unsigned long is 64bit.

Then the valid_mask will just contain bits MAX_NUMNODES to BITS_PER_LONG.

Thanks
Yisheng Xie

> but here we want to check bits between MAX_NUMNODES and BITS_PER_LONG
> and endmask should not be mixed up with that?
> 
> 
> Vlastimil
> 


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH RFC v2 3/4] mm/mempolicy: fix the check of nodemask from user
@ 2017-10-31 11:28         ` Vlastimil Babka
  0 siblings, 0 replies; 61+ messages in thread
From: Vlastimil Babka @ 2017-10-31 11:28 UTC (permalink / raw)
  To: Yisheng Xie, akpm, mhocko, mingo, rientjes, n-horiguchi, salls
  Cc: linux-mm, linux-kernel, tanxiaojun, linux-api

On 10/31/2017 12:01 PM, Yisheng Xie wrote:
> 
> 
> On 2017/10/31 17:30, Vlastimil Babka wrote:
>> On 10/27/2017 12:14 PM, Yisheng Xie wrote:
>>> +	/*
>>> +	 * When the user specified more nodes than supported just check
>>> +	 * if the non supported part is all zero.
>>> +	 *
>>> +	 * If maxnode have more longs than MAX_NUMNODES, check
>>> +	 * the bits in that area first. And then go through to
>>> +	 * check the rest bits which equal or bigger than MAX_NUMNODES.
>>> +	 * Otherwise, just check bits [MAX_NUMNODES, maxnode).
>>> +	 */
>>>  	if (nlongs > BITS_TO_LONGS(MAX_NUMNODES)) {
>>>  		for (k = BITS_TO_LONGS(MAX_NUMNODES); k < nlongs; k++) {
>>> -			unsigned long t;
>>>  			if (get_user(t, nmask + k))
>>>  				return -EFAULT;
>>>  			if (k == nlongs - 1) {
>>> @@ -1294,6 +1301,16 @@ static int get_nodes(nodemask_t *nodes, const unsigned long __user *nmask,
>>>  		endmask = ~0UL;
>>>  	}
>>>  
>>> +	if (maxnode > MAX_NUMNODES && MAX_NUMNODES % BITS_PER_LONG != 0) {
>>> +		unsigned long valid_mask = endmask;
>>> +
>>> +		valid_mask &= ~((1UL << (MAX_NUMNODES % BITS_PER_LONG)) - 1);
>>
>> I'm not sure if the combination with endmask works in this case:
>>
>> 0      BITS_PER_LONG  2xBITS_PER_LONG
>> |____________|____________|
>>        |             |
>>   MAX_NUMNODES      maxnode
>>
>> endmask will contain bits between 0 and maxnode
> 
> In the case, BITS_TO_LONGS(maxnode) > BITS_TO_LONGS(MAX_NUMNODES), right?
> And after checking BITS_PER_LONG to 2xBITS_PER_LONG,endmask will set to
> "~0UL". e.g. endmask will be 0xffff ffff ffff ffff if
> unsigned long is 64bit.
> 
> Then the valid_mask will just contain bits MAX_NUMNODES to BITS_PER_LONG.

Ugh, right. I missed that. This code is not simple...

> Thanks
> Yisheng Xie
> 
>> but here we want to check bits between MAX_NUMNODES and BITS_PER_LONG
>> and endmask should not be mixed up with that?
>>
>>
>> Vlastimil
>>
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-api" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH RFC v2 3/4] mm/mempolicy: fix the check of nodemask from user
@ 2017-10-31 11:28         ` Vlastimil Babka
  0 siblings, 0 replies; 61+ messages in thread
From: Vlastimil Babka @ 2017-10-31 11:28 UTC (permalink / raw)
  To: Yisheng Xie, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	mhocko-IBi9RG/b67k, mingo-DgEjT+Ai2ygdnm+yROfE0A,
	rientjes-hpIqsD4AKlfQT0dZR+AlfA,
	n-horiguchi-PaJj6Psr51x8UrSeD/g0lQ, salls-b3bnyZ7c9ISVc3sceRu5cw
  Cc: linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	tanxiaojun-hv44wF8Li93QT0dZR+AlfA,
	linux-api-u79uwXL29TY76Z2rM5mHXA

On 10/31/2017 12:01 PM, Yisheng Xie wrote:
> 
> 
> On 2017/10/31 17:30, Vlastimil Babka wrote:
>> On 10/27/2017 12:14 PM, Yisheng Xie wrote:
>>> +	/*
>>> +	 * When the user specified more nodes than supported just check
>>> +	 * if the non supported part is all zero.
>>> +	 *
>>> +	 * If maxnode have more longs than MAX_NUMNODES, check
>>> +	 * the bits in that area first. And then go through to
>>> +	 * check the rest bits which equal or bigger than MAX_NUMNODES.
>>> +	 * Otherwise, just check bits [MAX_NUMNODES, maxnode).
>>> +	 */
>>>  	if (nlongs > BITS_TO_LONGS(MAX_NUMNODES)) {
>>>  		for (k = BITS_TO_LONGS(MAX_NUMNODES); k < nlongs; k++) {
>>> -			unsigned long t;
>>>  			if (get_user(t, nmask + k))
>>>  				return -EFAULT;
>>>  			if (k == nlongs - 1) {
>>> @@ -1294,6 +1301,16 @@ static int get_nodes(nodemask_t *nodes, const unsigned long __user *nmask,
>>>  		endmask = ~0UL;
>>>  	}
>>>  
>>> +	if (maxnode > MAX_NUMNODES && MAX_NUMNODES % BITS_PER_LONG != 0) {
>>> +		unsigned long valid_mask = endmask;
>>> +
>>> +		valid_mask &= ~((1UL << (MAX_NUMNODES % BITS_PER_LONG)) - 1);
>>
>> I'm not sure if the combination with endmask works in this case:
>>
>> 0      BITS_PER_LONG  2xBITS_PER_LONG
>> |____________|____________|
>>        |             |
>>   MAX_NUMNODES      maxnode
>>
>> endmask will contain bits between 0 and maxnode
> 
> In the case, BITS_TO_LONGS(maxnode) > BITS_TO_LONGS(MAX_NUMNODES), right?
> And after checking BITS_PER_LONG to 2xBITS_PER_LONG,endmask will set to
> "~0UL". e.g. endmask will be 0xffff ffff ffff ffff if
> unsigned long is 64bit.
> 
> Then the valid_mask will just contain bits MAX_NUMNODES to BITS_PER_LONG.

Ugh, right. I missed that. This code is not simple...

> Thanks
> Yisheng Xie
> 
>> but here we want to check bits between MAX_NUMNODES and BITS_PER_LONG
>> and endmask should not be mixed up with that?
>>
>>
>> Vlastimil
>>
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-api" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH RFC v2 3/4] mm/mempolicy: fix the check of nodemask from user
@ 2017-10-31 11:28         ` Vlastimil Babka
  0 siblings, 0 replies; 61+ messages in thread
From: Vlastimil Babka @ 2017-10-31 11:28 UTC (permalink / raw)
  To: Yisheng Xie, akpm, mhocko, mingo, rientjes, n-horiguchi, salls
  Cc: linux-mm, linux-kernel, tanxiaojun, linux-api

On 10/31/2017 12:01 PM, Yisheng Xie wrote:
> 
> 
> On 2017/10/31 17:30, Vlastimil Babka wrote:
>> On 10/27/2017 12:14 PM, Yisheng Xie wrote:
>>> +	/*
>>> +	 * When the user specified more nodes than supported just check
>>> +	 * if the non supported part is all zero.
>>> +	 *
>>> +	 * If maxnode have more longs than MAX_NUMNODES, check
>>> +	 * the bits in that area first. And then go through to
>>> +	 * check the rest bits which equal or bigger than MAX_NUMNODES.
>>> +	 * Otherwise, just check bits [MAX_NUMNODES, maxnode).
>>> +	 */
>>>  	if (nlongs > BITS_TO_LONGS(MAX_NUMNODES)) {
>>>  		for (k = BITS_TO_LONGS(MAX_NUMNODES); k < nlongs; k++) {
>>> -			unsigned long t;
>>>  			if (get_user(t, nmask + k))
>>>  				return -EFAULT;
>>>  			if (k == nlongs - 1) {
>>> @@ -1294,6 +1301,16 @@ static int get_nodes(nodemask_t *nodes, const unsigned long __user *nmask,
>>>  		endmask = ~0UL;
>>>  	}
>>>  
>>> +	if (maxnode > MAX_NUMNODES && MAX_NUMNODES % BITS_PER_LONG != 0) {
>>> +		unsigned long valid_mask = endmask;
>>> +
>>> +		valid_mask &= ~((1UL << (MAX_NUMNODES % BITS_PER_LONG)) - 1);
>>
>> I'm not sure if the combination with endmask works in this case:
>>
>> 0      BITS_PER_LONG  2xBITS_PER_LONG
>> |____________|____________|
>>        |             |
>>   MAX_NUMNODES      maxnode
>>
>> endmask will contain bits between 0 and maxnode
> 
> In the case, BITS_TO_LONGS(maxnode) > BITS_TO_LONGS(MAX_NUMNODES), right?
> And after checking BITS_PER_LONG to 2xBITS_PER_LONGi 1/4 ?endmask will set to
> "~0UL". e.g. endmask will be 0xffff ffff ffff ffff if
> unsigned long is 64bit.
> 
> Then the valid_mask will just contain bits MAX_NUMNODES to BITS_PER_LONG.

Ugh, right. I missed that. This code is not simple...

> Thanks
> Yisheng Xie
> 
>> but here we want to check bits between MAX_NUMNODES and BITS_PER_LONG
>> and endmask should not be mixed up with that?
>>
>>
>> Vlastimil
>>
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-api" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH RFC v2 1/4] mm/mempolicy: Fix get_nodes() mask miscalculation
@ 2017-11-01  9:37       ` Yisheng Xie
  0 siblings, 0 replies; 61+ messages in thread
From: Yisheng Xie @ 2017-11-01  9:37 UTC (permalink / raw)
  To: Vlastimil Babka, akpm, mhocko, mingo, rientjes, n-horiguchi, salls
  Cc: linux-mm, linux-kernel, tanxiaojun, linux-api, Andi Kleen

Hi Vlastimil,

Thanks for comment!
On 2017/10/31 16:34, Vlastimil Babka wrote:
> On 10/27/2017 12:14 PM, Yisheng Xie wrote:
>> It appears there is a nodemask miscalculation in the get_nodes()
>> function in mm/mempolicy.c.  This bug has two effects:
>>
>> 1. It is impossible to specify a length 1 nodemask.
>> 2. It is impossible to specify a nodemask containing the last node.
> 
> This should be more specific, which syscalls are you talking about?
> I assume it's set_mempolicy() and mbind() and it's the same issue that
> was discussed at https://marc.info/?l=linux-mm&m=150732591909576&w=2 ?

I just missed this thread, sorry about that. Not only set_mempolicy() and
mbind(), but migrate_pages() also suffers this problem. Maybe related
manpage should documented this as your mentioned below.

Thanks
Yisheng Xie

> 
>> Brent have submmit a patch before v2.6.12, however, Andi revert his
>> changed for ABI problem. I just resent this patch as RFC, for do not
>> clear about what's the problem Andi have met.
> 
> You should have CC'd Andi. As was discussed in the other thread, this
> would make existing programs potentially unsafe, so we can't change it.
> Instead it should be documented.
> 
>> As manpage of set_mempolicy, If the value of maxnode is zero, the
>> nodemask argument is ignored. but we should not ignore the nodemask
>> when maxnode is 1.
>>

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH RFC v2 1/4] mm/mempolicy: Fix get_nodes() mask miscalculation
@ 2017-11-01  9:37       ` Yisheng Xie
  0 siblings, 0 replies; 61+ messages in thread
From: Yisheng Xie @ 2017-11-01  9:37 UTC (permalink / raw)
  To: Vlastimil Babka, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	mhocko-IBi9RG/b67k, mingo-DgEjT+Ai2ygdnm+yROfE0A,
	rientjes-hpIqsD4AKlfQT0dZR+AlfA,
	n-horiguchi-PaJj6Psr51x8UrSeD/g0lQ, salls-b3bnyZ7c9ISVc3sceRu5cw
  Cc: linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	tanxiaojun-hv44wF8Li93QT0dZR+AlfA,
	linux-api-u79uwXL29TY76Z2rM5mHXA, Andi Kleen

Hi Vlastimil,

Thanks for comment!
On 2017/10/31 16:34, Vlastimil Babka wrote:
> On 10/27/2017 12:14 PM, Yisheng Xie wrote:
>> It appears there is a nodemask miscalculation in the get_nodes()
>> function in mm/mempolicy.c.  This bug has two effects:
>>
>> 1. It is impossible to specify a length 1 nodemask.
>> 2. It is impossible to specify a nodemask containing the last node.
> 
> This should be more specific, which syscalls are you talking about?
> I assume it's set_mempolicy() and mbind() and it's the same issue that
> was discussed at https://marc.info/?l=linux-mm&m=150732591909576&w=2 ?

I just missed this thread, sorry about that. Not only set_mempolicy() and
mbind(), but migrate_pages() also suffers this problem. Maybe related
manpage should documented this as your mentioned below.

Thanks
Yisheng Xie

> 
>> Brent have submmit a patch before v2.6.12, however, Andi revert his
>> changed for ABI problem. I just resent this patch as RFC, for do not
>> clear about what's the problem Andi have met.
> 
> You should have CC'd Andi. As was discussed in the other thread, this
> would make existing programs potentially unsafe, so we can't change it.
> Instead it should be documented.
> 
>> As manpage of set_mempolicy, If the value of maxnode is zero, the
>> nodemask argument is ignored. but we should not ignore the nodemask
>> when maxnode is 1.
>>

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH RFC v2 1/4] mm/mempolicy: Fix get_nodes() mask miscalculation
@ 2017-11-01  9:37       ` Yisheng Xie
  0 siblings, 0 replies; 61+ messages in thread
From: Yisheng Xie @ 2017-11-01  9:37 UTC (permalink / raw)
  To: Vlastimil Babka, akpm, mhocko, mingo, rientjes, n-horiguchi, salls
  Cc: linux-mm, linux-kernel, tanxiaojun, linux-api, Andi Kleen

Hi Vlastimil,

Thanks for comment!
On 2017/10/31 16:34, Vlastimil Babka wrote:
> On 10/27/2017 12:14 PM, Yisheng Xie wrote:
>> It appears there is a nodemask miscalculation in the get_nodes()
>> function in mm/mempolicy.c.  This bug has two effects:
>>
>> 1. It is impossible to specify a length 1 nodemask.
>> 2. It is impossible to specify a nodemask containing the last node.
> 
> This should be more specific, which syscalls are you talking about?
> I assume it's set_mempolicy() and mbind() and it's the same issue that
> was discussed at https://marc.info/?l=linux-mm&m=150732591909576&w=2 ?

I just missed this thread, sorry about that. Not only set_mempolicy() and
mbind(), but migrate_pages() also suffers this problem. Maybe related
manpage should documented this as your mentioned below.

Thanks
Yisheng Xie

> 
>> Brent have submmit a patch before v2.6.12, however, Andi revert his
>> changed for ABI problem. I just resent this patch as RFC, for do not
>> clear about what's the problem Andi have met.
> 
> You should have CC'd Andi. As was discussed in the other thread, this
> would make existing programs potentially unsafe, so we can't change it.
> Instead it should be documented.
> 
>> As manpage of set_mempolicy, If the value of maxnode is zero, the
>> nodemask argument is ignored. but we should not ignore the nodemask
>> when maxnode is 1.
>>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH RFC v2 4/4] mm/mempolicy: add nodes_empty check in SYSC_migrate_pages
@ 2017-11-06  1:31       ` Yisheng Xie
  0 siblings, 0 replies; 61+ messages in thread
From: Yisheng Xie @ 2017-11-06  1:31 UTC (permalink / raw)
  To: Vlastimil Babka, akpm, mhocko, mingo, rientjes, n-horiguchi, salls
  Cc: linux-mm, linux-kernel, tanxiaojun, linux-api, Andi Kleen,
	Christoph Lameter

Hi Vlastimil,

On 2017/10/31 17:46, Vlastimil Babka wrote:
> +CC Andi and Christoph
> 
> On 10/27/2017 12:14 PM, Yisheng Xie wrote:
>> As manpage of migrate_pages, the errno should be set to EINVAL when none
>> of the specified nodes contain memory. However, when new_nodes is null,
>> i.e. the specified nodes also do not have memory, as the following case:
>>
>> 	new_nodes = 0;
>> 	old_nodes = 0xf;
>> 	ret = migrate_pages(pid, old_nodes, new_nodes, MAX);
>>
>> The ret will be 0 and no errno is set.
>>
>> This patch is to add nodes_empty check to fix above case.
> 
> Hmm, I think we have a bigger problem than "empty set is a subset of
> anything" here.
> 
> The existing checks are:
> 
>         task_nodes = cpuset_mems_allowed(task);
>         if (!nodes_subset(*new, task_nodes) && !capable(CAP_SYS_NICE)) {
>                 err = -EPERM;
>                 goto out_put;
>         }
> 
>         if (!nodes_subset(*new, node_states[N_MEMORY])) {
>                 err = -EINVAL;
>                 goto out_put;
>         }
> 
> 
> And manpage says:
> 
>        EINVAL The value specified by maxnode exceeds a kernel-imposed
> limit.  Or, old_nodes or new_nodes specifies one or more node IDs that
> are greater than the maximum supported node
>               ID.  *Or, none of the node IDs specified by new_nodes are
> on-line and allowed by the process's current cpuset context, or none of
> the specified nodes contain memory.*
> 
>        EPERM  Insufficient privilege (CAP_SYS_NICE) to move pages of the
> process specified by pid, or insufficient privilege (CAP_SYS_NICE) to
> access the specified target nodes.
> 
> - it says "none ... are allowed", but checking for subset means we check
> if "all ... are allowed". Shouldn't we be checking for a non-empty
> intersection?

You are absolutely right. To follow the manpage, we should check non-empty
of intersection instead of subset. I mean:
         nodes_and(*new, *new, task_nodes);
         if (!node_empty(*new) && !capable(CAP_SYS_NICE)) {
                 err = -EPERM;
                 goto out_put;
         }

         nodes_and(*new, *new, node_states[N_MEMORY]);
         if (!node_empty(*new)) {
                 err = -EINVAL;
                 goto out_put;
         }

So finally, we should only migrate the smallest intersection of all the node
set, right?

> - there doesn't seem to be any EINVAL check for "process's current
> cpuset context", there's just an EPERM check for "target process's
> cpuset context".

This also need to be checked as manpage.

Thanks
Yisheng Xie

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH RFC v2 4/4] mm/mempolicy: add nodes_empty check in SYSC_migrate_pages
@ 2017-11-06  1:31       ` Yisheng Xie
  0 siblings, 0 replies; 61+ messages in thread
From: Yisheng Xie @ 2017-11-06  1:31 UTC (permalink / raw)
  To: Vlastimil Babka, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	mhocko-IBi9RG/b67k, mingo-DgEjT+Ai2ygdnm+yROfE0A,
	rientjes-hpIqsD4AKlfQT0dZR+AlfA,
	n-horiguchi-PaJj6Psr51x8UrSeD/g0lQ, salls-b3bnyZ7c9ISVc3sceRu5cw
  Cc: linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	tanxiaojun-hv44wF8Li93QT0dZR+AlfA,
	linux-api-u79uwXL29TY76Z2rM5mHXA, Andi Kleen, Christoph Lameter

Hi Vlastimil,

On 2017/10/31 17:46, Vlastimil Babka wrote:
> +CC Andi and Christoph
> 
> On 10/27/2017 12:14 PM, Yisheng Xie wrote:
>> As manpage of migrate_pages, the errno should be set to EINVAL when none
>> of the specified nodes contain memory. However, when new_nodes is null,
>> i.e. the specified nodes also do not have memory, as the following case:
>>
>> 	new_nodes = 0;
>> 	old_nodes = 0xf;
>> 	ret = migrate_pages(pid, old_nodes, new_nodes, MAX);
>>
>> The ret will be 0 and no errno is set.
>>
>> This patch is to add nodes_empty check to fix above case.
> 
> Hmm, I think we have a bigger problem than "empty set is a subset of
> anything" here.
> 
> The existing checks are:
> 
>         task_nodes = cpuset_mems_allowed(task);
>         if (!nodes_subset(*new, task_nodes) && !capable(CAP_SYS_NICE)) {
>                 err = -EPERM;
>                 goto out_put;
>         }
> 
>         if (!nodes_subset(*new, node_states[N_MEMORY])) {
>                 err = -EINVAL;
>                 goto out_put;
>         }
> 
> 
> And manpage says:
> 
>        EINVAL The value specified by maxnode exceeds a kernel-imposed
> limit.  Or, old_nodes or new_nodes specifies one or more node IDs that
> are greater than the maximum supported node
>               ID.  *Or, none of the node IDs specified by new_nodes are
> on-line and allowed by the process's current cpuset context, or none of
> the specified nodes contain memory.*
> 
>        EPERM  Insufficient privilege (CAP_SYS_NICE) to move pages of the
> process specified by pid, or insufficient privilege (CAP_SYS_NICE) to
> access the specified target nodes.
> 
> - it says "none ... are allowed", but checking for subset means we check
> if "all ... are allowed". Shouldn't we be checking for a non-empty
> intersection?

You are absolutely right. To follow the manpage, we should check non-empty
of intersection instead of subset. I mean:
         nodes_and(*new, *new, task_nodes);
         if (!node_empty(*new) && !capable(CAP_SYS_NICE)) {
                 err = -EPERM;
                 goto out_put;
         }

         nodes_and(*new, *new, node_states[N_MEMORY]);
         if (!node_empty(*new)) {
                 err = -EINVAL;
                 goto out_put;
         }

So finally, we should only migrate the smallest intersection of all the node
set, right?

> - there doesn't seem to be any EINVAL check for "process's current
> cpuset context", there's just an EPERM check for "target process's
> cpuset context".

This also need to be checked as manpage.

Thanks
Yisheng Xie

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH RFC v2 4/4] mm/mempolicy: add nodes_empty check in SYSC_migrate_pages
@ 2017-11-06  1:31       ` Yisheng Xie
  0 siblings, 0 replies; 61+ messages in thread
From: Yisheng Xie @ 2017-11-06  1:31 UTC (permalink / raw)
  To: Vlastimil Babka, akpm, mhocko, mingo, rientjes, n-horiguchi, salls
  Cc: linux-mm, linux-kernel, tanxiaojun, linux-api, Andi Kleen,
	Christoph Lameter

Hi Vlastimil,

On 2017/10/31 17:46, Vlastimil Babka wrote:
> +CC Andi and Christoph
> 
> On 10/27/2017 12:14 PM, Yisheng Xie wrote:
>> As manpage of migrate_pages, the errno should be set to EINVAL when none
>> of the specified nodes contain memory. However, when new_nodes is null,
>> i.e. the specified nodes also do not have memory, as the following case:
>>
>> 	new_nodes = 0;
>> 	old_nodes = 0xf;
>> 	ret = migrate_pages(pid, old_nodes, new_nodes, MAX);
>>
>> The ret will be 0 and no errno is set.
>>
>> This patch is to add nodes_empty check to fix above case.
> 
> Hmm, I think we have a bigger problem than "empty set is a subset of
> anything" here.
> 
> The existing checks are:
> 
>         task_nodes = cpuset_mems_allowed(task);
>         if (!nodes_subset(*new, task_nodes) && !capable(CAP_SYS_NICE)) {
>                 err = -EPERM;
>                 goto out_put;
>         }
> 
>         if (!nodes_subset(*new, node_states[N_MEMORY])) {
>                 err = -EINVAL;
>                 goto out_put;
>         }
> 
> 
> And manpage says:
> 
>        EINVAL The value specified by maxnode exceeds a kernel-imposed
> limit.  Or, old_nodes or new_nodes specifies one or more node IDs that
> are greater than the maximum supported node
>               ID.  *Or, none of the node IDs specified by new_nodes are
> on-line and allowed by the process's current cpuset context, or none of
> the specified nodes contain memory.*
> 
>        EPERM  Insufficient privilege (CAP_SYS_NICE) to move pages of the
> process specified by pid, or insufficient privilege (CAP_SYS_NICE) to
> access the specified target nodes.
> 
> - it says "none ... are allowed", but checking for subset means we check
> if "all ... are allowed". Shouldn't we be checking for a non-empty
> intersection?

You are absolutely right. To follow the manpage, we should check non-empty
of intersection instead of subset. I meani 1/4 ?
         nodes_and(*new, *new, task_nodes);
         if (!node_empty(*new) && !capable(CAP_SYS_NICE)) {
                 err = -EPERM;
                 goto out_put;
         }

         nodes_and(*new, *new, node_states[N_MEMORY]);
         if (!node_empty(*new)) {
                 err = -EINVAL;
                 goto out_put;
         }

So finally, we should only migrate the smallest intersection of all the node
set, right?

> - there doesn't seem to be any EINVAL check for "process's current
> cpuset context", there's just an EPERM check for "target process's
> cpuset context".

This also need to be checked as manpage.

Thanks
Yisheng Xie

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH RFC v2 4/4] mm/mempolicy: add nodes_empty check in SYSC_migrate_pages
@ 2017-11-06  7:39         ` Vlastimil Babka
  0 siblings, 0 replies; 61+ messages in thread
From: Vlastimil Babka @ 2017-11-06  7:39 UTC (permalink / raw)
  To: Yisheng Xie, akpm, mhocko, mingo, rientjes, n-horiguchi, salls
  Cc: linux-mm, linux-kernel, tanxiaojun, linux-api, Andi Kleen,
	Christoph Lameter

On 11/06/2017 02:31 AM, Yisheng Xie wrote:
> Hi Vlastimil,
> 
> On 2017/10/31 17:46, Vlastimil Babka wrote:
>> +CC Andi and Christoph
>>
>> On 10/27/2017 12:14 PM, Yisheng Xie wrote:
>>> As manpage of migrate_pages, the errno should be set to EINVAL when none
>>> of the specified nodes contain memory. However, when new_nodes is null,
>>> i.e. the specified nodes also do not have memory, as the following case:
>>>
>>> 	new_nodes = 0;
>>> 	old_nodes = 0xf;
>>> 	ret = migrate_pages(pid, old_nodes, new_nodes, MAX);
>>>
>>> The ret will be 0 and no errno is set.
>>>
>>> This patch is to add nodes_empty check to fix above case.
>>
>> Hmm, I think we have a bigger problem than "empty set is a subset of
>> anything" here.
>>
>> The existing checks are:
>>
>>         task_nodes = cpuset_mems_allowed(task);
>>         if (!nodes_subset(*new, task_nodes) && !capable(CAP_SYS_NICE)) {
>>                 err = -EPERM;
>>                 goto out_put;
>>         }
>>
>>         if (!nodes_subset(*new, node_states[N_MEMORY])) {
>>                 err = -EINVAL;
>>                 goto out_put;
>>         }
>>
>>
>> And manpage says:
>>
>>        EINVAL The value specified by maxnode exceeds a kernel-imposed
>> limit.  Or, old_nodes or new_nodes specifies one or more node IDs that
>> are greater than the maximum supported node
>>               ID.  *Or, none of the node IDs specified by new_nodes are
>> on-line and allowed by the process's current cpuset context, or none of
>> the specified nodes contain memory.*
>>
>>        EPERM  Insufficient privilege (CAP_SYS_NICE) to move pages of the
>> process specified by pid, or insufficient privilege (CAP_SYS_NICE) to
>> access the specified target nodes.
>>
>> - it says "none ... are allowed", but checking for subset means we check
>> if "all ... are allowed". Shouldn't we be checking for a non-empty
>> intersection?
> 
> You are absolutely right. To follow the manpage, we should check non-empty
> of intersection instead of subset. I mean:
>          nodes_and(*new, *new, task_nodes);
>          if (!node_empty(*new) && !capable(CAP_SYS_NICE)) {
>                  err = -EPERM;
>                  goto out_put;
>          }
> 
>          nodes_and(*new, *new, node_states[N_MEMORY]);
>          if (!node_empty(*new)) {
>                  err = -EINVAL;
>                  goto out_put;
>          }

Maybe not exactly like this, see below.

> So finally, we should only migrate the smallest intersection of all the node
> set, right?

That's right.

So if new_nodes AND task_nodes AND node_states[N_MEMORY] is empty, then
EINVAL.

I'm not sure what exactly is the EPERM intention. Should really the
capability of THIS process override the cpuset restriction of the TARGET
process? Maybe yes. Then, does "insufficient privilege (CAP_SYS_NICE) to
access the specified target nodes." mean that at least some nodes must
be allowed, or all of them? Maybe the subset check is after all OK for
the EPERM check, but still wrong for the EINVAL check.

>> - there doesn't seem to be any EINVAL check for "process's current
>> cpuset context", there's just an EPERM check for "target process's
>> cpuset context".
> 
> This also need to be checked as manpage.
> 
> Thanks
> Yisheng Xie
> 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH RFC v2 4/4] mm/mempolicy: add nodes_empty check in SYSC_migrate_pages
@ 2017-11-06  7:39         ` Vlastimil Babka
  0 siblings, 0 replies; 61+ messages in thread
From: Vlastimil Babka @ 2017-11-06  7:39 UTC (permalink / raw)
  To: Yisheng Xie, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	mhocko-IBi9RG/b67k, mingo-DgEjT+Ai2ygdnm+yROfE0A,
	rientjes-hpIqsD4AKlfQT0dZR+AlfA,
	n-horiguchi-PaJj6Psr51x8UrSeD/g0lQ, salls-b3bnyZ7c9ISVc3sceRu5cw
  Cc: linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	tanxiaojun-hv44wF8Li93QT0dZR+AlfA,
	linux-api-u79uwXL29TY76Z2rM5mHXA, Andi Kleen, Christoph Lameter

On 11/06/2017 02:31 AM, Yisheng Xie wrote:
> Hi Vlastimil,
> 
> On 2017/10/31 17:46, Vlastimil Babka wrote:
>> +CC Andi and Christoph
>>
>> On 10/27/2017 12:14 PM, Yisheng Xie wrote:
>>> As manpage of migrate_pages, the errno should be set to EINVAL when none
>>> of the specified nodes contain memory. However, when new_nodes is null,
>>> i.e. the specified nodes also do not have memory, as the following case:
>>>
>>> 	new_nodes = 0;
>>> 	old_nodes = 0xf;
>>> 	ret = migrate_pages(pid, old_nodes, new_nodes, MAX);
>>>
>>> The ret will be 0 and no errno is set.
>>>
>>> This patch is to add nodes_empty check to fix above case.
>>
>> Hmm, I think we have a bigger problem than "empty set is a subset of
>> anything" here.
>>
>> The existing checks are:
>>
>>         task_nodes = cpuset_mems_allowed(task);
>>         if (!nodes_subset(*new, task_nodes) && !capable(CAP_SYS_NICE)) {
>>                 err = -EPERM;
>>                 goto out_put;
>>         }
>>
>>         if (!nodes_subset(*new, node_states[N_MEMORY])) {
>>                 err = -EINVAL;
>>                 goto out_put;
>>         }
>>
>>
>> And manpage says:
>>
>>        EINVAL The value specified by maxnode exceeds a kernel-imposed
>> limit.  Or, old_nodes or new_nodes specifies one or more node IDs that
>> are greater than the maximum supported node
>>               ID.  *Or, none of the node IDs specified by new_nodes are
>> on-line and allowed by the process's current cpuset context, or none of
>> the specified nodes contain memory.*
>>
>>        EPERM  Insufficient privilege (CAP_SYS_NICE) to move pages of the
>> process specified by pid, or insufficient privilege (CAP_SYS_NICE) to
>> access the specified target nodes.
>>
>> - it says "none ... are allowed", but checking for subset means we check
>> if "all ... are allowed". Shouldn't we be checking for a non-empty
>> intersection?
> 
> You are absolutely right. To follow the manpage, we should check non-empty
> of intersection instead of subset. I mean:
>          nodes_and(*new, *new, task_nodes);
>          if (!node_empty(*new) && !capable(CAP_SYS_NICE)) {
>                  err = -EPERM;
>                  goto out_put;
>          }
> 
>          nodes_and(*new, *new, node_states[N_MEMORY]);
>          if (!node_empty(*new)) {
>                  err = -EINVAL;
>                  goto out_put;
>          }

Maybe not exactly like this, see below.

> So finally, we should only migrate the smallest intersection of all the node
> set, right?

That's right.

So if new_nodes AND task_nodes AND node_states[N_MEMORY] is empty, then
EINVAL.

I'm not sure what exactly is the EPERM intention. Should really the
capability of THIS process override the cpuset restriction of the TARGET
process? Maybe yes. Then, does "insufficient privilege (CAP_SYS_NICE) to
access the specified target nodes." mean that at least some nodes must
be allowed, or all of them? Maybe the subset check is after all OK for
the EPERM check, but still wrong for the EINVAL check.

>> - there doesn't seem to be any EINVAL check for "process's current
>> cpuset context", there's just an EPERM check for "target process's
>> cpuset context".
> 
> This also need to be checked as manpage.
> 
> Thanks
> Yisheng Xie
> 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH RFC v2 4/4] mm/mempolicy: add nodes_empty check in SYSC_migrate_pages
@ 2017-11-06  7:39         ` Vlastimil Babka
  0 siblings, 0 replies; 61+ messages in thread
From: Vlastimil Babka @ 2017-11-06  7:39 UTC (permalink / raw)
  To: Yisheng Xie, akpm, mhocko, mingo, rientjes, n-horiguchi, salls
  Cc: linux-mm, linux-kernel, tanxiaojun, linux-api, Andi Kleen,
	Christoph Lameter

On 11/06/2017 02:31 AM, Yisheng Xie wrote:
> Hi Vlastimil,
> 
> On 2017/10/31 17:46, Vlastimil Babka wrote:
>> +CC Andi and Christoph
>>
>> On 10/27/2017 12:14 PM, Yisheng Xie wrote:
>>> As manpage of migrate_pages, the errno should be set to EINVAL when none
>>> of the specified nodes contain memory. However, when new_nodes is null,
>>> i.e. the specified nodes also do not have memory, as the following case:
>>>
>>> 	new_nodes = 0;
>>> 	old_nodes = 0xf;
>>> 	ret = migrate_pages(pid, old_nodes, new_nodes, MAX);
>>>
>>> The ret will be 0 and no errno is set.
>>>
>>> This patch is to add nodes_empty check to fix above case.
>>
>> Hmm, I think we have a bigger problem than "empty set is a subset of
>> anything" here.
>>
>> The existing checks are:
>>
>>         task_nodes = cpuset_mems_allowed(task);
>>         if (!nodes_subset(*new, task_nodes) && !capable(CAP_SYS_NICE)) {
>>                 err = -EPERM;
>>                 goto out_put;
>>         }
>>
>>         if (!nodes_subset(*new, node_states[N_MEMORY])) {
>>                 err = -EINVAL;
>>                 goto out_put;
>>         }
>>
>>
>> And manpage says:
>>
>>        EINVAL The value specified by maxnode exceeds a kernel-imposed
>> limit.  Or, old_nodes or new_nodes specifies one or more node IDs that
>> are greater than the maximum supported node
>>               ID.  *Or, none of the node IDs specified by new_nodes are
>> on-line and allowed by the process's current cpuset context, or none of
>> the specified nodes contain memory.*
>>
>>        EPERM  Insufficient privilege (CAP_SYS_NICE) to move pages of the
>> process specified by pid, or insufficient privilege (CAP_SYS_NICE) to
>> access the specified target nodes.
>>
>> - it says "none ... are allowed", but checking for subset means we check
>> if "all ... are allowed". Shouldn't we be checking for a non-empty
>> intersection?
> 
> You are absolutely right. To follow the manpage, we should check non-empty
> of intersection instead of subset. I meani 1/4 ?
>          nodes_and(*new, *new, task_nodes);
>          if (!node_empty(*new) && !capable(CAP_SYS_NICE)) {
>                  err = -EPERM;
>                  goto out_put;
>          }
> 
>          nodes_and(*new, *new, node_states[N_MEMORY]);
>          if (!node_empty(*new)) {
>                  err = -EINVAL;
>                  goto out_put;
>          }

Maybe not exactly like this, see below.

> So finally, we should only migrate the smallest intersection of all the node
> set, right?

That's right.

So if new_nodes AND task_nodes AND node_states[N_MEMORY] is empty, then
EINVAL.

I'm not sure what exactly is the EPERM intention. Should really the
capability of THIS process override the cpuset restriction of the TARGET
process? Maybe yes. Then, does "insufficient privilege (CAP_SYS_NICE) to
access the specified target nodes." mean that at least some nodes must
be allowed, or all of them? Maybe the subset check is after all OK for
the EPERM check, but still wrong for the EINVAL check.

>> - there doesn't seem to be any EINVAL check for "process's current
>> cpuset context", there's just an EPERM check for "target process's
>> cpuset context".
> 
> This also need to be checked as manpage.
> 
> Thanks
> Yisheng Xie
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH RFC v2 4/4] mm/mempolicy: add nodes_empty check in SYSC_migrate_pages
@ 2017-11-06 15:29           ` Christopher Lameter
  0 siblings, 0 replies; 61+ messages in thread
From: Christopher Lameter @ 2017-11-06 15:29 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Yisheng Xie, akpm, mhocko, mingo, rientjes, n-horiguchi, salls,
	linux-mm, linux-kernel, tanxiaojun, linux-api, Andi Kleen

On Mon, 6 Nov 2017, Vlastimil Babka wrote:

> I'm not sure what exactly is the EPERM intention. Should really the
> capability of THIS process override the cpuset restriction of the TARGET
> process? Maybe yes. Then, does "insufficient privilege (CAP_SYS_NICE) to

CAP_SYS_NICE never overrides cpuset restrictions. The cap can be used to
migrate pages that are *also* mapped by other processes (and thus move
pages of another process which may have different cpu set restrictions!).
The cap should not allow migrating pages to nodes that are not allowed by
the cpuset of the current process.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH RFC v2 4/4] mm/mempolicy: add nodes_empty check in SYSC_migrate_pages
@ 2017-11-06 15:29           ` Christopher Lameter
  0 siblings, 0 replies; 61+ messages in thread
From: Christopher Lameter @ 2017-11-06 15:29 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Yisheng Xie, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	mhocko-IBi9RG/b67k, mingo-DgEjT+Ai2ygdnm+yROfE0A,
	rientjes-hpIqsD4AKlfQT0dZR+AlfA,
	n-horiguchi-PaJj6Psr51x8UrSeD/g0lQ, salls-b3bnyZ7c9ISVc3sceRu5cw,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	tanxiaojun-hv44wF8Li93QT0dZR+AlfA,
	linux-api-u79uwXL29TY76Z2rM5mHXA, Andi Kleen

On Mon, 6 Nov 2017, Vlastimil Babka wrote:

> I'm not sure what exactly is the EPERM intention. Should really the
> capability of THIS process override the cpuset restriction of the TARGET
> process? Maybe yes. Then, does "insufficient privilege (CAP_SYS_NICE) to

CAP_SYS_NICE never overrides cpuset restrictions. The cap can be used to
migrate pages that are *also* mapped by other processes (and thus move
pages of another process which may have different cpu set restrictions!).
The cap should not allow migrating pages to nodes that are not allowed by
the cpuset of the current process.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH RFC v2 4/4] mm/mempolicy: add nodes_empty check in SYSC_migrate_pages
@ 2017-11-06 15:29           ` Christopher Lameter
  0 siblings, 0 replies; 61+ messages in thread
From: Christopher Lameter @ 2017-11-06 15:29 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Yisheng Xie, akpm, mhocko, mingo, rientjes, n-horiguchi, salls,
	linux-mm, linux-kernel, tanxiaojun, linux-api, Andi Kleen

On Mon, 6 Nov 2017, Vlastimil Babka wrote:

> I'm not sure what exactly is the EPERM intention. Should really the
> capability of THIS process override the cpuset restriction of the TARGET
> process? Maybe yes. Then, does "insufficient privilege (CAP_SYS_NICE) to

CAP_SYS_NICE never overrides cpuset restrictions. The cap can be used to
migrate pages that are *also* mapped by other processes (and thus move
pages of another process which may have different cpu set restrictions!).
The cap should not allow migrating pages to nodes that are not allowed by
the cpuset of the current process.


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH RFC v2 4/4] mm/mempolicy: add nodes_empty check in SYSC_migrate_pages
  2017-11-06 15:29           ` Christopher Lameter
  (?)
@ 2017-11-07 11:23             ` Yisheng Xie
  -1 siblings, 0 replies; 61+ messages in thread
From: Yisheng Xie @ 2017-11-07 11:23 UTC (permalink / raw)
  To: Christopher Lameter, Vlastimil Babka
  Cc: akpm, mhocko, mingo, rientjes, n-horiguchi, salls, linux-mm,
	linux-kernel, tanxiaojun, linux-api, Andi Kleen

hi Christopher and Vlastimil,

Thanks for your comment!
On 2017/11/6 23:29, Christopher Lameter wrote:
> On Mon, 6 Nov 2017, Vlastimil Babka wrote:
> 
>> I'm not sure what exactly is the EPERM intention. Should really the
>> capability of THIS process override the cpuset restriction of the TARGET
>> process? Maybe yes. Then, does "insufficient privilege (CAP_SYS_NICE) to
> 
> CAP_SYS_NICE never overrides cpuset restrictions. The cap can be used to
> migrate pages that are *also* mapped by other processes (and thus move
> pages of another process which may have different cpu set restrictions!).

So you means the specified nodes should be a subset of target cpu set, right?

Thanks
Yisheng Xie
> The cap should not allow migrating pages to nodes that are not allowed by
> the cpuset of the current process.
> 
> 
> 
> .
> 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH RFC v2 4/4] mm/mempolicy: add nodes_empty check in SYSC_migrate_pages
@ 2017-11-07 11:23             ` Yisheng Xie
  0 siblings, 0 replies; 61+ messages in thread
From: Yisheng Xie @ 2017-11-07 11:23 UTC (permalink / raw)
  To: Christopher Lameter, Vlastimil Babka
  Cc: akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, mhocko-IBi9RG/b67k,
	mingo-DgEjT+Ai2ygdnm+yROfE0A, rientjes-hpIqsD4AKlfQT0dZR+AlfA,
	n-horiguchi-PaJj6Psr51x8UrSeD/g0lQ, salls-b3bnyZ7c9ISVc3sceRu5cw,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	tanxiaojun-hv44wF8Li93QT0dZR+AlfA,
	linux-api-u79uwXL29TY76Z2rM5mHXA, Andi Kleen

hi Christopher and Vlastimil,

Thanks for your comment!
On 2017/11/6 23:29, Christopher Lameter wrote:
> On Mon, 6 Nov 2017, Vlastimil Babka wrote:
> 
>> I'm not sure what exactly is the EPERM intention. Should really the
>> capability of THIS process override the cpuset restriction of the TARGET
>> process? Maybe yes. Then, does "insufficient privilege (CAP_SYS_NICE) to
> 
> CAP_SYS_NICE never overrides cpuset restrictions. The cap can be used to
> migrate pages that are *also* mapped by other processes (and thus move
> pages of another process which may have different cpu set restrictions!).

So you means the specified nodes should be a subset of target cpu set, right?

Thanks
Yisheng Xie
> The cap should not allow migrating pages to nodes that are not allowed by
> the cpuset of the current process.
> 
> 
> 
> .
> 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH RFC v2 4/4] mm/mempolicy: add nodes_empty check in SYSC_migrate_pages
@ 2017-11-07 11:23             ` Yisheng Xie
  0 siblings, 0 replies; 61+ messages in thread
From: Yisheng Xie @ 2017-11-07 11:23 UTC (permalink / raw)
  To: Christopher Lameter, Vlastimil Babka
  Cc: akpm, mhocko, mingo, rientjes, n-horiguchi, salls, linux-mm,
	linux-kernel, tanxiaojun, linux-api, Andi Kleen

hi Christopher and Vlastimil,

Thanks for your comment!
On 2017/11/6 23:29, Christopher Lameter wrote:
> On Mon, 6 Nov 2017, Vlastimil Babka wrote:
> 
>> I'm not sure what exactly is the EPERM intention. Should really the
>> capability of THIS process override the cpuset restriction of the TARGET
>> process? Maybe yes. Then, does "insufficient privilege (CAP_SYS_NICE) to
> 
> CAP_SYS_NICE never overrides cpuset restrictions. The cap can be used to
> migrate pages that are *also* mapped by other processes (and thus move
> pages of another process which may have different cpu set restrictions!).

So you means the specified nodes should be a subset of target cpu set, right?

Thanks
Yisheng Xie
> The cap should not allow migrating pages to nodes that are not allowed by
> the cpuset of the current process.
> 
> 
> 
> .
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH RFC v2 4/4] mm/mempolicy: add nodes_empty check in SYSC_migrate_pages
@ 2017-11-07 14:54               ` Christopher Lameter
  0 siblings, 0 replies; 61+ messages in thread
From: Christopher Lameter @ 2017-11-07 14:54 UTC (permalink / raw)
  To: Yisheng Xie
  Cc: Vlastimil Babka, akpm, mhocko, mingo, rientjes, n-horiguchi,
	salls, linux-mm, linux-kernel, tanxiaojun, linux-api, Andi Kleen

On Tue, 7 Nov 2017, Yisheng Xie wrote:

> On 2017/11/6 23:29, Christopher Lameter wrote:
> > On Mon, 6 Nov 2017, Vlastimil Babka wrote:
> >
> >> I'm not sure what exactly is the EPERM intention. Should really the
> >> capability of THIS process override the cpuset restriction of the TARGET
> >> process? Maybe yes. Then, does "insufficient privilege (CAP_SYS_NICE) to
> >
> > CAP_SYS_NICE never overrides cpuset restrictions. The cap can be used to
> > migrate pages that are *also* mapped by other processes (and thus move
> > pages of another process which may have different cpu set restrictions!).
>
> So you means the specified nodes should be a subset of target cpu set, right?

The specified nodes need to be part of the *current* cpu set.

Migrate pages moves the pages of a single process there is no TARGET
process.

Thus thehe *target* nodes need to be a subset of the current cpu set.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH RFC v2 4/4] mm/mempolicy: add nodes_empty check in SYSC_migrate_pages
@ 2017-11-07 14:54               ` Christopher Lameter
  0 siblings, 0 replies; 61+ messages in thread
From: Christopher Lameter @ 2017-11-07 14:54 UTC (permalink / raw)
  To: Yisheng Xie
  Cc: Vlastimil Babka, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	mhocko-IBi9RG/b67k, mingo-DgEjT+Ai2ygdnm+yROfE0A,
	rientjes-hpIqsD4AKlfQT0dZR+AlfA,
	n-horiguchi-PaJj6Psr51x8UrSeD/g0lQ, salls-b3bnyZ7c9ISVc3sceRu5cw,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	tanxiaojun-hv44wF8Li93QT0dZR+AlfA,
	linux-api-u79uwXL29TY76Z2rM5mHXA, Andi Kleen

On Tue, 7 Nov 2017, Yisheng Xie wrote:

> On 2017/11/6 23:29, Christopher Lameter wrote:
> > On Mon, 6 Nov 2017, Vlastimil Babka wrote:
> >
> >> I'm not sure what exactly is the EPERM intention. Should really the
> >> capability of THIS process override the cpuset restriction of the TARGET
> >> process? Maybe yes. Then, does "insufficient privilege (CAP_SYS_NICE) to
> >
> > CAP_SYS_NICE never overrides cpuset restrictions. The cap can be used to
> > migrate pages that are *also* mapped by other processes (and thus move
> > pages of another process which may have different cpu set restrictions!).
>
> So you means the specified nodes should be a subset of target cpu set, right?

The specified nodes need to be part of the *current* cpu set.

Migrate pages moves the pages of a single process there is no TARGET
process.

Thus thehe *target* nodes need to be a subset of the current cpu set.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH RFC v2 4/4] mm/mempolicy: add nodes_empty check in SYSC_migrate_pages
@ 2017-11-07 14:54               ` Christopher Lameter
  0 siblings, 0 replies; 61+ messages in thread
From: Christopher Lameter @ 2017-11-07 14:54 UTC (permalink / raw)
  To: Yisheng Xie
  Cc: Vlastimil Babka, akpm, mhocko, mingo, rientjes, n-horiguchi,
	salls, linux-mm, linux-kernel, tanxiaojun, linux-api, Andi Kleen

On Tue, 7 Nov 2017, Yisheng Xie wrote:

> On 2017/11/6 23:29, Christopher Lameter wrote:
> > On Mon, 6 Nov 2017, Vlastimil Babka wrote:
> >
> >> I'm not sure what exactly is the EPERM intention. Should really the
> >> capability of THIS process override the cpuset restriction of the TARGET
> >> process? Maybe yes. Then, does "insufficient privilege (CAP_SYS_NICE) to
> >
> > CAP_SYS_NICE never overrides cpuset restrictions. The cap can be used to
> > migrate pages that are *also* mapped by other processes (and thus move
> > pages of another process which may have different cpu set restrictions!).
>
> So you means the specified nodes should be a subset of target cpu set, right?

The specified nodes need to be part of the *current* cpu set.

Migrate pages moves the pages of a single process there is no TARGET
process.

Thus thehe *target* nodes need to be a subset of the current cpu set.


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH RFC v2 4/4] mm/mempolicy: add nodes_empty check in SYSC_migrate_pages
  2017-11-07 14:54               ` Christopher Lameter
@ 2017-11-07 15:05                 ` Vlastimil Babka
  -1 siblings, 0 replies; 61+ messages in thread
From: Vlastimil Babka @ 2017-11-07 15:05 UTC (permalink / raw)
  To: Christopher Lameter, Yisheng Xie
  Cc: akpm, mhocko, mingo, rientjes, n-horiguchi, salls, linux-mm,
	linux-kernel, tanxiaojun, linux-api, Andi Kleen

On 11/07/2017 03:54 PM, Christopher Lameter wrote:
> On Tue, 7 Nov 2017, Yisheng Xie wrote:
> 
>> On 2017/11/6 23:29, Christopher Lameter wrote:
>>> On Mon, 6 Nov 2017, Vlastimil Babka wrote:
>>>
>>>> I'm not sure what exactly is the EPERM intention. Should really the
>>>> capability of THIS process override the cpuset restriction of the TARGET
>>>> process? Maybe yes. Then, does "insufficient privilege (CAP_SYS_NICE) to
>>>
>>> CAP_SYS_NICE never overrides cpuset restrictions. The cap can be used to
>>> migrate pages that are *also* mapped by other processes (and thus move
>>> pages of another process which may have different cpu set restrictions!).
>>
>> So you means the specified nodes should be a subset of target cpu set, right?
> 
> The specified nodes need to be part of the *current* cpu set.
> 
> Migrate pages moves the pages of a single process there is no TARGET
> process.

migrate_pages(2) takes a pid argument

"migrate_pages()  attempts  to  move all pages of the process pid that
are in memory nodes old_nodes to the memory nodes in new_nodes. "

> Thus thehe *target* nodes need to be a subset of the current cpu set.
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-api" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH RFC v2 4/4] mm/mempolicy: add nodes_empty check in SYSC_migrate_pages
@ 2017-11-07 15:05                 ` Vlastimil Babka
  0 siblings, 0 replies; 61+ messages in thread
From: Vlastimil Babka @ 2017-11-07 15:05 UTC (permalink / raw)
  To: Christopher Lameter, Yisheng Xie
  Cc: akpm, mhocko, mingo, rientjes, n-horiguchi, salls, linux-mm,
	linux-kernel, tanxiaojun, linux-api, Andi Kleen

On 11/07/2017 03:54 PM, Christopher Lameter wrote:
> On Tue, 7 Nov 2017, Yisheng Xie wrote:
> 
>> On 2017/11/6 23:29, Christopher Lameter wrote:
>>> On Mon, 6 Nov 2017, Vlastimil Babka wrote:
>>>
>>>> I'm not sure what exactly is the EPERM intention. Should really the
>>>> capability of THIS process override the cpuset restriction of the TARGET
>>>> process? Maybe yes. Then, does "insufficient privilege (CAP_SYS_NICE) to
>>>
>>> CAP_SYS_NICE never overrides cpuset restrictions. The cap can be used to
>>> migrate pages that are *also* mapped by other processes (and thus move
>>> pages of another process which may have different cpu set restrictions!).
>>
>> So you means the specified nodes should be a subset of target cpu set, right?
> 
> The specified nodes need to be part of the *current* cpu set.
> 
> Migrate pages moves the pages of a single process there is no TARGET
> process.

migrate_pages(2) takes a pid argument

"migrate_pages()  attempts  to  move all pages of the process pid that
are in memory nodes old_nodes to the memory nodes in new_nodes. "

> Thus thehe *target* nodes need to be a subset of the current cpu set.
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-api" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH RFC v2 4/4] mm/mempolicy: add nodes_empty check in SYSC_migrate_pages
  2017-11-07 15:05                 ` Vlastimil Babka
@ 2017-11-07 15:55                   ` Christopher Lameter
  -1 siblings, 0 replies; 61+ messages in thread
From: Christopher Lameter @ 2017-11-07 15:55 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Yisheng Xie, akpm, mhocko, mingo, rientjes, n-horiguchi, salls,
	linux-mm, linux-kernel, tanxiaojun, linux-api, Andi Kleen

On Tue, 7 Nov 2017, Vlastimil Babka wrote:

> > Migrate pages moves the pages of a single process there is no TARGET
> > process.
>
> migrate_pages(2) takes a pid argument
>
> "migrate_pages()  attempts  to  move all pages of the process pid that
> are in memory nodes old_nodes to the memory nodes in new_nodes. "

Ok missed that. Most use cases here are on the current process.

Fundamentally a process can have shared pages outside of the cpuset that
a process is restricted to. Thus I would think that migration to any of
the allowed nodes of the current process that is calling migrate pages
is ok. The caller wants this and the caller has a right to allocate on
these nodes. It would be strange if migrate_pages would allow allocation
outside of the current cpuset.

> > Thus thehe *target* nodes need to be a subset of the current cpu set.

And therefore the above still holds.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH RFC v2 4/4] mm/mempolicy: add nodes_empty check in SYSC_migrate_pages
@ 2017-11-07 15:55                   ` Christopher Lameter
  0 siblings, 0 replies; 61+ messages in thread
From: Christopher Lameter @ 2017-11-07 15:55 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Yisheng Xie, akpm, mhocko, mingo, rientjes, n-horiguchi, salls,
	linux-mm, linux-kernel, tanxiaojun, linux-api, Andi Kleen

On Tue, 7 Nov 2017, Vlastimil Babka wrote:

> > Migrate pages moves the pages of a single process there is no TARGET
> > process.
>
> migrate_pages(2) takes a pid argument
>
> "migrate_pages()  attempts  to  move all pages of the process pid that
> are in memory nodes old_nodes to the memory nodes in new_nodes. "

Ok missed that. Most use cases here are on the current process.

Fundamentally a process can have shared pages outside of the cpuset that
a process is restricted to. Thus I would think that migration to any of
the allowed nodes of the current process that is calling migrate pages
is ok. The caller wants this and the caller has a right to allocate on
these nodes. It would be strange if migrate_pages would allow allocation
outside of the current cpuset.

> > Thus thehe *target* nodes need to be a subset of the current cpu set.

And therefore the above still holds.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH RFC v2 4/4] mm/mempolicy: add nodes_empty check in SYSC_migrate_pages
  2017-11-07 15:55                   ` Christopher Lameter
  (?)
@ 2017-11-08  1:38                     ` Yisheng Xie
  -1 siblings, 0 replies; 61+ messages in thread
From: Yisheng Xie @ 2017-11-08  1:38 UTC (permalink / raw)
  To: Christopher Lameter, Vlastimil Babka
  Cc: akpm, mhocko, mingo, rientjes, n-horiguchi, salls, linux-mm,
	linux-kernel, tanxiaojun, linux-api, Andi Kleen

Hi Christopher,

On 2017/11/7 23:55, Christopher Lameter wrote:
> On Tue, 7 Nov 2017, Vlastimil Babka wrote:
> 
>>> Migrate pages moves the pages of a single process there is no TARGET
>>> process.
>>
>> migrate_pages(2) takes a pid argument
>>
>> "migrate_pages()  attempts  to  move all pages of the process pid that
>> are in memory nodes old_nodes to the memory nodes in new_nodes. "
> 
> Ok missed that. Most use cases here are on the current process.

Yeah, so most case current process is the same as target process.
But maybe I still miss someting, see below:

> 
> Fundamentally a process can have shared pages outside of the cpuset that
> a process is restricted to. Thus I would think that migration to any of
> the allowed nodes of the current process that is calling migrate pages
> is ok. The caller wants this and the caller has a right to allocate on
> these nodes. It would be strange if migrate_pages would allow allocation
> outside of the current cpuset.
> 
>>> Thus thehe *target* nodes need to be a subset of the current cpu set.
> 
> And therefore the above still holds.

Another case is current process is *not* the same as target process, and
when current process try to migrate pages of target process from old_nodes
to new_nodes, the new_nodes should be a subset of target process cpuset.
CAP_SYS_NICE will insure that current process have the privilege, and will
not overwrite the restriction of target process cpuset.

However, for the current cpuset restriction, as manpage says :
  EINVAL... Or, _none_ of the node IDs specified by new_nodes are
  on-line and allowed by the process's current cpuset context, or none of
  the specified nodes contain memory.

So for current cpuset restriction, an intersection check should be enough
instead of subset? And it will also make sure migrate_pages will not
allocate pages outside of the current cpuset.

> 
> .
> 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH RFC v2 4/4] mm/mempolicy: add nodes_empty check in SYSC_migrate_pages
@ 2017-11-08  1:38                     ` Yisheng Xie
  0 siblings, 0 replies; 61+ messages in thread
From: Yisheng Xie @ 2017-11-08  1:38 UTC (permalink / raw)
  To: Christopher Lameter, Vlastimil Babka
  Cc: akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, mhocko-IBi9RG/b67k,
	mingo-DgEjT+Ai2ygdnm+yROfE0A, rientjes-hpIqsD4AKlfQT0dZR+AlfA,
	n-horiguchi-PaJj6Psr51x8UrSeD/g0lQ, salls-b3bnyZ7c9ISVc3sceRu5cw,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	tanxiaojun-hv44wF8Li93QT0dZR+AlfA,
	linux-api-u79uwXL29TY76Z2rM5mHXA, Andi Kleen

Hi Christopher,

On 2017/11/7 23:55, Christopher Lameter wrote:
> On Tue, 7 Nov 2017, Vlastimil Babka wrote:
> 
>>> Migrate pages moves the pages of a single process there is no TARGET
>>> process.
>>
>> migrate_pages(2) takes a pid argument
>>
>> "migrate_pages()  attempts  to  move all pages of the process pid that
>> are in memory nodes old_nodes to the memory nodes in new_nodes. "
> 
> Ok missed that. Most use cases here are on the current process.

Yeah, so most case current process is the same as target process.
But maybe I still miss someting, see below:

> 
> Fundamentally a process can have shared pages outside of the cpuset that
> a process is restricted to. Thus I would think that migration to any of
> the allowed nodes of the current process that is calling migrate pages
> is ok. The caller wants this and the caller has a right to allocate on
> these nodes. It would be strange if migrate_pages would allow allocation
> outside of the current cpuset.
> 
>>> Thus thehe *target* nodes need to be a subset of the current cpu set.
> 
> And therefore the above still holds.

Another case is current process is *not* the same as target process, and
when current process try to migrate pages of target process from old_nodes
to new_nodes, the new_nodes should be a subset of target process cpuset.
CAP_SYS_NICE will insure that current process have the privilege, and will
not overwrite the restriction of target process cpuset.

However, for the current cpuset restriction, as manpage says :
  EINVAL... Or, _none_ of the node IDs specified by new_nodes are
  on-line and allowed by the process's current cpuset context, or none of
  the specified nodes contain memory.

So for current cpuset restriction, an intersection check should be enough
instead of subset? And it will also make sure migrate_pages will not
allocate pages outside of the current cpuset.

> 
> .
> 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH RFC v2 4/4] mm/mempolicy: add nodes_empty check in SYSC_migrate_pages
@ 2017-11-08  1:38                     ` Yisheng Xie
  0 siblings, 0 replies; 61+ messages in thread
From: Yisheng Xie @ 2017-11-08  1:38 UTC (permalink / raw)
  To: Christopher Lameter, Vlastimil Babka
  Cc: akpm, mhocko, mingo, rientjes, n-horiguchi, salls, linux-mm,
	linux-kernel, tanxiaojun, linux-api, Andi Kleen

Hi Christopher,

On 2017/11/7 23:55, Christopher Lameter wrote:
> On Tue, 7 Nov 2017, Vlastimil Babka wrote:
> 
>>> Migrate pages moves the pages of a single process there is no TARGET
>>> process.
>>
>> migrate_pages(2) takes a pid argument
>>
>> "migrate_pages()  attempts  to  move all pages of the process pid that
>> are in memory nodes old_nodes to the memory nodes in new_nodes. "
> 
> Ok missed that. Most use cases here are on the current process.

Yeah, so most case current process is the same as target process.
But maybe I still miss someting, see below:

> 
> Fundamentally a process can have shared pages outside of the cpuset that
> a process is restricted to. Thus I would think that migration to any of
> the allowed nodes of the current process that is calling migrate pages
> is ok. The caller wants this and the caller has a right to allocate on
> these nodes. It would be strange if migrate_pages would allow allocation
> outside of the current cpuset.
> 
>>> Thus thehe *target* nodes need to be a subset of the current cpu set.
> 
> And therefore the above still holds.

Another case is current process is *not* the same as target process, and
when current process try to migrate pages of target process from old_nodes
to new_nodes, the new_nodes should be a subset of target process cpuset.
CAP_SYS_NICE will insure that current process have the privilege, and will
not overwrite the restriction of target process cpuset.

However, for the current cpuset restriction, as manpage says :
  EINVAL... Or, _none_ of the node IDs specified by new_nodes are
  on-line and allowed by the process's current cpuset context, or none of
  the specified nodes contain memory.

So for current cpuset restriction, an intersection check should be enough
instead of subset? And it will also make sure migrate_pages will not
allocate pages outside of the current cpuset.

> 
> .
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH RFC v2 4/4] mm/mempolicy: add nodes_empty check in SYSC_migrate_pages
@ 2017-11-08 15:02                       ` Christopher Lameter
  0 siblings, 0 replies; 61+ messages in thread
From: Christopher Lameter @ 2017-11-08 15:02 UTC (permalink / raw)
  To: Yisheng Xie
  Cc: Vlastimil Babka, akpm, mhocko, mingo, rientjes, n-horiguchi,
	salls, linux-mm, linux-kernel, tanxiaojun, linux-api, Andi Kleen

On Wed, 8 Nov 2017, Yisheng Xie wrote:

> Another case is current process is *not* the same as target process, and
> when current process try to migrate pages of target process from old_nodes
> to new_nodes, the new_nodes should be a subset of target process cpuset.

The caller of migrate_pages should be able to migrate the target process
pages anywhere the caller can allocate memory. If that is outside the
target processes cpuset then that is fine. Pagecache pages that are not
allocated by the target process already are not subject to the target
processes restriction. So this is not that unusual.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH RFC v2 4/4] mm/mempolicy: add nodes_empty check in SYSC_migrate_pages
@ 2017-11-08 15:02                       ` Christopher Lameter
  0 siblings, 0 replies; 61+ messages in thread
From: Christopher Lameter @ 2017-11-08 15:02 UTC (permalink / raw)
  To: Yisheng Xie
  Cc: Vlastimil Babka, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	mhocko-IBi9RG/b67k, mingo-DgEjT+Ai2ygdnm+yROfE0A,
	rientjes-hpIqsD4AKlfQT0dZR+AlfA,
	n-horiguchi-PaJj6Psr51x8UrSeD/g0lQ, salls-b3bnyZ7c9ISVc3sceRu5cw,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	tanxiaojun-hv44wF8Li93QT0dZR+AlfA,
	linux-api-u79uwXL29TY76Z2rM5mHXA, Andi Kleen

On Wed, 8 Nov 2017, Yisheng Xie wrote:

> Another case is current process is *not* the same as target process, and
> when current process try to migrate pages of target process from old_nodes
> to new_nodes, the new_nodes should be a subset of target process cpuset.

The caller of migrate_pages should be able to migrate the target process
pages anywhere the caller can allocate memory. If that is outside the
target processes cpuset then that is fine. Pagecache pages that are not
allocated by the target process already are not subject to the target
processes restriction. So this is not that unusual.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH RFC v2 4/4] mm/mempolicy: add nodes_empty check in SYSC_migrate_pages
@ 2017-11-08 15:02                       ` Christopher Lameter
  0 siblings, 0 replies; 61+ messages in thread
From: Christopher Lameter @ 2017-11-08 15:02 UTC (permalink / raw)
  To: Yisheng Xie
  Cc: Vlastimil Babka, akpm, mhocko, mingo, rientjes, n-horiguchi,
	salls, linux-mm, linux-kernel, tanxiaojun, linux-api, Andi Kleen

On Wed, 8 Nov 2017, Yisheng Xie wrote:

> Another case is current process is *not* the same as target process, and
> when current process try to migrate pages of target process from old_nodes
> to new_nodes, the new_nodes should be a subset of target process cpuset.

The caller of migrate_pages should be able to migrate the target process
pages anywhere the caller can allocate memory. If that is outside the
target processes cpuset then that is fine. Pagecache pages that are not
allocated by the target process already are not subject to the target
processes restriction. So this is not that unusual.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH RFC v2 4/4] mm/mempolicy: add nodes_empty check in SYSC_migrate_pages
  2017-11-08 15:02                       ` Christopher Lameter
  (?)
@ 2017-11-09 10:54                         ` Yisheng Xie
  -1 siblings, 0 replies; 61+ messages in thread
From: Yisheng Xie @ 2017-11-09 10:54 UTC (permalink / raw)
  To: Christopher Lameter
  Cc: Vlastimil Babka, akpm, mhocko, mingo, rientjes, n-horiguchi,
	salls, linux-mm, linux-kernel, tanxiaojun, linux-api, Andi Kleen

Hi Christopher,

On 2017/11/8 23:02, Christopher Lameter wrote:
> On Wed, 8 Nov 2017, Yisheng Xie wrote:
> 
>> Another case is current process is *not* the same as target process, and
>> when current process try to migrate pages of target process from old_nodes
>> to new_nodes, the new_nodes should be a subset of target process cpuset.
> 
> The caller of migrate_pages should be able to migrate the target process
> pages anywhere the caller can allocate memory. If that is outside the
> target processes cpuset then that is fine. Pagecache pages that are not
> allocated by the target process already are not subject to the target
> processes restriction. So this is not that unusual.

So there is no need to check the restriction of target process cpuset, right?
I hope that I do not miss anything :)

Thanks
Yisheng Xie
> 
> .
> 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH RFC v2 4/4] mm/mempolicy: add nodes_empty check in SYSC_migrate_pages
@ 2017-11-09 10:54                         ` Yisheng Xie
  0 siblings, 0 replies; 61+ messages in thread
From: Yisheng Xie @ 2017-11-09 10:54 UTC (permalink / raw)
  To: Christopher Lameter
  Cc: Vlastimil Babka, akpm, mhocko, mingo, rientjes, n-horiguchi,
	salls, linux-mm, linux-kernel, tanxiaojun, linux-api, Andi Kleen

Hi Christopher,

On 2017/11/8 23:02, Christopher Lameter wrote:
> On Wed, 8 Nov 2017, Yisheng Xie wrote:
> 
>> Another case is current process is *not* the same as target process, and
>> when current process try to migrate pages of target process from old_nodes
>> to new_nodes, the new_nodes should be a subset of target process cpuset.
> 
> The caller of migrate_pages should be able to migrate the target process
> pages anywhere the caller can allocate memory. If that is outside the
> target processes cpuset then that is fine. Pagecache pages that are not
> allocated by the target process already are not subject to the target
> processes restriction. So this is not that unusual.

So there is no need to check the restriction of target process cpuset, right?
I hope that I do not miss anything :)

Thanks
Yisheng Xie
> 
> .
> 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH RFC v2 4/4] mm/mempolicy: add nodes_empty check in SYSC_migrate_pages
@ 2017-11-09 10:54                         ` Yisheng Xie
  0 siblings, 0 replies; 61+ messages in thread
From: Yisheng Xie @ 2017-11-09 10:54 UTC (permalink / raw)
  To: Christopher Lameter
  Cc: Vlastimil Babka, akpm, mhocko, mingo, rientjes, n-horiguchi,
	salls, linux-mm, linux-kernel, tanxiaojun, linux-api, Andi Kleen

Hi Christopher,

On 2017/11/8 23:02, Christopher Lameter wrote:
> On Wed, 8 Nov 2017, Yisheng Xie wrote:
> 
>> Another case is current process is *not* the same as target process, and
>> when current process try to migrate pages of target process from old_nodes
>> to new_nodes, the new_nodes should be a subset of target process cpuset.
> 
> The caller of migrate_pages should be able to migrate the target process
> pages anywhere the caller can allocate memory. If that is outside the
> target processes cpuset then that is fine. Pagecache pages that are not
> allocated by the target process already are not subject to the target
> processes restriction. So this is not that unusual.

So there is no need to check the restriction of target process cpuset, right?
I hope that I do not miss anything :)

Thanks
Yisheng Xie
> 
> .
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH RFC v2 4/4] mm/mempolicy: add nodes_empty check in SYSC_migrate_pages
@ 2017-11-09 15:46                           ` Christopher Lameter
  0 siblings, 0 replies; 61+ messages in thread
From: Christopher Lameter @ 2017-11-09 15:46 UTC (permalink / raw)
  To: Yisheng Xie
  Cc: Vlastimil Babka, akpm, mhocko, mingo, rientjes, n-horiguchi,
	salls, linux-mm, linux-kernel, tanxiaojun, linux-api, Andi Kleen

On Thu, 9 Nov 2017, Yisheng Xie wrote:

> > The caller of migrate_pages should be able to migrate the target process
> > pages anywhere the caller can allocate memory. If that is outside the
> > target processes cpuset then that is fine. Pagecache pages that are not
> > allocated by the target process already are not subject to the target
> > processes restriction. So this is not that unusual.
>
> So there is no need to check the restriction of target process cpuset, right?
> I hope that I do not miss anything :)

Exactly.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH RFC v2 4/4] mm/mempolicy: add nodes_empty check in SYSC_migrate_pages
@ 2017-11-09 15:46                           ` Christopher Lameter
  0 siblings, 0 replies; 61+ messages in thread
From: Christopher Lameter @ 2017-11-09 15:46 UTC (permalink / raw)
  To: Yisheng Xie
  Cc: Vlastimil Babka, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	mhocko-IBi9RG/b67k, mingo-DgEjT+Ai2ygdnm+yROfE0A,
	rientjes-hpIqsD4AKlfQT0dZR+AlfA,
	n-horiguchi-PaJj6Psr51x8UrSeD/g0lQ, salls-b3bnyZ7c9ISVc3sceRu5cw,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	tanxiaojun-hv44wF8Li93QT0dZR+AlfA,
	linux-api-u79uwXL29TY76Z2rM5mHXA, Andi Kleen

On Thu, 9 Nov 2017, Yisheng Xie wrote:

> > The caller of migrate_pages should be able to migrate the target process
> > pages anywhere the caller can allocate memory. If that is outside the
> > target processes cpuset then that is fine. Pagecache pages that are not
> > allocated by the target process already are not subject to the target
> > processes restriction. So this is not that unusual.
>
> So there is no need to check the restriction of target process cpuset, right?
> I hope that I do not miss anything :)

Exactly.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH RFC v2 4/4] mm/mempolicy: add nodes_empty check in SYSC_migrate_pages
@ 2017-11-09 15:46                           ` Christopher Lameter
  0 siblings, 0 replies; 61+ messages in thread
From: Christopher Lameter @ 2017-11-09 15:46 UTC (permalink / raw)
  To: Yisheng Xie
  Cc: Vlastimil Babka, akpm, mhocko, mingo, rientjes, n-horiguchi,
	salls, linux-mm, linux-kernel, tanxiaojun, linux-api, Andi Kleen

On Thu, 9 Nov 2017, Yisheng Xie wrote:

> > The caller of migrate_pages should be able to migrate the target process
> > pages anywhere the caller can allocate memory. If that is outside the
> > target processes cpuset then that is fine. Pagecache pages that are not
> > allocated by the target process already are not subject to the target
> > processes restriction. So this is not that unusual.
>
> So there is no need to check the restriction of target process cpuset, right?
> I hope that I do not miss anything :)

Exactly.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 61+ messages in thread

end of thread, other threads:[~2017-11-09 15:46 UTC | newest]

Thread overview: 61+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-10-27 10:14 [PATCH RFC v2 0/4] some fixes and clean up for mempolicy Yisheng Xie
2017-10-27 10:14 ` Yisheng Xie
2017-10-27 10:14 ` [PATCH RFC v2 1/4] mm/mempolicy: Fix get_nodes() mask miscalculation Yisheng Xie
2017-10-27 10:14   ` Yisheng Xie
2017-10-27 10:14   ` Yisheng Xie
2017-10-31  8:34   ` Vlastimil Babka
2017-10-31  8:34     ` Vlastimil Babka
2017-11-01  9:37     ` Yisheng Xie
2017-11-01  9:37       ` Yisheng Xie
2017-11-01  9:37       ` Yisheng Xie
2017-10-27 10:14 ` [PATCH RFC v2 2/4] mm/mempolicy: remove redundant check in get_nodes Yisheng Xie
2017-10-27 10:14   ` Yisheng Xie
2017-10-27 10:14   ` Yisheng Xie
2017-10-31  8:55   ` Vlastimil Babka
2017-10-31  8:55     ` Vlastimil Babka
2017-10-27 10:14 ` [PATCH RFC v2 3/4] mm/mempolicy: fix the check of nodemask from user Yisheng Xie
2017-10-27 10:14   ` Yisheng Xie
2017-10-31  9:30   ` Vlastimil Babka
2017-10-31  9:30     ` Vlastimil Babka
2017-10-31 11:01     ` Yisheng Xie
2017-10-31 11:01       ` Yisheng Xie
2017-10-31 11:01       ` Yisheng Xie
2017-10-31 11:28       ` Vlastimil Babka
2017-10-31 11:28         ` Vlastimil Babka
2017-10-31 11:28         ` Vlastimil Babka
2017-10-27 10:14 ` [PATCH RFC v2 4/4] mm/mempolicy: add nodes_empty check in SYSC_migrate_pages Yisheng Xie
2017-10-27 10:14   ` Yisheng Xie
2017-10-31  9:46   ` Vlastimil Babka
2017-10-31  9:46     ` Vlastimil Babka
2017-10-31  9:46     ` Vlastimil Babka
2017-11-06  1:31     ` Yisheng Xie
2017-11-06  1:31       ` Yisheng Xie
2017-11-06  1:31       ` Yisheng Xie
2017-11-06  7:39       ` Vlastimil Babka
2017-11-06  7:39         ` Vlastimil Babka
2017-11-06  7:39         ` Vlastimil Babka
2017-11-06 15:29         ` Christopher Lameter
2017-11-06 15:29           ` Christopher Lameter
2017-11-06 15:29           ` Christopher Lameter
2017-11-07 11:23           ` Yisheng Xie
2017-11-07 11:23             ` Yisheng Xie
2017-11-07 11:23             ` Yisheng Xie
2017-11-07 14:54             ` Christopher Lameter
2017-11-07 14:54               ` Christopher Lameter
2017-11-07 14:54               ` Christopher Lameter
2017-11-07 15:05               ` Vlastimil Babka
2017-11-07 15:05                 ` Vlastimil Babka
2017-11-07 15:55                 ` Christopher Lameter
2017-11-07 15:55                   ` Christopher Lameter
2017-11-08  1:38                   ` Yisheng Xie
2017-11-08  1:38                     ` Yisheng Xie
2017-11-08  1:38                     ` Yisheng Xie
2017-11-08 15:02                     ` Christopher Lameter
2017-11-08 15:02                       ` Christopher Lameter
2017-11-08 15:02                       ` Christopher Lameter
2017-11-09 10:54                       ` Yisheng Xie
2017-11-09 10:54                         ` Yisheng Xie
2017-11-09 10:54                         ` Yisheng Xie
2017-11-09 15:46                         ` Christopher Lameter
2017-11-09 15:46                           ` Christopher Lameter
2017-11-09 15:46                           ` Christopher Lameter

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.