All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/3] some fixes and clean up for mempolicy
@ 2017-11-17  1:37 ` Yisheng Xie
  0 siblings, 0 replies; 23+ messages in thread
From: Yisheng Xie @ 2017-11-17  1:37 UTC (permalink / raw)
  To: akpm, vbabka, mhocko, mingo, rientjes, n-horiguchi, salls, ak, cl
  Cc: linux-mm, linux-kernel, linux-api, tanxiaojun

This patchset is triggered by Xiojun's report of ltp test fail[1],
and I have sent a patch to resolve it by check nodes_empty of new_nodes[2].

The new version is to follow Vlastimil's suggestion, which fix by checking
the new_nodes value in function get_nodes. And I just split them to small
patches for easy to review and discussion. For more detail, please look
into each patches.

Change logs of v3:
 * remove patch get_nodes's mask miscalculation
 * check whether node is empty after AND current task node, and then nodes
   which have memory.

Change logs of v2:
 * fix get_nodes's mask miscalculation
 * remove redundant check in get_nodes
 * fix the check of nodemask from user - per Vlastimil

Any comment and complain is welome.

Thanks
Yisheng Xie

[1] https://patchwork.kernel.org/patch/10012005/
[2] https://patchwork.kernel.org/patch/10013329/

Yisheng Xie (3):
  mm/mempolicy: remove redundant check in get_nodes
  mm/mempolicy: fix the check of nodemask from user
  mm/mempolicy: add nodes_empty check in SYSC_migrate_pages

 mm/mempolicy.c | 35 +++++++++++++++++++++++++++--------
 1 file changed, 27 insertions(+), 8 deletions(-)

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH v3 0/3] some fixes and clean up for mempolicy
@ 2017-11-17  1:37 ` Yisheng Xie
  0 siblings, 0 replies; 23+ messages in thread
From: Yisheng Xie @ 2017-11-17  1:37 UTC (permalink / raw)
  To: akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, vbabka-AlSwsSmVLrQ,
	mhocko-IBi9RG/b67k, mingo-DgEjT+Ai2ygdnm+yROfE0A,
	rientjes-hpIqsD4AKlfQT0dZR+AlfA,
	n-horiguchi-PaJj6Psr51x8UrSeD/g0lQ, salls-b3bnyZ7c9ISVc3sceRu5cw,
	ak-VuQAYsv1563Yd54FQh9/CA, cl-vYTEC60ixJUAvxtiuMwx3w
  Cc: linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-api-u79uwXL29TY76Z2rM5mHXA,
	tanxiaojun-hv44wF8Li93QT0dZR+AlfA

This patchset is triggered by Xiojun's report of ltp test fail[1],
and I have sent a patch to resolve it by check nodes_empty of new_nodes[2].

The new version is to follow Vlastimil's suggestion, which fix by checking
the new_nodes value in function get_nodes. And I just split them to small
patches for easy to review and discussion. For more detail, please look
into each patches.

Change logs of v3:
 * remove patch get_nodes's mask miscalculation
 * check whether node is empty after AND current task node, and then nodes
   which have memory.

Change logs of v2:
 * fix get_nodes's mask miscalculation
 * remove redundant check in get_nodes
 * fix the check of nodemask from user - per Vlastimil

Any comment and complain is welome.

Thanks
Yisheng Xie

[1] https://patchwork.kernel.org/patch/10012005/
[2] https://patchwork.kernel.org/patch/10013329/

Yisheng Xie (3):
  mm/mempolicy: remove redundant check in get_nodes
  mm/mempolicy: fix the check of nodemask from user
  mm/mempolicy: add nodes_empty check in SYSC_migrate_pages

 mm/mempolicy.c | 35 +++++++++++++++++++++++++++--------
 1 file changed, 27 insertions(+), 8 deletions(-)

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH v3 0/3] some fixes and clean up for mempolicy
@ 2017-11-17  1:37 ` Yisheng Xie
  0 siblings, 0 replies; 23+ messages in thread
From: Yisheng Xie @ 2017-11-17  1:37 UTC (permalink / raw)
  To: akpm, vbabka, mhocko, mingo, rientjes, n-horiguchi, salls, ak, cl
  Cc: linux-mm, linux-kernel, linux-api, tanxiaojun

This patchset is triggered by Xiojun's report of ltp test fail[1],
and I have sent a patch to resolve it by check nodes_empty of new_nodes[2].

The new version is to follow Vlastimil's suggestion, which fix by checking
the new_nodes value in function get_nodes. And I just split them to small
patches for easy to review and discussion. For more detail, please look
into each patches.

Change logs of v3:
 * remove patch get_nodes's mask miscalculation
 * check whether node is empty after AND current task node, and then nodes
   which have memory.

Change logs of v2:
 * fix get_nodes's mask miscalculation
 * remove redundant check in get_nodes
 * fix the check of nodemask from user - per Vlastimil

Any comment and complain is welome.

Thanks
Yisheng Xie

[1] https://patchwork.kernel.org/patch/10012005/
[2] https://patchwork.kernel.org/patch/10013329/

Yisheng Xie (3):
  mm/mempolicy: remove redundant check in get_nodes
  mm/mempolicy: fix the check of nodemask from user
  mm/mempolicy: add nodes_empty check in SYSC_migrate_pages

 mm/mempolicy.c | 35 +++++++++++++++++++++++++++--------
 1 file changed, 27 insertions(+), 8 deletions(-)

-- 
1.8.3.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH v3 1/3] mm/mempolicy: remove redundant check in get_nodes
  2017-11-17  1:37 ` Yisheng Xie
@ 2017-11-17  1:37   ` Yisheng Xie
  -1 siblings, 0 replies; 23+ messages in thread
From: Yisheng Xie @ 2017-11-17  1:37 UTC (permalink / raw)
  To: akpm, vbabka, mhocko, mingo, rientjes, n-horiguchi, salls, ak, cl
  Cc: linux-mm, linux-kernel, linux-api, tanxiaojun

We have already checked whether maxnode is a page worth of bits, by:
    maxnode > PAGE_SIZE*BITS_PER_BYTE

So no need to check it once more.

Acked-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Yisheng Xie <xieyisheng1@huawei.com>
---
 mm/mempolicy.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 4ce44d3..6e867a8 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -1282,8 +1282,6 @@ static int get_nodes(nodemask_t *nodes, const unsigned long __user *nmask,
 	/* When the user specified more nodes than supported just check
 	   if the non supported part is all zero. */
 	if (nlongs > BITS_TO_LONGS(MAX_NUMNODES)) {
-		if (nlongs > PAGE_SIZE/sizeof(long))
-			return -EINVAL;
 		for (k = BITS_TO_LONGS(MAX_NUMNODES); k < nlongs; k++) {
 			unsigned long t;
 			if (get_user(t, nmask + k))
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v3 1/3] mm/mempolicy: remove redundant check in get_nodes
@ 2017-11-17  1:37   ` Yisheng Xie
  0 siblings, 0 replies; 23+ messages in thread
From: Yisheng Xie @ 2017-11-17  1:37 UTC (permalink / raw)
  To: akpm, vbabka, mhocko, mingo, rientjes, n-horiguchi, salls, ak, cl
  Cc: linux-mm, linux-kernel, linux-api, tanxiaojun

We have already checked whether maxnode is a page worth of bits, by:
    maxnode > PAGE_SIZE*BITS_PER_BYTE

So no need to check it once more.

Acked-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Yisheng Xie <xieyisheng1@huawei.com>
---
 mm/mempolicy.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 4ce44d3..6e867a8 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -1282,8 +1282,6 @@ static int get_nodes(nodemask_t *nodes, const unsigned long __user *nmask,
 	/* When the user specified more nodes than supported just check
 	   if the non supported part is all zero. */
 	if (nlongs > BITS_TO_LONGS(MAX_NUMNODES)) {
-		if (nlongs > PAGE_SIZE/sizeof(long))
-			return -EINVAL;
 		for (k = BITS_TO_LONGS(MAX_NUMNODES); k < nlongs; k++) {
 			unsigned long t;
 			if (get_user(t, nmask + k))
-- 
1.8.3.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v3 2/3] mm/mempolicy: fix the check of nodemask from user
  2017-11-17  1:37 ` Yisheng Xie
@ 2017-11-17  1:37   ` Yisheng Xie
  -1 siblings, 0 replies; 23+ messages in thread
From: Yisheng Xie @ 2017-11-17  1:37 UTC (permalink / raw)
  To: akpm, vbabka, mhocko, mingo, rientjes, n-horiguchi, salls, ak, cl
  Cc: linux-mm, linux-kernel, linux-api, tanxiaojun

As Xiaojun reported the ltp of migrate_pages01 will failed on ARCH arm64
system which has 4 nodes[0...3], all have memory and CONFIG_NODES_SHIFT=2:

migrate_pages01    0  TINFO  :  test_invalid_nodes
migrate_pages01   14  TFAIL  :  migrate_pages_common.c:45: unexpected failure - returned value = 0, expected: -1
migrate_pages01   15  TFAIL  :  migrate_pages_common.c:55: call succeeded unexpectedly

In this case the test_invalid_nodes of migrate_pages01 will call:
SYSC_migrate_pages as:

migrate_pages(0, , {0x0000000000000001}, 64, , {0x0000000000000010}, 64) = 0

The new nodes specifies one or more node IDs that are greater than the
maximum supported node ID, however, the errno is not set to EINVAL as
expected.

As man pages of set_mempolicy[1], mbind[2], and migrate_pages[3] memtioned,
when nodemask specifies one or more node IDs that are greater than the
maximum supported node ID, the errno should set to EINVAL. However, get_nodes
only check whether the part of bits [BITS_PER_LONG*BITS_TO_LONGS(MAX_NUMNODES),
maxnode) is zero or not,  and remain [MAX_NUMNODES, BITS_PER_LONG*BITS_TO_LONGS(MAX_NUMNODES)
unchecked.

This patch is to check the bits of [MAX_NUMNODES, maxnode) in get_nodes to
let migrate_pages set the errno to EINVAL when nodemask specifies one or
more node IDs that are greater than the maximum supported node ID, which
follows the manpage's guide.

[1] http://man7.org/linux/man-pages/man2/set_mempolicy.2.html
[2] http://man7.org/linux/man-pages/man2/mbind.2.html
[3] http://man7.org/linux/man-pages/man2/migrate_pages.2.html

Reported-by: Tan Xiaojun <tanxiaojun@huawei.com>
Signed-off-by: Yisheng Xie <xieyisheng1@huawei.com>
---
 mm/mempolicy.c | 23 ++++++++++++++++++++---
 1 file changed, 20 insertions(+), 3 deletions(-)

diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 6e867a8..65df28d 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -1263,6 +1263,7 @@ static int get_nodes(nodemask_t *nodes, const unsigned long __user *nmask,
 		     unsigned long maxnode)
 {
 	unsigned long k;
+	unsigned long t;
 	unsigned long nlongs;
 	unsigned long endmask;
 
@@ -1279,11 +1280,17 @@ static int get_nodes(nodemask_t *nodes, const unsigned long __user *nmask,
 	else
 		endmask = (1UL << (maxnode % BITS_PER_LONG)) - 1;
 
-	/* When the user specified more nodes than supported just check
-	   if the non supported part is all zero. */
+	/*
+	 * When the user specified more nodes than supported just check
+	 * if the non supported part is all zero.
+	 *
+	 * If maxnode have more longs than MAX_NUMNODES, check
+	 * the bits in that area first. And then go through to
+	 * check the rest bits which equal or bigger than MAX_NUMNODES.
+	 * Otherwise, just check bits [MAX_NUMNODES, maxnode).
+	 */
 	if (nlongs > BITS_TO_LONGS(MAX_NUMNODES)) {
 		for (k = BITS_TO_LONGS(MAX_NUMNODES); k < nlongs; k++) {
-			unsigned long t;
 			if (get_user(t, nmask + k))
 				return -EFAULT;
 			if (k == nlongs - 1) {
@@ -1296,6 +1303,16 @@ static int get_nodes(nodemask_t *nodes, const unsigned long __user *nmask,
 		endmask = ~0UL;
 	}
 
+	if (maxnode > MAX_NUMNODES && MAX_NUMNODES % BITS_PER_LONG != 0) {
+		unsigned long valid_mask = endmask;
+
+		valid_mask &= ~((1UL << (MAX_NUMNODES % BITS_PER_LONG)) - 1);
+		if (get_user(t, nmask + nlongs - 1))
+			return -EFAULT;
+		if (t & valid_mask)
+			return -EINVAL;
+	}
+
 	if (copy_from_user(nodes_addr(*nodes), nmask, nlongs*sizeof(unsigned long)))
 		return -EFAULT;
 	nodes_addr(*nodes)[nlongs-1] &= endmask;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v3 2/3] mm/mempolicy: fix the check of nodemask from user
@ 2017-11-17  1:37   ` Yisheng Xie
  0 siblings, 0 replies; 23+ messages in thread
From: Yisheng Xie @ 2017-11-17  1:37 UTC (permalink / raw)
  To: akpm, vbabka, mhocko, mingo, rientjes, n-horiguchi, salls, ak, cl
  Cc: linux-mm, linux-kernel, linux-api, tanxiaojun

As Xiaojun reported the ltp of migrate_pages01 will failed on ARCH arm64
system which has 4 nodes[0...3], all have memory and CONFIG_NODES_SHIFT=2:

migrate_pages01    0  TINFO  :  test_invalid_nodes
migrate_pages01   14  TFAIL  :  migrate_pages_common.c:45: unexpected failure - returned value = 0, expected: -1
migrate_pages01   15  TFAIL  :  migrate_pages_common.c:55: call succeeded unexpectedly

In this case the test_invalid_nodes of migrate_pages01 will call:
SYSC_migrate_pages as:

migrate_pages(0, , {0x0000000000000001}, 64, , {0x0000000000000010}, 64) = 0

The new nodes specifies one or more node IDs that are greater than the
maximum supported node ID, however, the errno is not set to EINVAL as
expected.

As man pages of set_mempolicy[1], mbind[2], and migrate_pages[3] memtioned,
when nodemask specifies one or more node IDs that are greater than the
maximum supported node ID, the errno should set to EINVAL. However, get_nodes
only check whether the part of bits [BITS_PER_LONG*BITS_TO_LONGS(MAX_NUMNODES),
maxnode) is zero or not,  and remain [MAX_NUMNODES, BITS_PER_LONG*BITS_TO_LONGS(MAX_NUMNODES)
unchecked.

This patch is to check the bits of [MAX_NUMNODES, maxnode) in get_nodes to
let migrate_pages set the errno to EINVAL when nodemask specifies one or
more node IDs that are greater than the maximum supported node ID, which
follows the manpage's guide.

[1] http://man7.org/linux/man-pages/man2/set_mempolicy.2.html
[2] http://man7.org/linux/man-pages/man2/mbind.2.html
[3] http://man7.org/linux/man-pages/man2/migrate_pages.2.html

Reported-by: Tan Xiaojun <tanxiaojun@huawei.com>
Signed-off-by: Yisheng Xie <xieyisheng1@huawei.com>
---
 mm/mempolicy.c | 23 ++++++++++++++++++++---
 1 file changed, 20 insertions(+), 3 deletions(-)

diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 6e867a8..65df28d 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -1263,6 +1263,7 @@ static int get_nodes(nodemask_t *nodes, const unsigned long __user *nmask,
 		     unsigned long maxnode)
 {
 	unsigned long k;
+	unsigned long t;
 	unsigned long nlongs;
 	unsigned long endmask;
 
@@ -1279,11 +1280,17 @@ static int get_nodes(nodemask_t *nodes, const unsigned long __user *nmask,
 	else
 		endmask = (1UL << (maxnode % BITS_PER_LONG)) - 1;
 
-	/* When the user specified more nodes than supported just check
-	   if the non supported part is all zero. */
+	/*
+	 * When the user specified more nodes than supported just check
+	 * if the non supported part is all zero.
+	 *
+	 * If maxnode have more longs than MAX_NUMNODES, check
+	 * the bits in that area first. And then go through to
+	 * check the rest bits which equal or bigger than MAX_NUMNODES.
+	 * Otherwise, just check bits [MAX_NUMNODES, maxnode).
+	 */
 	if (nlongs > BITS_TO_LONGS(MAX_NUMNODES)) {
 		for (k = BITS_TO_LONGS(MAX_NUMNODES); k < nlongs; k++) {
-			unsigned long t;
 			if (get_user(t, nmask + k))
 				return -EFAULT;
 			if (k == nlongs - 1) {
@@ -1296,6 +1303,16 @@ static int get_nodes(nodemask_t *nodes, const unsigned long __user *nmask,
 		endmask = ~0UL;
 	}
 
+	if (maxnode > MAX_NUMNODES && MAX_NUMNODES % BITS_PER_LONG != 0) {
+		unsigned long valid_mask = endmask;
+
+		valid_mask &= ~((1UL << (MAX_NUMNODES % BITS_PER_LONG)) - 1);
+		if (get_user(t, nmask + nlongs - 1))
+			return -EFAULT;
+		if (t & valid_mask)
+			return -EINVAL;
+	}
+
 	if (copy_from_user(nodes_addr(*nodes), nmask, nlongs*sizeof(unsigned long)))
 		return -EFAULT;
 	nodes_addr(*nodes)[nlongs-1] &= endmask;
-- 
1.8.3.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v3 3/3] mm/mempolicy: add nodes_empty check in SYSC_migrate_pages
  2017-11-17  1:37 ` Yisheng Xie
@ 2017-11-17  1:37   ` Yisheng Xie
  -1 siblings, 0 replies; 23+ messages in thread
From: Yisheng Xie @ 2017-11-17  1:37 UTC (permalink / raw)
  To: akpm, vbabka, mhocko, mingo, rientjes, n-horiguchi, salls, ak, cl
  Cc: linux-mm, linux-kernel, linux-api, tanxiaojun

As manpage of migrate_pages, the errno should be set to EINVAL when
none of the node IDs specified by new_nodes are on-line and allowed
by the process's current cpuset context, or none of the specified
nodes contain memory. However, when test by following case:

	new_nodes = 0;
	old_nodes = 0xf;
	ret = migrate_pages(pid, old_nodes, new_nodes, MAX);

The ret will be 0 and no errno is set. As the new_nodes is empty,
we should expect EINVAL as documented.

To fix the case like above, this patch check whether target nodes
AND current task_nodes is empty, and then check whether AND
node_states[N_MEMORY] is empty.

Signed-off-by: Yisheng Xie <xieyisheng1@huawei.com>
---
 mm/mempolicy.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 65df28d..f604b22 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -1433,10 +1433,14 @@ static int copy_nodes_to_user(unsigned long __user *mask, unsigned long maxnode,
 		goto out_put;
 	}
 
-	if (!nodes_subset(*new, node_states[N_MEMORY])) {
-		err = -EINVAL;
+	task_nodes = cpuset_mems_allowed(current);
+	nodes_and(*new, *new, task_nodes);
+	if (nodes_empty(*new))
+		goto out_put;
+
+	nodes_and(*new, *new, node_states[N_MEMORY]);
+	if (nodes_empty(*new))
 		goto out_put;
-	}
 
 	err = security_task_movememory(task);
 	if (err)
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v3 3/3] mm/mempolicy: add nodes_empty check in SYSC_migrate_pages
@ 2017-11-17  1:37   ` Yisheng Xie
  0 siblings, 0 replies; 23+ messages in thread
From: Yisheng Xie @ 2017-11-17  1:37 UTC (permalink / raw)
  To: akpm, vbabka, mhocko, mingo, rientjes, n-horiguchi, salls, ak, cl
  Cc: linux-mm, linux-kernel, linux-api, tanxiaojun

As manpage of migrate_pages, the errno should be set to EINVAL when
none of the node IDs specified by new_nodes are on-line and allowed
by the process's current cpuset context, or none of the specified
nodes contain memory. However, when test by following case:

	new_nodes = 0;
	old_nodes = 0xf;
	ret = migrate_pages(pid, old_nodes, new_nodes, MAX);

The ret will be 0 and no errno is set. As the new_nodes is empty,
we should expect EINVAL as documented.

To fix the case like above, this patch check whether target nodes
AND current task_nodes is empty, and then check whether AND
node_states[N_MEMORY] is empty.

Signed-off-by: Yisheng Xie <xieyisheng1@huawei.com>
---
 mm/mempolicy.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 65df28d..f604b22 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -1433,10 +1433,14 @@ static int copy_nodes_to_user(unsigned long __user *mask, unsigned long maxnode,
 		goto out_put;
 	}
 
-	if (!nodes_subset(*new, node_states[N_MEMORY])) {
-		err = -EINVAL;
+	task_nodes = cpuset_mems_allowed(current);
+	nodes_and(*new, *new, task_nodes);
+	if (nodes_empty(*new))
+		goto out_put;
+
+	nodes_and(*new, *new, node_states[N_MEMORY]);
+	if (nodes_empty(*new))
 		goto out_put;
-	}
 
 	err = security_task_movememory(task);
 	if (err)
-- 
1.8.3.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 1/3] mm/mempolicy: remove redundant check in get_nodes
@ 2017-11-20 22:24     ` David Rientjes
  0 siblings, 0 replies; 23+ messages in thread
From: David Rientjes @ 2017-11-20 22:24 UTC (permalink / raw)
  To: Yisheng Xie
  Cc: akpm, vbabka, mhocko, mingo, n-horiguchi, salls, ak, cl,
	linux-mm, linux-kernel, linux-api, tanxiaojun

On Fri, 17 Nov 2017, Yisheng Xie wrote:

> We have already checked whether maxnode is a page worth of bits, by:
>     maxnode > PAGE_SIZE*BITS_PER_BYTE
> 
> So no need to check it once more.
> 
> Acked-by: Vlastimil Babka <vbabka@suse.cz>
> Signed-off-by: Yisheng Xie <xieyisheng1@huawei.com>

Acked-by: David Rientjes <rientjes@google.com>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 1/3] mm/mempolicy: remove redundant check in get_nodes
@ 2017-11-20 22:24     ` David Rientjes
  0 siblings, 0 replies; 23+ messages in thread
From: David Rientjes @ 2017-11-20 22:24 UTC (permalink / raw)
  To: Yisheng Xie
  Cc: akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, vbabka-AlSwsSmVLrQ,
	mhocko-IBi9RG/b67k, mingo-DgEjT+Ai2ygdnm+yROfE0A,
	n-horiguchi-PaJj6Psr51x8UrSeD/g0lQ, salls-b3bnyZ7c9ISVc3sceRu5cw,
	ak-VuQAYsv1563Yd54FQh9/CA, cl-vYTEC60ixJUAvxtiuMwx3w,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-api-u79uwXL29TY76Z2rM5mHXA,
	tanxiaojun-hv44wF8Li93QT0dZR+AlfA

On Fri, 17 Nov 2017, Yisheng Xie wrote:

> We have already checked whether maxnode is a page worth of bits, by:
>     maxnode > PAGE_SIZE*BITS_PER_BYTE
> 
> So no need to check it once more.
> 
> Acked-by: Vlastimil Babka <vbabka-AlSwsSmVLrQ@public.gmane.org>
> Signed-off-by: Yisheng Xie <xieyisheng1-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>

Acked-by: David Rientjes <rientjes-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 1/3] mm/mempolicy: remove redundant check in get_nodes
@ 2017-11-20 22:24     ` David Rientjes
  0 siblings, 0 replies; 23+ messages in thread
From: David Rientjes @ 2017-11-20 22:24 UTC (permalink / raw)
  To: Yisheng Xie
  Cc: akpm, vbabka, mhocko, mingo, n-horiguchi, salls, ak, cl,
	linux-mm, linux-kernel, linux-api, tanxiaojun

On Fri, 17 Nov 2017, Yisheng Xie wrote:

> We have already checked whether maxnode is a page worth of bits, by:
>     maxnode > PAGE_SIZE*BITS_PER_BYTE
> 
> So no need to check it once more.
> 
> Acked-by: Vlastimil Babka <vbabka@suse.cz>
> Signed-off-by: Yisheng Xie <xieyisheng1@huawei.com>

Acked-by: David Rientjes <rientjes@google.com>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 2/3] mm/mempolicy: fix the check of nodemask from user
  2017-11-17  1:37   ` Yisheng Xie
@ 2017-11-27 17:05     ` Vlastimil Babka
  -1 siblings, 0 replies; 23+ messages in thread
From: Vlastimil Babka @ 2017-11-27 17:05 UTC (permalink / raw)
  To: Yisheng Xie, akpm, mhocko, mingo, rientjes, n-horiguchi, salls, ak, cl
  Cc: linux-mm, linux-kernel, linux-api, tanxiaojun

On 11/17/2017 02:37 AM, Yisheng Xie wrote:
> As Xiaojun reported the ltp of migrate_pages01 will failed on ARCH arm64
> system which has 4 nodes[0...3], all have memory and CONFIG_NODES_SHIFT=2:
> 
> migrate_pages01    0  TINFO  :  test_invalid_nodes
> migrate_pages01   14  TFAIL  :  migrate_pages_common.c:45: unexpected failure - returned value = 0, expected: -1
> migrate_pages01   15  TFAIL  :  migrate_pages_common.c:55: call succeeded unexpectedly
> 
> In this case the test_invalid_nodes of migrate_pages01 will call:
> SYSC_migrate_pages as:
> 
> migrate_pages(0, , {0x0000000000000001}, 64, , {0x0000000000000010}, 64) = 0
> 
> The new nodes specifies one or more node IDs that are greater than the
> maximum supported node ID, however, the errno is not set to EINVAL as
> expected.
> 
> As man pages of set_mempolicy[1], mbind[2], and migrate_pages[3] memtioned,
> when nodemask specifies one or more node IDs that are greater than the
> maximum supported node ID, the errno should set to EINVAL. However, get_nodes
> only check whether the part of bits [BITS_PER_LONG*BITS_TO_LONGS(MAX_NUMNODES),
> maxnode) is zero or not,  and remain [MAX_NUMNODES, BITS_PER_LONG*BITS_TO_LONGS(MAX_NUMNODES)
> unchecked.
> 
> This patch is to check the bits of [MAX_NUMNODES, maxnode) in get_nodes to
> let migrate_pages set the errno to EINVAL when nodemask specifies one or
> more node IDs that are greater than the maximum supported node ID, which
> follows the manpage's guide.
> 
> [1] http://man7.org/linux/man-pages/man2/set_mempolicy.2.html
> [2] http://man7.org/linux/man-pages/man2/mbind.2.html
> [3] http://man7.org/linux/man-pages/man2/migrate_pages.2.html
> 
> Reported-by: Tan Xiaojun <tanxiaojun@huawei.com>
> Signed-off-by: Yisheng Xie <xieyisheng1@huawei.com>

Acked-by: Vlastimil Babka <vbabka@suse.cz>

> ---
>  mm/mempolicy.c | 23 ++++++++++++++++++++---
>  1 file changed, 20 insertions(+), 3 deletions(-)
> 
> diff --git a/mm/mempolicy.c b/mm/mempolicy.c
> index 6e867a8..65df28d 100644
> --- a/mm/mempolicy.c
> +++ b/mm/mempolicy.c
> @@ -1263,6 +1263,7 @@ static int get_nodes(nodemask_t *nodes, const unsigned long __user *nmask,
>  		     unsigned long maxnode)
>  {
>  	unsigned long k;
> +	unsigned long t;
>  	unsigned long nlongs;
>  	unsigned long endmask;
>  
> @@ -1279,11 +1280,17 @@ static int get_nodes(nodemask_t *nodes, const unsigned long __user *nmask,
>  	else
>  		endmask = (1UL << (maxnode % BITS_PER_LONG)) - 1;
>  
> -	/* When the user specified more nodes than supported just check
> -	   if the non supported part is all zero. */
> +	/*
> +	 * When the user specified more nodes than supported just check
> +	 * if the non supported part is all zero.
> +	 *
> +	 * If maxnode have more longs than MAX_NUMNODES, check
> +	 * the bits in that area first. And then go through to
> +	 * check the rest bits which equal or bigger than MAX_NUMNODES.
> +	 * Otherwise, just check bits [MAX_NUMNODES, maxnode).
> +	 */
>  	if (nlongs > BITS_TO_LONGS(MAX_NUMNODES)) {
>  		for (k = BITS_TO_LONGS(MAX_NUMNODES); k < nlongs; k++) {
> -			unsigned long t;
>  			if (get_user(t, nmask + k))
>  				return -EFAULT;
>  			if (k == nlongs - 1) {
> @@ -1296,6 +1303,16 @@ static int get_nodes(nodemask_t *nodes, const unsigned long __user *nmask,
>  		endmask = ~0UL;
>  	}
>  
> +	if (maxnode > MAX_NUMNODES && MAX_NUMNODES % BITS_PER_LONG != 0) {
> +		unsigned long valid_mask = endmask;
> +
> +		valid_mask &= ~((1UL << (MAX_NUMNODES % BITS_PER_LONG)) - 1);
> +		if (get_user(t, nmask + nlongs - 1))
> +			return -EFAULT;
> +		if (t & valid_mask)
> +			return -EINVAL;
> +	}
> +
>  	if (copy_from_user(nodes_addr(*nodes), nmask, nlongs*sizeof(unsigned long)))
>  		return -EFAULT;
>  	nodes_addr(*nodes)[nlongs-1] &= endmask;
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 2/3] mm/mempolicy: fix the check of nodemask from user
@ 2017-11-27 17:05     ` Vlastimil Babka
  0 siblings, 0 replies; 23+ messages in thread
From: Vlastimil Babka @ 2017-11-27 17:05 UTC (permalink / raw)
  To: Yisheng Xie, akpm, mhocko, mingo, rientjes, n-horiguchi, salls, ak, cl
  Cc: linux-mm, linux-kernel, linux-api, tanxiaojun

On 11/17/2017 02:37 AM, Yisheng Xie wrote:
> As Xiaojun reported the ltp of migrate_pages01 will failed on ARCH arm64
> system which has 4 nodes[0...3], all have memory and CONFIG_NODES_SHIFT=2:
> 
> migrate_pages01    0  TINFO  :  test_invalid_nodes
> migrate_pages01   14  TFAIL  :  migrate_pages_common.c:45: unexpected failure - returned value = 0, expected: -1
> migrate_pages01   15  TFAIL  :  migrate_pages_common.c:55: call succeeded unexpectedly
> 
> In this case the test_invalid_nodes of migrate_pages01 will call:
> SYSC_migrate_pages as:
> 
> migrate_pages(0, , {0x0000000000000001}, 64, , {0x0000000000000010}, 64) = 0
> 
> The new nodes specifies one or more node IDs that are greater than the
> maximum supported node ID, however, the errno is not set to EINVAL as
> expected.
> 
> As man pages of set_mempolicy[1], mbind[2], and migrate_pages[3] memtioned,
> when nodemask specifies one or more node IDs that are greater than the
> maximum supported node ID, the errno should set to EINVAL. However, get_nodes
> only check whether the part of bits [BITS_PER_LONG*BITS_TO_LONGS(MAX_NUMNODES),
> maxnode) is zero or not,  and remain [MAX_NUMNODES, BITS_PER_LONG*BITS_TO_LONGS(MAX_NUMNODES)
> unchecked.
> 
> This patch is to check the bits of [MAX_NUMNODES, maxnode) in get_nodes to
> let migrate_pages set the errno to EINVAL when nodemask specifies one or
> more node IDs that are greater than the maximum supported node ID, which
> follows the manpage's guide.
> 
> [1] http://man7.org/linux/man-pages/man2/set_mempolicy.2.html
> [2] http://man7.org/linux/man-pages/man2/mbind.2.html
> [3] http://man7.org/linux/man-pages/man2/migrate_pages.2.html
> 
> Reported-by: Tan Xiaojun <tanxiaojun@huawei.com>
> Signed-off-by: Yisheng Xie <xieyisheng1@huawei.com>

Acked-by: Vlastimil Babka <vbabka@suse.cz>

> ---
>  mm/mempolicy.c | 23 ++++++++++++++++++++---
>  1 file changed, 20 insertions(+), 3 deletions(-)
> 
> diff --git a/mm/mempolicy.c b/mm/mempolicy.c
> index 6e867a8..65df28d 100644
> --- a/mm/mempolicy.c
> +++ b/mm/mempolicy.c
> @@ -1263,6 +1263,7 @@ static int get_nodes(nodemask_t *nodes, const unsigned long __user *nmask,
>  		     unsigned long maxnode)
>  {
>  	unsigned long k;
> +	unsigned long t;
>  	unsigned long nlongs;
>  	unsigned long endmask;
>  
> @@ -1279,11 +1280,17 @@ static int get_nodes(nodemask_t *nodes, const unsigned long __user *nmask,
>  	else
>  		endmask = (1UL << (maxnode % BITS_PER_LONG)) - 1;
>  
> -	/* When the user specified more nodes than supported just check
> -	   if the non supported part is all zero. */
> +	/*
> +	 * When the user specified more nodes than supported just check
> +	 * if the non supported part is all zero.
> +	 *
> +	 * If maxnode have more longs than MAX_NUMNODES, check
> +	 * the bits in that area first. And then go through to
> +	 * check the rest bits which equal or bigger than MAX_NUMNODES.
> +	 * Otherwise, just check bits [MAX_NUMNODES, maxnode).
> +	 */
>  	if (nlongs > BITS_TO_LONGS(MAX_NUMNODES)) {
>  		for (k = BITS_TO_LONGS(MAX_NUMNODES); k < nlongs; k++) {
> -			unsigned long t;
>  			if (get_user(t, nmask + k))
>  				return -EFAULT;
>  			if (k == nlongs - 1) {
> @@ -1296,6 +1303,16 @@ static int get_nodes(nodemask_t *nodes, const unsigned long __user *nmask,
>  		endmask = ~0UL;
>  	}
>  
> +	if (maxnode > MAX_NUMNODES && MAX_NUMNODES % BITS_PER_LONG != 0) {
> +		unsigned long valid_mask = endmask;
> +
> +		valid_mask &= ~((1UL << (MAX_NUMNODES % BITS_PER_LONG)) - 1);
> +		if (get_user(t, nmask + nlongs - 1))
> +			return -EFAULT;
> +		if (t & valid_mask)
> +			return -EINVAL;
> +	}
> +
>  	if (copy_from_user(nodes_addr(*nodes), nmask, nlongs*sizeof(unsigned long)))
>  		return -EFAULT;
>  	nodes_addr(*nodes)[nlongs-1] &= endmask;
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 3/3] mm/mempolicy: add nodes_empty check in SYSC_migrate_pages
@ 2017-11-27 17:25     ` Vlastimil Babka
  0 siblings, 0 replies; 23+ messages in thread
From: Vlastimil Babka @ 2017-11-27 17:25 UTC (permalink / raw)
  To: Yisheng Xie, akpm, mhocko, mingo, rientjes, n-horiguchi, salls, ak, cl
  Cc: linux-mm, linux-kernel, linux-api, tanxiaojun

On 11/17/2017 02:37 AM, Yisheng Xie wrote:
> As manpage of migrate_pages, the errno should be set to EINVAL when
> none of the node IDs specified by new_nodes are on-line and allowed
> by the process's current cpuset context, or none of the specified
> nodes contain memory. However, when test by following case:
> 
> 	new_nodes = 0;
> 	old_nodes = 0xf;
> 	ret = migrate_pages(pid, old_nodes, new_nodes, MAX);
> 
> The ret will be 0 and no errno is set. As the new_nodes is empty,
> we should expect EINVAL as documented.
> 
> To fix the case like above, this patch check whether target nodes
> AND current task_nodes is empty, and then check whether AND
> node_states[N_MEMORY] is empty.
> 
> Signed-off-by: Yisheng Xie <xieyisheng1@huawei.com>
> ---
>  mm/mempolicy.c | 10 +++++++---
>  1 file changed, 7 insertions(+), 3 deletions(-)
> 
> diff --git a/mm/mempolicy.c b/mm/mempolicy.c
> index 65df28d..f604b22 100644
> --- a/mm/mempolicy.c
> +++ b/mm/mempolicy.c
> @@ -1433,10 +1433,14 @@ static int copy_nodes_to_user(unsigned long __user *mask, unsigned long maxnode,
>  		goto out_put;
>  	}

Let me add the whole preceding that ends on the lines above:

        task_nodes = cpuset_mems_allowed(task);
        /* Is the user allowed to access the target nodes? */
        if (!nodes_subset(*new, task_nodes) && !capable(CAP_SYS_NICE)) {
                err = -EPERM;
                goto out_put;
        }

>  
> -	if (!nodes_subset(*new, node_states[N_MEMORY])) {
> -		err = -EINVAL;
> +	task_nodes = cpuset_mems_allowed(current);
> +	nodes_and(*new, *new, task_nodes);
> +	if (nodes_empty(*new))
> +		goto out_put;

So if we have CAP_SYS_NICE, we pass (or rather skip) the EPERM check
above, but the current cpuset restriction still applies regardless. This
doesn't make sense to me? If I get Christoph right in the v2 discussion,
then CAP_SYS_NICE should not allow current cpuset escape. In that case,
we should remove the CAP_SYS_NICE check from the EPERM check? Also
should it be a subset check, or a non-empty-intersection check?

Note there's still a danger that we are breaking existing code so this
will have to be reverted in any case...

> +
> +	nodes_and(*new, *new, node_states[N_MEMORY]);
> +	if (nodes_empty(*new))
>  		goto out_put;
> -	}
>  
>  	err = security_task_movememory(task);
>  	if (err)
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 3/3] mm/mempolicy: add nodes_empty check in SYSC_migrate_pages
@ 2017-11-27 17:25     ` Vlastimil Babka
  0 siblings, 0 replies; 23+ messages in thread
From: Vlastimil Babka @ 2017-11-27 17:25 UTC (permalink / raw)
  To: Yisheng Xie, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	mhocko-IBi9RG/b67k, mingo-DgEjT+Ai2ygdnm+yROfE0A,
	rientjes-hpIqsD4AKlfQT0dZR+AlfA,
	n-horiguchi-PaJj6Psr51x8UrSeD/g0lQ, salls-b3bnyZ7c9ISVc3sceRu5cw,
	ak-VuQAYsv1563Yd54FQh9/CA, cl-vYTEC60ixJUAvxtiuMwx3w
  Cc: linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-api-u79uwXL29TY76Z2rM5mHXA,
	tanxiaojun-hv44wF8Li93QT0dZR+AlfA

On 11/17/2017 02:37 AM, Yisheng Xie wrote:
> As manpage of migrate_pages, the errno should be set to EINVAL when
> none of the node IDs specified by new_nodes are on-line and allowed
> by the process's current cpuset context, or none of the specified
> nodes contain memory. However, when test by following case:
> 
> 	new_nodes = 0;
> 	old_nodes = 0xf;
> 	ret = migrate_pages(pid, old_nodes, new_nodes, MAX);
> 
> The ret will be 0 and no errno is set. As the new_nodes is empty,
> we should expect EINVAL as documented.
> 
> To fix the case like above, this patch check whether target nodes
> AND current task_nodes is empty, and then check whether AND
> node_states[N_MEMORY] is empty.
> 
> Signed-off-by: Yisheng Xie <xieyisheng1-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
> ---
>  mm/mempolicy.c | 10 +++++++---
>  1 file changed, 7 insertions(+), 3 deletions(-)
> 
> diff --git a/mm/mempolicy.c b/mm/mempolicy.c
> index 65df28d..f604b22 100644
> --- a/mm/mempolicy.c
> +++ b/mm/mempolicy.c
> @@ -1433,10 +1433,14 @@ static int copy_nodes_to_user(unsigned long __user *mask, unsigned long maxnode,
>  		goto out_put;
>  	}

Let me add the whole preceding that ends on the lines above:

        task_nodes = cpuset_mems_allowed(task);
        /* Is the user allowed to access the target nodes? */
        if (!nodes_subset(*new, task_nodes) && !capable(CAP_SYS_NICE)) {
                err = -EPERM;
                goto out_put;
        }

>  
> -	if (!nodes_subset(*new, node_states[N_MEMORY])) {
> -		err = -EINVAL;
> +	task_nodes = cpuset_mems_allowed(current);
> +	nodes_and(*new, *new, task_nodes);
> +	if (nodes_empty(*new))
> +		goto out_put;

So if we have CAP_SYS_NICE, we pass (or rather skip) the EPERM check
above, but the current cpuset restriction still applies regardless. This
doesn't make sense to me? If I get Christoph right in the v2 discussion,
then CAP_SYS_NICE should not allow current cpuset escape. In that case,
we should remove the CAP_SYS_NICE check from the EPERM check? Also
should it be a subset check, or a non-empty-intersection check?

Note there's still a danger that we are breaking existing code so this
will have to be reverted in any case...

> +
> +	nodes_and(*new, *new, node_states[N_MEMORY]);
> +	if (nodes_empty(*new))
>  		goto out_put;
> -	}
>  
>  	err = security_task_movememory(task);
>  	if (err)
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 3/3] mm/mempolicy: add nodes_empty check in SYSC_migrate_pages
@ 2017-11-27 17:25     ` Vlastimil Babka
  0 siblings, 0 replies; 23+ messages in thread
From: Vlastimil Babka @ 2017-11-27 17:25 UTC (permalink / raw)
  To: Yisheng Xie, akpm, mhocko, mingo, rientjes, n-horiguchi, salls, ak, cl
  Cc: linux-mm, linux-kernel, linux-api, tanxiaojun

On 11/17/2017 02:37 AM, Yisheng Xie wrote:
> As manpage of migrate_pages, the errno should be set to EINVAL when
> none of the node IDs specified by new_nodes are on-line and allowed
> by the process's current cpuset context, or none of the specified
> nodes contain memory. However, when test by following case:
> 
> 	new_nodes = 0;
> 	old_nodes = 0xf;
> 	ret = migrate_pages(pid, old_nodes, new_nodes, MAX);
> 
> The ret will be 0 and no errno is set. As the new_nodes is empty,
> we should expect EINVAL as documented.
> 
> To fix the case like above, this patch check whether target nodes
> AND current task_nodes is empty, and then check whether AND
> node_states[N_MEMORY] is empty.
> 
> Signed-off-by: Yisheng Xie <xieyisheng1@huawei.com>
> ---
>  mm/mempolicy.c | 10 +++++++---
>  1 file changed, 7 insertions(+), 3 deletions(-)
> 
> diff --git a/mm/mempolicy.c b/mm/mempolicy.c
> index 65df28d..f604b22 100644
> --- a/mm/mempolicy.c
> +++ b/mm/mempolicy.c
> @@ -1433,10 +1433,14 @@ static int copy_nodes_to_user(unsigned long __user *mask, unsigned long maxnode,
>  		goto out_put;
>  	}

Let me add the whole preceding that ends on the lines above:

        task_nodes = cpuset_mems_allowed(task);
        /* Is the user allowed to access the target nodes? */
        if (!nodes_subset(*new, task_nodes) && !capable(CAP_SYS_NICE)) {
                err = -EPERM;
                goto out_put;
        }

>  
> -	if (!nodes_subset(*new, node_states[N_MEMORY])) {
> -		err = -EINVAL;
> +	task_nodes = cpuset_mems_allowed(current);
> +	nodes_and(*new, *new, task_nodes);
> +	if (nodes_empty(*new))
> +		goto out_put;

So if we have CAP_SYS_NICE, we pass (or rather skip) the EPERM check
above, but the current cpuset restriction still applies regardless. This
doesn't make sense to me? If I get Christoph right in the v2 discussion,
then CAP_SYS_NICE should not allow current cpuset escape. In that case,
we should remove the CAP_SYS_NICE check from the EPERM check? Also
should it be a subset check, or a non-empty-intersection check?

Note there's still a danger that we are breaking existing code so this
will have to be reverted in any case...

> +
> +	nodes_and(*new, *new, node_states[N_MEMORY]);
> +	if (nodes_empty(*new))
>  		goto out_put;
> -	}
>  
>  	err = security_task_movememory(task);
>  	if (err)
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 3/3] mm/mempolicy: add nodes_empty check in SYSC_migrate_pages
@ 2017-11-28  2:03       ` Yisheng Xie
  0 siblings, 0 replies; 23+ messages in thread
From: Yisheng Xie @ 2017-11-28  2:03 UTC (permalink / raw)
  To: Vlastimil Babka, akpm, mhocko, mingo, rientjes, n-horiguchi,
	salls, ak, cl
  Cc: linux-mm, linux-kernel, linux-api, tanxiaojun

Hi Vlastimil,

Thanks for your comment!
On 2017/11/28 1:25, Vlastimil Babka wrote:
> On 11/17/2017 02:37 AM, Yisheng Xie wrote:
>> As manpage of migrate_pages, the errno should be set to EINVAL when
>> none of the node IDs specified by new_nodes are on-line and allowed
>> by the process's current cpuset context, or none of the specified
>> nodes contain memory. However, when test by following case:
>>
>> 	new_nodes = 0;
>> 	old_nodes = 0xf;
>> 	ret = migrate_pages(pid, old_nodes, new_nodes, MAX);
>>
>> The ret will be 0 and no errno is set. As the new_nodes is empty,
>> we should expect EINVAL as documented.
>>
>> To fix the case like above, this patch check whether target nodes
>> AND current task_nodes is empty, and then check whether AND
>> node_states[N_MEMORY] is empty.
>>
>> Signed-off-by: Yisheng Xie <xieyisheng1@huawei.com>
>> ---
>>  mm/mempolicy.c | 10 +++++++---
>>  1 file changed, 7 insertions(+), 3 deletions(-)
>>
>> diff --git a/mm/mempolicy.c b/mm/mempolicy.c
>> index 65df28d..f604b22 100644
>> --- a/mm/mempolicy.c
>> +++ b/mm/mempolicy.c
>> @@ -1433,10 +1433,14 @@ static int copy_nodes_to_user(unsigned long __user *mask, unsigned long maxnode,
>>  		goto out_put;
>>  	}
> 
> Let me add the whole preceding that ends on the lines above:
> 
>         task_nodes = cpuset_mems_allowed(task);
>         /* Is the user allowed to access the target nodes? */
>         if (!nodes_subset(*new, task_nodes) && !capable(CAP_SYS_NICE)) {
>                 err = -EPERM;
>                 goto out_put;
>         }
> 
>>  
>> -	if (!nodes_subset(*new, node_states[N_MEMORY])) {
>> -		err = -EINVAL;
>> +	task_nodes = cpuset_mems_allowed(current);
>> +	nodes_and(*new, *new, task_nodes);
>> +	if (nodes_empty(*new))
>> +		goto out_put;
> 
> So if we have CAP_SYS_NICE, we pass (or rather skip) the EPERM check
> above, but the current cpuset restriction still applies regardless. This
> doesn't make sense to me? If I get Christoph right in the v2 discussion,
> then CAP_SYS_NICE should not allow current cpuset escape. 
hmm, maybe I do not get what you mean, the patch seems do not *escape* the
current cpuset?  if CAP_SYS_NICE it also check current cpuset, right?

> In that case,
> we should remove the CAP_SYS_NICE check from the EPERM check? Also
> should it be a subset check, or a non-empty-intersection check?

So you mean:
1. we should remove the EPERM check above?
2. Not sure we should use subset check, or a non-empty-intersection for current cpuset?
(Please let me know, if have other points.)

For 1: I have checked the manpage of capabilities[1]:
CAP_SYS_NICE
	[...]
	*apply migrate_pages(2) to arbitrary processes* and allow
        processes to be migrated to arbitrary nodes;

	apply move_pages(2) to arbitrary processes;
	[...]

Therefore, IMO, EPERM check should be something like:
	if (currtent->mm != task->mm && !capable(CAP_SYS_NICE)) { // or if (currtent != task && !capable(CAP_SYS_NICE)) ?
		err = -EPERM;
		goto out_put;
	}
And I kept it as unchanged to follow the original code's meaning.(For move_pages
also use the the logical to check EPERM). I also did not want to break the existing code. :)

For 2: we should follow the manpage of migrate_pages about EINVAL, as your listed in
the former discussion:
	  EINVAL... Or, _none_ of the node IDs specified by new_nodes are
 	 	on-line and allowed by the process's current cpuset context, or none of
  		the specified nodes contain memory.

So a non-empty-intersection check for current cpuset should be enough, right?
And Christoph seems do _not oppose_ this point. (I not sure whether he is *agree* or not).

[1] http://man7.org/linux/man-pages/man7/capabilities.7.html
> 
> Note there's still a danger that we are breaking existing code so this
> will have to be reverted in any case...

I am not oppose if you want to revert this patch, but we should find a
correct way to fix the case above, right? Maybe anther version or a fix to fold?

Thanks
Yisheng Xie
> 
>> +
>> +	nodes_and(*new, *new, node_states[N_MEMORY]);
>> +	if (nodes_empty(*new))
>>  		goto out_put;
>> -	}
>>  
>>  	err = security_task_movememory(task);
>>  	if (err)
>>
> 
> 
> .
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 3/3] mm/mempolicy: add nodes_empty check in SYSC_migrate_pages
@ 2017-11-28  2:03       ` Yisheng Xie
  0 siblings, 0 replies; 23+ messages in thread
From: Yisheng Xie @ 2017-11-28  2:03 UTC (permalink / raw)
  To: Vlastimil Babka, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	mhocko-IBi9RG/b67k, mingo-DgEjT+Ai2ygdnm+yROfE0A,
	rientjes-hpIqsD4AKlfQT0dZR+AlfA,
	n-horiguchi-PaJj6Psr51x8UrSeD/g0lQ, salls-b3bnyZ7c9ISVc3sceRu5cw,
	ak-VuQAYsv1563Yd54FQh9/CA, cl-vYTEC60ixJUAvxtiuMwx3w
  Cc: linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-api-u79uwXL29TY76Z2rM5mHXA,
	tanxiaojun-hv44wF8Li93QT0dZR+AlfA

Hi Vlastimil,

Thanks for your comment!
On 2017/11/28 1:25, Vlastimil Babka wrote:
> On 11/17/2017 02:37 AM, Yisheng Xie wrote:
>> As manpage of migrate_pages, the errno should be set to EINVAL when
>> none of the node IDs specified by new_nodes are on-line and allowed
>> by the process's current cpuset context, or none of the specified
>> nodes contain memory. However, when test by following case:
>>
>> 	new_nodes = 0;
>> 	old_nodes = 0xf;
>> 	ret = migrate_pages(pid, old_nodes, new_nodes, MAX);
>>
>> The ret will be 0 and no errno is set. As the new_nodes is empty,
>> we should expect EINVAL as documented.
>>
>> To fix the case like above, this patch check whether target nodes
>> AND current task_nodes is empty, and then check whether AND
>> node_states[N_MEMORY] is empty.
>>
>> Signed-off-by: Yisheng Xie <xieyisheng1-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
>> ---
>>  mm/mempolicy.c | 10 +++++++---
>>  1 file changed, 7 insertions(+), 3 deletions(-)
>>
>> diff --git a/mm/mempolicy.c b/mm/mempolicy.c
>> index 65df28d..f604b22 100644
>> --- a/mm/mempolicy.c
>> +++ b/mm/mempolicy.c
>> @@ -1433,10 +1433,14 @@ static int copy_nodes_to_user(unsigned long __user *mask, unsigned long maxnode,
>>  		goto out_put;
>>  	}
> 
> Let me add the whole preceding that ends on the lines above:
> 
>         task_nodes = cpuset_mems_allowed(task);
>         /* Is the user allowed to access the target nodes? */
>         if (!nodes_subset(*new, task_nodes) && !capable(CAP_SYS_NICE)) {
>                 err = -EPERM;
>                 goto out_put;
>         }
> 
>>  
>> -	if (!nodes_subset(*new, node_states[N_MEMORY])) {
>> -		err = -EINVAL;
>> +	task_nodes = cpuset_mems_allowed(current);
>> +	nodes_and(*new, *new, task_nodes);
>> +	if (nodes_empty(*new))
>> +		goto out_put;
> 
> So if we have CAP_SYS_NICE, we pass (or rather skip) the EPERM check
> above, but the current cpuset restriction still applies regardless. This
> doesn't make sense to me? If I get Christoph right in the v2 discussion,
> then CAP_SYS_NICE should not allow current cpuset escape. 
hmm, maybe I do not get what you mean, the patch seems do not *escape* the
current cpuset?  if CAP_SYS_NICE it also check current cpuset, right?

> In that case,
> we should remove the CAP_SYS_NICE check from the EPERM check? Also
> should it be a subset check, or a non-empty-intersection check?

So you mean:
1. we should remove the EPERM check above?
2. Not sure we should use subset check, or a non-empty-intersection for current cpuset?
(Please let me know, if have other points.)

For 1: I have checked the manpage of capabilities[1]:
CAP_SYS_NICE
	[...]
	*apply migrate_pages(2) to arbitrary processes* and allow
        processes to be migrated to arbitrary nodes;

	apply move_pages(2) to arbitrary processes;
	[...]

Therefore, IMO, EPERM check should be something like:
	if (currtent->mm != task->mm && !capable(CAP_SYS_NICE)) { // or if (currtent != task && !capable(CAP_SYS_NICE)) ?
		err = -EPERM;
		goto out_put;
	}
And I kept it as unchanged to follow the original code's meaning.(For move_pages
also use the the logical to check EPERM). I also did not want to break the existing code. :)

For 2: we should follow the manpage of migrate_pages about EINVAL, as your listed in
the former discussion:
	  EINVAL... Or, _none_ of the node IDs specified by new_nodes are
 	 	on-line and allowed by the process's current cpuset context, or none of
  		the specified nodes contain memory.

So a non-empty-intersection check for current cpuset should be enough, right?
And Christoph seems do _not oppose_ this point. (I not sure whether he is *agree* or not).

[1] http://man7.org/linux/man-pages/man7/capabilities.7.html
> 
> Note there's still a danger that we are breaking existing code so this
> will have to be reverted in any case...

I am not oppose if you want to revert this patch, but we should find a
correct way to fix the case above, right? Maybe anther version or a fix to fold?

Thanks
Yisheng Xie
> 
>> +
>> +	nodes_and(*new, *new, node_states[N_MEMORY]);
>> +	if (nodes_empty(*new))
>>  		goto out_put;
>> -	}
>>  
>>  	err = security_task_movememory(task);
>>  	if (err)
>>
> 
> 
> .
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 3/3] mm/mempolicy: add nodes_empty check in SYSC_migrate_pages
@ 2017-11-28  2:03       ` Yisheng Xie
  0 siblings, 0 replies; 23+ messages in thread
From: Yisheng Xie @ 2017-11-28  2:03 UTC (permalink / raw)
  To: Vlastimil Babka, akpm, mhocko, mingo, rientjes, n-horiguchi,
	salls, ak, cl
  Cc: linux-mm, linux-kernel, linux-api, tanxiaojun

Hi Vlastimil,

Thanks for your comment!
On 2017/11/28 1:25, Vlastimil Babka wrote:
> On 11/17/2017 02:37 AM, Yisheng Xie wrote:
>> As manpage of migrate_pages, the errno should be set to EINVAL when
>> none of the node IDs specified by new_nodes are on-line and allowed
>> by the process's current cpuset context, or none of the specified
>> nodes contain memory. However, when test by following case:
>>
>> 	new_nodes = 0;
>> 	old_nodes = 0xf;
>> 	ret = migrate_pages(pid, old_nodes, new_nodes, MAX);
>>
>> The ret will be 0 and no errno is set. As the new_nodes is empty,
>> we should expect EINVAL as documented.
>>
>> To fix the case like above, this patch check whether target nodes
>> AND current task_nodes is empty, and then check whether AND
>> node_states[N_MEMORY] is empty.
>>
>> Signed-off-by: Yisheng Xie <xieyisheng1@huawei.com>
>> ---
>>  mm/mempolicy.c | 10 +++++++---
>>  1 file changed, 7 insertions(+), 3 deletions(-)
>>
>> diff --git a/mm/mempolicy.c b/mm/mempolicy.c
>> index 65df28d..f604b22 100644
>> --- a/mm/mempolicy.c
>> +++ b/mm/mempolicy.c
>> @@ -1433,10 +1433,14 @@ static int copy_nodes_to_user(unsigned long __user *mask, unsigned long maxnode,
>>  		goto out_put;
>>  	}
> 
> Let me add the whole preceding that ends on the lines above:
> 
>         task_nodes = cpuset_mems_allowed(task);
>         /* Is the user allowed to access the target nodes? */
>         if (!nodes_subset(*new, task_nodes) && !capable(CAP_SYS_NICE)) {
>                 err = -EPERM;
>                 goto out_put;
>         }
> 
>>  
>> -	if (!nodes_subset(*new, node_states[N_MEMORY])) {
>> -		err = -EINVAL;
>> +	task_nodes = cpuset_mems_allowed(current);
>> +	nodes_and(*new, *new, task_nodes);
>> +	if (nodes_empty(*new))
>> +		goto out_put;
> 
> So if we have CAP_SYS_NICE, we pass (or rather skip) the EPERM check
> above, but the current cpuset restriction still applies regardless. This
> doesn't make sense to me? If I get Christoph right in the v2 discussion,
> then CAP_SYS_NICE should not allow current cpuset escape. 
hmm, maybe I do not get what you mean, the patch seems do not *escape* the
current cpuset?  if CAP_SYS_NICE it also check current cpuset, right?

> In that case,
> we should remove the CAP_SYS_NICE check from the EPERM check? Also
> should it be a subset check, or a non-empty-intersection check?

So you mean:
1. we should remove the EPERM check above?
2. Not sure we should use subset check, or a non-empty-intersection for current cpuset?
(Please let me know, if have other points.)

For 1: I have checked the manpage of capabilities[1]:
CAP_SYS_NICE
	[...]
	*apply migrate_pages(2) to arbitrary processes* and allow
        processes to be migrated to arbitrary nodes;

	apply move_pages(2) to arbitrary processes;
	[...]

Therefore, IMO, EPERM check should be something like:
	if (currtent->mm != task->mm && !capable(CAP_SYS_NICE)) { // or if (currtent != task && !capable(CAP_SYS_NICE)) ?
		err = -EPERM;
		goto out_put;
	}
And I kept it as unchanged to follow the original code's meaning.(For move_pages
also use the the logical to check EPERM). I also did not want to break the existing code. :)

For 2: we should follow the manpage of migrate_pages about EINVAL, as your listed in
the former discussion:
	  EINVAL... Or, _none_ of the node IDs specified by new_nodes are
 	 	on-line and allowed by the process's current cpuset context, or none of
  		the specified nodes contain memory.

So a non-empty-intersection check for current cpuset should be enough, right?
And Christoph seems do _not oppose_ this point. (I not sure whether he is *agree* or not).

[1] http://man7.org/linux/man-pages/man7/capabilities.7.html
> 
> Note there's still a danger that we are breaking existing code so this
> will have to be reverted in any case...

I am not oppose if you want to revert this patch, but we should find a
correct way to fix the case above, right? Maybe anther version or a fix to fold?

Thanks
Yisheng Xie
> 
>> +
>> +	nodes_and(*new, *new, node_states[N_MEMORY]);
>> +	if (nodes_empty(*new))
>>  		goto out_put;
>> -	}
>>  
>>  	err = security_task_movememory(task);
>>  	if (err)
>>
> 
> 
> .
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 3/3] mm/mempolicy: add nodes_empty check in SYSC_migrate_pages
@ 2017-11-28  8:04         ` Yisheng Xie
  0 siblings, 0 replies; 23+ messages in thread
From: Yisheng Xie @ 2017-11-28  8:04 UTC (permalink / raw)
  To: Vlastimil Babka, akpm, mhocko, mingo, rientjes, n-horiguchi,
	salls, ak, cl
  Cc: linux-mm, linux-kernel, linux-api, tanxiaojun



On 2017/11/28 10:03, Yisheng Xie wrote:
> Hi Vlastimil,
> 
> Thanks for your comment!
> On 2017/11/28 1:25, Vlastimil Babka wrote:
>> On 11/17/2017 02:37 AM, Yisheng Xie wrote:
>>> As manpage of migrate_pages, the errno should be set to EINVAL when
>>> none of the node IDs specified by new_nodes are on-line and allowed
>>> by the process's current cpuset context, or none of the specified
>>> nodes contain memory. However, when test by following case:
>>>
>>> 	new_nodes = 0;
>>> 	old_nodes = 0xf;
>>> 	ret = migrate_pages(pid, old_nodes, new_nodes, MAX);
>>>
>>> The ret will be 0 and no errno is set. As the new_nodes is empty,
>>> we should expect EINVAL as documented.
>>>
>>> To fix the case like above, this patch check whether target nodes
>>> AND current task_nodes is empty, and then check whether AND
>>> node_states[N_MEMORY] is empty.
>>>
>>> Signed-off-by: Yisheng Xie <xieyisheng1@huawei.com>
>>> ---
>>>  mm/mempolicy.c | 10 +++++++---
>>>  1 file changed, 7 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/mm/mempolicy.c b/mm/mempolicy.c
>>> index 65df28d..f604b22 100644
>>> --- a/mm/mempolicy.c
>>> +++ b/mm/mempolicy.c
>>> @@ -1433,10 +1433,14 @@ static int copy_nodes_to_user(unsigned long __user *mask, unsigned long maxnode,
>>>  		goto out_put;
>>>  	}
>>
>> Let me add the whole preceding that ends on the lines above:
>>
>>         task_nodes = cpuset_mems_allowed(task);
>>         /* Is the user allowed to access the target nodes? */
>>         if (!nodes_subset(*new, task_nodes) && !capable(CAP_SYS_NICE)) {
>>                 err = -EPERM;
>>                 goto out_put;
>>         }
>>
>>>  
>>> -	if (!nodes_subset(*new, node_states[N_MEMORY])) {
>>> -		err = -EINVAL;
>>> +	task_nodes = cpuset_mems_allowed(current);
>>> +	nodes_and(*new, *new, task_nodes);
>>> +	if (nodes_empty(*new))
>>> +		goto out_put;
>>
>> So if we have CAP_SYS_NICE, we pass (or rather skip) the EPERM check
>> above, but the current cpuset restriction still applies regardless. This
>> doesn't make sense to me? If I get Christoph right in the v2 discussion,
>> then CAP_SYS_NICE should not allow current cpuset escape. 
> hmm, maybe I do not get what you mean, the patch seems do not *escape* the
> current cpuset?  if CAP_SYS_NICE it also check current cpuset, right?
> 
>> In that case,
>> we should remove the CAP_SYS_NICE check from the EPERM check? Also
>> should it be a subset check, or a non-empty-intersection check?
> 
> So you mean:
> 1. we should remove the EPERM check above?
> 2. Not sure we should use subset check, or a non-empty-intersection for current cpuset?
> (Please let me know, if have other points.)
> 
> For 1: I have checked the manpage of capabilities[1]:
> CAP_SYS_NICE
> 	[...]
> 	*apply migrate_pages(2) to arbitrary processes* and allow
>         processes to be migrated to arbitrary nodes;
> 
> 	apply move_pages(2) to arbitrary processes;
> 	[...]
> 
> Therefore, IMO, EPERM check should be something like:
> 	if (currtent->mm != task->mm && !capable(CAP_SYS_NICE)) { // or if (currtent != task && !capable(CAP_SYS_NICE)) ?
> 		err = -EPERM;
> 		goto out_put;
> 	}
> And I kept it as unchanged to follow the original code's meaning.(For move_pages
> also use the the logical to check EPERM). I also did not want to break the existing code. :)

Please forget about move_pages part, it has different logical, I am just confused.
Sorry about that. Anyway, I means we should do some check about EPERM, maybe not
as original code, but can not just remove it.

> 
> For 2: we should follow the manpage of migrate_pages about EINVAL, as your listed in
> the former discussion:
> 	  EINVAL... Or, _none_ of the node IDs specified by new_nodes are
>  	 	on-line and allowed by the process's current cpuset context, or none of
>   		the specified nodes contain memory.
> 
> So a non-empty-intersection check for current cpuset should be enough, right?
> And Christoph seems do _not oppose_ this point. (I not sure whether he is *agree* or not).
> 
> [1] http://man7.org/linux/man-pages/man7/capabilities.7.html
>>
>> Note there's still a danger that we are breaking existing code so this
>> will have to be reverted in any case...
> 
> I am not oppose if you want to revert this patch, but we should find a
> correct way to fix the case above, right? Maybe anther version or a fix to fold?
> 
> Thanks
> Yisheng Xie
>>
>>> +
>>> +	nodes_and(*new, *new, node_states[N_MEMORY]);
>>> +	if (nodes_empty(*new))
>>>  		goto out_put;
>>> -	}
>>>  
>>>  	err = security_task_movememory(task);
>>>  	if (err)
>>>
>>
>>
>> .
>>
> 
> 
> .
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 3/3] mm/mempolicy: add nodes_empty check in SYSC_migrate_pages
@ 2017-11-28  8:04         ` Yisheng Xie
  0 siblings, 0 replies; 23+ messages in thread
From: Yisheng Xie @ 2017-11-28  8:04 UTC (permalink / raw)
  To: Vlastimil Babka, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	mhocko-IBi9RG/b67k, mingo-DgEjT+Ai2ygdnm+yROfE0A,
	rientjes-hpIqsD4AKlfQT0dZR+AlfA,
	n-horiguchi-PaJj6Psr51x8UrSeD/g0lQ, salls-b3bnyZ7c9ISVc3sceRu5cw,
	ak-VuQAYsv1563Yd54FQh9/CA, cl-vYTEC60ixJUAvxtiuMwx3w
  Cc: linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-api-u79uwXL29TY76Z2rM5mHXA,
	tanxiaojun-hv44wF8Li93QT0dZR+AlfA



On 2017/11/28 10:03, Yisheng Xie wrote:
> Hi Vlastimil,
> 
> Thanks for your comment!
> On 2017/11/28 1:25, Vlastimil Babka wrote:
>> On 11/17/2017 02:37 AM, Yisheng Xie wrote:
>>> As manpage of migrate_pages, the errno should be set to EINVAL when
>>> none of the node IDs specified by new_nodes are on-line and allowed
>>> by the process's current cpuset context, or none of the specified
>>> nodes contain memory. However, when test by following case:
>>>
>>> 	new_nodes = 0;
>>> 	old_nodes = 0xf;
>>> 	ret = migrate_pages(pid, old_nodes, new_nodes, MAX);
>>>
>>> The ret will be 0 and no errno is set. As the new_nodes is empty,
>>> we should expect EINVAL as documented.
>>>
>>> To fix the case like above, this patch check whether target nodes
>>> AND current task_nodes is empty, and then check whether AND
>>> node_states[N_MEMORY] is empty.
>>>
>>> Signed-off-by: Yisheng Xie <xieyisheng1-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
>>> ---
>>>  mm/mempolicy.c | 10 +++++++---
>>>  1 file changed, 7 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/mm/mempolicy.c b/mm/mempolicy.c
>>> index 65df28d..f604b22 100644
>>> --- a/mm/mempolicy.c
>>> +++ b/mm/mempolicy.c
>>> @@ -1433,10 +1433,14 @@ static int copy_nodes_to_user(unsigned long __user *mask, unsigned long maxnode,
>>>  		goto out_put;
>>>  	}
>>
>> Let me add the whole preceding that ends on the lines above:
>>
>>         task_nodes = cpuset_mems_allowed(task);
>>         /* Is the user allowed to access the target nodes? */
>>         if (!nodes_subset(*new, task_nodes) && !capable(CAP_SYS_NICE)) {
>>                 err = -EPERM;
>>                 goto out_put;
>>         }
>>
>>>  
>>> -	if (!nodes_subset(*new, node_states[N_MEMORY])) {
>>> -		err = -EINVAL;
>>> +	task_nodes = cpuset_mems_allowed(current);
>>> +	nodes_and(*new, *new, task_nodes);
>>> +	if (nodes_empty(*new))
>>> +		goto out_put;
>>
>> So if we have CAP_SYS_NICE, we pass (or rather skip) the EPERM check
>> above, but the current cpuset restriction still applies regardless. This
>> doesn't make sense to me? If I get Christoph right in the v2 discussion,
>> then CAP_SYS_NICE should not allow current cpuset escape. 
> hmm, maybe I do not get what you mean, the patch seems do not *escape* the
> current cpuset?  if CAP_SYS_NICE it also check current cpuset, right?
> 
>> In that case,
>> we should remove the CAP_SYS_NICE check from the EPERM check? Also
>> should it be a subset check, or a non-empty-intersection check?
> 
> So you mean:
> 1. we should remove the EPERM check above?
> 2. Not sure we should use subset check, or a non-empty-intersection for current cpuset?
> (Please let me know, if have other points.)
> 
> For 1: I have checked the manpage of capabilities[1]:
> CAP_SYS_NICE
> 	[...]
> 	*apply migrate_pages(2) to arbitrary processes* and allow
>         processes to be migrated to arbitrary nodes;
> 
> 	apply move_pages(2) to arbitrary processes;
> 	[...]
> 
> Therefore, IMO, EPERM check should be something like:
> 	if (currtent->mm != task->mm && !capable(CAP_SYS_NICE)) { // or if (currtent != task && !capable(CAP_SYS_NICE)) ?
> 		err = -EPERM;
> 		goto out_put;
> 	}
> And I kept it as unchanged to follow the original code's meaning.(For move_pages
> also use the the logical to check EPERM). I also did not want to break the existing code. :)

Please forget about move_pages part, it has different logical, I am just confused.
Sorry about that. Anyway, I means we should do some check about EPERM, maybe not
as original code, but can not just remove it.

> 
> For 2: we should follow the manpage of migrate_pages about EINVAL, as your listed in
> the former discussion:
> 	  EINVAL... Or, _none_ of the node IDs specified by new_nodes are
>  	 	on-line and allowed by the process's current cpuset context, or none of
>   		the specified nodes contain memory.
> 
> So a non-empty-intersection check for current cpuset should be enough, right?
> And Christoph seems do _not oppose_ this point. (I not sure whether he is *agree* or not).
> 
> [1] http://man7.org/linux/man-pages/man7/capabilities.7.html
>>
>> Note there's still a danger that we are breaking existing code so this
>> will have to be reverted in any case...
> 
> I am not oppose if you want to revert this patch, but we should find a
> correct way to fix the case above, right? Maybe anther version or a fix to fold?
> 
> Thanks
> Yisheng Xie
>>
>>> +
>>> +	nodes_and(*new, *new, node_states[N_MEMORY]);
>>> +	if (nodes_empty(*new))
>>>  		goto out_put;
>>> -	}
>>>  
>>>  	err = security_task_movememory(task);
>>>  	if (err)
>>>
>>
>>
>> .
>>
> 
> 
> .
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3 3/3] mm/mempolicy: add nodes_empty check in SYSC_migrate_pages
@ 2017-11-28  8:04         ` Yisheng Xie
  0 siblings, 0 replies; 23+ messages in thread
From: Yisheng Xie @ 2017-11-28  8:04 UTC (permalink / raw)
  To: Vlastimil Babka, akpm, mhocko, mingo, rientjes, n-horiguchi,
	salls, ak, cl
  Cc: linux-mm, linux-kernel, linux-api, tanxiaojun



On 2017/11/28 10:03, Yisheng Xie wrote:
> Hi Vlastimil,
> 
> Thanks for your comment!
> On 2017/11/28 1:25, Vlastimil Babka wrote:
>> On 11/17/2017 02:37 AM, Yisheng Xie wrote:
>>> As manpage of migrate_pages, the errno should be set to EINVAL when
>>> none of the node IDs specified by new_nodes are on-line and allowed
>>> by the process's current cpuset context, or none of the specified
>>> nodes contain memory. However, when test by following case:
>>>
>>> 	new_nodes = 0;
>>> 	old_nodes = 0xf;
>>> 	ret = migrate_pages(pid, old_nodes, new_nodes, MAX);
>>>
>>> The ret will be 0 and no errno is set. As the new_nodes is empty,
>>> we should expect EINVAL as documented.
>>>
>>> To fix the case like above, this patch check whether target nodes
>>> AND current task_nodes is empty, and then check whether AND
>>> node_states[N_MEMORY] is empty.
>>>
>>> Signed-off-by: Yisheng Xie <xieyisheng1@huawei.com>
>>> ---
>>>  mm/mempolicy.c | 10 +++++++---
>>>  1 file changed, 7 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/mm/mempolicy.c b/mm/mempolicy.c
>>> index 65df28d..f604b22 100644
>>> --- a/mm/mempolicy.c
>>> +++ b/mm/mempolicy.c
>>> @@ -1433,10 +1433,14 @@ static int copy_nodes_to_user(unsigned long __user *mask, unsigned long maxnode,
>>>  		goto out_put;
>>>  	}
>>
>> Let me add the whole preceding that ends on the lines above:
>>
>>         task_nodes = cpuset_mems_allowed(task);
>>         /* Is the user allowed to access the target nodes? */
>>         if (!nodes_subset(*new, task_nodes) && !capable(CAP_SYS_NICE)) {
>>                 err = -EPERM;
>>                 goto out_put;
>>         }
>>
>>>  
>>> -	if (!nodes_subset(*new, node_states[N_MEMORY])) {
>>> -		err = -EINVAL;
>>> +	task_nodes = cpuset_mems_allowed(current);
>>> +	nodes_and(*new, *new, task_nodes);
>>> +	if (nodes_empty(*new))
>>> +		goto out_put;
>>
>> So if we have CAP_SYS_NICE, we pass (or rather skip) the EPERM check
>> above, but the current cpuset restriction still applies regardless. This
>> doesn't make sense to me? If I get Christoph right in the v2 discussion,
>> then CAP_SYS_NICE should not allow current cpuset escape. 
> hmm, maybe I do not get what you mean, the patch seems do not *escape* the
> current cpuset?  if CAP_SYS_NICE it also check current cpuset, right?
> 
>> In that case,
>> we should remove the CAP_SYS_NICE check from the EPERM check? Also
>> should it be a subset check, or a non-empty-intersection check?
> 
> So you mean:
> 1. we should remove the EPERM check above?
> 2. Not sure we should use subset check, or a non-empty-intersection for current cpuset?
> (Please let me know, if have other points.)
> 
> For 1: I have checked the manpage of capabilities[1]:
> CAP_SYS_NICE
> 	[...]
> 	*apply migrate_pages(2) to arbitrary processes* and allow
>         processes to be migrated to arbitrary nodes;
> 
> 	apply move_pages(2) to arbitrary processes;
> 	[...]
> 
> Therefore, IMO, EPERM check should be something like:
> 	if (currtent->mm != task->mm && !capable(CAP_SYS_NICE)) { // or if (currtent != task && !capable(CAP_SYS_NICE)) ?
> 		err = -EPERM;
> 		goto out_put;
> 	}
> And I kept it as unchanged to follow the original code's meaning.(For move_pages
> also use the the logical to check EPERM). I also did not want to break the existing code. :)

Please forget about move_pages part, it has different logical, I am just confused.
Sorry about that. Anyway, I means we should do some check about EPERM, maybe not
as original code, but can not just remove it.

> 
> For 2: we should follow the manpage of migrate_pages about EINVAL, as your listed in
> the former discussion:
> 	  EINVAL... Or, _none_ of the node IDs specified by new_nodes are
>  	 	on-line and allowed by the process's current cpuset context, or none of
>   		the specified nodes contain memory.
> 
> So a non-empty-intersection check for current cpuset should be enough, right?
> And Christoph seems do _not oppose_ this point. (I not sure whether he is *agree* or not).
> 
> [1] http://man7.org/linux/man-pages/man7/capabilities.7.html
>>
>> Note there's still a danger that we are breaking existing code so this
>> will have to be reverted in any case...
> 
> I am not oppose if you want to revert this patch, but we should find a
> correct way to fix the case above, right? Maybe anther version or a fix to fold?
> 
> Thanks
> Yisheng Xie
>>
>>> +
>>> +	nodes_and(*new, *new, node_states[N_MEMORY]);
>>> +	if (nodes_empty(*new))
>>>  		goto out_put;
>>> -	}
>>>  
>>>  	err = security_task_movememory(task);
>>>  	if (err)
>>>
>>
>>
>> .
>>
> 
> 
> .
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2017-11-28  8:04 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-17  1:37 [PATCH v3 0/3] some fixes and clean up for mempolicy Yisheng Xie
2017-11-17  1:37 ` Yisheng Xie
2017-11-17  1:37 ` Yisheng Xie
2017-11-17  1:37 ` [PATCH v3 1/3] mm/mempolicy: remove redundant check in get_nodes Yisheng Xie
2017-11-17  1:37   ` Yisheng Xie
2017-11-20 22:24   ` David Rientjes
2017-11-20 22:24     ` David Rientjes
2017-11-20 22:24     ` David Rientjes
2017-11-17  1:37 ` [PATCH v3 2/3] mm/mempolicy: fix the check of nodemask from user Yisheng Xie
2017-11-17  1:37   ` Yisheng Xie
2017-11-27 17:05   ` Vlastimil Babka
2017-11-27 17:05     ` Vlastimil Babka
2017-11-17  1:37 ` [PATCH v3 3/3] mm/mempolicy: add nodes_empty check in SYSC_migrate_pages Yisheng Xie
2017-11-17  1:37   ` Yisheng Xie
2017-11-27 17:25   ` Vlastimil Babka
2017-11-27 17:25     ` Vlastimil Babka
2017-11-27 17:25     ` Vlastimil Babka
2017-11-28  2:03     ` Yisheng Xie
2017-11-28  2:03       ` Yisheng Xie
2017-11-28  2:03       ` Yisheng Xie
2017-11-28  8:04       ` Yisheng Xie
2017-11-28  8:04         ` Yisheng Xie
2017-11-28  8:04         ` Yisheng Xie

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.