All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] numa: Change get_mempolicy() to use nr_node_ids instead of MAX_NUMNODES
@ 2019-02-11 18:02 rcampbell
  2019-02-11 19:27 ` Andrew Morton
  0 siblings, 1 reply; 5+ messages in thread
From: rcampbell @ 2019-02-11 18:02 UTC (permalink / raw)
  To: linux-mm; +Cc: Ralph Campbell, Andrew Morton

From: Ralph Campbell <rcampbell@nvidia.com>

The system call, get_mempolicy() [1], passes an unsigned long *nodemask
pointer and an unsigned long maxnode argument which specifies the
length of the user's nodemask array in bits (which is rounded up).
The manual page says that if the maxnode value is too small,
get_mempolicy will return EINVAL but there is no system call to return
this minimum value. To determine this value, some programs search
/proc/<pid>/status for a line starting with "Mems_allowed:" and use
the number of digits in the mask to determine the minimum value.
A recent change to the way this line is formatted [2] causes these
programs to compute a value less than MAX_NUMNODES so get_mempolicy()
returns EINVAL.

Change get_mempolicy(), the older compat version of get_mempolicy(), and
the copy_nodes_to_user() function to use nr_node_ids instead of
MAX_NUMNODES, thus preserving the defacto method of computing the
minimum size for the nodemask array and the maxnode argument.

[1] http://man7.org/linux/man-pages/man2/get_mempolicy.2.html
[2] https://lore.kernel.org/lkml/1545405631-6808-1-git-send-email-longman@redhat.com

Signed-off-by: Ralph Campbell <rcampbell@nvidia.com>
Suggested-by: Alexander Duyck <alexander.duyck@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
---
 mm/mempolicy.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 1da2f1f09675..af171ccb56a2 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -1314,7 +1314,7 @@ static int copy_nodes_to_user(unsigned long __user *mask, unsigned long maxnode,
 			      nodemask_t *nodes)
 {
 	unsigned long copy = ALIGN(maxnode-1, 64) / 8;
-	const int nbytes = BITS_TO_LONGS(MAX_NUMNODES) * sizeof(long);
+	unsigned int nbytes = BITS_TO_LONGS(nr_node_ids) * sizeof(long);
 
 	if (copy > nbytes) {
 		if (copy > PAGE_SIZE)
@@ -1491,7 +1491,7 @@ static int kernel_get_mempolicy(int __user *policy,
 	int uninitialized_var(pval);
 	nodemask_t nodes;
 
-	if (nmask != NULL && maxnode < MAX_NUMNODES)
+	if (nmask != NULL && maxnode < nr_node_ids)
 		return -EINVAL;
 
 	err = do_get_mempolicy(&pval, &nodes, addr, flags);
@@ -1527,7 +1527,7 @@ COMPAT_SYSCALL_DEFINE5(get_mempolicy, int __user *, policy,
 	unsigned long nr_bits, alloc_size;
 	DECLARE_BITMAP(bm, MAX_NUMNODES);
 
-	nr_bits = min_t(unsigned long, maxnode-1, MAX_NUMNODES);
+	nr_bits = min_t(unsigned long, maxnode-1, nr_node_ids);
 	alloc_size = ALIGN(nr_bits, BITS_PER_LONG) / 8;
 
 	if (nmask)
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] numa: Change get_mempolicy() to use nr_node_ids instead of MAX_NUMNODES
  2019-02-11 18:02 [PATCH] numa: Change get_mempolicy() to use nr_node_ids instead of MAX_NUMNODES rcampbell
@ 2019-02-11 19:27 ` Andrew Morton
  2019-02-27 18:38   ` Vlastimil Babka
  0 siblings, 1 reply; 5+ messages in thread
From: Andrew Morton @ 2019-02-11 19:27 UTC (permalink / raw)
  To: rcampbell; +Cc: linux-mm, Waiman Long

On Mon, 11 Feb 2019 10:02:45 -0800 <rcampbell@nvidia.com> wrote:

> From: Ralph Campbell <rcampbell@nvidia.com>
> 
> The system call, get_mempolicy() [1], passes an unsigned long *nodemask
> pointer and an unsigned long maxnode argument which specifies the
> length of the user's nodemask array in bits (which is rounded up).
> The manual page says that if the maxnode value is too small,
> get_mempolicy will return EINVAL but there is no system call to return
> this minimum value. To determine this value, some programs search
> /proc/<pid>/status for a line starting with "Mems_allowed:" and use
> the number of digits in the mask to determine the minimum value.
> A recent change to the way this line is formatted [2] causes these
> programs to compute a value less than MAX_NUMNODES so get_mempolicy()
> returns EINVAL.
> 
> Change get_mempolicy(), the older compat version of get_mempolicy(), and
> the copy_nodes_to_user() function to use nr_node_ids instead of
> MAX_NUMNODES, thus preserving the defacto method of computing the
> minimum size for the nodemask array and the maxnode argument.
> 
> [1] http://man7.org/linux/man-pages/man2/get_mempolicy.2.html
> [2] https://lore.kernel.org/lkml/1545405631-6808-1-git-send-email-longman@redhat.com
> 

Ugh, what a mess.

For a start, that's a crazy interface.  I wish that had been brought to
our attention so we could have provided a sane way for userspace to
determine MAX_NUMNODES.

Secondly, 4fb8e5b89bcbbb ("include/linux/nodemask.h: use nr_node_ids
(not MAX_NUMNODES) in __nodemask_pr_numnodes()") introduced a
regession.  The proposed get_mempolicy() change appears to be a good
one, but is a strange way of addressing the regression.  I suppose it's
acceptable, as long as this change is backported into kernels which
have 4fb8e5b89bcbbb.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] numa: Change get_mempolicy() to use nr_node_ids instead of MAX_NUMNODES
  2019-02-11 19:27 ` Andrew Morton
@ 2019-02-27 18:38   ` Vlastimil Babka
  2019-02-28 19:11     ` Andrew Morton
  0 siblings, 1 reply; 5+ messages in thread
From: Vlastimil Babka @ 2019-02-27 18:38 UTC (permalink / raw)
  To: Andrew Morton, rcampbell
  Cc: linux-mm, Waiman Long, Linux API, Alexander Duyck, Andi Kleen,
	Florian Weimer, Linus Torvalds, stable

On 2/11/19 8:27 PM, Andrew Morton wrote:
> On Mon, 11 Feb 2019 10:02:45 -0800 <rcampbell@nvidia.com> wrote:
> 
>> From: Ralph Campbell <rcampbell@nvidia.com>
>> 
>> The system call, get_mempolicy() [1], passes an unsigned long *nodemask
>> pointer and an unsigned long maxnode argument which specifies the
>> length of the user's nodemask array in bits (which is rounded up).
>> The manual page says that if the maxnode value is too small,
>> get_mempolicy will return EINVAL but there is no system call to return
>> this minimum value. To determine this value, some programs search
>> /proc/<pid>/status for a line starting with "Mems_allowed:" and use
>> the number of digits in the mask to determine the minimum value.
>> A recent change to the way this line is formatted [2] causes these
>> programs to compute a value less than MAX_NUMNODES so get_mempolicy()
>> returns EINVAL.
>> 
>> Change get_mempolicy(), the older compat version of get_mempolicy(), and
>> the copy_nodes_to_user() function to use nr_node_ids instead of
>> MAX_NUMNODES, thus preserving the defacto method of computing the
>> minimum size for the nodemask array and the maxnode argument.
>> 
>> [1] http://man7.org/linux/man-pages/man2/get_mempolicy.2.html
>> [2] https://lore.kernel.org/lkml/1545405631-6808-1-git-send-email-longman@redhat.com

Please, the next time include linux-api and people involved in the previous
thread [1] into the CC list. Likely there should have been a Suggested-by: for
Alexander as well.

>> 
> 
> Ugh, what a mess.

I'm afraid it's even somewhat worse mess now.

> For a start, that's a crazy interface.  I wish that had been brought to
> our attention so we could have provided a sane way for userspace to
> determine MAX_NUMNODES.
> 
> Secondly, 4fb8e5b89bcbbb ("include/linux/nodemask.h: use nr_node_ids
> (not MAX_NUMNODES) in __nodemask_pr_numnodes()") introduced a

There's no such commit, that sha was probably from linux-next. The patch is
still in mmotm [1]. Luckily, I would say. Maybe Linus or some automation could
run some script to check for bogus Fixes tags before accepting patches?

> regession.  The proposed get_mempolicy() change appears to be a good
> one, but is a strange way of addressing the regression.  I suppose it's
> acceptable, as long as this change is backported into kernels which
> have 4fb8e5b89bcbbb.

Based on the non-existing sha, hopefully it wasn't backported anywhere, but
maybe some AI did anyway. Ah, seems like it indeed made it as far as 4.9, as a
fix for non-existing commit and without proper linux-api consideration :(
I guess it's too late to revert it for 5.0. Hopefully the change is really safe
and won't break anything, i.e. hopefully nobody was determining MAX_NUMNODES by
increasing buffer size until get_mempolicy() stopped returning EINVAL. Or other
problem in e.g. CRIU context.

What about the manpage? It says "The  value specified by maxnode is less than
the number of node IDs supported by the system." which could be perhaps applied
both to nr_node_ids or MAX_NUMNODES. Or should we update it?

[1]
https://lore.kernel.org/linux-mm/631c44cc-df2d-40d4-a537-d24864df0679@nvidia.com/T/#u
[2]
https://www.ozlabs.org/~akpm/mmotm/broken-out/include-linux-nodemaskh-use-nr_node_ids-not-max_numnodes-in-__nodemask_pr_numnodes.patch

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] numa: Change get_mempolicy() to use nr_node_ids instead of MAX_NUMNODES
  2019-02-27 18:38   ` Vlastimil Babka
@ 2019-02-28 19:11     ` Andrew Morton
  2019-02-28 20:43       ` Vlastimil Babka
  0 siblings, 1 reply; 5+ messages in thread
From: Andrew Morton @ 2019-02-28 19:11 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: rcampbell, linux-mm, Waiman Long, Linux API, Alexander Duyck,
	Andi Kleen, Florian Weimer, Linus Torvalds, stable

On Wed, 27 Feb 2019 19:38:47 +0100 Vlastimil Babka <vbabka@suse.cz> wrote:

> On 2/11/19 8:27 PM, Andrew Morton wrote:
> > On Mon, 11 Feb 2019 10:02:45 -0800 <rcampbell@nvidia.com> wrote:
> > 
> >> From: Ralph Campbell <rcampbell@nvidia.com>
> >> 
> >> The system call, get_mempolicy() [1], passes an unsigned long *nodemask
> >> pointer and an unsigned long maxnode argument which specifies the
> >> length of the user's nodemask array in bits (which is rounded up).
> >> The manual page says that if the maxnode value is too small,
> >> get_mempolicy will return EINVAL but there is no system call to return
> >> this minimum value. To determine this value, some programs search
> >> /proc/<pid>/status for a line starting with "Mems_allowed:" and use
> >> the number of digits in the mask to determine the minimum value.
> >> A recent change to the way this line is formatted [2] causes these
> >> programs to compute a value less than MAX_NUMNODES so get_mempolicy()
> >> returns EINVAL.
> >> 
> >> Change get_mempolicy(), the older compat version of get_mempolicy(), and
> >> the copy_nodes_to_user() function to use nr_node_ids instead of
> >> MAX_NUMNODES, thus preserving the defacto method of computing the
> >> minimum size for the nodemask array and the maxnode argument.
> >> 
> >> [1] http://man7.org/linux/man-pages/man2/get_mempolicy.2.html
> >> [2] https://lore.kernel.org/lkml/1545405631-6808-1-git-send-email-longman@redhat.com
> 
> Please, the next time include linux-api and people involved in the previous
> thread [1] into the CC list. Likely there should have been a Suggested-by: for
> Alexander as well.
> 
> >> 
> > 
> > Ugh, what a mess.
> 
> I'm afraid it's even somewhat worse mess now.
> 
> > For a start, that's a crazy interface.  I wish that had been brought to
> > our attention so we could have provided a sane way for userspace to
> > determine MAX_NUMNODES.
> > 
> > Secondly, 4fb8e5b89bcbbb ("include/linux/nodemask.h: use nr_node_ids
> > (not MAX_NUMNODES) in __nodemask_pr_numnodes()") introduced a
> 
> There's no such commit, that sha was probably from linux-next. The patch is
> still in mmotm [1]. Luckily, I would say. Maybe Linus or some automation could
> run some script to check for bogus Fixes tags before accepting patches?

Ah, that's a relief.

How about we just drop "include/linux/nodemask.h: use nr_node_ids (not
MAX_NUMNODES) in __nodemask_pr_numnodes()"
(https://ozlabs.org/~akpm/mmotm/broken-out/include-linux-nodemaskh-use-nr_node_ids-not-max_numnodes-in-__nodemask_pr_numnodes.patch)?
It's just a cosmetic thing, really.



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] numa: Change get_mempolicy() to use nr_node_ids instead of MAX_NUMNODES
  2019-02-28 19:11     ` Andrew Morton
@ 2019-02-28 20:43       ` Vlastimil Babka
  0 siblings, 0 replies; 5+ messages in thread
From: Vlastimil Babka @ 2019-02-28 20:43 UTC (permalink / raw)
  To: Andrew Morton
  Cc: rcampbell, linux-mm, Waiman Long, Linux API, Alexander Duyck,
	Andi Kleen, Florian Weimer, Linus Torvalds, stable

On 2/28/19 8:11 PM, Andrew Morton wrote:
>>> Secondly, 4fb8e5b89bcbbb ("include/linux/nodemask.h: use nr_node_ids
>>> (not MAX_NUMNODES) in __nodemask_pr_numnodes()") introduced a
>>
>> There's no such commit, that sha was probably from linux-next. The patch is
>> still in mmotm [1]. Luckily, I would say. Maybe Linus or some automation could
>> run some script to check for bogus Fixes tags before accepting patches?
> 
> Ah, that's a relief.
> 
> How about we just drop "include/linux/nodemask.h: use nr_node_ids (not
> MAX_NUMNODES) in __nodemask_pr_numnodes()"
> (https://ozlabs.org/~akpm/mmotm/broken-out/include-linux-nodemaskh-use-nr_node_ids-not-max_numnodes-in-__nodemask_pr_numnodes.patch)?
> It's just a cosmetic thing, really.

Yeah the risk of breaking something is not worth it, IMHO.


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2019-02-28 20:47 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-02-11 18:02 [PATCH] numa: Change get_mempolicy() to use nr_node_ids instead of MAX_NUMNODES rcampbell
2019-02-11 19:27 ` Andrew Morton
2019-02-27 18:38   ` Vlastimil Babka
2019-02-28 19:11     ` Andrew Morton
2019-02-28 20:43       ` Vlastimil Babka

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.