All of lore.kernel.org
 help / color / mirror / Atom feed
* cpumask: fix compat getaffinity
@ 2010-05-07 12:45 Arnd Bergmann
  2010-05-08  8:30 ` Rusty Russell
  0 siblings, 1 reply; 22+ messages in thread
From: Arnd Bergmann @ 2010-05-07 12:45 UTC (permalink / raw)
  To: linux-kernel; +Cc: stable, Rusty Russell, Andi Kleen, Ken Werner

Commit a45185d2d "cpumask: convert kernel/compat.c" broke
libnuma, which abuses sched_getaffinity to find out NR_CPUS
in order to parse /sys/devices/system/node/node*/cpumap.

On NUMA systems with less than 32 possibly CPUs, the
current compat_sys_sched_getaffinity now returns '4'
instead of the actual NR_CPUS/8, which makes libnuma
bail out when parsing the cpumap.

This restores the original return value for now.
If we ever get around to changing cpumask_size
to return only possibly CPUs, we will also need to
make the format of the cpumap file.

We should probably also make libnuma able to deal with
the modified kernel interface, so it can operate on
all kernels.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reported-by: Ken Werner <ken.werner@web.de>
Cc: stable@kernel.org
Cc: Andi Kleen <andi@firstfloor.org>

--- a/kernel/compat.c
+++ b/kernel/compat.c
@@ -497,7 +497,7 @@ asmlinkage long compat_sys_sched_getaffinity(compat_pid_t pid, unsigned int len,
 	unsigned long *k;
 	unsigned int min_length = cpumask_size();
 
-	if (nr_cpu_ids <= BITS_PER_COMPAT_LONG)
+	if (NR_CPUS <= BITS_PER_COMPAT_LONG)
 		min_length = sizeof(compat_ulong_t);
 
 	if (len < min_length)

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: cpumask: fix compat getaffinity
  2010-05-07 12:45 cpumask: fix compat getaffinity Arnd Bergmann
@ 2010-05-08  8:30 ` Rusty Russell
  2010-05-08  9:11   ` Arnd Bergmann
  0 siblings, 1 reply; 22+ messages in thread
From: Rusty Russell @ 2010-05-08  8:30 UTC (permalink / raw)
  To: Arnd Bergmann; +Cc: linux-kernel, stable, Andi Kleen, Ken Werner

On Fri, 7 May 2010 10:15:49 pm Arnd Bergmann wrote:
> Commit a45185d2d "cpumask: convert kernel/compat.c" broke
> libnuma, which abuses sched_getaffinity to find out NR_CPUS
> in order to parse /sys/devices/system/node/node*/cpumap.
> 
> On NUMA systems with less than 32 possibly CPUs, the
> current compat_sys_sched_getaffinity now returns '4'
> instead of the actual NR_CPUS/8, which makes libnuma
> bail out when parsing the cpumap.

Really?  AFAICT the cpumap is printed using nr_cpu_ids too.  Can you
give an example of what cpumap is on this system?

Confused,
Rusty.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: cpumask: fix compat getaffinity
  2010-05-08  8:30 ` Rusty Russell
@ 2010-05-08  9:11   ` Arnd Bergmann
  2010-05-10 23:43     ` Rusty Russell
  0 siblings, 1 reply; 22+ messages in thread
From: Arnd Bergmann @ 2010-05-08  9:11 UTC (permalink / raw)
  To: Rusty Russell; +Cc: linux-kernel, stable, Andi Kleen, Ken Werner

On Saturday 08 May 2010 10:30:47 Rusty Russell wrote:
> 
> On Fri, 7 May 2010 10:15:49 pm Arnd Bergmann wrote:
> > Commit a45185d2d "cpumask: convert kernel/compat.c" broke
> > libnuma, which abuses sched_getaffinity to find out NR_CPUS
> > in order to parse /sys/devices/system/node/node*/cpumap.
> > 
> > On NUMA systems with less than 32 possibly CPUs, the
> > current compat_sys_sched_getaffinity now returns '4'
> > instead of the actual NR_CPUS/8, which makes libnuma
> > bail out when parsing the cpumap.
> 
> Really?  AFAICT the cpumap is printed using nr_cpu_ids too.  Can you
> give an example of what cpumap is on this system?

On Ken's PS3 running the Fedora 12 kernel (2.6.32-something), the output
from my memory is 00000000,00000000,00000000,00000003\n. NR_CPUs
is 128, nr_cpu_ids is most likely 2.

	Arnd

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: cpumask: fix compat getaffinity
  2010-05-08  9:11   ` Arnd Bergmann
@ 2010-05-10 23:43     ` Rusty Russell
  2010-05-11  1:47       ` KOSAKI Motohiro
  2010-05-11  9:05       ` Arnd Bergmann
  0 siblings, 2 replies; 22+ messages in thread
From: Rusty Russell @ 2010-05-10 23:43 UTC (permalink / raw)
  To: Arnd Bergmann; +Cc: linux-kernel, stable, Andi Kleen, Ken Werner

On Sat, 8 May 2010 06:41:08 pm Arnd Bergmann wrote:
> On Saturday 08 May 2010 10:30:47 Rusty Russell wrote:
> > 
> > On Fri, 7 May 2010 10:15:49 pm Arnd Bergmann wrote:
> > > Commit a45185d2d "cpumask: convert kernel/compat.c" broke
> > > libnuma, which abuses sched_getaffinity to find out NR_CPUS
> > > in order to parse /sys/devices/system/node/node*/cpumap.
> > > 
> > > On NUMA systems with less than 32 possibly CPUs, the
> > > current compat_sys_sched_getaffinity now returns '4'
> > > instead of the actual NR_CPUS/8, which makes libnuma
> > > bail out when parsing the cpumap.
> > 
> > Really?  AFAICT the cpumap is printed using nr_cpu_ids too.  Can you
> > give an example of what cpumap is on this system?
> 
> On Ken's PS3 running the Fedora 12 kernel (2.6.32-something), the output
> from my memory is 00000000,00000000,00000000,00000003\n. NR_CPUs
> is 128, nr_cpu_ids is most likely 2.

Ah, I see.  CONFIG_CPUMASK_OFFSTACK=n.

The nr_cpumask_bits is really an optimization for cpumask_first etc: making
it a constant (NR_CPUS) rather than a variable (nr_cpu_ids) for small NR_CPUS
makes it slightly faster.

However, we also use this for scnprintf et al, revealing this inconsistency.
Your patch would break libnuma on CONFIG_CPUMASK_OFFSTACK=y.

How's this?

cpumask: use nr_cpu_ids for printing and parsing cpumasks

Commit a45185d2d "cpumask: convert kernel/compat.c" broke
libnuma, which abuses sched_getaffinity to find out NR_CPUS
in order to parse /sys/devices/system/node/node*/cpumap.

However, the result now returned reflects nr_cpu_ids, and
cpumask_scnprintf et al. use nr_cpumask_bits which is NR_CPUS (for
CONFIG_CPUMASK_OFFSTACK=n) or nr_cpu_ids (for
CONFIG_CPUMASK_OFFSTACK=y).

We should use nr_cpu_ids consistently.

Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: stable@kernel.org

diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
--- a/include/linux/cpumask.h
+++ b/include/linux/cpumask.h
@@ -529,7 +529,7 @@ static inline void cpumask_copy(struct c
 static inline int cpumask_scnprintf(char *buf, int len,
 				    const struct cpumask *srcp)
 {
-	return bitmap_scnprintf(buf, len, cpumask_bits(srcp), nr_cpumask_bits);
+	return bitmap_scnprintf(buf, len, cpumask_bits(srcp), nr_cpu_ids);
 }
 
 /**
@@ -543,7 +543,7 @@ static inline int cpumask_scnprintf(char
 static inline int cpumask_parse_user(const char __user *buf, int len,
 				     struct cpumask *dstp)
 {
-	return bitmap_parse_user(buf, len, cpumask_bits(dstp), nr_cpumask_bits);
+	return bitmap_parse_user(buf, len, cpumask_bits(dstp), nr_cpu_ids);
 }
 
 /**
@@ -558,8 +558,7 @@ static inline int cpumask_parse_user(con
 static inline int cpulist_scnprintf(char *buf, int len,
 				    const struct cpumask *srcp)
 {
-	return bitmap_scnlistprintf(buf, len, cpumask_bits(srcp),
-				    nr_cpumask_bits);
+	return bitmap_scnlistprintf(buf, len, cpumask_bits(srcp), nr_cpu_ids);
 }
 
 /**
@@ -572,7 +571,7 @@ static inline int cpulist_scnprintf(char
  */
 static inline int cpulist_parse(const char *buf, struct cpumask *dstp)
 {
-	return bitmap_parselist(buf, cpumask_bits(dstp), nr_cpumask_bits);
+	return bitmap_parselist(buf, cpumask_bits(dstp), nr_cpu_ids);
 }
 
 /**

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: cpumask: fix compat getaffinity
  2010-05-10 23:43     ` Rusty Russell
@ 2010-05-11  1:47       ` KOSAKI Motohiro
  2010-05-11  3:13         ` [stable] " Greg KH
  2010-05-11  9:05       ` Arnd Bergmann
  1 sibling, 1 reply; 22+ messages in thread
From: KOSAKI Motohiro @ 2010-05-11  1:47 UTC (permalink / raw)
  To: Rusty Russell
  Cc: kosaki.motohiro, Arnd Bergmann, linux-kernel, stable, Andi Kleen,
	Ken Werner

> How's this?
> 
> cpumask: use nr_cpu_ids for printing and parsing cpumasks
> 
> Commit a45185d2d "cpumask: convert kernel/compat.c" broke
> libnuma, which abuses sched_getaffinity to find out NR_CPUS
> in order to parse /sys/devices/system/node/node*/cpumap.
> 
> However, the result now returned reflects nr_cpu_ids, and
> cpumask_scnprintf et al. use nr_cpumask_bits which is NR_CPUS (for
> CONFIG_CPUMASK_OFFSTACK=n) or nr_cpu_ids (for
> CONFIG_CPUMASK_OFFSTACK=y).
> 
> We should use nr_cpu_ids consistently.
>
> Reported-by: Arnd Bergmann <arnd@arndb.de>
> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
> Cc: stable@kernel.org

Well, This patch seems to have ABI change. please don't send abi-change to -stable.

note: I'm not against this patch itself.


> 
> diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
> --- a/include/linux/cpumask.h
> +++ b/include/linux/cpumask.h
> @@ -529,7 +529,7 @@ static inline void cpumask_copy(struct c
>  static inline int cpumask_scnprintf(char *buf, int len,
>  				    const struct cpumask *srcp)
>  {
> -	return bitmap_scnprintf(buf, len, cpumask_bits(srcp), nr_cpumask_bits);
> +	return bitmap_scnprintf(buf, len, cpumask_bits(srcp), nr_cpu_ids);
>  }
>  
>  /**
> @@ -543,7 +543,7 @@ static inline int cpumask_scnprintf(char
>  static inline int cpumask_parse_user(const char __user *buf, int len,
>  				     struct cpumask *dstp)
>  {
> -	return bitmap_parse_user(buf, len, cpumask_bits(dstp), nr_cpumask_bits);
> +	return bitmap_parse_user(buf, len, cpumask_bits(dstp), nr_cpu_ids);
>  }
>  
>  /**
> @@ -558,8 +558,7 @@ static inline int cpumask_parse_user(con
>  static inline int cpulist_scnprintf(char *buf, int len,
>  				    const struct cpumask *srcp)
>  {
> -	return bitmap_scnlistprintf(buf, len, cpumask_bits(srcp),
> -				    nr_cpumask_bits);
> +	return bitmap_scnlistprintf(buf, len, cpumask_bits(srcp), nr_cpu_ids);
>  }
>  
>  /**
> @@ -572,7 +571,7 @@ static inline int cpulist_scnprintf(char
>   */
>  static inline int cpulist_parse(const char *buf, struct cpumask *dstp)
>  {
> -	return bitmap_parselist(buf, cpumask_bits(dstp), nr_cpumask_bits);
> +	return bitmap_parselist(buf, cpumask_bits(dstp), nr_cpu_ids);
>  }
>  
>  /**
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/




^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [stable] cpumask: fix compat getaffinity
  2010-05-11  1:47       ` KOSAKI Motohiro
@ 2010-05-11  3:13         ` Greg KH
  2010-05-11  5:51           ` Rusty Russell
                             ` (2 more replies)
  0 siblings, 3 replies; 22+ messages in thread
From: Greg KH @ 2010-05-11  3:13 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: Rusty Russell, Arnd Bergmann, linux-kernel, Ken Werner,
	Andi Kleen, stable

On Tue, May 11, 2010 at 10:47:03AM +0900, KOSAKI Motohiro wrote:
> > How's this?
> > 
> > cpumask: use nr_cpu_ids for printing and parsing cpumasks
> > 
> > Commit a45185d2d "cpumask: convert kernel/compat.c" broke
> > libnuma, which abuses sched_getaffinity to find out NR_CPUS
> > in order to parse /sys/devices/system/node/node*/cpumap.
> > 
> > However, the result now returned reflects nr_cpu_ids, and
> > cpumask_scnprintf et al. use nr_cpumask_bits which is NR_CPUS (for
> > CONFIG_CPUMASK_OFFSTACK=n) or nr_cpu_ids (for
> > CONFIG_CPUMASK_OFFSTACK=y).
> > 
> > We should use nr_cpu_ids consistently.
> >
> > Reported-by: Arnd Bergmann <arnd@arndb.de>
> > Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
> > Cc: stable@kernel.org
> 
> Well, This patch seems to have ABI change. please don't send abi-change to -stable.

Why?  There is no such thing as a "stable" internal abi in the kernel,
and that includes the -stable kernel releases.

If it fixes a bug, that's all the requirement is.

> note: I'm not against this patch itself.

Great, then it should be just fine, right?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [stable] cpumask: fix compat getaffinity
  2010-05-11  3:13         ` [stable] " Greg KH
@ 2010-05-11  5:51           ` Rusty Russell
  2010-05-11  6:25             ` KOSAKI Motohiro
  2010-05-11  6:20           ` KOSAKI Motohiro
  2010-05-12  8:30           ` Milton Miller
  2 siblings, 1 reply; 22+ messages in thread
From: Rusty Russell @ 2010-05-11  5:51 UTC (permalink / raw)
  To: Greg KH
  Cc: KOSAKI Motohiro, Arnd Bergmann, linux-kernel, Ken Werner,
	Andi Kleen, stable

On Tue, 11 May 2010 12:43:54 pm Greg KH wrote:
> On Tue, May 11, 2010 at 10:47:03AM +0900, KOSAKI Motohiro wrote:
> > > How's this?
> > > 
> > > cpumask: use nr_cpu_ids for printing and parsing cpumasks
> > > 
> > > Commit a45185d2d "cpumask: convert kernel/compat.c" broke
> > > libnuma, which abuses sched_getaffinity to find out NR_CPUS
> > > in order to parse /sys/devices/system/node/node*/cpumap.
> > > 
> > > However, the result now returned reflects nr_cpu_ids, and
> > > cpumask_scnprintf et al. use nr_cpumask_bits which is NR_CPUS (for
> > > CONFIG_CPUMASK_OFFSTACK=n) or nr_cpu_ids (for
> > > CONFIG_CPUMASK_OFFSTACK=y).
> > > 
> > > We should use nr_cpu_ids consistently.
> > >
> > > Reported-by: Arnd Bergmann <arnd@arndb.de>
> > > Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
> > > Cc: stable@kernel.org
> > 
> > Well, This patch seems to have ABI change. please don't send abi-change to -stable.
> 
> Why?  There is no such thing as a "stable" internal abi in the kernel,
> and that includes the -stable kernel releases.

He's referring to the change in sysfs output.  However, the ABI involved is
already defined to be robust against change of NR_CPUS, so changing it to
nr_cpu_ids is OK.

Of course, if libnuma weren't abusing the ABI, this change wouldn't be
necessary :(

Cheers,
Rusty.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [stable] cpumask: fix compat getaffinity
  2010-05-11  3:13         ` [stable] " Greg KH
  2010-05-11  5:51           ` Rusty Russell
@ 2010-05-11  6:20           ` KOSAKI Motohiro
  2010-05-11 15:20             ` Greg KH
  2010-05-12  8:30           ` Milton Miller
  2 siblings, 1 reply; 22+ messages in thread
From: KOSAKI Motohiro @ 2010-05-11  6:20 UTC (permalink / raw)
  To: Greg KH
  Cc: kosaki.motohiro, Rusty Russell, Arnd Bergmann, linux-kernel,
	Ken Werner, Andi Kleen, stable

> On Tue, May 11, 2010 at 10:47:03AM +0900, KOSAKI Motohiro wrote:
> > > How's this?
> > > 
> > > cpumask: use nr_cpu_ids for printing and parsing cpumasks
> > > 
> > > Commit a45185d2d "cpumask: convert kernel/compat.c" broke
> > > libnuma, which abuses sched_getaffinity to find out NR_CPUS
> > > in order to parse /sys/devices/system/node/node*/cpumap.
> > > 
> > > However, the result now returned reflects nr_cpu_ids, and
> > > cpumask_scnprintf et al. use nr_cpumask_bits which is NR_CPUS (for
> > > CONFIG_CPUMASK_OFFSTACK=n) or nr_cpu_ids (for
> > > CONFIG_CPUMASK_OFFSTACK=y).
> > > 
> > > We should use nr_cpu_ids consistently.
> > >
> > > Reported-by: Arnd Bergmann <arnd@arndb.de>
> > > Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
> > > Cc: stable@kernel.org
> > 
> > Well, This patch seems to have ABI change. please don't send abi-change to -stable.
> 
> Why?  There is no such thing as a "stable" internal abi in the kernel,
> and that includes the -stable kernel releases.
> 
> If it fixes a bug, that's all the requirement is.

AFAIK, -stable is mainly used for distro and they have many and many
packages. we can't assume no app read /sys files directly.
IOW, To change /sys printing format is a bit risky change. frequently
compatibility breaking natually break code stability.


> > note: I'm not against this patch itself.
> 
> Great, then it should be just fine, right?

Yup, fine.


Thanks.




^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [stable] cpumask: fix compat getaffinity
  2010-05-11  5:51           ` Rusty Russell
@ 2010-05-11  6:25             ` KOSAKI Motohiro
  0 siblings, 0 replies; 22+ messages in thread
From: KOSAKI Motohiro @ 2010-05-11  6:25 UTC (permalink / raw)
  To: Rusty Russell
  Cc: kosaki.motohiro, Greg KH, Arnd Bergmann, linux-kernel,
	Ken Werner, Andi Kleen, stable

> On Tue, 11 May 2010 12:43:54 pm Greg KH wrote:
> > On Tue, May 11, 2010 at 10:47:03AM +0900, KOSAKI Motohiro wrote:
> > > > How's this?
> > > > 
> > > > cpumask: use nr_cpu_ids for printing and parsing cpumasks
> > > > 
> > > > Commit a45185d2d "cpumask: convert kernel/compat.c" broke
> > > > libnuma, which abuses sched_getaffinity to find out NR_CPUS
> > > > in order to parse /sys/devices/system/node/node*/cpumap.
> > > > 
> > > > However, the result now returned reflects nr_cpu_ids, and
> > > > cpumask_scnprintf et al. use nr_cpumask_bits which is NR_CPUS (for
> > > > CONFIG_CPUMASK_OFFSTACK=n) or nr_cpu_ids (for
> > > > CONFIG_CPUMASK_OFFSTACK=y).
> > > > 
> > > > We should use nr_cpu_ids consistently.
> > > >
> > > > Reported-by: Arnd Bergmann <arnd@arndb.de>
> > > > Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
> > > > Cc: stable@kernel.org
> > > 
> > > Well, This patch seems to have ABI change. please don't send abi-change to -stable.
> > 
> > Why?  There is no such thing as a "stable" internal abi in the kernel,
> > and that includes the -stable kernel releases.
> 
> He's referring to the change in sysfs output.  However, the ABI involved is
> already defined to be robust against change of NR_CPUS, so changing it to
> nr_cpu_ids is OK.

hmm...

ok, I'm cancel my previous claim. generically I think I'm right. but in this case
I agree the patch's risk is very low.

sorry for noise.

> Of course, if libnuma weren't abusing the ABI, this change wouldn't be
> necessary :(

yes yes...





^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: cpumask: fix compat getaffinity
  2010-05-10 23:43     ` Rusty Russell
  2010-05-11  1:47       ` KOSAKI Motohiro
@ 2010-05-11  9:05       ` Arnd Bergmann
  2010-05-12  0:52         ` KOSAKI Motohiro
  1 sibling, 1 reply; 22+ messages in thread
From: Arnd Bergmann @ 2010-05-11  9:05 UTC (permalink / raw)
  To: Rusty Russell
  Cc: linux-kernel, stable, Andi Kleen, Ken Werner, KOSAKI Motohiro

On Tuesday 11 May 2010, Rusty Russell wrote:
> cpumask: use nr_cpu_ids for printing and parsing cpumasks
> 
> Commit a45185d2d "cpumask: convert kernel/compat.c" broke
> libnuma, which abuses sched_getaffinity to find out NR_CPUS
> in order to parse /sys/devices/system/node/node*/cpumap.
> 
> However, the result now returned reflects nr_cpu_ids, and
> cpumask_scnprintf et al. use nr_cpumask_bits which is NR_CPUS (for
> CONFIG_CPUMASK_OFFSTACK=n) or nr_cpu_ids (for
> CONFIG_CPUMASK_OFFSTACK=y).
> 
> We should use nr_cpu_ids consistently.
> 
> Reported-by: Arnd Bergmann <arnd@arndb.de>
> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
> Cc: stable@kernel.org

It looks like this changes fixes the compat case, but now
the native system call would be inconsistent with the
sysfs output. 

Wouldn't you also need something like the change below
to make the return value of sched_getaffinity() independent
of NR_CPUS?

Signed-off-by: Arnd Bergmann <arnd@arndb.de>

--- a/kernel/compat.c
+++ b/kernel/compat.c
@@ -495,12 +495,9 @@ asmlinkage long compat_sys_sched_getaffinity(compat_pid_t pid, unsigned int len,
 	int ret;
 	cpumask_var_t mask;
 	unsigned long *k;
-	unsigned int min_length = cpumask_size();
+	size_t size = BITS_TO_COMPAT_LONGS(nr_cpu_ids) * sizeof(compat_ulong_t);
 
-	if (nr_cpu_ids <= BITS_PER_COMPAT_LONG)
-		min_length = sizeof(compat_ulong_t);
-
-	if (len < min_length)
+	if (len < size)
 		return -EINVAL;
 
 	if (!alloc_cpumask_var(&mask, GFP_KERNEL))
@@ -511,9 +508,9 @@ asmlinkage long compat_sys_sched_getaffinity(compat_pid_t pid, unsigned int len,
 		goto out;
 
 	k = cpumask_bits(mask);
-	ret = compat_put_bitmap(user_mask_ptr, k, min_length * 8);
+	ret = compat_put_bitmap(user_mask_ptr, k, size * 8);
 	if (ret == 0)
-		ret = min_length;
+		ret = size;
 
 out:
 	free_cpumask_var(mask);
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -6693,8 +6693,9 @@ SYSCALL_DEFINE3(sched_getaffinity, pid_t, pid, unsigned int, len,
 {
 	int ret;
 	cpumask_var_t mask;
+	size_t size = BITS_TO_LONGS(nr_cpu_ids) * sizeof(long);
 
-	if (len < cpumask_size())
+	if (len < size)
 		return -EINVAL;
 
 	if (!alloc_cpumask_var(&mask, GFP_KERNEL))
@@ -6702,10 +6703,10 @@ SYSCALL_DEFINE3(sched_getaffinity, pid_t, pid, unsigned int, len,
 
 	ret = sched_getaffinity(pid, mask);
 	if (ret == 0) {
-		if (copy_to_user(user_mask_ptr, mask, cpumask_size()))
+		if (copy_to_user(user_mask_ptr, mask, size))
 			ret = -EFAULT;
 		else
-			ret = cpumask_size();
+			ret = size;
 	}
 	free_cpumask_var(mask);
 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [stable] cpumask: fix compat getaffinity
  2010-05-11  6:20           ` KOSAKI Motohiro
@ 2010-05-11 15:20             ` Greg KH
  2010-05-11 18:13               ` Arnd Bergmann
  0 siblings, 1 reply; 22+ messages in thread
From: Greg KH @ 2010-05-11 15:20 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: Rusty Russell, Arnd Bergmann, linux-kernel, Ken Werner,
	Andi Kleen, stable

On Tue, May 11, 2010 at 03:20:45PM +0900, KOSAKI Motohiro wrote:
> > On Tue, May 11, 2010 at 10:47:03AM +0900, KOSAKI Motohiro wrote:
> > > > How's this?
> > > > 
> > > > cpumask: use nr_cpu_ids for printing and parsing cpumasks
> > > > 
> > > > Commit a45185d2d "cpumask: convert kernel/compat.c" broke
> > > > libnuma, which abuses sched_getaffinity to find out NR_CPUS
> > > > in order to parse /sys/devices/system/node/node*/cpumap.
> > > > 
> > > > However, the result now returned reflects nr_cpu_ids, and
> > > > cpumask_scnprintf et al. use nr_cpumask_bits which is NR_CPUS (for
> > > > CONFIG_CPUMASK_OFFSTACK=n) or nr_cpu_ids (for
> > > > CONFIG_CPUMASK_OFFSTACK=y).
> > > > 
> > > > We should use nr_cpu_ids consistently.
> > > >
> > > > Reported-by: Arnd Bergmann <arnd@arndb.de>
> > > > Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
> > > > Cc: stable@kernel.org
> > > 
> > > Well, This patch seems to have ABI change. please don't send abi-change to -stable.
> > 
> > Why?  There is no such thing as a "stable" internal abi in the kernel,
> > and that includes the -stable kernel releases.
> > 
> > If it fixes a bug, that's all the requirement is.
> 
> AFAIK, -stable is mainly used for distro and they have many and many
> packages. we can't assume no app read /sys files directly.
> IOW, To change /sys printing format is a bit risky change. frequently
> compatibility breaking natually break code stability.

Ah, the sysfs user/kernel API is what you are referring to here, right?
If so, remember, any changes need to be documented in Documentation/API
:)

And as for distros using -stable for their releases, that's fine, they
well know the risks/benefits for that and handle changes on their own.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [stable] cpumask: fix compat getaffinity
  2010-05-11 15:20             ` Greg KH
@ 2010-05-11 18:13               ` Arnd Bergmann
  2010-05-11 21:36                 ` Greg KH
  0 siblings, 1 reply; 22+ messages in thread
From: Arnd Bergmann @ 2010-05-11 18:13 UTC (permalink / raw)
  To: Greg KH
  Cc: KOSAKI Motohiro, Rusty Russell, linux-kernel, Ken Werner,
	Andi Kleen, stable

On Tuesday 11 May 2010 17:20:02 Greg KH wrote:
> > AFAIK, -stable is mainly used for distro and they have many and many
> > packages. we can't assume no app read /sys files directly.
> > IOW, To change /sys printing format is a bit risky change. frequently
> > compatibility breaking natually break code stability.
> 
> Ah, the sysfs user/kernel API is what you are referring to here, right?
> If so, remember, any changes need to be documented in Documentation/API
> :)
> And as for distros using -stable for their releases, that's fine, they
> well know the risks/benefits for that and handle changes on their own.

Would it make sense to use my initial patch for -stable, which reverts
the ABI back to before the change that caused the problem, but apply
the correct fix (changing the ABI throughout) for future releases?

	Arnd

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [stable] cpumask: fix compat getaffinity
  2010-05-11 18:13               ` Arnd Bergmann
@ 2010-05-11 21:36                 ` Greg KH
  0 siblings, 0 replies; 22+ messages in thread
From: Greg KH @ 2010-05-11 21:36 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: KOSAKI Motohiro, Rusty Russell, linux-kernel, Ken Werner,
	Andi Kleen, stable

On Tue, May 11, 2010 at 08:13:55PM +0200, Arnd Bergmann wrote:
> On Tuesday 11 May 2010 17:20:02 Greg KH wrote:
> > > AFAIK, -stable is mainly used for distro and they have many and many
> > > packages. we can't assume no app read /sys files directly.
> > > IOW, To change /sys printing format is a bit risky change. frequently
> > > compatibility breaking natually break code stability.
> > 
> > Ah, the sysfs user/kernel API is what you are referring to here, right?
> > If so, remember, any changes need to be documented in Documentation/API
> > :)
> > And as for distros using -stable for their releases, that's fine, they
> > well know the risks/benefits for that and handle changes on their own.
> 
> Would it make sense to use my initial patch for -stable, which reverts
> the ABI back to before the change that caused the problem, but apply
> the correct fix (changing the ABI throughout) for future releases?

I do not know, as I really don't know what the issue is here.

I trust that you, and the others, can come up with a fix for mainline
properly, and if it applies to the stable tree, feel free to forward it
to stable@kernel.org for submission then.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: cpumask: fix compat getaffinity
  2010-05-11  9:05       ` Arnd Bergmann
@ 2010-05-12  0:52         ` KOSAKI Motohiro
  0 siblings, 0 replies; 22+ messages in thread
From: KOSAKI Motohiro @ 2010-05-12  0:52 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: kosaki.motohiro, Rusty Russell, linux-kernel, stable, Andi Kleen,
	Ken Werner

> On Tuesday 11 May 2010, Rusty Russell wrote:
> > cpumask: use nr_cpu_ids for printing and parsing cpumasks
> > 
> > Commit a45185d2d "cpumask: convert kernel/compat.c" broke
> > libnuma, which abuses sched_getaffinity to find out NR_CPUS
> > in order to parse /sys/devices/system/node/node*/cpumap.
> > 
> > However, the result now returned reflects nr_cpu_ids, and
> > cpumask_scnprintf et al. use nr_cpumask_bits which is NR_CPUS (for
> > CONFIG_CPUMASK_OFFSTACK=n) or nr_cpu_ids (for
> > CONFIG_CPUMASK_OFFSTACK=y).
> > 
> > We should use nr_cpu_ids consistently.
> > 
> > Reported-by: Arnd Bergmann <arnd@arndb.de>
> > Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
> > Cc: stable@kernel.org
> 
> It looks like this changes fixes the compat case, but now
> the native system call would be inconsistent with the
> sysfs output. 
> 
> Wouldn't you also need something like the change below
> to make the return value of sched_getaffinity() independent
> of NR_CPUS?
> 
> Signed-off-by: Arnd Bergmann <arnd@arndb.de>

Looks good. but please rebase it on latest linus-tree. in linus tree,
we are already using nr_cpu_ids instead NR_CPUS.

And I'm sorry. at making such change, I've forgot to care compat
syscall, your patch fixes it ;)




^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: cpumask: fix compat getaffinity
  2010-05-11  3:13         ` [stable] " Greg KH
  2010-05-11  5:51           ` Rusty Russell
  2010-05-11  6:20           ` KOSAKI Motohiro
@ 2010-05-12  8:30           ` Milton Miller
  2010-05-14 12:39             ` Rusty Russell
  2010-06-22 23:30             ` Rusty Russell
  2 siblings, 2 replies; 22+ messages in thread
From: Milton Miller @ 2010-05-12  8:30 UTC (permalink / raw)
  To: Arnd Bergmann, KOSAKI Motohiro, Rusty Russell, Greg KH
  Cc: linux-kernel, stable


At least for parsing, we need to allocate and parse NR_CPUS until
all places like arch/powerpc/platforms/pseries/xics.c that compare a
user-supplied mask to CPUMASK_ALL are eliminated.


> Would it make sense to use my initial patch for -stable, which reverts
> the ABI back to before the change that caused the problem, but apply
> the correct fix (changing the ABI throughout) for future releases?

This would definitly be the conservative fix.

milton

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: cpumask: fix compat getaffinity
  2010-05-12  8:30           ` Milton Miller
@ 2010-05-14 12:39             ` Rusty Russell
  2010-05-17  6:04               ` KOSAKI Motohiro
  2010-06-22 23:30             ` Rusty Russell
  1 sibling, 1 reply; 22+ messages in thread
From: Rusty Russell @ 2010-05-14 12:39 UTC (permalink / raw)
  To: Milton Miller
  Cc: Arnd Bergmann, KOSAKI Motohiro, Greg KH, linux-kernel, stable, anton

On Wed, 12 May 2010 06:00:45 pm Milton Miller wrote:
> 
> At least for parsing, we need to allocate and parse NR_CPUS until
> all places like arch/powerpc/platforms/pseries/xics.c that compare a
> user-supplied mask to CPUMASK_ALL are eliminated.

Good point.  Anton will want to fix those anyway for CONFIG_CPUMASK_OFFSTACK,
too, but that's the reason the parsing uses nr_cpumask_bits.

> > Would it make sense to use my initial patch for -stable, which reverts
> > the ABI back to before the change that caused the problem, but apply
> > the correct fix (changing the ABI throughout) for future releases?
> 
> This would definitly be the conservative fix.

Instead of changing back to NR_CPUS which will break libnuma for
CPUMASK_OFFSTACK, how about changing it to nr_cpumask_bits and having an
explicit comment above it:

	/* libnuma assumes we match scnprintf for /sys/.../node/node*/cpumap */

Thanks,
Rusty.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: cpumask: fix compat getaffinity
  2010-05-14 12:39             ` Rusty Russell
@ 2010-05-17  6:04               ` KOSAKI Motohiro
  2010-05-17 18:58                 ` Arnd Bergmann
                                   ` (2 more replies)
  0 siblings, 3 replies; 22+ messages in thread
From: KOSAKI Motohiro @ 2010-05-17  6:04 UTC (permalink / raw)
  To: Rusty Russell
  Cc: kosaki.motohiro, Milton Miller, Arnd Bergmann, Greg KH,
	linux-kernel, stable, anton

> On Wed, 12 May 2010 06:00:45 pm Milton Miller wrote:
> > 
> > At least for parsing, we need to allocate and parse NR_CPUS until
> > all places like arch/powerpc/platforms/pseries/xics.c that compare a
> > user-supplied mask to CPUMASK_ALL are eliminated.
> 
> Good point.  Anton will want to fix those anyway for CONFIG_CPUMASK_OFFSTACK,
> too, but that's the reason the parsing uses nr_cpumask_bits.
> 
> > > Would it make sense to use my initial patch for -stable, which reverts
> > > the ABI back to before the change that caused the problem, but apply
> > > the correct fix (changing the ABI throughout) for future releases?
> > 
> > This would definitly be the conservative fix.
> 
> Instead of changing back to NR_CPUS which will break libnuma for
> CPUMASK_OFFSTACK, how about changing it to nr_cpumask_bits and having an
> explicit comment above it:

Yes and No.

1) sched_getaffinity syscall is used from glibc and libnuma.
2) glibc doesn't use the return value almostly. glibc emulate it as NR_CPUS=1024.
3) Now, both sched_getaffinity() and compat_sys_sched_getaffinity() have nr_cpu_ids thing.
4) But only compat_sys_sched_getaffinity() hit libnuma problem.

I think It mean compat_sys_sched_getaffinity() should behave as sched_getaffinity().
IOW, libnuma assume compat_sys_sched_getaffinity() return len args or NR_CPUS.
then, following patch do it. I confirmed the patch works with or without CPUMASK_OFFSTACK.  

So, My proposal is
	1) merge both mine and yours to linus tree
	2) but backport only mine


How do you think?


==========================================================
From: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Subject: [PATCH] cpumask: fix compat getaffinity

Commit a45185d2d "cpumask: convert kernel/compat.c" broke
libnuma, which abuses sched_getaffinity to find out NR_CPUS
in order to parse /sys/devices/system/node/node*/cpumap.

On NUMA systems with less than 32 possibly CPUs, the
current compat_sys_sched_getaffinity now returns '4'
instead of the actual NR_CPUS/8, which makes libnuma
bail out when parsing the cpumap.

The libnuma call sched_getaffinity(0, bitmap, 4096)
at first. It mean the libnuma expect the return value of
sched_getaffinity() is either len argument or NR_CPUS.
But it doesn't expect to return nr_cpu_ids.

Strictly speaking, userland requirement are

1) Glibc assume the return value mean the lengh of initialized
   of mask argument. E.g. if sched_getaffinity(1024) return 128,
   glibc make zero fill rest 896 byte.
2) Libnuma assume the return value can be used to guess NR_CPUS
   in kernel. It assume len-arg<NR_CPUS makes -EINVAL. But
   it try len=4096 at first and 4096 is always bigger than
   NR_CPUS. Then, if we remove strange min_length normalization,
   we never hit -EINVAL case.

sched_getaffinity() already solved this issue. This patch
adapt compat_sys_sched_getaffinity() as it.

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
---
 kernel/compat.c |   25 +++++++++++--------------
 1 files changed, 11 insertions(+), 14 deletions(-)

diff --git a/kernel/compat.c b/kernel/compat.c
index 7f40e92..5adab05 100644
--- a/kernel/compat.c
+++ b/kernel/compat.c
@@ -495,29 +495,26 @@ asmlinkage long compat_sys_sched_getaffinity(compat_pid_t pid, unsigned int len,
 {
 	int ret;
 	cpumask_var_t mask;
-	unsigned long *k;
-	unsigned int min_length = cpumask_size();
-
-	if (nr_cpu_ids <= BITS_PER_COMPAT_LONG)
-		min_length = sizeof(compat_ulong_t);
 
-	if (len < min_length)
+	if ((len * BITS_PER_BYTE) < nr_cpu_ids)
+		return -EINVAL;
+	if (len & (sizeof(compat_ulong_t)-1))
 		return -EINVAL;
 
 	if (!alloc_cpumask_var(&mask, GFP_KERNEL))
 		return -ENOMEM;
 
 	ret = sched_getaffinity(pid, mask);
-	if (ret < 0)
-		goto out;
+	if (ret == 0) {
+		size_t retlen = min_t(size_t, len, cpumask_size());
 
-	k = cpumask_bits(mask);
-	ret = compat_put_bitmap(user_mask_ptr, k, min_length * 8);
-	if (ret == 0)
-		ret = min_length;
-
-out:
+		if (compat_put_bitmap(user_mask_ptr, cpumask_bits(mask), retlen * 8))
+			ret = -EFAULT;
+		else
+			ret = retlen;
+	}
 	free_cpumask_var(mask);
+
 	return ret;
 }
 
-- 
1.6.5.2








^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: cpumask: fix compat getaffinity
  2010-05-17  6:04               ` KOSAKI Motohiro
@ 2010-05-17 18:58                 ` Arnd Bergmann
  2010-05-18  0:57                 ` Rusty Russell
  2010-12-14 16:59                 ` Josh Hunt
  2 siblings, 0 replies; 22+ messages in thread
From: Arnd Bergmann @ 2010-05-17 18:58 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: Rusty Russell, Milton Miller, Greg KH, linux-kernel, stable, anton

On Monday 17 May 2010, KOSAKI Motohiro wrote:
> From: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> Subject: [PATCH] cpumask: fix compat getaffinity
> 
> Commit a45185d2d "cpumask: convert kernel/compat.c" broke
> libnuma, which abuses sched_getaffinity to find out NR_CPUS
> in order to parse /sys/devices/system/node/node*/cpumap.
> 
> On NUMA systems with less than 32 possibly CPUs, the
> current compat_sys_sched_getaffinity now returns '4'
> instead of the actual NR_CPUS/8, which makes libnuma
> bail out when parsing the cpumap.
> 
> The libnuma call sched_getaffinity(0, bitmap, 4096)
> at first. It mean the libnuma expect the return value of
> sched_getaffinity() is either len argument or NR_CPUS.
> But it doesn't expect to return nr_cpu_ids.
> 
> Strictly speaking, userland requirement are
> 
> 1) Glibc assume the return value mean the lengh of initialized
>    of mask argument. E.g. if sched_getaffinity(1024) return 128,
>    glibc make zero fill rest 896 byte.
> 2) Libnuma assume the return value can be used to guess NR_CPUS
>    in kernel. It assume len-arg<NR_CPUS makes -EINVAL. But
>    it try len=4096 at first and 4096 is always bigger than
>    NR_CPUS. Then, if we remove strange min_length normalization,
>    we never hit -EINVAL case.
> 
> sched_getaffinity() already solved this issue. This patch
> adapt compat_sys_sched_getaffinity() as it.
> 
> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>

Looks good to me,

Acked-by: Arnd Bergmann <arnd@arndb.de>

You should also add a Cc:stable@vger.kernel.org line to the
description so Greg's scripts catch it when it gets merged
upstream.

	Arnd

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: cpumask: fix compat getaffinity
  2010-05-17  6:04               ` KOSAKI Motohiro
  2010-05-17 18:58                 ` Arnd Bergmann
@ 2010-05-18  0:57                 ` Rusty Russell
  2010-12-14 16:59                 ` Josh Hunt
  2 siblings, 0 replies; 22+ messages in thread
From: Rusty Russell @ 2010-05-18  0:57 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: Milton Miller, Arnd Bergmann, Greg KH, linux-kernel, stable, anton

On Mon, 17 May 2010 03:34:44 pm KOSAKI Motohiro wrote:
> > On Wed, 12 May 2010 06:00:45 pm Milton Miller wrote:
> > > 
> > > At least for parsing, we need to allocate and parse NR_CPUS until
> > > all places like arch/powerpc/platforms/pseries/xics.c that compare a
> > > user-supplied mask to CPUMASK_ALL are eliminated.
> > 
> > Good point.  Anton will want to fix those anyway for CONFIG_CPUMASK_OFFSTACK,
> > too, but that's the reason the parsing uses nr_cpumask_bits.
> > 
> > > > Would it make sense to use my initial patch for -stable, which reverts
> > > > the ABI back to before the change that caused the problem, but apply
> > > > the correct fix (changing the ABI throughout) for future releases?
> > > 
> > > This would definitly be the conservative fix.
> > 
> > Instead of changing back to NR_CPUS which will break libnuma for
> > CPUMASK_OFFSTACK, how about changing it to nr_cpumask_bits and having an
> > explicit comment above it:
> 
> Yes and No.
> 
> 1) sched_getaffinity syscall is used from glibc and libnuma.
> 2) glibc doesn't use the return value almostly. glibc emulate it as NR_CPUS=1024.
> 3) Now, both sched_getaffinity() and compat_sys_sched_getaffinity() have nr_cpu_ids thing.
> 4) But only compat_sys_sched_getaffinity() hit libnuma problem.
> 
> I think It mean compat_sys_sched_getaffinity() should behave as sched_getaffinity().
> IOW, libnuma assume compat_sys_sched_getaffinity() return len args or NR_CPUS.
> then, following patch do it. I confirmed the patch works with or without CPUMASK_OFFSTACK.  
> 
> So, My proposal is
> 	1) merge both mine and yours to linus tree
> 	2) but backport only mine

I think we should just take yours; it seems sufficient.

Acked-by: Rusty Russell <rusty@rustcorp.com.au>

We're going to break at > 4096 cpus whatever we do; I argued with Ingo about
it when the affinity syscalls were added but noone thought we'd hit it
so soon.

Thanks,
Rusty.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: cpumask: fix compat getaffinity
  2010-05-12  8:30           ` Milton Miller
  2010-05-14 12:39             ` Rusty Russell
@ 2010-06-22 23:30             ` Rusty Russell
  1 sibling, 0 replies; 22+ messages in thread
From: Rusty Russell @ 2010-06-22 23:30 UTC (permalink / raw)
  To: Milton Miller; +Cc: Arnd Bergmann, KOSAKI Motohiro, Greg KH, linux-kernel

On Wed, 12 May 2010 06:00:45 pm Milton Miller wrote:
> 
> At least for parsing, we need to allocate and parse NR_CPUS until
> all places like arch/powerpc/platforms/pseries/xics.c that compare a
> user-supplied mask to CPUMASK_ALL are eliminated.

Hi, this has been bugging me...

We're OK in this case: cpumask_equal() will only iterate to nr_cpus.  That's
one reason for all the cpumask churn...

I'm about to submit the cpumask shrinkage to Ingo.  I'll cc you...

Cheers,
Rusty.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: cpumask: fix compat getaffinity
  2010-05-17  6:04               ` KOSAKI Motohiro
  2010-05-17 18:58                 ` Arnd Bergmann
  2010-05-18  0:57                 ` Rusty Russell
@ 2010-12-14 16:59                 ` Josh Hunt
  2010-12-15 14:40                   ` Arnd Bergmann
  2 siblings, 1 reply; 22+ messages in thread
From: Josh Hunt @ 2010-12-14 16:59 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: Rusty Russell, Milton Miller, Arnd Bergmann, Greg KH,
	linux-kernel, anton

I realize this thread is fairly old, but I have a question about this
statement in the changelog:

> On NUMA systems with less than 32 possibly CPUs, the
> current compat_sys_sched_getaffinity now returns '4'
> instead of the actual NR_CPUS/8, which makes libnuma
> bail out when parsing the cpumap.

This does not seem accurate, at least not on my system with NR_CPUS=32
and nr_cpu_ids=8. The smallest value retlen will ever be set to is
sizeof(long) which is 8 on most modern systems. This breaks the
statement that this function will now return NR_CPUS/8.

Thanks
Josh


On Sun, May 16, 2010 at 11:04 PM, KOSAKI Motohiro
<kosaki.motohiro@jp.fujitsu.com> wrote:
>> On Wed, 12 May 2010 06:00:45 pm Milton Miller wrote:
>> >
>> > At least for parsing, we need to allocate and parse NR_CPUS until
>> > all places like arch/powerpc/platforms/pseries/xics.c that compare a
>> > user-supplied mask to CPUMASK_ALL are eliminated.
>>
>> Good point.  Anton will want to fix those anyway for CONFIG_CPUMASK_OFFSTACK,
>> too, but that's the reason the parsing uses nr_cpumask_bits.
>>
>> > > Would it make sense to use my initial patch for -stable, which reverts
>> > > the ABI back to before the change that caused the problem, but apply
>> > > the correct fix (changing the ABI throughout) for future releases?
>> >
>> > This would definitly be the conservative fix.
>>
>> Instead of changing back to NR_CPUS which will break libnuma for
>> CPUMASK_OFFSTACK, how about changing it to nr_cpumask_bits and having an
>> explicit comment above it:
>
> Yes and No.
>
> 1) sched_getaffinity syscall is used from glibc and libnuma.
> 2) glibc doesn't use the return value almostly. glibc emulate it as NR_CPUS=1024.
> 3) Now, both sched_getaffinity() and compat_sys_sched_getaffinity() have nr_cpu_ids thing.
> 4) But only compat_sys_sched_getaffinity() hit libnuma problem.
>
> I think It mean compat_sys_sched_getaffinity() should behave as sched_getaffinity().
> IOW, libnuma assume compat_sys_sched_getaffinity() return len args or NR_CPUS.
> then, following patch do it. I confirmed the patch works with or without CPUMASK_OFFSTACK.
>
> So, My proposal is
>        1) merge both mine and yours to linus tree
>        2) but backport only mine
>
>
> How do you think?
>
>
> ==========================================================
> From: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> Subject: [PATCH] cpumask: fix compat getaffinity
>
> Commit a45185d2d "cpumask: convert kernel/compat.c" broke
> libnuma, which abuses sched_getaffinity to find out NR_CPUS
> in order to parse /sys/devices/system/node/node*/cpumap.
>
> On NUMA systems with less than 32 possibly CPUs, the
> current compat_sys_sched_getaffinity now returns '4'
> instead of the actual NR_CPUS/8, which makes libnuma
> bail out when parsing the cpumap.
>
> The libnuma call sched_getaffinity(0, bitmap, 4096)
> at first. It mean the libnuma expect the return value of
> sched_getaffinity() is either len argument or NR_CPUS.
> But it doesn't expect to return nr_cpu_ids.
>
> Strictly speaking, userland requirement are
>
> 1) Glibc assume the return value mean the lengh of initialized
>   of mask argument. E.g. if sched_getaffinity(1024) return 128,
>   glibc make zero fill rest 896 byte.
> 2) Libnuma assume the return value can be used to guess NR_CPUS
>   in kernel. It assume len-arg<NR_CPUS makes -EINVAL. But
>   it try len=4096 at first and 4096 is always bigger than
>   NR_CPUS. Then, if we remove strange min_length normalization,
>   we never hit -EINVAL case.
>
> sched_getaffinity() already solved this issue. This patch
> adapt compat_sys_sched_getaffinity() as it.
>
> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> ---
>  kernel/compat.c |   25 +++++++++++--------------
>  1 files changed, 11 insertions(+), 14 deletions(-)
>
> diff --git a/kernel/compat.c b/kernel/compat.c
> index 7f40e92..5adab05 100644
> --- a/kernel/compat.c
> +++ b/kernel/compat.c
> @@ -495,29 +495,26 @@ asmlinkage long compat_sys_sched_getaffinity(compat_pid_t pid, unsigned int len,
>  {
>        int ret;
>        cpumask_var_t mask;
> -       unsigned long *k;
> -       unsigned int min_length = cpumask_size();
> -
> -       if (nr_cpu_ids <= BITS_PER_COMPAT_LONG)
> -               min_length = sizeof(compat_ulong_t);
>
> -       if (len < min_length)
> +       if ((len * BITS_PER_BYTE) < nr_cpu_ids)
> +               return -EINVAL;
> +       if (len & (sizeof(compat_ulong_t)-1))
>                return -EINVAL;
>
>        if (!alloc_cpumask_var(&mask, GFP_KERNEL))
>                return -ENOMEM;
>
>        ret = sched_getaffinity(pid, mask);
> -       if (ret < 0)
> -               goto out;
> +       if (ret == 0) {
> +               size_t retlen = min_t(size_t, len, cpumask_size());
>
> -       k = cpumask_bits(mask);
> -       ret = compat_put_bitmap(user_mask_ptr, k, min_length * 8);
> -       if (ret == 0)
> -               ret = min_length;
> -
> -out:
> +               if (compat_put_bitmap(user_mask_ptr, cpumask_bits(mask), retlen * 8))
> +                       ret = -EFAULT;
> +               else
> +                       ret = retlen;
> +       }
>        free_cpumask_var(mask);
> +
>        return ret;
>  }
>
> --
> 1.6.5.2
>
>
>
>
>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: cpumask: fix compat getaffinity
  2010-12-14 16:59                 ` Josh Hunt
@ 2010-12-15 14:40                   ` Arnd Bergmann
  0 siblings, 0 replies; 22+ messages in thread
From: Arnd Bergmann @ 2010-12-15 14:40 UTC (permalink / raw)
  To: Josh Hunt
  Cc: KOSAKI Motohiro, Rusty Russell, Milton Miller, Greg KH,
	linux-kernel, anton

On Tuesday 14 December 2010, Josh Hunt wrote:
> 
> I realize this thread is fairly old, but I have a question about this
> statement in the changelog:
> 
> > On NUMA systems with less than 32 possibly CPUs, the
> > current compat_sys_sched_getaffinity now returns '4'
> > instead of the actual NR_CPUS/8, which makes libnuma
> > bail out when parsing the cpumap.
> 
> This does not seem accurate, at least not on my system with NR_CPUS=32
> and nr_cpu_ids=8. The smallest value retlen will ever be set to is
> sizeof(long) which is 8 on most modern systems. This breaks the
> statement that this function will now return NR_CPUS/8.

Yes, I think the description is a bit misleading there.
The point was that it should return the number that the
sysfs files are based on, instead of the smallest multiple
of sizeof(compat_long) that can hold the installed CPUs.

	Arnd

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2010-12-15 14:41 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-05-07 12:45 cpumask: fix compat getaffinity Arnd Bergmann
2010-05-08  8:30 ` Rusty Russell
2010-05-08  9:11   ` Arnd Bergmann
2010-05-10 23:43     ` Rusty Russell
2010-05-11  1:47       ` KOSAKI Motohiro
2010-05-11  3:13         ` [stable] " Greg KH
2010-05-11  5:51           ` Rusty Russell
2010-05-11  6:25             ` KOSAKI Motohiro
2010-05-11  6:20           ` KOSAKI Motohiro
2010-05-11 15:20             ` Greg KH
2010-05-11 18:13               ` Arnd Bergmann
2010-05-11 21:36                 ` Greg KH
2010-05-12  8:30           ` Milton Miller
2010-05-14 12:39             ` Rusty Russell
2010-05-17  6:04               ` KOSAKI Motohiro
2010-05-17 18:58                 ` Arnd Bergmann
2010-05-18  0:57                 ` Rusty Russell
2010-12-14 16:59                 ` Josh Hunt
2010-12-15 14:40                   ` Arnd Bergmann
2010-06-22 23:30             ` Rusty Russell
2010-05-11  9:05       ` Arnd Bergmann
2010-05-12  0:52         ` KOSAKI Motohiro

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.