linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC/PATCH] RLIMIT_ARG_MAX
@ 2008-02-27 13:37 Peter Zijlstra
  2008-02-29 16:05 ` Linus Torvalds
  0 siblings, 1 reply; 27+ messages in thread
From: Peter Zijlstra @ 2008-02-27 13:37 UTC (permalink / raw)
  To: aaw, Andrew Morton, michael.kerrisk, carlos, Linus Torvalds, Alan Cox
  Cc: linux-kernel

Hi Linus,

Raised by: http://bugzilla.kernel.org/show_bug.cgi?id=10095 , there is
the question of whether we want to separate the env+arg arrays from the
stack proper.

Currently these arrays are considered part of the stack, and
RLIMIT_STACK includes them. However POSIX does not specify it must be
so.

The complaint is that sysconf(_SC_ARG_MAX) returns a hard coded value
(which is not obtained from the kernel) and might, depending on the
RLIMIT_STACK setting, be invalid.

POSIX disallows sysconf() variables to change during the execution of a
process, so even if it would ask the kernel for a value, we could not
give a sane answer.

The suggestion is to introduce a new RLIMIT_ARG_MAX which takes over the
role of sysconf(_SC_ARG_MAX), however this would require we either
separate these values into their own vma, or subtract 
  mm->env_end - mm->env_start + mm->arg_end - mm->arg_start
from the computed vma size when we test RLIMIT_STACK.

I'm still of two minds on this issue.. but fwiw here is a patch
implementing RLIMIT_ARG_MAX - utterly untested and doesn't consider 
!MMU.

---
Subject: RLIMIT_ARG_MAX

Having this rlimit allows userspace to determine how large argv arrays
can be (after they bother to calculate the env size).

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
 fs/exec.c                      |    2 +-
 fs/proc/base.c                 |    1 +
 include/asm-generic/resource.h |    4 +++-
 mm/mmap.c                      |    6 +++++-
 4 files changed, 10 insertions(+), 3 deletions(-)

Index: linux-2.6/fs/exec.c
===================================================================
--- linux-2.6.orig/fs/exec.c
+++ linux-2.6/fs/exec.c
@@ -183,7 +183,7 @@ static struct page *get_arg_page(struct 
 		 *  - the program will have a reasonable amount of stack left
 		 *    to work from.
 		 */
-		if (size > rlim[RLIMIT_STACK].rlim_cur / 4) {
+		if (size > rlim[RLIMIT_ARG_MAX].rlim_cur) {
 			put_page(page);
 			return NULL;
 		}
Index: linux-2.6/fs/proc/base.c
===================================================================
--- linux-2.6.orig/fs/proc/base.c
+++ linux-2.6/fs/proc/base.c
@@ -412,6 +412,7 @@ static const struct limit_names lnames[R
 	[RLIMIT_NICE] = {"Max nice priority", NULL},
 	[RLIMIT_RTPRIO] = {"Max realtime priority", NULL},
 	[RLIMIT_RTTIME] = {"Max realtime timeout", "us"},
+	[RLIMIT_ARG_MAX] = {"Max env+arg space", "bytes"},
 };
 
 /* Display limits for a process */
Index: linux-2.6/include/asm-generic/resource.h
===================================================================
--- linux-2.6.orig/include/asm-generic/resource.h
+++ linux-2.6/include/asm-generic/resource.h
@@ -45,7 +45,8 @@
 					   0-39 for nice level 19 .. -20 */
 #define RLIMIT_RTPRIO		14	/* maximum realtime priority */
 #define RLIMIT_RTTIME		15	/* timeout for RT tasks in us */
-#define RLIM_NLIMITS		16
+#define RLIMIT_ARG_MAX		16	/* maximum env+arg space */
+#define RLIM_NLIMITS		17
 
 /*
  * SuS says limits have to be unsigned.
@@ -87,6 +88,7 @@
 	[RLIMIT_NICE]		= { 0, 0 },				\
 	[RLIMIT_RTPRIO]		= { 0, 0 },				\
 	[RLIMIT_RTTIME]		= {  RLIM_INFINITY,  RLIM_INFINITY },	\
+	[RLIMIT_ARG_MAX]	= { 32*PAGE_SIZE, _STK_LIM/4 },	\
 }
 
 #endif	/* __KERNEL__ */
Index: linux-2.6/mm/mmap.c
===================================================================
--- linux-2.6.orig/mm/mmap.c
+++ linux-2.6/mm/mmap.c
@@ -1516,13 +1516,17 @@ static int acct_stack_growth(struct vm_a
 	struct mm_struct *mm = vma->vm_mm;
 	struct rlimit *rlim = current->signal->rlim;
 	unsigned long new_start;
+	unsigned long env_arg_size;
 
 	/* address space limit tests */
 	if (!may_expand_vm(mm, grow))
 		return -ENOMEM;
 
+	env_arg_size = mm->env_end - mm->env_start +
+		       mm->arg_end - mm->arg_start;
+
 	/* Stack limit test */
-	if (size > rlim[RLIMIT_STACK].rlim_cur)
+	if (size - env_arg_size > rlim[RLIMIT_STACK].rlim_cur)
 		return -ENOMEM;
 
 	/* mlock limit tests */




^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC/PATCH] RLIMIT_ARG_MAX
  2008-02-27 13:37 [RFC/PATCH] RLIMIT_ARG_MAX Peter Zijlstra
@ 2008-02-29 16:05 ` Linus Torvalds
  2008-02-29 16:58   ` Michael Kerrisk
  0 siblings, 1 reply; 27+ messages in thread
From: Linus Torvalds @ 2008-02-29 16:05 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: aaw, Andrew Morton, michael.kerrisk, carlos, Alan Cox, linux-kernel



On Wed, 27 Feb 2008, Peter Zijlstra wrote:
> 
> Currently these arrays are considered part of the stack, and
> RLIMIT_STACK includes them. However POSIX does not specify it must be
> so.

What's the real advantage of this? I'm not seeing it. Just an extra 
complexity "niceness" that nobody can rely on anyway since it's not even 
specified, and older kernels won't do it.

		Linus

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC/PATCH] RLIMIT_ARG_MAX
  2008-02-29 16:05 ` Linus Torvalds
@ 2008-02-29 16:58   ` Michael Kerrisk
  2008-02-29 17:12     ` Linus Torvalds
  2008-02-29 17:14     ` Peter Zijlstra
  0 siblings, 2 replies; 27+ messages in thread
From: Michael Kerrisk @ 2008-02-29 16:58 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Peter Zijlstra, aaw, Andrew Morton, michael.kerrisk, carlos,
	Alan Cox, linux-kernel, drepper, mtk.manpages

[Adding Ulrich D to the CC]

On Fri, Feb 29, 2008 at 5:05 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
>
>  On Wed, 27 Feb 2008, Peter Zijlstra wrote:
>  >
>  > Currently these arrays are considered part of the stack, and
>  > RLIMIT_STACK includes them. However POSIX does not specify it must be
>  > so.
>
>  What's the real advantage of this? I'm not seeing it. Just an extra
>  complexity "niceness" that nobody can rely on anyway since it's not even
>  specified, and older kernels won't do it.

The advantages are the following:

1. We don't break the ABI.  in 2.6.23, RLIMIT_STACK acquired an
additional semantic: RLIMIT_STACK/4 specified the size for
argv+environ.  aaw@google.com added this feature to allow processes to
have much larger argument lists.  However, if the user sets
RLIMIT_STACK to less than 512k, then the amount of space for
argv+environ falls below the space guaranteed by kernel 2.6.22 and
earlier.    (Older kernels guaranteed at least 128k for argv+environ.)
 Manipulating RLIMIT_STACK did not previously have this effect.  (One
place this matters is with NPTL, where, if RLIMIT_STACK is set to
anything other than unlimited, then it is used as the default stack
size when creating new threads.  When creating many threads, it may
well be desirable to set RLIMIT_STACK to a value lower than 512k.)

While the new functionality provided by aaw@google.com's work is
useful, RLIMIT_STACK really should not have been overloaded with a
second meaning, since it is no longer possible to control stack size
without also changing the limit on argv+environ space.    Hence the
proposal of a new resource limit.

2. It provides a sane mechanism for an application to determine the
space available for argv+environ.  Formerly this space was an
invariant, advertised via sysconf(_SC_ARG_MAX).

3. The implementation details about stack size and size/location of
argv+environ can be decoupled.

Cheers,

Michael

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC/PATCH] RLIMIT_ARG_MAX
  2008-02-29 16:58   ` Michael Kerrisk
@ 2008-02-29 17:12     ` Linus Torvalds
  2008-02-29 17:18       ` Peter Zijlstra
  2008-02-29 17:14     ` Peter Zijlstra
  1 sibling, 1 reply; 27+ messages in thread
From: Linus Torvalds @ 2008-02-29 17:12 UTC (permalink / raw)
  To: Michael Kerrisk
  Cc: Peter Zijlstra, aaw, Andrew Morton, michael.kerrisk, carlos,
	Alan Cox, linux-kernel, drepper, mtk.manpages



On Fri, 29 Feb 2008, Michael Kerrisk wrote:

> >  What's the real advantage of this? I'm not seeing it. Just an extra
> >  complexity "niceness" that nobody can rely on anyway since it's not even
> >  specified, and older kernels won't do it.
> 
> The advantages are the following:
> 
> 1. We don't break the ABI.  in 2.6.23, RLIMIT_STACK acquired an
> additional semantic: RLIMIT_STACK/4 specified the size for
> argv+environ.

So maybe we should change *that* then, and just allow arg/env to be more 
than 25%.

> 2. It provides a sane mechanism for an application to determine the
> space available for argv+environ.  Formerly this space was an
> invariant, advertised via sysconf(_SC_ARG_MAX).

.. and what's the point? We've never had it before, nobody has ever cared, 
and the whole notion is just stupid. Why would we want to limit it? The 
only thing that the kernel *cares* about is the stack size - any other 
size limits are always going to be arbitrary.

> 3. The implementation details about stack size and size/location of
> argv+environ can be decoupled.

Now, this is a potentially interesting argument, but is it true (ie don't 
we have programs that know about the status quo) and are people actually 
planning on doing that (for what reason?) or is it just a theoretical one?

		Linus

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC/PATCH] RLIMIT_ARG_MAX
  2008-02-29 16:58   ` Michael Kerrisk
  2008-02-29 17:12     ` Linus Torvalds
@ 2008-02-29 17:14     ` Peter Zijlstra
  2008-02-29 17:35       ` Linus Torvalds
  1 sibling, 1 reply; 27+ messages in thread
From: Peter Zijlstra @ 2008-02-29 17:14 UTC (permalink / raw)
  To: Michael Kerrisk
  Cc: Linus Torvalds, aaw, Andrew Morton, michael.kerrisk, carlos,
	Alan Cox, linux-kernel, drepper, mtk.manpages


On Fri, 2008-02-29 at 17:58 +0100, Michael Kerrisk wrote:
> [Adding Ulrich D to the CC]
> 
> On Fri, Feb 29, 2008 at 5:05 PM, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> >
> >
> >  On Wed, 27 Feb 2008, Peter Zijlstra wrote:
> >  >
> >  > Currently these arrays are considered part of the stack, and
> >  > RLIMIT_STACK includes them. However POSIX does not specify it must be
> >  > so.
> >
> >  What's the real advantage of this? I'm not seeing it. Just an extra
> >  complexity "niceness" that nobody can rely on anyway since it's not even
> >  specified, and older kernels won't do it.
> 
> The advantages are the following:
> 
> 1. We don't break the ABI.  in 2.6.23, RLIMIT_STACK acquired an
> additional semantic: RLIMIT_STACK/4 specified the size for
> argv+environ.  aaw@google.com added this feature to allow processes to
> have much larger argument lists.  However, if the user sets
> RLIMIT_STACK to less than 512k, then the amount of space for
> argv+environ falls below the space guaranteed by kernel 2.6.22 and
> earlier.    (Older kernels guaranteed at least 128k for argv+environ.)
>  Manipulating RLIMIT_STACK did not previously have this effect.  (One
> place this matters is with NPTL, where, if RLIMIT_STACK is set to
> anything other than unlimited, then it is used as the default stack
> size when creating new threads.  When creating many threads, it may
> well be desirable to set RLIMIT_STACK to a value lower than 512k.)
> 
> While the new functionality provided by aaw@google.com's work is
> useful, RLIMIT_STACK really should not have been overloaded with a
> second meaning, since it is no longer possible to control stack size
> without also changing the limit on argv+environ space.    Hence the
> proposal of a new resource limit.
> 
> 2. It provides a sane mechanism for an application to determine the
> space available for argv+environ.  Formerly this space was an
> invariant, advertised via sysconf(_SC_ARG_MAX).
> 
> 3. The implementation details about stack size and size/location of
> argv+environ can be decoupled.

You fail to mention that <23 will still fault the first time it tries to
grow the stack when you set rlimit_stack to 128k and actually supply
128k of env+arg.




^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC/PATCH] RLIMIT_ARG_MAX
  2008-02-29 17:12     ` Linus Torvalds
@ 2008-02-29 17:18       ` Peter Zijlstra
  2008-02-29 17:29         ` Linus Torvalds
  0 siblings, 1 reply; 27+ messages in thread
From: Peter Zijlstra @ 2008-02-29 17:18 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Michael Kerrisk, aaw, Andrew Morton, michael.kerrisk, carlos,
	Alan Cox, linux-kernel, drepper, mtk.manpages


On Fri, 2008-02-29 at 09:12 -0800, Linus Torvalds wrote:

> > 2. It provides a sane mechanism for an application to determine the
> > space available for argv+environ.  Formerly this space was an
> > invariant, advertised via sysconf(_SC_ARG_MAX).
> 
> ... and what's the point? We've never had it before, nobody has ever cared, 
> and the whole notion is just stupid. Why would we want to limit it? The 
> only thing that the kernel *cares* about is the stack size - any other 
> size limits are always going to be arbitrary.

Well, don't think of limiting it, but querying the limit.

Programs like xargs would need to know how much to stuff into argv
before starting a new invocation.



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC/PATCH] RLIMIT_ARG_MAX
  2008-02-29 17:18       ` Peter Zijlstra
@ 2008-02-29 17:29         ` Linus Torvalds
  2008-02-29 17:42           ` Peter Zijlstra
  2008-03-04 20:07           ` Pavel Machek
  0 siblings, 2 replies; 27+ messages in thread
From: Linus Torvalds @ 2008-02-29 17:29 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Michael Kerrisk, aaw, Andrew Morton, michael.kerrisk, carlos,
	Alan Cox, linux-kernel, drepper, mtk.manpages



On Fri, 29 Feb 2008, Peter Zijlstra wrote:
>
> > ... and what's the point? We've never had it before, nobody has ever cared, 
> > and the whole notion is just stupid. Why would we want to limit it? The 
> > only thing that the kernel *cares* about is the stack size - any other 
> > size limits are always going to be arbitrary.
> 
> Well, don't think of limiting it, but querying the limit.
> 
> Programs like xargs would need to know how much to stuff into argv
> before starting a new invocation.

But they already can't really do that. More importantly, isn't it better 
to just use the whole stack size then (or just return "stack size / 4" or 
whatever)?

			Linus

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC/PATCH] RLIMIT_ARG_MAX
  2008-02-29 17:14     ` Peter Zijlstra
@ 2008-02-29 17:35       ` Linus Torvalds
  2008-02-29 17:55         ` Peter Zijlstra
  0 siblings, 1 reply; 27+ messages in thread
From: Linus Torvalds @ 2008-02-29 17:35 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Michael Kerrisk, aaw, Andrew Morton, michael.kerrisk, carlos,
	Alan Cox, linux-kernel, drepper, mtk.manpages



On Fri, 29 Feb 2008, Peter Zijlstra wrote:
> 
> You fail to mention that <23 will still fault the first time it tries to
> grow the stack when you set rlimit_stack to 128k and actually supply
> 128k of env+arg.

So? That's what rlimit_stack has always meant (and not just on Linux 
either, afaik). That's not a bug, it's a feature. If the system has a 
limited stack, it has a limited stack. That's what RLIMIT_STACK means.

		Linus

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC/PATCH] RLIMIT_ARG_MAX
  2008-02-29 17:29         ` Linus Torvalds
@ 2008-02-29 17:42           ` Peter Zijlstra
  2008-02-29 18:12             ` Linus Torvalds
  2008-03-04 20:07           ` Pavel Machek
  1 sibling, 1 reply; 27+ messages in thread
From: Peter Zijlstra @ 2008-02-29 17:42 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Michael Kerrisk, aaw, Andrew Morton, michael.kerrisk, carlos,
	Alan Cox, linux-kernel, drepper, mtk.manpages


On Fri, 2008-02-29 at 09:29 -0800, Linus Torvalds wrote:
> 
> On Fri, 29 Feb 2008, Peter Zijlstra wrote:
> >
> > > ... and what's the point? We've never had it before, nobody has ever cared, 
> > > and the whole notion is just stupid. Why would we want to limit it? The 
> > > only thing that the kernel *cares* about is the stack size - any other 
> > > size limits are always going to be arbitrary.
> > 
> > Well, don't think of limiting it, but querying the limit.
> > 
> > Programs like xargs would need to know how much to stuff into argv
> > before starting a new invocation.
> 
> But they already can't really do that. 

I think they used to use sysconf(_SC_ARG_MAX) to do that.

> More importantly, isn't it better to just use the whole stack size then 

Well, we ran into trouble of freshly spawned tasks faulting on the first
stack grow. The /4 thing was to avoid that situation.

> (or just return "stack size / 4" or whatever)?

I'm all for that, trouble is that the POSIX folks specified that the
sysconf() value must be consistent during the lifetime of a process.
Which isn't true, because we can change rlimit_stack after asking. And
the linux implementation doesn't even seem to bother asking the kernel -
so there just isn't much we _can_ do here.

My suggestion was a kernel version check along with sysconf or
rlimit_stack. But I guess that made the userspace people puke :-)


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC/PATCH] RLIMIT_ARG_MAX
  2008-02-29 17:35       ` Linus Torvalds
@ 2008-02-29 17:55         ` Peter Zijlstra
  2008-02-29 18:14           ` Linus Torvalds
                             ` (2 more replies)
  0 siblings, 3 replies; 27+ messages in thread
From: Peter Zijlstra @ 2008-02-29 17:55 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Michael Kerrisk, aaw, Andrew Morton, michael.kerrisk, carlos,
	Alan Cox, linux-kernel, drepper, mtk.manpages


On Fri, 2008-02-29 at 09:35 -0800, Linus Torvalds wrote:
> 
> On Fri, 29 Feb 2008, Peter Zijlstra wrote:
> > 
> > You fail to mention that <23 will still fault the first time it tries to
> > grow the stack when you set rlimit_stack to 128k and actually supply
> > 128k of env+arg.
> 
> So? That's what rlimit_stack has always meant (and not just on Linux 
> either, afaik). That's not a bug, it's a feature. If the system has a 
> limited stack, it has a limited stack. That's what RLIMIT_STACK means.

Well, I agree with that point. It just that apparently POSIX does not.
According to Michael POSIX does not consider the arg+env array part of
the stack proper.




^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC/PATCH] RLIMIT_ARG_MAX
  2008-02-29 17:42           ` Peter Zijlstra
@ 2008-02-29 18:12             ` Linus Torvalds
  2008-02-29 19:01               ` Ollie Wild
  0 siblings, 1 reply; 27+ messages in thread
From: Linus Torvalds @ 2008-02-29 18:12 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Michael Kerrisk, aaw, Andrew Morton, michael.kerrisk, carlos,
	Alan Cox, linux-kernel, drepper, mtk.manpages



On Fri, 29 Feb 2008, Peter Zijlstra wrote:
> 
> > More importantly, isn't it better to just use the whole stack size then 
> 
> Well, we ran into trouble of freshly spawned tasks faulting on the first
> stack grow. The /4 thing was to avoid that situation.

Yeah, I do see the point of wanting slop, but maybe the right thing to do 
is simply to just not make the slop be 75% of it all ;)

The thing is, RLIMIT_STACK has never been "exact" in the first place (ie 
it has *always* contained the argument and environment as a part of it), 
and that is really traditional behaviour even outside of Linux.

And I seriously doubt that RLIMIT_ARG_MAX really buys people anything 
truly wonderful, and it definitely adds just another thing you can screw 
up and make the system just behave differently depending on a config value 
that doesn't even matter to the kernel. In fact, even with that patch, 
it's *still* not going to handle the difference between the actual string 
space and the pointers themselves, or even all the other setup stuff that 
the binary loaders will put on the stack.

So it's not *going* to be exact even with RLIMIT_ARG_MAX, because it's 
going to have all those other issues to contend with - on a 64-bit 
architecture, the argument _pointers_ are often within an order of 
magnitude of the argument strings themselves, and I don't think your patch 
counted them as part of the argument/environemnt size (I was too lazy to 
check the sources, but I'm pretty sure argv/env_start/end is just the 
string space, not the pointers).

So rather than introduce a new thing that is not going to be trustworthy 
anyway, I'd much rather just remove the limit entirely.

Also, it all boils down to the fact that the whole argument is utter crap:

> POSIX disallows sysconf() variables to change during the execution of a
> process, so even if it would ask the kernel for a value, we could not
> give a sane answer.

If the resource limits change, then it makes not a whit of a difference 
whether _SC_ARG_MAX changes or not, because it's not going to reflect 
reality. So you might as well just continue to give the value we've always 
given (128k? I don't remember). Because it's not going to be the "real" 
value *anyway*.

This whole argument seems pointless. Has anybody ever really cared? Why 
not just keep _SC_ARG_MAX at the old (small) limit, and then the fact that 
99% of all programs won't even care, and that they can actually use much 
larger limits in real life is just gravy.

A *good* implementation would generally just do the execve() with the 
maximal arguments, and only bother to see "oh, maybe I can split it up" if 
it returns ENOMEM or whatever it does. 

So I don't see the practical value here - _SC_ARG_MAX is not worth having 
another tweaking value for that people will just always get wrogn anyway 
because there is no right answer (except "I don't want my stack to grow 
too large" where it's just one of the relevant things)

			Linus

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC/PATCH] RLIMIT_ARG_MAX
  2008-02-29 17:55         ` Peter Zijlstra
@ 2008-02-29 18:14           ` Linus Torvalds
  2008-02-29 18:18           ` Michael Kerrisk
  2008-02-29 18:40           ` Alan Cox
  2 siblings, 0 replies; 27+ messages in thread
From: Linus Torvalds @ 2008-02-29 18:14 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Michael Kerrisk, aaw, Andrew Morton, michael.kerrisk, carlos,
	Alan Cox, linux-kernel, drepper, mtk.manpages



On Fri, 29 Feb 2008, Peter Zijlstra wrote:
> 
> Well, I agree with that point. It just that apparently POSIX does not.
> According to Michael POSIX does not consider the arg+env array part of
> the stack proper.

I don't think that's true.

POSIX has always been guided on "what you can depend on", not "we make up 
new features". And if that has changed, then it's a problem for POSIX, not 
for Linux.

IOW, I think somebody is either reading the standard wrong, or the 
standard is not worth reading.

		Linus

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC/PATCH] RLIMIT_ARG_MAX
  2008-02-29 17:55         ` Peter Zijlstra
  2008-02-29 18:14           ` Linus Torvalds
@ 2008-02-29 18:18           ` Michael Kerrisk
  2008-02-29 18:39             ` Linus Torvalds
  2008-03-01  8:42             ` Geoff Clare
  2008-02-29 18:40           ` Alan Cox
  2 siblings, 2 replies; 27+ messages in thread
From: Michael Kerrisk @ 2008-02-29 18:18 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Linus Torvalds, Michael Kerrisk, aaw, Andrew Morton, carlos,
	Alan Cox, linux-kernel, drepper, mtk.manpages, Geoff Clare

Peter Zijlstra wrote:
> On Fri, 2008-02-29 at 09:35 -0800, Linus Torvalds wrote:
>> On Fri, 29 Feb 2008, Peter Zijlstra wrote:
>>> You fail to mention that <23 will still fault the first time it tries to
>>> grow the stack when you set rlimit_stack to 128k and actually supply
>>> 128k of env+arg.
>> So? That's what rlimit_stack has always meant (and not just on Linux 
>> either, afaik). That's not a bug, it's a feature. If the system has a 
>> limited stack, it has a limited stack. That's what RLIMIT_STACK means.
> 
> Well, I agree with that point. It just that apparently POSIX does not.
> According to Michael POSIX does not consider the arg+env array part of
> the stack proper.

AFAIK, POSIX.1 makes no requirement here.  Most (all?) Unix systems have 
traditionally placed argv+environ just above the stack, but that isn't 
required.

My reading of POSIX.1 (and POSIX doesn't seem very explicit on this 
point), is that the limits on argv+environ and on stack are decoupled, 
since POSIX specifies RLIMIT_STACK and sysconf(_SC_ARG_MAX) and doesn't 
specify any relationship between the two.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC/PATCH] RLIMIT_ARG_MAX
  2008-02-29 18:18           ` Michael Kerrisk
@ 2008-02-29 18:39             ` Linus Torvalds
  2008-02-29 19:49               ` Michael Kerrisk
  2008-03-01  8:42             ` Geoff Clare
  1 sibling, 1 reply; 27+ messages in thread
From: Linus Torvalds @ 2008-02-29 18:39 UTC (permalink / raw)
  To: Michael Kerrisk
  Cc: Peter Zijlstra, aaw, Andrew Morton, carlos, Alan Cox,
	linux-kernel, drepper, mtk.manpages, Geoff Clare



On Fri, 29 Feb 2008, Michael Kerrisk wrote:
> 
> My reading of POSIX.1 (and POSIX doesn't seem very explicit on this point), is
> that the limits on argv+environ and on stack are decoupled, since POSIX
> specifies RLIMIT_STACK and sysconf(_SC_ARG_MAX) and doesn't specify any
> relationship between the two.

I agree. And clearly there _are_ relationships and always have been, but 
equally clearly they simply haven't been a big issue in practice, and 
nobody really cares.

Usually, _SC_ARG_MAX is just so much smaller than RLIMIT_STACK that it 
makes no possible difference.  Which I would actually argue we should just 
continue with: just keep _SC_ARG_MAX a smallish, irrelevant constant.

We still have to have the compile-time ARG_MAX constant (as in *real* 
constant - a #define) anyway, for traditional programs, and you might as 
well make sysconf(_SC_ARG_MAX) always just match ARG_MAX.

It's not like there is likely a single user of _SC_ARG_MAX that cares.

			Linus

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC/PATCH] RLIMIT_ARG_MAX
  2008-02-29 17:55         ` Peter Zijlstra
  2008-02-29 18:14           ` Linus Torvalds
  2008-02-29 18:18           ` Michael Kerrisk
@ 2008-02-29 18:40           ` Alan Cox
  2 siblings, 0 replies; 27+ messages in thread
From: Alan Cox @ 2008-02-29 18:40 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Linus Torvalds, Michael Kerrisk, aaw, Andrew Morton,
	michael.kerrisk, carlos, linux-kernel, drepper, mtk.manpages

On Fri, 29 Feb 2008 18:55:56 +0100
Peter Zijlstra <a.p.zijlstra@chello.nl> wrote:

> 
> On Fri, 2008-02-29 at 09:35 -0800, Linus Torvalds wrote:
> > 
> > On Fri, 29 Feb 2008, Peter Zijlstra wrote:
> > > 
> > > You fail to mention that <23 will still fault the first time it tries to
> > > grow the stack when you set rlimit_stack to 128k and actually supply
> > > 128k of env+arg.
> > 
> > So? That's what rlimit_stack has always meant (and not just on Linux 
> > either, afaik). That's not a bug, it's a feature. If the system has a 
> > limited stack, it has a limited stack. That's what RLIMIT_STACK means.
> 
> Well, I agree with that point. It just that apparently POSIX does not.
> According to Michael POSIX does not consider the arg+env array part of
> the stack proper.

As far as I can see POSIX and SuS do not care. In all the ABIs some of
your stack is already used by stuff. Posix doesn't seem to consider it
either way. By some undefined magic main() gets argc, argv, envp. Quite
frankly it could read them from a pipe before main is called.

Alan

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC/PATCH] RLIMIT_ARG_MAX
  2008-02-29 18:12             ` Linus Torvalds
@ 2008-02-29 19:01               ` Ollie Wild
  2008-02-29 19:09                 ` Jakub Jelinek
  0 siblings, 1 reply; 27+ messages in thread
From: Ollie Wild @ 2008-02-29 19:01 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Peter Zijlstra, Michael Kerrisk, Andrew Morton, michael.kerrisk,
	carlos, Alan Cox, linux-kernel, drepper, mtk.manpages

On Fri, Feb 29, 2008 at 10:12 AM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> So it's not *going* to be exact even with RLIMIT_ARG_MAX, because it's
> going to have all those other issues to contend with - on a 64-bit
> architecture, the argument _pointers_ are often within an order of
> magnitude of the argument strings themselves, and I don't think your patch
> counted them as part of the argument/environemnt size (I was too lazy to
> check the sources, but I'm pretty sure argv/env_start/end is just the
> string space, not the pointers).

This is precisely why I picked 25% as the maximum argument size ratio.
 In practice, that 25% can easily mean 50% or more.  If people want to
increase this, it can probably be tweaked somewhat, but switching it
to, say, 50% probably isn't a good idea.

Ollie

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC/PATCH] RLIMIT_ARG_MAX
  2008-02-29 19:01               ` Ollie Wild
@ 2008-02-29 19:09                 ` Jakub Jelinek
  2008-02-29 19:50                   ` Linus Torvalds
  0 siblings, 1 reply; 27+ messages in thread
From: Jakub Jelinek @ 2008-02-29 19:09 UTC (permalink / raw)
  To: Ollie Wild
  Cc: Linus Torvalds, Peter Zijlstra, Michael Kerrisk, Andrew Morton,
	michael.kerrisk, carlos, Alan Cox, linux-kernel, drepper,
	mtk.manpages

On Fri, Feb 29, 2008 at 11:01:38AM -0800, Ollie Wild wrote:
> On Fri, Feb 29, 2008 at 10:12 AM, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> >
> > So it's not *going* to be exact even with RLIMIT_ARG_MAX, because it's
> > going to have all those other issues to contend with - on a 64-bit
> > architecture, the argument _pointers_ are often within an order of
> > magnitude of the argument strings themselves, and I don't think your patch
> > counted them as part of the argument/environemnt size (I was too lazy to
> > check the sources, but I'm pretty sure argv/env_start/end is just the
> > string space, not the pointers).
> 
> This is precisely why I picked 25% as the maximum argument size ratio.
>  In practice, that 25% can easily mean 50% or more.  If people want to
> increase this, it can probably be tweaked somewhat, but switching it
> to, say, 50% probably isn't a good idea.

I think 50% would be still fine.  And, ideally make that
MAX (RLIMIT_STACK / 2, 128KB) to avoid regressions for programs which assume
they can pass ARG_MAX args+env, even if they have say 192KB stack limit.

	Jakub

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC/PATCH] RLIMIT_ARG_MAX
  2008-02-29 18:39             ` Linus Torvalds
@ 2008-02-29 19:49               ` Michael Kerrisk
  2008-02-29 20:07                 ` Linus Torvalds
  0 siblings, 1 reply; 27+ messages in thread
From: Michael Kerrisk @ 2008-02-29 19:49 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Peter Zijlstra, aaw, Andrew Morton, carlos, Alan Cox,
	linux-kernel, drepper, mtk.manpages, Geoff Clare

On Fri, Feb 29, 2008 at 7:39 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
>
>  On Fri, 29 Feb 2008, Michael Kerrisk wrote:
>  >
>  > My reading of POSIX.1 (and POSIX doesn't seem very explicit on this point), is
>  > that the limits on argv+environ and on stack are decoupled, since POSIX
>  > specifies RLIMIT_STACK and sysconf(_SC_ARG_MAX) and doesn't specify any
>  > relationship between the two.
>
>  I agree. And clearly there _are_ relationships and always have been, but
>  equally clearly they simply haven't been a big issue in practice, and
>  nobody really cares.

Do we know that for sure?

>  Usually, _SC_ARG_MAX is just so much smaller than RLIMIT_STACK that it
>  makes no possible difference.  Which I would actually argue we should just
>  continue with: just keep _SC_ARG_MAX a smallish, irrelevant constant.
>
>  We still have to have the compile-time ARG_MAX constant (as in *real*
>  constant - a #define) anyway, for traditional programs, and you might as
>  well make sysconf(_SC_ARG_MAX) always just match ARG_MAX.
>
>  It's not like there is likely a single user of _SC_ARG_MAX that cares.

In my initial reply, I pointed out one example where users *may* care:
NPTL uses RLIMIT_STACK to determine the size of per-thread stacks.  It
is conceivable that users might want to set RLIMIT_STACK < 512k, and
that would have the effect of lowering the amount of space for
argv+eviron below what the kernel has historically guaranteed.  That's
an ABI change, though it's unclear whether it would impact anyone in
practice.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC/PATCH] RLIMIT_ARG_MAX
  2008-02-29 19:09                 ` Jakub Jelinek
@ 2008-02-29 19:50                   ` Linus Torvalds
  2008-02-29 20:03                     ` Ollie Wild
  0 siblings, 1 reply; 27+ messages in thread
From: Linus Torvalds @ 2008-02-29 19:50 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Ollie Wild, Peter Zijlstra, Michael Kerrisk, Andrew Morton,
	michael.kerrisk, carlos, Alan Cox, linux-kernel, drepper,
	mtk.manpages



On Fri, 29 Feb 2008, Jakub Jelinek wrote:
> On Fri, Feb 29, 2008 at 11:01:38AM -0800, Ollie Wild wrote:
> > 
> > This is precisely why I picked 25% as the maximum argument size ratio.
> >  In practice, that 25% can easily mean 50% or more.  If people want to
> > increase this, it can probably be tweaked somewhat, but switching it
> > to, say, 50% probably isn't a good idea.
> 
> I think 50% would be still fine.  And, ideally make that
> MAX (RLIMIT_STACK / 2, 128KB) to avoid regressions for programs which assume
> they can pass ARG_MAX args+env, even if they have say 192KB stack limit.

It would certainly be worth at least testing that as an approach.

Another thing we could decide to do is to just check the size of the stack 
that is left at the end of all the stack setup code, and just say "if it's 
less than X bytes, just return ENOMEM rather than set up a process with a 
really unusably small stack".

			Linus

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC/PATCH] RLIMIT_ARG_MAX
  2008-02-29 19:50                   ` Linus Torvalds
@ 2008-02-29 20:03                     ` Ollie Wild
  0 siblings, 0 replies; 27+ messages in thread
From: Ollie Wild @ 2008-02-29 20:03 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Jakub Jelinek, Peter Zijlstra, Michael Kerrisk, Andrew Morton,
	michael.kerrisk, carlos, Alan Cox, linux-kernel, drepper,
	mtk.manpages

On Fri, Feb 29, 2008 at 11:50 AM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> Another thing we could decide to do is to just check the size of the stack
> that is left at the end of all the stack setup code, and just say "if it's
> less than X bytes, just return ENOMEM rather than set up a process with a
> really unusably small stack".

What would be a reasonable value, though?  Whereas argument space is
static and known at process execution time, required stack space is
program dependent.  If the program is going to crash, I'd rather it do
so at exec time.  Otherwise, we end up with corrupted files, partially
committed database transactions, and so forth.  I'd rather error on
the side of small argument space.

In the common situation, a 25% argument allocation is vastly larger
than the pre-2.6.23 limit.  We're really only talking about cases
where the limits have been set to unusually small values.  I'd be
interested in hearing from people that do this in practice.  Why do
they do this?  What are their expectations?

Ollie

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC/PATCH] RLIMIT_ARG_MAX
  2008-02-29 19:49               ` Michael Kerrisk
@ 2008-02-29 20:07                 ` Linus Torvalds
  2008-02-29 20:43                   ` Michael Kerrisk
  2008-02-29 21:57                   ` Linus Torvalds
  0 siblings, 2 replies; 27+ messages in thread
From: Linus Torvalds @ 2008-02-29 20:07 UTC (permalink / raw)
  To: Michael Kerrisk
  Cc: Peter Zijlstra, aaw, Andrew Morton, carlos, Alan Cox,
	linux-kernel, drepper, mtk.manpages, Geoff Clare



On Fri, 29 Feb 2008, Michael Kerrisk wrote:

> On Fri, Feb 29, 2008 at 7:39 PM, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
>
> >  I agree. And clearly there _are_ relationships and always have been, but
> >  equally clearly they simply haven't been a big issue in practice, and
> >  nobody really cares.
> 
> Do we know that for sure?

We *do* know for sure that the relationship has always been there. At 
least in Linux, and I bet in 99% of all other Unixes too. The arguments 
simply have traditionally been counted as part of the stack size.

Or did you mean the latter part?

The fact is, we *also* know for sure that anybody that depends on 
_SC_ARG_MAX being exact has always - and will continue to be - broken. 
Again, because of not only older kernels but also because even with the 
patch in question, we don't count argument sizes exactly.

> In my initial reply, I pointed out one example where users *may* care:
> NPTL uses RLIMIT_STACK to determine the size of per-thread stacks.  It
> is conceivable that users might want to set RLIMIT_STACK < 512k, and
> that would have the effect of lowering the amount of space for
> argv+eviron below what the kernel has historically guaranteed.  That's
> an ABI change, though it's unclear whether it would impact anyone in
> practice.

I do agree that we should at least make the "MAX(stacksize/4, 128k)" 
change for backwards compatibility. That is actually a potential 
regression, but it has nothing to do with a new _SC_ARG_SIZE, because 
quite frankly, it's a regression *regardless* of whether we'd expose a new 
rlimit or not!

And one of the reasons I'm so down on new resource limits is that nobody 
then has the code to actually update them. You won't see it in "ulimit -a" 
until you have a newly compiled bash that cares etc etc, so as far as I'm 
concerned, I see hat RLIMIT_ARG_MAX as nothing but a pain. It actually 
makes the code more complex, makes it work less like we and others have 
*always* worked, and doesn't even help users - quite the reverse.

I don't like unnecessary complexity. If the RLIMGI_STACK/4 check is truly 
so troublesome, let's just *remove* it, rather than add more crap and 
complexity on top of it!

Problem solved.

		Linus

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC/PATCH] RLIMIT_ARG_MAX
  2008-02-29 20:07                 ` Linus Torvalds
@ 2008-02-29 20:43                   ` Michael Kerrisk
  2008-02-29 21:34                     ` Linus Torvalds
  2008-02-29 21:57                   ` Linus Torvalds
  1 sibling, 1 reply; 27+ messages in thread
From: Michael Kerrisk @ 2008-02-29 20:43 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Peter Zijlstra, aaw, Andrew Morton, carlos, Alan Cox,
	linux-kernel, drepper, mtk.manpages, Geoff Clare

On Fri, Feb 29, 2008 at 9:07 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
>
>  On Fri, 29 Feb 2008, Michael Kerrisk wrote:
>
>  > On Fri, Feb 29, 2008 at 7:39 PM, Linus Torvalds
>  > <torvalds@linux-foundation.org> wrote:
>  >
>
> > >  I agree. And clearly there _are_ relationships and always have been, but
>  > >  equally clearly they simply haven't been a big issue in practice, and
>  > >  nobody really cares.
>  >
>  > Do we know that for sure?
>
>  We *do* know for sure that the relationship has always been there. At
>  least in Linux, and I bet in 99% of all other Unixes too. The arguments
>  simply have traditionally been counted as part of the stack size.
>
>  Or did you mean the latter part?

I meant: do we know for sure that no one really cares?

>  The fact is, we *also* know for sure that anybody that depends on
>  _SC_ARG_MAX being exact has always - and will continue to be - broken.
>  Again, because of not only older kernels but also because even with the
>  patch in question, we don't count argument sizes exactly.
>
>
>  > In my initial reply, I pointed out one example where users *may* care:
>  > NPTL uses RLIMIT_STACK to determine the size of per-thread stacks.  It
>  > is conceivable that users might want to set RLIMIT_STACK < 512k, and
>  > that would have the effect of lowering the amount of space for
>  > argv+eviron below what the kernel has historically guaranteed.  That's
>  > an ABI change, though it's unclear whether it would impact anyone in
>  > practice.
>
>  I do agree that we should at least make the "MAX(stacksize/4, 128k)"
>  change for backwards compatibility.

Good -- because that's probably the most important point, IMO.

> That is actually a potential
>  regression, but it has nothing to do with a new _SC_ARG_SIZE, because
>  quite frankly, it's a regression *regardless* of whether we'd expose a new
>  rlimit or not!

Agreed.

The new rlimit is primarily for the (supposed) applications that care
about knowing (at least approximately) what _SC_ARG_MAX is.  I raised
the initial bug report against glibc because applications can no
longer (post 2.6.23) do this, but I haven't done the investigation
about how many applications actually care.

Cheers,

Michael

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC/PATCH] RLIMIT_ARG_MAX
  2008-02-29 20:43                   ` Michael Kerrisk
@ 2008-02-29 21:34                     ` Linus Torvalds
  0 siblings, 0 replies; 27+ messages in thread
From: Linus Torvalds @ 2008-02-29 21:34 UTC (permalink / raw)
  To: Michael Kerrisk
  Cc: Peter Zijlstra, aaw, Andrew Morton, carlos, Alan Cox,
	linux-kernel, drepper, mtk.manpages, Geoff Clare



On Fri, 29 Feb 2008, Michael Kerrisk wrote:

> On Fri, Feb 29, 2008 at 9:07 PM, Linus Torvalds
> >
> > > >  I agree. And clearly there _are_ relationships and always have been, but
> >  > >  equally clearly they simply haven't been a big issue in practice, and
> >  > >  nobody really cares.
> >  >
> >  > Do we know that for sure?
> >
> >  We *do* know for sure that the relationship has always been there. At
> >  least in Linux, and I bet in 99% of all other Unixes too. The arguments
> >  simply have traditionally been counted as part of the stack size.
> >
> >  Or did you mean the latter part?
> 
> I meant: do we know for sure that no one really cares?

Well, what I have tried to argue is that even if they care, the patch 
won't actually really help. It just moves existing behaviour around a bit, 
but leaves all the fundamental issues totally untouched in that it may 
count the strings, but not the pointers themselves etc.

More importantly, anybody who would depend on any new behaviour would 
still be screwed on all other platforms - including older Linux ones - in 
that they'd depend on some very specific behaviour that simply isn't going 
to be there in other cases.

So yeah, I can see that people could care, but they *shouldn't*.

> The new rlimit is primarily for the (supposed) applications that care
> about knowing (at least approximately) what _SC_ARG_MAX is.  I raised
> the initial bug report against glibc because applications can no
> longer (post 2.6.23) do this, but I haven't done the investigation
> about how many applications actually care.

Very few reasonably can. The thing is, in order to care, you have to count 
things like your own environment space etc, and you have to know that 
there is something you can even *do* about it if the counts go wrong.

So in practice, I think it's just about things like "xargs" and very few 
actual applications. 

I did try to do a google codesearch on "sysconf(_SC_ARG_MAX)" and it 
exists, but there wasn't a whole lot. The most logical one (and the one 
that didn't prefer the ARG_MAX #define) was the built-in xargs in ksh.

But I really didn't look very hard, just a few screenfuls of codesearch.

Realistically, "xargs" really is the main user. *Most* users of execve() 
simply either want all their arguments or none. It's not that common that 
somebody says "ok, I have a ton of arguments, but if you limit them I'll 
just use a fraction of them".

			Linus

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC/PATCH] RLIMIT_ARG_MAX
  2008-02-29 20:07                 ` Linus Torvalds
  2008-02-29 20:43                   ` Michael Kerrisk
@ 2008-02-29 21:57                   ` Linus Torvalds
  2008-03-01 14:21                     ` Carlos O'Donell
  1 sibling, 1 reply; 27+ messages in thread
From: Linus Torvalds @ 2008-02-29 21:57 UTC (permalink / raw)
  To: Michael Kerrisk
  Cc: Peter Zijlstra, aaw, Andrew Morton, carlos, Alan Cox,
	linux-kernel, drepper, mtk.manpages, Geoff Clare



On Fri, 29 Feb 2008, Linus Torvalds wrote:
> 
> I do agree that we should at least make the "MAX(stacksize/4, 128k)" 
> change for backwards compatibility.

How about something like this?

The alternative is to just remove that size check entirely, and depend on 
get_user_pages() doing the stack limit check (among all the *other* checks 
it does when it does the acct_stack_growth() thing).

I'd almost prefer that simpler approach, but I don't have any really 
strong preferences. Anybody?

		Linus

---
 fs/exec.c |   10 +++++++++-
 1 files changed, 9 insertions(+), 1 deletions(-)

diff --git a/fs/exec.c b/fs/exec.c
index a44b142..e91f9cb 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -173,8 +173,15 @@ static struct page *get_arg_page(struct linux_binprm *bprm, unsigned long pos,
 		return NULL;
 
 	if (write) {
-		struct rlimit *rlim = current->signal->rlim;
 		unsigned long size = bprm->vma->vm_end - bprm->vma->vm_start;
+		struct rlimit *rlim;
+
+		/*
+		 * We've historically supported up to 32 pages of argument
+		 * strings even with small stacks
+		 */
+		if (size <= 32*PAGE_SIZE)
+			return page;
 
 		/*
 		 * Limit to 1/4-th the stack size for the argv+env strings.
@@ -183,6 +190,7 @@ static struct page *get_arg_page(struct linux_binprm *bprm, unsigned long pos,
 		 *  - the program will have a reasonable amount of stack left
 		 *    to work from.
 		 */
+		rlim = current->signal->rlim;
 		if (size > rlim[RLIMIT_STACK].rlim_cur / 4) {
 			put_page(page);
 			return NULL;

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [RFC/PATCH] RLIMIT_ARG_MAX
  2008-02-29 18:18           ` Michael Kerrisk
  2008-02-29 18:39             ` Linus Torvalds
@ 2008-03-01  8:42             ` Geoff Clare
  1 sibling, 0 replies; 27+ messages in thread
From: Geoff Clare @ 2008-03-01  8:42 UTC (permalink / raw)
  To: Michael Kerrisk
  Cc: Peter Zijlstra, Linus Torvalds, aaw, Andrew Morton, carlos,
	Alan Cox, linux-kernel, drepper, mtk.manpages

Michael Kerrisk <michael.kerrisk@googlemail.com> wrote, on 29 Feb 2008:
>
> My reading of POSIX.1 (and POSIX doesn't seem very explicit on this 
> point), is that the limits on argv+environ and on stack are decoupled, 
> since POSIX specifies RLIMIT_STACK and sysconf(_SC_ARG_MAX) and doesn't 
> specify any relationship between the two.

POSIX doesn't specify any relationship between them because (as far
as POSIX is concerned) they are limits on entirely different things.
sysconf(_SC_ARG_MAX) is a limit on how much arg+env a process can
_pass_ to the exec*() functions.  The RLIMIT_* limits are limits on
the process itself.

-- 
Geoff Clare <g.clare@opengroup.org>
The Open Group, Thames Tower, Station Road, Reading, RG1 1LX, England

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC/PATCH] RLIMIT_ARG_MAX
  2008-02-29 21:57                   ` Linus Torvalds
@ 2008-03-01 14:21                     ` Carlos O'Donell
  0 siblings, 0 replies; 27+ messages in thread
From: Carlos O'Donell @ 2008-03-01 14:21 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Michael Kerrisk, Peter Zijlstra, aaw, Andrew Morton, Alan Cox,
	linux-kernel, drepper, mtk.manpages, Geoff Clare

Linus Torvalds wrote:
> On Fri, 29 Feb 2008, Linus Torvalds wrote:
>> I do agree that we should at least make the "MAX(stacksize/4, 128k)" 
>> change for backwards compatibility.
> 
> How about something like this?

This is perfect. As the original submitter of the bug my primary 
interest is in having the regression fixed.

> The alternative is to just remove that size check entirely, and depend on 
> get_user_pages() doing the stack limit check (among all the *other* checks 
> it does when it does the acct_stack_growth() thing).
> 
> I'd almost prefer that simpler approach, but I don't have any really 
> strong preferences. Anybody?
> 
> 		Linus
> 
> ---
>  fs/exec.c |   10 +++++++++-
>  1 files changed, 9 insertions(+), 1 deletions(-)
> 
> diff --git a/fs/exec.c b/fs/exec.c
> index a44b142..e91f9cb 100644
> --- a/fs/exec.c
> +++ b/fs/exec.c
> @@ -173,8 +173,15 @@ static struct page *get_arg_page(struct linux_binprm *bprm, unsigned long pos,
>  		return NULL;
>  
>  	if (write) {
> -		struct rlimit *rlim = current->signal->rlim;
>  		unsigned long size = bprm->vma->vm_end - bprm->vma->vm_start;
> +		struct rlimit *rlim;
> +
> +		/*
> +		 * We've historically supported up to 32 pages of argument
> +		 * strings even with small stacks
> +		 */
> +		if (size <= 32*PAGE_SIZE)
> +			return page;

Could you use ARG_MAX as defined in include/linux/limits.h?

>  
>  		/*
>  		 * Limit to 1/4-th the stack size for the argv+env strings.
> @@ -183,6 +190,7 @@ static struct page *get_arg_page(struct linux_binprm *bprm, unsigned long pos,
>  		 *  - the program will have a reasonable amount of stack left
>  		 *    to work from.
>  		 */
> +		rlim = current->signal->rlim;
>  		if (size > rlim[RLIMIT_STACK].rlim_cur / 4) {
>  			put_page(page);
>  			return NULL;

Cheers,
Carlos.
-- 
Carlos O'Donell
CodeSourcery
carlos@codesourcery.com
(650) 331-3385 x716

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC/PATCH] RLIMIT_ARG_MAX
  2008-02-29 17:29         ` Linus Torvalds
  2008-02-29 17:42           ` Peter Zijlstra
@ 2008-03-04 20:07           ` Pavel Machek
  1 sibling, 0 replies; 27+ messages in thread
From: Pavel Machek @ 2008-03-04 20:07 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Peter Zijlstra, Michael Kerrisk, aaw, Andrew Morton,
	michael.kerrisk, carlos, Alan Cox, linux-kernel, drepper,
	mtk.manpages

On Fri 2008-02-29 09:29:19, Linus Torvalds wrote:
> 
> 
> On Fri, 29 Feb 2008, Peter Zijlstra wrote:
> >
> > > ... and what's the point? We've never had it before, nobody has ever cared, 
> > > and the whole notion is just stupid. Why would we want to limit it? The 
> > > only thing that the kernel *cares* about is the stack size - any other 
> > > size limits are always going to be arbitrary.
> > 
> > Well, don't think of limiting it, but querying the limit.
> > 
> > Programs like xargs would need to know how much to stuff into argv
> > before starting a new invocation.
> 
> But they already can't really do that. More importantly, isn't it better 
> to just use the whole stack size then (or just return "stack size / 4" or 
> whatever)?

Using whole stack smells like a security problem to me.

...pass so much parameters that passwd dies on stack shortage. Make
sure passwd grabbed some system-wide lock before dying.

						Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2008-03-04 20:10 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-02-27 13:37 [RFC/PATCH] RLIMIT_ARG_MAX Peter Zijlstra
2008-02-29 16:05 ` Linus Torvalds
2008-02-29 16:58   ` Michael Kerrisk
2008-02-29 17:12     ` Linus Torvalds
2008-02-29 17:18       ` Peter Zijlstra
2008-02-29 17:29         ` Linus Torvalds
2008-02-29 17:42           ` Peter Zijlstra
2008-02-29 18:12             ` Linus Torvalds
2008-02-29 19:01               ` Ollie Wild
2008-02-29 19:09                 ` Jakub Jelinek
2008-02-29 19:50                   ` Linus Torvalds
2008-02-29 20:03                     ` Ollie Wild
2008-03-04 20:07           ` Pavel Machek
2008-02-29 17:14     ` Peter Zijlstra
2008-02-29 17:35       ` Linus Torvalds
2008-02-29 17:55         ` Peter Zijlstra
2008-02-29 18:14           ` Linus Torvalds
2008-02-29 18:18           ` Michael Kerrisk
2008-02-29 18:39             ` Linus Torvalds
2008-02-29 19:49               ` Michael Kerrisk
2008-02-29 20:07                 ` Linus Torvalds
2008-02-29 20:43                   ` Michael Kerrisk
2008-02-29 21:34                     ` Linus Torvalds
2008-02-29 21:57                   ` Linus Torvalds
2008-03-01 14:21                     ` Carlos O'Donell
2008-03-01  8:42             ` Geoff Clare
2008-02-29 18:40           ` Alan Cox

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).