linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Question regarding MAX_ARG_STRLEN with execve()
@ 2017-06-30  6:29 Anshuman Khandual
  2017-06-30 14:22 ` Michal Hocko
  0 siblings, 1 reply; 5+ messages in thread
From: Anshuman Khandual @ 2017-06-30  6:29 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Ingo Molnar, Alexander Viro

Hello,

execve() system call should support argument length of
MAX_ARG_STRLEN (PAGE_SIZE * 32). On 64K page size systems, we
are not able to pass 32 * PAGE_SIZE arguments into the execve()
system call because of the following reasons.

* struct linux_binprm's vma starts with a size of PAGE_SIZE

	vma->vm_end = STACK_TOP_MAX;
	vma->vm_start = vma->vm_end - PAGE_SIZE;

* The VMA expands as much depending upon the argument size. So
  for 32 * PAGE_SIZE argument, it becomes 33 * PAGE_SIZE.

* 33 * PAGE_SIZE with 64K pages fails the following test in
  get_arg_page() function. 33 * PAGE_SIZE is more than 2MB
  (8 MB /4) with 64K page size.

   if (size > READ_ONCE(rlim[RLIMIT_STACK].rlim_cur) / 4)

* Right now RLIMIT_STACK is hard coded 8MB which does not take
  PAGE_SIZE into account. 

Wondering what should be the solution for this problem ?

* Change the default stack size from 8MB ?
* Change the ratio test from 1/4th ?

Thoughts ?

Regards
Anshuman

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Question regarding MAX_ARG_STRLEN with execve()
  2017-06-30  6:29 Question regarding MAX_ARG_STRLEN with execve() Anshuman Khandual
@ 2017-06-30 14:22 ` Michal Hocko
  2017-07-03  8:28   ` Anshuman Khandual
  0 siblings, 1 reply; 5+ messages in thread
From: Michal Hocko @ 2017-06-30 14:22 UTC (permalink / raw)
  To: Anshuman Khandual; +Cc: Linux Kernel Mailing List, Ingo Molnar, Alexander Viro

On Fri 30-06-17 11:59:37, Anshuman Khandual wrote:
> Hello,
> 
> execve() system call should support argument length of
> MAX_ARG_STRLEN (PAGE_SIZE * 32). On 64K page size systems, we
> are not able to pass 32 * PAGE_SIZE arguments into the execve()
> system call because of the following reasons.
> 
> * struct linux_binprm's vma starts with a size of PAGE_SIZE
> 
> 	vma->vm_end = STACK_TOP_MAX;
> 	vma->vm_start = vma->vm_end - PAGE_SIZE;
> 
> * The VMA expands as much depending upon the argument size. So
>   for 32 * PAGE_SIZE argument, it becomes 33 * PAGE_SIZE.
> 
> * 33 * PAGE_SIZE with 64K pages fails the following test in
>   get_arg_page() function. 33 * PAGE_SIZE is more than 2MB
>   (8 MB /4) with 64K page size.
> 
>    if (size > READ_ONCE(rlim[RLIMIT_STACK].rlim_cur) / 4)
> 
> * Right now RLIMIT_STACK is hard coded 8MB which does not take
>   PAGE_SIZE into account. 
> 
> Wondering what should be the solution for this problem ?
> 
> * Change the default stack size from 8MB ?

just increase the ulimit if you want to use such a large arguments.

> * Change the ratio test from 1/4th ?

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Question regarding MAX_ARG_STRLEN with execve()
  2017-06-30 14:22 ` Michal Hocko
@ 2017-07-03  8:28   ` Anshuman Khandual
  2017-07-03  9:21     ` Michal Hocko
  0 siblings, 1 reply; 5+ messages in thread
From: Anshuman Khandual @ 2017-07-03  8:28 UTC (permalink / raw)
  To: Michal Hocko; +Cc: Linux Kernel Mailing List, Ingo Molnar, Alexander Viro

On 06/30/2017 07:52 PM, Michal Hocko wrote:
> On Fri 30-06-17 11:59:37, Anshuman Khandual wrote:
>> Hello,
>>
>> execve() system call should support argument length of
>> MAX_ARG_STRLEN (PAGE_SIZE * 32). On 64K page size systems, we
>> are not able to pass 32 * PAGE_SIZE arguments into the execve()
>> system call because of the following reasons.
>>
>> * struct linux_binprm's vma starts with a size of PAGE_SIZE
>>
>> 	vma->vm_end = STACK_TOP_MAX;
>> 	vma->vm_start = vma->vm_end - PAGE_SIZE;
>>
>> * The VMA expands as much depending upon the argument size. So
>>   for 32 * PAGE_SIZE argument, it becomes 33 * PAGE_SIZE.
>>
>> * 33 * PAGE_SIZE with 64K pages fails the following test in
>>   get_arg_page() function. 33 * PAGE_SIZE is more than 2MB
>>   (8 MB /4) with 64K page size.
>>
>>    if (size > READ_ONCE(rlim[RLIMIT_STACK].rlim_cur) / 4)
>>
>> * Right now RLIMIT_STACK is hard coded 8MB which does not take
>>   PAGE_SIZE into account. 
>>
>> Wondering what should be the solution for this problem ?
>>
>> * Change the default stack size from 8MB ?
> just increase the ulimit if you want to use such a large arguments.
> 

Yeah that is possible but it does not still offset the fact that
the calculation is broken on the page size of 64K. I mean, yeah
its not practical to have such a large argument. But the point
is whether we would want to support the MAX_ARG_STRLEN semantic
for execve system call or not. At present its broken for 64K
and I am asking whether we will be willing to revisit the
'1/4th of the stack' condition.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Question regarding MAX_ARG_STRLEN with execve()
  2017-07-03  8:28   ` Anshuman Khandual
@ 2017-07-03  9:21     ` Michal Hocko
  2017-07-04 11:06       ` Anshuman Khandual
  0 siblings, 1 reply; 5+ messages in thread
From: Michal Hocko @ 2017-07-03  9:21 UTC (permalink / raw)
  To: Anshuman Khandual; +Cc: Linux Kernel Mailing List, Ingo Molnar, Alexander Viro

On Mon 03-07-17 13:58:59, Anshuman Khandual wrote:
> On 06/30/2017 07:52 PM, Michal Hocko wrote:
> > On Fri 30-06-17 11:59:37, Anshuman Khandual wrote:
> >> Hello,
> >>
> >> execve() system call should support argument length of
> >> MAX_ARG_STRLEN (PAGE_SIZE * 32). On 64K page size systems, we
> >> are not able to pass 32 * PAGE_SIZE arguments into the execve()
> >> system call because of the following reasons.
> >>
> >> * struct linux_binprm's vma starts with a size of PAGE_SIZE
> >>
> >> 	vma->vm_end = STACK_TOP_MAX;
> >> 	vma->vm_start = vma->vm_end - PAGE_SIZE;
> >>
> >> * The VMA expands as much depending upon the argument size. So
> >>   for 32 * PAGE_SIZE argument, it becomes 33 * PAGE_SIZE.
> >>
> >> * 33 * PAGE_SIZE with 64K pages fails the following test in
> >>   get_arg_page() function. 33 * PAGE_SIZE is more than 2MB
> >>   (8 MB /4) with 64K page size.
> >>
> >>    if (size > READ_ONCE(rlim[RLIMIT_STACK].rlim_cur) / 4)
> >>
> >> * Right now RLIMIT_STACK is hard coded 8MB which does not take
> >>   PAGE_SIZE into account. 
> >>
> >> Wondering what should be the solution for this problem ?
> >>
> >> * Change the default stack size from 8MB ?
> > just increase the ulimit if you want to use such a large arguments.
> > 
> 
> Yeah that is possible but it does not still offset the fact that
> the calculation is broken on the page size of 64K. I mean, yeah
> its not practical to have such a large argument. But the point
> is whether we would want to support the MAX_ARG_STRLEN semantic
> for execve system call or not. At present its broken for 64K
> and I am asking whether we will be willing to revisit the
> '1/4th of the stack' condition.

I dunno. We have this 1/4 of RLIMIT semantic for years and it doesn't
seem there were any bug reports. Yes, MAX_ARG_STRLEN being PAGE_SIZE
dependent is unfortunate because it makes an arch independent default
ulimit hard to get right but I am not sure we actually have to lose
sleep over this.

Or do you have any specific proposal how to "fix" this limitation which
wouldn't break other userspace?
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Question regarding MAX_ARG_STRLEN with execve()
  2017-07-03  9:21     ` Michal Hocko
@ 2017-07-04 11:06       ` Anshuman Khandual
  0 siblings, 0 replies; 5+ messages in thread
From: Anshuman Khandual @ 2017-07-04 11:06 UTC (permalink / raw)
  To: Michal Hocko; +Cc: Linux Kernel Mailing List, Ingo Molnar, Alexander Viro

On 07/03/2017 02:51 PM, Michal Hocko wrote:
> On Mon 03-07-17 13:58:59, Anshuman Khandual wrote:
>> On 06/30/2017 07:52 PM, Michal Hocko wrote:
>>> On Fri 30-06-17 11:59:37, Anshuman Khandual wrote:
>>>> Hello,
>>>>
>>>> execve() system call should support argument length of
>>>> MAX_ARG_STRLEN (PAGE_SIZE * 32). On 64K page size systems, we
>>>> are not able to pass 32 * PAGE_SIZE arguments into the execve()
>>>> system call because of the following reasons.
>>>>
>>>> * struct linux_binprm's vma starts with a size of PAGE_SIZE
>>>>
>>>> 	vma->vm_end = STACK_TOP_MAX;
>>>> 	vma->vm_start = vma->vm_end - PAGE_SIZE;
>>>>
>>>> * The VMA expands as much depending upon the argument size. So
>>>>   for 32 * PAGE_SIZE argument, it becomes 33 * PAGE_SIZE.
>>>>
>>>> * 33 * PAGE_SIZE with 64K pages fails the following test in
>>>>   get_arg_page() function. 33 * PAGE_SIZE is more than 2MB
>>>>   (8 MB /4) with 64K page size.
>>>>
>>>>    if (size > READ_ONCE(rlim[RLIMIT_STACK].rlim_cur) / 4)
>>>>
>>>> * Right now RLIMIT_STACK is hard coded 8MB which does not take
>>>>   PAGE_SIZE into account. 
>>>>
>>>> Wondering what should be the solution for this problem ?
>>>>
>>>> * Change the default stack size from 8MB ?
>>> just increase the ulimit if you want to use such a large arguments.
>>>
>>
>> Yeah that is possible but it does not still offset the fact that
>> the calculation is broken on the page size of 64K. I mean, yeah
>> its not practical to have such a large argument. But the point
>> is whether we would want to support the MAX_ARG_STRLEN semantic
>> for execve system call or not. At present its broken for 64K
>> and I am asking whether we will be willing to revisit the
>> '1/4th of the stack' condition.
> 
> I dunno. We have this 1/4 of RLIMIT semantic for years and it doesn't
> seem there were any bug reports. Yes, MAX_ARG_STRLEN being PAGE_SIZE
> dependent is unfortunate because it makes an arch independent default
> ulimit hard to get right but I am not sure we actually have to lose
> sleep over this.

I understand your point.

> 
> Or do you have any specific proposal how to "fix" this limitation which
> wouldn't break other userspace?

There are three variables here MAX_ARG_STRLEN, RLIMIT_STACK and the 25%
condition. Execve() is supporting MAX_ARG_STRLEN for a long time, hence
it cannot be changed now. That leaves us to change either the default
RLIMIT_STACK value or the 25% condition. Both are kernel internal
implementation. But I am not sure how changing them might affect any
other userspace behavior, hence asking for suggestions. I just wanted
to explore the possibilities of a fix here.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2017-07-04 11:06 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-06-30  6:29 Question regarding MAX_ARG_STRLEN with execve() Anshuman Khandual
2017-06-30 14:22 ` Michal Hocko
2017-07-03  8:28   ` Anshuman Khandual
2017-07-03  9:21     ` Michal Hocko
2017-07-04 11:06       ` Anshuman Khandual

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).