linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* thread stacks and strict vm overcommit accounting
@ 2007-03-13 16:33 Dan Aloni
  2007-03-15 19:06 ` Andrew Morton
  0 siblings, 1 reply; 11+ messages in thread
From: Dan Aloni @ 2007-03-13 16:33 UTC (permalink / raw)
  To: Linux Kernel List

Hello,

This question is relevent to 2.6.20.

I noticed that if the RSS for the stack size is say, 8MB, running
a single-threaded process doesn't incur an increase of 8MB to
Committed_AS (/proc/meminfo).

However, on multi-threaded apps linked with pthread (on Debian
Etch with 2.6.20 vanilla x86_64), every thread will incur the
the specified maximum stack size RSS (assuming that you use
the default attr). In other words, it appears that vm accounting
works differently in that case.

Is this the intended behaviour?

-- 
Dan Aloni
XIV LTD, http://www.xivstorage.com
da-x (at) monatomic.org, dan (at) xiv.co.il

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: thread stacks and strict vm overcommit accounting
  2007-03-13 16:33 thread stacks and strict vm overcommit accounting Dan Aloni
@ 2007-03-15 19:06 ` Andrew Morton
  2007-03-15 20:37   ` Hugh Dickins
  2007-03-16  4:29   ` KAMEZAWA Hiroyuki
  0 siblings, 2 replies; 11+ messages in thread
From: Andrew Morton @ 2007-03-15 19:06 UTC (permalink / raw)
  To: Dan Aloni; +Cc: linux-kernel

> On Tue, 13 Mar 2007 18:33:20 +0200 Dan Aloni <da-x@monatomic.org> wrote:
> Hello,
> 
> This question is relevent to 2.6.20.
> 
> I noticed that if the RSS for the stack size is say, 8MB, running
> a single-threaded process doesn't incur an increase of 8MB to
> Committed_AS (/proc/meminfo).
> 
> However, on multi-threaded apps linked with pthread (on Debian
> Etch with 2.6.20 vanilla x86_64), every thread will incur the
> the specified maximum stack size RSS (assuming that you use
> the default attr). In other words, it appears that vm accounting
> works differently in that case.
> 
> Is this the intended behaviour?

That sounds like a bug to me.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: thread stacks and strict vm overcommit accounting
  2007-03-15 19:06 ` Andrew Morton
@ 2007-03-15 20:37   ` Hugh Dickins
  2007-03-15 20:59     ` Ulrich Drepper
  2007-03-15 23:33     ` Alan Cox
  2007-03-16  4:29   ` KAMEZAWA Hiroyuki
  1 sibling, 2 replies; 11+ messages in thread
From: Hugh Dickins @ 2007-03-15 20:37 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Dan Aloni, linux-kernel

On Thu, 15 Mar 2007, Andrew Morton wrote:
> > On Tue, 13 Mar 2007 18:33:20 +0200 Dan Aloni <da-x@monatomic.org> wrote:
> > 
> > This question is relevent to 2.6.20.
> > 
> > I noticed that if the RSS for the stack size is say, 8MB, running

I think you meant to say RLIMIT_STACK rather than RSS, didn't you, Dan?

> > a single-threaded process doesn't incur an increase of 8MB to
> > Committed_AS (/proc/meminfo).

Stack RSS should certainly be included in Committed_AS,
but RLIMIT_STACK merely limits how big the stack vma may grow to:
at any moment the stack vma is probably very much smaller,
and only its current size is accounted in Committed_AS.

> > 
> > However, on multi-threaded apps linked with pthread (on Debian
> > Etch with 2.6.20 vanilla x86_64), every thread will incur the
> > the specified maximum stack size RSS (assuming that you use
> > the default attr). In other words, it appears that vm accounting
> > works differently in that case.

I'm guessing that the pthread stacks are mmap'ed as greatest extents
(probably because that's the easiest way to keep them apart), rather
than as small MAP_GROWSDOWN areas to be expanded later on fault.
If so, then those would indeed account the maximum in Committed_AS.

> > 
> > Is this the intended behaviour?
> 
> That sounds like a bug to me.

I'm suspecting it's an oddity rather than a bug.

Hugh

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: thread stacks and strict vm overcommit accounting
  2007-03-15 20:37   ` Hugh Dickins
@ 2007-03-15 20:59     ` Ulrich Drepper
  2007-03-15 23:33     ` Alan Cox
  1 sibling, 0 replies; 11+ messages in thread
From: Ulrich Drepper @ 2007-03-15 20:59 UTC (permalink / raw)
  To: Hugh Dickins; +Cc: Andrew Morton, Dan Aloni, linux-kernel

On 3/15/07, Hugh Dickins <hugh@veritas.com> wrote:
> I'm guessing that the pthread stacks are mmap'ed as greatest extents
> (probably because that's the easiest way to keep them apart), rather
> than as small MAP_GROWSDOWN areas to be expanded later on fault.

Please all, forget about MAP_GROWSDOWN.  It's useless.  If thread
stacks are not completely mapped (address space allocation, memory
allocation is not needed) it means subsequent unrelated mmaps can fall
into the address space which is meant to be used for the stack, hence
preventing the stack from growing.

libpthread uses an mmap for the complete stack size all the time and
this of course is accounted for in the kernel.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: thread stacks and strict vm overcommit accounting
  2007-03-15 23:33     ` Alan Cox
@ 2007-03-15 22:36       ` Andrew Morton
  2007-03-15 23:08         ` Hugh Dickins
  2007-03-15 23:42         ` Dan Aloni
  0 siblings, 2 replies; 11+ messages in thread
From: Andrew Morton @ 2007-03-15 22:36 UTC (permalink / raw)
  To: Alan Cox; +Cc: Hugh Dickins, Dan Aloni, linux-kernel

On Thu, 15 Mar 2007 23:33:43 +0000
Alan Cox <alan@lxorguk.ukuu.org.uk> wrote:

> > Stack RSS should certainly be included in Committed_AS,
> > but RLIMIT_STACK merely limits how big the stack vma may grow to:
> > at any moment the stack vma is probably very much smaller,
> > and only its current size is accounted in Committed_AS.
> 
> With a typical size as a fuzz factor preaccounted in later kernels.

Where's that done?

> > > > Is this the intended behaviour?
> > > 
> > > That sounds like a bug to me.
> > 
> > I'm suspecting it's an oddity rather than a bug.
> 
> It is intended behaviour.

Each instance of

main()
{
	sleep(100);
}

appears to increase Committed_AS by around 200kb.  But we've committed to
providing it with 8MB for stack.

How come this is correct?


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: thread stacks and strict vm overcommit accounting
  2007-03-15 22:36       ` Andrew Morton
@ 2007-03-15 23:08         ` Hugh Dickins
  2007-03-16  1:31           ` Alan Cox
  2007-03-16 14:43           ` Jakub Jelinek
  2007-03-15 23:42         ` Dan Aloni
  1 sibling, 2 replies; 11+ messages in thread
From: Hugh Dickins @ 2007-03-15 23:08 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Alan Cox, Dan Aloni, linux-kernel

On Thu, 15 Mar 2007, Andrew Morton wrote:
> On Thu, 15 Mar 2007 23:33:43 +0000
> Alan Cox <alan@lxorguk.ukuu.org.uk> wrote:
> 
> > > Stack RSS should certainly be included in Committed_AS,
> > > but RLIMIT_STACK merely limits how big the stack vma may grow to:
> > > at any moment the stack vma is probably very much smaller,
> > > and only its current size is accounted in Committed_AS.
> > 
> > With a typical size as a fuzz factor preaccounted in later kernels.
> 
> Where's that done?

I don't know what Alan is referring to there.

> 
> > > > > Is this the intended behaviour?
> > > > 
> > > > That sounds like a bug to me.
> > > 
> > > I'm suspecting it's an oddity rather than a bug.
> > 
> > It is intended behaviour.

Intended in the way the different stacks are implemented,
but odd enough for us to wonder at the difference.

> 
> Each instance of
> 
> main()
> {
> 	sleep(100);
> }
> 
> appears to increase Committed_AS by around 200kb.  But we've committed to
> providing it with 8MB for stack.
> 
> How come this is correct?

We've no more committed to providing each instance with 8MB of stack,
than we've committed to providing each instance with RLIMIT_AS of
address space.  The rlimits are limits, not commitments, surely?

Hugh

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: thread stacks and strict vm overcommit accounting
  2007-03-15 20:37   ` Hugh Dickins
  2007-03-15 20:59     ` Ulrich Drepper
@ 2007-03-15 23:33     ` Alan Cox
  2007-03-15 22:36       ` Andrew Morton
  1 sibling, 1 reply; 11+ messages in thread
From: Alan Cox @ 2007-03-15 23:33 UTC (permalink / raw)
  To: Hugh Dickins; +Cc: Andrew Morton, Dan Aloni, linux-kernel

> Stack RSS should certainly be included in Committed_AS,
> but RLIMIT_STACK merely limits how big the stack vma may grow to:
> at any moment the stack vma is probably very much smaller,
> and only its current size is accounted in Committed_AS.

With a typical size as a fuzz factor preaccounted in later kernels.

> > > Is this the intended behaviour?
> > 
> > That sounds like a bug to me.
> 
> I'm suspecting it's an oddity rather than a bug.

It is intended behaviour.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: thread stacks and strict vm overcommit accounting
  2007-03-15 22:36       ` Andrew Morton
  2007-03-15 23:08         ` Hugh Dickins
@ 2007-03-15 23:42         ` Dan Aloni
  1 sibling, 0 replies; 11+ messages in thread
From: Dan Aloni @ 2007-03-15 23:42 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Alan Cox, Hugh Dickins, linux-kernel

On Thu, Mar 15, 2007 at 03:36:13PM -0700, Andrew Morton wrote:
> 
> > > > > Is this the intended behaviour?
> > > > 
> > > > That sounds like a bug to me.
> > > 
> > > I'm suspecting it's an oddity rather than a bug.
> > 
> > It is intended behaviour.
> 
> Each instance of
> 
> main()
> {
> 	sleep(100);
> }
> 
> appears to increase Committed_AS by around 200kb.  But we've committed to
> providing it with 8MB for stack.
> 
> How come this is correct?

Perhaps it makes a lot of sense if you regard stack growth at 
the same sense that you regard heap growth by the means of brk(). 

Just by the fact that the stack is limited on default and RLIMIT_DATA 
is unlimited, doesn't mean the we need to account for the maximum
stack size. 

Perhaps for embedded systems where you want to have overcommit_memory=2 
overcommit_ratio=100 and no swap (for design constraints), just to make
sure that allocations fail *always before* OOM gets triggered (and 
therefore OOM never gets triggered, thankfully), it would have been
useful to look at Commited_AS to realize how much the system is close 
to the maximum memory utilization potential.

Learning about this 'oddity' in Commited_AS, I'd guess it would be 
better for me not to rely on it for measurements and perhaps tweak 
smaller values of RSS_STACK for processes on that embedded system.

-- 
Dan Aloni
XIV LTD, http://www.xivstorage.com
da-x (at) monatomic.org, dan (at) xiv.co.il

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: thread stacks and strict vm overcommit accounting
  2007-03-15 23:08         ` Hugh Dickins
@ 2007-03-16  1:31           ` Alan Cox
  2007-03-16 14:43           ` Jakub Jelinek
  1 sibling, 0 replies; 11+ messages in thread
From: Alan Cox @ 2007-03-16  1:31 UTC (permalink / raw)
  To: Hugh Dickins; +Cc: Andrew Morton, Dan Aloni, linux-kernel

> > > With a typical size as a fuzz factor preaccounted in later kernels.
> > 
> > Where's that done?
> 
> I don't know what Alan is referring to there.

fs/exec.c - we add 20 pages to the stack vma size initially.

> We've no more committed to providing each instance with 8MB of stack,
> than we've committed to providing each instance with RLIMIT_AS of
> address space.  The rlimits are limits, not commitments, surely?

Yes, its just that the C programming language is utterly and
mindbogglingly broken when it comes to resource exhaustion for the stack.

Alan

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: thread stacks and strict vm overcommit accounting
  2007-03-15 19:06 ` Andrew Morton
  2007-03-15 20:37   ` Hugh Dickins
@ 2007-03-16  4:29   ` KAMEZAWA Hiroyuki
  1 sibling, 0 replies; 11+ messages in thread
From: KAMEZAWA Hiroyuki @ 2007-03-16  4:29 UTC (permalink / raw)
  To: Andrew Morton; +Cc: da-x, linux-kernel

On Thu, 15 Mar 2007 11:06:21 -0800
Andrew Morton <akpm@linux-foundation.org> wrote:

> > On Tue, 13 Mar 2007 18:33:20 +0200 Dan Aloni <da-x@monatomic.org> wrote:
> > Hello,
> > 
> > This question is relevent to 2.6.20.
> > 
> > I noticed that if the RSS for the stack size is say, 8MB, running
> > a single-threaded process doesn't incur an increase of 8MB to
> > Committed_AS (/proc/meminfo).
> > 
> > However, on multi-threaded apps linked with pthread (on Debian
> > Etch with 2.6.20 vanilla x86_64), every thread will incur the
> > the specified maximum stack size RSS (assuming that you use
> > the default attr). In other words, it appears that vm accounting
> > works differently in that case.
> > 
> > Is this the intended behaviour?
> 
> That sounds like a bug to me.

AFAIK, "main" thread's stack is marked as VM_GROWS?? and its size can be
changed dynamically. "other" threads' stack are alloced by mmap (or malloc maybe)
and it never grows. This is difference between multi-thread and single thread.

So, you should be carefull to the size of stack when you use multi-threaded apps
and vm_overcommit_ratio at the same time. Because MAP_NORESERVE is accounted
if sysctl_overcommit_memory == OVERCOMMIT_NEVER, a program like java will fail
to create a new thread sometimes.

I have no good idea to fix this difference, sorry.

-Kame


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: thread stacks and strict vm overcommit accounting
  2007-03-15 23:08         ` Hugh Dickins
  2007-03-16  1:31           ` Alan Cox
@ 2007-03-16 14:43           ` Jakub Jelinek
  1 sibling, 0 replies; 11+ messages in thread
From: Jakub Jelinek @ 2007-03-16 14:43 UTC (permalink / raw)
  To: Hugh Dickins; +Cc: Andrew Morton, Alan Cox, Dan Aloni, linux-kernel

On Thu, Mar 15, 2007 at 11:08:40PM +0000, Hugh Dickins wrote:
> > appears to increase Committed_AS by around 200kb.  But we've committed to
> > providing it with 8MB for stack.
> > 
> > How come this is correct?
> 
> We've no more committed to providing each instance with 8MB of stack,
> than we've committed to providing each instance with RLIMIT_AS of
> address space.  The rlimits are limits, not commitments, surely?

RLIMIT_STACK only applies to the initial thread, POSIX threads have just
stack size attribute, not maximum thread stack size attribute.
If you set it explicitly with pthread_attr_setstacksize, then libpthread
will honor whatever thread stack size you want, otherwise it just uses
some default thread stack size.  This happens to be in NPTL derived
from RLIMIT_STACK value, but very well could be something else.

	Jakub

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2007-03-16 14:44 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-03-13 16:33 thread stacks and strict vm overcommit accounting Dan Aloni
2007-03-15 19:06 ` Andrew Morton
2007-03-15 20:37   ` Hugh Dickins
2007-03-15 20:59     ` Ulrich Drepper
2007-03-15 23:33     ` Alan Cox
2007-03-15 22:36       ` Andrew Morton
2007-03-15 23:08         ` Hugh Dickins
2007-03-16  1:31           ` Alan Cox
2007-03-16 14:43           ` Jakub Jelinek
2007-03-15 23:42         ` Dan Aloni
2007-03-16  4:29   ` KAMEZAWA Hiroyuki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).