linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* RE: A way to shrink process impact on kernel memory usage?
@ 2003-05-10  1:25 Perez-Gonzalez, Inaky
  2003-05-13 14:45 ` Timothy Miller
  0 siblings, 1 reply; 5+ messages in thread
From: Perez-Gonzalez, Inaky @ 2003-05-10  1:25 UTC (permalink / raw)
  To: 'Timothy Miller', 'Linux Kernel Mailing List'


> -----Original Message-----
> From: Timothy Miller [mailto:miller@techsource.com]
> 
> One of the things that's been worked on to reduce kernel memory usage
> for processes is to shrink the kernel stack from 8k to 4k.  I mean, it's
> not like you could shrink it to 6k, right?  Well, why not?  Why not
> allocate an 8k space and put various process-related data structures at
> the beginning of it?  Sure, a stack overflow could corrupt that data,
> but a stack overflow would be disasterous anyhow.

It is being done already. At least, on i386, alloc_thread_info() 
allocates two pages; at the beginning you have the thread info
structure [context and friends]. 

This is called from copy_process(), dup_task_struct(), alloc_thread_info().

However, what you say makes sense, but it'd be kind of difficult to
calculate how much is enough ... maybe, who knows. But the only 
thing you can put in there is stuff that is specific to each thread
(scheduling information, parent/s, childs, siblings, pid maps,
timers? used_math, comm, fsinfo, ipc, etc, etc ...).

Thus, it'd be interesting to collapse all the common stuff in
the task_struct corresponding to a same thread group into a single
one, and move whatever is thread specific out to a thread-specific
structure [alike to thread_info, although I guess you want to keep
thread_info really small for cache performance].

That should save a lot of task_structs when threading and move all
the info to that place you say. It is going to be a lot of work,
though, very kind of 2.7.

> Someone complained about a process structure already being too bloated.
>   Unless it's several K in size already, you can bloat it up all you
> please this way.

Not really - the more bloated, the more cache misses you will have.
There are a lot of fields that don't use all the bits and a lot
of Booleans; it'd make sense to collapse all those into a single
word if possible.

> Another advantage is that you could make the datastructures growable.
> The stack grows down, and the data grows up.  As long as they don't
> meet, all is well.

To solve that, you put the structures on the top of the area instead
of at the beginning. That way you are sure the stack cannot overflow
over your (very delicate) data structures, and makes it easier to add
an overflow guard page (as the stack end is at the beginning of a
page).

Iñaky Pérez-González -- Not speaking for Intel -- all opinions are my own
(and my fault)

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: A way to shrink process impact on kernel memory usage?
  2003-05-10  1:25 A way to shrink process impact on kernel memory usage? Perez-Gonzalez, Inaky
@ 2003-05-13 14:45 ` Timothy Miller
  0 siblings, 0 replies; 5+ messages in thread
From: Timothy Miller @ 2003-05-13 14:45 UTC (permalink / raw)
  To: Perez-Gonzalez, Inaky; +Cc: 'Linux Kernel Mailing List'



Perez-Gonzalez, Inaky wrote:
>>-----Original Message-----
>>From: Timothy Miller [mailto:miller@techsource.com]
>>
>>One of the things that's been worked on to reduce kernel memory usage
>>for processes is to shrink the kernel stack from 8k to 4k.  I mean, it's
>>not like you could shrink it to 6k, right?  Well, why not?  Why not
>>allocate an 8k space and put various process-related data structures at
>>the beginning of it?  Sure, a stack overflow could corrupt that data,
>>but a stack overflow would be disasterous anyhow.
> 
> 
> It is being done already. At least, on i386, alloc_thread_info() 
> allocates two pages; at the beginning you have the thread info
> structure [context and friends]. 
> 
> This is called from copy_process(), dup_task_struct(), alloc_thread_info().
> 
> However, what you say makes sense, but it'd be kind of difficult to
> calculate how much is enough ... maybe, who knows. But the only 
> thing you can put in there is stuff that is specific to each thread
> (scheduling information, parent/s, childs, siblings, pid maps,
> timers? used_math, comm, fsinfo, ipc, etc, etc ...).

If you have some data which is common to a group of threads/processes, 
it could be stored in one (or more--redundantly) of the process stacks. 
  If the refcount is not zero and the process stack holding the data is 
to die, the data can be moved to another stack or otherwise stored 
somewhere else.

> 
> Thus, it'd be interesting to collapse all the common stuff in
> the task_struct corresponding to a same thread group into a single
> one, and move whatever is thread specific out to a thread-specific
> structure [alike to thread_info, although I guess you want to keep
> thread_info really small for cache performance].
> 
> That should save a lot of task_structs when threading and move all
> the info to that place you say. It is going to be a lot of work,
> though, very kind of 2.7.
> 

It might, nevertheless, be a good an equitable solution to the problem. 
  Another way to skin a cat, as it were.

> 
>>Someone complained about a process structure already being too bloated.
>>  Unless it's several K in size already, you can bloat it up all you
>>please this way.
> 
> 
> Not really - the more bloated, the more cache misses you will have.
> There are a lot of fields that don't use all the bits and a lot
> of Booleans; it'd make sense to collapse all those into a single
> word if possible.

Most assuredly.  Why are they not already? :)

> 
>>Another advantage is that you could make the datastructures growable.
>>The stack grows down, and the data grows up.  As long as they don't
>>meet, all is well.
> 
> 
> To solve that, you put the structures on the top of the area instead
> of at the beginning. That way you are sure the stack cannot overflow
> over your (very delicate) data structures, and makes it easier to add
> an overflow guard page (as the stack end is at the beginning of a
> page).

I believe I mentioned that idea.  Either the stack and data grow in 
opposite directions, with obvious advantages and risks, or the data is 
at the top of the area but therefore not growable.



^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: A way to shrink process impact on kernel memory usage?
@ 2003-05-13 19:56 Perez-Gonzalez, Inaky
  0 siblings, 0 replies; 5+ messages in thread
From: Perez-Gonzalez, Inaky @ 2003-05-13 19:56 UTC (permalink / raw)
  To: 'Timothy Miller'; +Cc: 'Linux Kernel Mailing List'


From: Timothy Miller [mailto:miller@techsource.com]
> Perez-Gonzalez, Inaky wrote:
> 
> If you have some data which is common to a group of threads/processes,
> it could be stored in one (or more--redundantly) of the process stacks.
>   If the refcount is not zero and the process stack holding the data is
> to die, the data can be moved to another stack or otherwise stored
> somewhere else.

I don't think you need that redundancy - at the end of the day, it
is much simpler to just have the common task struct (could we say,
process?) with the shared stuff - replication is nice, but not in
this area.

> > Not really - the more bloated, the more cache misses you will have.
> > There are a lot of fields that don't use all the bits and a lot
> > of Booleans; it'd make sense to collapse all those into a single
> > word if possible.
> 
> Most assuredly.  Why are they not already? :)

Beats me ... maybe there are performance concerns I am not aware
of, or simply, it has not been tackled. This is something I have
on my list of "would be interesting to work on".

> > To solve that, you put the structures on the top of the area instead
> > of at the beginning. That way you are sure the stack cannot overflow
> > over your (very delicate) data structures, and makes it easier to add
> > an overflow guard page (as the stack end is at the beginning of a
> > page).
> 
> I believe I mentioned that idea.  Either the stack and data grow in
> opposite directions, with obvious advantages and risks, or the data is
> at the top of the area but therefore not growable.

Kill me ... my apologies; sometimes it seems that I don't master 
reading as much as I thought :]


Iñaky Pérez-González -- Not speaking for Intel -- all opinions are my own
(and my fault)

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: A way to shrink process impact on kernel memory usage?
  2003-05-09 17:10 Timothy Miller
@ 2003-05-10 20:43 ` David Woodhouse
  0 siblings, 0 replies; 5+ messages in thread
From: David Woodhouse @ 2003-05-10 20:43 UTC (permalink / raw)
  To: Timothy Miller; +Cc: Linux Kernel Mailing List

On Fri, 2003-05-09 at 18:10, Timothy Miller wrote: 
> Why not allocate an 8k space and put various process-related data
> structures at the beginning of it?  Sure, a stack overflow could
> corrupt that data, but a stack overflow would be disasterous anyhow.

No reason why not at all. That's why we've been doing it this way for
years ;)

-- 
dwmw2


^ permalink raw reply	[flat|nested] 5+ messages in thread

* A way to shrink process impact on kernel memory usage?
@ 2003-05-09 17:10 Timothy Miller
  2003-05-10 20:43 ` David Woodhouse
  0 siblings, 1 reply; 5+ messages in thread
From: Timothy Miller @ 2003-05-09 17:10 UTC (permalink / raw)
  To: Linux Kernel Mailing List

One of the things that's been worked on to reduce kernel memory usage 
for processes is to shrink the kernel stack from 8k to 4k.  I mean, it's 
not like you could shrink it to 6k, right?  Well, why not?  Why not 
allocate an 8k space and put various process-related data structures at 
the beginning of it?  Sure, a stack overflow could corrupt that data, 
but a stack overflow would be disasterous anyhow.

I'm sure that, in addition to the memory allocated by kmalloc, some data 
structures are also allocated to track it so that you can know what to 
free when you use kfree, right?  Well, combining a few things this way 
would save a few bytes there too.

Also, if you're really worried about overflow, or you want a guard page 
or whatever, then put the data structures at the end and set the initial 
stack pointer appropriately.

Someone complained about a process structure already being too bloated. 
  Unless it's several K in size already, you can bloat it up all you 
please this way.

Another advantage is that you could make the datastructures growable. 
The stack grows down, and the data grows up.  As long as they don't 
meet, all is well.


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2003-05-13 19:43 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-05-10  1:25 A way to shrink process impact on kernel memory usage? Perez-Gonzalez, Inaky
2003-05-13 14:45 ` Timothy Miller
  -- strict thread matches above, loose matches on Subject: below --
2003-05-13 19:56 Perez-Gonzalez, Inaky
2003-05-09 17:10 Timothy Miller
2003-05-10 20:43 ` David Woodhouse

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).