linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: William Lee Irwin III <wli@holomorphy.com>
To: Torsten Landschoff <torsten@debian.org>
Cc: J?rn Engel <joern@wohnheim.fh-wedel.de>,
	Linux kernel <linux-kernel@vger.kernel.org>
Subject: Re: top stack (l)users for 2.5.69
Date: Wed, 7 May 2003 09:01:44 -0700	[thread overview]
Message-ID: <20030507160144.GS8931@holomorphy.com> (raw)
In-Reply-To: <20030507150429.GA7248@stargate.galaxy>

On Wed, May 07, 2003 at 07:47:36AM -0700, William Lee Irwin III wrote:
>> The kernel stack is (in Linux) unswappable memory that persists
>> throughout the lifetime of a thread. It's basically how many threads
>> you want to be able to cram into a system, and it matters a lot for
>> 32-bit.

On Wed, May 07, 2003 at 05:04:29PM +0200, Torsten Landschoff wrote:
> Okay, that makes sense. BTW: Why not go a step further and have just 
> one kernel stack (probably better one per CPU)?

Generally things are stopped in the middle of function calls when
scheduled out and the register state etc. saved nowhere but as register
spills to the stack in the task model of programming (commonly used in
UNIX implementations and for most kernels really). Each userspace thread
has a "mirror image" thread inside the kernel, and basically scheduling
happens as some thread inside the kernel deciding it's time to check
whether one should schedule, and when it does, dumping what registers
it hasn't already to the kernel stack to save state, and then switching
stacks. So the stack is implicitly used to save per-thread state and
can't really be shared on a per-cpu basis in a UNIX-like design. UNIX
IIRC dealt with the resource scalability problem that a decision to pin
the memory would cause by shoving kernel stacks in the u area, which
could be swapped when under sufficient duress, and there was a whole
layer of scheduling that decided when to dump whole processes to swap
when there were too many competing for memory, and when to swap them in.

Pure per-cpu stacks would require the interrupt model of programming to
be used, which is a design decision deep enough it's debatable whether
it's feasible to do conversions to or from at all, never mind desirable.
Basically every entry point into the kernel is treated as an interrupt,
and nothing can ever sleep or be scheduled in the kernel, but rather
only register callbacks to be run when the event waited for occurs.
Scheduling only happens as a decision of which userspace task to resume
when returning from the kernel to userspace, though one could envision
a priority queue discipline for processing the registered callbacks.

Many of the mechanics used for async io qualify as "partial conversions"
but in truth most of the truly difficult aspects are avoided by limiting
the usage of the style to io requests and not using its style for
memory allocations or other things. It's basically Not UNIX (TM). AFAIK
only a couple of research kernels from the late 80's, QuickSilver and V
(cited by Vahalia) ever used it, though Vahalia isn't likely to give an
exhaustive list of the things so there may be others. But it probably
makes for impressive resource scalability numbers wrt. threads, at the
cost of some runtime overhead for more complex state maintenance.


-- wli

  reply	other threads:[~2003-05-07 15:49 UTC|newest]

Thread overview: 68+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-05-07 13:20 top stack (l)users for 2.5.69 Jörn Engel
2003-05-07 13:45 ` Richard B. Johnson
2003-05-07 13:56   ` Jörn Engel
2003-05-07 14:16     ` Richard B. Johnson
2003-05-07 17:13       ` Jonathan Lundell
2003-05-07 17:40         ` Richard B. Johnson
2003-05-07 18:12           ` Roland Dreier
2003-05-07 18:28             ` Richard B. Johnson
2003-05-07 18:44               ` Timothy Miller
2003-05-07 18:46               ` Roland Dreier
2003-05-07 19:30                 ` Richard B. Johnson
2003-05-07 19:42                   ` Roland Dreier
2003-05-07 20:04                     ` Richard B. Johnson
2003-05-07 20:23                       ` Roland Dreier
2003-05-07 20:42                       ` Timothy Miller
2003-05-08  9:06                         ` Jörn Engel
2003-05-08 11:33                         ` Richard B. Johnson
2003-05-08 12:00                           ` Helge Hafting
2003-05-08 15:42                           ` Timothy Miller
2003-05-09  8:57                             ` Miles Bader
2003-05-09 16:50                               ` Timothy Miller
2003-05-08 16:47                           ` Davide Libenzi
2003-05-07 18:51               ` Davide Libenzi
2003-05-07 19:22                 ` Richard B. Johnson
2003-05-07 19:31                   ` Davide Libenzi
2003-05-07 19:39                   ` Hua Zhong
2003-05-07 21:47                 ` Martin J. Bligh
2003-05-08 10:29           ` David Howells
2003-05-07 17:55         ` Jörn Engel
2003-05-07 16:20           ` Martin J. Bligh
2003-05-07 19:01         ` Dave Hansen
2003-05-07 20:06           ` Jörn Engel
2003-05-07 20:14             ` Dave Hansen
2003-05-08  8:41               ` Jörn Engel
2003-05-08 16:51                 ` Dave Hansen
2003-05-08 22:12                   ` Jörn Engel
2003-05-07 21:30         ` Jesse Pollard
2003-05-07 21:54           ` Timothy Miller
2003-05-07 22:01             ` Jesse Pollard
2003-05-07 14:33     ` Torsten Landschoff
2003-05-07 14:47       ` William Lee Irwin III
2003-05-07 15:04         ` Torsten Landschoff
2003-05-07 16:01           ` William Lee Irwin III [this message]
2003-05-08 15:36             ` Ingo Oeser
2003-05-08 18:04               ` William Lee Irwin III
2003-05-07 15:23         ` Timothy Miller
2003-05-07 15:47           ` William Lee Irwin III
2003-05-07 16:49         ` Jörn Engel
2003-05-07 17:18           ` Davide Libenzi
2003-05-07 17:40             ` Jörn Engel
2003-05-07 18:35               ` Davide Libenzi
2003-05-07 19:45                 ` Jörn Engel
2003-05-07 18:23             ` William Lee Irwin III
2003-05-07 17:38           ` William Lee Irwin III
2003-05-07 17:47             ` Jörn Engel
2003-05-07 14:49       ` Richard B. Johnson
2003-05-07 18:36   ` Linus Torvalds
2003-05-07 19:17     ` Jeff Garzik
2003-05-07 20:38       ` Randy.Dunlap
2003-05-07 21:27         ` Marcus Alanen
2003-05-07 21:27           ` Randy.Dunlap
2003-05-08 15:10         ` Ingo Oeser
2003-05-08 17:12           ` Randy.Dunlap
2003-05-07 19:38 Chuck Ebbert
2003-05-08 14:08 Chuck Ebbert
2003-05-08 18:04 ` Jonathan Lundell
2003-05-08 19:05   ` Timothy Miller
2003-05-08 21:00     ` Jonathan Lundell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20030507160144.GS8931@holomorphy.com \
    --to=wli@holomorphy.com \
    --cc=joern@wohnheim.fh-wedel.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=torsten@debian.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).