linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Richard B. Johnson" <root@chaos.analogic.com>
To: Roland Dreier <roland@topspin.com>
Cc: Linux kernel <linux-kernel@vger.kernel.org>
Subject: Re: top stack (l)users for 2.5.69
Date: Wed, 7 May 2003 16:04:19 -0400 (EDT)	[thread overview]
Message-ID: <Pine.LNX.4.53.0305071547060.13869@chaos> (raw)
In-Reply-To: <52n0hyo85x.fsf@topspin.com>

On Wed, 7 May 2003, Roland Dreier wrote:

>     Roland> Right.  Now think about where the kernel stack for the
>     Roland> process that is sleeping in the kernel is kept.
>
>     Richard> It's the kernel, of course. The scheduler runs in the
>     Richard> kernel under the kernel stack, with the kernel data. It
>     Richard> has nothing to do with the original user once the user
>     Richard> sleeps. The user's context was saved, the kernel was set
>     Richard> up, and the kernel will schedule other tasks until the
>     Richard> sleep time or the sleep_on even is complete.  At that
>     Richard> time, (or thereafter), the kernel will schedule the
>     Richard> previously sleeping task, its context will be restored,
>     Richard> and it continues execution.
>
>     Richard> The context of a task (see entry.S) is completely defined
>     Richard> by its registers, including the hidden part of the
>     Richard> segments (selectors) that define privilege.
>
> I'll try one more time.  Let's say a user process makes a system call
> and enters the kernel.  That system call goes through a few function
> calls in the kernel (which each push something on the kernel stack for
> that process).  Finally, the kernel has to sleep to sleep to service
> the system call (let's say it's a blocking read() waiting for some
> data to arrive on a socket).
>
> OK, now the scheduler runs, and another user process starts and makes
> its own system call, which also goes to sleep.
>
> Now say the data the original process was waiting for arrives.  The
> scheduler wakes up that process, which is in the kernel, and it
> finishes servicing the read.  This means it now returns through the
> chain of kernel function calls before returning to user space.  Each
> return in kernel space has to pop some stuff off the stack, and it
> better not get mixed up with the second process's kernel stack.
>
> That's (one reason) why each process needs its own kernel stack.
>


But no! Not at all. The context of a user does not need to be saved
on the stack, and in fact, isn't. It's saved in a task structure
that was created when the original task was born. The pointer to
that task structure is called 'current' in the kernel. It's in
the kernel's data space, and everything necessary to put that
task back together is in that structure.

Context switching is usually not done by pushing all the registers
onto a stack, then later popping them back. That's not the way
it works.

When a caller executes int 0x80, this is a software interrupt,
called a 'trap'. It enters the trap handler on the kernel stack,
with the segment selectors set up as defined for that trap-handler.
It happens because software told hardware what to do ahead of time.
Software doesn't do it during the trap event. In the trap handler,
no context switch normally occurs. This is so that the kernel can
perform privileged tasks upon behalf of the caller without the
overhead of a context switch. However, all the user's registers
are saved and the kernel's data selector(s) are set so that they
can access the kernel data and the user's data (the user-mode
pointers for file/IO work, etc.). This happens in the context of
the user, but the privilege of the kernel. If the kernel-mode
function needs to sleep, the user's registers have already been
saved it its "current" structure. The kernel is free to find
some other task to load from the run queue and switch to that
task (see switch_to()).

Cheers,
Dick Johnson
Penguin : Linux version 2.4.20 on an i686 machine (797.90 BogoMips).
Why is the government concerned about the lunatic fringe? Think about it.


  reply	other threads:[~2003-05-07 19:48 UTC|newest]

Thread overview: 68+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-05-07 13:20 top stack (l)users for 2.5.69 Jörn Engel
2003-05-07 13:45 ` Richard B. Johnson
2003-05-07 13:56   ` Jörn Engel
2003-05-07 14:16     ` Richard B. Johnson
2003-05-07 17:13       ` Jonathan Lundell
2003-05-07 17:40         ` Richard B. Johnson
2003-05-07 18:12           ` Roland Dreier
2003-05-07 18:28             ` Richard B. Johnson
2003-05-07 18:44               ` Timothy Miller
2003-05-07 18:46               ` Roland Dreier
2003-05-07 19:30                 ` Richard B. Johnson
2003-05-07 19:42                   ` Roland Dreier
2003-05-07 20:04                     ` Richard B. Johnson [this message]
2003-05-07 20:23                       ` Roland Dreier
2003-05-07 20:42                       ` Timothy Miller
2003-05-08  9:06                         ` Jörn Engel
2003-05-08 11:33                         ` Richard B. Johnson
2003-05-08 12:00                           ` Helge Hafting
2003-05-08 15:42                           ` Timothy Miller
2003-05-09  8:57                             ` Miles Bader
2003-05-09 16:50                               ` Timothy Miller
2003-05-08 16:47                           ` Davide Libenzi
2003-05-07 18:51               ` Davide Libenzi
2003-05-07 19:22                 ` Richard B. Johnson
2003-05-07 19:31                   ` Davide Libenzi
2003-05-07 19:39                   ` Hua Zhong
2003-05-07 21:47                 ` Martin J. Bligh
2003-05-08 10:29           ` David Howells
2003-05-07 17:55         ` Jörn Engel
2003-05-07 16:20           ` Martin J. Bligh
2003-05-07 19:01         ` Dave Hansen
2003-05-07 20:06           ` Jörn Engel
2003-05-07 20:14             ` Dave Hansen
2003-05-08  8:41               ` Jörn Engel
2003-05-08 16:51                 ` Dave Hansen
2003-05-08 22:12                   ` Jörn Engel
2003-05-07 21:30         ` Jesse Pollard
2003-05-07 21:54           ` Timothy Miller
2003-05-07 22:01             ` Jesse Pollard
2003-05-07 14:33     ` Torsten Landschoff
2003-05-07 14:47       ` William Lee Irwin III
2003-05-07 15:04         ` Torsten Landschoff
2003-05-07 16:01           ` William Lee Irwin III
2003-05-08 15:36             ` Ingo Oeser
2003-05-08 18:04               ` William Lee Irwin III
2003-05-07 15:23         ` Timothy Miller
2003-05-07 15:47           ` William Lee Irwin III
2003-05-07 16:49         ` Jörn Engel
2003-05-07 17:18           ` Davide Libenzi
2003-05-07 17:40             ` Jörn Engel
2003-05-07 18:35               ` Davide Libenzi
2003-05-07 19:45                 ` Jörn Engel
2003-05-07 18:23             ` William Lee Irwin III
2003-05-07 17:38           ` William Lee Irwin III
2003-05-07 17:47             ` Jörn Engel
2003-05-07 14:49       ` Richard B. Johnson
2003-05-07 18:36   ` Linus Torvalds
2003-05-07 19:17     ` Jeff Garzik
2003-05-07 20:38       ` Randy.Dunlap
2003-05-07 21:27         ` Marcus Alanen
2003-05-07 21:27           ` Randy.Dunlap
2003-05-08 15:10         ` Ingo Oeser
2003-05-08 17:12           ` Randy.Dunlap
2003-05-07 19:38 Chuck Ebbert
2003-05-08 14:08 Chuck Ebbert
2003-05-08 18:04 ` Jonathan Lundell
2003-05-08 19:05   ` Timothy Miller
2003-05-08 21:00     ` Jonathan Lundell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Pine.LNX.4.53.0305071547060.13869@chaos \
    --to=root@chaos.analogic.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=roland@topspin.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).