linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Bill Huey (Hui) <billh@gnuppy.monkey.org>
To: Ingo Molnar <mingo@elte.hu>
Cc: Andy Isaacson <adi@hexapodia.org>, Larry McVoy <lm@bitmover.com>,
	Peter W?chtler <pwaechtler@mac.com>,
	Bill Davidsen <davidsen@tmr.com>,
	linux-kernel@vger.kernel.org,
	"Bill Huey (Hui)" <billh@gnuppy.monkey.org>
Subject: Re: 1:1 threading vs. scheduler activations (was: Re: [ANNOUNCE] Native POSIX Thread Library 0.1)
Date: Tue, 24 Sep 2002 20:08:39 -0700	[thread overview]
Message-ID: <20020925030839.GA4746@gnuppy.monkey.org> (raw)
In-Reply-To: <Pine.LNX.4.44.0209240755060.8943-100000@localhost.localdomain>

On Tue, Sep 24, 2002 at 08:32:16AM +0200, Ingo Molnar wrote:
> yes, SA's (and KSA's) are an interesting concept, but i personally think
> they are way too much complexity - and history has shows that complexity
> never leads to anything good, especially not in OS design.

FreeBSD's KSEs .;)

> Eg. SA's, like every M:N concept, must have a userspace component of the
> scheduler, which gets very funny when you try to implement all the things
> the kernel scheduler has had for years: fairness, SMP balancing, RT
> scheduling (!), preemption and more.

Yeah, I understand. These folks are doing some interesting stuff and
might provide some answers for you:

	http://www.research.ibm.com/K42/

This paper specifically:

	http://www.research.ibm.com/K42/white-papers/Scheduling.pdf

Their stuff isn't too much different than FreeBSD's KSE project, different
names for the primitives, different communication, etc...

> And then i havent mentioned things like upcall costs - what's the point in
> upcalling userspace which then has to schedule, instead of doing this
> stuff right in the kernel? Scheduler activations concentrate too much on
> the 5% of cases that have more userspace<->userspace context switching
> than some sort of kernel-provoked context switching. Sure, scheduler
> activations can be done, but i cannot see how they can be any better than
> 'just as fast' as a 1:1 implementation - at a much higher complexity and
> robustness cost.

Folks have been experimenting with other means of kernel/userspace using
a chunk of shared memory, notification and polling when the UTS gets entered
by a block on a mutex or other operation. Upcalls are what was used in
the original Topaz OS paper that implemented SAs, Mach was the other.
It doesn't mean that it's used universally for all implementations.

> the biggest part of Linux's kernel-space context switching is the cost of
> kernel entry - and the cost of kernel entry gets cheaper with every new
> generation of CPUs. Basing the whole threading design on the avoidance of
> the kernel scheduler is like basing your tent on a glacier, in a hot
> summer day.
> 
> Plus in an M:N model all the development toolchain suddenly has to
> understand the new set of contexts, debuggers, tracers, everything.

That's not an issue. Folks expect that to be so when working with any
new threading system.

> Plus there are other issues like security - it's perfectly reasonable in
> the 1:1 model for a certain set of server threads to drop all privileges
> to do the more dangerous stuff. (while there is no such thing as absolute
> security and separation in a threaded app, dropping privileges can avoid
> certain classes of exploits.)

> generally the whole SA/M:N concept creaks under the huge change that is
> introduced by having multiple userspace contexts of execution per a single
> kernel-space context of execution. Such detaching of concepts, no matter
> which kernel subsystem you look at, causes problems everywhere.

Maybe, it's probably implementation specific. I'm curious as to how K42
performs.

> eg. the VM. There's no way you can get an 'upcall' from the VM that you
> need to wait for free RAM - most of the related kernel code is simply not
> ready and restartable. So VM load can end up blocking kernel contexts
> without giving any chance to user contexts to be 'scheduled' by the
> userspace scheduler. This happens exactly in the worst moment, when load
> increases and stuff starts swapping.

That's solved by refashioning the kernel to pump out a blocking notification
to the UTS for that backing kernel thread. It's expected out of an SA style
system.

> and there are some things that i'm not at all sure can be fixed in any
> reasonable way - eg. RT scheduling. [the userspace library would have to
> raise/drop the priority of threads in the userspace scheduler, causing an
> additional kernel entry/exit, eliminating even the theoretical advantage
> it had for pure user<->user context switches.]

KSEs have a RT scheduling category, but the issue of preemption is not clearly
understood by me just yet so can't comment on it. I was in the process of trying
to understand this stuff at one time since I was thinking about work on that
project.

> plus basic performance issues. If you have a healthy mix of userspace and
> kernelspace scheduler activity then you've at least doubled your icache
> footprint by having two scheduler - the dcache footprint is higher as
> well. A *single* bad cachemiss on a P4 is already almost as expensive as a
> kernel entry - and it's not like the growing gap between RAM access
> latency and CPU performance will shrink in the future. And we arent even
> using SYSENTER/SYSEXIT in the Linux kernel yet, which will shave off
> another 40% from the syscall entry (and kernel context switching) cost.

It'll be localized to the UTS, while threads that blocked in the kernel
are mostly going to be IO driven. Don't know about the situation where
you might have a a mixture of those activities.

The infrastructure for the upcalls might incur significant overhead.

> so my current take on threading models is: if you *can* do a really fast
> and lightweight kernel based 1:1 threading implementation then you have
> won. Anything else is barely more than workarounds for (fixable)  
> architectural problems. Concentrate your execution abstraction into the
> kernel and make it *really* fast and scalable - that will improve
> everything else. OTOH any improvement to the userspace thread scheduler
> only improves threaded applications - which are still the minority. Sure,
> some of the above problems can be helped, but it's not trivial - and some
> problems i dont think can be solved at all.

> But we'll see, the FreeBSD folks i think are working on KSA's so we'll
> know for sure in a couple of years.

There's a lot of ways folks can do this kind of stuff. Who knows ? The current
method you folks are doing could very well be the best for Linux.

I don't have much more to say about this topic.

bill


  reply	other threads:[~2002-09-25  3:03 UTC|newest]

Thread overview: 61+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2002-09-22 18:55 [ANNOUNCE] Native POSIX Thread Library 0.1 Peter Waechtler
2002-09-22 21:32 ` Larry McVoy
2002-09-23 10:05   ` Bill Davidsen
2002-09-23 11:55     ` Peter Waechtler
2002-09-23 19:14       ` Bill Davidsen
2002-09-29 23:26         ` Buddy Lumpkin
2002-09-30 14:54           ` Corey Minyard
2002-09-23 15:30     ` Larry McVoy
2002-09-23 19:44       ` Olivier Galibert
2002-09-23 19:48       ` Bill Davidsen
2002-09-23 20:32         ` Ingo Molnar
2002-09-24  0:03           ` Andy Isaacson
2002-09-24  0:10             ` Jeff Garzik
2002-09-24  0:14               ` Andy Isaacson
2002-09-24  5:53             ` Ingo Molnar
2002-09-24 20:34             ` David Schwartz
2002-09-24  7:12           ` Thunder from the hill
2002-09-24  7:30             ` Ingo Molnar
2002-09-23 22:35         ` Mark Mielke
2002-09-23 19:59       ` Peter Waechtler
2002-09-23 20:36         ` Ingo Molnar
2002-09-23 21:08           ` Peter Wächtler
2002-09-23 22:44             ` Mark Mielke
2002-09-23 23:01               ` Bill Huey
2002-09-23 23:11                 ` Mark Mielke
2002-09-24  0:21                   ` Bill Huey
2002-09-24  3:20                     ` Mark Mielke
2002-09-23 23:57           ` Andy Isaacson
2002-09-24  6:32             ` 1:1 threading vs. scheduler activations (was: Re: [ANNOUNCE] Native POSIX Thread Library 0.1) Ingo Molnar
2002-09-25  3:08               ` Bill Huey [this message]
2002-09-24 18:10             ` [ANNOUNCE] Native POSIX Thread Library 0.1 Christoph Hellwig
2002-09-23 21:32       ` Bill Huey
2002-09-23 21:41       ` dean gaudet
2002-09-23 22:10         ` Bill Huey
2002-09-23 22:56         ` Mark Mielke
2002-09-24 10:02       ` Nikita Danilov
2002-09-23 21:22     ` Bill Huey
2002-09-23 21:03 ` Bill Huey
2002-09-24 12:03   ` Michael Sinz
2002-09-24 13:40     ` Peter Svensson
2002-09-24 14:20       ` Michael Sinz
2002-09-24 14:50         ` Offtopic: (was Re: [ANNOUNCE] Native POSIX Thread Library 0.1) Peter Svensson
2002-09-24 15:19           ` Mark Veltzer
2002-09-24 17:29             ` Rik van Riel
2002-09-25 18:57               ` Mark Mielke
2002-09-25 19:04                 ` Rik van Riel
2002-09-25 19:29                   ` Mark Veltzer
2002-09-25 19:23                     ` Rik van Riel
2002-09-24 16:31           ` Rik van Riel
2002-09-24 18:49             ` Michael Sinz
2002-09-24 19:12               ` PATCH: per user fair scheduler 2.4.19 (cleaned up, thanks hch) (was: Re: Offtopic: (was Re: [ANNOUNCE] Native POSIX Thread Library 0.1)) Rik van Riel
2002-09-24 20:19 ` [ANNOUNCE] Native POSIX Thread Library 0.1 David Schwartz
2002-09-24 21:10   ` Chris Friesen
2002-09-24 21:22     ` Rik van Riel
2002-09-24 21:35       ` Roberto Peon
2002-09-24 21:35       ` Chris Friesen
2002-09-25 19:02     ` David Schwartz
2002-09-24 23:16   ` Peter Waechtler
2002-09-24 23:23     ` Rik van Riel
2002-09-25 19:05     ` David Schwartz
2002-09-24 16:49 1:1 threading vs. scheduler activations (was: Re: [ANNOUNCE] Native POSIX Thread Library 0.1) Perez-Gonzalez, Inaky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20020925030839.GA4746@gnuppy.monkey.org \
    --to=billh@gnuppy.monkey.org \
    --cc=adi@hexapodia.org \
    --cc=davidsen@tmr.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lm@bitmover.com \
    --cc=mingo@elte.hu \
    --cc=pwaechtler@mac.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).