linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: john stultz <johnstul@us.ibm.com>
To: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Andi Kleen <ak@suse.de>, Ingo Molnar <mingo@elte.hu>,
	Thomas Gleixner <tglx@linutronix.de>,
	Con Kolivas <kernel@kolivas.org>,
	Rusty Russell <rusty@rustcorp.com.au>,
	Zachary Amsden <zach@vmware.com>,
	James Morris <jmorris@namei.org>,
	Chris Wright <chrisw@sous-sol.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	cpufreq@lists.linux.org.uk,
	Virtualization Mailing List <virtualization@lists.osdl.org>,
	Daniel Walker <dwalker@mvista.com>
Subject: Re: Stolen and degraded time and schedulers
Date: Tue, 13 Mar 2007 13:12:48 -0700	[thread overview]
Message-ID: <1173816769.22180.14.camel@localhost> (raw)
In-Reply-To: <45F6D1D0.6080905@goop.org>

On Tue, 2007-03-13 at 09:31 -0700, Jeremy Fitzhardinge wrote:
> The current Linux scheduler makes one big assumption: that 1ms of CPU
> time is the same as any other 1ms of CPU time, and that therefore a
> process makes the same amount of progress regardless of which particular
> ms of time it gets.
> 
> This assumption is wrong now, and will become more wrong as
> virtualization gets more widely used.
> 
> It's wrong now, because it fails to take into account of several kinds
> of missing time:
> 
>    1. interrupts - time spent in an ISR is accounted to the current
>       process, even though it gets no direct benefit
>    2. SMM - time is completely lost from the kernel
>    3. slow CPUs - 1ms of 600MHz CPU is less useful than 1ms of 2.4GHz CPU
> 
[snip]
> So how to deal with this?  Basically we need a clock which measures "CPU
> work units", and have the scheduler use this clock.
> 
> A "CPU work unit" clock has these properties:
> 
>     * inherently per-CPU (from the kernel's perspective, so it would be
>       per-VCPU in a virtual machine)
>     * monotonic - you can't do negative work
>     * measured in "work units"
[snip]
> So, how to implement this?
> 
> One quick hack would be to just make a new clocksource entrypoint, which
> returns work units rather than real-time cycles.  That would be fairly
> simple to implement, but it doesn't really take the per-cpu nature of
> the clock into account (since its possible that different cpus on the
> same machine might need their own methods).
> 
> Perhaps a better fit would be an entity which is equivalent to a
> clocksource, but registered per-cpu like (some) clockevents.
> 
> I don't have a particular preference, but I wonder what the clock gurus
> think.

My gut reaction would be to avoid using clocksources for now. While
there is some thought going into how to expand clocksources for other
uses (Daniel is working on this, for example), the design for
clocksources has been very focused on its utility to timekeeping, so I'm
hesitant to try complicate the clocksources in order to multiplex
functionality until what is really needed is well understood.

I suspect the best approach would be see how the sched_clock interface
can be reworked/used for what you want, as it's design goals map closest
to the work-unit properties you list above.

Then we can look to see how clocksources can be best used to implement
the sched_clock interface.

-john










  reply	other threads:[~2007-03-13 20:13 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-03-13 16:31 Stolen and degraded time and schedulers Jeremy Fitzhardinge
2007-03-13 20:12 ` john stultz [this message]
2007-03-13 20:32   ` Jeremy Fitzhardinge
2007-03-13 21:27     ` Daniel Walker
2007-03-13 21:59       ` Jeremy Fitzhardinge
2007-03-14  0:43         ` Dan Hecht
2007-03-14  4:37           ` Jeremy Fitzhardinge
2007-03-14 13:58             ` Lennart Sorensen
2007-03-14 15:08               ` Jeremy Fitzhardinge
2007-03-14 15:12                 ` Lennart Sorensen
2007-03-14 19:02             ` Dan Hecht
2007-03-14 19:34               ` Jeremy Fitzhardinge
2007-03-14 19:45                 ` Rik van Riel
2007-03-14 19:47                   ` Jeremy Fitzhardinge
2007-03-14 20:02                     ` Rik van Riel
2007-03-14 20:26                 ` Dan Hecht
2007-03-14 20:31                   ` Jeremy Fitzhardinge
2007-03-14 20:46                     ` Dan Hecht
2007-03-14 21:18                       ` Jeremy Fitzhardinge
2007-03-15 19:09                         ` Dan Hecht
2007-03-15 19:18                           ` Jeremy Fitzhardinge
2007-03-15 19:48                           ` Rik van Riel
2007-03-15 19:53                           ` Jeremy Fitzhardinge
2007-03-15 20:07                             ` Dan Hecht
2007-03-15 20:14                               ` Rik van Riel
2007-03-15 20:35                                 ` Dan Hecht
2007-03-16  8:59                                   ` Martin Schwidefsky
2007-03-14 20:38                 ` Ingo Molnar
2007-03-14 20:59                   ` Jeremy Fitzhardinge
2007-03-16  8:38                     ` Ingo Molnar
2007-03-16 16:53                       ` Jeremy Fitzhardinge
2007-03-15  5:23                 ` Paul Mackerras
2007-03-15 19:33                   ` Jeremy Fitzhardinge
2007-03-14  2:00         ` Daniel Walker
2007-03-14  6:52           ` Jeremy Fitzhardinge
2007-03-14  8:20             ` Zan Lynx
2007-03-14 16:11             ` Daniel Walker
2007-03-14 16:37               ` Jeremy Fitzhardinge
2007-03-14 16:59                 ` Daniel Walker
2007-03-14 17:08                   ` Jeremy Fitzhardinge
2007-03-14 18:06                     ` Daniel Walker
2007-03-14 18:41                       ` Jeremy Fitzhardinge
2007-03-14 19:00                         ` Daniel Walker
2007-03-14 19:44                           ` Jeremy Fitzhardinge
2007-03-14 20:33                             ` Daniel Walker
2007-03-14 21:16                               ` Jeremy Fitzhardinge
2007-03-14 21:34                                 ` Daniel Walker
2007-03-14 21:42                                   ` Jeremy Fitzhardinge
2007-03-14 21:36 ` Con Kolivas
2007-03-14 21:38   ` Jeremy Fitzhardinge
2007-03-14 21:40   ` Con Kolivas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1173816769.22180.14.camel@localhost \
    --to=johnstul@us.ibm.com \
    --cc=ak@suse.de \
    --cc=chrisw@sous-sol.org \
    --cc=cpufreq@lists.linux.org.uk \
    --cc=dwalker@mvista.com \
    --cc=jeremy@goop.org \
    --cc=jmorris@namei.org \
    --cc=kernel@kolivas.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=rusty@rustcorp.com.au \
    --cc=tglx@linutronix.de \
    --cc=virtualization@lists.osdl.org \
    --cc=zach@vmware.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).