linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeremy Fitzhardinge <jeremy@goop.org>
To: Andi Kleen <ak@muc.de>
Cc: Zachary Amsden <zach@vmware.com>,
	Linus Torvalds <torvalds@osdl.org>,
	Rusty Russell <rusty@rustcorp.com.au>,
	Chris Wright <chrisw@sous-sol.org>, Dan Hecht <dhecht@vmware.com>,
	Dan Arai <arai@vmware.com>, Andrew Morton <akpm@osdl.org>,
	Virtualization Mailing List <virtualization@lists.osdl.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Daniel Walker <dwalker@mvista.com>
Subject: Re: [PATCH 2/9] Sched clock paravirt op fix.patch
Date: Tue, 13 Mar 2007 08:26:27 -0700	[thread overview]
Message-ID: <45F6C2A3.4040305@goop.org> (raw)
In-Reply-To: <20070313140129.GB92373@muc.de>

Andi Kleen wrote:
> On Thu, Mar 01, 2007 at 06:54:24PM -0800, Zachary Amsden wrote:
>   
>> The custom_sched_clock hook is broken.  The result from sched_clock needs to be
>> in nanoseconds, not in CPU cycles.  The TSC is insufficient for this purpose,
>> because TSC is poorly defined in a virtual environment, and mostly represents
>> real world time instead of scheduled process time (which can be interrupted
>> without notice when a virtual machine is descheduled).
>>
>> To make the scheduler consistent, we must expose a different nature of time,
>> that is scheduled time.  So deprecate this custom_sched_clock hack and turn it
>> into a paravirt-op, as it should have been all along.  This allows the tsc.c
>> code which converts cycles to nanoseconds to be shared by all paravirt-ops
>> backends.
>>
>> It is unfortunate to add a new paravirt-op, but this is a very distinct
>> abstraction which is clearly different for all virtual machine implementations,
>> and it gets rid of an ugly indirect function which I ashamedly admit I hacked
>> in to try to get this to work earlier, and then even got in the wrong units.
>>
>> Please apply.
>>     
>
> I think it's better to remove this completely and not allow paravirt
> to hook into sched_clock.  After all a hypervisor stealing time is no
> different from interrupts stealing time and we don't try to handle
> that either. 
>   

Well, its a matter of degree.  If your busy vcpu is sharing a real cpu
with 9 other busy vcpus, then you'll only see about 10% of the real cpu
cycles.  If we had legitimate loads in which 90% of the cpu were taken
up by interrupt processing, then I think we definitely would try to
account for that time properly.

SMM interrupts, in which the CPU is stolen from the kernel entirely, is
a better analogy.  But again, they're expected to be rare and
insignificant.  In a virtual machine, on the other hand, you could
expect the majority of cpu time to be stolen by other things running on
the same physical machine.

In other words, regardless of whether this particular pv_op lives or
dies, we're going to need to have to deal with stolen time properly.  I
think this hook is reasonable and useful step towards doing that.

    J

  reply	other threads:[~2007-03-13 15:26 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-03-02  2:54 [PATCH 2/9] Sched clock paravirt op fix.patch Zachary Amsden
2007-03-13 14:01 ` Andi Kleen
2007-03-13 15:26   ` Jeremy Fitzhardinge [this message]
2007-03-13 16:07     ` Chris Wright
2007-03-13 16:16       ` Andi Kleen
2007-03-13 16:37         ` Rik van Riel
2007-03-13 20:56           ` Andi Kleen
2007-03-13 21:05             ` Jeremy Fitzhardinge
2007-03-16 21:29               ` Matt Mackall

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=45F6C2A3.4040305@goop.org \
    --to=jeremy@goop.org \
    --cc=ak@muc.de \
    --cc=akpm@osdl.org \
    --cc=arai@vmware.com \
    --cc=chrisw@sous-sol.org \
    --cc=dhecht@vmware.com \
    --cc=dwalker@mvista.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=rusty@rustcorp.com.au \
    --cc=torvalds@osdl.org \
    --cc=virtualization@lists.osdl.org \
    --cc=zach@vmware.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).