kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeremy Fitzhardinge <jeremy@goop.org>
To: Glauber Costa <glommer@redhat.com>
Cc: kvm@vger.kernel.org, avi@redhat.com, zamsden@redhat.com,
	mtosatti@redhat.com, riel@redhat.com, peterz@infradead.org,
	mingo@elte.hu
Subject: Re: [RFC v2 0/7] kvm stael time implementation
Date: Mon, 30 Aug 2010 10:20:42 -0700	[thread overview]
Message-ID: <4C7BE86A.5010609@goop.org> (raw)
In-Reply-To: <1283184391-7785-1-git-send-email-glommer@redhat.com>

 On 08/30/2010 09:06 AM, Glauber Costa wrote:
> Hi,
>
> So, this is basically the same as v1, with three major
> differences: 
>  1) I am posting to lkml for wider audience
>  2) patch 2/7 fixes one problem I mentined would happen in
>     smp systems, which is, we only update kvmclock when we
>     changes pcup
>  3) softlockup algorithm is changed. Again, as marcelo pointed
>     out, this is open to discussion, and I am not dropping it
>     so more people can step in.
>
> I have some other patches under local test for a slightly modified
> guest part accounting, and I do somehow support extending
> the interface, and changing to nsecs (maybe not 100 %, but...). But
> I am posting in this state so we can have lkml people to step
> in earlier.
>
> Reminder of the previous cover-letter:
>
> There are two parts of it: the guest and host part.
>
> The proposal for the guest part, is to just change the
> common time accounting, and try to identify at that spot,
> wether or not we should account any steal time. I considered
> this idea less cumbersome that trying to cook a clockevents
> implementation ourselves, since I see little value in it.
> I am, however, pretty open to suggestions.

What's the relationship between clockevents and stolen time?  Are you
thinking some form of timer that counts unstolen time or something?

> For the host<->guest communications, I am using a shared
> page, in the same way as pvclock. Because of that, I am just
> hijacking pvclock structure anyway. There is a 32-bit field
> floating by, that gives us enough room for 8 years of steal
> time (we use msec resolution).

Please don't.  The pvclock structure is already getting a bit packed
with stuff, and stolen time isn't really part of its job.  In Xen we
have a separate runstate structure which allows the guest to get a
detailed breakdown of the time each vcpu spends in each state (which are
guaranteed to sum to the system time).  We can use that to compute how
much time has been stolen (time spent in RUNNABLE state).  You might
consider a similar ABI for KVM, even if you can't (yet) fill out all the
time values.


> The main idea is to timestamp our exit and entry through
> sched notifiers, and export the value at pvclock updates.
> This obviously have some disadvantages: by doing this we
> are giving up futures ideas about only updating
> this structure once, and even right now, won't work
> on pinned-smp (since we don't update pvclock if we
> haven't changed cpus.)
>
> But again, it is just an RFC, and I'd like to feel the
> reception of the idea as a whole.

The Xen code has always accounted for stolen time, so it appears in top,
vmstat, etc, and gives a user/admin some idea about how much their
domain is suffering from competition.  This doesn't require any kernel
changes aside from some calls to account_steal_ticks(); we do this every
timer interrupt, accumulating whole ticks as we get them.

But I've not successfully managed to make the scheduler work well with
stolen time.  I did experiment with making sched_clock return unstolen
time, on the grounds that it would give the scheduler more information
about how long things actually executed for.  But its meaningless for
measuring sleep/idle times, and it causes the different cpus' timebases
to drift severely, which causes other things in the kernel to get upset.

So I think any work in the scheduler area is interesting, and probably
worth posting separately unless they're too entangled in the stolen time
accounting area.

    J

> Have a nice review.
> Glauber Costa (7):
>   change headers preparing for steal time
>   always call kvm_write_guest
>   measure time out of guest
>   change kernel accounting to include steal time
>   kvm steal time implementation
>   touch softlockup watchdog
>   tell guest about steal time feature
>
>  arch/x86/include/asm/kvm_host.h    |    2 +
>  arch/x86/include/asm/kvm_para.h    |    1 +
>  arch/x86/include/asm/pvclock-abi.h |    4 ++-
>  arch/x86/kernel/kvmclock.c         |   40 ++++++++++++++++++++++++++++++++++++
>  arch/x86/kvm/x86.c                 |   26 ++++++++++++++++++----
>  include/linux/sched.h              |    1 +
>  kernel/sched.c                     |   29 ++++++++++++++++++++++++++
>  7 files changed, 97 insertions(+), 6 deletions(-)
>


      parent reply	other threads:[~2010-08-30 17:20 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-08-30 16:06 [RFC v2 0/7] kvm stael time implementation Glauber Costa
2010-08-30 16:06 ` [RFC v2 1/7] change headers preparing for steal time Glauber Costa
2010-08-30 16:06   ` [RFC 1/8] Implement getnsboottime kernel API Glauber Costa
2010-08-30 16:06     ` [RFC v2 2/7] always call kvm_write_guest Glauber Costa
2010-08-30 16:06       ` [RFC 2/8] change headers preparing for steal time Glauber Costa
2010-08-30 16:06         ` [RFC 3/8] always call kvm_write_guest Glauber Costa
2010-08-30 16:06           ` [RFC v2 3/7] measure time out of guest Glauber Costa
2010-08-30 16:06             ` [RFC v2 4/7] change kernel accounting to include steal time Glauber Costa
2010-08-30 16:06               ` [RFC 4/8] measure time out of guest Glauber Costa
2010-08-30 16:06                 ` [RFC 5/8] change kernel accounting to include steal time Glauber Costa
2010-08-30 16:06                   ` [RFC v2 5/7] kvm steal time implementation Glauber Costa
2010-08-30 16:06                     ` [RFC 6/8] " Glauber Costa
2010-08-30 16:06                       ` [RFC v2 6/7] touch softlockup watchdog Glauber Costa
2010-08-30 16:06                         ` [RFC v2 7/7] tell guest about steal time feature Glauber Costa
2010-08-30 16:06                           ` [RFC 7/8] touch softlockup watchdog Glauber Costa
2010-08-30 16:06                             ` [RFC 8/8] tell guest about steal time feature Glauber Costa
2010-08-30 17:33                         ` [RFC v2 6/7] touch softlockup watchdog Jeremy Fitzhardinge
2010-08-30 18:07                           ` Glauber Costa
2010-08-30 16:46                   ` [RFC 5/8] change kernel accounting to include steal time Peter Zijlstra
2010-08-30 17:26                     ` Glauber Costa
2010-08-30 17:30               ` [RFC v2 4/7] " Jeremy Fitzhardinge
2010-08-30 18:39                 ` Rik van Riel
2010-08-30 19:07                   ` Jeremy Fitzhardinge
2010-08-30 19:14                     ` Peter Zijlstra
2010-08-30 19:17                     ` Rik van Riel
2010-08-30 19:20                       ` Peter Zijlstra
2010-08-30 19:45                         ` Rik van Riel
2010-08-30 22:56                           ` Jeremy Fitzhardinge
2010-08-30 23:03                             ` Rik van Riel
2010-08-31  8:11                               ` Peter Zijlstra
2010-09-02 18:19                                 ` Glauber Costa
2010-09-03  3:24                                   ` Jeremy Fitzhardinge
2010-09-03  7:18                                   ` Peter Zijlstra
2010-09-01 23:56     ` [RFC 1/8] Implement getnsboottime kernel API Zachary Amsden
2010-08-30 16:37 ` [RFC v2 0/7] kvm stael time implementation Peter Zijlstra
2010-08-30 16:45   ` Jeremy Fitzhardinge
2010-08-30 17:21     ` Glauber Costa
2010-08-30 17:20 ` Jeremy Fitzhardinge [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4C7BE86A.5010609@goop.org \
    --to=jeremy@goop.org \
    --cc=avi@redhat.com \
    --cc=glommer@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=mtosatti@redhat.com \
    --cc=peterz@infradead.org \
    --cc=riel@redhat.com \
    --cc=zamsden@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).