From mboxrd@z Thu Jan 1 00:00:00 1970 From: Glauber Costa Subject: Re: [RFC v2 4/7] change kernel accounting to include steal time Date: Thu, 2 Sep 2010 15:19:56 -0300 Message-ID: <20100902181956.GE5933@mothafucka.localdomain> References: <1283184391-7785-8-git-send-email-glommer@redhat.com> <4C7BEA9C.1060605@goop.org> <4C7BFACD.4030409@redhat.com> <4C7C0187.7040401@goop.org> <4C7C03CB.1060700@redhat.com> <1283196005.1820.1340.camel@laptop> <4C7C0A57.2010906@redhat.com> <4C7C3709.3040706@goop.org> <4C7C38BC.1090907@redhat.com> <1283242309.1820.1471.camel@laptop> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Rik van Riel , Jeremy Fitzhardinge , kvm@vger.kernel.org, avi@redhat.com, zamsden@redhat.com, mtosatti@redhat.com, mingo@elte.hu To: Peter Zijlstra Return-path: Received: from mx1.redhat.com ([209.132.183.28]:25422 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753264Ab0IBSUQ (ORCPT ); Thu, 2 Sep 2010 14:20:16 -0400 Content-Disposition: inline In-Reply-To: <1283242309.1820.1471.camel@laptop> Sender: kvm-owner@vger.kernel.org List-ID: On Tue, Aug 31, 2010 at 10:11:49AM +0200, Peter Zijlstra wrote: > On Mon, 2010-08-30 at 19:03 -0400, Rik van Riel wrote: > > > > > I think it basically comes down to adding "sched_clock_unstolen()" which > > > the scheduler can use to measure time a process spends running, and > > > sched_clock() for measuring sleep times. In the normal case, > > > sched_clock_unstolen() would be the same as sched_clock(). > > > > That requires the host to export (any time the guest is scheduled > > in), the amount of CPU time the VCPU thread has used, and the time > > the VCPU was scheduled in. > > > > Since the VCPU must be running when it is examining these variables, > > it can calculate the additional time (since it was last scheduled) > > to account to the task, and remember the currently calculated time > > in its own per-vcpu variable, so next time it can get a delta again. > > I think its easier (and sufficient) for the host to tell the guest how > long it was _not_ running. That can simply be passed in when you start > the vcpu again and doesn't need a fancy communication channel. > > The guests sched_clock() will measure wall time, the guests > sched_clock_stolen() will report the accumulation of these stolen times. > > Then you can make sched_clock_unstolen() be sched_clock() - > sched_clock_stolen(). And like Jeremy said, if you make the sched_fair > stuff use sched_clock_unstolen() things should more or less work. So what's the big drawback of just making sched_clock return sched_clock_unstolen? When there is no steal time involved, they will just be equal anyway. And this way, everybody that relies on sched_clock for whatever reason, will probably work.