All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Glauber Costa <glommer@redhat.com>
Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
	aliguori@us.ibm.com, Rik van Riel <riel@redhat.com>,
	Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>,
	Avi Kivity <avi@redhat.com>
Subject: Re: [PATCH 16/16] KVM-GST: adjust scheduler cpu power
Date: Wed, 26 Jan 2011 17:46:26 +0100	[thread overview]
Message-ID: <1296060386.28776.1312.camel@laptop> (raw)
In-Reply-To: <1296056617.3591.26.camel@mothafucka.localdomain>

On Wed, 2011-01-26 at 13:43 -0200, Glauber Costa wrote:

> yes, but once this delta is subtracted from rq->clock_task, this value is not
> used to dictate power, unless I am mistaken.
> 
> power is adjusted according to scale_rt_power(), which does it using the
> values of rq->rt_avg, rq->age_stamp, and rq->clock.
> 
> So whatever I store into rq->clock_task, but not rq->clock (which
> correct me if I'm wrong, is expected to be walltime), will not be used
> to adjust cpu power, which is what I'm trying to achieve.

No, see the below, it uses a per-cpu virt_steal_time() clock which is
expected to return steal-time in ns.

All time not accounted to ->clock_task is accumulated in lost, and
passed into sched_rt_avg_update() and thus affects the cpu_power.

If it finds that 50% of the (recent) time is steal time, its cpu_power
will be 50%.

---
 kernel/sched.c          |   44 ++++++++++++++++++++++++++++----------------
 kernel/sched_features.h |    2 +-
 2 files changed, 29 insertions(+), 17 deletions(-)

diff --git a/kernel/sched.c b/kernel/sched.c
index 18d38e4..c71384c 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -523,6 +523,9 @@ struct rq {
 #ifdef CONFIG_IRQ_TIME_ACCOUNTING
 	u64 prev_irq_time;
 #endif
+#ifdef CONFIG_SCHED_PARAVIRT
+	u64 prev_steal_time;
+#endif
 
 	/* calc_load related fields */
 	unsigned long calc_load_update;
@@ -1888,11 +1891,15 @@ void account_system_vtime(struct task_struct *curr)
 }
 EXPORT_SYMBOL_GPL(account_system_vtime);
 
+#endif /* CONFIG_IRQ_TIME_ACCOUNTING */
+
 static void update_rq_clock_task(struct rq *rq, s64 delta)
 {
-	s64 irq_delta;
+	s64 lost_delta __maybe_unused;
+	s64 lost = 0;
 
-	irq_delta = irq_time_read(cpu_of(rq)) - rq->prev_irq_time;
+#ifdef CONFIG_IRQ_TIME_ACCOUNTING
+	lost_delta = irq_time_read(cpu_of(rq)) - rq->prev_irq_time;
 
 	/*
 	 * Since irq_time is only updated on {soft,}irq_exit, we might run into
@@ -1909,26 +1916,31 @@ static void update_rq_clock_task(struct rq *rq, s64 delta)
 	 * the current rq->clock timestamp, except that would require using
 	 * atomic ops.
 	 */
-	if (irq_delta > delta)
-		irq_delta = delta;
+	if (lost_delta > delta)
+		lost_delta = delta;
 
-	rq->prev_irq_time += irq_delta;
-	delta -= irq_delta;
-	rq->clock_task += delta;
+	rq->prev_irq_time += lost_delta;
+	lost += lost_delta;
+#endif
+#ifdef CONFIG_SCHED_PARAVIRT
+	lost_delta = virt_steal_time(cpu_of(rq)) - rq->prev_steal_time;
+	
+	/*
+	 * unlikely, unless steal_time accounting is iffy
+	 */
+	if (lost + lost_delta > delta)
+		lost_delta = delta - lost;
 
-	if (irq_delta && sched_feat(NONIRQ_POWER))
-		sched_rt_avg_update(rq, irq_delta);
-}
+	rq->prev_steal_time += lost_delta;
+	lost += lost_delta
+#endif
 
-#else /* CONFIG_IRQ_TIME_ACCOUNTING */
+	rq->clock_task += delta - lost;
 
-static void update_rq_clock_task(struct rq *rq, s64 delta)
-{
-	rq->clock_task += delta;
+	if (lost && sched_feat(NONTASK_POWER))
+		sched_rt_avg_update(rq, lost);
 }
 
-#endif /* CONFIG_IRQ_TIME_ACCOUNTING */
-
 #include "sched_idletask.c"
 #include "sched_fair.c"
 #include "sched_rt.c"
diff --git a/kernel/sched_features.h b/kernel/sched_features.h
index 68e69ac..b334a2d 100644
--- a/kernel/sched_features.h
+++ b/kernel/sched_features.h
@@ -63,4 +63,4 @@ SCHED_FEAT(OWNER_SPIN, 1)
 /*
  * Decrement CPU power based on irq activity
  */
-SCHED_FEAT(NONIRQ_POWER, 1)
+SCHED_FEAT(NONTASK_POWER, 1)


  reply	other threads:[~2011-01-26 16:45 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-01-24 18:06 [PATCH 00/16] New Proposal for steal time in KVM Glauber Costa
2011-01-24 18:06 ` [PATCH 01/16] KVM-HDR: register KVM basic header infrastructure Glauber Costa
2011-01-26 11:06   ` Avi Kivity
2011-01-26 12:13     ` Glauber Costa
2011-01-26 15:12       ` Avi Kivity
2011-01-26 15:36         ` Glauber Costa
2011-01-26 17:22           ` Anthony Liguori
2011-01-26 17:49             ` Glauber Costa
2011-01-27 12:31               ` Avi Kivity
2011-01-24 18:06 ` [PATCH 02/16] KVM-HV: KVM - KVM Virtual Memory hypervisor implementation Glauber Costa
2011-01-24 18:06 ` [PATCH 03/16] KVM-HDR: KVM Userspace registering ioctl Glauber Costa
2011-01-26 11:12   ` Avi Kivity
2011-01-26 12:14     ` Glauber Costa
2011-01-26 15:14       ` Avi Kivity
2011-01-26 15:23         ` Glauber Costa
2011-01-24 18:06 ` [PATCH 04/16] KVM-HV: " Glauber Costa
2011-01-24 18:06 ` [PATCH 05/16] KVM-HDR: Implement wallclock over KVM - KVM Virtual Memory Glauber Costa
2011-01-26 11:13   ` Avi Kivity
2011-01-26 12:20     ` Glauber Costa
2011-01-26 15:17       ` Avi Kivity
2011-01-26 15:45         ` Glauber Costa
2011-01-27 12:17           ` Avi Kivity
2011-01-24 18:06 ` [PATCH 06/16] " Glauber Costa
2011-01-24 18:06 ` [PATCH 07/16] KVM-GST: " Glauber Costa
2011-01-24 18:06 ` [PATCH 08/16] KVM-HDR: Implement kvmclock systemtime " Glauber Costa
2011-01-24 18:06 ` [PATCH 09/16] KVM-HV: " Glauber Costa
2011-01-24 18:06 ` [PATCH 10/16] KVM-GST: " Glauber Costa
2011-01-24 18:06 ` [PATCH 11/16] KVM-HDR: KVM Steal time implementation Glauber Costa
2011-01-24 23:06   ` Rik van Riel
2011-01-24 18:06 ` [PATCH 12/16] KVM-HV: " Glauber Costa
2011-01-24 23:15   ` Rik van Riel
2011-01-24 18:06 ` [PATCH 13/16] KVM-HV: KVM Steal time calculation Glauber Costa
2011-01-24 23:20   ` Rik van Riel
2011-01-24 18:06 ` [PATCH 14/16] KVM-GST: KVM Steal time registration Glauber Costa
2011-01-24 23:27   ` Rik van Riel
2011-01-24 23:31   ` Rik van Riel
2011-01-25  1:25     ` Glauber Costa
2011-01-25  1:26       ` Rik van Riel
2011-01-25  1:28         ` Glauber Costa
2011-01-24 18:06 ` [PATCH 15/16] KVM-GST: KVM Steal time accounting Glauber Costa
2011-01-24 23:33   ` Rik van Riel
2011-01-24 18:06 ` [PATCH 16/16] KVM-GST: adjust scheduler cpu power Glauber Costa
2011-01-24 18:32   ` Peter Zijlstra
2011-01-24 18:51     ` Glauber Costa
2011-01-24 19:51       ` Peter Zijlstra
2011-01-24 19:57         ` Glauber Costa
2011-01-25 20:02         ` Glauber Costa
2011-01-25 20:13           ` Peter Zijlstra
2011-01-25 20:47             ` Glauber Costa
2011-01-25 21:07               ` Peter Zijlstra
2011-01-25 21:27                 ` Glauber Costa
2011-01-26  9:57                   ` Peter Zijlstra
2011-01-26 15:43                     ` Glauber Costa
2011-01-26 16:46                       ` Peter Zijlstra [this message]
2011-01-26 16:53                         ` Peter Zijlstra
2011-01-26 18:11                         ` Glauber Costa
2011-01-24 19:53       ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1296060386.28776.1312.camel@laptop \
    --to=peterz@infradead.org \
    --cc=aliguori@us.ibm.com \
    --cc=avi@redhat.com \
    --cc=glommer@redhat.com \
    --cc=jeremy.fitzhardinge@citrix.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=riel@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.