kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC v2 0/7] kvm stael time implementation
@ 2010-08-30 16:06 Glauber Costa
  2010-08-30 16:06 ` [RFC v2 1/7] change headers preparing for steal time Glauber Costa
                   ` (2 more replies)
  0 siblings, 3 replies; 38+ messages in thread
From: Glauber Costa @ 2010-08-30 16:06 UTC (permalink / raw)
  To: kvm; +Cc: avi, zamsden, mtosatti, riel, peterz, mingo, jeremy

Hi,

So, this is basically the same as v1, with three major
differences: 
 1) I am posting to lkml for wider audience
 2) patch 2/7 fixes one problem I mentined would happen in
    smp systems, which is, we only update kvmclock when we
    changes pcup
 3) softlockup algorithm is changed. Again, as marcelo pointed
    out, this is open to discussion, and I am not dropping it
    so more people can step in.

I have some other patches under local test for a slightly modified
guest part accounting, and I do somehow support extending
the interface, and changing to nsecs (maybe not 100 %, but...). But
I am posting in this state so we can have lkml people to step
in earlier.

Reminder of the previous cover-letter:

There are two parts of it: the guest and host part.

The proposal for the guest part, is to just change the
common time accounting, and try to identify at that spot,
wether or not we should account any steal time. I considered
this idea less cumbersome that trying to cook a clockevents
implementation ourselves, since I see little value in it.
I am, however, pretty open to suggestions.

For the host<->guest communications, I am using a shared
page, in the same way as pvclock. Because of that, I am just
hijacking pvclock structure anyway. There is a 32-bit field
floating by, that gives us enough room for 8 years of steal
time (we use msec resolution).

The main idea is to timestamp our exit and entry through
sched notifiers, and export the value at pvclock updates.
This obviously have some disadvantages: by doing this we
are giving up futures ideas about only updating
this structure once, and even right now, won't work
on pinned-smp (since we don't update pvclock if we
haven't changed cpus.)

But again, it is just an RFC, and I'd like to feel the
reception of the idea as a whole.

Have a nice review.
Glauber Costa (7):
  change headers preparing for steal time
  always call kvm_write_guest
  measure time out of guest
  change kernel accounting to include steal time
  kvm steal time implementation
  touch softlockup watchdog
  tell guest about steal time feature

 arch/x86/include/asm/kvm_host.h    |    2 +
 arch/x86/include/asm/kvm_para.h    |    1 +
 arch/x86/include/asm/pvclock-abi.h |    4 ++-
 arch/x86/kernel/kvmclock.c         |   40 ++++++++++++++++++++++++++++++++++++
 arch/x86/kvm/x86.c                 |   26 ++++++++++++++++++----
 include/linux/sched.h              |    1 +
 kernel/sched.c                     |   29 ++++++++++++++++++++++++++
 7 files changed, 97 insertions(+), 6 deletions(-)


^ permalink raw reply	[flat|nested] 38+ messages in thread

* [RFC v2 1/7] change headers preparing for steal time
  2010-08-30 16:06 [RFC v2 0/7] kvm stael time implementation Glauber Costa
@ 2010-08-30 16:06 ` Glauber Costa
  2010-08-30 16:06   ` [RFC 1/8] Implement getnsboottime kernel API Glauber Costa
  2010-08-30 16:37 ` [RFC v2 0/7] kvm stael time implementation Peter Zijlstra
  2010-08-30 17:20 ` Jeremy Fitzhardinge
  2 siblings, 1 reply; 38+ messages in thread
From: Glauber Costa @ 2010-08-30 16:06 UTC (permalink / raw)
  To: kvm; +Cc: avi, zamsden, mtosatti, riel, peterz, mingo, jeremy

This guest/host common patch prepares infrastructure for
the steal time implementation. Some constants are added,
and a name change happens in pvclock vcpu structure.

Signed-off-by: Glauber Costa <glommer@redhat.com>
---
 arch/x86/include/asm/kvm_para.h    |    1 +
 arch/x86/include/asm/pvclock-abi.h |    4 +++-
 2 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/arch/x86/include/asm/kvm_para.h b/arch/x86/include/asm/kvm_para.h
index 05eba5e..1759c81 100644
--- a/arch/x86/include/asm/kvm_para.h
+++ b/arch/x86/include/asm/kvm_para.h
@@ -25,6 +25,7 @@
  * in pvclock structure. If no bits are set, all flags are ignored.
  */
 #define KVM_FEATURE_CLOCKSOURCE_STABLE_BIT	24
+#define KVM_FEATURE_CLOCKSOURCE_STEAL_BIT 	25
 
 #define MSR_KVM_WALL_CLOCK  0x11
 #define MSR_KVM_SYSTEM_TIME 0x12
diff --git a/arch/x86/include/asm/pvclock-abi.h b/arch/x86/include/asm/pvclock-abi.h
index 35f2d19..417061b 100644
--- a/arch/x86/include/asm/pvclock-abi.h
+++ b/arch/x86/include/asm/pvclock-abi.h
@@ -24,7 +24,7 @@
 
 struct pvclock_vcpu_time_info {
 	u32   version;
-	u32   pad0;
+	u32   steal_time;
 	u64   tsc_timestamp;
 	u64   system_time;
 	u32   tsc_to_system_mul;
@@ -40,5 +40,7 @@ struct pvclock_wall_clock {
 } __attribute__((__packed__));
 
 #define PVCLOCK_TSC_STABLE_BIT	(1 << 0)
+#define PVCLOCK_STEAL_BIT   (2 << 0)
+
 #endif /* __ASSEMBLY__ */
 #endif /* _ASM_X86_PVCLOCK_ABI_H */
-- 
1.6.2.2


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [RFC 1/8] Implement getnsboottime kernel API
  2010-08-30 16:06 ` [RFC v2 1/7] change headers preparing for steal time Glauber Costa
@ 2010-08-30 16:06   ` Glauber Costa
  2010-08-30 16:06     ` [RFC v2 2/7] always call kvm_write_guest Glauber Costa
  2010-09-01 23:56     ` [RFC 1/8] Implement getnsboottime kernel API Zachary Amsden
  0 siblings, 2 replies; 38+ messages in thread
From: Glauber Costa @ 2010-08-30 16:06 UTC (permalink / raw)
  To: kvm; +Cc: avi, zamsden, mtosatti, riel, peterz, mingo, jeremy

From: Zachary Amsden <zamsden@redhat.com>

Add a kernel call to get the number of nanoseconds since boot.  This
is generally useful enough to make it a generic call.

Signed-off-by: Zachary Amsden <zamsden@redhat.com>
---
 include/linux/time.h      |    1 +
 kernel/time/timekeeping.c |   27 +++++++++++++++++++++++++++
 2 files changed, 28 insertions(+), 0 deletions(-)

diff --git a/include/linux/time.h b/include/linux/time.h
index ea3559f..5d04108 100644
--- a/include/linux/time.h
+++ b/include/linux/time.h
@@ -145,6 +145,7 @@ extern void getnstimeofday(struct timespec *tv);
 extern void getrawmonotonic(struct timespec *ts);
 extern void getboottime(struct timespec *ts);
 extern void monotonic_to_bootbased(struct timespec *ts);
+extern s64 getnsboottime(void);
 
 extern struct timespec timespec_trunc(struct timespec t, unsigned gran);
 extern int timekeeping_valid_for_hres(void);
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index caf8d4d..d250f0a 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -285,6 +285,33 @@ void ktime_get_ts(struct timespec *ts)
 }
 EXPORT_SYMBOL_GPL(ktime_get_ts);
 
+
+/**
+ * getnsboottime - get the bootbased clock in nsec format
+ *
+ * The function calculates the bootbased clock from the realtime
+ * clock and the wall_to_monotonic offset and stores the result
+ * in normalized timespec format in the variable pointed to by @ts.
+ */
+s64 getnsboottime(void)
+{
+	unsigned int seq;
+	s64 secs, nsecs;
+
+	WARN_ON(timekeeping_suspended);
+
+	do {
+		seq = read_seqbegin(&xtime_lock);
+		secs = xtime.tv_sec + wall_to_monotonic.tv_sec;
+		secs += total_sleep_time.tv_sec;
+		nsecs = xtime.tv_nsec + wall_to_monotonic.tv_nsec;
+		nsecs += total_sleep_time.tv_nsec + timekeeping_get_ns();
+
+	} while (read_seqretry(&xtime_lock, seq));
+	return nsecs + (secs * NSEC_PER_SEC);
+}
+EXPORT_SYMBOL_GPL(getnsboottime);
+
 /**
  * do_gettimeofday - Returns the time of day in a timeval
  * @tv:		pointer to the timeval to be set
-- 
1.6.2.2


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [RFC v2 2/7] always call kvm_write_guest
  2010-08-30 16:06   ` [RFC 1/8] Implement getnsboottime kernel API Glauber Costa
@ 2010-08-30 16:06     ` Glauber Costa
  2010-08-30 16:06       ` [RFC 2/8] change headers preparing for steal time Glauber Costa
  2010-09-01 23:56     ` [RFC 1/8] Implement getnsboottime kernel API Zachary Amsden
  1 sibling, 1 reply; 38+ messages in thread
From: Glauber Costa @ 2010-08-30 16:06 UTC (permalink / raw)
  To: kvm; +Cc: avi, zamsden, mtosatti, riel, peterz, mingo, jeremy

This patch makes sure that kvm_write_guest() is called
at every guest enter. We do not, however, want to update
all the structure at all times, so we add a flag
that basically tells it whether or not to do the whole operation

Signed-off-by: Glauber Costa <glommer@redhat.com>
---
 arch/x86/kvm/x86.c |   10 ++++++++--
 1 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index e7e3b50..f4b77ea 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -895,7 +895,7 @@ static void kvm_set_time_scale(uint32_t tsc_khz, struct pvclock_vcpu_time_info *
 
 static DEFINE_PER_CPU(unsigned long, cpu_tsc_khz);
 
-static void kvm_write_guest_time(struct kvm_vcpu *v)
+static void kvm_write_guest_time(struct kvm_vcpu *v, bool update_time)
 {
 	struct timespec ts;
 	unsigned long flags;
@@ -906,6 +906,9 @@ static void kvm_write_guest_time(struct kvm_vcpu *v)
 	if ((!vcpu->time_page))
 		return;
 
+	if (!update_time)
+		return;
+
 	this_tsc_khz = get_cpu_var(cpu_tsc_khz);
 	if (unlikely(vcpu->hv_clock_tsc_khz != this_tsc_khz)) {
 		kvm_set_time_scale(this_tsc_khz, &vcpu->hv_clock);
@@ -4671,6 +4674,7 @@ static void kvm_put_guest_xcr0(struct kvm_vcpu *vcpu)
 static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
 {
 	int r;
+	bool update_kvm_time = 0;
 	bool req_int_win = !irqchip_in_kernel(vcpu->kvm) &&
 		vcpu->run->request_interrupt_window;
 
@@ -4680,7 +4684,7 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
 		if (kvm_check_request(KVM_REQ_MIGRATE_TIMER, vcpu))
 			__kvm_migrate_timers(vcpu);
 		if (kvm_check_request(KVM_REQ_KVMCLOCK_UPDATE, vcpu))
-			kvm_write_guest_time(vcpu);
+			update_kvm_time = 1;
 		if (kvm_check_request(KVM_REQ_MMU_SYNC, vcpu))
 			kvm_mmu_sync_roots(vcpu);
 		if (kvm_check_request(KVM_REQ_TLB_FLUSH, vcpu))
@@ -4701,6 +4705,8 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
 		}
 	}
 
+	kvm_write_guest_time(vcpu, update_kvm_time);
+
 	r = kvm_mmu_reload(vcpu);
 	if (unlikely(r))
 		goto out;
-- 
1.6.2.2


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [RFC 2/8] change headers preparing for steal time
  2010-08-30 16:06     ` [RFC v2 2/7] always call kvm_write_guest Glauber Costa
@ 2010-08-30 16:06       ` Glauber Costa
  2010-08-30 16:06         ` [RFC 3/8] always call kvm_write_guest Glauber Costa
  0 siblings, 1 reply; 38+ messages in thread
From: Glauber Costa @ 2010-08-30 16:06 UTC (permalink / raw)
  To: kvm; +Cc: avi, zamsden, mtosatti, riel, peterz, mingo, jeremy

This guest/host common patch prepares infrastructure for
the steal time implementation. Some constants are added,
and a name change happens in pvclock vcpu structure.

Signed-off-by: Glauber Costa <glommer@redhat.com>
---
 arch/x86/include/asm/kvm_para.h    |    1 +
 arch/x86/include/asm/pvclock-abi.h |    4 +++-
 2 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/arch/x86/include/asm/kvm_para.h b/arch/x86/include/asm/kvm_para.h
index 05eba5e..1759c81 100644
--- a/arch/x86/include/asm/kvm_para.h
+++ b/arch/x86/include/asm/kvm_para.h
@@ -25,6 +25,7 @@
  * in pvclock structure. If no bits are set, all flags are ignored.
  */
 #define KVM_FEATURE_CLOCKSOURCE_STABLE_BIT	24
+#define KVM_FEATURE_CLOCKSOURCE_STEAL_BIT 	25
 
 #define MSR_KVM_WALL_CLOCK  0x11
 #define MSR_KVM_SYSTEM_TIME 0x12
diff --git a/arch/x86/include/asm/pvclock-abi.h b/arch/x86/include/asm/pvclock-abi.h
index 35f2d19..417061b 100644
--- a/arch/x86/include/asm/pvclock-abi.h
+++ b/arch/x86/include/asm/pvclock-abi.h
@@ -24,7 +24,7 @@
 
 struct pvclock_vcpu_time_info {
 	u32   version;
-	u32   pad0;
+	u32   steal_time;
 	u64   tsc_timestamp;
 	u64   system_time;
 	u32   tsc_to_system_mul;
@@ -40,5 +40,7 @@ struct pvclock_wall_clock {
 } __attribute__((__packed__));
 
 #define PVCLOCK_TSC_STABLE_BIT	(1 << 0)
+#define PVCLOCK_STEAL_BIT   (2 << 0)
+
 #endif /* __ASSEMBLY__ */
 #endif /* _ASM_X86_PVCLOCK_ABI_H */
-- 
1.6.2.2


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [RFC 3/8] always call kvm_write_guest
  2010-08-30 16:06       ` [RFC 2/8] change headers preparing for steal time Glauber Costa
@ 2010-08-30 16:06         ` Glauber Costa
  2010-08-30 16:06           ` [RFC v2 3/7] measure time out of guest Glauber Costa
  0 siblings, 1 reply; 38+ messages in thread
From: Glauber Costa @ 2010-08-30 16:06 UTC (permalink / raw)
  To: kvm; +Cc: avi, zamsden, mtosatti, riel, peterz, mingo, jeremy

This patch makes sure that kvm_write_guest() is called
at every guest enter. We do not, however, want to update
all the structure at all times, so we add a flag
that basically tells it whether or not to do the whole operation

Signed-off-by: Glauber Costa <glommer@redhat.com>
---
 arch/x86/kvm/x86.c |   10 ++++++++--
 1 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index e7e3b50..f4b77ea 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -895,7 +895,7 @@ static void kvm_set_time_scale(uint32_t tsc_khz, struct pvclock_vcpu_time_info *
 
 static DEFINE_PER_CPU(unsigned long, cpu_tsc_khz);
 
-static void kvm_write_guest_time(struct kvm_vcpu *v)
+static void kvm_write_guest_time(struct kvm_vcpu *v, bool update_time)
 {
 	struct timespec ts;
 	unsigned long flags;
@@ -906,6 +906,9 @@ static void kvm_write_guest_time(struct kvm_vcpu *v)
 	if ((!vcpu->time_page))
 		return;
 
+	if (!update_time)
+		return;
+
 	this_tsc_khz = get_cpu_var(cpu_tsc_khz);
 	if (unlikely(vcpu->hv_clock_tsc_khz != this_tsc_khz)) {
 		kvm_set_time_scale(this_tsc_khz, &vcpu->hv_clock);
@@ -4671,6 +4674,7 @@ static void kvm_put_guest_xcr0(struct kvm_vcpu *vcpu)
 static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
 {
 	int r;
+	bool update_kvm_time = 0;
 	bool req_int_win = !irqchip_in_kernel(vcpu->kvm) &&
 		vcpu->run->request_interrupt_window;
 
@@ -4680,7 +4684,7 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
 		if (kvm_check_request(KVM_REQ_MIGRATE_TIMER, vcpu))
 			__kvm_migrate_timers(vcpu);
 		if (kvm_check_request(KVM_REQ_KVMCLOCK_UPDATE, vcpu))
-			kvm_write_guest_time(vcpu);
+			update_kvm_time = 1;
 		if (kvm_check_request(KVM_REQ_MMU_SYNC, vcpu))
 			kvm_mmu_sync_roots(vcpu);
 		if (kvm_check_request(KVM_REQ_TLB_FLUSH, vcpu))
@@ -4701,6 +4705,8 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
 		}
 	}
 
+	kvm_write_guest_time(vcpu, update_kvm_time);
+
 	r = kvm_mmu_reload(vcpu);
 	if (unlikely(r))
 		goto out;
-- 
1.6.2.2


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [RFC v2 3/7] measure time out of guest
  2010-08-30 16:06         ` [RFC 3/8] always call kvm_write_guest Glauber Costa
@ 2010-08-30 16:06           ` Glauber Costa
  2010-08-30 16:06             ` [RFC v2 4/7] change kernel accounting to include steal time Glauber Costa
  0 siblings, 1 reply; 38+ messages in thread
From: Glauber Costa @ 2010-08-30 16:06 UTC (permalink / raw)
  To: kvm; +Cc: avi, zamsden, mtosatti, riel, peterz, mingo, jeremy

By measuring time between a vcpu_put and a vcpu_load, we can
estimate how much time did the guest stay out of the cpu.
This is exported to the guest at every clock update.

Signed-off-by: Glauber Costa <glommer@redhat.com>
---
 arch/x86/include/asm/kvm_host.h |    2 ++
 arch/x86/kvm/x86.c              |   13 +++++++++++--
 2 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 502e53f..bc28aff 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -364,6 +364,8 @@ struct kvm_vcpu_arch {
 	u64 hv_vapic;
 
 	cpumask_var_t wbinvd_dirty_mask;
+	u64 time_out;
+	u64 last_time_out;
 };
 
 struct kvm_arch {
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index f4b77ea..9d08032 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -906,6 +906,7 @@ static void kvm_write_guest_time(struct kvm_vcpu *v, bool update_time)
 	if ((!vcpu->time_page))
 		return;
 
+	vcpu->hv_clock.steal_time = vcpu->time_out / 1000000;
 	if (!update_time)
 		return;
 
@@ -928,8 +929,7 @@ static void kvm_write_guest_time(struct kvm_vcpu *v, bool update_time)
 	vcpu->hv_clock.system_time = ts.tv_nsec +
 				     (NSEC_PER_SEC * (u64)ts.tv_sec) + v->kvm->arch.kvmclock_offset;
 
-	vcpu->hv_clock.flags = 0;
-
+	vcpu->hv_clock.flags = PVCLOCK_STEAL_BIT;
 	/*
 	 * The interface expects us to write an even number signaling that the
 	 * update is finished. Since the guest won't see the intermediate
@@ -1801,6 +1801,8 @@ static bool need_emulate_wbinvd(struct kvm_vcpu *vcpu)
 
 void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
 {
+	s64 now;
+
 	/* Address WBINVD may be executed by guest */
 	if (need_emulate_wbinvd(vcpu)) {
 		if (kvm_x86_ops->has_wbinvd_exit())
@@ -1818,12 +1820,19 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
 		per_cpu(cpu_tsc_khz, cpu) = khz;
 	}
 	kvm_request_guest_time_update(vcpu);
+
+	now = getnsboottime();
+
+	if (vcpu->arch.last_time_out != 0)
+		vcpu->arch.time_out += now - vcpu->arch.last_time_out;
 }
 
 void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
 {
 	kvm_x86_ops->vcpu_put(vcpu);
 	kvm_put_guest_fpu(vcpu);
+
+	vcpu->arch.last_time_out = getnsboottime();
 }
 
 static int is_efer_nx(void)
-- 
1.6.2.2


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [RFC v2 4/7] change kernel accounting to include steal time
  2010-08-30 16:06           ` [RFC v2 3/7] measure time out of guest Glauber Costa
@ 2010-08-30 16:06             ` Glauber Costa
  2010-08-30 16:06               ` [RFC 4/8] measure time out of guest Glauber Costa
  2010-08-30 17:30               ` [RFC v2 4/7] " Jeremy Fitzhardinge
  0 siblings, 2 replies; 38+ messages in thread
From: Glauber Costa @ 2010-08-30 16:06 UTC (permalink / raw)
  To: kvm; +Cc: avi, zamsden, mtosatti, riel, peterz, mingo, jeremy

This patch proposes a common steal time implementation. When no
steal time is accounted, we just add a branch to the current
accounting code, that shouldn't add much overhead.

When we do want to register steal time, we proceed as following:
- if we would account user or system time in this tick, and there is
  out-of-cpu time registered, we skip it altogether, and account steal
  time only.
- if we would account user or system time in this tick, and we got the
  cpu for the whole slice, we proceed normaly.
- if we are idle in this tick, we flush out-of-cpu time to give it the
  chance to update whatever last-measure internal variable it may have.

This approach is simple, but proved to work well for my test scenarios.
in a UP guest on UP host, with a cpu-hog in both guest and host shows
~ 50 % steal time. steal time is also accounted proportionally, if
nice values are given to the host cpu-hog.

A cpu-hog in the host with no load in the guest, produces 0 % steal time,
with 100 % idle, as one would expect.

Signed-off-by: Glauber Costa <glommer@redhat.com>
---
 include/linux/sched.h |    1 +
 kernel/sched.c        |   29 +++++++++++++++++++++++++++++
 2 files changed, 30 insertions(+), 0 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 0478888..e571ddd 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -312,6 +312,7 @@ long io_schedule_timeout(long timeout);
 extern void cpu_init (void);
 extern void trap_init(void);
 extern void update_process_times(int user);
+extern cputime_t (*hypervisor_steal_time)(void);
 extern void scheduler_tick(void);
 
 extern void sched_show_task(struct task_struct *p);
diff --git a/kernel/sched.c b/kernel/sched.c
index f52a880..9695c92 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -3157,6 +3157,16 @@ unsigned long long thread_group_sched_runtime(struct task_struct *p)
 	return ns;
 }
 
+cputime_t (*hypervisor_steal_time)(void) = NULL;
+
+static inline cputime_t get_steal_time_from_hypervisor(void)
+{
+	if (!hypervisor_steal_time)
+		return 0;
+	return hypervisor_steal_time();
+}
+
+
 /*
  * Account user cpu time to a process.
  * @p: the process that the cpu time gets accounted to
@@ -3169,6 +3179,12 @@ void account_user_time(struct task_struct *p, cputime_t cputime,
 	struct cpu_usage_stat *cpustat = &kstat_this_cpu.cpustat;
 	cputime64_t tmp;
 
+	tmp = get_steal_time_from_hypervisor();
+	if (tmp) {
+		account_steal_time(tmp);
+		return;
+	}
+
 	/* Add user time to process. */
 	p->utime = cputime_add(p->utime, cputime);
 	p->utimescaled = cputime_add(p->utimescaled, cputime_scaled);
@@ -3234,6 +3250,12 @@ void account_system_time(struct task_struct *p, int hardirq_offset,
 		return;
 	}
 
+	tmp = get_steal_time_from_hypervisor();
+	if (tmp) {
+		account_steal_time(tmp);
+		return;
+	}
+
 	/* Add system time to process. */
 	p->stime = cputime_add(p->stime, cputime);
 	p->stimescaled = cputime_add(p->stimescaled, cputime_scaled);
@@ -3276,6 +3298,13 @@ void account_idle_time(cputime_t cputime)
 	cputime64_t cputime64 = cputime_to_cputime64(cputime);
 	struct rq *rq = this_rq();
 
+	/*
+	 * if we're idle, we don't account it as steal time, since we did
+	 * not want to run anyway. We do call the steal function, however, to
+	 * give the guest the chance to flush its internal buffers
+	 */
+	get_steal_time_from_hypervisor();
+
 	if (atomic_read(&rq->nr_iowait) > 0)
 		cpustat->iowait = cputime64_add(cpustat->iowait, cputime64);
 	else
-- 
1.6.2.2


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [RFC 4/8] measure time out of guest
  2010-08-30 16:06             ` [RFC v2 4/7] change kernel accounting to include steal time Glauber Costa
@ 2010-08-30 16:06               ` Glauber Costa
  2010-08-30 16:06                 ` [RFC 5/8] change kernel accounting to include steal time Glauber Costa
  2010-08-30 17:30               ` [RFC v2 4/7] " Jeremy Fitzhardinge
  1 sibling, 1 reply; 38+ messages in thread
From: Glauber Costa @ 2010-08-30 16:06 UTC (permalink / raw)
  To: kvm; +Cc: avi, zamsden, mtosatti, riel, peterz, mingo, jeremy

By measuring time between a vcpu_put and a vcpu_load, we can
estimate how much time did the guest stay out of the cpu.
This is exported to the guest at every clock update.

Signed-off-by: Glauber Costa <glommer@redhat.com>
---
 arch/x86/include/asm/kvm_host.h |    2 ++
 arch/x86/kvm/x86.c              |   13 +++++++++++--
 2 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 502e53f..bc28aff 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -364,6 +364,8 @@ struct kvm_vcpu_arch {
 	u64 hv_vapic;
 
 	cpumask_var_t wbinvd_dirty_mask;
+	u64 time_out;
+	u64 last_time_out;
 };
 
 struct kvm_arch {
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index f4b77ea..9d08032 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -906,6 +906,7 @@ static void kvm_write_guest_time(struct kvm_vcpu *v, bool update_time)
 	if ((!vcpu->time_page))
 		return;
 
+	vcpu->hv_clock.steal_time = vcpu->time_out / 1000000;
 	if (!update_time)
 		return;
 
@@ -928,8 +929,7 @@ static void kvm_write_guest_time(struct kvm_vcpu *v, bool update_time)
 	vcpu->hv_clock.system_time = ts.tv_nsec +
 				     (NSEC_PER_SEC * (u64)ts.tv_sec) + v->kvm->arch.kvmclock_offset;
 
-	vcpu->hv_clock.flags = 0;
-
+	vcpu->hv_clock.flags = PVCLOCK_STEAL_BIT;
 	/*
 	 * The interface expects us to write an even number signaling that the
 	 * update is finished. Since the guest won't see the intermediate
@@ -1801,6 +1801,8 @@ static bool need_emulate_wbinvd(struct kvm_vcpu *vcpu)
 
 void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
 {
+	s64 now;
+
 	/* Address WBINVD may be executed by guest */
 	if (need_emulate_wbinvd(vcpu)) {
 		if (kvm_x86_ops->has_wbinvd_exit())
@@ -1818,12 +1820,19 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
 		per_cpu(cpu_tsc_khz, cpu) = khz;
 	}
 	kvm_request_guest_time_update(vcpu);
+
+	now = getnsboottime();
+
+	if (vcpu->arch.last_time_out != 0)
+		vcpu->arch.time_out += now - vcpu->arch.last_time_out;
 }
 
 void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
 {
 	kvm_x86_ops->vcpu_put(vcpu);
 	kvm_put_guest_fpu(vcpu);
+
+	vcpu->arch.last_time_out = getnsboottime();
 }
 
 static int is_efer_nx(void)
-- 
1.6.2.2


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [RFC 5/8] change kernel accounting to include steal time
  2010-08-30 16:06               ` [RFC 4/8] measure time out of guest Glauber Costa
@ 2010-08-30 16:06                 ` Glauber Costa
  2010-08-30 16:06                   ` [RFC v2 5/7] kvm steal time implementation Glauber Costa
  2010-08-30 16:46                   ` [RFC 5/8] change kernel accounting to include steal time Peter Zijlstra
  0 siblings, 2 replies; 38+ messages in thread
From: Glauber Costa @ 2010-08-30 16:06 UTC (permalink / raw)
  To: kvm; +Cc: avi, zamsden, mtosatti, riel, peterz, mingo, jeremy

This patch proposes a common steal time implementation. When no
steal time is accounted, we just add a branch to the current
accounting code, that shouldn't add much overhead.

When we do want to register steal time, we proceed as following:
- if we would account user or system time in this tick, and there is
  out-of-cpu time registered, we skip it altogether, and account steal
  time only.
- if we would account user or system time in this tick, and we got the
  cpu for the whole slice, we proceed normaly.
- if we are idle in this tick, we flush out-of-cpu time to give it the
  chance to update whatever last-measure internal variable it may have.

This approach is simple, but proved to work well for my test scenarios.
in a UP guest on UP host, with a cpu-hog in both guest and host shows
~ 50 % steal time. steal time is also accounted proportionally, if
nice values are given to the host cpu-hog.

A cpu-hog in the host with no load in the guest, produces 0 % steal time,
with 100 % idle, as one would expect.

Signed-off-by: Glauber Costa <glommer@redhat.com>
---
 include/linux/sched.h |    1 +
 kernel/sched.c        |   29 +++++++++++++++++++++++++++++
 2 files changed, 30 insertions(+), 0 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 0478888..e571ddd 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -312,6 +312,7 @@ long io_schedule_timeout(long timeout);
 extern void cpu_init (void);
 extern void trap_init(void);
 extern void update_process_times(int user);
+extern cputime_t (*hypervisor_steal_time)(void);
 extern void scheduler_tick(void);
 
 extern void sched_show_task(struct task_struct *p);
diff --git a/kernel/sched.c b/kernel/sched.c
index f52a880..9695c92 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -3157,6 +3157,16 @@ unsigned long long thread_group_sched_runtime(struct task_struct *p)
 	return ns;
 }
 
+cputime_t (*hypervisor_steal_time)(void) = NULL;
+
+static inline cputime_t get_steal_time_from_hypervisor(void)
+{
+	if (!hypervisor_steal_time)
+		return 0;
+	return hypervisor_steal_time();
+}
+
+
 /*
  * Account user cpu time to a process.
  * @p: the process that the cpu time gets accounted to
@@ -3169,6 +3179,12 @@ void account_user_time(struct task_struct *p, cputime_t cputime,
 	struct cpu_usage_stat *cpustat = &kstat_this_cpu.cpustat;
 	cputime64_t tmp;
 
+	tmp = get_steal_time_from_hypervisor();
+	if (tmp) {
+		account_steal_time(tmp);
+		return;
+	}
+
 	/* Add user time to process. */
 	p->utime = cputime_add(p->utime, cputime);
 	p->utimescaled = cputime_add(p->utimescaled, cputime_scaled);
@@ -3234,6 +3250,12 @@ void account_system_time(struct task_struct *p, int hardirq_offset,
 		return;
 	}
 
+	tmp = get_steal_time_from_hypervisor();
+	if (tmp) {
+		account_steal_time(tmp);
+		return;
+	}
+
 	/* Add system time to process. */
 	p->stime = cputime_add(p->stime, cputime);
 	p->stimescaled = cputime_add(p->stimescaled, cputime_scaled);
@@ -3276,6 +3298,13 @@ void account_idle_time(cputime_t cputime)
 	cputime64_t cputime64 = cputime_to_cputime64(cputime);
 	struct rq *rq = this_rq();
 
+	/*
+	 * if we're idle, we don't account it as steal time, since we did
+	 * not want to run anyway. We do call the steal function, however, to
+	 * give the guest the chance to flush its internal buffers
+	 */
+	get_steal_time_from_hypervisor();
+
 	if (atomic_read(&rq->nr_iowait) > 0)
 		cpustat->iowait = cputime64_add(cpustat->iowait, cputime64);
 	else
-- 
1.6.2.2


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [RFC v2 5/7] kvm steal time implementation
  2010-08-30 16:06                 ` [RFC 5/8] change kernel accounting to include steal time Glauber Costa
@ 2010-08-30 16:06                   ` Glauber Costa
  2010-08-30 16:06                     ` [RFC 6/8] " Glauber Costa
  2010-08-30 16:46                   ` [RFC 5/8] change kernel accounting to include steal time Peter Zijlstra
  1 sibling, 1 reply; 38+ messages in thread
From: Glauber Costa @ 2010-08-30 16:06 UTC (permalink / raw)
  To: kvm; +Cc: avi, zamsden, mtosatti, riel, peterz, mingo, jeremy

This is the proposed kvm-side steal time implementation.
It is migration safe, as it checks flags at every read.

Signed-off-by: Glauber Costa <glommer@redhat.com>
---
 arch/x86/kernel/kvmclock.c |   35 +++++++++++++++++++++++++++++++++++
 1 files changed, 35 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
index eb9b76c..a1f4852 100644
--- a/arch/x86/kernel/kvmclock.c
+++ b/arch/x86/kernel/kvmclock.c
@@ -18,6 +18,8 @@
 
 #include <linux/clocksource.h>
 #include <linux/kvm_para.h>
+#include <linux/kernel_stat.h>
+#include <linux/sched.h>
 #include <asm/pvclock.h>
 #include <asm/msr.h>
 #include <asm/apic.h>
@@ -41,6 +43,7 @@ early_param("no-kvmclock", parse_no_kvmclock);
 
 /* The hypervisor will put information about time periodically here */
 static DEFINE_PER_CPU_SHARED_ALIGNED(struct pvclock_vcpu_time_info, hv_clock);
+static DEFINE_PER_CPU(u64, steal_info);
 static struct pvclock_wall_clock wall_clock;
 
 /*
@@ -82,6 +85,32 @@ static cycle_t kvm_clock_read(void)
 	return ret;
 }
 
+static DEFINE_PER_CPU(u64, steal_info);
+
+cputime_t kvm_get_steal_time(void)
+{
+	u64 delta = 0;
+	u64 *last_steal_info, this_steal_info;
+	struct pvclock_vcpu_time_info *src;
+
+	src = &get_cpu_var(hv_clock);
+	if (!(src->flags & PVCLOCK_STEAL_BIT))
+		goto out;
+
+	this_steal_info = src->steal_time;
+	put_cpu_var(hv_clock);
+
+	last_steal_info = &get_cpu_var(steal_info);
+
+	delta = this_steal_info - *last_steal_info;
+
+	*last_steal_info = this_steal_info;
+	put_cpu_var(steal_info);
+
+out:
+	return msecs_to_cputime(delta);
+}
+
 static cycle_t kvm_clock_get_cycles(struct clocksource *cs)
 {
 	return kvm_clock_read();
@@ -134,6 +163,8 @@ static int kvm_register_clock(char *txt)
 	printk(KERN_INFO "kvm-clock: cpu %d, msr %x:%x, %s\n",
 	       cpu, high, low, txt);
 
+	per_cpu(steal_info, cpu) = 0;
+
 	return native_write_msr_safe(msr_kvm_system_time, low, high);
 }
 
@@ -218,4 +249,8 @@ void __init kvmclock_init(void)
 
 	if (kvm_para_has_feature(KVM_FEATURE_CLOCKSOURCE_STABLE_BIT))
 		pvclock_set_flags(PVCLOCK_TSC_STABLE_BIT);
+
+	
+	if (kvm_para_has_feature(KVM_FEATURE_CLOCKSOURCE_STEAL_BIT))
+		hypervisor_steal_time = kvm_get_steal_time;
 }
-- 
1.6.2.2


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [RFC 6/8] kvm steal time implementation
  2010-08-30 16:06                   ` [RFC v2 5/7] kvm steal time implementation Glauber Costa
@ 2010-08-30 16:06                     ` Glauber Costa
  2010-08-30 16:06                       ` [RFC v2 6/7] touch softlockup watchdog Glauber Costa
  0 siblings, 1 reply; 38+ messages in thread
From: Glauber Costa @ 2010-08-30 16:06 UTC (permalink / raw)
  To: kvm; +Cc: avi, zamsden, mtosatti, riel, peterz, mingo, jeremy

This is the proposed kvm-side steal time implementation.
It is migration safe, as it checks flags at every read.

Signed-off-by: Glauber Costa <glommer@redhat.com>
---
 arch/x86/kernel/kvmclock.c |   35 +++++++++++++++++++++++++++++++++++
 1 files changed, 35 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
index eb9b76c..a1f4852 100644
--- a/arch/x86/kernel/kvmclock.c
+++ b/arch/x86/kernel/kvmclock.c
@@ -18,6 +18,8 @@
 
 #include <linux/clocksource.h>
 #include <linux/kvm_para.h>
+#include <linux/kernel_stat.h>
+#include <linux/sched.h>
 #include <asm/pvclock.h>
 #include <asm/msr.h>
 #include <asm/apic.h>
@@ -41,6 +43,7 @@ early_param("no-kvmclock", parse_no_kvmclock);
 
 /* The hypervisor will put information about time periodically here */
 static DEFINE_PER_CPU_SHARED_ALIGNED(struct pvclock_vcpu_time_info, hv_clock);
+static DEFINE_PER_CPU(u64, steal_info);
 static struct pvclock_wall_clock wall_clock;
 
 /*
@@ -82,6 +85,32 @@ static cycle_t kvm_clock_read(void)
 	return ret;
 }
 
+static DEFINE_PER_CPU(u64, steal_info);
+
+cputime_t kvm_get_steal_time(void)
+{
+	u64 delta = 0;
+	u64 *last_steal_info, this_steal_info;
+	struct pvclock_vcpu_time_info *src;
+
+	src = &get_cpu_var(hv_clock);
+	if (!(src->flags & PVCLOCK_STEAL_BIT))
+		goto out;
+
+	this_steal_info = src->steal_time;
+	put_cpu_var(hv_clock);
+
+	last_steal_info = &get_cpu_var(steal_info);
+
+	delta = this_steal_info - *last_steal_info;
+
+	*last_steal_info = this_steal_info;
+	put_cpu_var(steal_info);
+
+out:
+	return msecs_to_cputime(delta);
+}
+
 static cycle_t kvm_clock_get_cycles(struct clocksource *cs)
 {
 	return kvm_clock_read();
@@ -134,6 +163,8 @@ static int kvm_register_clock(char *txt)
 	printk(KERN_INFO "kvm-clock: cpu %d, msr %x:%x, %s\n",
 	       cpu, high, low, txt);
 
+	per_cpu(steal_info, cpu) = 0;
+
 	return native_write_msr_safe(msr_kvm_system_time, low, high);
 }
 
@@ -218,4 +249,8 @@ void __init kvmclock_init(void)
 
 	if (kvm_para_has_feature(KVM_FEATURE_CLOCKSOURCE_STABLE_BIT))
 		pvclock_set_flags(PVCLOCK_TSC_STABLE_BIT);
+
+	
+	if (kvm_para_has_feature(KVM_FEATURE_CLOCKSOURCE_STEAL_BIT))
+		hypervisor_steal_time = kvm_get_steal_time;
 }
-- 
1.6.2.2


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [RFC v2 6/7] touch softlockup watchdog
  2010-08-30 16:06                     ` [RFC 6/8] " Glauber Costa
@ 2010-08-30 16:06                       ` Glauber Costa
  2010-08-30 16:06                         ` [RFC v2 7/7] tell guest about steal time feature Glauber Costa
  2010-08-30 17:33                         ` [RFC v2 6/7] touch softlockup watchdog Jeremy Fitzhardinge
  0 siblings, 2 replies; 38+ messages in thread
From: Glauber Costa @ 2010-08-30 16:06 UTC (permalink / raw)
  To: kvm; +Cc: avi, zamsden, mtosatti, riel, peterz, mingo, jeremy

With a reliable steal time mechanism, we can tell if we're
out of the cpu for very long, differentiating from the case
that we simply got a real softlockup.

In the case we were out of cpu, the watchdog is fed, making
bogus softlockups disappear.

Signed-off-by: Glauber Costa <glommer@redhat.com>
---
 arch/x86/kernel/kvmclock.c |    5 +++++
 1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
index a1f4852..d217475 100644
--- a/arch/x86/kernel/kvmclock.c
+++ b/arch/x86/kernel/kvmclock.c
@@ -91,6 +91,7 @@ cputime_t kvm_get_steal_time(void)
 {
 	u64 delta = 0;
 	u64 *last_steal_info, this_steal_info;
+	int touch_wd;
 	struct pvclock_vcpu_time_info *src;
 
 	src = &get_cpu_var(hv_clock);
@@ -104,6 +105,10 @@ cputime_t kvm_get_steal_time(void)
 
 	delta = this_steal_info - *last_steal_info;
 
+	touch_wd = softlockup_thresh * 1000UL;
+	if ((touch_wd > 0) && (delta > touch_wd))
+		touch_softlockup_watchdog();
+
 	*last_steal_info = this_steal_info;
 	put_cpu_var(steal_info);
 
-- 
1.6.2.2


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [RFC v2 7/7] tell guest about steal time feature
  2010-08-30 16:06                       ` [RFC v2 6/7] touch softlockup watchdog Glauber Costa
@ 2010-08-30 16:06                         ` Glauber Costa
  2010-08-30 16:06                           ` [RFC 7/8] touch softlockup watchdog Glauber Costa
  2010-08-30 17:33                         ` [RFC v2 6/7] touch softlockup watchdog Jeremy Fitzhardinge
  1 sibling, 1 reply; 38+ messages in thread
From: Glauber Costa @ 2010-08-30 16:06 UTC (permalink / raw)
  To: kvm; +Cc: avi, zamsden, mtosatti, riel, peterz, mingo, jeremy

Guest kernel will only activate steal time if the host exports it.
Warn the guest about it.
---
 arch/x86/kvm/x86.c |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 9d08032..f6a2d74 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2103,7 +2103,8 @@ static void do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
 		entry->eax = (1 << KVM_FEATURE_CLOCKSOURCE) |
 			     (1 << KVM_FEATURE_NOP_IO_DELAY) |
 			     (1 << KVM_FEATURE_CLOCKSOURCE2) |
-			     (1 << KVM_FEATURE_CLOCKSOURCE_STABLE_BIT);
+			     (1 << KVM_FEATURE_CLOCKSOURCE_STABLE_BIT) |
+			     (1 << KVM_FEATURE_CLOCKSOURCE_STEAL_BIT);
 		entry->ebx = 0;
 		entry->ecx = 0;
 		entry->edx = 0;
-- 
1.6.2.2


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [RFC 7/8] touch softlockup watchdog
  2010-08-30 16:06                         ` [RFC v2 7/7] tell guest about steal time feature Glauber Costa
@ 2010-08-30 16:06                           ` Glauber Costa
  2010-08-30 16:06                             ` [RFC 8/8] tell guest about steal time feature Glauber Costa
  0 siblings, 1 reply; 38+ messages in thread
From: Glauber Costa @ 2010-08-30 16:06 UTC (permalink / raw)
  To: kvm; +Cc: avi, zamsden, mtosatti, riel, peterz, mingo, jeremy

With a reliable steal time mechanism, we can tell if we're
out of the cpu for very long, differentiating from the case
that we simply got a real softlockup.

In the case we were out of cpu, the watchdog is fed, making
bogus softlockups disappear.

Signed-off-by: Glauber Costa <glommer@redhat.com>
---
 arch/x86/kernel/kvmclock.c |    5 +++++
 1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
index a1f4852..d217475 100644
--- a/arch/x86/kernel/kvmclock.c
+++ b/arch/x86/kernel/kvmclock.c
@@ -91,6 +91,7 @@ cputime_t kvm_get_steal_time(void)
 {
 	u64 delta = 0;
 	u64 *last_steal_info, this_steal_info;
+	int touch_wd;
 	struct pvclock_vcpu_time_info *src;
 
 	src = &get_cpu_var(hv_clock);
@@ -104,6 +105,10 @@ cputime_t kvm_get_steal_time(void)
 
 	delta = this_steal_info - *last_steal_info;
 
+	touch_wd = softlockup_thresh * 1000UL;
+	if ((touch_wd > 0) && (delta > touch_wd))
+		touch_softlockup_watchdog();
+
 	*last_steal_info = this_steal_info;
 	put_cpu_var(steal_info);
 
-- 
1.6.2.2


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [RFC 8/8] tell guest about steal time feature
  2010-08-30 16:06                           ` [RFC 7/8] touch softlockup watchdog Glauber Costa
@ 2010-08-30 16:06                             ` Glauber Costa
  0 siblings, 0 replies; 38+ messages in thread
From: Glauber Costa @ 2010-08-30 16:06 UTC (permalink / raw)
  To: kvm; +Cc: avi, zamsden, mtosatti, riel, peterz, mingo, jeremy

Guest kernel will only activate steal time if the host exports it.
Warn the guest about it.
---
 arch/x86/kvm/x86.c |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 9d08032..f6a2d74 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2103,7 +2103,8 @@ static void do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
 		entry->eax = (1 << KVM_FEATURE_CLOCKSOURCE) |
 			     (1 << KVM_FEATURE_NOP_IO_DELAY) |
 			     (1 << KVM_FEATURE_CLOCKSOURCE2) |
-			     (1 << KVM_FEATURE_CLOCKSOURCE_STABLE_BIT);
+			     (1 << KVM_FEATURE_CLOCKSOURCE_STABLE_BIT) |
+			     (1 << KVM_FEATURE_CLOCKSOURCE_STEAL_BIT);
 		entry->ebx = 0;
 		entry->ecx = 0;
 		entry->edx = 0;
-- 
1.6.2.2


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [RFC v2 0/7] kvm stael time implementation
  2010-08-30 16:06 [RFC v2 0/7] kvm stael time implementation Glauber Costa
  2010-08-30 16:06 ` [RFC v2 1/7] change headers preparing for steal time Glauber Costa
@ 2010-08-30 16:37 ` Peter Zijlstra
  2010-08-30 16:45   ` Jeremy Fitzhardinge
  2010-08-30 17:20 ` Jeremy Fitzhardinge
  2 siblings, 1 reply; 38+ messages in thread
From: Peter Zijlstra @ 2010-08-30 16:37 UTC (permalink / raw)
  To: Glauber Costa; +Cc: kvm, avi, zamsden, mtosatti, riel, mingo, jeremy

On Mon, 2010-08-30 at 12:06 -0400, Glauber Costa wrote:
> 
> Have a nice review.
> Glauber Costa (7):
>   change headers preparing for steal time
>   always call kvm_write_guest
>   measure time out of guest
>   change kernel accounting to include steal time
>   kvm steal time implementation
>   touch softlockup watchdog
>   tell guest about steal time feature 

Something's gone wrong, I've got at least 15 emails in this thread.

Also, please try again but now with --no-chain-reply-to, threads nested
this deep are useless for reviewing.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC v2 0/7] kvm stael time implementation
  2010-08-30 16:37 ` [RFC v2 0/7] kvm stael time implementation Peter Zijlstra
@ 2010-08-30 16:45   ` Jeremy Fitzhardinge
  2010-08-30 17:21     ` Glauber Costa
  0 siblings, 1 reply; 38+ messages in thread
From: Jeremy Fitzhardinge @ 2010-08-30 16:45 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Glauber Costa, kvm, avi, zamsden, mtosatti, riel, mingo

 On 08/30/2010 09:37 AM, Peter Zijlstra wrote:
> Something's gone wrong, I've got at least 15 emails in this thread.
>
> Also, please try again but now with --no-chain-reply-to, threads nested
> this deep are useless for reviewing.

Looks like there are two versions of the series mushed together.

    J


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC 5/8] change kernel accounting to include steal time
  2010-08-30 16:06                 ` [RFC 5/8] change kernel accounting to include steal time Glauber Costa
  2010-08-30 16:06                   ` [RFC v2 5/7] kvm steal time implementation Glauber Costa
@ 2010-08-30 16:46                   ` Peter Zijlstra
  2010-08-30 17:26                     ` Glauber Costa
  1 sibling, 1 reply; 38+ messages in thread
From: Peter Zijlstra @ 2010-08-30 16:46 UTC (permalink / raw)
  To: Glauber Costa; +Cc: kvm, avi, zamsden, mtosatti, riel, mingo, jeremy

On Mon, 2010-08-30 at 12:06 -0400, Glauber Costa wrote:
> This patch proposes a common steal time implementation. When no
> steal time is accounted, we just add a branch to the current
> accounting code, that shouldn't add much overhead.

But why not use alternative code and avoid that conditional
all-together?

>   * Account user cpu time to a process.
>   * @p: the process that the cpu time gets accounted to
> @@ -3169,6 +3179,12 @@ void account_user_time(struct task_struct *p, cputime_t cputime,
>  	struct cpu_usage_stat *cpustat = &kstat_this_cpu.cpustat;
>  	cputime64_t tmp;
>  
> +	tmp = get_steal_time_from_hypervisor();
> +	if (tmp) {
> +		account_steal_time(tmp);
> +		return;
> +	}
> +
>  	/* Add user time to process. */
>  	p->utime = cputime_add(p->utime, cputime);
>  	p->utimescaled = cputime_add(p->utimescaled, cputime_scaled);
> @@ -3234,6 +3250,12 @@ void account_system_time(struct task_struct *p, int hardirq_offset,
>  		return;
>  	}
>  
> +	tmp = get_steal_time_from_hypervisor();
> +	if (tmp) {
> +		account_steal_time(tmp);
> +		return;
> +	}
> +
>  	/* Add system time to process. */
>  	p->stime = cputime_add(p->stime, cputime);
>  	p->stimescaled = cputime_add(p->stimescaled, cputime_scaled);

Why replicate that code and not restructure account_process_tick()?



^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC v2 0/7] kvm stael time implementation
  2010-08-30 16:06 [RFC v2 0/7] kvm stael time implementation Glauber Costa
  2010-08-30 16:06 ` [RFC v2 1/7] change headers preparing for steal time Glauber Costa
  2010-08-30 16:37 ` [RFC v2 0/7] kvm stael time implementation Peter Zijlstra
@ 2010-08-30 17:20 ` Jeremy Fitzhardinge
  2 siblings, 0 replies; 38+ messages in thread
From: Jeremy Fitzhardinge @ 2010-08-30 17:20 UTC (permalink / raw)
  To: Glauber Costa; +Cc: kvm, avi, zamsden, mtosatti, riel, peterz, mingo

 On 08/30/2010 09:06 AM, Glauber Costa wrote:
> Hi,
>
> So, this is basically the same as v1, with three major
> differences: 
>  1) I am posting to lkml for wider audience
>  2) patch 2/7 fixes one problem I mentined would happen in
>     smp systems, which is, we only update kvmclock when we
>     changes pcup
>  3) softlockup algorithm is changed. Again, as marcelo pointed
>     out, this is open to discussion, and I am not dropping it
>     so more people can step in.
>
> I have some other patches under local test for a slightly modified
> guest part accounting, and I do somehow support extending
> the interface, and changing to nsecs (maybe not 100 %, but...). But
> I am posting in this state so we can have lkml people to step
> in earlier.
>
> Reminder of the previous cover-letter:
>
> There are two parts of it: the guest and host part.
>
> The proposal for the guest part, is to just change the
> common time accounting, and try to identify at that spot,
> wether or not we should account any steal time. I considered
> this idea less cumbersome that trying to cook a clockevents
> implementation ourselves, since I see little value in it.
> I am, however, pretty open to suggestions.

What's the relationship between clockevents and stolen time?  Are you
thinking some form of timer that counts unstolen time or something?

> For the host<->guest communications, I am using a shared
> page, in the same way as pvclock. Because of that, I am just
> hijacking pvclock structure anyway. There is a 32-bit field
> floating by, that gives us enough room for 8 years of steal
> time (we use msec resolution).

Please don't.  The pvclock structure is already getting a bit packed
with stuff, and stolen time isn't really part of its job.  In Xen we
have a separate runstate structure which allows the guest to get a
detailed breakdown of the time each vcpu spends in each state (which are
guaranteed to sum to the system time).  We can use that to compute how
much time has been stolen (time spent in RUNNABLE state).  You might
consider a similar ABI for KVM, even if you can't (yet) fill out all the
time values.


> The main idea is to timestamp our exit and entry through
> sched notifiers, and export the value at pvclock updates.
> This obviously have some disadvantages: by doing this we
> are giving up futures ideas about only updating
> this structure once, and even right now, won't work
> on pinned-smp (since we don't update pvclock if we
> haven't changed cpus.)
>
> But again, it is just an RFC, and I'd like to feel the
> reception of the idea as a whole.

The Xen code has always accounted for stolen time, so it appears in top,
vmstat, etc, and gives a user/admin some idea about how much their
domain is suffering from competition.  This doesn't require any kernel
changes aside from some calls to account_steal_ticks(); we do this every
timer interrupt, accumulating whole ticks as we get them.

But I've not successfully managed to make the scheduler work well with
stolen time.  I did experiment with making sched_clock return unstolen
time, on the grounds that it would give the scheduler more information
about how long things actually executed for.  But its meaningless for
measuring sleep/idle times, and it causes the different cpus' timebases
to drift severely, which causes other things in the kernel to get upset.

So I think any work in the scheduler area is interesting, and probably
worth posting separately unless they're too entangled in the stolen time
accounting area.

    J

> Have a nice review.
> Glauber Costa (7):
>   change headers preparing for steal time
>   always call kvm_write_guest
>   measure time out of guest
>   change kernel accounting to include steal time
>   kvm steal time implementation
>   touch softlockup watchdog
>   tell guest about steal time feature
>
>  arch/x86/include/asm/kvm_host.h    |    2 +
>  arch/x86/include/asm/kvm_para.h    |    1 +
>  arch/x86/include/asm/pvclock-abi.h |    4 ++-
>  arch/x86/kernel/kvmclock.c         |   40 ++++++++++++++++++++++++++++++++++++
>  arch/x86/kvm/x86.c                 |   26 ++++++++++++++++++----
>  include/linux/sched.h              |    1 +
>  kernel/sched.c                     |   29 ++++++++++++++++++++++++++
>  7 files changed, 97 insertions(+), 6 deletions(-)
>


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC v2 0/7] kvm stael time implementation
  2010-08-30 16:45   ` Jeremy Fitzhardinge
@ 2010-08-30 17:21     ` Glauber Costa
  0 siblings, 0 replies; 38+ messages in thread
From: Glauber Costa @ 2010-08-30 17:21 UTC (permalink / raw)
  To: Jeremy Fitzhardinge
  Cc: Peter Zijlstra, kvm, avi, zamsden, mtosatti, riel, mingo

On Mon, Aug 30, 2010 at 09:45:41AM -0700, Jeremy Fitzhardinge wrote:
>  On 08/30/2010 09:37 AM, Peter Zijlstra wrote:
> > Something's gone wrong, I've got at least 15 emails in this thread.
> >
> > Also, please try again but now with --no-chain-reply-to, threads nested
> > this deep are useless for reviewing.
> 
> Looks like there are two versions of the series mushed together.
yeah, my bad. I did git-format-patch twice by accident, into the
same directory.


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC 5/8] change kernel accounting to include steal time
  2010-08-30 16:46                   ` [RFC 5/8] change kernel accounting to include steal time Peter Zijlstra
@ 2010-08-30 17:26                     ` Glauber Costa
  0 siblings, 0 replies; 38+ messages in thread
From: Glauber Costa @ 2010-08-30 17:26 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: kvm, avi, zamsden, mtosatti, riel, mingo, jeremy

On Mon, Aug 30, 2010 at 06:46:25PM +0200, Peter Zijlstra wrote:
> On Mon, 2010-08-30 at 12:06 -0400, Glauber Costa wrote:
> > This patch proposes a common steal time implementation. When no
> > steal time is accounted, we just add a branch to the current
> > accounting code, that shouldn't add much overhead.
> 
> But why not use alternative code and avoid that conditional
> all-together?
> 
> >   * Account user cpu time to a process.
> >   * @p: the process that the cpu time gets accounted to
> > @@ -3169,6 +3179,12 @@ void account_user_time(struct task_struct *p, cputime_t cputime,
> >  	struct cpu_usage_stat *cpustat = &kstat_this_cpu.cpustat;
> >  	cputime64_t tmp;
> >  
> > +	tmp = get_steal_time_from_hypervisor();
> > +	if (tmp) {
> > +		account_steal_time(tmp);
> > +		return;
> > +	}
> > +
> >  	/* Add user time to process. */
> >  	p->utime = cputime_add(p->utime, cputime);
> >  	p->utimescaled = cputime_add(p->utimescaled, cputime_scaled);
> > @@ -3234,6 +3250,12 @@ void account_system_time(struct task_struct *p, int hardirq_offset,
> >  		return;
> >  	}
> >  
> > +	tmp = get_steal_time_from_hypervisor();
> > +	if (tmp) {
> > +		account_steal_time(tmp);
> > +		return;
> > +	}
> > +
> >  	/* Add system time to process. */
> >  	p->stime = cputime_add(p->stime, cputime);
> >  	p->stimescaled = cputime_add(p->stimescaled, cputime_scaled);
> 
> Why replicate that code and not restructure account_process_tick()?
For both: because this way is easier, and this is just an rfc, so I
am mainly interested in getting convergence on the design, or leaving
this behind altogether. I do agree with both your comments, btw


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC v2 4/7] change kernel accounting to include steal time
  2010-08-30 16:06             ` [RFC v2 4/7] change kernel accounting to include steal time Glauber Costa
  2010-08-30 16:06               ` [RFC 4/8] measure time out of guest Glauber Costa
@ 2010-08-30 17:30               ` Jeremy Fitzhardinge
  2010-08-30 18:39                 ` Rik van Riel
  1 sibling, 1 reply; 38+ messages in thread
From: Jeremy Fitzhardinge @ 2010-08-30 17:30 UTC (permalink / raw)
  To: Glauber Costa; +Cc: kvm, avi, zamsden, mtosatti, riel, peterz, mingo

 On 08/30/2010 09:06 AM, Glauber Costa wrote:
> This patch proposes a common steal time implementation. When no
> steal time is accounted, we just add a branch to the current
> accounting code, that shouldn't add much overhead.

How is stolen time logically any different from a CPU running slowly due
to HT or power management?  Is it worth trying to handle them in the
same way?  (I'm mostly picking on the "_from_hypervisor" part, since
that seems over-specific.)

Why not have a get_unstolen_time() function which just returns
sched_clock() in the normal case, but can return less?

> When we do want to register steal time, we proceed as following:
> - if we would account user or system time in this tick, and there is
>   out-of-cpu time registered, we skip it altogether, and account steal
>   time only.
> - if we would account user or system time in this tick, and we got the
>   cpu for the whole slice, we proceed normaly.
> - if we are idle in this tick, we flush out-of-cpu time to give it the
>   chance to update whatever last-measure internal variable it may have.
>
> This approach is simple, but proved to work well for my test scenarios.
> in a UP guest on UP host, with a cpu-hog in both guest and host shows
> ~ 50 % steal time. steal time is also accounted proportionally, if
> nice values are given to the host cpu-hog.
>
> A cpu-hog in the host with no load in the guest, produces 0 % steal time,
> with 100 % idle, as one would expect.
>
> Signed-off-by: Glauber Costa <glommer@redhat.com>
> ---
>  include/linux/sched.h |    1 +
>  kernel/sched.c        |   29 +++++++++++++++++++++++++++++
>  2 files changed, 30 insertions(+), 0 deletions(-)
>
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index 0478888..e571ddd 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -312,6 +312,7 @@ long io_schedule_timeout(long timeout);
>  extern void cpu_init (void);
>  extern void trap_init(void);
>  extern void update_process_times(int user);
> +extern cputime_t (*hypervisor_steal_time)(void);
>  extern void scheduler_tick(void);
>  
>  extern void sched_show_task(struct task_struct *p);
> diff --git a/kernel/sched.c b/kernel/sched.c
> index f52a880..9695c92 100644
> --- a/kernel/sched.c
> +++ b/kernel/sched.c
> @@ -3157,6 +3157,16 @@ unsigned long long thread_group_sched_runtime(struct task_struct *p)
>  	return ns;
>  }
>  
> +cputime_t (*hypervisor_steal_time)(void) = NULL;
> +
> +static inline cputime_t get_steal_time_from_hypervisor(void)
> +{
> +	if (!hypervisor_steal_time)
> +		return 0;
> +	return hypervisor_steal_time();
> +}
> +
> +
>  /*
>   * Account user cpu time to a process.
>   * @p: the process that the cpu time gets accounted to
> @@ -3169,6 +3179,12 @@ void account_user_time(struct task_struct *p, cputime_t cputime,
>  	struct cpu_usage_stat *cpustat = &kstat_this_cpu.cpustat;
>  	cputime64_t tmp;
>  
> +	tmp = get_steal_time_from_hypervisor();
> +	if (tmp) {
> +		account_steal_time(tmp);
> +		return;
> +	}

Is that all?  Does the scheduler use account_steal_time() to adjust its
scheduling decisions, or is it just something that gets shown to users? 
I thought just the latter.

But if all you're doing is calling account_steal_time(), why bother with
all this get_steal_time_from_hypervisor() stuff?  The
hypervisor-specific code can just call account_steal_time() directly.

> +
>  	/* Add user time to process. */
>  	p->utime = cputime_add(p->utime, cputime);
>  	p->utimescaled = cputime_add(p->utimescaled, cputime_scaled);
> @@ -3234,6 +3250,12 @@ void account_system_time(struct task_struct *p, int hardirq_offset,
>  		return;
>  	}
>  
> +	tmp = get_steal_time_from_hypervisor();
> +	if (tmp) {
> +		account_steal_time(tmp);
> +		return;
> +	}
> +
>  	/* Add system time to process. */
>  	p->stime = cputime_add(p->stime, cputime);
>  	p->stimescaled = cputime_add(p->stimescaled, cputime_scaled);
> @@ -3276,6 +3298,13 @@ void account_idle_time(cputime_t cputime)
>  	cputime64_t cputime64 = cputime_to_cputime64(cputime);
>  	struct rq *rq = this_rq();
>  
> +	/*
> +	 * if we're idle, we don't account it as steal time, since we did
> +	 * not want to run anyway. We do call the steal function, however, to
> +	 * give the guest the chance to flush its internal buffers
> +	 */
> +	get_steal_time_from_hypervisor();

Eh?  This doesn't make much sense.  What side-effects is
get_steal_time_from_hypervisor() expected to have?  If there's some
hypervisor-specific implementation detail, why not wrap that up in a
specific function rather than overloading this one?

    J

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC v2 6/7] touch softlockup watchdog
  2010-08-30 16:06                       ` [RFC v2 6/7] touch softlockup watchdog Glauber Costa
  2010-08-30 16:06                         ` [RFC v2 7/7] tell guest about steal time feature Glauber Costa
@ 2010-08-30 17:33                         ` Jeremy Fitzhardinge
  2010-08-30 18:07                           ` Glauber Costa
  1 sibling, 1 reply; 38+ messages in thread
From: Jeremy Fitzhardinge @ 2010-08-30 17:33 UTC (permalink / raw)
  To: Glauber Costa; +Cc: kvm, avi, zamsden, mtosatti, riel, peterz, mingo

 On 08/30/2010 09:06 AM, Glauber Costa wrote:
> With a reliable steal time mechanism, we can tell if we're
> out of the cpu for very long, differentiating from the case
> that we simply got a real softlockup.
>
> In the case we were out of cpu, the watchdog is fed, making
> bogus softlockups disappear.

Why not change the softlockup to measure unstolen time rather than
explicitly poking it from a hypervisor-specific function?

How is touching it in kvm_get_steal_time() correct anyway?

    J

> Signed-off-by: Glauber Costa <glommer@redhat.com>
> ---
>  arch/x86/kernel/kvmclock.c |    5 +++++
>  1 files changed, 5 insertions(+), 0 deletions(-)
>
> diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
> index a1f4852..d217475 100644
> --- a/arch/x86/kernel/kvmclock.c
> +++ b/arch/x86/kernel/kvmclock.c
> @@ -91,6 +91,7 @@ cputime_t kvm_get_steal_time(void)
>  {
>  	u64 delta = 0;
>  	u64 *last_steal_info, this_steal_info;
> +	int touch_wd;
>  	struct pvclock_vcpu_time_info *src;
>  
>  	src = &get_cpu_var(hv_clock);
> @@ -104,6 +105,10 @@ cputime_t kvm_get_steal_time(void)
>  
>  	delta = this_steal_info - *last_steal_info;
>  
> +	touch_wd = softlockup_thresh * 1000UL;
> +	if ((touch_wd > 0) && (delta > touch_wd))
> +		touch_softlockup_watchdog();
> +
>  	*last_steal_info = this_steal_info;
>  	put_cpu_var(steal_info);
>  


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC v2 6/7] touch softlockup watchdog
  2010-08-30 17:33                         ` [RFC v2 6/7] touch softlockup watchdog Jeremy Fitzhardinge
@ 2010-08-30 18:07                           ` Glauber Costa
  0 siblings, 0 replies; 38+ messages in thread
From: Glauber Costa @ 2010-08-30 18:07 UTC (permalink / raw)
  To: Jeremy Fitzhardinge; +Cc: kvm, avi, zamsden, mtosatti, riel, peterz, mingo

On Mon, Aug 30, 2010 at 10:33:26AM -0700, Jeremy Fitzhardinge wrote:
>  On 08/30/2010 09:06 AM, Glauber Costa wrote:
> > With a reliable steal time mechanism, we can tell if we're
> > out of the cpu for very long, differentiating from the case
> > that we simply got a real softlockup.
> >
> > In the case we were out of cpu, the watchdog is fed, making
> > bogus softlockups disappear.
> 
> Why not change the softlockup to measure unstolen time rather than
> explicitly poking it from a hypervisor-specific function?
> 
Ok, I like your idea much better.

Thanks!


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC v2 4/7] change kernel accounting to include steal time
  2010-08-30 17:30               ` [RFC v2 4/7] " Jeremy Fitzhardinge
@ 2010-08-30 18:39                 ` Rik van Riel
  2010-08-30 19:07                   ` Jeremy Fitzhardinge
  0 siblings, 1 reply; 38+ messages in thread
From: Rik van Riel @ 2010-08-30 18:39 UTC (permalink / raw)
  To: Jeremy Fitzhardinge
  Cc: Glauber Costa, kvm, avi, zamsden, mtosatti, peterz, mingo

On 08/30/2010 01:30 PM, Jeremy Fitzhardinge wrote:
>   On 08/30/2010 09:06 AM, Glauber Costa wrote:
>> This patch proposes a common steal time implementation. When no
>> steal time is accounted, we just add a branch to the current
>> accounting code, that shouldn't add much overhead.
>
> How is stolen time logically any different from a CPU running slowly due
> to HT or power management?  Is it worth trying to handle them in the
> same way?  (I'm mostly picking on the "_from_hypervisor" part, since
> that seems over-specific.)
>
> Why not have a get_unstolen_time() function which just returns
> sched_clock() in the normal case, but can return less?

Steal time gets you information you can act on, when
your program is running slowly.

The steal time statistic allows you to see whether the
slowdown was due to the CPU just not being fast enough,
or due to something else contending for the CPU.

This can be useful information. Apparently, it has been
useful enough that it has been implemented on s390, PPC
and Xen (pre pvops Xen).

I suppose HT can have similar results, but that is
also something that system administrators can see.

-- 
All rights reversed

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC v2 4/7] change kernel accounting to include steal time
  2010-08-30 18:39                 ` Rik van Riel
@ 2010-08-30 19:07                   ` Jeremy Fitzhardinge
  2010-08-30 19:14                     ` Peter Zijlstra
  2010-08-30 19:17                     ` Rik van Riel
  0 siblings, 2 replies; 38+ messages in thread
From: Jeremy Fitzhardinge @ 2010-08-30 19:07 UTC (permalink / raw)
  To: Rik van Riel; +Cc: Glauber Costa, kvm, avi, zamsden, mtosatti, peterz, mingo

 On 08/30/2010 11:39 AM, Rik van Riel wrote:
> On 08/30/2010 01:30 PM, Jeremy Fitzhardinge wrote:
>>   On 08/30/2010 09:06 AM, Glauber Costa wrote:
>>> This patch proposes a common steal time implementation. When no
>>> steal time is accounted, we just add a branch to the current
>>> accounting code, that shouldn't add much overhead.
>>
>> How is stolen time logically any different from a CPU running slowly due
>> to HT or power management?  Is it worth trying to handle them in the
>> same way?  (I'm mostly picking on the "_from_hypervisor" part, since
>> that seems over-specific.)
>>
>> Why not have a get_unstolen_time() function which just returns
>> sched_clock() in the normal case, but can return less?
>
> Steal time gets you information you can act on, when
> your program is running slowly.
>
> The steal time statistic allows you to see whether the
> slowdown was due to the CPU just not being fast enough,
> or due to something else contending for the CPU.
>
> This can be useful information. Apparently, it has been
> useful enough that it has been implemented on s390, PPC
> and Xen (pre pvops Xen).

Sure - pvops Xen has had stolen time accounting from the start (or a
very long time, at least).  I think Glauber's changes are functionally
identical to Xen's existing stolen time support, but more intrusive.


> I suppose HT can have similar results, but that is
> also something that system administrators can see.

My comments were mostly because I though that this patch was going to
actually make the scheduler change behaviour to take stolen time into
account in scheduling decisions - and those decisions would be equally
meaningful if the slowdown was from HT, PM or stolen time.  But AFAICT
this patch does nothing to that end.

    J

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC v2 4/7] change kernel accounting to include steal time
  2010-08-30 19:07                   ` Jeremy Fitzhardinge
@ 2010-08-30 19:14                     ` Peter Zijlstra
  2010-08-30 19:17                     ` Rik van Riel
  1 sibling, 0 replies; 38+ messages in thread
From: Peter Zijlstra @ 2010-08-30 19:14 UTC (permalink / raw)
  To: Jeremy Fitzhardinge
  Cc: Rik van Riel, Glauber Costa, kvm, avi, zamsden, mtosatti, mingo

On Mon, 2010-08-30 at 12:07 -0700, Jeremy Fitzhardinge wrote:
> My comments were mostly because I though that this patch was going to
> actually make the scheduler change behaviour to take stolen time into
> account in scheduling decisions - and those decisions would be equally
> meaningful if the slowdown was from HT, PM or stolen time.  But AFAICT
> this patch does nothing to that end. 

If you want to influence the load-balancer (the only thing that really
cares about HT) the best would probably be to add a term to
kernel/sched_fair.c:update_cpu_power().

Decrease the power relative to the amount of steal-time, just like freq
and time spend on RT tasks.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC v2 4/7] change kernel accounting to include steal time
  2010-08-30 19:07                   ` Jeremy Fitzhardinge
  2010-08-30 19:14                     ` Peter Zijlstra
@ 2010-08-30 19:17                     ` Rik van Riel
  2010-08-30 19:20                       ` Peter Zijlstra
  1 sibling, 1 reply; 38+ messages in thread
From: Rik van Riel @ 2010-08-30 19:17 UTC (permalink / raw)
  To: Jeremy Fitzhardinge
  Cc: Glauber Costa, kvm, avi, zamsden, mtosatti, peterz, mingo

On 08/30/2010 03:07 PM, Jeremy Fitzhardinge wrote:
>   On 08/30/2010 11:39 AM, Rik van Riel wrote:
>> On 08/30/2010 01:30 PM, Jeremy Fitzhardinge wrote:
>>>    On 08/30/2010 09:06 AM, Glauber Costa wrote:
>>>> This patch proposes a common steal time implementation. When no
>>>> steal time is accounted, we just add a branch to the current
>>>> accounting code, that shouldn't add much overhead.
>>>
>>> How is stolen time logically any different from a CPU running slowly due
>>> to HT or power management?  Is it worth trying to handle them in the
>>> same way?  (I'm mostly picking on the "_from_hypervisor" part, since
>>> that seems over-specific.)
>>>
>>> Why not have a get_unstolen_time() function which just returns
>>> sched_clock() in the normal case, but can return less?
>>
>> Steal time gets you information you can act on, when
>> your program is running slowly.
>>
>> The steal time statistic allows you to see whether the
>> slowdown was due to the CPU just not being fast enough,
>> or due to something else contending for the CPU.
>>
>> This can be useful information. Apparently, it has been
>> useful enough that it has been implemented on s390, PPC
>> and Xen (pre pvops Xen).
>
> Sure - pvops Xen has had stolen time accounting from the start (or a
> very long time, at least).  I think Glauber's changes are functionally
> identical to Xen's existing stolen time support, but more intrusive.
>
>
>> I suppose HT can have similar results, but that is
>> also something that system administrators can see.
>
> My comments were mostly because I though that this patch was going to
> actually make the scheduler change behaviour to take stolen time into
> account in scheduling decisions

I believe that happens automatically.

When time is accounted as steal time, it is NOT accounted as
to the current process user/system/..., which in turn should
help it in the scheduler.

Am I overlooking something?

-- 
All rights reversed

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC v2 4/7] change kernel accounting to include steal time
  2010-08-30 19:17                     ` Rik van Riel
@ 2010-08-30 19:20                       ` Peter Zijlstra
  2010-08-30 19:45                         ` Rik van Riel
  0 siblings, 1 reply; 38+ messages in thread
From: Peter Zijlstra @ 2010-08-30 19:20 UTC (permalink / raw)
  To: Rik van Riel
  Cc: Jeremy Fitzhardinge, Glauber Costa, kvm, avi, zamsden, mtosatti, mingo

On Mon, 2010-08-30 at 15:17 -0400, Rik van Riel wrote:
> 
> When time is accounted as steal time, it is NOT accounted as
> to the current process user/system/..., which in turn should
> help it in the scheduler.
> 
> Am I overlooking something? 

Yeah, the scheduler doesn't care about the user/system time accounting
at all... :-)

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC v2 4/7] change kernel accounting to include steal time
  2010-08-30 19:20                       ` Peter Zijlstra
@ 2010-08-30 19:45                         ` Rik van Riel
  2010-08-30 22:56                           ` Jeremy Fitzhardinge
  0 siblings, 1 reply; 38+ messages in thread
From: Rik van Riel @ 2010-08-30 19:45 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Jeremy Fitzhardinge, Glauber Costa, kvm, avi, zamsden, mtosatti, mingo

On 08/30/2010 03:20 PM, Peter Zijlstra wrote:
> On Mon, 2010-08-30 at 15:17 -0400, Rik van Riel wrote:
>>
>> When time is accounted as steal time, it is NOT accounted as
>> to the current process user/system/..., which in turn should
>> help it in the scheduler.
>>
>> Am I overlooking something?
>
> Yeah, the scheduler doesn't care about the user/system time accounting
> at all... :-)

Uh oh. This would seem like something we'll want to fix
in an architecture independent way, so s390, etc. also
benefit from it.

I can see this being a real problem when the host and guest
OS have the same time slice - which is quite possible since
they may both be the same version of Linux.

Guest 1, alternating between processes A and B, may end up
with process A getting a lot of actual CPU time, and process
B being scheduled in when the VCPU itself is not running...

-- 
All rights reversed

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC v2 4/7] change kernel accounting to include steal time
  2010-08-30 19:45                         ` Rik van Riel
@ 2010-08-30 22:56                           ` Jeremy Fitzhardinge
  2010-08-30 23:03                             ` Rik van Riel
  0 siblings, 1 reply; 38+ messages in thread
From: Jeremy Fitzhardinge @ 2010-08-30 22:56 UTC (permalink / raw)
  To: Rik van Riel
  Cc: Peter Zijlstra, Glauber Costa, kvm, avi, zamsden, mtosatti, mingo

 On 08/30/2010 12:45 PM, Rik van Riel wrote:
> On 08/30/2010 03:20 PM, Peter Zijlstra wrote:
>> On Mon, 2010-08-30 at 15:17 -0400, Rik van Riel wrote:
>>>
>>> When time is accounted as steal time, it is NOT accounted as
>>> to the current process user/system/..., which in turn should
>>> help it in the scheduler.
>>>
>>> Am I overlooking something?
>>
>> Yeah, the scheduler doesn't care about the user/system time accounting
>> at all... :-)
>
> Uh oh. This would seem like something we'll want to fix
> in an architecture independent way, so s390, etc. also
> benefit from it.
>
> I can see this being a real problem when the host and guest
> OS have the same time slice - which is quite possible since
> they may both be the same version of Linux.
>
> Guest 1, alternating between processes A and B, may end up
> with process A getting a lot of actual CPU time, and process
> B being scheduled in when the VCPU itself is not running...

Yep.  I'd been trying to do that with sched_clock games, but that never
worked out.

I think it basically comes down to adding "sched_clock_unstolen()" which
the scheduler can use to measure time a process spends running, and
sched_clock() for measuring sleep times.  In the normal case,
sched_clock_unstolen() would be the same as sched_clock().

    J

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC v2 4/7] change kernel accounting to include steal time
  2010-08-30 22:56                           ` Jeremy Fitzhardinge
@ 2010-08-30 23:03                             ` Rik van Riel
  2010-08-31  8:11                               ` Peter Zijlstra
  0 siblings, 1 reply; 38+ messages in thread
From: Rik van Riel @ 2010-08-30 23:03 UTC (permalink / raw)
  To: Jeremy Fitzhardinge
  Cc: Peter Zijlstra, Glauber Costa, kvm, avi, zamsden, mtosatti, mingo

On 08/30/2010 06:56 PM, Jeremy Fitzhardinge wrote:
>   On 08/30/2010 12:45 PM, Rik van Riel wrote:
>> On 08/30/2010 03:20 PM, Peter Zijlstra wrote:
>>> On Mon, 2010-08-30 at 15:17 -0400, Rik van Riel wrote:
>>>>
>>>> When time is accounted as steal time, it is NOT accounted as
>>>> to the current process user/system/..., which in turn should
>>>> help it in the scheduler.
>>>>
>>>> Am I overlooking something?
>>>
>>> Yeah, the scheduler doesn't care about the user/system time accounting
>>> at all... :-)
>>
>> Uh oh. This would seem like something we'll want to fix
>> in an architecture independent way, so s390, etc. also
>> benefit from it.
>>
>> I can see this being a real problem when the host and guest
>> OS have the same time slice - which is quite possible since
>> they may both be the same version of Linux.
>>
>> Guest 1, alternating between processes A and B, may end up
>> with process A getting a lot of actual CPU time, and process
>> B being scheduled in when the VCPU itself is not running...
>
> Yep.  I'd been trying to do that with sched_clock games, but that never
> worked out.
>
> I think it basically comes down to adding "sched_clock_unstolen()" which
> the scheduler can use to measure time a process spends running, and
> sched_clock() for measuring sleep times.  In the normal case,
> sched_clock_unstolen() would be the same as sched_clock().

That requires the host to export (any time the guest is scheduled
in), the amount of CPU time the VCPU thread has used, and the time
the VCPU was scheduled in.

Since the VCPU must be running when it is examining these variables,
it can calculate the additional time (since it was last scheduled)
to account to the task, and remember the currently calculated time
in its own per-vcpu variable, so next time it can get a delta again.

-- 
All rights reversed

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC v2 4/7] change kernel accounting to include steal time
  2010-08-30 23:03                             ` Rik van Riel
@ 2010-08-31  8:11                               ` Peter Zijlstra
  2010-09-02 18:19                                 ` Glauber Costa
  0 siblings, 1 reply; 38+ messages in thread
From: Peter Zijlstra @ 2010-08-31  8:11 UTC (permalink / raw)
  To: Rik van Riel
  Cc: Jeremy Fitzhardinge, Glauber Costa, kvm, avi, zamsden, mtosatti, mingo

On Mon, 2010-08-30 at 19:03 -0400, Rik van Riel wrote:
> 
> > I think it basically comes down to adding "sched_clock_unstolen()" which
> > the scheduler can use to measure time a process spends running, and
> > sched_clock() for measuring sleep times.  In the normal case,
> > sched_clock_unstolen() would be the same as sched_clock().
> 
> That requires the host to export (any time the guest is scheduled
> in), the amount of CPU time the VCPU thread has used, and the time
> the VCPU was scheduled in.
> 
> Since the VCPU must be running when it is examining these variables,
> it can calculate the additional time (since it was last scheduled)
> to account to the task, and remember the currently calculated time
> in its own per-vcpu variable, so next time it can get a delta again. 

I think its easier (and sufficient) for the host to tell the guest how
long it was _not_ running. That can simply be passed in when you start
the vcpu again and doesn't need a fancy communication channel.

The guests sched_clock() will measure wall time, the guests
sched_clock_stolen() will report the accumulation of these stolen times.

Then you can make sched_clock_unstolen() be sched_clock() -
sched_clock_stolen(). And like Jeremy said, if you make the sched_fair
stuff use sched_clock_unstolen() things should more or less work.

The problem with all that is that you'll start to schedule on unstolen
time instead of wall-time, which might not give the best results for
things like latencies etc.. but I guess that's one of the prices you pay
for using virt.

Also, like said yesterday, you need some factor in update_cpu_power(), a
quick hack might be to add all stolen time to sched_rt_avg_update().



^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC 1/8] Implement getnsboottime kernel API
  2010-08-30 16:06   ` [RFC 1/8] Implement getnsboottime kernel API Glauber Costa
  2010-08-30 16:06     ` [RFC v2 2/7] always call kvm_write_guest Glauber Costa
@ 2010-09-01 23:56     ` Zachary Amsden
  1 sibling, 0 replies; 38+ messages in thread
From: Zachary Amsden @ 2010-09-01 23:56 UTC (permalink / raw)
  To: Glauber Costa; +Cc: kvm, avi, mtosatti, riel, peterz, mingo, jeremy

On 08/30/2010 06:06 AM, Glauber Costa wrote:
> From: Zachary Amsden<zamsden@redhat.com>
>
> Add a kernel call to get the number of nanoseconds since boot.  This
> is generally useful enough to make it a generic call.
>
> Signed-off-by: Zachary Amsden<zamsden@redhat.com>
> ---
>   include/linux/time.h      |    1 +
>   kernel/time/timekeeping.c |   27 +++++++++++++++++++++++++++
>   2 files changed, 28 insertions(+), 0 deletions(-)
>
> diff --git a/include/linux/time.h b/include/linux/time.h
> index ea3559f..5d04108 100644
> --- a/include/linux/time.h
> +++ b/include/linux/time.h
> @@ -145,6 +145,7 @@ extern void getnstimeofday(struct timespec *tv);
>   extern void getrawmonotonic(struct timespec *ts);
>   extern void getboottime(struct timespec *ts);
>   extern void monotonic_to_bootbased(struct timespec *ts);
> +extern s64 getnsboottime(void);
>
>   extern struct timespec timespec_trunc(struct timespec t, unsigned gran);
>   extern int timekeeping_valid_for_hres(void);
> diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
> index caf8d4d..d250f0a 100644
> --- a/kernel/time/timekeeping.c
> +++ b/kernel/time/timekeeping.c
> @@ -285,6 +285,33 @@ void ktime_get_ts(struct timespec *ts)
>   }
>   EXPORT_SYMBOL_GPL(ktime_get_ts);
>
> +
> +/**
> + * getnsboottime - get the bootbased clock in nsec format
> + *
> + * The function calculates the bootbased clock from the realtime
> + * clock and the wall_to_monotonic offset and stores the result
> + * in normalized timespec format in the variable pointed to by @ts.
> + */
> +s64 getnsboottime(void)
> +{
> +	unsigned int seq;
> +	s64 secs, nsecs;
> +
> +	WARN_ON(timekeeping_suspended);
> +
> +	do {
> +		seq = read_seqbegin(&xtime_lock);
> +		secs = xtime.tv_sec + wall_to_monotonic.tv_sec;
> +		secs += total_sleep_time.tv_sec;
> +		nsecs = xtime.tv_nsec + wall_to_monotonic.tv_nsec;
> +		nsecs += total_sleep_time.tv_nsec + timekeeping_get_ns();
> +
> +	} while (read_seqretry(&xtime_lock, seq));
> +	return nsecs + (secs * NSEC_PER_SEC);
> +}
> +EXPORT_SYMBOL_GPL(getnsboottime);
> +
>   /**
>    * do_gettimeofday - Returns the time of day in a timeval
>    * @tv:		pointer to the timeval to be set
>    

FWIW, I sent a patch to drop this yesterday, in favor of this approach:

+static inline u64 get_kernel_ns(void)
+{
+    struct timespec ts;
+
+    WARN_ON(preemptible());
+    ktime_get_ts(&ts);
+    monotonic_to_bootbased(&ts);
+    return timespec_to_ns(&ts);
+}
+


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC v2 4/7] change kernel accounting to include steal time
  2010-08-31  8:11                               ` Peter Zijlstra
@ 2010-09-02 18:19                                 ` Glauber Costa
  2010-09-03  3:24                                   ` Jeremy Fitzhardinge
  2010-09-03  7:18                                   ` Peter Zijlstra
  0 siblings, 2 replies; 38+ messages in thread
From: Glauber Costa @ 2010-09-02 18:19 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Rik van Riel, Jeremy Fitzhardinge, kvm, avi, zamsden, mtosatti, mingo

On Tue, Aug 31, 2010 at 10:11:49AM +0200, Peter Zijlstra wrote:
> On Mon, 2010-08-30 at 19:03 -0400, Rik van Riel wrote:
> > 
> > > I think it basically comes down to adding "sched_clock_unstolen()" which
> > > the scheduler can use to measure time a process spends running, and
> > > sched_clock() for measuring sleep times.  In the normal case,
> > > sched_clock_unstolen() would be the same as sched_clock().
> > 
> > That requires the host to export (any time the guest is scheduled
> > in), the amount of CPU time the VCPU thread has used, and the time
> > the VCPU was scheduled in.
> > 
> > Since the VCPU must be running when it is examining these variables,
> > it can calculate the additional time (since it was last scheduled)
> > to account to the task, and remember the currently calculated time
> > in its own per-vcpu variable, so next time it can get a delta again. 
> 
> I think its easier (and sufficient) for the host to tell the guest how
> long it was _not_ running. That can simply be passed in when you start
> the vcpu again and doesn't need a fancy communication channel.
> 
> The guests sched_clock() will measure wall time, the guests
> sched_clock_stolen() will report the accumulation of these stolen times.
> 
> Then you can make sched_clock_unstolen() be sched_clock() -
> sched_clock_stolen(). And like Jeremy said, if you make the sched_fair
> stuff use sched_clock_unstolen() things should more or less work.
So what's the big drawback of just making sched_clock return sched_clock_unstolen?
When there is no steal time involved, they will just be equal anyway.
And this way, everybody that relies on sched_clock for whatever reason,
will probably work.


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC v2 4/7] change kernel accounting to include steal time
  2010-09-02 18:19                                 ` Glauber Costa
@ 2010-09-03  3:24                                   ` Jeremy Fitzhardinge
  2010-09-03  7:18                                   ` Peter Zijlstra
  1 sibling, 0 replies; 38+ messages in thread
From: Jeremy Fitzhardinge @ 2010-09-03  3:24 UTC (permalink / raw)
  To: Glauber Costa
  Cc: Peter Zijlstra, Rik van Riel, kvm, avi, zamsden, mtosatti, mingo

 On 09/02/2010 11:19 AM, Glauber Costa wrote:
> On Tue, Aug 31, 2010 at 10:11:49AM +0200, Peter Zijlstra wrote:
>> I think its easier (and sufficient) for the host to tell the guest how
>> long it was _not_ running. That can simply be passed in when you start
>> the vcpu again and doesn't need a fancy communication channel.
>>
>> The guests sched_clock() will measure wall time, the guests
>> sched_clock_stolen() will report the accumulation of these stolen times.
>>
>> Then you can make sched_clock_unstolen() be sched_clock() -
>> sched_clock_stolen(). And like Jeremy said, if you make the sched_fair
>> stuff use sched_clock_unstolen() things should more or less work.
> So what's the big drawback of just making sched_clock return sched_clock_unstolen?
> When there is no steal time involved, they will just be equal anyway.
> And this way, everybody that relies on sched_clock for whatever reason,
> will probably work.

I can say from experience that it definitely does not work.  That's what
I tried up until recently when I reverted it all.

It doesn't work for two semi-related reasons:

    * Using sched_clock_unstolen() to measure idle/sleep times is
      completely meaningless (nobody cares if the cpu was "stolen" while
      you were sleeping).
    * The notion of unstolen time is inherently per-cpu, so the unstolen
      clocks are completely unsynchronized between cpus, whereas
      sched_clock is expected to be moderately well synchronized.

A corollary to this is that if a task sleeps on one cpu and wakes on
another, then "sleep_time = sleep_end - sleep_start" is completely,
utterly, meaningless if you use per-cpu unstolen time as your timebase
sleep_start/end.

(The surprising thing is that it actually works surprisingly well, in
that the system runs and there's only occasional oddities like "idle
time" being completely misreported, and sometimes processes go to sleep
for really long times.)

    J

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC v2 4/7] change kernel accounting to include steal time
  2010-09-02 18:19                                 ` Glauber Costa
  2010-09-03  3:24                                   ` Jeremy Fitzhardinge
@ 2010-09-03  7:18                                   ` Peter Zijlstra
  1 sibling, 0 replies; 38+ messages in thread
From: Peter Zijlstra @ 2010-09-03  7:18 UTC (permalink / raw)
  To: Glauber Costa
  Cc: Rik van Riel, Jeremy Fitzhardinge, kvm, avi, zamsden, mtosatti, mingo

On Thu, 2010-09-02 at 15:19 -0300, Glauber Costa wrote:
> So what's the big drawback of just making sched_clock return sched_clock_unstolen?
> When there is no steal time involved, they will just be equal anyway.
> And this way, everybody that relies on sched_clock for whatever reason,
> will probably work. 

Because there's a number of things wanting sched_clock() and the likes
to be synchronized across cpus, its impossible to make _unstolen()
synchronized because different vcpus can get different service levels.

Also, I think you want to keep measuring scheduling latency and other
such things in actual wall-time.

^ permalink raw reply	[flat|nested] 38+ messages in thread

end of thread, other threads:[~2010-09-03  7:19 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-08-30 16:06 [RFC v2 0/7] kvm stael time implementation Glauber Costa
2010-08-30 16:06 ` [RFC v2 1/7] change headers preparing for steal time Glauber Costa
2010-08-30 16:06   ` [RFC 1/8] Implement getnsboottime kernel API Glauber Costa
2010-08-30 16:06     ` [RFC v2 2/7] always call kvm_write_guest Glauber Costa
2010-08-30 16:06       ` [RFC 2/8] change headers preparing for steal time Glauber Costa
2010-08-30 16:06         ` [RFC 3/8] always call kvm_write_guest Glauber Costa
2010-08-30 16:06           ` [RFC v2 3/7] measure time out of guest Glauber Costa
2010-08-30 16:06             ` [RFC v2 4/7] change kernel accounting to include steal time Glauber Costa
2010-08-30 16:06               ` [RFC 4/8] measure time out of guest Glauber Costa
2010-08-30 16:06                 ` [RFC 5/8] change kernel accounting to include steal time Glauber Costa
2010-08-30 16:06                   ` [RFC v2 5/7] kvm steal time implementation Glauber Costa
2010-08-30 16:06                     ` [RFC 6/8] " Glauber Costa
2010-08-30 16:06                       ` [RFC v2 6/7] touch softlockup watchdog Glauber Costa
2010-08-30 16:06                         ` [RFC v2 7/7] tell guest about steal time feature Glauber Costa
2010-08-30 16:06                           ` [RFC 7/8] touch softlockup watchdog Glauber Costa
2010-08-30 16:06                             ` [RFC 8/8] tell guest about steal time feature Glauber Costa
2010-08-30 17:33                         ` [RFC v2 6/7] touch softlockup watchdog Jeremy Fitzhardinge
2010-08-30 18:07                           ` Glauber Costa
2010-08-30 16:46                   ` [RFC 5/8] change kernel accounting to include steal time Peter Zijlstra
2010-08-30 17:26                     ` Glauber Costa
2010-08-30 17:30               ` [RFC v2 4/7] " Jeremy Fitzhardinge
2010-08-30 18:39                 ` Rik van Riel
2010-08-30 19:07                   ` Jeremy Fitzhardinge
2010-08-30 19:14                     ` Peter Zijlstra
2010-08-30 19:17                     ` Rik van Riel
2010-08-30 19:20                       ` Peter Zijlstra
2010-08-30 19:45                         ` Rik van Riel
2010-08-30 22:56                           ` Jeremy Fitzhardinge
2010-08-30 23:03                             ` Rik van Riel
2010-08-31  8:11                               ` Peter Zijlstra
2010-09-02 18:19                                 ` Glauber Costa
2010-09-03  3:24                                   ` Jeremy Fitzhardinge
2010-09-03  7:18                                   ` Peter Zijlstra
2010-09-01 23:56     ` [RFC 1/8] Implement getnsboottime kernel API Zachary Amsden
2010-08-30 16:37 ` [RFC v2 0/7] kvm stael time implementation Peter Zijlstra
2010-08-30 16:45   ` Jeremy Fitzhardinge
2010-08-30 17:21     ` Glauber Costa
2010-08-30 17:20 ` Jeremy Fitzhardinge

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).