kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH 0/3] Report guest steal time in host
@ 2015-04-22 10:24 Naveen N. Rao
  2015-04-22 10:24 ` [RFC PATCH 1/3] procfs: add guest steal time in /proc/<pid>/stat Naveen N. Rao
                   ` (4 more replies)
  0 siblings, 5 replies; 7+ messages in thread
From: Naveen N. Rao @ 2015-04-22 10:24 UTC (permalink / raw)
  To: kvm, kvm-ppc; +Cc: paulus, mpe, agraf, mingo, ego

Steal time accounts the time duration during which a guest vcpu was ready to
run, but was not scheduled to run by the hypervisor. This is particularly
relevant in cloud environment where customers would want to use this as an
indicator that their guests are being throttled. However, as it stands today,
guest steal time information is not visible from the hypervisor.

For cloud service providers, this is problematic since they would want to
overcommit cpu resources to achieve optimum resource utilization while at the
same time ensuring guests are not throttled. It is useful for service providers
to have access to the guest steal time data so that they can base their
overcommit/guest packing decisions on this. Higher guest steal time can be used
as a trigger to change how the guests are scheduled, or even migrate guests out
of a system.

This patchset attempts to make the guest steal times available in the host.
This is achieved by introducing a new field in per-task statistics
(/proc/<pid>/stat and /proc/<pid>/task/<pid>/stat) to accumulate per-vcpu steal
time. Programs (such as pidstat) can then be enhanced to report this
information on a per-thread basis [If there is a better place/way to expose
this, please let me know]. As an example, with pidstat on ppc64:

Guest steal time information using mpstat:
-----------------------------------------

[root@rhel7-img ~]# mpstat -P ALL 1
Linux 3.19.0nnr (rhel7-img) 	04/15/2015 	_ppc64_	(4 CPU)

03:13:23 PM  CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest  %gnice   %idle
03:13:24 PM  all   12.25    0.00    1.25    0.00    1.00    2.25   13.75    0.00    0.00   69.50
03:13:24 PM    0   46.53    0.00    0.00    0.00    0.00    4.95   45.54    0.00    0.00    2.97
03:13:24 PM    1    0.00    0.00    0.00    0.00    0.00    4.04    3.03    0.00    0.00   92.93
03:13:24 PM    2    0.00    0.00    0.00    0.00    3.96    0.99    2.97    0.00    0.00   92.08
03:13:24 PM    3    3.00    0.00    4.00    0.00    0.00    0.00    4.00    0.00    0.00   89.00

03:13:24 PM  CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest  %gnice   %idle
03:13:25 PM  all   12.59    0.00    0.00    0.00    0.00    0.25   12.35    0.00    0.00   74.81
03:13:25 PM    0   50.00    0.00    0.00    0.00    0.00    0.98   49.02    0.00    0.00    0.00
03:13:25 PM    1    0.98    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00   99.02
03:13:25 PM    2    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00
03:13:25 PM    3    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00

03:13:25 PM  CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest  %gnice   %idle
03:13:26 PM  all   12.99    0.00    0.00    0.00    0.25    0.00   12.75    0.00    0.00   74.02
03:13:26 PM    0   51.96    0.00    0.00    0.00    0.00    0.00   48.04    0.00    0.00    0.00
03:13:26 PM    1    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00
03:13:26 PM    2    0.00    0.00    0.00    0.00    0.98    0.00    2.94    0.00    0.00   96.08
03:13:26 PM    3    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00

03:13:26 PM  CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest  %gnice   %idle
03:13:27 PM  all   12.53    0.00    1.00    0.25    0.00    0.25   12.03    0.00    0.00   73.93
03:13:27 PM    0   51.02    0.00    0.00    0.00    0.00    0.00   48.98    0.00    0.00    0.00
03:13:27 PM    1    0.00    0.00    4.04    0.00    0.00    0.00    0.00    0.00    0.00   95.96
03:13:27 PM    2    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00
03:13:27 PM    3    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00

Average:     CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest  %gnice   %idle
Average:     all   12.91    0.00    0.54    0.01    0.04    0.12   12.39    0.00    0.00   74.00
Average:       0   51.36    0.00    0.03    0.00    0.03    0.26   48.27    0.00    0.00    0.05
Average:       1    0.02    0.00    1.54    0.02    0.02    0.15    0.36    0.00    0.00   97.89
Average:       2    0.00    0.00    0.52    0.00    0.09    0.02    0.36    0.00    0.00   99.02
Average:       3    0.05    0.00    0.07    0.00    0.02    0.09    0.34    0.00    0.00   99.43

Steal time information in host using (locally modified) pidstat:
---------------------------------------------------------------

[naveen@xxxxxxxxxx sysstat]$ ./pidstat -C qemu -tIu 1
Linux 3.19.0nnr (xxxxxxxxxx.in.ibm.com) 	04/15/2015 	_ppc64_	(64 CPU)

04:43:20 AM   UID      TGID       TID    %usr %system  %guest    %CPU  %steal   CPU  Command
04:43:22 AM  1008      3001         -    0.00    0.00   54.21    3.39   45.79    12  qemu-system-ppc
04:43:22 AM  1008         -      3005    0.00    0.00   54.21    3.39    0.00    12  |__qemu-system-ppc

04:43:22 AM   UID      TGID       TID    %usr %system  %guest    %CPU  %steal   CPU  Command
04:43:23 AM  1008      3001         -    0.00    0.00   52.00    3.25   46.00    12  qemu-system-ppc
04:43:23 AM  1008         -      3003    0.00    0.00    2.00    0.12   46.00    12  |__qemu-system-ppc
04:43:23 AM  1008         -      3005    0.00    0.00   45.00    2.81    0.00    12  |__qemu-system-ppc
04:43:23 AM  1008         -      3006    0.00    0.00    6.00    0.38    0.00    12  |__qemu-system-ppc

04:43:23 AM   UID      TGID       TID    %usr %system  %guest    %CPU  %steal   CPU  Command
04:43:24 AM  1008      3001         -    0.00    2.00   50.00    3.25   67.00    12  qemu-system-ppc
04:43:24 AM  1008         -      3001    0.00    1.00    0.00    0.06    0.00    12  |__qemu-system-ppc
04:43:24 AM  1008         -      3003    0.00    0.00    8.00    0.50   49.00    12  |__qemu-system-ppc
04:43:24 AM  1008         -      3004    0.00    0.00    2.00    0.12    5.00    12  |__qemu-system-ppc
04:43:24 AM  1008         -      3005    0.00    0.00   38.00    2.38    3.00    12  |__qemu-system-ppc
04:43:24 AM  1008         -      3006    0.00    1.00    0.00    0.06    8.00    12  |__qemu-system-ppc

04:43:24 AM   UID      TGID       TID    %usr %system  %guest    %CPU  %steal   CPU  Command
04:43:25 AM  1008      3001         -    0.00    0.00   51.00    3.19   47.00    12  qemu-system-ppc
04:43:25 AM  1008         -      3003    0.00    0.00   27.00    1.69   47.00    12  |__qemu-system-ppc
04:43:25 AM  1008         -      3004    0.00    1.00    0.00    0.06    0.00    12  |__qemu-system-ppc
04:43:25 AM  1008         -      3005    0.00    1.00   23.00    1.50    0.00    12  |__qemu-system-ppc
04:43:25 AM  1008         -      3006    0.00    0.00    2.00    0.12    0.00    12  |__qemu-system-ppc

04:43:25 AM   UID      TGID       TID    %usr %system  %guest    %CPU  %steal   CPU  Command
04:43:26 AM  1008      3001         -    0.00    0.00   51.00    3.18   53.00    12  qemu-system-ppc
04:43:26 AM  1008         -      3003    0.00    0.00    9.00    0.56   50.00    12  |__qemu-system-ppc
04:43:26 AM  1008         -      3005    0.00    0.00   16.00    1.00    3.00    12  |__qemu-system-ppc
04:43:26 AM  1008         -      3006    0.00    0.00   26.00    1.62    0.00    12  |__qemu-system-ppc

Average:      UID      TGID       TID    %usr %system  %guest    %CPU  %steal   CPU  Command
Average:     1008      3001         -    0.00    0.18   51.54    3.23   50.12     -  qemu-system-ppc
Average:     1008         -      3001    0.02    0.02    0.00    0.00    0.00     -  |__qemu-system-ppc
Average:     1008         -      3003    0.00    0.03   15.89    0.99   48.24     -  |__qemu-system-ppc
Average:     1008         -      3004    0.00    0.05   11.70    0.73    0.56     -  |__qemu-system-ppc
Average:     1008         -      3005    0.00    0.06   20.03    1.26    0.58     -  |__qemu-system-ppc
Average:     1008         -      3006    0.00    0.03    3.93    0.25    0.72     -  |__qemu-system-ppc


On x86, we can obtain accurate steal time information since it is just the
scheduler run_delay. However, on powerpc, obtaining accurate steal time
information is challenging. This patchset proposes a technique that allows us
to obtain a reasonable (+/- 5%) approximation. Please suggest if there are
better ways to achieve more accurate steal time accounting in the hypervisor. I
am also interested in general feedback on the overall patchset and my approach
for the same.


Thanks!
- Naveen


Naveen N. Rao (3):
  procfs: add guest steal time in /proc/<pid>/stat
  kvm/x86: report guest steal time in host
  kvm/powerpc: report guest steal time in host

 arch/powerpc/include/asm/kvm_host.h     | 1 +
 arch/powerpc/kernel/asm-offsets.c       | 1 +
 arch/powerpc/kvm/book3s_hv.c            | 2 ++
 arch/powerpc/kvm/book3s_hv_rmhandlers.S | 3 +++
 arch/x86/kvm/x86.c                      | 1 +
 fs/proc/array.c                         | 6 ++++++
 include/linux/sched.h                   | 7 +++++++
 kernel/fork.c                           | 2 +-
 8 files changed, 22 insertions(+), 1 deletion(-)

-- 
2.3.5

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [RFC PATCH 1/3] procfs: add guest steal time in /proc/<pid>/stat
  2015-04-22 10:24 [RFC PATCH 0/3] Report guest steal time in host Naveen N. Rao
@ 2015-04-22 10:24 ` Naveen N. Rao
  2015-04-22 10:24 ` [RFC PATCH 2/3] kvm/x86: report guest steal time in host Naveen N. Rao
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 7+ messages in thread
From: Naveen N. Rao @ 2015-04-22 10:24 UTC (permalink / raw)
  To: kvm, kvm-ppc; +Cc: paulus, mpe, agraf, mingo, ego

Introduce a field in /proc/<pid>/stat to expose guest steal time.

Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
---
 fs/proc/array.c       | 6 ++++++
 include/linux/sched.h | 7 +++++++
 kernel/fork.c         | 2 +-
 3 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/fs/proc/array.c b/fs/proc/array.c
index 1295a00..d86f00e 100644
--- a/fs/proc/array.c
+++ b/fs/proc/array.c
@@ -363,6 +363,7 @@ static int do_task_stat(struct seq_file *m, struct pid_namespace *ns,
 	unsigned long rsslim = 0;
 	char tcomm[sizeof(task->comm)];
 	unsigned long flags;
+	cputime_t gstime;
 
 	state = *get_task_state(task);
 	vsize = eip = esp = 0;
@@ -382,6 +383,7 @@ static int do_task_stat(struct seq_file *m, struct pid_namespace *ns,
 	sigemptyset(&sigcatch);
 	cutime = cstime = utime = stime = 0;
 	cgtime = gtime = 0;
+	gstime = 0;
 
 	if (lock_task_sighand(task, &flags)) {
 		struct signal_struct *sig = task->signal;
@@ -410,6 +412,7 @@ static int do_task_stat(struct seq_file *m, struct pid_namespace *ns,
 				min_flt += t->min_flt;
 				maj_flt += t->maj_flt;
 				gtime += task_gtime(t);
+				gstime += task_gstime(t);
 			} while_each_thread(task, t);
 
 			min_flt += sig->min_flt;
@@ -432,6 +435,7 @@ static int do_task_stat(struct seq_file *m, struct pid_namespace *ns,
 		maj_flt = task->maj_flt;
 		task_cputime_adjusted(task, &utime, &stime);
 		gtime = task_gtime(task);
+		gstime = task_gstime(task);
 	}
 
 	/* scale priority and nice values from timeslices to -20..20 */
@@ -505,6 +509,8 @@ static int do_task_stat(struct seq_file *m, struct pid_namespace *ns,
 	else
 		seq_put_decimal_ll(m, ' ', 0);
 
+	seq_put_decimal_ull(m, ' ', cputime_to_clock_t(gstime));
+
 	seq_putc(m, '\n');
 	if (mm)
 		mmput(mm);
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 0eabab9..cb57954 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1429,6 +1429,7 @@ struct task_struct {
 
 	cputime_t utime, stime, utimescaled, stimescaled;
 	cputime_t gtime;
+	cputime_t gstime;
 #ifndef CONFIG_VIRT_CPU_ACCOUNTING_NATIVE
 	struct cputime prev_cputime;
 #endif
@@ -1955,6 +1956,12 @@ static inline cputime_t task_gtime(struct task_struct *t)
 	return t->gtime;
 }
 #endif
+
+static inline cputime_t task_gstime(struct task_struct *t)
+{
+	return t->gstime;
+}
+
 extern void task_cputime_adjusted(struct task_struct *p, cputime_t *ut, cputime_t *st);
 extern void thread_group_cputime_adjusted(struct task_struct *p, cputime_t *ut, cputime_t *st);
 
diff --git a/kernel/fork.c b/kernel/fork.c
index cf65139..529ebe5 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1293,7 +1293,7 @@ static struct task_struct *copy_process(unsigned long clone_flags,
 
 	init_sigpending(&p->pending);
 
-	p->utime = p->stime = p->gtime = 0;
+	p->utime = p->stime = p->gtime = p->gstime = 0;
 	p->utimescaled = p->stimescaled = 0;
 #ifndef CONFIG_VIRT_CPU_ACCOUNTING_NATIVE
 	p->prev_cputime.utime = p->prev_cputime.stime = 0;
-- 
2.3.5

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [RFC PATCH 2/3] kvm/x86: report guest steal time in host
  2015-04-22 10:24 [RFC PATCH 0/3] Report guest steal time in host Naveen N. Rao
  2015-04-22 10:24 ` [RFC PATCH 1/3] procfs: add guest steal time in /proc/<pid>/stat Naveen N. Rao
@ 2015-04-22 10:24 ` Naveen N. Rao
  2015-04-22 10:24 ` [RFC PATCH 3/3] kvm/powerpc: " Naveen N. Rao
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 7+ messages in thread
From: Naveen N. Rao @ 2015-04-22 10:24 UTC (permalink / raw)
  To: kvm, kvm-ppc; +Cc: paulus, mpe, agraf, mingo, ego

Report guest steal time in host task statistics. On x86, this is just
the scheduler run_delay.

Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
---
 arch/x86/kvm/x86.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 0ee725f..737b0e4 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2094,6 +2094,7 @@ static void record_steal_time(struct kvm_vcpu *vcpu)
 
 	vcpu->arch.st.steal.steal += vcpu->arch.st.accum_steal;
 	vcpu->arch.st.steal.version += 2;
+	current->gstime += vcpu->arch.st.accum_steal;
 	vcpu->arch.st.accum_steal = 0;
 
 	kvm_write_guest_cached(vcpu->kvm, &vcpu->arch.st.stime,
-- 
2.3.5

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [RFC PATCH 3/3] kvm/powerpc: report guest steal time in host
  2015-04-22 10:24 [RFC PATCH 0/3] Report guest steal time in host Naveen N. Rao
  2015-04-22 10:24 ` [RFC PATCH 1/3] procfs: add guest steal time in /proc/<pid>/stat Naveen N. Rao
  2015-04-22 10:24 ` [RFC PATCH 2/3] kvm/x86: report guest steal time in host Naveen N. Rao
@ 2015-04-22 10:24 ` Naveen N. Rao
  2015-04-22 11:05 ` [RFC PATCH 0/3] Report " Christian Borntraeger
  2015-05-06 11:55 ` [PATCH " Naveen N. Rao
  4 siblings, 0 replies; 7+ messages in thread
From: Naveen N. Rao @ 2015-04-22 10:24 UTC (permalink / raw)
  To: kvm, kvm-ppc; +Cc: paulus, mpe, agraf, mingo, ego

On powerpc, kvm tracks both the guest steal time as well as the time
when guest was idle and this gets sent in to the guest through DTL. The
guest accounts these entries as either steal time or idle time based on
the last running task. Since the true guest idle status is not visible
to the host, we can't accurately report the guest steal time in the
host.

However, tracking the guest vcpu cede status can get us a reasonable
(within 5% variation) vcpu steal time since guest vcpus cede the
processor on entering the idle task. To do this, we introduce a new
field ceded_st in kvm_vcpu_arch structure to accurately track the guest
vcpu cede status (this is needed since the existing ceded field is
modified before we can use it). During DTL entry creation, we check this
flag and account the time as stolen if the guest vcpu had not ceded.

Tests show that the steal time being reported in the host with this
approach is around 5% higher than the steal time shown in guest. Please
suggest if there are ways to get more accurate steal time information in
the host.

Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/kvm_host.h     | 1 +
 arch/powerpc/kernel/asm-offsets.c       | 1 +
 arch/powerpc/kvm/book3s_hv.c            | 2 ++
 arch/powerpc/kvm/book3s_hv_rmhandlers.S | 3 +++
 4 files changed, 7 insertions(+)

diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
index 8ef0512..7db48c4 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -655,6 +655,7 @@ struct kvm_vcpu_arch {
 	u64 busy_preempt;
 
 	u32 emul_inst;
+	u8 ceded_st;
 #endif
 };
 
diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c
index 4717859..765c7c4 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -521,6 +521,7 @@ int main(void)
 	DEFINE(VCPU_DEC_EXPIRES, offsetof(struct kvm_vcpu, arch.dec_expires));
 	DEFINE(VCPU_PENDING_EXC, offsetof(struct kvm_vcpu, arch.pending_exceptions));
 	DEFINE(VCPU_CEDED, offsetof(struct kvm_vcpu, arch.ceded));
+	DEFINE(VCPU_CEDED_ST, offsetof(struct kvm_vcpu, arch.ceded_st));
 	DEFINE(VCPU_PRODDED, offsetof(struct kvm_vcpu, arch.prodded));
 	DEFINE(VCPU_MMCR, offsetof(struct kvm_vcpu, arch.mmcr));
 	DEFINE(VCPU_PMC, offsetof(struct kvm_vcpu, arch.pmc));
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index de74756..ad7c0e3 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -545,6 +545,8 @@ static void kvmppc_create_dtl_entry(struct kvm_vcpu *vcpu,
 	spin_lock_irq(&vcpu->arch.tbacct_lock);
 	stolen += vcpu->arch.busy_stolen;
 	vcpu->arch.busy_stolen = 0;
+	if (!vcpu->arch.ceded_st && stolen)
+		(pid_task(vcpu->pid, PIDTYPE_PID))->gstime += stolen;
 	spin_unlock_irq(&vcpu->arch.tbacct_lock);
 	if (!dt || !vpa)
 		return;
diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index 6cbf163..28f304e 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -873,6 +873,7 @@ deliver_guest_interrupt:
 fast_guest_return:
 	li	r0,0
 	stb	r0,VCPU_CEDED(r4)	/* cancel cede */
+	stb	r0,VCPU_CEDED_ST(r4)	/* cancel cede */
 	mtspr	SPRN_HSRR0,r10
 	mtspr	SPRN_HSRR1,r11
 
@@ -1889,6 +1890,7 @@ _GLOBAL(kvmppc_h_cede)
 	std	r11,VCPU_MSR(r3)
 	li	r0,1
 	stb	r0,VCPU_CEDED(r3)
+	stb	r0,VCPU_CEDED_ST(r3)
 	sync			/* order setting ceded vs. testing prodded */
 	lbz	r5,VCPU_PRODDED(r3)
 	cmpwi	r5,0
@@ -2052,6 +2054,7 @@ kvm_cede_prodded:
 	stb	r0,VCPU_PRODDED(r3)
 	sync			/* order testing prodded vs. clearing ceded */
 	stb	r0,VCPU_CEDED(r3)
+	stb	r0,VCPU_CEDED_ST(r3)
 	li	r3,H_SUCCESS
 	blr
 
-- 
2.3.5

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [RFC PATCH 0/3] Report guest steal time in host
  2015-04-22 10:24 [RFC PATCH 0/3] Report guest steal time in host Naveen N. Rao
                   ` (2 preceding siblings ...)
  2015-04-22 10:24 ` [RFC PATCH 3/3] kvm/powerpc: " Naveen N. Rao
@ 2015-04-22 11:05 ` Christian Borntraeger
  2015-04-22 11:39   ` Naveen N. Rao
  2015-05-06 11:55 ` [PATCH " Naveen N. Rao
  4 siblings, 1 reply; 7+ messages in thread
From: Christian Borntraeger @ 2015-04-22 11:05 UTC (permalink / raw)
  To: Naveen N. Rao, kvm, kvm-ppc; +Cc: paulus, mpe, agraf, mingo, ego, linux-s390

Am 22.04.2015 um 12:24 schrieb Naveen N. Rao:
> Steal time accounts the time duration during which a guest vcpu was ready to
> run, but was not scheduled to run by the hypervisor. This is particularly
> relevant in cloud environment where customers would want to use this as an
> indicator that their guests are being throttled. However, as it stands today,
> guest steal time information is not visible from the hypervisor.
> 
> For cloud service providers, this is problematic since they would want to
> overcommit cpu resources to achieve optimum resource utilization while at the
> same time ensuring guests are not throttled. It is useful for service providers
> to have access to the guest steal time data so that they can base their
> overcommit/guest packing decisions on this. Higher guest steal time can be used
> as a trigger to change how the guests are scheduled, or even migrate guests out
> of a system.
> 
> This patchset attempts to make the guest steal times available in the host.
> This is achieved by introducing a new field in per-task statistics
> (/proc/<pid>/stat and /proc/<pid>/task/<pid>/stat) to accumulate per-vcpu steal
> time. Programs (such as pidstat) can then be enhanced to report this
> information on a per-thread basis [If there is a better place/way to expose
> this, please let me know]. As an example, with pidstat on ppc64:
> 
> Guest steal time information using mpstat:
> -----------------------------------------
> 
> [root@rhel7-img ~]# mpstat -P ALL 1
> Linux 3.19.0nnr (rhel7-img) 	04/15/2015 	_ppc64_	(4 CPU)
> 
> 03:13:23 PM  CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest  %gnice   %idle
> 03:13:24 PM  all   12.25    0.00    1.25    0.00    1.00    2.25   13.75    0.00    0.00   69.50
> 03:13:24 PM    0   46.53    0.00    0.00    0.00    0.00    4.95   45.54    0.00    0.00    2.97
> 03:13:24 PM    1    0.00    0.00    0.00    0.00    0.00    4.04    3.03    0.00    0.00   92.93
> 03:13:24 PM    2    0.00    0.00    0.00    0.00    3.96    0.99    2.97    0.00    0.00   92.08
> 03:13:24 PM    3    3.00    0.00    4.00    0.00    0.00    0.00    4.00    0.00    0.00   89.00
> 
> 03:13:24 PM  CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest  %gnice   %idle
> 03:13:25 PM  all   12.59    0.00    0.00    0.00    0.00    0.25   12.35    0.00    0.00   74.81
> 03:13:25 PM    0   50.00    0.00    0.00    0.00    0.00    0.98   49.02    0.00    0.00    0.00
> 03:13:25 PM    1    0.98    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00   99.02
> 03:13:25 PM    2    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00
> 03:13:25 PM    3    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00
> 
> 03:13:25 PM  CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest  %gnice   %idle
> 03:13:26 PM  all   12.99    0.00    0.00    0.00    0.25    0.00   12.75    0.00    0.00   74.02
> 03:13:26 PM    0   51.96    0.00    0.00    0.00    0.00    0.00   48.04    0.00    0.00    0.00
> 03:13:26 PM    1    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00
> 03:13:26 PM    2    0.00    0.00    0.00    0.00    0.98    0.00    2.94    0.00    0.00   96.08
> 03:13:26 PM    3    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00
> 
> 03:13:26 PM  CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest  %gnice   %idle
> 03:13:27 PM  all   12.53    0.00    1.00    0.25    0.00    0.25   12.03    0.00    0.00   73.93
> 03:13:27 PM    0   51.02    0.00    0.00    0.00    0.00    0.00   48.98    0.00    0.00    0.00
> 03:13:27 PM    1    0.00    0.00    4.04    0.00    0.00    0.00    0.00    0.00    0.00   95.96
> 03:13:27 PM    2    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00
> 03:13:27 PM    3    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00
> 
> Average:     CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest  %gnice   %idle
> Average:     all   12.91    0.00    0.54    0.01    0.04    0.12   12.39    0.00    0.00   74.00
> Average:       0   51.36    0.00    0.03    0.00    0.03    0.26   48.27    0.00    0.00    0.05
> Average:       1    0.02    0.00    1.54    0.02    0.02    0.15    0.36    0.00    0.00   97.89
> Average:       2    0.00    0.00    0.52    0.00    0.09    0.02    0.36    0.00    0.00   99.02
> Average:       3    0.05    0.00    0.07    0.00    0.02    0.09    0.34    0.00    0.00   99.43
> 
> Steal time information in host using (locally modified) pidstat:
> ---------------------------------------------------------------
> 
> [naveen@xxxxxxxxxx sysstat]$ ./pidstat -C qemu -tIu 1
> Linux 3.19.0nnr (xxxxxxxxxx.in.ibm.com) 	04/15/2015 	_ppc64_	(64 CPU)
> 
> 04:43:20 AM   UID      TGID       TID    %usr %system  %guest    %CPU  %steal   CPU  Command
> 04:43:22 AM  1008      3001         -    0.00    0.00   54.21    3.39   45.79    12  qemu-system-ppc
> 04:43:22 AM  1008         -      3005    0.00    0.00   54.21    3.39    0.00    12  |__qemu-system-ppc
> 
> 04:43:22 AM   UID      TGID       TID    %usr %system  %guest    %CPU  %steal   CPU  Command
> 04:43:23 AM  1008      3001         -    0.00    0.00   52.00    3.25   46.00    12  qemu-system-ppc
> 04:43:23 AM  1008         -      3003    0.00    0.00    2.00    0.12   46.00    12  |__qemu-system-ppc
> 04:43:23 AM  1008         -      3005    0.00    0.00   45.00    2.81    0.00    12  |__qemu-system-ppc
> 04:43:23 AM  1008         -      3006    0.00    0.00    6.00    0.38    0.00    12  |__qemu-system-ppc
> 
> 04:43:23 AM   UID      TGID       TID    %usr %system  %guest    %CPU  %steal   CPU  Command
> 04:43:24 AM  1008      3001         -    0.00    2.00   50.00    3.25   67.00    12  qemu-system-ppc
> 04:43:24 AM  1008         -      3001    0.00    1.00    0.00    0.06    0.00    12  |__qemu-system-ppc
> 04:43:24 AM  1008         -      3003    0.00    0.00    8.00    0.50   49.00    12  |__qemu-system-ppc
> 04:43:24 AM  1008         -      3004    0.00    0.00    2.00    0.12    5.00    12  |__qemu-system-ppc
> 04:43:24 AM  1008         -      3005    0.00    0.00   38.00    2.38    3.00    12  |__qemu-system-ppc
> 04:43:24 AM  1008         -      3006    0.00    1.00    0.00    0.06    8.00    12  |__qemu-system-ppc
> 
> 04:43:24 AM   UID      TGID       TID    %usr %system  %guest    %CPU  %steal   CPU  Command
> 04:43:25 AM  1008      3001         -    0.00    0.00   51.00    3.19   47.00    12  qemu-system-ppc
> 04:43:25 AM  1008         -      3003    0.00    0.00   27.00    1.69   47.00    12  |__qemu-system-ppc
> 04:43:25 AM  1008         -      3004    0.00    1.00    0.00    0.06    0.00    12  |__qemu-system-ppc
> 04:43:25 AM  1008         -      3005    0.00    1.00   23.00    1.50    0.00    12  |__qemu-system-ppc
> 04:43:25 AM  1008         -      3006    0.00    0.00    2.00    0.12    0.00    12  |__qemu-system-ppc
> 
> 04:43:25 AM   UID      TGID       TID    %usr %system  %guest    %CPU  %steal   CPU  Command
> 04:43:26 AM  1008      3001         -    0.00    0.00   51.00    3.18   53.00    12  qemu-system-ppc
> 04:43:26 AM  1008         -      3003    0.00    0.00    9.00    0.56   50.00    12  |__qemu-system-ppc
> 04:43:26 AM  1008         -      3005    0.00    0.00   16.00    1.00    3.00    12  |__qemu-system-ppc
> 04:43:26 AM  1008         -      3006    0.00    0.00   26.00    1.62    0.00    12  |__qemu-system-ppc
> 
> Average:      UID      TGID       TID    %usr %system  %guest    %CPU  %steal   CPU  Command
> Average:     1008      3001         -    0.00    0.18   51.54    3.23   50.12     -  qemu-system-ppc
> Average:     1008         -      3001    0.02    0.02    0.00    0.00    0.00     -  |__qemu-system-ppc
> Average:     1008         -      3003    0.00    0.03   15.89    0.99   48.24     -  |__qemu-system-ppc
> Average:     1008         -      3004    0.00    0.05   11.70    0.73    0.56     -  |__qemu-system-ppc
> Average:     1008         -      3005    0.00    0.06   20.03    1.26    0.58     -  |__qemu-system-ppc
> Average:     1008         -      3006    0.00    0.03    3.93    0.25    0.72     -  |__qemu-system-ppc
> 
> 
> On x86, we can obtain accurate steal time information since it is just the
> scheduler run_delay. However, on powerpc, obtaining accurate steal time
> information is challenging. This patchset proposes a technique that allows us
> to obtain a reasonable (+/- 5%) approximation. Please suggest if there are
> better ways to achieve more accurate steal time accounting in the hypervisor. I
> am also interested in general feedback on the overall patchset and my approach
> for the same.
> 
> 
> Thanks!
> - Naveen

When adding new interfaces regarding steal time etc, can you add the s390 list CC?
s390 is doing steal time calculation also with hardware support: we have two type of
timers, one compares agains the TOD clock, which steps with the wall clock, the other
timer steps only if the guest CPU is backed by a real CPU. So, _IF_ this interface is
introduced it would be good to have some s390 point of view included in that discussion.

Christian

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC PATCH 0/3] Report guest steal time in host
  2015-04-22 11:05 ` [RFC PATCH 0/3] Report " Christian Borntraeger
@ 2015-04-22 11:39   ` Naveen N. Rao
  0 siblings, 0 replies; 7+ messages in thread
From: Naveen N. Rao @ 2015-04-22 11:39 UTC (permalink / raw)
  To: Christian Borntraeger
  Cc: kvm, kvm-ppc, paulus, mpe, agraf, mingo, ego, linux-s390

On 2015/04/22 01:05PM, Christian Borntraeger wrote:
> Am 22.04.2015 um 12:24 schrieb Naveen N. Rao:
> > Steal time accounts the time duration during which a guest vcpu was ready to
> > run, but was not scheduled to run by the hypervisor. This is particularly
> > relevant in cloud environment where customers would want to use this as an
> > indicator that their guests are being throttled. However, as it stands today,
> > guest steal time information is not visible from the hypervisor.
> > 
> > For cloud service providers, this is problematic since they would want to
> > overcommit cpu resources to achieve optimum resource utilization while at the
> > same time ensuring guests are not throttled. It is useful for service providers
> > to have access to the guest steal time data so that they can base their
> > overcommit/guest packing decisions on this. Higher guest steal time can be used
> > as a trigger to change how the guests are scheduled, or even migrate guests out
> > of a system.
> > 
> > This patchset attempts to make the guest steal times available in the host.
> > This is achieved by introducing a new field in per-task statistics
> > (/proc/<pid>/stat and /proc/<pid>/task/<pid>/stat) to accumulate per-vcpu steal
> > time. Programs (such as pidstat) can then be enhanced to report this
> > information on a per-thread basis [If there is a better place/way to expose
> > this, please let me know]. As an example, with pidstat on ppc64:
> > 
> > Guest steal time information using mpstat:
> > -----------------------------------------
> > 
> > [root@rhel7-img ~]# mpstat -P ALL 1
> > Linux 3.19.0nnr (rhel7-img) 	04/15/2015 	_ppc64_	(4 CPU)
> > 
> > 03:13:23 PM  CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest  %gnice   %idle
> > 03:13:24 PM  all   12.25    0.00    1.25    0.00    1.00    2.25   13.75    0.00    0.00   69.50
> > 03:13:24 PM    0   46.53    0.00    0.00    0.00    0.00    4.95   45.54    0.00    0.00    2.97
> > 03:13:24 PM    1    0.00    0.00    0.00    0.00    0.00    4.04    3.03    0.00    0.00   92.93
> > 03:13:24 PM    2    0.00    0.00    0.00    0.00    3.96    0.99    2.97    0.00    0.00   92.08
> > 03:13:24 PM    3    3.00    0.00    4.00    0.00    0.00    0.00    4.00    0.00    0.00   89.00
> > 
> > 03:13:24 PM  CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest  %gnice   %idle
> > 03:13:25 PM  all   12.59    0.00    0.00    0.00    0.00    0.25   12.35    0.00    0.00   74.81
> > 03:13:25 PM    0   50.00    0.00    0.00    0.00    0.00    0.98   49.02    0.00    0.00    0.00
> > 03:13:25 PM    1    0.98    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00   99.02
> > 03:13:25 PM    2    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00
> > 03:13:25 PM    3    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00
> > 
> > 03:13:25 PM  CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest  %gnice   %idle
> > 03:13:26 PM  all   12.99    0.00    0.00    0.00    0.25    0.00   12.75    0.00    0.00   74.02
> > 03:13:26 PM    0   51.96    0.00    0.00    0.00    0.00    0.00   48.04    0.00    0.00    0.00
> > 03:13:26 PM    1    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00
> > 03:13:26 PM    2    0.00    0.00    0.00    0.00    0.98    0.00    2.94    0.00    0.00   96.08
> > 03:13:26 PM    3    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00
> > 
> > 03:13:26 PM  CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest  %gnice   %idle
> > 03:13:27 PM  all   12.53    0.00    1.00    0.25    0.00    0.25   12.03    0.00    0.00   73.93
> > 03:13:27 PM    0   51.02    0.00    0.00    0.00    0.00    0.00   48.98    0.00    0.00    0.00
> > 03:13:27 PM    1    0.00    0.00    4.04    0.00    0.00    0.00    0.00    0.00    0.00   95.96
> > 03:13:27 PM    2    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00
> > 03:13:27 PM    3    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00
> > 
> > Average:     CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest  %gnice   %idle
> > Average:     all   12.91    0.00    0.54    0.01    0.04    0.12   12.39    0.00    0.00   74.00
> > Average:       0   51.36    0.00    0.03    0.00    0.03    0.26   48.27    0.00    0.00    0.05
> > Average:       1    0.02    0.00    1.54    0.02    0.02    0.15    0.36    0.00    0.00   97.89
> > Average:       2    0.00    0.00    0.52    0.00    0.09    0.02    0.36    0.00    0.00   99.02
> > Average:       3    0.05    0.00    0.07    0.00    0.02    0.09    0.34    0.00    0.00   99.43
> > 
> > Steal time information in host using (locally modified) pidstat:
> > ---------------------------------------------------------------
> > 
> > [naveen@xxxxxxxxxx sysstat]$ ./pidstat -C qemu -tIu 1
> > Linux 3.19.0nnr (xxxxxxxxxx.in.ibm.com) 	04/15/2015 	_ppc64_	(64 CPU)
> > 
> > 04:43:20 AM   UID      TGID       TID    %usr %system  %guest    %CPU  %steal   CPU  Command
> > 04:43:22 AM  1008      3001         -    0.00    0.00   54.21    3.39   45.79    12  qemu-system-ppc
> > 04:43:22 AM  1008         -      3005    0.00    0.00   54.21    3.39    0.00    12  |__qemu-system-ppc
> > 
> > 04:43:22 AM   UID      TGID       TID    %usr %system  %guest    %CPU  %steal   CPU  Command
> > 04:43:23 AM  1008      3001         -    0.00    0.00   52.00    3.25   46.00    12  qemu-system-ppc
> > 04:43:23 AM  1008         -      3003    0.00    0.00    2.00    0.12   46.00    12  |__qemu-system-ppc
> > 04:43:23 AM  1008         -      3005    0.00    0.00   45.00    2.81    0.00    12  |__qemu-system-ppc
> > 04:43:23 AM  1008         -      3006    0.00    0.00    6.00    0.38    0.00    12  |__qemu-system-ppc
> > 
> > 04:43:23 AM   UID      TGID       TID    %usr %system  %guest    %CPU  %steal   CPU  Command
> > 04:43:24 AM  1008      3001         -    0.00    2.00   50.00    3.25   67.00    12  qemu-system-ppc
> > 04:43:24 AM  1008         -      3001    0.00    1.00    0.00    0.06    0.00    12  |__qemu-system-ppc
> > 04:43:24 AM  1008         -      3003    0.00    0.00    8.00    0.50   49.00    12  |__qemu-system-ppc
> > 04:43:24 AM  1008         -      3004    0.00    0.00    2.00    0.12    5.00    12  |__qemu-system-ppc
> > 04:43:24 AM  1008         -      3005    0.00    0.00   38.00    2.38    3.00    12  |__qemu-system-ppc
> > 04:43:24 AM  1008         -      3006    0.00    1.00    0.00    0.06    8.00    12  |__qemu-system-ppc
> > 
> > 04:43:24 AM   UID      TGID       TID    %usr %system  %guest    %CPU  %steal   CPU  Command
> > 04:43:25 AM  1008      3001         -    0.00    0.00   51.00    3.19   47.00    12  qemu-system-ppc
> > 04:43:25 AM  1008         -      3003    0.00    0.00   27.00    1.69   47.00    12  |__qemu-system-ppc
> > 04:43:25 AM  1008         -      3004    0.00    1.00    0.00    0.06    0.00    12  |__qemu-system-ppc
> > 04:43:25 AM  1008         -      3005    0.00    1.00   23.00    1.50    0.00    12  |__qemu-system-ppc
> > 04:43:25 AM  1008         -      3006    0.00    0.00    2.00    0.12    0.00    12  |__qemu-system-ppc
> > 
> > 04:43:25 AM   UID      TGID       TID    %usr %system  %guest    %CPU  %steal   CPU  Command
> > 04:43:26 AM  1008      3001         -    0.00    0.00   51.00    3.18   53.00    12  qemu-system-ppc
> > 04:43:26 AM  1008         -      3003    0.00    0.00    9.00    0.56   50.00    12  |__qemu-system-ppc
> > 04:43:26 AM  1008         -      3005    0.00    0.00   16.00    1.00    3.00    12  |__qemu-system-ppc
> > 04:43:26 AM  1008         -      3006    0.00    0.00   26.00    1.62    0.00    12  |__qemu-system-ppc
> > 
> > Average:      UID      TGID       TID    %usr %system  %guest    %CPU  %steal   CPU  Command
> > Average:     1008      3001         -    0.00    0.18   51.54    3.23   50.12     -  qemu-system-ppc
> > Average:     1008         -      3001    0.02    0.02    0.00    0.00    0.00     -  |__qemu-system-ppc
> > Average:     1008         -      3003    0.00    0.03   15.89    0.99   48.24     -  |__qemu-system-ppc
> > Average:     1008         -      3004    0.00    0.05   11.70    0.73    0.56     -  |__qemu-system-ppc
> > Average:     1008         -      3005    0.00    0.06   20.03    1.26    0.58     -  |__qemu-system-ppc
> > Average:     1008         -      3006    0.00    0.03    3.93    0.25    0.72     -  |__qemu-system-ppc
> > 
> > 
> > On x86, we can obtain accurate steal time information since it is just the
> > scheduler run_delay. However, on powerpc, obtaining accurate steal time
> > information is challenging. This patchset proposes a technique that allows us
> > to obtain a reasonable (+/- 5%) approximation. Please suggest if there are
> > better ways to achieve more accurate steal time accounting in the hypervisor. I
> > am also interested in general feedback on the overall patchset and my approach
> > for the same.
> > 
> > 
> > Thanks!
> > - Naveen
> 
> When adding new interfaces regarding steal time etc, can you add the s390 list CC?
> s390 is doing steal time calculation also with hardware support: we have two type of
> timers, one compares agains the TOD clock, which steps with the wall clock, the other
> timer steps only if the guest CPU is backed by a real CPU. So, _IF_ this interface is
> introduced it would be good to have some s390 point of view included in that discussion.

Hi Christian,
Sure. I largely focused on powerpc and x86 so far. I will be glad to get 
feedback from s390 as well as other architectures on this.

Thanks,
Naveen

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 0/3] Report guest steal time in host
  2015-04-22 10:24 [RFC PATCH 0/3] Report guest steal time in host Naveen N. Rao
                   ` (3 preceding siblings ...)
  2015-04-22 11:05 ` [RFC PATCH 0/3] Report " Christian Borntraeger
@ 2015-05-06 11:55 ` Naveen N. Rao
  4 siblings, 0 replies; 7+ messages in thread
From: Naveen N. Rao @ 2015-05-06 11:55 UTC (permalink / raw)
  To: linux-kernel, linux-arch, kvm, linuxppc-dev, linux-s390
  Cc: paulus, mpe, agraf, mingo, ego

Arrgh! Sorry about the headers. Please ignore this set. Will repost in a 
separate thread.


- Naveen


On 2015/05/06 04:28PM, Naveen N Rao wrote:
> Steal time accounts the time duration during which a guest vcpu was ready to
> run, but was not scheduled to run by the hypervisor. This is particularly
> relevant in cloud environment where customers would want to use this as an
> indicator that their guests are being throttled. However, as it stands today,
> guest steal time information is not visible from the hypervisor.
> 
> For cloud service providers, this is problematic since they would want to
> overcommit cpu resources to achieve optimum resource utilization while at the
> same time ensuring guests are not throttled. It is useful for service providers
> to have access to the guest steal time data so that they can base their
> overcommit/guest packing decisions on this. Higher guest steal time can be used
> as a trigger to change how the guests are scheduled, or even migrate guests out
> of a system.
> 
> This patchset attempts to make the guest steal times available in the host.
> This is achieved by introducing a new field in per-task statistics
> (/proc/<pid>/stat and /proc/<pid>/task/<pid>/stat) to accumulate per-vcpu steal
> time. Programs (such as pidstat) can then be enhanced to report this
> information on a per-thread basis.
> 
> This should also work for nested virtualization: steal time information for the
> guest is readable via /proc/stat, while steal time information for guests
> hosted on this hypervisor is readable via /proc/<pid>/task/*/stat.
> 
> Also, mpstat always shows steal time information for current (self) guest on a
> per-cpu basis. And pidstat can be enhanced to report the same for the hosted
> guests on a per-vcpu basis.
> 
> As an example:
> 
> Guest (self) steal time information using mpstat:
> ------------------------------------------------
> 
> mpstat is run from within the guest.
> 
> [root@rhel7-img ~]# mpstat -P ALL 1
> Linux 3.19.0nnr (rhel7-img) 	04/15/2015 	_ppc64_	(4 CPU)
> 
> 03:13:23 PM  CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest  %gnice   %idle
> 03:13:24 PM  all   12.25    0.00    1.25    0.00    1.00    2.25   13.75    0.00    0.00   69.50
> 03:13:24 PM    0   46.53    0.00    0.00    0.00    0.00    4.95   45.54    0.00    0.00    2.97
> 03:13:24 PM    1    0.00    0.00    0.00    0.00    0.00    4.04    3.03    0.00    0.00   92.93
> 03:13:24 PM    2    0.00    0.00    0.00    0.00    3.96    0.99    2.97    0.00    0.00   92.08
> 03:13:24 PM    3    3.00    0.00    4.00    0.00    0.00    0.00    4.00    0.00    0.00   89.00
> 
> 03:13:24 PM  CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest  %gnice   %idle
> 03:13:25 PM  all   12.59    0.00    0.00    0.00    0.00    0.25   12.35    0.00    0.00   74.81
> 03:13:25 PM    0   50.00    0.00    0.00    0.00    0.00    0.98   49.02    0.00    0.00    0.00
> 03:13:25 PM    1    0.98    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00   99.02
> 03:13:25 PM    2    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00
> 03:13:25 PM    3    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00
> 
> 03:13:25 PM  CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest  %gnice   %idle
> 03:13:26 PM  all   12.99    0.00    0.00    0.00    0.25    0.00   12.75    0.00    0.00   74.02
> 03:13:26 PM    0   51.96    0.00    0.00    0.00    0.00    0.00   48.04    0.00    0.00    0.00
> 03:13:26 PM    1    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00
> 03:13:26 PM    2    0.00    0.00    0.00    0.00    0.98    0.00    2.94    0.00    0.00   96.08
> 03:13:26 PM    3    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00
> 
> 03:13:26 PM  CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest  %gnice   %idle
> 03:13:27 PM  all   12.53    0.00    1.00    0.25    0.00    0.25   12.03    0.00    0.00   73.93
> 03:13:27 PM    0   51.02    0.00    0.00    0.00    0.00    0.00   48.98    0.00    0.00    0.00
> 03:13:27 PM    1    0.00    0.00    4.04    0.00    0.00    0.00    0.00    0.00    0.00   95.96
> 03:13:27 PM    2    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00
> 03:13:27 PM    3    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00
> 
> Average:     CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest  %gnice   %idle
> Average:     all   12.91    0.00    0.54    0.01    0.04    0.12   12.39    0.00    0.00   74.00
> Average:       0   51.36    0.00    0.03    0.00    0.03    0.26   48.27    0.00    0.00    0.05
> Average:       1    0.02    0.00    1.54    0.02    0.02    0.15    0.36    0.00    0.00   97.89
> Average:       2    0.00    0.00    0.52    0.00    0.09    0.02    0.36    0.00    0.00   99.02
> Average:       3    0.05    0.00    0.07    0.00    0.02    0.09    0.34    0.00    0.00   99.43
> 
> Steal time information for hosted guests in host using (locally modified) pidstat:
> ---------------------------------------------------------------------------------
> 
> pidstat is being run in the host.
> 
> [naveen@xxxxxxxxxx sysstat]$ ./pidstat -C qemu -tIu 1
> Linux 3.19.0nnr (xxxxxxxxxx.in.ibm.com) 	04/15/2015 	_ppc64_	(64 CPU)
> 
> 04:43:20 AM   UID      TGID       TID    %usr %system  %guest    %CPU  %steal   CPU  Command
> 04:43:22 AM  1008      3001         -    0.00    0.00   54.21    3.39   45.79    12  qemu-system-ppc
> 04:43:22 AM  1008         -      3005    0.00    0.00   54.21    3.39    0.00    12  |__qemu-system-ppc
> 
> 04:43:22 AM   UID      TGID       TID    %usr %system  %guest    %CPU  %steal   CPU  Command
> 04:43:23 AM  1008      3001         -    0.00    0.00   52.00    3.25   46.00    12  qemu-system-ppc
> 04:43:23 AM  1008         -      3003    0.00    0.00    2.00    0.12   46.00    12  |__qemu-system-ppc
> 04:43:23 AM  1008         -      3005    0.00    0.00   45.00    2.81    0.00    12  |__qemu-system-ppc
> 04:43:23 AM  1008         -      3006    0.00    0.00    6.00    0.38    0.00    12  |__qemu-system-ppc
> 
> 04:43:23 AM   UID      TGID       TID    %usr %system  %guest    %CPU  %steal   CPU  Command
> 04:43:24 AM  1008      3001         -    0.00    2.00   50.00    3.25   67.00    12  qemu-system-ppc
> 04:43:24 AM  1008         -      3001    0.00    1.00    0.00    0.06    0.00    12  |__qemu-system-ppc
> 04:43:24 AM  1008         -      3003    0.00    0.00    8.00    0.50   49.00    12  |__qemu-system-ppc
> 04:43:24 AM  1008         -      3004    0.00    0.00    2.00    0.12    5.00    12  |__qemu-system-ppc
> 04:43:24 AM  1008         -      3005    0.00    0.00   38.00    2.38    3.00    12  |__qemu-system-ppc
> 04:43:24 AM  1008         -      3006    0.00    1.00    0.00    0.06    8.00    12  |__qemu-system-ppc
> 
> 04:43:24 AM   UID      TGID       TID    %usr %system  %guest    %CPU  %steal   CPU  Command
> 04:43:25 AM  1008      3001         -    0.00    0.00   51.00    3.19   47.00    12  qemu-system-ppc
> 04:43:25 AM  1008         -      3003    0.00    0.00   27.00    1.69   47.00    12  |__qemu-system-ppc
> 04:43:25 AM  1008         -      3004    0.00    1.00    0.00    0.06    0.00    12  |__qemu-system-ppc
> 04:43:25 AM  1008         -      3005    0.00    1.00   23.00    1.50    0.00    12  |__qemu-system-ppc
> 04:43:25 AM  1008         -      3006    0.00    0.00    2.00    0.12    0.00    12  |__qemu-system-ppc
> 
> 04:43:25 AM   UID      TGID       TID    %usr %system  %guest    %CPU  %steal   CPU  Command
> 04:43:26 AM  1008      3001         -    0.00    0.00   51.00    3.18   53.00    12  qemu-system-ppc
> 04:43:26 AM  1008         -      3003    0.00    0.00    9.00    0.56   50.00    12  |__qemu-system-ppc
> 04:43:26 AM  1008         -      3005    0.00    0.00   16.00    1.00    3.00    12  |__qemu-system-ppc
> 04:43:26 AM  1008         -      3006    0.00    0.00   26.00    1.62    0.00    12  |__qemu-system-ppc
> 
> Average:      UID      TGID       TID    %usr %system  %guest    %CPU  %steal   CPU  Command
> Average:     1008      3001         -    0.00    0.18   51.54    3.23   50.12     -  qemu-system-ppc
> Average:     1008         -      3001    0.02    0.02    0.00    0.00    0.00     -  |__qemu-system-ppc
> Average:     1008         -      3003    0.00    0.03   15.89    0.99   48.24     -  |__qemu-system-ppc
> Average:     1008         -      3004    0.00    0.05   11.70    0.73    0.56     -  |__qemu-system-ppc
> Average:     1008         -      3005    0.00    0.06   20.03    1.26    0.58     -  |__qemu-system-ppc
> Average:     1008         -      3006    0.00    0.03    3.93    0.25    0.72     -  |__qemu-system-ppc
> 
> 
> Thanks!
> - Naveen
> 
> ------
> Changes since RFC: Updated description to clarify few aspects that I got
> questions about. No code changes.
> 
> 
> Naveen N. Rao (3):
>   procfs: add guest steal time in /proc/<pid>/stat
>   kvm/x86: report guest steal time in host
>   kvm/powerpc: report guest steal time in host
> 
>  arch/powerpc/include/asm/kvm_host.h     | 1 +
>  arch/powerpc/kernel/asm-offsets.c       | 1 +
>  arch/powerpc/kvm/book3s_hv.c            | 2 ++
>  arch/powerpc/kvm/book3s_hv_rmhandlers.S | 3 +++
>  arch/x86/kvm/x86.c                      | 1 +
>  fs/proc/array.c                         | 6 ++++++
>  include/linux/sched.h                   | 7 +++++++
>  kernel/fork.c                           | 2 +-
>  8 files changed, 22 insertions(+), 1 deletion(-)
> 
> -- 
> 2.3.5
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2015-05-06 11:55 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-04-22 10:24 [RFC PATCH 0/3] Report guest steal time in host Naveen N. Rao
2015-04-22 10:24 ` [RFC PATCH 1/3] procfs: add guest steal time in /proc/<pid>/stat Naveen N. Rao
2015-04-22 10:24 ` [RFC PATCH 2/3] kvm/x86: report guest steal time in host Naveen N. Rao
2015-04-22 10:24 ` [RFC PATCH 3/3] kvm/powerpc: " Naveen N. Rao
2015-04-22 11:05 ` [RFC PATCH 0/3] Report " Christian Borntraeger
2015-04-22 11:39   ` Naveen N. Rao
2015-05-06 11:55 ` [PATCH " Naveen N. Rao

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).