All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] Report guest steal time in host
@ 2015-05-06 11:56 ` Naveen N. Rao
  0 siblings, 0 replies; 20+ messages in thread
From: Naveen N. Rao @ 2015-05-06 11:56 UTC (permalink / raw)
  To: linux-kernel, linux-arch, kvm, linuxppc-dev, linux-s390
  Cc: paulus, mpe, agraf, mingo, ego, warrier

Steal time accounts the time duration during which a guest vcpu was ready to
run, but was not scheduled to run by the hypervisor. This is particularly
relevant in cloud environment where customers would want to use this as an
indicator that their guests are being throttled. However, as it stands today,
guest steal time information is not visible from the hypervisor.

For cloud service providers, this is problematic since they would want to
overcommit cpu resources to achieve optimum resource utilization while at the
same time ensuring guests are not throttled. It is useful for service providers
to have access to the guest steal time data so that they can base their
overcommit/guest packing decisions on this. Higher guest steal time can be used
as a trigger to change how the guests are scheduled, or even migrate guests out
of a system.

This patchset attempts to make the guest steal times available in the host.
This is achieved by introducing a new field in per-task statistics
(/proc/<pid>/stat and /proc/<pid>/task/<pid>/stat) to accumulate per-vcpu steal
time. Programs (such as pidstat) can then be enhanced to report this
information on a per-thread basis.

This should also work for nested virtualization: steal time information for the
guest is readable via /proc/stat, while steal time information for guests
hosted on this hypervisor is readable via /proc/<pid>/task/*/stat.

Also, mpstat always shows steal time information for current (self) guest on a
per-cpu basis. And pidstat can be enhanced to report the same for the hosted
guests on a per-vcpu basis.

As an example:

Guest (self) steal time information using mpstat:
------------------------------------------------

mpstat is run from within the guest.

[root@rhel7-img ~]# mpstat -P ALL 1
Linux 3.19.0nnr (rhel7-img) 	04/15/2015 	_ppc64_	(4 CPU)

03:13:23 PM  CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest  %gnice   %idle
03:13:24 PM  all   12.25    0.00    1.25    0.00    1.00    2.25   13.75    0.00    0.00   69.50
03:13:24 PM    0   46.53    0.00    0.00    0.00    0.00    4.95   45.54    0.00    0.00    2.97
03:13:24 PM    1    0.00    0.00    0.00    0.00    0.00    4.04    3.03    0.00    0.00   92.93
03:13:24 PM    2    0.00    0.00    0.00    0.00    3.96    0.99    2.97    0.00    0.00   92.08
03:13:24 PM    3    3.00    0.00    4.00    0.00    0.00    0.00    4.00    0.00    0.00   89.00

03:13:24 PM  CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest  %gnice   %idle
03:13:25 PM  all   12.59    0.00    0.00    0.00    0.00    0.25   12.35    0.00    0.00   74.81
03:13:25 PM    0   50.00    0.00    0.00    0.00    0.00    0.98   49.02    0.00    0.00    0.00
03:13:25 PM    1    0.98    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00   99.02
03:13:25 PM    2    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00
03:13:25 PM    3    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00

03:13:25 PM  CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest  %gnice   %idle
03:13:26 PM  all   12.99    0.00    0.00    0.00    0.25    0.00   12.75    0.00    0.00   74.02
03:13:26 PM    0   51.96    0.00    0.00    0.00    0.00    0.00   48.04    0.00    0.00    0.00
03:13:26 PM    1    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00
03:13:26 PM    2    0.00    0.00    0.00    0.00    0.98    0.00    2.94    0.00    0.00   96.08
03:13:26 PM    3    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00

03:13:26 PM  CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest  %gnice   %idle
03:13:27 PM  all   12.53    0.00    1.00    0.25    0.00    0.25   12.03    0.00    0.00   73.93
03:13:27 PM    0   51.02    0.00    0.00    0.00    0.00    0.00   48.98    0.00    0.00    0.00
03:13:27 PM    1    0.00    0.00    4.04    0.00    0.00    0.00    0.00    0.00    0.00   95.96
03:13:27 PM    2    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00
03:13:27 PM    3    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00

Average:     CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest  %gnice   %idle
Average:     all   12.91    0.00    0.54    0.01    0.04    0.12   12.39    0.00    0.00   74.00
Average:       0   51.36    0.00    0.03    0.00    0.03    0.26   48.27    0.00    0.00    0.05
Average:       1    0.02    0.00    1.54    0.02    0.02    0.15    0.36    0.00    0.00   97.89
Average:       2    0.00    0.00    0.52    0.00    0.09    0.02    0.36    0.00    0.00   99.02
Average:       3    0.05    0.00    0.07    0.00    0.02    0.09    0.34    0.00    0.00   99.43

Steal time information for hosted guests in host using (locally modified) pidstat:
---------------------------------------------------------------------------------

pidstat is being run in the host.

[naveen@xxxxxxxxxx sysstat]$ ./pidstat -C qemu -tIu 1
Linux 3.19.0nnr (xxxxxxxxxx.in.ibm.com) 	04/15/2015 	_ppc64_	(64 CPU)

04:43:20 AM   UID      TGID       TID    %usr %system  %guest    %CPU  %steal   CPU  Command
04:43:22 AM  1008      3001         -    0.00    0.00   54.21    3.39   45.79    12  qemu-system-ppc
04:43:22 AM  1008         -      3005    0.00    0.00   54.21    3.39    0.00    12  |__qemu-system-ppc

04:43:22 AM   UID      TGID       TID    %usr %system  %guest    %CPU  %steal   CPU  Command
04:43:23 AM  1008      3001         -    0.00    0.00   52.00    3.25   46.00    12  qemu-system-ppc
04:43:23 AM  1008         -      3003    0.00    0.00    2.00    0.12   46.00    12  |__qemu-system-ppc
04:43:23 AM  1008         -      3005    0.00    0.00   45.00    2.81    0.00    12  |__qemu-system-ppc
04:43:23 AM  1008         -      3006    0.00    0.00    6.00    0.38    0.00    12  |__qemu-system-ppc

04:43:23 AM   UID      TGID       TID    %usr %system  %guest    %CPU  %steal   CPU  Command
04:43:24 AM  1008      3001         -    0.00    2.00   50.00    3.25   67.00    12  qemu-system-ppc
04:43:24 AM  1008         -      3001    0.00    1.00    0.00    0.06    0.00    12  |__qemu-system-ppc
04:43:24 AM  1008         -      3003    0.00    0.00    8.00    0.50   49.00    12  |__qemu-system-ppc
04:43:24 AM  1008         -      3004    0.00    0.00    2.00    0.12    5.00    12  |__qemu-system-ppc
04:43:24 AM  1008         -      3005    0.00    0.00   38.00    2.38    3.00    12  |__qemu-system-ppc
04:43:24 AM  1008         -      3006    0.00    1.00    0.00    0.06    8.00    12  |__qemu-system-ppc

04:43:24 AM   UID      TGID       TID    %usr %system  %guest    %CPU  %steal   CPU  Command
04:43:25 AM  1008      3001         -    0.00    0.00   51.00    3.19   47.00    12  qemu-system-ppc
04:43:25 AM  1008         -      3003    0.00    0.00   27.00    1.69   47.00    12  |__qemu-system-ppc
04:43:25 AM  1008         -      3004    0.00    1.00    0.00    0.06    0.00    12  |__qemu-system-ppc
04:43:25 AM  1008         -      3005    0.00    1.00   23.00    1.50    0.00    12  |__qemu-system-ppc
04:43:25 AM  1008         -      3006    0.00    0.00    2.00    0.12    0.00    12  |__qemu-system-ppc

04:43:25 AM   UID      TGID       TID    %usr %system  %guest    %CPU  %steal   CPU  Command
04:43:26 AM  1008      3001         -    0.00    0.00   51.00    3.18   53.00    12  qemu-system-ppc
04:43:26 AM  1008         -      3003    0.00    0.00    9.00    0.56   50.00    12  |__qemu-system-ppc
04:43:26 AM  1008         -      3005    0.00    0.00   16.00    1.00    3.00    12  |__qemu-system-ppc
04:43:26 AM  1008         -      3006    0.00    0.00   26.00    1.62    0.00    12  |__qemu-system-ppc

Average:      UID      TGID       TID    %usr %system  %guest    %CPU  %steal   CPU  Command
Average:     1008      3001         -    0.00    0.18   51.54    3.23   50.12     -  qemu-system-ppc
Average:     1008         -      3001    0.02    0.02    0.00    0.00    0.00     -  |__qemu-system-ppc
Average:     1008         -      3003    0.00    0.03   15.89    0.99   48.24     -  |__qemu-system-ppc
Average:     1008         -      3004    0.00    0.05   11.70    0.73    0.56     -  |__qemu-system-ppc
Average:     1008         -      3005    0.00    0.06   20.03    1.26    0.58     -  |__qemu-system-ppc
Average:     1008         -      3006    0.00    0.03    3.93    0.25    0.72     -  |__qemu-system-ppc


- Naveen

------
Changes since RFC: Updated description to clarify few aspects that I got
questions about. No code changes.


Naveen N. Rao (3):
  procfs: add guest steal time in /proc/<pid>/stat
  kvm/x86: report guest steal time in host
  kvm/powerpc: report guest steal time in host

 arch/powerpc/include/asm/kvm_host.h     | 1 +
 arch/powerpc/kernel/asm-offsets.c       | 1 +
 arch/powerpc/kvm/book3s_hv.c            | 2 ++
 arch/powerpc/kvm/book3s_hv_rmhandlers.S | 3 +++
 arch/x86/kvm/x86.c                      | 1 +
 fs/proc/array.c                         | 6 ++++++
 include/linux/sched.h                   | 7 +++++++
 kernel/fork.c                           | 2 +-
 8 files changed, 22 insertions(+), 1 deletion(-)

-- 
2.3.7


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 0/3] Report guest steal time in host
@ 2015-05-06 11:56 ` Naveen N. Rao
  0 siblings, 0 replies; 20+ messages in thread
From: Naveen N. Rao @ 2015-05-06 11:56 UTC (permalink / raw)
  To: linux-kernel, linux-arch, kvm, linuxppc-dev, linux-s390
  Cc: ego, agraf, mingo, paulus, warrier

Steal time accounts the time duration during which a guest vcpu was ready to
run, but was not scheduled to run by the hypervisor. This is particularly
relevant in cloud environment where customers would want to use this as an
indicator that their guests are being throttled. However, as it stands today,
guest steal time information is not visible from the hypervisor.

For cloud service providers, this is problematic since they would want to
overcommit cpu resources to achieve optimum resource utilization while at the
same time ensuring guests are not throttled. It is useful for service providers
to have access to the guest steal time data so that they can base their
overcommit/guest packing decisions on this. Higher guest steal time can be used
as a trigger to change how the guests are scheduled, or even migrate guests out
of a system.

This patchset attempts to make the guest steal times available in the host.
This is achieved by introducing a new field in per-task statistics
(/proc/<pid>/stat and /proc/<pid>/task/<pid>/stat) to accumulate per-vcpu steal
time. Programs (such as pidstat) can then be enhanced to report this
information on a per-thread basis.

This should also work for nested virtualization: steal time information for the
guest is readable via /proc/stat, while steal time information for guests
hosted on this hypervisor is readable via /proc/<pid>/task/*/stat.

Also, mpstat always shows steal time information for current (self) guest on a
per-cpu basis. And pidstat can be enhanced to report the same for the hosted
guests on a per-vcpu basis.

As an example:

Guest (self) steal time information using mpstat:
------------------------------------------------

mpstat is run from within the guest.

[root@rhel7-img ~]# mpstat -P ALL 1
Linux 3.19.0nnr (rhel7-img) 	04/15/2015 	_ppc64_	(4 CPU)

03:13:23 PM  CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest  %gnice   %idle
03:13:24 PM  all   12.25    0.00    1.25    0.00    1.00    2.25   13.75    0.00    0.00   69.50
03:13:24 PM    0   46.53    0.00    0.00    0.00    0.00    4.95   45.54    0.00    0.00    2.97
03:13:24 PM    1    0.00    0.00    0.00    0.00    0.00    4.04    3.03    0.00    0.00   92.93
03:13:24 PM    2    0.00    0.00    0.00    0.00    3.96    0.99    2.97    0.00    0.00   92.08
03:13:24 PM    3    3.00    0.00    4.00    0.00    0.00    0.00    4.00    0.00    0.00   89.00

03:13:24 PM  CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest  %gnice   %idle
03:13:25 PM  all   12.59    0.00    0.00    0.00    0.00    0.25   12.35    0.00    0.00   74.81
03:13:25 PM    0   50.00    0.00    0.00    0.00    0.00    0.98   49.02    0.00    0.00    0.00
03:13:25 PM    1    0.98    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00   99.02
03:13:25 PM    2    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00
03:13:25 PM    3    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00

03:13:25 PM  CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest  %gnice   %idle
03:13:26 PM  all   12.99    0.00    0.00    0.00    0.25    0.00   12.75    0.00    0.00   74.02
03:13:26 PM    0   51.96    0.00    0.00    0.00    0.00    0.00   48.04    0.00    0.00    0.00
03:13:26 PM    1    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00
03:13:26 PM    2    0.00    0.00    0.00    0.00    0.98    0.00    2.94    0.00    0.00   96.08
03:13:26 PM    3    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00

03:13:26 PM  CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest  %gnice   %idle
03:13:27 PM  all   12.53    0.00    1.00    0.25    0.00    0.25   12.03    0.00    0.00   73.93
03:13:27 PM    0   51.02    0.00    0.00    0.00    0.00    0.00   48.98    0.00    0.00    0.00
03:13:27 PM    1    0.00    0.00    4.04    0.00    0.00    0.00    0.00    0.00    0.00   95.96
03:13:27 PM    2    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00
03:13:27 PM    3    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00

Average:     CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest  %gnice   %idle
Average:     all   12.91    0.00    0.54    0.01    0.04    0.12   12.39    0.00    0.00   74.00
Average:       0   51.36    0.00    0.03    0.00    0.03    0.26   48.27    0.00    0.00    0.05
Average:       1    0.02    0.00    1.54    0.02    0.02    0.15    0.36    0.00    0.00   97.89
Average:       2    0.00    0.00    0.52    0.00    0.09    0.02    0.36    0.00    0.00   99.02
Average:       3    0.05    0.00    0.07    0.00    0.02    0.09    0.34    0.00    0.00   99.43

Steal time information for hosted guests in host using (locally modified) pidstat:
---------------------------------------------------------------------------------

pidstat is being run in the host.

[naveen@xxxxxxxxxx sysstat]$ ./pidstat -C qemu -tIu 1
Linux 3.19.0nnr (xxxxxxxxxx.in.ibm.com) 	04/15/2015 	_ppc64_	(64 CPU)

04:43:20 AM   UID      TGID       TID    %usr %system  %guest    %CPU  %steal   CPU  Command
04:43:22 AM  1008      3001         -    0.00    0.00   54.21    3.39   45.79    12  qemu-system-ppc
04:43:22 AM  1008         -      3005    0.00    0.00   54.21    3.39    0.00    12  |__qemu-system-ppc

04:43:22 AM   UID      TGID       TID    %usr %system  %guest    %CPU  %steal   CPU  Command
04:43:23 AM  1008      3001         -    0.00    0.00   52.00    3.25   46.00    12  qemu-system-ppc
04:43:23 AM  1008         -      3003    0.00    0.00    2.00    0.12   46.00    12  |__qemu-system-ppc
04:43:23 AM  1008         -      3005    0.00    0.00   45.00    2.81    0.00    12  |__qemu-system-ppc
04:43:23 AM  1008         -      3006    0.00    0.00    6.00    0.38    0.00    12  |__qemu-system-ppc

04:43:23 AM   UID      TGID       TID    %usr %system  %guest    %CPU  %steal   CPU  Command
04:43:24 AM  1008      3001         -    0.00    2.00   50.00    3.25   67.00    12  qemu-system-ppc
04:43:24 AM  1008         -      3001    0.00    1.00    0.00    0.06    0.00    12  |__qemu-system-ppc
04:43:24 AM  1008         -      3003    0.00    0.00    8.00    0.50   49.00    12  |__qemu-system-ppc
04:43:24 AM  1008         -      3004    0.00    0.00    2.00    0.12    5.00    12  |__qemu-system-ppc
04:43:24 AM  1008         -      3005    0.00    0.00   38.00    2.38    3.00    12  |__qemu-system-ppc
04:43:24 AM  1008         -      3006    0.00    1.00    0.00    0.06    8.00    12  |__qemu-system-ppc

04:43:24 AM   UID      TGID       TID    %usr %system  %guest    %CPU  %steal   CPU  Command
04:43:25 AM  1008      3001         -    0.00    0.00   51.00    3.19   47.00    12  qemu-system-ppc
04:43:25 AM  1008         -      3003    0.00    0.00   27.00    1.69   47.00    12  |__qemu-system-ppc
04:43:25 AM  1008         -      3004    0.00    1.00    0.00    0.06    0.00    12  |__qemu-system-ppc
04:43:25 AM  1008         -      3005    0.00    1.00   23.00    1.50    0.00    12  |__qemu-system-ppc
04:43:25 AM  1008         -      3006    0.00    0.00    2.00    0.12    0.00    12  |__qemu-system-ppc

04:43:25 AM   UID      TGID       TID    %usr %system  %guest    %CPU  %steal   CPU  Command
04:43:26 AM  1008      3001         -    0.00    0.00   51.00    3.18   53.00    12  qemu-system-ppc
04:43:26 AM  1008         -      3003    0.00    0.00    9.00    0.56   50.00    12  |__qemu-system-ppc
04:43:26 AM  1008         -      3005    0.00    0.00   16.00    1.00    3.00    12  |__qemu-system-ppc
04:43:26 AM  1008         -      3006    0.00    0.00   26.00    1.62    0.00    12  |__qemu-system-ppc

Average:      UID      TGID       TID    %usr %system  %guest    %CPU  %steal   CPU  Command
Average:     1008      3001         -    0.00    0.18   51.54    3.23   50.12     -  qemu-system-ppc
Average:     1008         -      3001    0.02    0.02    0.00    0.00    0.00     -  |__qemu-system-ppc
Average:     1008         -      3003    0.00    0.03   15.89    0.99   48.24     -  |__qemu-system-ppc
Average:     1008         -      3004    0.00    0.05   11.70    0.73    0.56     -  |__qemu-system-ppc
Average:     1008         -      3005    0.00    0.06   20.03    1.26    0.58     -  |__qemu-system-ppc
Average:     1008         -      3006    0.00    0.03    3.93    0.25    0.72     -  |__qemu-system-ppc


- Naveen

------
Changes since RFC: Updated description to clarify few aspects that I got
questions about. No code changes.


Naveen N. Rao (3):
  procfs: add guest steal time in /proc/<pid>/stat
  kvm/x86: report guest steal time in host
  kvm/powerpc: report guest steal time in host

 arch/powerpc/include/asm/kvm_host.h     | 1 +
 arch/powerpc/kernel/asm-offsets.c       | 1 +
 arch/powerpc/kvm/book3s_hv.c            | 2 ++
 arch/powerpc/kvm/book3s_hv_rmhandlers.S | 3 +++
 arch/x86/kvm/x86.c                      | 1 +
 fs/proc/array.c                         | 6 ++++++
 include/linux/sched.h                   | 7 +++++++
 kernel/fork.c                           | 2 +-
 8 files changed, 22 insertions(+), 1 deletion(-)

-- 
2.3.7

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 1/3] procfs: add guest steal time in /proc/<pid>/stat
  2015-05-06 11:56 ` Naveen N. Rao
@ 2015-05-06 11:56   ` Naveen N. Rao
  -1 siblings, 0 replies; 20+ messages in thread
From: Naveen N. Rao @ 2015-05-06 11:56 UTC (permalink / raw)
  To: linux-kernel, linux-arch, kvm, linuxppc-dev, linux-s390
  Cc: paulus, mpe, agraf, mingo, ego, warrier

Introduce a field in /proc/<pid>/stat to expose guest steal time.

Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
---
 fs/proc/array.c       | 6 ++++++
 include/linux/sched.h | 7 +++++++
 kernel/fork.c         | 2 +-
 3 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/fs/proc/array.c b/fs/proc/array.c
index fd02a9e..ad8e616 100644
--- a/fs/proc/array.c
+++ b/fs/proc/array.c
@@ -381,6 +381,7 @@ static int do_task_stat(struct seq_file *m, struct pid_namespace *ns,
 	unsigned long rsslim = 0;
 	char tcomm[sizeof(task->comm)];
 	unsigned long flags;
+	cputime_t gstime;
 
 	state = *get_task_state(task);
 	vsize = eip = esp = 0;
@@ -400,6 +401,7 @@ static int do_task_stat(struct seq_file *m, struct pid_namespace *ns,
 	sigemptyset(&sigcatch);
 	cutime = cstime = utime = stime = 0;
 	cgtime = gtime = 0;
+	gstime = 0;
 
 	if (lock_task_sighand(task, &flags)) {
 		struct signal_struct *sig = task->signal;
@@ -428,6 +430,7 @@ static int do_task_stat(struct seq_file *m, struct pid_namespace *ns,
 				min_flt += t->min_flt;
 				maj_flt += t->maj_flt;
 				gtime += task_gtime(t);
+				gstime += task_gstime(t);
 			} while_each_thread(task, t);
 
 			min_flt += sig->min_flt;
@@ -450,6 +453,7 @@ static int do_task_stat(struct seq_file *m, struct pid_namespace *ns,
 		maj_flt = task->maj_flt;
 		task_cputime_adjusted(task, &utime, &stime);
 		gtime = task_gtime(task);
+		gstime = task_gstime(task);
 	}
 
 	/* scale priority and nice values from timeslices to -20..20 */
@@ -523,6 +527,8 @@ static int do_task_stat(struct seq_file *m, struct pid_namespace *ns,
 	else
 		seq_put_decimal_ll(m, ' ', 0);
 
+	seq_put_decimal_ull(m, ' ', cputime_to_clock_t(gstime));
+
 	seq_putc(m, '\n');
 	if (mm)
 		mmput(mm);
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 26a2e61..e28f869 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1430,6 +1430,7 @@ struct task_struct {
 
 	cputime_t utime, stime, utimescaled, stimescaled;
 	cputime_t gtime;
+	cputime_t gstime;
 #ifndef CONFIG_VIRT_CPU_ACCOUNTING_NATIVE
 	struct cputime prev_cputime;
 #endif
@@ -1956,6 +1957,12 @@ static inline cputime_t task_gtime(struct task_struct *t)
 	return t->gtime;
 }
 #endif
+
+static inline cputime_t task_gstime(struct task_struct *t)
+{
+	return t->gstime;
+}
+
 extern void task_cputime_adjusted(struct task_struct *p, cputime_t *ut, cputime_t *st);
 extern void thread_group_cputime_adjusted(struct task_struct *p, cputime_t *ut, cputime_t *st);
 
diff --git a/kernel/fork.c b/kernel/fork.c
index 03c1eaa..edf4ffb 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1339,7 +1339,7 @@ static struct task_struct *copy_process(unsigned long clone_flags,
 
 	init_sigpending(&p->pending);
 
-	p->utime = p->stime = p->gtime = 0;
+	p->utime = p->stime = p->gtime = p->gstime = 0;
 	p->utimescaled = p->stimescaled = 0;
 #ifndef CONFIG_VIRT_CPU_ACCOUNTING_NATIVE
 	p->prev_cputime.utime = p->prev_cputime.stime = 0;
-- 
2.3.7


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 1/3] procfs: add guest steal time in /proc/<pid>/stat
@ 2015-05-06 11:56   ` Naveen N. Rao
  0 siblings, 0 replies; 20+ messages in thread
From: Naveen N. Rao @ 2015-05-06 11:56 UTC (permalink / raw)
  To: linux-kernel, linux-arch, kvm, linuxppc-dev, linux-s390
  Cc: ego, agraf, mingo, paulus, warrier

Introduce a field in /proc/<pid>/stat to expose guest steal time.

Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
---
 fs/proc/array.c       | 6 ++++++
 include/linux/sched.h | 7 +++++++
 kernel/fork.c         | 2 +-
 3 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/fs/proc/array.c b/fs/proc/array.c
index fd02a9e..ad8e616 100644
--- a/fs/proc/array.c
+++ b/fs/proc/array.c
@@ -381,6 +381,7 @@ static int do_task_stat(struct seq_file *m, struct pid_namespace *ns,
 	unsigned long rsslim = 0;
 	char tcomm[sizeof(task->comm)];
 	unsigned long flags;
+	cputime_t gstime;
 
 	state = *get_task_state(task);
 	vsize = eip = esp = 0;
@@ -400,6 +401,7 @@ static int do_task_stat(struct seq_file *m, struct pid_namespace *ns,
 	sigemptyset(&sigcatch);
 	cutime = cstime = utime = stime = 0;
 	cgtime = gtime = 0;
+	gstime = 0;
 
 	if (lock_task_sighand(task, &flags)) {
 		struct signal_struct *sig = task->signal;
@@ -428,6 +430,7 @@ static int do_task_stat(struct seq_file *m, struct pid_namespace *ns,
 				min_flt += t->min_flt;
 				maj_flt += t->maj_flt;
 				gtime += task_gtime(t);
+				gstime += task_gstime(t);
 			} while_each_thread(task, t);
 
 			min_flt += sig->min_flt;
@@ -450,6 +453,7 @@ static int do_task_stat(struct seq_file *m, struct pid_namespace *ns,
 		maj_flt = task->maj_flt;
 		task_cputime_adjusted(task, &utime, &stime);
 		gtime = task_gtime(task);
+		gstime = task_gstime(task);
 	}
 
 	/* scale priority and nice values from timeslices to -20..20 */
@@ -523,6 +527,8 @@ static int do_task_stat(struct seq_file *m, struct pid_namespace *ns,
 	else
 		seq_put_decimal_ll(m, ' ', 0);
 
+	seq_put_decimal_ull(m, ' ', cputime_to_clock_t(gstime));
+
 	seq_putc(m, '\n');
 	if (mm)
 		mmput(mm);
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 26a2e61..e28f869 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1430,6 +1430,7 @@ struct task_struct {
 
 	cputime_t utime, stime, utimescaled, stimescaled;
 	cputime_t gtime;
+	cputime_t gstime;
 #ifndef CONFIG_VIRT_CPU_ACCOUNTING_NATIVE
 	struct cputime prev_cputime;
 #endif
@@ -1956,6 +1957,12 @@ static inline cputime_t task_gtime(struct task_struct *t)
 	return t->gtime;
 }
 #endif
+
+static inline cputime_t task_gstime(struct task_struct *t)
+{
+	return t->gstime;
+}
+
 extern void task_cputime_adjusted(struct task_struct *p, cputime_t *ut, cputime_t *st);
 extern void thread_group_cputime_adjusted(struct task_struct *p, cputime_t *ut, cputime_t *st);
 
diff --git a/kernel/fork.c b/kernel/fork.c
index 03c1eaa..edf4ffb 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1339,7 +1339,7 @@ static struct task_struct *copy_process(unsigned long clone_flags,
 
 	init_sigpending(&p->pending);
 
-	p->utime = p->stime = p->gtime = 0;
+	p->utime = p->stime = p->gtime = p->gstime = 0;
 	p->utimescaled = p->stimescaled = 0;
 #ifndef CONFIG_VIRT_CPU_ACCOUNTING_NATIVE
 	p->prev_cputime.utime = p->prev_cputime.stime = 0;
-- 
2.3.7

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 2/3] kvm/x86: report guest steal time in host
  2015-05-06 11:56 ` Naveen N. Rao
@ 2015-05-06 11:56   ` Naveen N. Rao
  -1 siblings, 0 replies; 20+ messages in thread
From: Naveen N. Rao @ 2015-05-06 11:56 UTC (permalink / raw)
  To: linux-kernel, linux-arch, kvm, linuxppc-dev, linux-s390
  Cc: paulus, mpe, agraf, mingo, ego, warrier

Report guest steal time in host task statistics. On x86, this is just
the scheduler run_delay.

Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
---
 arch/x86/kvm/x86.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index c73efcd..7107b7d 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2128,6 +2128,7 @@ static void record_steal_time(struct kvm_vcpu *vcpu)
 
 	vcpu->arch.st.steal.steal += vcpu->arch.st.accum_steal;
 	vcpu->arch.st.steal.version += 2;
+	current->gstime += vcpu->arch.st.accum_steal;
 	vcpu->arch.st.accum_steal = 0;
 
 	kvm_write_guest_cached(vcpu->kvm, &vcpu->arch.st.stime,
-- 
2.3.7


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 2/3] kvm/x86: report guest steal time in host
@ 2015-05-06 11:56   ` Naveen N. Rao
  0 siblings, 0 replies; 20+ messages in thread
From: Naveen N. Rao @ 2015-05-06 11:56 UTC (permalink / raw)
  To: linux-kernel, linux-arch, kvm, linuxppc-dev, linux-s390
  Cc: ego, agraf, mingo, paulus, warrier

Report guest steal time in host task statistics. On x86, this is just
the scheduler run_delay.

Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
---
 arch/x86/kvm/x86.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index c73efcd..7107b7d 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2128,6 +2128,7 @@ static void record_steal_time(struct kvm_vcpu *vcpu)
 
 	vcpu->arch.st.steal.steal += vcpu->arch.st.accum_steal;
 	vcpu->arch.st.steal.version += 2;
+	current->gstime += vcpu->arch.st.accum_steal;
 	vcpu->arch.st.accum_steal = 0;
 
 	kvm_write_guest_cached(vcpu->kvm, &vcpu->arch.st.stime,
-- 
2.3.7

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 3/3] kvm/powerpc: report guest steal time in host
  2015-05-06 11:56 ` Naveen N. Rao
@ 2015-05-06 11:56   ` Naveen N. Rao
  -1 siblings, 0 replies; 20+ messages in thread
From: Naveen N. Rao @ 2015-05-06 11:56 UTC (permalink / raw)
  To: linux-kernel, linux-arch, kvm, linuxppc-dev, linux-s390
  Cc: paulus, mpe, agraf, mingo, ego, warrier

On powerpc, kvm tracks both the guest steal time as well as the time
when guest was idle and this gets sent in to the guest through DTL. The
guest accounts these entries as either steal time or idle time based on
the last running task. Since the true guest idle status is not visible
to the host, we can't accurately expose the guest steal time in the
host.

However, tracking the guest vcpu cede status can get us a reasonable
(within 5% variation) vcpu steal time since guest vcpus cede the
processor on entering the idle task. To do this, we introduce a new
field ceded_st in kvm_vcpu_arch structure to accurately track the guest
vcpu cede status (this is needed since the existing ceded field is
modified before we can use it). During DTL entry creation, we check this
flag and account the time as stolen if the guest vcpu had not ceded.

Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
---
Tests show that the steal time being reported in the host with this approach is
around 5% higher than the steal time shown in guest. I'd be interested to know
if there are ways to achieve better accounting of the guest steal time in host.

Thanks!
- Naveen

 arch/powerpc/include/asm/kvm_host.h     | 1 +
 arch/powerpc/kernel/asm-offsets.c       | 1 +
 arch/powerpc/kvm/book3s_hv.c            | 2 ++
 arch/powerpc/kvm/book3s_hv_rmhandlers.S | 3 +++
 4 files changed, 7 insertions(+)

diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
index a193a13..48cafd6 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -661,6 +661,7 @@ struct kvm_vcpu_arch {
 	u64 busy_preempt;
 
 	u32 emul_inst;
+	u8 ceded_st;
 #endif
 
 #ifdef CONFIG_KVM_BOOK3S_HV_EXIT_TIMING
diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c
index 0034b6b..7c11c84 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -534,6 +534,7 @@ int main(void)
 	DEFINE(VCPU_DEC_EXPIRES, offsetof(struct kvm_vcpu, arch.dec_expires));
 	DEFINE(VCPU_PENDING_EXC, offsetof(struct kvm_vcpu, arch.pending_exceptions));
 	DEFINE(VCPU_CEDED, offsetof(struct kvm_vcpu, arch.ceded));
+	DEFINE(VCPU_CEDED_ST, offsetof(struct kvm_vcpu, arch.ceded_st));
 	DEFINE(VCPU_PRODDED, offsetof(struct kvm_vcpu, arch.prodded));
 	DEFINE(VCPU_MMCR, offsetof(struct kvm_vcpu, arch.mmcr));
 	DEFINE(VCPU_PMC, offsetof(struct kvm_vcpu, arch.pmc));
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 48d3c5d..7a7e3ab 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -565,6 +565,8 @@ static void kvmppc_create_dtl_entry(struct kvm_vcpu *vcpu,
 	spin_lock_irq(&vcpu->arch.tbacct_lock);
 	stolen += vcpu->arch.busy_stolen;
 	vcpu->arch.busy_stolen = 0;
+	if (!vcpu->arch.ceded_st && stolen)
+		(pid_task(vcpu->pid, PIDTYPE_PID))->gstime += stolen;
 	spin_unlock_irq(&vcpu->arch.tbacct_lock);
 	if (!dt || !vpa)
 		return;
diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index 4d70df2..80efc31 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -924,6 +924,7 @@ deliver_guest_interrupt:
 fast_guest_return:
 	li	r0,0
 	stb	r0,VCPU_CEDED(r4)	/* cancel cede */
+	stb	r0,VCPU_CEDED_ST(r4)	/* cancel cede */
 	mtspr	SPRN_HSRR0,r10
 	mtspr	SPRN_HSRR1,r11
 
@@ -2059,6 +2060,7 @@ _GLOBAL(kvmppc_h_cede)		/* r3 = vcpu pointer, r11 = msr, r13 = paca */
 	std	r11,VCPU_MSR(r3)
 	li	r0,1
 	stb	r0,VCPU_CEDED(r3)
+	stb	r0,VCPU_CEDED_ST(r3)
 	sync			/* order setting ceded vs. testing prodded */
 	lbz	r5,VCPU_PRODDED(r3)
 	cmpwi	r5,0
@@ -2266,6 +2268,7 @@ kvm_cede_prodded:
 	stb	r0,VCPU_PRODDED(r3)
 	sync			/* order testing prodded vs. clearing ceded */
 	stb	r0,VCPU_CEDED(r3)
+	stb	r0,VCPU_CEDED_ST(r3)
 	li	r3,H_SUCCESS
 	blr
 
-- 
2.3.7


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 3/3] kvm/powerpc: report guest steal time in host
@ 2015-05-06 11:56   ` Naveen N. Rao
  0 siblings, 0 replies; 20+ messages in thread
From: Naveen N. Rao @ 2015-05-06 11:56 UTC (permalink / raw)
  To: linux-kernel, linux-arch, kvm, linuxppc-dev, linux-s390
  Cc: ego, agraf, mingo, paulus, warrier

On powerpc, kvm tracks both the guest steal time as well as the time
when guest was idle and this gets sent in to the guest through DTL. The
guest accounts these entries as either steal time or idle time based on
the last running task. Since the true guest idle status is not visible
to the host, we can't accurately expose the guest steal time in the
host.

However, tracking the guest vcpu cede status can get us a reasonable
(within 5% variation) vcpu steal time since guest vcpus cede the
processor on entering the idle task. To do this, we introduce a new
field ceded_st in kvm_vcpu_arch structure to accurately track the guest
vcpu cede status (this is needed since the existing ceded field is
modified before we can use it). During DTL entry creation, we check this
flag and account the time as stolen if the guest vcpu had not ceded.

Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
---
Tests show that the steal time being reported in the host with this approach is
around 5% higher than the steal time shown in guest. I'd be interested to know
if there are ways to achieve better accounting of the guest steal time in host.

Thanks!
- Naveen

 arch/powerpc/include/asm/kvm_host.h     | 1 +
 arch/powerpc/kernel/asm-offsets.c       | 1 +
 arch/powerpc/kvm/book3s_hv.c            | 2 ++
 arch/powerpc/kvm/book3s_hv_rmhandlers.S | 3 +++
 4 files changed, 7 insertions(+)

diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
index a193a13..48cafd6 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -661,6 +661,7 @@ struct kvm_vcpu_arch {
 	u64 busy_preempt;
 
 	u32 emul_inst;
+	u8 ceded_st;
 #endif
 
 #ifdef CONFIG_KVM_BOOK3S_HV_EXIT_TIMING
diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c
index 0034b6b..7c11c84 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -534,6 +534,7 @@ int main(void)
 	DEFINE(VCPU_DEC_EXPIRES, offsetof(struct kvm_vcpu, arch.dec_expires));
 	DEFINE(VCPU_PENDING_EXC, offsetof(struct kvm_vcpu, arch.pending_exceptions));
 	DEFINE(VCPU_CEDED, offsetof(struct kvm_vcpu, arch.ceded));
+	DEFINE(VCPU_CEDED_ST, offsetof(struct kvm_vcpu, arch.ceded_st));
 	DEFINE(VCPU_PRODDED, offsetof(struct kvm_vcpu, arch.prodded));
 	DEFINE(VCPU_MMCR, offsetof(struct kvm_vcpu, arch.mmcr));
 	DEFINE(VCPU_PMC, offsetof(struct kvm_vcpu, arch.pmc));
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 48d3c5d..7a7e3ab 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -565,6 +565,8 @@ static void kvmppc_create_dtl_entry(struct kvm_vcpu *vcpu,
 	spin_lock_irq(&vcpu->arch.tbacct_lock);
 	stolen += vcpu->arch.busy_stolen;
 	vcpu->arch.busy_stolen = 0;
+	if (!vcpu->arch.ceded_st && stolen)
+		(pid_task(vcpu->pid, PIDTYPE_PID))->gstime += stolen;
 	spin_unlock_irq(&vcpu->arch.tbacct_lock);
 	if (!dt || !vpa)
 		return;
diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index 4d70df2..80efc31 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -924,6 +924,7 @@ deliver_guest_interrupt:
 fast_guest_return:
 	li	r0,0
 	stb	r0,VCPU_CEDED(r4)	/* cancel cede */
+	stb	r0,VCPU_CEDED_ST(r4)	/* cancel cede */
 	mtspr	SPRN_HSRR0,r10
 	mtspr	SPRN_HSRR1,r11
 
@@ -2059,6 +2060,7 @@ _GLOBAL(kvmppc_h_cede)		/* r3 = vcpu pointer, r11 = msr, r13 = paca */
 	std	r11,VCPU_MSR(r3)
 	li	r0,1
 	stb	r0,VCPU_CEDED(r3)
+	stb	r0,VCPU_CEDED_ST(r3)
 	sync			/* order setting ceded vs. testing prodded */
 	lbz	r5,VCPU_PRODDED(r3)
 	cmpwi	r5,0
@@ -2266,6 +2268,7 @@ kvm_cede_prodded:
 	stb	r0,VCPU_PRODDED(r3)
 	sync			/* order testing prodded vs. clearing ceded */
 	stb	r0,VCPU_CEDED(r3)
+	stb	r0,VCPU_CEDED_ST(r3)
 	li	r3,H_SUCCESS
 	blr
 
-- 
2.3.7

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH 3/3] kvm/powerpc: report guest steal time in host
  2015-05-06 11:56   ` Naveen N. Rao
@ 2015-05-06 12:46     ` Christian Borntraeger
  -1 siblings, 0 replies; 20+ messages in thread
From: Christian Borntraeger @ 2015-05-06 12:46 UTC (permalink / raw)
  To: Naveen N. Rao, linux-kernel, linux-arch, kvm, linuxppc-dev, linux-s390
  Cc: paulus, mpe, agraf, mingo, ego, warrier

Am 06.05.2015 um 13:56 schrieb Naveen N. Rao:
> On powerpc, kvm tracks both the guest steal time as well as the time
> when guest was idle and this gets sent in to the guest through DTL. The
> guest accounts these entries as either steal time or idle time based on
> the last running task. Since the true guest idle status is not visible
> to the host, we can't accurately expose the guest steal time in the
> host.
> 
> However, tracking the guest vcpu cede status can get us a reasonable
> (within 5% variation) vcpu steal time since guest vcpus cede the
> processor on entering the idle task. To do this, we introduce a new
> field ceded_st in kvm_vcpu_arch structure to accurately track the guest
> vcpu cede status (this is needed since the existing ceded field is
> modified before we can use it). During DTL entry creation, we check this
> flag and account the time as stolen if the guest vcpu had not ceded.

I think this is more or less a question about the semantic:

What would happen if you use  current->sched_info.run_delay like x86 also
on power? How far are the numbers away? My feeling is, that the semantics
of "steal time" inside the guest is somewhat different on each platform. 

This brings me to a 2nd question:
Do you need to match the host view of guest steal time with the guest view
or do we want to have a host view that translates as "this is the time that
the guest was runnable but we were too busy to schedule him"?
For the former x86 has the best solution, as the host tells the guest its
understanding of steal - so both match. For the latter we actually try to
give guest steal a meaning in the host context  - the overload.
Would /proc/<pid>/schedstat value 2 (time spent waiting on a runqueue)
meet your requirements from the cover-letter?

Christian



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 3/3] kvm/powerpc: report guest steal time in host
@ 2015-05-06 12:46     ` Christian Borntraeger
  0 siblings, 0 replies; 20+ messages in thread
From: Christian Borntraeger @ 2015-05-06 12:46 UTC (permalink / raw)
  To: Naveen N. Rao, linux-kernel, linux-arch, kvm, linuxppc-dev, linux-s390
  Cc: ego, agraf, mingo, paulus, warrier

Am 06.05.2015 um 13:56 schrieb Naveen N. Rao:
> On powerpc, kvm tracks both the guest steal time as well as the time
> when guest was idle and this gets sent in to the guest through DTL. The
> guest accounts these entries as either steal time or idle time based on
> the last running task. Since the true guest idle status is not visible
> to the host, we can't accurately expose the guest steal time in the
> host.
> 
> However, tracking the guest vcpu cede status can get us a reasonable
> (within 5% variation) vcpu steal time since guest vcpus cede the
> processor on entering the idle task. To do this, we introduce a new
> field ceded_st in kvm_vcpu_arch structure to accurately track the guest
> vcpu cede status (this is needed since the existing ceded field is
> modified before we can use it). During DTL entry creation, we check this
> flag and account the time as stolen if the guest vcpu had not ceded.

I think this is more or less a question about the semantic:

What would happen if you use  current->sched_info.run_delay like x86 also
on power? How far are the numbers away? My feeling is, that the semantics
of "steal time" inside the guest is somewhat different on each platform. 

This brings me to a 2nd question:
Do you need to match the host view of guest steal time with the guest view
or do we want to have a host view that translates as "this is the time that
the guest was runnable but we were too busy to schedule him"?
For the former x86 has the best solution, as the host tells the guest its
understanding of steal - so both match. For the latter we actually try to
give guest steal a meaning in the host context  - the overload.
Would /proc/<pid>/schedstat value 2 (time spent waiting on a runqueue)
meet your requirements from the cover-letter?

Christian

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 3/3] kvm/powerpc: report guest steal time in host
  2015-05-06 12:46     ` Christian Borntraeger
@ 2015-05-06 16:42       ` Naveen N. Rao
  -1 siblings, 0 replies; 20+ messages in thread
From: Naveen N. Rao @ 2015-05-06 16:42 UTC (permalink / raw)
  To: Christian Borntraeger
  Cc: linux-kernel, linux-arch, kvm, linuxppc-dev, linux-s390, paulus,
	mpe, agraf, mingo, ego, warrier

On 2015/05/06 02:46PM, Christian Borntraeger wrote:
> Am 06.05.2015 um 13:56 schrieb Naveen N. Rao:
> > On powerpc, kvm tracks both the guest steal time as well as the time
> > when guest was idle and this gets sent in to the guest through DTL. The
> > guest accounts these entries as either steal time or idle time based on
> > the last running task. Since the true guest idle status is not visible
> > to the host, we can't accurately expose the guest steal time in the
> > host.
> > 
> > However, tracking the guest vcpu cede status can get us a reasonable
> > (within 5% variation) vcpu steal time since guest vcpus cede the
> > processor on entering the idle task. To do this, we introduce a new
> > field ceded_st in kvm_vcpu_arch structure to accurately track the guest
> > vcpu cede status (this is needed since the existing ceded field is
> > modified before we can use it). During DTL entry creation, we check this
> > flag and account the time as stolen if the guest vcpu had not ceded.
> 
> I think this is more or less a question about the semantic:
> 
> What would happen if you use  current->sched_info.run_delay like x86 also
> on power? How far are the numbers away?

The numbers were quite off and didn't quite make sense.

> My feeling is, that the semantics
> of "steal time" inside the guest is somewhat different on each platform. 
> 
> This brings me to a 2nd question:
> Do you need to match the host view of guest steal time with the guest view
> or do we want to have a host view that translates as "this is the time that
> the guest was runnable but we were too busy to schedule him"?

Very good point. This is probably good enough for our purpose and I'd 
like to think my current patchset does something similar for powerpc. We 
don't report the exact steal time as seen from within the guest, but a 
close approximation of it. We count all time that a vcpu was not-idle as 
steal. This includes time we were doing something in the host on behalf 
of the vcpu as well as time when we were just doing something else. I 
don't know if we can separate these two or if that would be desirable.  
The scheduler statistics don't seem to accurately reflect this on ppc.

> For the former x86 has the best solution, as the host tells the guest its
> understanding of steal - so both match. For the latter we actually try to
> give guest steal a meaning in the host context  - the overload.
> Would /proc/<pid>/schedstat value 2 (time spent waiting on a runqueue)
> meet your requirements from the cover-letter?

This looks to be the same as sched_info.run_delay, which doesn't seem to 
reflect the wait on the runqueue. I will recheck this on ppc tomorrow.

As an aside, do you happen to know if /proc/<pid>/schedstat accurately 
reports the "overload" on s390?


Thanks!
- Naveen


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 3/3] kvm/powerpc: report guest steal time in host
@ 2015-05-06 16:42       ` Naveen N. Rao
  0 siblings, 0 replies; 20+ messages in thread
From: Naveen N. Rao @ 2015-05-06 16:42 UTC (permalink / raw)
  To: Christian Borntraeger
  Cc: linux-arch, linux-s390, kvm, ego, linux-kernel, agraf, mingo,
	paulus, warrier, linuxppc-dev

On 2015/05/06 02:46PM, Christian Borntraeger wrote:
> Am 06.05.2015 um 13:56 schrieb Naveen N. Rao:
> > On powerpc, kvm tracks both the guest steal time as well as the time
> > when guest was idle and this gets sent in to the guest through DTL. The
> > guest accounts these entries as either steal time or idle time based on
> > the last running task. Since the true guest idle status is not visible
> > to the host, we can't accurately expose the guest steal time in the
> > host.
> > 
> > However, tracking the guest vcpu cede status can get us a reasonable
> > (within 5% variation) vcpu steal time since guest vcpus cede the
> > processor on entering the idle task. To do this, we introduce a new
> > field ceded_st in kvm_vcpu_arch structure to accurately track the guest
> > vcpu cede status (this is needed since the existing ceded field is
> > modified before we can use it). During DTL entry creation, we check this
> > flag and account the time as stolen if the guest vcpu had not ceded.
> 
> I think this is more or less a question about the semantic:
> 
> What would happen if you use  current->sched_info.run_delay like x86 also
> on power? How far are the numbers away?

The numbers were quite off and didn't quite make sense.

> My feeling is, that the semantics
> of "steal time" inside the guest is somewhat different on each platform. 
> 
> This brings me to a 2nd question:
> Do you need to match the host view of guest steal time with the guest view
> or do we want to have a host view that translates as "this is the time that
> the guest was runnable but we were too busy to schedule him"?

Very good point. This is probably good enough for our purpose and I'd 
like to think my current patchset does something similar for powerpc. We 
don't report the exact steal time as seen from within the guest, but a 
close approximation of it. We count all time that a vcpu was not-idle as 
steal. This includes time we were doing something in the host on behalf 
of the vcpu as well as time when we were just doing something else. I 
don't know if we can separate these two or if that would be desirable.  
The scheduler statistics don't seem to accurately reflect this on ppc.

> For the former x86 has the best solution, as the host tells the guest its
> understanding of steal - so both match. For the latter we actually try to
> give guest steal a meaning in the host context  - the overload.
> Would /proc/<pid>/schedstat value 2 (time spent waiting on a runqueue)
> meet your requirements from the cover-letter?

This looks to be the same as sched_info.run_delay, which doesn't seem to 
reflect the wait on the runqueue. I will recheck this on ppc tomorrow.

As an aside, do you happen to know if /proc/<pid>/schedstat accurately 
reports the "overload" on s390?


Thanks!
- Naveen

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 3/3] kvm/powerpc: report guest steal time in host
  2015-05-06 16:42       ` Naveen N. Rao
@ 2015-05-07 12:04         ` Christian Borntraeger
  -1 siblings, 0 replies; 20+ messages in thread
From: Christian Borntraeger @ 2015-05-07 12:04 UTC (permalink / raw)
  To: Naveen N. Rao
  Cc: linux-kernel, linux-arch, kvm, linuxppc-dev, linux-s390, paulus,
	mpe, agraf, mingo, ego, warrier

Am 06.05.2015 um 18:42 schrieb Naveen N. Rao:
> On 2015/05/06 02:46PM, Christian Borntraeger wrote:
>> Am 06.05.2015 um 13:56 schrieb Naveen N. Rao:
>>> On powerpc, kvm tracks both the guest steal time as well as the time
>>> when guest was idle and this gets sent in to the guest through DTL. The
>>> guest accounts these entries as either steal time or idle time based on
>>> the last running task. Since the true guest idle status is not visible
>>> to the host, we can't accurately expose the guest steal time in the
>>> host.
>>>
>>> However, tracking the guest vcpu cede status can get us a reasonable
>>> (within 5% variation) vcpu steal time since guest vcpus cede the
>>> processor on entering the idle task. To do this, we introduce a new
>>> field ceded_st in kvm_vcpu_arch structure to accurately track the guest
>>> vcpu cede status (this is needed since the existing ceded field is
>>> modified before we can use it). During DTL entry creation, we check this
>>> flag and account the time as stolen if the guest vcpu had not ceded.
>>
>> I think this is more or less a question about the semantic:
>>
>> What would happen if you use  current->sched_info.run_delay like x86 also
>> on power? How far are the numbers away?
> 
> The numbers were quite off and didn't quite make sense.

Strange. I would expect to match at least the wall clock time between
runnable and running. Maybe its just a bug?


> 
>> My feeling is, that the semantics
>> of "steal time" inside the guest is somewhat different on each platform. 
>>
>> This brings me to a 2nd question:
>> Do you need to match the host view of guest steal time with the guest view
>> or do we want to have a host view that translates as "this is the time that
>> the guest was runnable but we were too busy to schedule him"?
> 
> Very good point. This is probably good enough for our purpose and I'd 
> like to think my current patchset does something similar for powerpc. We 
> don't report the exact steal time as seen from within the guest, but a 
> close approximation of it. We count all time that a vcpu was not-idle as 
> steal. This includes time we were doing something in the host on behalf 
> of the vcpu as well as time when we were just doing something else. I 
> don't know if we can separate these two or if that would be desirable.  
> The scheduler statistics don't seem to accurately reflect this on ppc.
> 
>> For the former x86 has the best solution, as the host tells the guest its
>> understanding of steal - so both match. For the latter we actually try to
>> give guest steal a meaning in the host context  - the overload.
>> Would /proc/<pid>/schedstat value 2 (time spent waiting on a runqueue)
>> meet your requirements from the cover-letter?
> 
> This looks to be the same as sched_info.run_delay, which doesn't seem to 
> reflect the wait on the runqueue. I will recheck this on ppc tomorrow.
> 
> As an aside, do you happen to know if /proc/<pid>/schedstat accurately 
> reports the "overload" on s390?

Things are usually even more complicated as we always have the LPAR hypervisor
below the KVM or z/VM hypervisor (KVM or z/VM guests are always nested so to
speak). Depending on the overcommit on LPAR level the wall clock times might 
indicate a problem in a "wrong" place. 

Now the steal time in a kvm guest is actually precise as the hardware will
step the guest cpu timer only when both LPAR and KVM have this CPU scheduled.
This will also cause "steal" when KVM emulates an instruction for the guest - 
unless we correct the guest view - which we dont right now.
The Linux in LPAR also sees the steal time it got stolen by LPAR.

I really have not looked closely at run_delay. My assumption is that
it boils down to "wall clock time between runnable and running". If the
admin does overcommit in KVM and LPAR is just slightly  overcommitted this
is probably good enough. If the overcommit happens at LPAR then the value
might be confusing. I would assume that people overcommit at the z/VM or KVM
level and the LPAR is managed with less overcommit - but thats not a given.

Christian


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 3/3] kvm/powerpc: report guest steal time in host
@ 2015-05-07 12:04         ` Christian Borntraeger
  0 siblings, 0 replies; 20+ messages in thread
From: Christian Borntraeger @ 2015-05-07 12:04 UTC (permalink / raw)
  To: Naveen N. Rao
  Cc: linux-arch, linux-s390, kvm, ego, linux-kernel, agraf, mingo,
	paulus, warrier, linuxppc-dev

Am 06.05.2015 um 18:42 schrieb Naveen N. Rao:
> On 2015/05/06 02:46PM, Christian Borntraeger wrote:
>> Am 06.05.2015 um 13:56 schrieb Naveen N. Rao:
>>> On powerpc, kvm tracks both the guest steal time as well as the time
>>> when guest was idle and this gets sent in to the guest through DTL. The
>>> guest accounts these entries as either steal time or idle time based on
>>> the last running task. Since the true guest idle status is not visible
>>> to the host, we can't accurately expose the guest steal time in the
>>> host.
>>>
>>> However, tracking the guest vcpu cede status can get us a reasonable
>>> (within 5% variation) vcpu steal time since guest vcpus cede the
>>> processor on entering the idle task. To do this, we introduce a new
>>> field ceded_st in kvm_vcpu_arch structure to accurately track the guest
>>> vcpu cede status (this is needed since the existing ceded field is
>>> modified before we can use it). During DTL entry creation, we check this
>>> flag and account the time as stolen if the guest vcpu had not ceded.
>>
>> I think this is more or less a question about the semantic:
>>
>> What would happen if you use  current->sched_info.run_delay like x86 also
>> on power? How far are the numbers away?
> 
> The numbers were quite off and didn't quite make sense.

Strange. I would expect to match at least the wall clock time between
runnable and running. Maybe its just a bug?


> 
>> My feeling is, that the semantics
>> of "steal time" inside the guest is somewhat different on each platform. 
>>
>> This brings me to a 2nd question:
>> Do you need to match the host view of guest steal time with the guest view
>> or do we want to have a host view that translates as "this is the time that
>> the guest was runnable but we were too busy to schedule him"?
> 
> Very good point. This is probably good enough for our purpose and I'd 
> like to think my current patchset does something similar for powerpc. We 
> don't report the exact steal time as seen from within the guest, but a 
> close approximation of it. We count all time that a vcpu was not-idle as 
> steal. This includes time we were doing something in the host on behalf 
> of the vcpu as well as time when we were just doing something else. I 
> don't know if we can separate these two or if that would be desirable.  
> The scheduler statistics don't seem to accurately reflect this on ppc.
> 
>> For the former x86 has the best solution, as the host tells the guest its
>> understanding of steal - so both match. For the latter we actually try to
>> give guest steal a meaning in the host context  - the overload.
>> Would /proc/<pid>/schedstat value 2 (time spent waiting on a runqueue)
>> meet your requirements from the cover-letter?
> 
> This looks to be the same as sched_info.run_delay, which doesn't seem to 
> reflect the wait on the runqueue. I will recheck this on ppc tomorrow.
> 
> As an aside, do you happen to know if /proc/<pid>/schedstat accurately 
> reports the "overload" on s390?

Things are usually even more complicated as we always have the LPAR hypervisor
below the KVM or z/VM hypervisor (KVM or z/VM guests are always nested so to
speak). Depending on the overcommit on LPAR level the wall clock times might 
indicate a problem in a "wrong" place. 

Now the steal time in a kvm guest is actually precise as the hardware will
step the guest cpu timer only when both LPAR and KVM have this CPU scheduled.
This will also cause "steal" when KVM emulates an instruction for the guest - 
unless we correct the guest view - which we dont right now.
The Linux in LPAR also sees the steal time it got stolen by LPAR.

I really have not looked closely at run_delay. My assumption is that
it boils down to "wall clock time between runnable and running". If the
admin does overcommit in KVM and LPAR is just slightly  overcommitted this
is probably good enough. If the overcommit happens at LPAR then the value
might be confusing. I would assume that people overcommit at the z/VM or KVM
level and the LPAR is managed with less overcommit - but thats not a given.

Christian

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 3/3] kvm/powerpc: report guest steal time in host
  2015-05-06 10:58 [PATCH 0/3] Report " Naveen N. Rao
@ 2015-05-06 10:58   ` Naveen N. Rao
  2015-05-06 10:58 ` Naveen N. Rao
                     ` (3 subsequent siblings)
  4 siblings, 0 replies; 20+ messages in thread
From: Naveen N. Rao @ 2015-05-06 10:58 UTC (permalink / raw)
  To: linux-kernel, linux-arch, kvm, linuxppc-dev, linux-s390
  Cc: paulus, mpe, agraf, mingo, ego

On powerpc, kvm tracks both the guest steal time as well as the time
when guest was idle and this gets sent in to the guest through DTL. The
guest accounts these entries as either steal time or idle time based on
the last running task. Since the true guest idle status is not visible
to the host, we can't accurately expose the guest steal time in the
host.

However, tracking the guest vcpu cede status can get us a reasonable
(within 5% variation) vcpu steal time since guest vcpus cede the
processor on entering the idle task. To do this, we introduce a new
field ceded_st in kvm_vcpu_arch structure to accurately track the guest
vcpu cede status (this is needed since the existing ceded field is
modified before we can use it). During DTL entry creation, we check this
flag and account the time as stolen if the guest vcpu had not ceded.

Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
---
Tests show that the steal time being reported in the host with this approach is
around 5% higher than the steal time shown in guest. I'd be interested to know
if there are ways to achieve better accounting of the guest steal time in host.

 arch/powerpc/include/asm/kvm_host.h     | 1 +
 arch/powerpc/kernel/asm-offsets.c       | 1 +
 arch/powerpc/kvm/book3s_hv.c            | 2 ++
 arch/powerpc/kvm/book3s_hv_rmhandlers.S | 3 +++
 4 files changed, 7 insertions(+)

diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
index 8ef0512..7db48c4 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -655,6 +655,7 @@ struct kvm_vcpu_arch {
 	u64 busy_preempt;
 
 	u32 emul_inst;
+	u8 ceded_st;
 #endif
 };
 
diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c
index 4717859..765c7c4 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -521,6 +521,7 @@ int main(void)
 	DEFINE(VCPU_DEC_EXPIRES, offsetof(struct kvm_vcpu, arch.dec_expires));
 	DEFINE(VCPU_PENDING_EXC, offsetof(struct kvm_vcpu, arch.pending_exceptions));
 	DEFINE(VCPU_CEDED, offsetof(struct kvm_vcpu, arch.ceded));
+	DEFINE(VCPU_CEDED_ST, offsetof(struct kvm_vcpu, arch.ceded_st));
 	DEFINE(VCPU_PRODDED, offsetof(struct kvm_vcpu, arch.prodded));
 	DEFINE(VCPU_MMCR, offsetof(struct kvm_vcpu, arch.mmcr));
 	DEFINE(VCPU_PMC, offsetof(struct kvm_vcpu, arch.pmc));
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index de74756..ad7c0e3 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -545,6 +545,8 @@ static void kvmppc_create_dtl_entry(struct kvm_vcpu *vcpu,
 	spin_lock_irq(&vcpu->arch.tbacct_lock);
 	stolen += vcpu->arch.busy_stolen;
 	vcpu->arch.busy_stolen = 0;
+	if (!vcpu->arch.ceded_st && stolen)
+		(pid_task(vcpu->pid, PIDTYPE_PID))->gstime += stolen;
 	spin_unlock_irq(&vcpu->arch.tbacct_lock);
 	if (!dt || !vpa)
 		return;
diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index 6cbf163..28f304e 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -873,6 +873,7 @@ deliver_guest_interrupt:
 fast_guest_return:
 	li	r0,0
 	stb	r0,VCPU_CEDED(r4)	/* cancel cede */
+	stb	r0,VCPU_CEDED_ST(r4)	/* cancel cede */
 	mtspr	SPRN_HSRR0,r10
 	mtspr	SPRN_HSRR1,r11
 
@@ -1889,6 +1890,7 @@ _GLOBAL(kvmppc_h_cede)
 	std	r11,VCPU_MSR(r3)
 	li	r0,1
 	stb	r0,VCPU_CEDED(r3)
+	stb	r0,VCPU_CEDED_ST(r3)
 	sync			/* order setting ceded vs. testing prodded */
 	lbz	r5,VCPU_PRODDED(r3)
 	cmpwi	r5,0
@@ -2052,6 +2054,7 @@ kvm_cede_prodded:
 	stb	r0,VCPU_PRODDED(r3)
 	sync			/* order testing prodded vs. clearing ceded */
 	stb	r0,VCPU_CEDED(r3)
+	stb	r0,VCPU_CEDED_ST(r3)
 	li	r3,H_SUCCESS
 	blr
 
-- 
2.3.5


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 3/3] kvm/powerpc: report guest steal time in host
  2015-05-06 10:58 [PATCH 0/3] Report " Naveen N. Rao
  2015-05-06 10:58 ` [PATCH 3/3] kvm/powerpc: report " Naveen N. Rao
  2015-05-06 10:58 ` Naveen N. Rao
@ 2015-05-06 10:58 ` Naveen N. Rao
  2015-05-06 10:58 ` Naveen N. Rao
  2015-05-06 10:58   ` Naveen N. Rao
  4 siblings, 0 replies; 20+ messages in thread
From: Naveen N. Rao @ 2015-05-06 10:58 UTC (permalink / raw)
  To: linux-kernel, linux-arch, kvm, linuxppc-dev, linux-s390
  Cc: paulus, mpe, agraf, mingo, ego

On powerpc, kvm tracks both the guest steal time as well as the time
when guest was idle and this gets sent in to the guest through DTL. The
guest accounts these entries as either steal time or idle time based on
the last running task. Since the true guest idle status is not visible
to the host, we can't accurately expose the guest steal time in the
host.

However, tracking the guest vcpu cede status can get us a reasonable
(within 5% variation) vcpu steal time since guest vcpus cede the
processor on entering the idle task. To do this, we introduce a new
field ceded_st in kvm_vcpu_arch structure to accurately track the guest
vcpu cede status (this is needed since the existing ceded field is
modified before we can use it). During DTL entry creation, we check this
flag and account the time as stolen if the guest vcpu had not ceded.

Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
---
Tests show that the steal time being reported in the host with this approach is
around 5% higher than the steal time shown in guest. I'd be interested to know
if there are ways to achieve better accounting of the guest steal time in host.

 arch/powerpc/include/asm/kvm_host.h     | 1 +
 arch/powerpc/kernel/asm-offsets.c       | 1 +
 arch/powerpc/kvm/book3s_hv.c            | 2 ++
 arch/powerpc/kvm/book3s_hv_rmhandlers.S | 3 +++
 4 files changed, 7 insertions(+)

diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
index 8ef0512..7db48c4 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -655,6 +655,7 @@ struct kvm_vcpu_arch {
 	u64 busy_preempt;
 
 	u32 emul_inst;
+	u8 ceded_st;
 #endif
 };
 
diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c
index 4717859..765c7c4 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -521,6 +521,7 @@ int main(void)
 	DEFINE(VCPU_DEC_EXPIRES, offsetof(struct kvm_vcpu, arch.dec_expires));
 	DEFINE(VCPU_PENDING_EXC, offsetof(struct kvm_vcpu, arch.pending_exceptions));
 	DEFINE(VCPU_CEDED, offsetof(struct kvm_vcpu, arch.ceded));
+	DEFINE(VCPU_CEDED_ST, offsetof(struct kvm_vcpu, arch.ceded_st));
 	DEFINE(VCPU_PRODDED, offsetof(struct kvm_vcpu, arch.prodded));
 	DEFINE(VCPU_MMCR, offsetof(struct kvm_vcpu, arch.mmcr));
 	DEFINE(VCPU_PMC, offsetof(struct kvm_vcpu, arch.pmc));
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index de74756..ad7c0e3 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -545,6 +545,8 @@ static void kvmppc_create_dtl_entry(struct kvm_vcpu *vcpu,
 	spin_lock_irq(&vcpu->arch.tbacct_lock);
 	stolen += vcpu->arch.busy_stolen;
 	vcpu->arch.busy_stolen = 0;
+	if (!vcpu->arch.ceded_st && stolen)
+		(pid_task(vcpu->pid, PIDTYPE_PID))->gstime += stolen;
 	spin_unlock_irq(&vcpu->arch.tbacct_lock);
 	if (!dt || !vpa)
 		return;
diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index 6cbf163..28f304e 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -873,6 +873,7 @@ deliver_guest_interrupt:
 fast_guest_return:
 	li	r0,0
 	stb	r0,VCPU_CEDED(r4)	/* cancel cede */
+	stb	r0,VCPU_CEDED_ST(r4)	/* cancel cede */
 	mtspr	SPRN_HSRR0,r10
 	mtspr	SPRN_HSRR1,r11
 
@@ -1889,6 +1890,7 @@ _GLOBAL(kvmppc_h_cede)
 	std	r11,VCPU_MSR(r3)
 	li	r0,1
 	stb	r0,VCPU_CEDED(r3)
+	stb	r0,VCPU_CEDED_ST(r3)
 	sync			/* order setting ceded vs. testing prodded */
 	lbz	r5,VCPU_PRODDED(r3)
 	cmpwi	r5,0
@@ -2052,6 +2054,7 @@ kvm_cede_prodded:
 	stb	r0,VCPU_PRODDED(r3)
 	sync			/* order testing prodded vs. clearing ceded */
 	stb	r0,VCPU_CEDED(r3)
+	stb	r0,VCPU_CEDED_ST(r3)
 	li	r3,H_SUCCESS
 	blr
 
-- 
2.3.5

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 3/3] kvm/powerpc: report guest steal time in host
  2015-05-06 10:58 [PATCH 0/3] Report " Naveen N. Rao
                   ` (2 preceding siblings ...)
  2015-05-06 10:58 ` Naveen N. Rao
@ 2015-05-06 10:58 ` Naveen N. Rao
  2015-05-06 10:58   ` Naveen N. Rao
  4 siblings, 0 replies; 20+ messages in thread
From: Naveen N. Rao @ 2015-05-06 10:58 UTC (permalink / raw)
  To: linux-kernel, linux-arch, kvm, linuxppc-dev, linux-s390
  Cc: paulus, mpe, agraf, mingo, ego

On powerpc, kvm tracks both the guest steal time as well as the time
when guest was idle and this gets sent in to the guest through DTL. The
guest accounts these entries as either steal time or idle time based on
the last running task. Since the true guest idle status is not visible
to the host, we can't accurately expose the guest steal time in the
host.

However, tracking the guest vcpu cede status can get us a reasonable
(within 5% variation) vcpu steal time since guest vcpus cede the
processor on entering the idle task. To do this, we introduce a new
field ceded_st in kvm_vcpu_arch structure to accurately track the guest
vcpu cede status (this is needed since the existing ceded field is
modified before we can use it). During DTL entry creation, we check this
flag and account the time as stolen if the guest vcpu had not ceded.

Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
---
Tests show that the steal time being reported in the host with this approach is
around 5% higher than the steal time shown in guest. I'd be interested to know
if there are ways to achieve better accounting of the guest steal time in host.

 arch/powerpc/include/asm/kvm_host.h     | 1 +
 arch/powerpc/kernel/asm-offsets.c       | 1 +
 arch/powerpc/kvm/book3s_hv.c            | 2 ++
 arch/powerpc/kvm/book3s_hv_rmhandlers.S | 3 +++
 4 files changed, 7 insertions(+)

diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
index 8ef0512..7db48c4 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -655,6 +655,7 @@ struct kvm_vcpu_arch {
 	u64 busy_preempt;
 
 	u32 emul_inst;
+	u8 ceded_st;
 #endif
 };
 
diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c
index 4717859..765c7c4 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -521,6 +521,7 @@ int main(void)
 	DEFINE(VCPU_DEC_EXPIRES, offsetof(struct kvm_vcpu, arch.dec_expires));
 	DEFINE(VCPU_PENDING_EXC, offsetof(struct kvm_vcpu, arch.pending_exceptions));
 	DEFINE(VCPU_CEDED, offsetof(struct kvm_vcpu, arch.ceded));
+	DEFINE(VCPU_CEDED_ST, offsetof(struct kvm_vcpu, arch.ceded_st));
 	DEFINE(VCPU_PRODDED, offsetof(struct kvm_vcpu, arch.prodded));
 	DEFINE(VCPU_MMCR, offsetof(struct kvm_vcpu, arch.mmcr));
 	DEFINE(VCPU_PMC, offsetof(struct kvm_vcpu, arch.pmc));
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index de74756..ad7c0e3 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -545,6 +545,8 @@ static void kvmppc_create_dtl_entry(struct kvm_vcpu *vcpu,
 	spin_lock_irq(&vcpu->arch.tbacct_lock);
 	stolen += vcpu->arch.busy_stolen;
 	vcpu->arch.busy_stolen = 0;
+	if (!vcpu->arch.ceded_st && stolen)
+		(pid_task(vcpu->pid, PIDTYPE_PID))->gstime += stolen;
 	spin_unlock_irq(&vcpu->arch.tbacct_lock);
 	if (!dt || !vpa)
 		return;
diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index 6cbf163..28f304e 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -873,6 +873,7 @@ deliver_guest_interrupt:
 fast_guest_return:
 	li	r0,0
 	stb	r0,VCPU_CEDED(r4)	/* cancel cede */
+	stb	r0,VCPU_CEDED_ST(r4)	/* cancel cede */
 	mtspr	SPRN_HSRR0,r10
 	mtspr	SPRN_HSRR1,r11
 
@@ -1889,6 +1890,7 @@ _GLOBAL(kvmppc_h_cede)
 	std	r11,VCPU_MSR(r3)
 	li	r0,1
 	stb	r0,VCPU_CEDED(r3)
+	stb	r0,VCPU_CEDED_ST(r3)
 	sync			/* order setting ceded vs. testing prodded */
 	lbz	r5,VCPU_PRODDED(r3)
 	cmpwi	r5,0
@@ -2052,6 +2054,7 @@ kvm_cede_prodded:
 	stb	r0,VCPU_PRODDED(r3)
 	sync			/* order testing prodded vs. clearing ceded */
 	stb	r0,VCPU_CEDED(r3)
+	stb	r0,VCPU_CEDED_ST(r3)
 	li	r3,H_SUCCESS
 	blr
 
-- 
2.3.5

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 3/3] kvm/powerpc: report guest steal time in host
  2015-05-06 10:58 [PATCH 0/3] Report " Naveen N. Rao
  2015-05-06 10:58 ` [PATCH 3/3] kvm/powerpc: report " Naveen N. Rao
@ 2015-05-06 10:58 ` Naveen N. Rao
  2015-05-06 10:58 ` Naveen N. Rao
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 20+ messages in thread
From: Naveen N. Rao @ 2015-05-06 10:58 UTC (permalink / raw)
  To: linux-kernel, linux-arch, kvm, linuxppc-dev, linux-s390
  Cc: mingo, paulus, agraf, ego

On powerpc, kvm tracks both the guest steal time as well as the time
when guest was idle and this gets sent in to the guest through DTL. The
guest accounts these entries as either steal time or idle time based on
the last running task. Since the true guest idle status is not visible
to the host, we can't accurately expose the guest steal time in the
host.

However, tracking the guest vcpu cede status can get us a reasonable
(within 5% variation) vcpu steal time since guest vcpus cede the
processor on entering the idle task. To do this, we introduce a new
field ceded_st in kvm_vcpu_arch structure to accurately track the guest
vcpu cede status (this is needed since the existing ceded field is
modified before we can use it). During DTL entry creation, we check this
flag and account the time as stolen if the guest vcpu had not ceded.

Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
---
Tests show that the steal time being reported in the host with this approach is
around 5% higher than the steal time shown in guest. I'd be interested to know
if there are ways to achieve better accounting of the guest steal time in host.

 arch/powerpc/include/asm/kvm_host.h     | 1 +
 arch/powerpc/kernel/asm-offsets.c       | 1 +
 arch/powerpc/kvm/book3s_hv.c            | 2 ++
 arch/powerpc/kvm/book3s_hv_rmhandlers.S | 3 +++
 4 files changed, 7 insertions(+)

diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
index 8ef0512..7db48c4 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -655,6 +655,7 @@ struct kvm_vcpu_arch {
 	u64 busy_preempt;
 
 	u32 emul_inst;
+	u8 ceded_st;
 #endif
 };
 
diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c
index 4717859..765c7c4 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -521,6 +521,7 @@ int main(void)
 	DEFINE(VCPU_DEC_EXPIRES, offsetof(struct kvm_vcpu, arch.dec_expires));
 	DEFINE(VCPU_PENDING_EXC, offsetof(struct kvm_vcpu, arch.pending_exceptions));
 	DEFINE(VCPU_CEDED, offsetof(struct kvm_vcpu, arch.ceded));
+	DEFINE(VCPU_CEDED_ST, offsetof(struct kvm_vcpu, arch.ceded_st));
 	DEFINE(VCPU_PRODDED, offsetof(struct kvm_vcpu, arch.prodded));
 	DEFINE(VCPU_MMCR, offsetof(struct kvm_vcpu, arch.mmcr));
 	DEFINE(VCPU_PMC, offsetof(struct kvm_vcpu, arch.pmc));
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index de74756..ad7c0e3 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -545,6 +545,8 @@ static void kvmppc_create_dtl_entry(struct kvm_vcpu *vcpu,
 	spin_lock_irq(&vcpu->arch.tbacct_lock);
 	stolen += vcpu->arch.busy_stolen;
 	vcpu->arch.busy_stolen = 0;
+	if (!vcpu->arch.ceded_st && stolen)
+		(pid_task(vcpu->pid, PIDTYPE_PID))->gstime += stolen;
 	spin_unlock_irq(&vcpu->arch.tbacct_lock);
 	if (!dt || !vpa)
 		return;
diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index 6cbf163..28f304e 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -873,6 +873,7 @@ deliver_guest_interrupt:
 fast_guest_return:
 	li	r0,0
 	stb	r0,VCPU_CEDED(r4)	/* cancel cede */
+	stb	r0,VCPU_CEDED_ST(r4)	/* cancel cede */
 	mtspr	SPRN_HSRR0,r10
 	mtspr	SPRN_HSRR1,r11
 
@@ -1889,6 +1890,7 @@ _GLOBAL(kvmppc_h_cede)
 	std	r11,VCPU_MSR(r3)
 	li	r0,1
 	stb	r0,VCPU_CEDED(r3)
+	stb	r0,VCPU_CEDED_ST(r3)
 	sync			/* order setting ceded vs. testing prodded */
 	lbz	r5,VCPU_PRODDED(r3)
 	cmpwi	r5,0
@@ -2052,6 +2054,7 @@ kvm_cede_prodded:
 	stb	r0,VCPU_PRODDED(r3)
 	sync			/* order testing prodded vs. clearing ceded */
 	stb	r0,VCPU_CEDED(r3)
+	stb	r0,VCPU_CEDED_ST(r3)
 	li	r3,H_SUCCESS
 	blr
 
-- 
2.3.5

_______________________________________________
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 3/3] kvm/powerpc: report guest steal time in host
  2015-05-06 10:58 [PATCH 0/3] Report " Naveen N. Rao
@ 2015-05-06 10:58 ` Naveen N. Rao
  2015-05-06 10:58 ` Naveen N. Rao
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 20+ messages in thread
From: Naveen N. Rao @ 2015-05-06 10:58 UTC (permalink / raw)
  To: linux-kernel, linux-arch, kvm, linuxppc-dev, linux-s390
  Cc: mingo, paulus, agraf, ego

On powerpc, kvm tracks both the guest steal time as well as the time
when guest was idle and this gets sent in to the guest through DTL. The
guest accounts these entries as either steal time or idle time based on
the last running task. Since the true guest idle status is not visible
to the host, we can't accurately expose the guest steal time in the
host.

However, tracking the guest vcpu cede status can get us a reasonable
(within 5% variation) vcpu steal time since guest vcpus cede the
processor on entering the idle task. To do this, we introduce a new
field ceded_st in kvm_vcpu_arch structure to accurately track the guest
vcpu cede status (this is needed since the existing ceded field is
modified before we can use it). During DTL entry creation, we check this
flag and account the time as stolen if the guest vcpu had not ceded.

Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
---
Tests show that the steal time being reported in the host with this approach is
around 5% higher than the steal time shown in guest. I'd be interested to know
if there are ways to achieve better accounting of the guest steal time in host.

 arch/powerpc/include/asm/kvm_host.h     | 1 +
 arch/powerpc/kernel/asm-offsets.c       | 1 +
 arch/powerpc/kvm/book3s_hv.c            | 2 ++
 arch/powerpc/kvm/book3s_hv_rmhandlers.S | 3 +++
 4 files changed, 7 insertions(+)

diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
index 8ef0512..7db48c4 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -655,6 +655,7 @@ struct kvm_vcpu_arch {
 	u64 busy_preempt;
 
 	u32 emul_inst;
+	u8 ceded_st;
 #endif
 };
 
diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c
index 4717859..765c7c4 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -521,6 +521,7 @@ int main(void)
 	DEFINE(VCPU_DEC_EXPIRES, offsetof(struct kvm_vcpu, arch.dec_expires));
 	DEFINE(VCPU_PENDING_EXC, offsetof(struct kvm_vcpu, arch.pending_exceptions));
 	DEFINE(VCPU_CEDED, offsetof(struct kvm_vcpu, arch.ceded));
+	DEFINE(VCPU_CEDED_ST, offsetof(struct kvm_vcpu, arch.ceded_st));
 	DEFINE(VCPU_PRODDED, offsetof(struct kvm_vcpu, arch.prodded));
 	DEFINE(VCPU_MMCR, offsetof(struct kvm_vcpu, arch.mmcr));
 	DEFINE(VCPU_PMC, offsetof(struct kvm_vcpu, arch.pmc));
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index de74756..ad7c0e3 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -545,6 +545,8 @@ static void kvmppc_create_dtl_entry(struct kvm_vcpu *vcpu,
 	spin_lock_irq(&vcpu->arch.tbacct_lock);
 	stolen += vcpu->arch.busy_stolen;
 	vcpu->arch.busy_stolen = 0;
+	if (!vcpu->arch.ceded_st && stolen)
+		(pid_task(vcpu->pid, PIDTYPE_PID))->gstime += stolen;
 	spin_unlock_irq(&vcpu->arch.tbacct_lock);
 	if (!dt || !vpa)
 		return;
diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index 6cbf163..28f304e 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -873,6 +873,7 @@ deliver_guest_interrupt:
 fast_guest_return:
 	li	r0,0
 	stb	r0,VCPU_CEDED(r4)	/* cancel cede */
+	stb	r0,VCPU_CEDED_ST(r4)	/* cancel cede */
 	mtspr	SPRN_HSRR0,r10
 	mtspr	SPRN_HSRR1,r11
 
@@ -1889,6 +1890,7 @@ _GLOBAL(kvmppc_h_cede)
 	std	r11,VCPU_MSR(r3)
 	li	r0,1
 	stb	r0,VCPU_CEDED(r3)
+	stb	r0,VCPU_CEDED_ST(r3)
 	sync			/* order setting ceded vs. testing prodded */
 	lbz	r5,VCPU_PRODDED(r3)
 	cmpwi	r5,0
@@ -2052,6 +2054,7 @@ kvm_cede_prodded:
 	stb	r0,VCPU_PRODDED(r3)
 	sync			/* order testing prodded vs. clearing ceded */
 	stb	r0,VCPU_CEDED(r3)
+	stb	r0,VCPU_CEDED_ST(r3)
 	li	r3,H_SUCCESS
 	blr
 
-- 
2.3.5

_______________________________________________
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 3/3] kvm/powerpc: report guest steal time in host
@ 2015-05-06 10:58   ` Naveen N. Rao
  0 siblings, 0 replies; 20+ messages in thread
From: Naveen N. Rao @ 2015-05-06 10:58 UTC (permalink / raw)
  To: linux-kernel, linux-arch, kvm, linuxppc-dev, linux-s390
  Cc: mingo, paulus, agraf, ego

On powerpc, kvm tracks both the guest steal time as well as the time
when guest was idle and this gets sent in to the guest through DTL. The
guest accounts these entries as either steal time or idle time based on
the last running task. Since the true guest idle status is not visible
to the host, we can't accurately expose the guest steal time in the
host.

However, tracking the guest vcpu cede status can get us a reasonable
(within 5% variation) vcpu steal time since guest vcpus cede the
processor on entering the idle task. To do this, we introduce a new
field ceded_st in kvm_vcpu_arch structure to accurately track the guest
vcpu cede status (this is needed since the existing ceded field is
modified before we can use it). During DTL entry creation, we check this
flag and account the time as stolen if the guest vcpu had not ceded.

Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
---
Tests show that the steal time being reported in the host with this approach is
around 5% higher than the steal time shown in guest. I'd be interested to know
if there are ways to achieve better accounting of the guest steal time in host.

 arch/powerpc/include/asm/kvm_host.h     | 1 +
 arch/powerpc/kernel/asm-offsets.c       | 1 +
 arch/powerpc/kvm/book3s_hv.c            | 2 ++
 arch/powerpc/kvm/book3s_hv_rmhandlers.S | 3 +++
 4 files changed, 7 insertions(+)

diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
index 8ef0512..7db48c4 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -655,6 +655,7 @@ struct kvm_vcpu_arch {
 	u64 busy_preempt;
 
 	u32 emul_inst;
+	u8 ceded_st;
 #endif
 };
 
diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c
index 4717859..765c7c4 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -521,6 +521,7 @@ int main(void)
 	DEFINE(VCPU_DEC_EXPIRES, offsetof(struct kvm_vcpu, arch.dec_expires));
 	DEFINE(VCPU_PENDING_EXC, offsetof(struct kvm_vcpu, arch.pending_exceptions));
 	DEFINE(VCPU_CEDED, offsetof(struct kvm_vcpu, arch.ceded));
+	DEFINE(VCPU_CEDED_ST, offsetof(struct kvm_vcpu, arch.ceded_st));
 	DEFINE(VCPU_PRODDED, offsetof(struct kvm_vcpu, arch.prodded));
 	DEFINE(VCPU_MMCR, offsetof(struct kvm_vcpu, arch.mmcr));
 	DEFINE(VCPU_PMC, offsetof(struct kvm_vcpu, arch.pmc));
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index de74756..ad7c0e3 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -545,6 +545,8 @@ static void kvmppc_create_dtl_entry(struct kvm_vcpu *vcpu,
 	spin_lock_irq(&vcpu->arch.tbacct_lock);
 	stolen += vcpu->arch.busy_stolen;
 	vcpu->arch.busy_stolen = 0;
+	if (!vcpu->arch.ceded_st && stolen)
+		(pid_task(vcpu->pid, PIDTYPE_PID))->gstime += stolen;
 	spin_unlock_irq(&vcpu->arch.tbacct_lock);
 	if (!dt || !vpa)
 		return;
diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index 6cbf163..28f304e 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -873,6 +873,7 @@ deliver_guest_interrupt:
 fast_guest_return:
 	li	r0,0
 	stb	r0,VCPU_CEDED(r4)	/* cancel cede */
+	stb	r0,VCPU_CEDED_ST(r4)	/* cancel cede */
 	mtspr	SPRN_HSRR0,r10
 	mtspr	SPRN_HSRR1,r11
 
@@ -1889,6 +1890,7 @@ _GLOBAL(kvmppc_h_cede)
 	std	r11,VCPU_MSR(r3)
 	li	r0,1
 	stb	r0,VCPU_CEDED(r3)
+	stb	r0,VCPU_CEDED_ST(r3)
 	sync			/* order setting ceded vs. testing prodded */
 	lbz	r5,VCPU_PRODDED(r3)
 	cmpwi	r5,0
@@ -2052,6 +2054,7 @@ kvm_cede_prodded:
 	stb	r0,VCPU_PRODDED(r3)
 	sync			/* order testing prodded vs. clearing ceded */
 	stb	r0,VCPU_CEDED(r3)
+	stb	r0,VCPU_CEDED_ST(r3)
 	li	r3,H_SUCCESS
 	blr
 
-- 
2.3.5

^ permalink raw reply related	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2015-05-07 12:04 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-05-06 11:56 [PATCH 0/3] Report guest steal time in host Naveen N. Rao
2015-05-06 11:56 ` Naveen N. Rao
2015-05-06 11:56 ` [PATCH 1/3] procfs: add guest steal time in /proc/<pid>/stat Naveen N. Rao
2015-05-06 11:56   ` Naveen N. Rao
2015-05-06 11:56 ` [PATCH 2/3] kvm/x86: report guest steal time in host Naveen N. Rao
2015-05-06 11:56   ` Naveen N. Rao
2015-05-06 11:56 ` [PATCH 3/3] kvm/powerpc: " Naveen N. Rao
2015-05-06 11:56   ` Naveen N. Rao
2015-05-06 12:46   ` Christian Borntraeger
2015-05-06 12:46     ` Christian Borntraeger
2015-05-06 16:42     ` Naveen N. Rao
2015-05-06 16:42       ` Naveen N. Rao
2015-05-07 12:04       ` Christian Borntraeger
2015-05-07 12:04         ` Christian Borntraeger
  -- strict thread matches above, loose matches on Subject: below --
2015-05-06 10:58 [PATCH 0/3] Report " Naveen N. Rao
2015-05-06 10:58 ` [PATCH 3/3] kvm/powerpc: report " Naveen N. Rao
2015-05-06 10:58 ` Naveen N. Rao
2015-05-06 10:58 ` Naveen N. Rao
2015-05-06 10:58 ` Naveen N. Rao
2015-05-06 10:58 ` Naveen N. Rao
2015-05-06 10:58   ` Naveen N. Rao

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.