All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH V2 0/4] Utilizing VMX preemption for timer virtualization
@ 2016-05-24 22:27 Yunhong Jiang
  2016-05-24 22:27 ` [RFC PATCH V2 1/4] Add the kvm sched_out hook Yunhong Jiang
                   ` (3 more replies)
  0 siblings, 4 replies; 32+ messages in thread
From: Yunhong Jiang @ 2016-05-24 22:27 UTC (permalink / raw)
  To: kvm; +Cc: mtosatti, rkrcmar, pbonzini, kernellwp

The VMX-preemption timer is a feature on VMX, it counts down, from the
value loaded by VM entry, in VMX nonroot operation. When the timer
counts down to zero, it stops counting down and a VM exit occurs.

This patchset utilize VMX preemption timer for tsc deadline timer
virtualization. The VMX preemption timer is armed before the vm-entry if the
tsc deadline timer is enabled. A VMExit will happen if the virtual TSC
deadline timer expires.

When the vCPU thread is scheduled out, the tsc deadline timer
virtualization will be switched to use the current solution, i.e. use
the timer for it. It's switched back to VMX preemption timer when the
vCPU thread is scheduled int.

This solution replace the complex OS's hrtimer system, and also the
host timer interrupt handling cost, with a preemption_timer VMexit. It
fits well for some NFV usage scenario, when the vCPU is bound to a
pCPU and the pCPU is isolated, or some similar scenarioes.

However, it possibly has impact if the vCPU thread is scheduled in/out
very frequently, because it switches from/to the hrtimer emulation a
lot. A module parameter is provided to turn it on or off.

Signed-off-by: Yunhong Jiang <yunhong.jiang@intel.com>

Performance Evalaution:
Host:
[nfv@otcnfv02 ~]$ cat /proc/cpuinfo
....
cpu family      : 6
model           : 63
model name      : Intel(R) Xeon(R) CPU E5-2699 v3 @ 2.30GHz

Guest:
Two vCPU with vCPU pinned to isolated pCPUs, idle=poll on guest kernel.
When the vCPU is not pinned, the benefit is smaller than pinned situation.

Test tools:
cyclictest [1] running 10 minutes with 1ms interval, i.e. 600000 loop in
total.

1. enable_hv_timer=Y.

# Histogram
......
000003 000000
000004 000029
000005 023017
000006 357485
000007 192723
000008 026141
000009 000106
000010 000067
......
# Min Latencies: 00004
# Avg Latencies: 00006

2. enable_hv_timer=N.

# Histogram
......
000004 000000
000005 000074
000006 001943
000007 005820
000008 164729
000009 424401
000010 001964
000011 000252
000012 000190
......
# Min Latencies: 00005
# Avg Latencies: 00010

Changes since v1 [2]:

* Remove the vmx_sched_out and no changes to kvm_x86_ops for it.
* Remove the two expired timer checkings on each vm-entry.
* Rename the hwemul_timer to hv_timer
* Clear vmx_x86_ops's membership if preemption timer is not usable.
* Cache cpu_preemption_timer_multi.
* Keep the tracepoint with the function patch.
* Other minor changes based on Paolo's review.

[1] https://rt.wiki.kernel.org/index.php/Cyclictest
[2] http://www.spinics.net/lists/kvm/msg132895.html

Yunhong Jiang (4):
  Add the kvm sched_out hook
  Utilize the vmx preemption timer
  Separate the start_sw_tscdeadline
  Utilize the vmx preemption timer for tsc deadline timer

 arch/arm/include/asm/kvm_host.h     |   1 +
 arch/mips/include/asm/kvm_host.h    |   1 +
 arch/powerpc/include/asm/kvm_host.h |   1 +
 arch/s390/include/asm/kvm_host.h    |   1 +
 arch/x86/include/asm/kvm_host.h     |   4 +
 arch/x86/kvm/lapic.c                | 144 ++++++++++++++++++++++++++++++------
 arch/x86/kvm/lapic.h                |  11 +++
 arch/x86/kvm/trace.h                |  22 ++++++
 arch/x86/kvm/vmx.c                  |  51 ++++++++++++-
 arch/x86/kvm/x86.c                  |   8 ++
 include/linux/kvm_host.h            |   1 +
 virt/kvm/kvm_main.c                 |   1 +
 12 files changed, 221 insertions(+), 25 deletions(-)

TODO:
	Find out the CPUs with VMX preemption timer broken.
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2016-06-15 18:08 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-05-24 22:27 [RFC PATCH V2 0/4] Utilizing VMX preemption for timer virtualization Yunhong Jiang
2016-05-24 22:27 ` [RFC PATCH V2 1/4] Add the kvm sched_out hook Yunhong Jiang
2016-05-24 22:27 ` [RFC PATCH V2 2/4] Utilize the vmx preemption timer Yunhong Jiang
2016-06-14 13:23   ` Roman Kagan
2016-06-14 13:41     ` Paolo Bonzini
2016-06-14 16:46       ` yunhong jiang
2016-06-14 21:56         ` Paolo Bonzini
2016-06-15 18:03           ` yunhong jiang
2016-06-14 16:46     ` yunhong jiang
2016-05-24 22:27 ` [RFC PATCH V2 3/4] Separate the start_sw_tscdeadline Yunhong Jiang
2016-05-24 22:27 ` [RFC PATCH V2 4/4] Utilize the vmx preemption timer for tsc deadline timer Yunhong Jiang
2016-05-24 23:11   ` David Matlack
2016-05-24 23:35     ` yunhong jiang
2016-05-25 11:58       ` Paolo Bonzini
2016-05-25 22:53         ` yunhong jiang
2016-05-26  7:20           ` Paolo Bonzini
2016-05-25 10:40     ` Paolo Bonzini
2016-05-25 13:38       ` Radim Krčmář
2016-05-25 11:52   ` Paolo Bonzini
2016-05-25 22:44     ` yunhong jiang
2016-05-26 14:05       ` Alan Jenkins
2016-05-26 15:32         ` Paolo Bonzini
2016-06-04  0:24     ` yunhong jiang
2016-06-06 13:49       ` Paolo Bonzini
2016-06-06 18:21         ` yunhong jiang
2016-05-25 13:27   ` Radim Krčmář
2016-05-25 13:51     ` Paolo Bonzini
2016-05-25 14:31       ` Radim Krčmář
2016-05-25 23:13         ` yunhong jiang
2016-06-14 11:34         ` Paolo Bonzini
2016-05-25 13:45   ` Radim Krčmář
2016-05-25 22:57     ` yunhong jiang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.