linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v5] KVM: halt-polling: poll for the upcoming fire timers
@ 2016-05-25  2:47 Wanpeng Li
  2016-05-26 10:26 ` Paolo Bonzini
  0 siblings, 1 reply; 7+ messages in thread
From: Wanpeng Li @ 2016-05-25  2:47 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: Wanpeng Li, Paolo Bonzini, Radim Krčmář,
	David Matlack, Christian Borntraeger, Yang Zhang

From: Wanpeng Li <wanpeng.li@hotmail.com>

If an emulated lapic timer will fire soon(in the scope of 10us the
base of dynamic halt-polling, lower-end of message passing workload
latency TCP_RR's poll time < 10us) we can treat it as a short halt,
and poll to wait it fire, the fire callback apic_timer_fn() will set
KVM_REQ_PENDING_TIMER, and this flag will be check during busy poll.
This can avoid context switch overhead and the latency which we wake
up vCPU.

This feature is slightly different from current advance expiration 
way. Advance expiration rely on the vCPU is running(do polling before 
vmentry). But in some cases, the timer interrupt may be blocked by 
other thread(i.e., IF bit is clear) and vCPU cannot be scheduled to 
run immediately. So even advance the timer early, vCPU may still see 
the latency. But polling is different, it ensures the vCPU to aware 
the timer expiration before schedule out.

echo HRTICK > /sys/kernel/debug/sched_features in dynticks guests in 
order to use high-res preemption tick. So task switch will trigger by 
hrtimer fire in guests. halt_poll_ns_timer is set to 10000ns. 

Context switching - times in microseconds - smaller is better
-------------------------------------------------------------------------
Host                 OS  2p/0K 2p/16K 2p/64K 8p/16K 8p/64K 16p/16K 16p/64K
                         ctxsw  ctxsw  ctxsw ctxsw  ctxsw   ctxsw   ctxsw
--------- ------------- ------ ------ ------ ------ ------ ------- -------
kernel     Linux 4.6.0+ 7.9800   11.0   10.8   14.6 9.4300    13.0    10.2 vanilla
kernel     Linux 4.6.0+   15.3   13.6   10.7   12.5 9.0000    12.8 7.38000 poll

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: David Matlack <dmatlack@google.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Yang Zhang <yang.zhang.wz@gmail.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
---
v4 -> v5:
 * add module parameter halt_poll_ns_timer to enable/disable this feature
v3 -> v4:
 * add module parameter halt_poll_ns_timer
 * rename patch subject since lapic maybe just for x86.
v2 -> v3:
 * add Yang's statement to patch description
v1 -> v2:
 * add return statement to non-x86 archs
 * capture never expire case for x86 (hrtimer is not started)

 arch/arm/include/asm/kvm_host.h     |  4 ++++
 arch/arm64/include/asm/kvm_host.h   |  4 ++++
 arch/mips/include/asm/kvm_host.h    |  4 ++++
 arch/powerpc/include/asm/kvm_host.h |  4 ++++
 arch/s390/include/asm/kvm_host.h    |  4 ++++
 arch/x86/kvm/lapic.c                | 11 +++++++++++
 arch/x86/kvm/lapic.h                |  1 +
 arch/x86/kvm/x86.c                  |  5 +++++
 include/linux/kvm_host.h            |  1 +
 virt/kvm/kvm_main.c                 | 13 ++++++++++---
 10 files changed, 48 insertions(+), 3 deletions(-)

diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index 0df6b1f..fdfbed9 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -292,6 +292,10 @@ static inline void kvm_arch_sync_events(struct kvm *kvm) {}
 static inline void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {}
 static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {}
 static inline void kvm_arch_vcpu_block_finish(struct kvm_vcpu *vcpu) {}
+static inline u64 kvm_arch_timer_remaining(struct kvm_vcpu *vcpu)
+{
+	return -1ULL;
+}
 
 static inline void kvm_arm_init_debug(void) {}
 static inline void kvm_arm_setup_debug(struct kvm_vcpu *vcpu) {}
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index e63d23b..f510d71 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -371,6 +371,10 @@ static inline void kvm_arch_sync_events(struct kvm *kvm) {}
 static inline void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {}
 static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {}
 static inline void kvm_arch_vcpu_block_finish(struct kvm_vcpu *vcpu) {}
+static inline u64 kvm_arch_timer_remaining(struct kvm_vcpu *vcpu)
+{
+	return -1ULL;
+}
 
 void kvm_arm_init_debug(void);
 void kvm_arm_setup_debug(struct kvm_vcpu *vcpu);
diff --git a/arch/mips/include/asm/kvm_host.h b/arch/mips/include/asm/kvm_host.h
index 6733ac5..baf9472 100644
--- a/arch/mips/include/asm/kvm_host.h
+++ b/arch/mips/include/asm/kvm_host.h
@@ -814,6 +814,10 @@ static inline void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {}
 static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {}
 static inline void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu) {}
 static inline void kvm_arch_vcpu_unblocking(struct kvm_vcpu *vcpu) {}
+static inline u64 kvm_arch_timer_remaining(struct kvm_vcpu *vcpu)
+{
+	return -1ULL;
+}
 static inline void kvm_arch_vcpu_block_finish(struct kvm_vcpu *vcpu) {}
 
 #endif /* __MIPS_KVM_HOST_H__ */
diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
index ec35af3..5986c79 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -729,5 +729,9 @@ static inline void kvm_arch_exit(void) {}
 static inline void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu) {}
 static inline void kvm_arch_vcpu_unblocking(struct kvm_vcpu *vcpu) {}
 static inline void kvm_arch_vcpu_block_finish(struct kvm_vcpu *vcpu) {}
+static inline u64 kvm_arch_timer_remaining(struct kvm_vcpu *vcpu)
+{
+	return -1ULL;
+}
 
 #endif /* __POWERPC_KVM_HOST_H__ */
diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h
index 37b9017..bdb01a1 100644
--- a/arch/s390/include/asm/kvm_host.h
+++ b/arch/s390/include/asm/kvm_host.h
@@ -696,6 +696,10 @@ static inline void kvm_arch_flush_shadow_memslot(struct kvm *kvm,
 		struct kvm_memory_slot *slot) {}
 static inline void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu) {}
 static inline void kvm_arch_vcpu_unblocking(struct kvm_vcpu *vcpu) {}
+static inline u64 kvm_arch_timer_remaining(struct kvm_vcpu *vcpu)
+{
+	return -1ULL;
+}
 
 void kvm_arch_vcpu_block_finish(struct kvm_vcpu *vcpu);
 
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index bbb5b28..cfeeac3 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -256,6 +256,17 @@ static inline int apic_lvtt_tscdeadline(struct kvm_lapic *apic)
 	return apic->lapic_timer.timer_mode == APIC_LVT_TIMER_TSCDEADLINE;
 }
 
+u64 apic_get_timer_expire(struct kvm_vcpu *vcpu)
+{
+	struct kvm_lapic *apic = vcpu->arch.apic;
+	struct hrtimer *timer = &apic->lapic_timer.timer;
+
+	if (!hrtimer_active(timer))
+		return -1ULL;
+	else
+		return ktime_to_ns(hrtimer_get_remaining(timer));
+}
+
 static inline int apic_lvt_nmi_mode(u32 lvt_val)
 {
 	return (lvt_val & (APIC_MODE_MASK | APIC_LVT_MASKED)) == APIC_DM_NMI;
diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h
index 891c6da..ee4da6c 100644
--- a/arch/x86/kvm/lapic.h
+++ b/arch/x86/kvm/lapic.h
@@ -212,4 +212,5 @@ bool kvm_intr_is_single_vcpu_fast(struct kvm *kvm, struct kvm_lapic_irq *irq,
 			struct kvm_vcpu **dest_vcpu);
 int kvm_vector_to_index(u32 vector, u32 dest_vcpus,
 			const unsigned long *bitmap, u32 bitmap_size);
+u64 apic_get_timer_expire(struct kvm_vcpu *vcpu);
 #endif
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index c805cf4..1b89a68 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -7623,6 +7623,11 @@ bool kvm_vcpu_compatible(struct kvm_vcpu *vcpu)
 struct static_key kvm_no_apic_vcpu __read_mostly;
 EXPORT_SYMBOL_GPL(kvm_no_apic_vcpu);
 
+u64 kvm_arch_timer_remaining(struct kvm_vcpu *vcpu)
+{
+	return apic_get_timer_expire(vcpu);
+}
+
 int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu)
 {
 	struct page *page;
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index b1fa8f1..14d6c23 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -663,6 +663,7 @@ int kvm_vcpu_yield_to(struct kvm_vcpu *target);
 void kvm_vcpu_on_spin(struct kvm_vcpu *vcpu);
 void kvm_load_guest_fpu(struct kvm_vcpu *vcpu);
 void kvm_put_guest_fpu(struct kvm_vcpu *vcpu);
+u64 kvm_arch_timer_remaining(struct kvm_vcpu *vcpu);
 
 void kvm_flush_remote_tlbs(struct kvm *kvm);
 void kvm_reload_remote_mmus(struct kvm *kvm);
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index dd4ac9d..4c1914a 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -78,6 +78,10 @@ module_param(halt_poll_ns_grow, uint, S_IRUGO | S_IWUSR);
 static unsigned int halt_poll_ns_shrink;
 module_param(halt_poll_ns_shrink, uint, S_IRUGO | S_IWUSR);
 
+/* lower-end of message passing workload latency TCP_RR's poll time < 10us */
+static unsigned int halt_poll_ns_timer = 0;
+module_param(halt_poll_ns_timer, uint, S_IRUGO | S_IWUSR);
+
 /*
  * Ordering of locks:
  *
@@ -2014,12 +2018,15 @@ void kvm_vcpu_block(struct kvm_vcpu *vcpu)
 	ktime_t start, cur;
 	DECLARE_SWAITQUEUE(wait);
 	bool waited = false;
-	u64 block_ns;
+	u64 block_ns, delta, remaining;
 
+	remaining = kvm_arch_timer_remaining(vcpu);
 	start = cur = ktime_get();
-	if (vcpu->halt_poll_ns) {
-		ktime_t stop = ktime_add_ns(ktime_get(), vcpu->halt_poll_ns);
+	if (vcpu->halt_poll_ns || remaining < halt_poll_ns_timer) {
+		ktime_t stop;
 
+		delta = vcpu->halt_poll_ns ? vcpu->halt_poll_ns : remaining;
+		stop = ktime_add_ns(ktime_get(), delta);
 		++vcpu->stat.halt_attempted_poll;
 		do {
 			/*
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v5] KVM: halt-polling: poll for the upcoming fire timers
  2016-05-25  2:47 [PATCH v5] KVM: halt-polling: poll for the upcoming fire timers Wanpeng Li
@ 2016-05-26 10:26 ` Paolo Bonzini
  2016-05-26 10:30   ` Paolo Bonzini
  2016-05-26 20:33   ` yunhong jiang
  0 siblings, 2 replies; 7+ messages in thread
From: Paolo Bonzini @ 2016-05-26 10:26 UTC (permalink / raw)
  To: Wanpeng Li, linux-kernel, kvm
  Cc: Wanpeng Li, Radim Krčmář,
	David Matlack, Christian Borntraeger, Yang Zhang



On 25/05/2016 04:47, Wanpeng Li wrote:
> From: Wanpeng Li <wanpeng.li@hotmail.com>
> 
> If an emulated lapic timer will fire soon(in the scope of 10us the
> base of dynamic halt-polling, lower-end of message passing workload
> latency TCP_RR's poll time < 10us) we can treat it as a short halt,
> and poll to wait it fire, the fire callback apic_timer_fn() will set
> KVM_REQ_PENDING_TIMER, and this flag will be check during busy poll.
> This can avoid context switch overhead and the latency which we wake
> up vCPU.

As discussed on IRC, I would like to understand why the adaptive
adjustment of halt_poll_ns is failing.  It seems like you have so few
halts that you don't get halt_poll_ns>0.  Yet, when the VM halts, it's
very close to the timer tick---often enough for this patch to have an
effect.

Please send a trace of halt_poll_ns_grow and halt_poll_ns_shrink
tracepoints, so that we can find out more about this.

Thanks,

Paolo

> This feature is slightly different from current advance expiration 
> way. Advance expiration rely on the vCPU is running(do polling before 
> vmentry). But in some cases, the timer interrupt may be blocked by 
> other thread(i.e., IF bit is clear) and vCPU cannot be scheduled to 
> run immediately. So even advance the timer early, vCPU may still see 
> the latency. But polling is different, it ensures the vCPU to aware 
> the timer expiration before schedule out.
> 
> echo HRTICK > /sys/kernel/debug/sched_features in dynticks guests in 
> order to use high-res preemption tick. So task switch will trigger by 
> hrtimer fire in guests. halt_poll_ns_timer is set to 10000ns. 
> 
> Context switching - times in microseconds - smaller is better
> -------------------------------------------------------------------------
> Host                 OS  2p/0K 2p/16K 2p/64K 8p/16K 8p/64K 16p/16K 16p/64K
>                          ctxsw  ctxsw  ctxsw ctxsw  ctxsw   ctxsw   ctxsw
> --------- ------------- ------ ------ ------ ------ ------ ------- -------
> kernel     Linux 4.6.0+ 7.9800   11.0   10.8   14.6 9.4300    13.0    10.2 vanilla
> kernel     Linux 4.6.0+   15.3   13.6   10.7   12.5 9.0000    12.8 7.38000 poll
> 
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Radim Krčmář <rkrcmar@redhat.com>
> Cc: David Matlack <dmatlack@google.com>
> Cc: Christian Borntraeger <borntraeger@de.ibm.com>
> Cc: Yang Zhang <yang.zhang.wz@gmail.com>
> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
> ---
> v4 -> v5:
>  * add module parameter halt_poll_ns_timer to enable/disable this feature
> v3 -> v4:
>  * add module parameter halt_poll_ns_timer
>  * rename patch subject since lapic maybe just for x86.
> v2 -> v3:
>  * add Yang's statement to patch description
> v1 -> v2:
>  * add return statement to non-x86 archs
>  * capture never expire case for x86 (hrtimer is not started)
> 
>  arch/arm/include/asm/kvm_host.h     |  4 ++++
>  arch/arm64/include/asm/kvm_host.h   |  4 ++++
>  arch/mips/include/asm/kvm_host.h    |  4 ++++
>  arch/powerpc/include/asm/kvm_host.h |  4 ++++
>  arch/s390/include/asm/kvm_host.h    |  4 ++++
>  arch/x86/kvm/lapic.c                | 11 +++++++++++
>  arch/x86/kvm/lapic.h                |  1 +
>  arch/x86/kvm/x86.c                  |  5 +++++
>  include/linux/kvm_host.h            |  1 +
>  virt/kvm/kvm_main.c                 | 13 ++++++++++---
>  10 files changed, 48 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
> index 0df6b1f..fdfbed9 100644
> --- a/arch/arm/include/asm/kvm_host.h
> +++ b/arch/arm/include/asm/kvm_host.h
> @@ -292,6 +292,10 @@ static inline void kvm_arch_sync_events(struct kvm *kvm) {}
>  static inline void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {}
>  static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {}
>  static inline void kvm_arch_vcpu_block_finish(struct kvm_vcpu *vcpu) {}
> +static inline u64 kvm_arch_timer_remaining(struct kvm_vcpu *vcpu)
> +{
> +	return -1ULL;
> +}
>  
>  static inline void kvm_arm_init_debug(void) {}
>  static inline void kvm_arm_setup_debug(struct kvm_vcpu *vcpu) {}
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index e63d23b..f510d71 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -371,6 +371,10 @@ static inline void kvm_arch_sync_events(struct kvm *kvm) {}
>  static inline void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {}
>  static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {}
>  static inline void kvm_arch_vcpu_block_finish(struct kvm_vcpu *vcpu) {}
> +static inline u64 kvm_arch_timer_remaining(struct kvm_vcpu *vcpu)
> +{
> +	return -1ULL;
> +}
>  
>  void kvm_arm_init_debug(void);
>  void kvm_arm_setup_debug(struct kvm_vcpu *vcpu);
> diff --git a/arch/mips/include/asm/kvm_host.h b/arch/mips/include/asm/kvm_host.h
> index 6733ac5..baf9472 100644
> --- a/arch/mips/include/asm/kvm_host.h
> +++ b/arch/mips/include/asm/kvm_host.h
> @@ -814,6 +814,10 @@ static inline void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {}
>  static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {}
>  static inline void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu) {}
>  static inline void kvm_arch_vcpu_unblocking(struct kvm_vcpu *vcpu) {}
> +static inline u64 kvm_arch_timer_remaining(struct kvm_vcpu *vcpu)
> +{
> +	return -1ULL;
> +}
>  static inline void kvm_arch_vcpu_block_finish(struct kvm_vcpu *vcpu) {}
>  
>  #endif /* __MIPS_KVM_HOST_H__ */
> diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
> index ec35af3..5986c79 100644
> --- a/arch/powerpc/include/asm/kvm_host.h
> +++ b/arch/powerpc/include/asm/kvm_host.h
> @@ -729,5 +729,9 @@ static inline void kvm_arch_exit(void) {}
>  static inline void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu) {}
>  static inline void kvm_arch_vcpu_unblocking(struct kvm_vcpu *vcpu) {}
>  static inline void kvm_arch_vcpu_block_finish(struct kvm_vcpu *vcpu) {}
> +static inline u64 kvm_arch_timer_remaining(struct kvm_vcpu *vcpu)
> +{
> +	return -1ULL;
> +}
>  
>  #endif /* __POWERPC_KVM_HOST_H__ */
> diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h
> index 37b9017..bdb01a1 100644
> --- a/arch/s390/include/asm/kvm_host.h
> +++ b/arch/s390/include/asm/kvm_host.h
> @@ -696,6 +696,10 @@ static inline void kvm_arch_flush_shadow_memslot(struct kvm *kvm,
>  		struct kvm_memory_slot *slot) {}
>  static inline void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu) {}
>  static inline void kvm_arch_vcpu_unblocking(struct kvm_vcpu *vcpu) {}
> +static inline u64 kvm_arch_timer_remaining(struct kvm_vcpu *vcpu)
> +{
> +	return -1ULL;
> +}
>  
>  void kvm_arch_vcpu_block_finish(struct kvm_vcpu *vcpu);
>  
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> index bbb5b28..cfeeac3 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -256,6 +256,17 @@ static inline int apic_lvtt_tscdeadline(struct kvm_lapic *apic)
>  	return apic->lapic_timer.timer_mode == APIC_LVT_TIMER_TSCDEADLINE;
>  }
>  
> +u64 apic_get_timer_expire(struct kvm_vcpu *vcpu)
> +{
> +	struct kvm_lapic *apic = vcpu->arch.apic;
> +	struct hrtimer *timer = &apic->lapic_timer.timer;
> +
> +	if (!hrtimer_active(timer))
> +		return -1ULL;
> +	else
> +		return ktime_to_ns(hrtimer_get_remaining(timer));
> +}
> +
>  static inline int apic_lvt_nmi_mode(u32 lvt_val)
>  {
>  	return (lvt_val & (APIC_MODE_MASK | APIC_LVT_MASKED)) == APIC_DM_NMI;
> diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h
> index 891c6da..ee4da6c 100644
> --- a/arch/x86/kvm/lapic.h
> +++ b/arch/x86/kvm/lapic.h
> @@ -212,4 +212,5 @@ bool kvm_intr_is_single_vcpu_fast(struct kvm *kvm, struct kvm_lapic_irq *irq,
>  			struct kvm_vcpu **dest_vcpu);
>  int kvm_vector_to_index(u32 vector, u32 dest_vcpus,
>  			const unsigned long *bitmap, u32 bitmap_size);
> +u64 apic_get_timer_expire(struct kvm_vcpu *vcpu);
>  #endif
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index c805cf4..1b89a68 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -7623,6 +7623,11 @@ bool kvm_vcpu_compatible(struct kvm_vcpu *vcpu)
>  struct static_key kvm_no_apic_vcpu __read_mostly;
>  EXPORT_SYMBOL_GPL(kvm_no_apic_vcpu);
>  
> +u64 kvm_arch_timer_remaining(struct kvm_vcpu *vcpu)
> +{
> +	return apic_get_timer_expire(vcpu);
> +}
> +
>  int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu)
>  {
>  	struct page *page;
> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> index b1fa8f1..14d6c23 100644
> --- a/include/linux/kvm_host.h
> +++ b/include/linux/kvm_host.h
> @@ -663,6 +663,7 @@ int kvm_vcpu_yield_to(struct kvm_vcpu *target);
>  void kvm_vcpu_on_spin(struct kvm_vcpu *vcpu);
>  void kvm_load_guest_fpu(struct kvm_vcpu *vcpu);
>  void kvm_put_guest_fpu(struct kvm_vcpu *vcpu);
> +u64 kvm_arch_timer_remaining(struct kvm_vcpu *vcpu);
>  
>  void kvm_flush_remote_tlbs(struct kvm *kvm);
>  void kvm_reload_remote_mmus(struct kvm *kvm);
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index dd4ac9d..4c1914a 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -78,6 +78,10 @@ module_param(halt_poll_ns_grow, uint, S_IRUGO | S_IWUSR);
>  static unsigned int halt_poll_ns_shrink;
>  module_param(halt_poll_ns_shrink, uint, S_IRUGO | S_IWUSR);
>  
> +/* lower-end of message passing workload latency TCP_RR's poll time < 10us */
> +static unsigned int halt_poll_ns_timer = 0;
> +module_param(halt_poll_ns_timer, uint, S_IRUGO | S_IWUSR);
> +
>  /*
>   * Ordering of locks:
>   *
> @@ -2014,12 +2018,15 @@ void kvm_vcpu_block(struct kvm_vcpu *vcpu)
>  	ktime_t start, cur;
>  	DECLARE_SWAITQUEUE(wait);
>  	bool waited = false;
> -	u64 block_ns;
> +	u64 block_ns, delta, remaining;
>  
> +	remaining = kvm_arch_timer_remaining(vcpu);
>  	start = cur = ktime_get();
> -	if (vcpu->halt_poll_ns) {
> -		ktime_t stop = ktime_add_ns(ktime_get(), vcpu->halt_poll_ns);
> +	if (vcpu->halt_poll_ns || remaining < halt_poll_ns_timer) {
> +		ktime_t stop;
>  
> +		delta = vcpu->halt_poll_ns ? vcpu->halt_poll_ns : remaining;
> +		stop = ktime_add_ns(ktime_get(), delta);
>  		++vcpu->stat.halt_attempted_poll;
>  		do {
>  			/*
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v5] KVM: halt-polling: poll for the upcoming fire timers
  2016-05-26 10:26 ` Paolo Bonzini
@ 2016-05-26 10:30   ` Paolo Bonzini
  2016-05-26 11:23     ` Wanpeng Li
  2016-05-26 20:33   ` yunhong jiang
  1 sibling, 1 reply; 7+ messages in thread
From: Paolo Bonzini @ 2016-05-26 10:30 UTC (permalink / raw)
  To: Wanpeng Li, linux-kernel, kvm
  Cc: Wanpeng Li, Radim Krčmář,
	David Matlack, Christian Borntraeger, Yang Zhang



On 26/05/2016 12:26, Paolo Bonzini wrote:
> As discussed on IRC, I would like to understand why the adaptive
> adjustment of halt_poll_ns is failing.  It seems like you have so few
> halts that you don't get halt_poll_ns>0.  Yet, when the VM halts, it's
> very close to the timer tick---often enough for this patch to have an
> effect.
> 
> Please send a trace of halt_poll_ns_grow and halt_poll_ns_shrink
> tracepoints, so that we can find out more about this.

And 30 seconds after I wrote this email, you told me on IRC that the
guest had HZ=1000 and the module parameter was set to 1 ms in order to
_really_ benefit from the patch.  So basically you could obtain the same
effect with idle=poll in the guest.

This explains why your reported results were not so great (as David noted).

Thanks,

Paolo

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v5] KVM: halt-polling: poll for the upcoming fire timers
  2016-05-26 10:30   ` Paolo Bonzini
@ 2016-05-26 11:23     ` Wanpeng Li
  0 siblings, 0 replies; 7+ messages in thread
From: Wanpeng Li @ 2016-05-26 11:23 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: linux-kernel, kvm, Wanpeng Li, Radim Krčmář,
	David Matlack, Christian Borntraeger, Yang Zhang

2016-05-26 18:30 GMT+08:00 Paolo Bonzini <pbonzini@redhat.com>:
>
>
> On 26/05/2016 12:26, Paolo Bonzini wrote:
>> As discussed on IRC, I would like to understand why the adaptive
>> adjustment of halt_poll_ns is failing.  It seems like you have so few
>> halts that you don't get halt_poll_ns>0.  Yet, when the VM halts, it's
>> very close to the timer tick---often enough for this patch to have an
>> effect.
>>
>> Please send a trace of halt_poll_ns_grow and halt_poll_ns_shrink
>> tracepoints, so that we can find out more about this.
>
> And 30 seconds after I wrote this email, you told me on IRC that the
> guest had HZ=1000 and the module parameter was set to 1 ms in order to
> _really_ benefit from the patch.  So basically you could obtain the same
> effect with idle=poll in the guest.
>
> This explains why your reported results were not so great (as David noted).

Yeah, I will drop the patch.

Regards,
Wanpeng Li

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v5] KVM: halt-polling: poll for the upcoming fire timers
  2016-05-26 10:26 ` Paolo Bonzini
  2016-05-26 10:30   ` Paolo Bonzini
@ 2016-05-26 20:33   ` yunhong jiang
  2016-05-27  9:49     ` Paolo Bonzini
  1 sibling, 1 reply; 7+ messages in thread
From: yunhong jiang @ 2016-05-26 20:33 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Wanpeng Li, linux-kernel, kvm, Wanpeng Li,
	Radim Krčmář,
	David Matlack, Christian Borntraeger, Yang Zhang

On Thu, 26 May 2016 12:26:27 +0200
Paolo Bonzini <pbonzini@redhat.com> wrote:

> 
> 
> On 25/05/2016 04:47, Wanpeng Li wrote:
> > From: Wanpeng Li <wanpeng.li@hotmail.com>
> > 
> > If an emulated lapic timer will fire soon(in the scope of 10us the
> > base of dynamic halt-polling, lower-end of message passing workload
> > latency TCP_RR's poll time < 10us) we can treat it as a short halt,
> > and poll to wait it fire, the fire callback apic_timer_fn() will set
> > KVM_REQ_PENDING_TIMER, and this flag will be check during busy poll.
> > This can avoid context switch overhead and the latency which we wake
> > up vCPU.
> 
> As discussed on IRC, I would like to understand why the adaptive

Glad to know the IRC channel. Is #kvm channel on freenode the IRC you are
talking about?

Thanks
--jyh

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v5] KVM: halt-polling: poll for the upcoming fire timers
  2016-05-26 20:33   ` yunhong jiang
@ 2016-05-27  9:49     ` Paolo Bonzini
  2016-05-28  0:14       ` yunhong jiang
  0 siblings, 1 reply; 7+ messages in thread
From: Paolo Bonzini @ 2016-05-27  9:49 UTC (permalink / raw)
  To: yunhong jiang
  Cc: Wanpeng Li, linux-kernel, kvm, Wanpeng Li,
	Radim Krčmář,
	David Matlack, Christian Borntraeger, Yang Zhang



On 26/05/2016 22:33, yunhong jiang wrote:
> On Thu, 26 May 2016 12:26:27 +0200
> Paolo Bonzini <pbonzini@redhat.com> wrote:
> 
>>
>>
>> On 25/05/2016 04:47, Wanpeng Li wrote:
>>> From: Wanpeng Li <wanpeng.li@hotmail.com>
>>>
>>> If an emulated lapic timer will fire soon(in the scope of 10us the
>>> base of dynamic halt-polling, lower-end of message passing workload
>>> latency TCP_RR's poll time < 10us) we can treat it as a short halt,
>>> and poll to wait it fire, the fire callback apic_timer_fn() will set
>>> KVM_REQ_PENDING_TIMER, and this flag will be check during busy poll.
>>> This can avoid context switch overhead and the latency which we wake
>>> up vCPU.
>>
>> As discussed on IRC, I would like to understand why the adaptive
> 
> Glad to know the IRC channel. Is #kvm channel on freenode the IRC you are
> talking about?

No, it's #qemu on irc.oftc.net.

Thanks,

Paolo

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v5] KVM: halt-polling: poll for the upcoming fire timers
  2016-05-27  9:49     ` Paolo Bonzini
@ 2016-05-28  0:14       ` yunhong jiang
  0 siblings, 0 replies; 7+ messages in thread
From: yunhong jiang @ 2016-05-28  0:14 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Wanpeng Li, linux-kernel, kvm, Wanpeng Li,
	Radim Krčmář,
	David Matlack, Christian Borntraeger, Yang Zhang

On Fri, 27 May 2016 11:49:26 +0200
Paolo Bonzini <pbonzini@redhat.com> wrote:

> 
> 
> On 26/05/2016 22:33, yunhong jiang wrote:
> > On Thu, 26 May 2016 12:26:27 +0200
> > Paolo Bonzini <pbonzini@redhat.com> wrote:
> > 
> >>
> >>
> >> On 25/05/2016 04:47, Wanpeng Li wrote:
> >>> From: Wanpeng Li <wanpeng.li@hotmail.com>
> >>>
> >>> If an emulated lapic timer will fire soon(in the scope of 10us the
> >>> base of dynamic halt-polling, lower-end of message passing
> >>> workload latency TCP_RR's poll time < 10us) we can treat it as a
> >>> short halt, and poll to wait it fire, the fire callback
> >>> apic_timer_fn() will set KVM_REQ_PENDING_TIMER, and this flag
> >>> will be check during busy poll. This can avoid context switch
> >>> overhead and the latency which we wake up vCPU.
> >>
> >> As discussed on IRC, I would like to understand why the adaptive
> > 
> > Glad to know the IRC channel. Is #kvm channel on freenode the IRC
> > you are talking about?
> 
> No, it's #qemu on irc.oftc.net

Thanks for the information.

--jyh

> 
> Thanks,
> 
> Paolo

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2016-05-28  0:23 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-05-25  2:47 [PATCH v5] KVM: halt-polling: poll for the upcoming fire timers Wanpeng Li
2016-05-26 10:26 ` Paolo Bonzini
2016-05-26 10:30   ` Paolo Bonzini
2016-05-26 11:23     ` Wanpeng Li
2016-05-26 20:33   ` yunhong jiang
2016-05-27  9:49     ` Paolo Bonzini
2016-05-28  0:14       ` yunhong jiang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).