linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] KVM: selftests: Make hyperv_clock selftest more stable
@ 2022-06-01 14:43 Vitaly Kuznetsov
  2022-06-01 16:06 ` Sean Christopherson
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Vitaly Kuznetsov @ 2022-06-01 14:43 UTC (permalink / raw)
  To: kvm, Paolo Bonzini
  Cc: Sean Christopherson, Maxim Levitsky, Wanpeng Li, Jim Mattson,
	linux-kernel

hyperv_clock doesn't always give a stable test result, especially with
AMD CPUs. The test compares Hyper-V MSR clocksource (acquired either
with rdmsr() from within the guest or KVM_GET_MSRS from the host)
against rdtsc(). To increase the accuracy, increase the measured delay
(done with nop loop) by two orders of magnitude and take the mean rdtsc()
value before and after rdmsr()/KVM_GET_MSRS.

Reported-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 tools/testing/selftests/kvm/x86_64/hyperv_clock.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/kvm/x86_64/hyperv_clock.c b/tools/testing/selftests/kvm/x86_64/hyperv_clock.c
index e0b2bb1339b1..3330fb183c68 100644
--- a/tools/testing/selftests/kvm/x86_64/hyperv_clock.c
+++ b/tools/testing/selftests/kvm/x86_64/hyperv_clock.c
@@ -44,7 +44,7 @@ static inline void nop_loop(void)
 {
 	int i;
 
-	for (i = 0; i < 1000000; i++)
+	for (i = 0; i < 100000000; i++)
 		asm volatile("nop");
 }
 
@@ -56,12 +56,14 @@ static inline void check_tsc_msr_rdtsc(void)
 	tsc_freq = rdmsr(HV_X64_MSR_TSC_FREQUENCY);
 	GUEST_ASSERT(tsc_freq > 0);
 
-	/* First, check MSR-based clocksource */
+	/* For increased accuracy, take mean rdtsc() before and afrer rdmsr() */
 	r1 = rdtsc();
 	t1 = rdmsr(HV_X64_MSR_TIME_REF_COUNT);
+	r1 = (r1 + rdtsc()) / 2;
 	nop_loop();
 	r2 = rdtsc();
 	t2 = rdmsr(HV_X64_MSR_TIME_REF_COUNT);
+	r2 = (r2 + rdtsc()) / 2;
 
 	GUEST_ASSERT(r2 > r1 && t2 > t1);
 
@@ -181,12 +183,14 @@ static void host_check_tsc_msr_rdtsc(struct kvm_vm *vm)
 	tsc_freq = vcpu_get_msr(vm, VCPU_ID, HV_X64_MSR_TSC_FREQUENCY);
 	TEST_ASSERT(tsc_freq > 0, "TSC frequency must be nonzero");
 
-	/* First, check MSR-based clocksource */
+	/* For increased accuracy, take mean rdtsc() before and afrer ioctl */
 	r1 = rdtsc();
 	t1 = vcpu_get_msr(vm, VCPU_ID, HV_X64_MSR_TIME_REF_COUNT);
+	r1 = (r1 + rdtsc()) / 2;
 	nop_loop();
 	r2 = rdtsc();
 	t2 = vcpu_get_msr(vm, VCPU_ID, HV_X64_MSR_TIME_REF_COUNT);
+	r2 = (r2 + rdtsc()) / 2;
 
 	TEST_ASSERT(t2 > t1, "Time reference MSR is not monotonic (%ld <= %ld)", t1, t2);
 
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] KVM: selftests: Make hyperv_clock selftest more stable
  2022-06-01 14:43 [PATCH] KVM: selftests: Make hyperv_clock selftest more stable Vitaly Kuznetsov
@ 2022-06-01 16:06 ` Sean Christopherson
  2022-06-02 13:34   ` Vitaly Kuznetsov
  2022-06-07  8:07   ` Maxim Levitsky
  2022-06-07  8:26 ` Maxim Levitsky
  2022-06-07 15:22 ` Paolo Bonzini
  2 siblings, 2 replies; 6+ messages in thread
From: Sean Christopherson @ 2022-06-01 16:06 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: kvm, Paolo Bonzini, Maxim Levitsky, Wanpeng Li, Jim Mattson,
	linux-kernel

On Wed, Jun 01, 2022, Vitaly Kuznetsov wrote:
> hyperv_clock doesn't always give a stable test result, especially with
> AMD CPUs. The test compares Hyper-V MSR clocksource (acquired either
> with rdmsr() from within the guest or KVM_GET_MSRS from the host)
> against rdtsc(). To increase the accuracy, increase the measured delay
> (done with nop loop) by two orders of magnitude and take the mean rdtsc()
> value before and after rdmsr()/KVM_GET_MSRS.

Rather than "fixing" the test by reducing the impact of noise, can we first try
to reduce the noise itself?  E.g. pin the test to a single CPU, redo the measurement
if the test is interrupted (/proc/interrupts?), etc...  Bonus points if that can
be implemented as a helper or pair of helpers so that other tests that want to
measure latency/time don't need to reinvent the wheel.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] KVM: selftests: Make hyperv_clock selftest more stable
  2022-06-01 16:06 ` Sean Christopherson
@ 2022-06-02 13:34   ` Vitaly Kuznetsov
  2022-06-07  8:07   ` Maxim Levitsky
  1 sibling, 0 replies; 6+ messages in thread
From: Vitaly Kuznetsov @ 2022-06-02 13:34 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: kvm, Paolo Bonzini, Maxim Levitsky, Wanpeng Li, Jim Mattson,
	linux-kernel

Sean Christopherson <seanjc@google.com> writes:

> On Wed, Jun 01, 2022, Vitaly Kuznetsov wrote:
>> hyperv_clock doesn't always give a stable test result, especially with
>> AMD CPUs. The test compares Hyper-V MSR clocksource (acquired either
>> with rdmsr() from within the guest or KVM_GET_MSRS from the host)
>> against rdtsc(). To increase the accuracy, increase the measured delay
>> (done with nop loop) by two orders of magnitude and take the mean rdtsc()
>> value before and after rdmsr()/KVM_GET_MSRS.
>
> Rather than "fixing" the test by reducing the impact of noise, can we first try
> to reduce the noise itself?  E.g. pin the test to a single CPU, redo the measurement
> if the test is interrupted (/proc/interrupts?), etc...  Bonus points if that can
> be implemented as a helper or pair of helpers so that other tests that want to
> measure latency/time don't need to reinvent the wheel.

While I'm not certain task migration to another CPU was always the
problem here (maybe the measured interval is too short anyway), I agree
these are good ideas, I'll look into them, thanks!

-- 
Vitaly


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] KVM: selftests: Make hyperv_clock selftest more stable
  2022-06-01 16:06 ` Sean Christopherson
  2022-06-02 13:34   ` Vitaly Kuznetsov
@ 2022-06-07  8:07   ` Maxim Levitsky
  1 sibling, 0 replies; 6+ messages in thread
From: Maxim Levitsky @ 2022-06-07  8:07 UTC (permalink / raw)
  To: Sean Christopherson, Vitaly Kuznetsov
  Cc: kvm, Paolo Bonzini, Wanpeng Li, Jim Mattson, linux-kernel

On Wed, 2022-06-01 at 16:06 +0000, Sean Christopherson wrote:
> On Wed, Jun 01, 2022, Vitaly Kuznetsov wrote:
> > hyperv_clock doesn't always give a stable test result, especially
> > with
> > AMD CPUs. The test compares Hyper-V MSR clocksource (acquired
> > either
> > with rdmsr() from within the guest or KVM_GET_MSRS from the host)
> > against rdtsc(). To increase the accuracy, increase the measured
> > delay
> > (done with nop loop) by two orders of magnitude and take the mean
> > rdtsc()
> > value before and after rdmsr()/KVM_GET_MSRS.
> 
> Rather than "fixing" the test by reducing the impact of noise, can we
> first try
> to reduce the noise itself?  E.g. pin the test to a single CPU, redo
> the measurement

Pinning is a good idea overall, however IMHO should not be done in 
all KVM selftests, as vCPU migration itself can be source of bugs.


> if the test is interrupted (/proc/interrupts?), etc...  Bonus points
This is not feasable IMHO - timer interrupt alone can fire at rate of
1000 interrupts/s. Just while reading /proc/interurpts you probably get
few of interrupts.


> if that can
> be implemented as a helper or pair of helpers so that other tests
> that want to
> measure latency/time don't need to reinvent the wheel.
> 

Best regards,
	Maxim Levitsky


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] KVM: selftests: Make hyperv_clock selftest more stable
  2022-06-01 14:43 [PATCH] KVM: selftests: Make hyperv_clock selftest more stable Vitaly Kuznetsov
  2022-06-01 16:06 ` Sean Christopherson
@ 2022-06-07  8:26 ` Maxim Levitsky
  2022-06-07 15:22 ` Paolo Bonzini
  2 siblings, 0 replies; 6+ messages in thread
From: Maxim Levitsky @ 2022-06-07  8:26 UTC (permalink / raw)
  To: Vitaly Kuznetsov, kvm, Paolo Bonzini
  Cc: Sean Christopherson, Wanpeng Li, Jim Mattson, linux-kernel

On Wed, 2022-06-01 at 16:43 +0200, Vitaly Kuznetsov wrote:
> hyperv_clock doesn't always give a stable test result, especially
> with
> AMD CPUs. The test compares Hyper-V MSR clocksource (acquired either
> with rdmsr() from within the guest or KVM_GET_MSRS from the host)
> against rdtsc(). To increase the accuracy, increase the measured
> delay
> (done with nop loop) by two orders of magnitude and take the mean
> rdtsc()
> value before and after rdmsr()/KVM_GET_MSRS.
Tiny nitpick: any reason why you think that AMD is more prone
to the failure? This test was failing on my Intel's laptop as well.

> 
> Reported-by: Maxim Levitsky <mlevitsk@redhat.com>
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
>  tools/testing/selftests/kvm/x86_64/hyperv_clock.c | 10 +++++++---
>  1 file changed, 7 insertions(+), 3 deletions(-)
> 
> diff --git a/tools/testing/selftests/kvm/x86_64/hyperv_clock.c
> b/tools/testing/selftests/kvm/x86_64/hyperv_clock.c
> index e0b2bb1339b1..3330fb183c68 100644
> --- a/tools/testing/selftests/kvm/x86_64/hyperv_clock.c
> +++ b/tools/testing/selftests/kvm/x86_64/hyperv_clock.c
> @@ -44,7 +44,7 @@ static inline void nop_loop(void)
>  {
>         int i;
>  
> -       for (i = 0; i < 1000000; i++)
> +       for (i = 0; i < 100000000; i++)
>                 asm volatile("nop");
>  }
>  
> @@ -56,12 +56,14 @@ static inline void check_tsc_msr_rdtsc(void)
>         tsc_freq = rdmsr(HV_X64_MSR_TSC_FREQUENCY);
>         GUEST_ASSERT(tsc_freq > 0);
>  
> -       /* First, check MSR-based clocksource */
> +       /* For increased accuracy, take mean rdtsc() before and afrer
> rdmsr() */
>         r1 = rdtsc();
>         t1 = rdmsr(HV_X64_MSR_TIME_REF_COUNT);
> +       r1 = (r1 + rdtsc()) / 2;
>         nop_loop();
>         r2 = rdtsc();
>         t2 = rdmsr(HV_X64_MSR_TIME_REF_COUNT);
> +       r2 = (r2 + rdtsc()) / 2;
>  
>         GUEST_ASSERT(r2 > r1 && t2 > t1);
>  
> @@ -181,12 +183,14 @@ static void host_check_tsc_msr_rdtsc(struct
> kvm_vm *vm)
>         tsc_freq = vcpu_get_msr(vm, VCPU_ID,
> HV_X64_MSR_TSC_FREQUENCY);
>         TEST_ASSERT(tsc_freq > 0, "TSC frequency must be nonzero");
>  
> -       /* First, check MSR-based clocksource */
> +       /* For increased accuracy, take mean rdtsc() before and afrer
> ioctl */
>         r1 = rdtsc();
>         t1 = vcpu_get_msr(vm, VCPU_ID, HV_X64_MSR_TIME_REF_COUNT);
> +       r1 = (r1 + rdtsc()) / 2;
>         nop_loop();
>         r2 = rdtsc();
>         t2 = vcpu_get_msr(vm, VCPU_ID, HV_X64_MSR_TIME_REF_COUNT);
> +       r2 = (r2 + rdtsc()) / 2;
>  
>         TEST_ASSERT(t2 > t1, "Time reference MSR is not monotonic
> (%ld <= %ld)", t1, t2);
>  

Both changes make sense, and the test doesn't fail anymore on my AMD
laptop.
Soon I'll test on my Intel laptop as well.

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Tested-by: Maxim Levitsky <mlevitsk@redhat.com>

Best regards,
	Maxim Levitsky


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] KVM: selftests: Make hyperv_clock selftest more stable
  2022-06-01 14:43 [PATCH] KVM: selftests: Make hyperv_clock selftest more stable Vitaly Kuznetsov
  2022-06-01 16:06 ` Sean Christopherson
  2022-06-07  8:26 ` Maxim Levitsky
@ 2022-06-07 15:22 ` Paolo Bonzini
  2 siblings, 0 replies; 6+ messages in thread
From: Paolo Bonzini @ 2022-06-07 15:22 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: kvm, Sean Christopherson, Maxim Levitsky, Wanpeng Li,
	Jim Mattson, linux-kernel

Queued, thanks.

Paolo



^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2022-06-07 15:22 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-01 14:43 [PATCH] KVM: selftests: Make hyperv_clock selftest more stable Vitaly Kuznetsov
2022-06-01 16:06 ` Sean Christopherson
2022-06-02 13:34   ` Vitaly Kuznetsov
2022-06-07  8:07   ` Maxim Levitsky
2022-06-07  8:26 ` Maxim Levitsky
2022-06-07 15:22 ` Paolo Bonzini

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).