All of lore.kernel.org
 help / color / mirror / Atom feed
From: Roman Kagan <rkagan@virtuozzo.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: <kvm@vger.kernel.org>, Andrey Smetanin <asmetanin@virtuozzo.com>,
	"Denis V. Lunev" <den@openvz.org>,
	Marcelo Tosatti <mtosatti@redhat.com>
Subject: Re: [PATCH kvm-unit-tests] KVM: x86: add hyperv clock test case
Date: Thu, 21 Apr 2016 20:01:58 +0300	[thread overview]
Message-ID: <20160421170157.GA16360@rkaganb.sw.ru> (raw)
In-Reply-To: <56B22CB4.9090404@redhat.com>

On Wed, Feb 03, 2016 at 05:37:08PM +0100, Paolo Bonzini wrote:
> On 28/01/2016 17:22, Roman Kagan wrote:
> > On Thu, Jan 28, 2016 at 03:04:58PM +0100, Paolo Bonzini wrote:
> >> The test checks the relative precision of the reference TSC page
> >> and the time reference counter.
> >>
> >> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> >> ---
> >> 	Andrey, the test has a relative error of approximately 3 parts
> >> 	per million on my machine.  In other words it drifts by 3
> >> 	microseconds every second, which I don't think is acceptable.
> >> 	The problem is that virtual_tsc_khz is 2593993 while the actual
> >> 	frequency is more like 2594001 kHz.
> > 
> > Hmm, how come?  I thought virtual_tsc_khz could only diverge from
> > tsc_khz on migration.
> > 
> > [Or maybe I just misunderstand what you mean by "the actual frequency".]
> 
> Talking to Marcelo helped me understanding it. :)
> 
> The issue is that the TSC kHz is not the correct value to use for the  
> the TSC page.  Instead, we want to pass to the guest a scale value 
> corresponding to the kernel's timekeeping multiplier.  This is what 
> both kvmclock and the time reference counter use.
> 
> The bit that I was missing last week was how the guest could perceive
> updates to the TSC page parameters as atomic.  We actually _can_
> emulate seqlock-like behavior in Hyper-V by writing 0 to seq during
> an update.  Instead of looping like seqlocks do, the guest will simply
> use the time reference counter and take a hypervisor exit.  The result
> however is still valid, because we want the time reference counter to
> be perfectly synchronized with the Hyper-V clock and lets you handle
> any kvmclock update scenario safely.
> 
> Therefore the algorithm would be like patch 2/2 I sent, but with
> important differences.  You'd still
> 
> 	if (hv->hv_tsc_page & HV_X64_MSR_TSC_REFERENCE_ENABLE)
> 		kvm_make_request(KVM_REQ_MASTERCLOCK_UPDATE, vcpu);
> 
> on MSR writes, but then hook into kvm_guest_time_update rather than
> kvm_gen_update_masterclock, like:
> 
> 	if (v == kvm_get_vcpu(v->kvm, 0))
> 	        kvm_write_hyperv_tsc_page(v->kvm, &vcpu->hv_clock);
> 
> and in kvm_write_hyperv_tsc_page the calculation would be based on
> the kvmclock parameters:
> 
>         {
>                 write 0 to seq;
>                 if (!(hv_clock->flags & PVCLOCK_TSC_STABLE_BIT))
>                         return;
> 
>                 compute scale and offset from hv_clock mul and shift;
>                 write scale and offset;
> 		write sequence
>         }
> 
> all of scale, offset and sequence can be computed from kvmclock parameters.
> 
> For sequence we have to convert "even values between 0 and 0xFFFFFFFE"
> into "values between "1 and 0xFFFFFFFE".  For example:
> 
> 	hyper-v sequence = (kvmclock sequence >> 1) + 1
> 
> will do it.  Computing the scale and offset is something like:
> 
>         // KVM formula:
> 	//    nsec = (ticks - tsc_timestamp) * tsc_to_system_mul << tsc_shift + system_time
> 	//
> 	// hyper-v formula:
> 	//    nsec/100 = ticks * scale / 2^32 + offset
> 	//
> 	// when tsc_timestamp, offset is zero so remove them both:
>         //    ticks * tsc_to_system_mul << tsc_shift / 100 = ticks * scale / 2^32
> 	//
> 	// multiply both sides by 2^32 / ticks and you get scale:
> 	//    scale = tsc_to_system_mul << (32 + tsc_shift) / 100
> 	//
> 	// check if it would overflow, and then use the time ref counter
> 	//    tsc_to_system_mul << (32 + tsc_shift) / 100 >= 2^32
> 	//    tsc_to_system_mul << 32 >= 100 * 2^32 << -tsc_shift
> 	//    tsc_to_system_mul >= 100 << -tsc_shift
> 
> 	if (shift < 0)
> 		rhs = 100 << -shift;
> 	else
> 		rhs = 100 >> shift;
> 
> 	if (tsc_to_system_mul >= rhs)
> 		return;
> 
>         scale = mul_u64_u32_div(1ULL << (32 + tsc_shift), tsc_to_system_mul, 100);
> 
>         // now expand the kvmclock formula:
>         //    nsec = ticks * tsc_to_system_mul << tsc_shift - (tsc_timestamp * tsc_to_system_mul << tsc_shift) + system_time 
> 	// divide by 100:
> 	//    nsec/100 = ticks * tsc_to_system_mul << tsc_shift /100 - (tsc_timestamp * tsc_to_system_mul << tsc_shift) /100 + system_time /100
> 	// replace tsc_to_system_mul << tsc_shift /100 by scale / 2^32:
>         //    nsec/100 = ticks * scale / 2^32 - (tsc_timestamp * scale / 2^32) + system_time / 100
> 	// comparing with the Hyper-V formula:
> 	//    offset = system_time / 100 - (tsc_timestamp * scale / 2^32)
> 
> 	offset = system_time;
> 	do_div(offset, 100);
> 
> 	timestamp_offset = tsc_timestamp;
> 	do_div32_shl32(timestamp_offset, scale);
>         offset = offset - timestamp_offset;
> 
> The remaining part is _further_ adjusting the offset to
> Would you like to finish implementing this and test it with the unit test?
> kvm/queue already has the patch to introduce do_div32_shl32.

At last I've got some time to tackle this.  I cooked up a patch to do
approximately this (fixing the math and the logic in a few places), and
adjusted the test to build, and now I'm about to submit both of them.

As to the drift problem, I reliably reproduce it, too, but it's a bug
somewhere in kvm_clock, and the new hyperv ref tsc page now demonstrates
it by virtue of relying on kvm_clock's values.  On my machine I observe
as much as +14 ppm and I'll try to chase it down, but this is
independent of the hyperv ref tsc page patchset.

Roman.

  parent reply	other threads:[~2016-04-21 17:17 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-01-28 14:04 [PATCH kvm-unit-tests] KVM: x86: add hyperv clock test case Paolo Bonzini
2016-01-28 14:04 ` Paolo Bonzini
2016-01-28 14:25 ` Andrey Smetanin
2016-01-28 14:50   ` Paolo Bonzini
2016-01-28 15:53     ` Paolo Bonzini
2016-01-28 18:45       ` Roman Kagan
2016-01-28 18:53     ` Roman Kagan
2016-01-28 21:28       ` Paolo Bonzini
2016-01-28 16:22 ` Roman Kagan
2016-02-03 16:37   ` Paolo Bonzini
2016-02-04  9:33     ` Roman Kagan
2016-02-04 10:13       ` Paolo Bonzini
2016-02-04 11:12         ` Roman Kagan
2016-04-21 17:01     ` Roman Kagan [this message]
2016-04-22 13:32       ` Roman Kagan
2016-04-22 18:08         ` Paolo Bonzini
2016-04-25  8:47           ` Roman Kagan
2016-04-26 10:34             ` Roman Kagan
2016-05-25 18:33               ` Roman Kagan
2016-05-26 14:47                 ` Roman Kagan
2016-05-29 22:34                 ` Marcelo Tosatti
2016-02-08 15:18 [PATCH 0/4] kvmclock: improve accuracy Paolo Bonzini
2016-02-08 15:18 ` [PATCH 1/4] KVM: x86: rename argument to kvm_set_tsc_khz Paolo Bonzini
2016-02-11 15:01   ` Marcelo Tosatti
2016-02-08 15:18 ` [PATCH 2/4] KVM: x86: rewrite handling of scaled TSC for kvmclock Paolo Bonzini
2016-02-11 15:23   ` Marcelo Tosatti
2016-02-08 15:18 ` [PATCH 3/4] KVM: x86: pass kvm_get_time_scale arguments in hertz Paolo Bonzini
2016-02-08 15:18 ` [PATCH 4/4] KVM: x86: track actual TSC frequency from the timekeeper struct Paolo Bonzini
2016-02-09 18:41   ` Owen Hofmann
2016-02-10 13:57     ` Paolo Bonzini
2016-02-16 13:48   ` Marcelo Tosatti
2016-02-16 14:25     ` Marcelo Tosatti
2016-02-16 16:59       ` Paolo Bonzini
2016-02-19 14:12         ` Marcelo Tosatti
2016-02-19 15:53           ` Paolo Bonzini
2016-02-16 14:00 ` [PATCH 0/4] kvmclock: improve accuracy Marcelo Tosatti
2016-08-31 14:13 [PATCH kvm-unit-tests] KVM: x86: add hyperv clock test case Roman Kagan
2016-09-01 16:07 ` Paolo Bonzini
2016-09-01 16:07 Paolo Bonzini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160421170157.GA16360@rkaganb.sw.ru \
    --to=rkagan@virtuozzo.com \
    --cc=asmetanin@virtuozzo.com \
    --cc=den@openvz.org \
    --cc=kvm@vger.kernel.org \
    --cc=mtosatti@redhat.com \
    --cc=pbonzini@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.