All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] KVM: x86: fix KVM_SET_CLOCK relative to setting correct clock value
@ 2017-05-02 21:36 Marcelo Tosatti
  2017-05-03 13:08 ` Paolo Bonzini
  0 siblings, 1 reply; 15+ messages in thread
From: Marcelo Tosatti @ 2017-05-02 21:36 UTC (permalink / raw)
  To: kvm-devel; +Cc: Paolo Bonzini, Radim Krčmář


In the masterclock enabled case, kvmclock_offset must be adjusted so
that user_ns.clock = master_kernel_ns + kvmclock_offset (that is, the
value set from KVM_SET_CLOCK is the one visible at system_timestamp).

This way the guest clock:

	1. Starts counting when KVM_SET_CLOCK executes.
	2. With the value provided by userspace.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

---
 arch/x86/kvm/x86.c |   23 ++++++++++++++++++++---
 1 file changed, 20 insertions(+), 3 deletions(-)

Index: kvm/arch/x86/kvm/x86.c
===================================================================
--- kvm.orig/arch/x86/kvm/x86.c	2017-04-27 17:37:48.131348255 -0300
+++ kvm/arch/x86/kvm/x86.c	2017-04-27 17:56:58.397530444 -0300
@@ -4172,8 +4172,9 @@
 		break;
 	}
 	case KVM_SET_CLOCK: {
-		struct kvm_clock_data user_ns;
 		u64 now_ns;
+		struct kvm_clock_data user_ns;
+		struct kvm_arch *ka = &kvm->arch;
 
 		r = -EFAULT;
 		if (copy_from_user(&user_ns, argp, sizeof(user_ns)))
@@ -4184,9 +4185,25 @@
 			goto out;
 
 		r = 0;
-		now_ns = get_kvmclock_ns(kvm);
-		kvm->arch.kvmclock_offset += user_ns.clock - now_ns;
+
 		kvm_gen_update_masterclock(kvm);
+		if (ka->use_master_clock) {
+			/*
+			 * In the masterclock enabled case,
+			 * kvmclock_offset must be adjusted so that
+			 * user_ns.clock = master_kernel_ns + kvmclock_offset
+			 * (that is, the value set from KVM_SET_CLOCK is the
+			 * one visible at system_timestamp).
+			 */
+			kvm->arch.kvmclock_offset = user_ns.clock -
+							ka->master_kernel_ns;
+
+			kvm_for_each_vcpu(i, vcpu, kvm)
+				kvm_make_request(KVM_REQ_CLOCK_UPDATE, vcpu);
+		} else {
+			now_ns = get_kvmclock_ns(kvm);
+			kvm->arch.kvmclock_offset += user_ns.clock - now_ns;
+		}
 		break;
 	}
 	case KVM_GET_CLOCK: {

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] KVM: x86: fix KVM_SET_CLOCK relative to setting correct clock value
  2017-05-02 21:36 [PATCH] KVM: x86: fix KVM_SET_CLOCK relative to setting correct clock value Marcelo Tosatti
@ 2017-05-03 13:08 ` Paolo Bonzini
  2017-05-03 13:40   ` Marcelo Tosatti
  2017-05-03 13:43   ` [PATCH v2] " Marcelo Tosatti
  0 siblings, 2 replies; 15+ messages in thread
From: Paolo Bonzini @ 2017-05-03 13:08 UTC (permalink / raw)
  To: Marcelo Tosatti, kvm-devel; +Cc: Radim Krčmář



On 02/05/2017 23:36, Marcelo Tosatti wrote:
> In the masterclock enabled case, kvmclock_offset must be adjusted so
> that user_ns.clock = master_kernel_ns + kvmclock_offset (that is, the
> value set from KVM_SET_CLOCK is the one visible at system_timestamp).
> 
> This way the guest clock:
> 
> 	1. Starts counting when KVM_SET_CLOCK executes.
> 	2. With the value provided by userspace.

So this fixes rounding errors?

> -		now_ns = get_kvmclock_ns(kvm);
> -		kvm->arch.kvmclock_offset += user_ns.clock - now_ns;
> +
>  		kvm_gen_update_masterclock(kvm);
> +		if (ka->use_master_clock) {
> +			/*
> +			 * In the masterclock enabled case,
> +			 * kvmclock_offset must be adjusted so that
> +			 * user_ns.clock = master_kernel_ns + kvmclock_offset
> +			 * (that is, the value set from KVM_SET_CLOCK is the
> +			 * one visible at system_timestamp).
> +			 */
> +			kvm->arch.kvmclock_offset = user_ns.clock -
> +							ka->master_kernel_ns;
> +


This needs to hold ka->pvclock_gtod_sync_lock, I think.

> +			kvm_for_each_vcpu(i, vcpu, kvm)
> +				kvm_make_request(KVM_REQ_CLOCK_UPDATE, vcpu);

kvm_gen_update_masterclock already does that, why did you move that to
before the assignment?

Paolo

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] KVM: x86: fix KVM_SET_CLOCK relative to setting correct clock value
  2017-05-03 13:08 ` Paolo Bonzini
@ 2017-05-03 13:40   ` Marcelo Tosatti
  2017-05-03 13:55     ` Paolo Bonzini
  2017-05-03 13:43   ` [PATCH v2] " Marcelo Tosatti
  1 sibling, 1 reply; 15+ messages in thread
From: Marcelo Tosatti @ 2017-05-03 13:40 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kvm-devel, Radim Krčmář

On Wed, May 03, 2017 at 03:08:46PM +0200, Paolo Bonzini wrote:
> 
> 
> On 02/05/2017 23:36, Marcelo Tosatti wrote:
> > In the masterclock enabled case, kvmclock_offset must be adjusted so
> > that user_ns.clock = master_kernel_ns + kvmclock_offset (that is, the
> > value set from KVM_SET_CLOCK is the one visible at system_timestamp).
> > 
> > This way the guest clock:
> > 
> > 	1. Starts counting when KVM_SET_CLOCK executes.
> > 	2. With the value provided by userspace.
> 
> So this fixes rounding errors?

No. It just does the correct 

"user_ns.clock = master_kernel_ns + kvmclock_offset = return value of
get_kvmclock_ns(kvm) at moment of KVM_SET_CLOCK" 

(that is, when guest_rdtsc() == tsc_timestamp, so guest_rdtsc() - tsc_timestamp = 0, 
you want get_kvmclock_ns() to return user_ns.clock).

Its not a rounding error.

> > -		now_ns = get_kvmclock_ns(kvm);
> > -		kvm->arch.kvmclock_offset += user_ns.clock - now_ns;
> > +
> >  		kvm_gen_update_masterclock(kvm);
> > +		if (ka->use_master_clock) {
> > +			/*
> > +			 * In the masterclock enabled case,
> > +			 * kvmclock_offset must be adjusted so that
> > +			 * user_ns.clock = master_kernel_ns + kvmclock_offset
> > +			 * (that is, the value set from KVM_SET_CLOCK is the
> > +			 * one visible at system_timestamp).
> > +			 */
> > +			kvm->arch.kvmclock_offset = user_ns.clock -
> > +							ka->master_kernel_ns;
> > +
> 
> 
> This needs to hold ka->pvclock_gtod_sync_lock, I think.
> 
> > +			kvm_for_each_vcpu(i, vcpu, kvm)
> > +				kvm_make_request(KVM_REQ_CLOCK_UPDATE, vcpu);
> 
> kvm_gen_update_masterclock already does that, why did you move that to
> before the assignment?

Its after the assignment of kvmclock_offset because
KVM_REQ_CLOCK_UPDATES processing by vcpus make use of kvmclock_offset.

So if you change that value, you have to request the vcpus to
recalculate their kvmclock areas using the new kvmclock_offset.

Resending v2, thanks.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v2] KVM: x86: fix KVM_SET_CLOCK relative to setting correct clock value
  2017-05-03 13:08 ` Paolo Bonzini
  2017-05-03 13:40   ` Marcelo Tosatti
@ 2017-05-03 13:43   ` Marcelo Tosatti
  2017-05-10 18:04     ` Radim Krčmář
  1 sibling, 1 reply; 15+ messages in thread
From: Marcelo Tosatti @ 2017-05-03 13:43 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kvm-devel, Radim Krčmář


In the masterclock enabled case, kvmclock_offset must be adjusted so
that user_ns.clock = master_kernel_ns + kvmclock_offset (that is, the
value set from KVM_SET_CLOCK is the one visible at system_timestamp).

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

---
 arch/x86/kvm/x86.c |   29 ++++++++++++++++++++++++++---
 1 file changed, 26 insertions(+), 3 deletions(-)

v2: grab pvclock_gtod_sync_lock spinlock (Paolo)

Index: kvm/arch/x86/kvm/x86.c
===================================================================
--- kvm.orig/arch/x86/kvm/x86.c	2017-04-27 17:37:48.131348255 -0300
+++ kvm/arch/x86/kvm/x86.c	2017-05-03 10:35:19.298447766 -0300
@@ -4172,8 +4172,9 @@
 		break;
 	}
 	case KVM_SET_CLOCK: {
-		struct kvm_clock_data user_ns;
 		u64 now_ns;
+		struct kvm_clock_data user_ns;
+		struct kvm_arch *ka = &kvm->arch;
 
 		r = -EFAULT;
 		if (copy_from_user(&user_ns, argp, sizeof(user_ns)))
@@ -4184,9 +4185,31 @@
 			goto out;
 
 		r = 0;
-		now_ns = get_kvmclock_ns(kvm);
-		kvm->arch.kvmclock_offset += user_ns.clock - now_ns;
+
 		kvm_gen_update_masterclock(kvm);
+		spin_lock(&ka->pvclock_gtod_sync_lock);
+		if (ka->use_master_clock) {
+			int i;
+			struct kvm_vcpu *vcpu;
+
+			/*
+			 * In the masterclock enabled case,
+			 * kvmclock_offset must be adjusted so that
+			 * user_ns.clock = master_kernel_ns + kvmclock_offset
+			 * (that is, the value set from KVM_SET_CLOCK is the
+			 * one visible at system_timestamp).
+			 */
+			kvm->arch.kvmclock_offset = user_ns.clock -
+							ka->master_kernel_ns;
+			spin_unlock(&ka->pvclock_gtod_sync_lock);
+
+			kvm_for_each_vcpu(i, vcpu, kvm)
+				kvm_make_request(KVM_REQ_CLOCK_UPDATE, vcpu);
+		} else {
+			spin_unlock(&ka->pvclock_gtod_sync_lock);
+			now_ns = get_kvmclock_ns(kvm);
+			kvm->arch.kvmclock_offset += user_ns.clock - now_ns;
+		}
 		break;
 	}
 	case KVM_GET_CLOCK: {

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] KVM: x86: fix KVM_SET_CLOCK relative to setting correct clock value
  2017-05-03 13:40   ` Marcelo Tosatti
@ 2017-05-03 13:55     ` Paolo Bonzini
  2017-05-03 18:24       ` Marcelo Tosatti
  0 siblings, 1 reply; 15+ messages in thread
From: Paolo Bonzini @ 2017-05-03 13:55 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: kvm-devel, Radim Krčmář



On 03/05/2017 15:40, Marcelo Tosatti wrote:
>>> In the masterclock enabled case, kvmclock_offset must be adjusted so
>>> that user_ns.clock = master_kernel_ns + kvmclock_offset (that is, the
>>> value set from KVM_SET_CLOCK is the one visible at system_timestamp).
>>>
>>> This way the guest clock:
>>>
>>> 	1. Starts counting when KVM_SET_CLOCK executes.
>>> 	2. With the value provided by userspace.
>>
>> So this fixes rounding errors?
> 
> No. It just does the correct 
> 
> "user_ns.clock = master_kernel_ns + kvmclock_offset = return value of
> get_kvmclock_ns(kvm) at moment of KVM_SET_CLOCK" 
> 
> (that is, when guest_rdtsc() == tsc_timestamp, so guest_rdtsc() - tsc_timestamp = 0, 
> you want get_kvmclock_ns() to return user_ns.clock).

Yep, and it currently doesn't match because the current code introduces
rounding errors?

>>> +			kvm_for_each_vcpu(i, vcpu, kvm)
>>> +				kvm_make_request(KVM_REQ_CLOCK_UPDATE, vcpu);
>> kvm_gen_update_masterclock already does that, why did you move that to
>> before the assignment?
> Its after the assignment of kvmclock_offset because
> KVM_REQ_CLOCK_UPDATES processing by vcpus make use of kvmclock_offset.

Sure, but why did you move kvm_gen_update_masterclock before?

If you left kvm_gen_update_masterclock after the update of
kvm->arch.kvmclock_offset, the duplicate KVM_REQ_CLOCK_UPDATE would not
be necessary.

Paolo

> So if you change that value, you have to request the vcpus to
> recalculate their kvmclock areas using the new kvmclock_offset.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] KVM: x86: fix KVM_SET_CLOCK relative to setting correct clock value
  2017-05-03 13:55     ` Paolo Bonzini
@ 2017-05-03 18:24       ` Marcelo Tosatti
  0 siblings, 0 replies; 15+ messages in thread
From: Marcelo Tosatti @ 2017-05-03 18:24 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kvm-devel, Radim Krčmář

On Wed, May 03, 2017 at 03:55:36PM +0200, Paolo Bonzini wrote:
> 
> 
> On 03/05/2017 15:40, Marcelo Tosatti wrote:
> >>> In the masterclock enabled case, kvmclock_offset must be adjusted so
> >>> that user_ns.clock = master_kernel_ns + kvmclock_offset (that is, the
> >>> value set from KVM_SET_CLOCK is the one visible at system_timestamp).
> >>>
> >>> This way the guest clock:
> >>>
> >>> 	1. Starts counting when KVM_SET_CLOCK executes.
> >>> 	2. With the value provided by userspace.
> >>
> >> So this fixes rounding errors?
> > 
> > No. It just does the correct 
> > 
> > "user_ns.clock = master_kernel_ns + kvmclock_offset = return value of
> > get_kvmclock_ns(kvm) at moment of KVM_SET_CLOCK" 
> > 
> > (that is, when guest_rdtsc() == tsc_timestamp, so guest_rdtsc() - tsc_timestamp = 0, 
> > you want get_kvmclock_ns() to return user_ns.clock).
> 
> Yep, and it currently doesn't match because the current code introduces
> rounding errors?

No. Consider the current code. When get_kvmclock_ns() is called,
ka->masterclock = false.

kvm_gen_update_masterclock() sets it to true (because at the time
KVM_SET_CLOCK is called, the conditions necessary to do so are
fulfilled).

> >>> +			kvm_for_each_vcpu(i, vcpu, kvm)
> >>> +				kvm_make_request(KVM_REQ_CLOCK_UPDATE, vcpu);
> >> kvm_gen_update_masterclock already does that, why did you move that to
> >> before the assignment?
> > Its after the assignment of kvmclock_offset because
> > KVM_REQ_CLOCK_UPDATES processing by vcpus make use of kvmclock_offset.
> 
> Sure, but why did you move kvm_gen_update_masterclock before?

Explained above.

> If you left kvm_gen_update_masterclock after the update of
> kvm->arch.kvmclock_offset, the duplicate KVM_REQ_CLOCK_UPDATE would not
> be necessary.

No can do.

Feel free to reorganize/rewrite the patch if you feel like it, as long
as the same effects are achieved.

Thanks.

> 
> Paolo
> 
> > So if you change that value, you have to request the vcpus to
> > recalculate their kvmclock areas using the new kvmclock_offset.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] KVM: x86: fix KVM_SET_CLOCK relative to setting correct clock value
  2017-05-03 13:43   ` [PATCH v2] " Marcelo Tosatti
@ 2017-05-10 18:04     ` Radim Krčmář
  2017-05-11 15:39       ` Marcelo Tosatti
  0 siblings, 1 reply; 15+ messages in thread
From: Radim Krčmář @ 2017-05-10 18:04 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: Paolo Bonzini, kvm-devel

2017-05-03 10:43-0300, Marcelo Tosatti:
> In the masterclock enabled case, kvmclock_offset must be adjusted so
> that user_ns.clock = master_kernel_ns + kvmclock_offset (that is, the
> value set from KVM_SET_CLOCK is the one visible at system_timestamp).

IIUC, we want to achieve

  user_ns.clock == get_kvmclock_ns(kvm)

and the important fix for kvm master clock is the move of
kvm_gen_update_masterclock() before we read the time.

The rest is just a minor optimization that also ignores time since
master_kernel_ns() and therefore pins user_ns.clock to a slightly
earlier time.

But all attention was given to the "minor optimization" -- have I missed
something about the direct use of ka->master_kernel_ns?

(If not, I'd rather have just the one-line move.)

---
Detailed reasoning.

kvm_gen_update_masterclock() shifts the master clock (changes its
percieved frequency, because it is reset from kernel boot clock again,
even though they are diverging), so we must not compute the
kvmclock_offset with an old master clock and apply it to a shifted one.

Using offset from ka->master_kernel_ns or get_kvmclock_ns()
("ka->master_kernel_ns + small delta") doesn't make much difference
then, because the user_ns is already an indeterminable point in the
past.  (We just assume that user_ns.clock is now, whenever that is.)

---
And a possible improvement.

Using kvm_gen_update_masterclock() seems superfluous.  There are three
major cases depending on state of the master clock:

 enabled)
   We can just match the kvmclock_offset so the following holds
     user_ns.clock == get_kvmclock_ns(kvm)
   No need to update the master clock.

 disabled)
   The kvmclock_offset is set from the kernel boot clock and that is
   already correct.

 disabled, but call to kvm_gen_update_masterclock() would enable it)
   The master clock will be updates with the kernel boot clock when a
   VCPU runs.  This means that master clock and kernel boot clock will
   begin diverging at a later point, but the initial offset is the same.
   Userspace can only tell the difference by calling KVM_GET_CLOCK
   afterwards and seeing the stable bit unset.
   Guest is mostly unaffected -- the inaccuracy from calling
   KVM_SET_CLOCK is much bigger than accumulation of a slightly
   different frequency will in few jiffies.

Do you see a reason to use kvm_gen_update_masterclock()?

I think that the code could be just:

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 464da936c53d..f024216a858d 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4177,7 +4177,7 @@ long kvm_arch_vm_ioctl(struct file *filp,
 		r = 0;
 		now_ns = get_kvmclock_ns(kvm);
 		kvm->arch.kvmclock_offset += user_ns.clock - now_ns;
-		kvm_gen_update_masterclock(kvm);
+		kvm_make_all_cpus_request(kvm, KVM_REQ_CLOCK_UPDATE);
 		break;
 	}
 	case KVM_GET_CLOCK: {

Thanks.

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] KVM: x86: fix KVM_SET_CLOCK relative to setting correct clock value
  2017-05-10 18:04     ` Radim Krčmář
@ 2017-05-11 15:39       ` Marcelo Tosatti
  2017-05-12 14:13         ` Radim Krčmář
  0 siblings, 1 reply; 15+ messages in thread
From: Marcelo Tosatti @ 2017-05-11 15:39 UTC (permalink / raw)
  To: Radim Krčmář; +Cc: Paolo Bonzini, kvm-devel

On Wed, May 10, 2017 at 08:04:31PM +0200, Radim Krčmář wrote:
> 2017-05-03 10:43-0300, Marcelo Tosatti:
> > In the masterclock enabled case, kvmclock_offset must be adjusted so
> > that user_ns.clock = master_kernel_ns + kvmclock_offset (that is, the
> > value set from KVM_SET_CLOCK is the one visible at system_timestamp).
> 
> IIUC, we want to achieve
> 
>   user_ns.clock == get_kvmclock_ns(kvm)

Yes, or equivalently we want 

user_ns.clock = master_kernel_ns + kvmclock_offset, 

when 

guest_rdtsc() == ka->master_cycle_now.

> and the important fix for kvm master clock is the move of
> kvm_gen_update_masterclock() before we read the time.

Yes.

> The rest is just a minor optimization that also ignores time since
> master_kernel_ns() and therefore pins user_ns.clock to a slightly
> earlier time.
> 
> But all attention was given to the "minor optimization" -- have I missed
> something about the direct use of ka->master_kernel_ns?

I haven't attempted to optimize anything. Not sure what you mean.

> 
> (If not, I'd rather have just the one-line move.)
> 
> ---
> Detailed reasoning.
> 
> kvm_gen_update_masterclock() shifts the master clock (changes its
> percieved frequency, because it is reset from kernel boot clock again,
> even though they are diverging), so we must not compute the
> kvmclock_offset with an old master clock and apply it to a shifted one.
> 
> Using offset from ka->master_kernel_ns or get_kvmclock_ns()
> ("ka->master_kernel_ns + small delta") doesn't make much difference
> then, because the user_ns is already an indeterminable point in the
> past.  (We just assume that user_ns.clock is now, whenever that is.)
> 
> ---
> And a possible improvement.
> 
> Using kvm_gen_update_masterclock() seems superfluous.  There are three
> major cases depending on state of the master clock:
> 
>  enabled)
>    We can just match the kvmclock_offset so the following holds
>      user_ns.clock == get_kvmclock_ns(kvm)
>    No need to update the master clock.
> 
>  disabled)
>    The kvmclock_offset is set from the kernel boot clock and that is
>    already correct.
> 
>  disabled, but call to kvm_gen_update_masterclock() would enable it)
>    The master clock will be updates with the kernel boot clock when a
>    VCPU runs.  This means that master clock and kernel boot clock will
>    begin diverging at a later point, but the initial offset is the same.
>    Userspace can only tell the difference by calling KVM_GET_CLOCK
>    afterwards and seeing the stable bit unset.
>    Guest is mostly unaffected -- the inaccuracy from calling
>    KVM_SET_CLOCK is much bigger than accumulation of a slightly
>    different frequency will in few jiffies.
> 
> Do you see a reason to use kvm_gen_update_masterclock()?
> 
> I think that the code could be just:
> 
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 464da936c53d..f024216a858d 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -4177,7 +4177,7 @@ long kvm_arch_vm_ioctl(struct file *filp,
>  		r = 0;
>  		now_ns = get_kvmclock_ns(kvm);
>  		kvm->arch.kvmclock_offset += user_ns.clock - now_ns;
> -		kvm_gen_update_masterclock(kvm);
> +		kvm_make_all_cpus_request(kvm, KVM_REQ_CLOCK_UPDATE);

No, this is wrong. kvm_gen_update_masterclock() must happen before
the assignment of kvm->arch.kvmclock_offset and get_kvmclock_ns().

(so that kvm_gen_update_masterclock sets ka->use_master_clock to true 
and get_kvmclock_ns() enters this case:

        hv_clock.tsc_timestamp = ka->master_cycle_now;
        hv_clock.system_time = ka->master_kernel_ns +
ka->kvmclock_offset;
        spin_unlock(&ka->pvclock_gtod_sync_lock);

        kvm_get_time_scale(NSEC_PER_SEC, __this_cpu_read(cpu_tsc_khz) *
1000LL,
                           &hv_clock.tsc_shift,
                           &hv_clock.tsc_to_system_mul);
        return __pvclock_read_cycles(&hv_clock, rdtsc());

Makes sense?

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] KVM: x86: fix KVM_SET_CLOCK relative to setting correct clock value
  2017-05-11 15:39       ` Marcelo Tosatti
@ 2017-05-12 14:13         ` Radim Krčmář
  2017-05-12 14:36           ` Paolo Bonzini
  2017-05-12 15:31           ` Marcelo Tosatti
  0 siblings, 2 replies; 15+ messages in thread
From: Radim Krčmář @ 2017-05-12 14:13 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: Paolo Bonzini, kvm-devel

2017-05-11 12:39-0300, Marcelo Tosatti:
> On Wed, May 10, 2017 at 08:04:31PM +0200, Radim Krčmář wrote:
> > 2017-05-03 10:43-0300, Marcelo Tosatti:
> > and the important fix for kvm master clock is the move of
> > kvm_gen_update_masterclock() before we read the time.

> > The rest is just a minor optimization that also ignores time since
> > master_kernel_ns() and therefore pins user_ns.clock to a slightly
> > earlier time.
> > 
> > But all attention was given to the "minor optimization" -- have I missed
> > something about the direct use of ka->master_kernel_ns?
> 
> I haven't attempted to optimize anything. Not sure what you mean.

I mean, why doesn't the patch look like this?

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 464da936c53d..8db1d09e59d7 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4175,9 +4175,10 @@ long kvm_arch_vm_ioctl(struct file *filp,
 			goto out;
 
 		r = 0;
+		kvm_gen_update_masterclock(kvm);
 		now_ns = get_kvmclock_ns(kvm);
 		kvm->arch.kvmclock_offset += user_ns.clock - now_ns;
-		kvm_gen_update_masterclock(kvm);
+		kvm_make_all_cpus_request(kvm, KVM_REQ_CLOCK_UPDATE);
 		break;
 	}
 	case KVM_GET_CLOCK: {

> >  disabled, but call to kvm_gen_update_masterclock() would enable it)
> >    The master clock will be updates with the kernel boot clock when a
> >    VCPU runs.  This means that master clock and kernel boot clock will
> >    begin diverging at a later point, but the initial offset is the same.
> >    Userspace can only tell the difference by calling KVM_GET_CLOCK
> >    afterwards and seeing the stable bit unset.
> >    Guest is mostly unaffected -- the inaccuracy from calling
> >    KVM_SET_CLOCK is much bigger than accumulation of a slightly
> >    different frequency will in few jiffies.
> > 
> > Do you see a reason to use kvm_gen_update_masterclock()?
> > 
> > I think that the code could be just:
> > 
> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > index 464da936c53d..f024216a858d 100644
> > --- a/arch/x86/kvm/x86.c
> > +++ b/arch/x86/kvm/x86.c
> > @@ -4177,7 +4177,7 @@ long kvm_arch_vm_ioctl(struct file *filp,
> >  		r = 0;
> >  		now_ns = get_kvmclock_ns(kvm);
> >  		kvm->arch.kvmclock_offset += user_ns.clock - now_ns;
> > -		kvm_gen_update_masterclock(kvm);
> > +		kvm_make_all_cpus_request(kvm, KVM_REQ_CLOCK_UPDATE);
> 
> No, this is wrong. kvm_gen_update_masterclock() must happen before
> the assignment of kvm->arch.kvmclock_offset and get_kvmclock_ns().
> 
> (so that kvm_gen_update_masterclock sets ka->use_master_clock to true 
> and get_kvmclock_ns() enters this case:
> 
>         hv_clock.tsc_timestamp = ka->master_cycle_now;
>         hv_clock.system_time = ka->master_kernel_ns +
> ka->kvmclock_offset;
>         spin_unlock(&ka->pvclock_gtod_sync_lock);
> 
>         kvm_get_time_scale(NSEC_PER_SEC, __this_cpu_read(cpu_tsc_khz) *
> 1000LL,
>                            &hv_clock.tsc_shift,
>                            &hv_clock.tsc_to_system_mul);
>         return __pvclock_read_cycles(&hv_clock, rdtsc());
> 
> Makes sense?

That doesn't explain the need for kvm_gen_update_masterclock().
kvm_gen_update_masterclock() moves the master clock base to the current
kernel boot clock; the non-master clock case uses the same kernel boot
clock to return the current kvmclock -- the computed offset is the same
in both cases.

Updating the master clock simplifies the case by resetting possible
divergence (due to different frequencies) between master clock and
kernel boot clock, but I'd still like to know (and document) if that is
the reason.

In what test does updating the master clock make a difference?

Thanks.

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] KVM: x86: fix KVM_SET_CLOCK relative to setting correct clock value
  2017-05-12 14:13         ` Radim Krčmář
@ 2017-05-12 14:36           ` Paolo Bonzini
  2017-05-12 15:31           ` Marcelo Tosatti
  1 sibling, 0 replies; 15+ messages in thread
From: Paolo Bonzini @ 2017-05-12 14:36 UTC (permalink / raw)
  To: Radim Krčmář, Marcelo Tosatti; +Cc: kvm-devel



On 12/05/2017 16:13, Radim Krčmář wrote:
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 464da936c53d..8db1d09e59d7 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -4175,9 +4175,10 @@ long kvm_arch_vm_ioctl(struct file *filp,
>  			goto out;
>  
>  		r = 0;
> +		kvm_gen_update_masterclock(kvm);
>  		now_ns = get_kvmclock_ns(kvm);
>  		kvm->arch.kvmclock_offset += user_ns.clock - now_ns;
> -		kvm_gen_update_masterclock(kvm);
> +		kvm_make_all_cpus_request(kvm, KVM_REQ_CLOCK_UPDATE);
>  		break;
>  	}
>  	case KVM_GET_CLOCK: {

And then I think the first kvm_gen_update_masterclock should be modified
to skip sending the KVM_REQ_CLOCK_UPDATE request.

Paolo

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] KVM: x86: fix KVM_SET_CLOCK relative to setting correct clock value
  2017-05-12 14:13         ` Radim Krčmář
  2017-05-12 14:36           ` Paolo Bonzini
@ 2017-05-12 15:31           ` Marcelo Tosatti
  2017-05-12 17:37             ` Radim Krčmář
  1 sibling, 1 reply; 15+ messages in thread
From: Marcelo Tosatti @ 2017-05-12 15:31 UTC (permalink / raw)
  To: Radim Krčmář; +Cc: Paolo Bonzini, kvm-devel

On Fri, May 12, 2017 at 04:13:23PM +0200, Radim Krčmář wrote:
> 2017-05-11 12:39-0300, Marcelo Tosatti:
> > On Wed, May 10, 2017 at 08:04:31PM +0200, Radim Krčmář wrote:
> > > 2017-05-03 10:43-0300, Marcelo Tosatti:
> > > and the important fix for kvm master clock is the move of
> > > kvm_gen_update_masterclock() before we read the time.
> 
> > > The rest is just a minor optimization that also ignores time since
> > > master_kernel_ns() and therefore pins user_ns.clock to a slightly
> > > earlier time.
> > > 
> > > But all attention was given to the "minor optimization" -- have I missed
> > > something about the direct use of ka->master_kernel_ns?
> > 
> > I haven't attempted to optimize anything. Not sure what you mean.
> 
> I mean, why doesn't the patch look like this?
d 
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 464da936c53d..8db1d09e59d7 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -4175,9 +4175,10 @@ long kvm_arch_vm_ioctl(struct file *filp,
>  			goto out;
>  
>  		r = 0;
> +		kvm_gen_update_masterclock(kvm);
>  		now_ns = get_kvmclock_ns(kvm);
>  		kvm->arch.kvmclock_offset += user_ns.clock - now_ns;
> -		kvm_gen_update_masterclock(kvm);
> +		kvm_make_all_cpus_request(kvm, KVM_REQ_CLOCK_UPDATE);
>  		break;

        now_ns = ka->master_kernel_ns + kvmclock_offset_prev + (grdtsc() - ka->master_cycle_now)
        kvmclock_offset += user_ns.clock - (ka->master_kernel_ns + kvmclock_offset_prev + (grdtsc() - ka->master_cycle_now)
        kvmclock_offset = kvmclock_offset_prev + user_ns.clock - (ka->master_kernel_ns + kvmclock_offset_prev + (grdtsc() - ka->master_cycle_now)

In case of VM was just initialized before migration, kvmclock_offset_prev is -ktime_get_boot_ns()

        kvmclock_offset = -ktime_get_boot_ns() + user_ns.clock - (ka->master_kernel_ns -ktime_get_boot_ns() + grdtsc() - ka->master_cycle_now))

But master_kernel_ns = ktime_get_boot_ns()     +  delta-between-vm-init-and-KVM_SET_CLOCK (AKA delta)
                    (the same one from VM init) 

        kvmclock_offset = -ktime_get_boot_ns() + user_ns.clock - (ktime_get_boot_ns() + delta + -ktime_get_boot_ns() + grdtsc() - ka->master_cycle_now))

        kvmclock_offset = -ktime_get_boot_ns() + user_ns.clock - delta - grdtsc() + ka->master_cycle_now

But we don't want grdtsc() - ka->master_cycle_now in there.

Note: grdtsc() == guest read tsc.

Now with

+			kvm->arch.kvmclock_offset = user_ns.clock -
+							ka->master_kernel_ns;

What happens is that guest clock starts counting, via kernel timekeeper,
at the moment kvm_get_time_and_clockread() runs. If you add grdtsc() -
ka->master_cycle_now in there, you are mindfully counting clock twice
(first: kernel timekeeper, second: the TSC between the (grdtsc() -
ka->master_cycle_now) in question.

+			kvm->arch.kvmclock_offset =  -ktime_get_boot_ns() +user_ns.clock -delta

Note that (grdtsc() - ka->master_cycle_now) is susceptible to scheduling
etc.

Makes sense?

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] KVM: x86: fix KVM_SET_CLOCK relative to setting correct clock value
  2017-05-12 15:31           ` Marcelo Tosatti
@ 2017-05-12 17:37             ` Radim Krčmář
  2017-05-13  3:46               ` Marcelo Tosatti
  0 siblings, 1 reply; 15+ messages in thread
From: Radim Krčmář @ 2017-05-12 17:37 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: Paolo Bonzini, kvm-devel

2017-05-12 12:31-0300, Marcelo Tosatti:
> On Fri, May 12, 2017 at 04:13:23PM +0200, Radim Krčmář wrote:
> > 2017-05-11 12:39-0300, Marcelo Tosatti:
> > > On Wed, May 10, 2017 at 08:04:31PM +0200, Radim Krčmář wrote:
> > > > 2017-05-03 10:43-0300, Marcelo Tosatti:
> > > > and the important fix for kvm master clock is the move of
> > > > kvm_gen_update_masterclock() before we read the time.
> > 
> > > > The rest is just a minor optimization that also ignores time since
> > > > master_kernel_ns() and therefore pins user_ns.clock to a slightly
> > > > earlier time.
> > > > 
> > > > But all attention was given to the "minor optimization" -- have I missed
> > > > something about the direct use of ka->master_kernel_ns?
> > > 
> > > I haven't attempted to optimize anything. Not sure what you mean.
> > 
> > I mean, why doesn't the patch look like this?
> d 
> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > index 464da936c53d..8db1d09e59d7 100644
> > --- a/arch/x86/kvm/x86.c
> > +++ b/arch/x86/kvm/x86.c
> > @@ -4175,9 +4175,10 @@ long kvm_arch_vm_ioctl(struct file *filp,
> >  			goto out;
> >  
> >  		r = 0;
> > +		kvm_gen_update_masterclock(kvm);
> >  		now_ns = get_kvmclock_ns(kvm);
> >  		kvm->arch.kvmclock_offset += user_ns.clock - now_ns;
> > -		kvm_gen_update_masterclock(kvm);
> > +		kvm_make_all_cpus_request(kvm, KVM_REQ_CLOCK_UPDATE);
> >  		break;
> 
>         now_ns = ka->master_kernel_ns + kvmclock_offset_prev + (grdtsc() - ka->master_cycle_now)
>         kvmclock_offset += user_ns.clock - (ka->master_kernel_ns + kvmclock_offset_prev + (grdtsc() - ka->master_cycle_now)
>         kvmclock_offset = kvmclock_offset_prev + user_ns.clock - (ka->master_kernel_ns + kvmclock_offset_prev + (grdtsc() - ka->master_cycle_now)
> 
> In case of VM was just initialized before migration, kvmclock_offset_prev is -ktime_get_boot_ns()
> 
>         kvmclock_offset = -ktime_get_boot_ns() + user_ns.clock - (ka->master_kernel_ns -ktime_get_boot_ns() + grdtsc() - ka->master_cycle_now))
> 
> But master_kernel_ns = ktime_get_boot_ns()     +  delta-between-vm-init-and-KVM_SET_CLOCK (AKA delta)
>                     (the same one from VM init) 
> 
>         kvmclock_offset = -ktime_get_boot_ns() + user_ns.clock - (ktime_get_boot_ns() + delta + -ktime_get_boot_ns() + grdtsc() - ka->master_cycle_now))
> 
>         kvmclock_offset = -ktime_get_boot_ns() + user_ns.clock - delta - grdtsc() + ka->master_cycle_now
> 
> But we don't want grdtsc() - ka->master_cycle_now in there.
> 
> Note: grdtsc() == guest read tsc.
> 
> Now with
> 
> +			kvm->arch.kvmclock_offset = user_ns.clock -
> +							ka->master_kernel_ns;
> 
> What happens is that guest clock starts counting, via kernel timekeeper,
> at the moment kvm_get_time_and_clockread() runs. If you add grdtsc() -
> ka->master_cycle_now in there, you are mindfully counting clock twice
> (first: kernel timekeeper, second: the TSC between the (grdtsc() -
> ka->master_cycle_now) in question.
> 
> +			kvm->arch.kvmclock_offset =  -ktime_get_boot_ns() +user_ns.clock -delta
> 
> Note that (grdtsc() - ka->master_cycle_now) is susceptible to scheduling
> etc.
> 
> Makes sense?

Yes.  The simpler code starts the kvmclock a bit later, but both are
correct -- anything within KVM_SET_CLOCK runtime is.

If we care about accuracy, then we should let userspace provide a
(kernel timestamp, kvm timestamp) pair, so the value of kvmclock can
really be controlled.

Adding ugly optimizations to work around shortcomings of the API is
going the wrong way ...

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] KVM: x86: fix KVM_SET_CLOCK relative to setting correct clock value
  2017-05-12 17:37             ` Radim Krčmář
@ 2017-05-13  3:46               ` Marcelo Tosatti
  2017-05-15 16:19                 ` Radim Krčmář
  0 siblings, 1 reply; 15+ messages in thread
From: Marcelo Tosatti @ 2017-05-13  3:46 UTC (permalink / raw)
  To: Radim Krčmář; +Cc: Paolo Bonzini, kvm-devel

On Fri, May 12, 2017 at 07:37:12PM +0200, Radim Krčmář wrote:
> 2017-05-12 12:31-0300, Marcelo Tosatti:
> > On Fri, May 12, 2017 at 04:13:23PM +0200, Radim Krčmář wrote:
> > > 2017-05-11 12:39-0300, Marcelo Tosatti:
> > > > On Wed, May 10, 2017 at 08:04:31PM +0200, Radim Krčmář wrote:
> > > > > 2017-05-03 10:43-0300, Marcelo Tosatti:
> > > > > and the important fix for kvm master clock is the move of
> > > > > kvm_gen_update_masterclock() before we read the time.
> > > 
> > > > > The rest is just a minor optimization that also ignores time since
> > > > > master_kernel_ns() and therefore pins user_ns.clock to a slightly
> > > > > earlier time.
> > > > > 
> > > > > But all attention was given to the "minor optimization" -- have I missed
> > > > > something about the direct use of ka->master_kernel_ns?
> > > > 
> > > > I haven't attempted to optimize anything. Not sure what you mean.
> > > 
> > > I mean, why doesn't the patch look like this?
> > d 
> > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > > index 464da936c53d..8db1d09e59d7 100644
> > > --- a/arch/x86/kvm/x86.c
> > > +++ b/arch/x86/kvm/x86.c
> > > @@ -4175,9 +4175,10 @@ long kvm_arch_vm_ioctl(struct file *filp,
> > >  			goto out;
> > >  
> > >  		r = 0;
> > > +		kvm_gen_update_masterclock(kvm);
> > >  		now_ns = get_kvmclock_ns(kvm);
> > >  		kvm->arch.kvmclock_offset += user_ns.clock - now_ns;
> > > -		kvm_gen_update_masterclock(kvm);
> > > +		kvm_make_all_cpus_request(kvm, KVM_REQ_CLOCK_UPDATE);
> > >  		break;
> > 
> >         now_ns = ka->master_kernel_ns + kvmclock_offset_prev + (grdtsc() - ka->master_cycle_now)
> >         kvmclock_offset += user_ns.clock - (ka->master_kernel_ns + kvmclock_offset_prev + (grdtsc() - ka->master_cycle_now)
> >         kvmclock_offset = kvmclock_offset_prev + user_ns.clock - (ka->master_kernel_ns + kvmclock_offset_prev + (grdtsc() - ka->master_cycle_now)
> > 
> > In case of VM was just initialized before migration, kvmclock_offset_prev is -ktime_get_boot_ns()
> > 
> >         kvmclock_offset = -ktime_get_boot_ns() + user_ns.clock - (ka->master_kernel_ns -ktime_get_boot_ns() + grdtsc() - ka->master_cycle_now))
> > 
> > But master_kernel_ns = ktime_get_boot_ns()     +  delta-between-vm-init-and-KVM_SET_CLOCK (AKA delta)
> >                     (the same one from VM init) 
> > 
> >         kvmclock_offset = -ktime_get_boot_ns() + user_ns.clock - (ktime_get_boot_ns() + delta + -ktime_get_boot_ns() + grdtsc() - ka->master_cycle_now))
> > 
> >         kvmclock_offset = -ktime_get_boot_ns() + user_ns.clock - delta - grdtsc() + ka->master_cycle_now
> > 
> > But we don't want grdtsc() - ka->master_cycle_now in there.
> > 
> > Note: grdtsc() == guest read tsc.
> > 
> > Now with
> > 
> > +			kvm->arch.kvmclock_offset = user_ns.clock -
> > +							ka->master_kernel_ns;
> > 
> > What happens is that guest clock starts counting, via kernel timekeeper,
> > at the moment kvm_get_time_and_clockread() runs. If you add grdtsc() -
> > ka->master_cycle_now in there, you are mindfully counting clock twice
> > (first: kernel timekeeper, second: the TSC between the (grdtsc() -
> > ka->master_cycle_now) in question.
> > 
> > +			kvm->arch.kvmclock_offset =  -ktime_get_boot_ns() +user_ns.clock -delta
> > 
> > Note that (grdtsc() - ka->master_cycle_now) is susceptible to scheduling
> > etc.
> > 
> > Makes sense?
> 
> Yes.  The simpler code starts the kvmclock a bit later, but both are
> correct -- anything within KVM_SET_CLOCK runtime is.

No the simpler code is not correct. You count time with two clocks for a
small period of time.

KVM_SET_CLOCK means: set the guest clock to the passed value and start
counting it from there.

With the simple fix, KVM_SET_CLOCK does: set the guest clock to the
passed value, but also add the delta between kvm_get_time_and_clockread() 
and get_kvmclock_ns(). 
Which is variable, due to scheduling.
So it is just wrong.

> If we care about accuracy, then we should let userspace provide a
> (kernel timestamp, kvm timestamp) pair, so the value of kvmclock can
> really be controlled.

I suppose something else has to be done: the control of the clock, 
from whatever userspace is using to measure passage of time, 
to TSC, has to be done in kernel.

But lets see if that is really necessary when the QEMU
PTP/CLOCK_MONOTONIC delta stuff is done (working on it).

In the meantime, do you have anything against this patch? I depend on 
it for the work above.

> Adding ugly optimizations to work around shortcomings of the API is
> going the wrong way ...

What optimization you refer to?

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] KVM: x86: fix KVM_SET_CLOCK relative to setting correct clock value
  2017-05-13  3:46               ` Marcelo Tosatti
@ 2017-05-15 16:19                 ` Radim Krčmář
  2017-05-15 21:06                   ` Marcelo Tosatti
  0 siblings, 1 reply; 15+ messages in thread
From: Radim Krčmář @ 2017-05-15 16:19 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: Paolo Bonzini, kvm-devel

2017-05-13 00:46-0300, Marcelo Tosatti:
> On Fri, May 12, 2017 at 07:37:12PM +0200, Radim Krčmář wrote:
> > 2017-05-12 12:31-0300, Marcelo Tosatti:
> > > Now with
> > > 
> > > +			kvm->arch.kvmclock_offset = user_ns.clock -
> > > +							ka->master_kernel_ns;
> > > 
> > > What happens is that guest clock starts counting, via kernel timekeeper,
> > > at the moment kvm_get_time_and_clockread() runs. If you add grdtsc() -
> > > ka->master_cycle_now in there, you are mindfully counting clock twice
> > > (first: kernel timekeeper, second: the TSC between the (grdtsc() -
> > > ka->master_cycle_now) in question.
> > > 
> > > +			kvm->arch.kvmclock_offset =  -ktime_get_boot_ns() +user_ns.clock -delta
> > > 
> > > Note that (grdtsc() - ka->master_cycle_now) is susceptible to scheduling
> > > etc.
> > > 
> > > Makes sense?
> > 
> > Yes.  The simpler code starts the kvmclock a bit later, but both are
> > correct -- anything within KVM_SET_CLOCK runtime is.
> 
> No the simpler code is not correct. You count time with two clocks for a
> small period of time.

A clock that counts kernel-nanoseconds is instantly replaced by a clock
that counts masterclock-nanoseconds, not incorrect by itself.

The simpler code will get the same kvmclock_offset as your code where
kvm_get_time_and_clockread() is called a bit later.
The distribution of resulting kvm_offsets will differ, but they must
both be correct or both incorrect, because they are already off-mark.

> KVM_SET_CLOCK means: set the guest clock to the passed value and start
> counting it from there.

Which is exactly what both versions do.

> With the simple fix, KVM_SET_CLOCK does: set the guest clock to the
> passed value, but also add the delta between kvm_get_time_and_clockread() 
> and get_kvmclock_ns(). 
> 
> Which is variable, due to scheduling.

Yes.

> So it is just wrong.

It makes the matter slightly worse by adding some execution time, but
the whole interface is "just wrong" even without that: we already have
the variability of the time between userspace's decision on
user_ns.clock value and kvm_get_time_and_clockread().

>> If we care about accuracy, then we should let userspace provide a
>> (kernel timestamp, kvm timestamp) pair, so the value of kvmclock can
>> really be controlled.
> 
> I suppose something else has to be done: the control of the clock, 
> from whatever userspace is using to measure passage of time, 
> to TSC, has to be done in kernel.

We agree, just worded it differently.

> But lets see if that is really necessary when the QEMU
> PTP/CLOCK_MONOTONIC delta stuff is done (working on it).

Right.

> In the meantime, do you have anything against this patch? I depend on 
> it for the work above.

The reasoning provided with the patch does not explain why

     * kvmclock_offset must be adjusted so that
     * user_ns.clock = master_kernel_ns + kvmclock_offset

Please explain why it "must".  I assert that it does not have to be.

If we agree that it is not necessary, then it is an optimization and I'd
like numbers to show that we are getting something that balances the
obfuscation; and why do we want it if we don't care about the better
solution (discussed above).

>                                                           I depend on 
> it for the work above.

Describing how other code couldn't work without this is great reason,
but again, please be specific -- what difference it make?

>> Adding ugly optimizations to work around shortcomings of the API is
>> going the wrong way ...
> 
> What optimization you refer to?

I refer to everything on top of the second hunk I posted.

Thanks.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] KVM: x86: fix KVM_SET_CLOCK relative to setting correct clock value
  2017-05-15 16:19                 ` Radim Krčmář
@ 2017-05-15 21:06                   ` Marcelo Tosatti
  0 siblings, 0 replies; 15+ messages in thread
From: Marcelo Tosatti @ 2017-05-15 21:06 UTC (permalink / raw)
  To: Radim Krčmář; +Cc: Paolo Bonzini, kvm-devel

On Mon, May 15, 2017 at 06:19:57PM +0200, Radim Krčmář wrote:
> 2017-05-13 00:46-0300, Marcelo Tosatti:
> > On Fri, May 12, 2017 at 07:37:12PM +0200, Radim Krčmář wrote:
> > > 2017-05-12 12:31-0300, Marcelo Tosatti:
> > > > Now with
> > > > 
> > > > +			kvm->arch.kvmclock_offset = user_ns.clock -
> > > > +							ka->master_kernel_ns;
> > > > 
> > > > What happens is that guest clock starts counting, via kernel timekeeper,
> > > > at the moment kvm_get_time_and_clockread() runs. If you add grdtsc() -
> > > > ka->master_cycle_now in there, you are mindfully counting clock twice
> > > > (first: kernel timekeeper, second: the TSC between the (grdtsc() -
> > > > ka->master_cycle_now) in question.
> > > > 
> > > > +			kvm->arch.kvmclock_offset =  -ktime_get_boot_ns() +user_ns.clock -delta
> > > > 
> > > > Note that (grdtsc() - ka->master_cycle_now) is susceptible to scheduling
> > > > etc.
> > > > 
> > > > Makes sense?
> > > 
> > > Yes.  The simpler code starts the kvmclock a bit later, but both are
> > > correct -- anything within KVM_SET_CLOCK runtime is.
> > 
> > No the simpler code is not correct. You count time with two clocks for a
> > small period of time.
> 
> A clock that counts kernel-nanoseconds is instantly replaced by a clock
> that counts masterclock-nanoseconds, not incorrect by itself.
> 
> The simpler code will get the same kvmclock_offset as your code where
> kvm_get_time_and_clockread() is called a bit later.
> The distribution of resulting kvm_offsets will differ, but they must
> both be correct or both incorrect, because they are already off-mark.
> 
> > KVM_SET_CLOCK means: set the guest clock to the passed value and start
> > counting it from there.
> 
> Which is exactly what both versions do.
> 
> > With the simple fix, KVM_SET_CLOCK does: set the guest clock to the
> > passed value, but also add the delta between kvm_get_time_and_clockread() 
> > and get_kvmclock_ns(). 
> > 
> > Which is variable, due to scheduling.
> 
> Yes.
> 
> > So it is just wrong.
> 
> It makes the matter slightly worse by adding some execution time, but
> the whole interface is "just wrong" even without that: we already have
> the variability of the time between userspace's decision on
> user_ns.clock value and kvm_get_time_and_clockread().
> 
> >> If we care about accuracy, then we should let userspace provide a
> >> (kernel timestamp, kvm timestamp) pair, so the value of kvmclock can
> >> really be controlled.
> > 
> > I suppose something else has to be done: the control of the clock, 
> > from whatever userspace is using to measure passage of time, 
> > to TSC, has to be done in kernel.
> 
> We agree, just worded it differently.
> 
> > But lets see if that is really necessary when the QEMU
> > PTP/CLOCK_MONOTONIC delta stuff is done (working on it).
> 
> Right.
> 
> > In the meantime, do you have anything against this patch? I depend on 
> > it for the work above.
> 
> The reasoning provided with the patch does not explain why
> 
>      * kvmclock_offset must be adjusted so that
>      * user_ns.clock = master_kernel_ns + kvmclock_offset
> 
> Please explain why it "must".  I assert that it does not have to be.
> 
> If we agree that it is not necessary, then it is an optimization and I'd
> like numbers to show that we are getting something that balances the
> obfuscation; and why do we want it if we don't care about the better
> solution (discussed above).
> 
> >                                                           I depend on 
> > it for the work above.
> 
> Describing how other code couldn't work without this is great reason,
> but again, please be specific -- what difference it make?
> 
> >> Adding ugly optimizations to work around shortcomings of the API is
> >> going the wrong way ...
> > 
> > What optimization you refer to?
> 
> I refer to everything on top of the second hunk I posted.
> 
> Thanks.

Actually you are right, your patch is fine because the 
length of time between kvm_get_time_and_clockread()
and get_kvmclock_ns(kvm) is compensated by

- grdtsc() + ka->master_cycle_now = 
- ( +grdtsc() - ka->master_cycle_now)

Which is the length of time between kvm_get_time_and_clockread() 
and get_kvmclock_ns(kvm).

Its much cleaner indeed. Can you please apply it? Thanks.

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2017-05-15 21:06 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-02 21:36 [PATCH] KVM: x86: fix KVM_SET_CLOCK relative to setting correct clock value Marcelo Tosatti
2017-05-03 13:08 ` Paolo Bonzini
2017-05-03 13:40   ` Marcelo Tosatti
2017-05-03 13:55     ` Paolo Bonzini
2017-05-03 18:24       ` Marcelo Tosatti
2017-05-03 13:43   ` [PATCH v2] " Marcelo Tosatti
2017-05-10 18:04     ` Radim Krčmář
2017-05-11 15:39       ` Marcelo Tosatti
2017-05-12 14:13         ` Radim Krčmář
2017-05-12 14:36           ` Paolo Bonzini
2017-05-12 15:31           ` Marcelo Tosatti
2017-05-12 17:37             ` Radim Krčmář
2017-05-13  3:46               ` Marcelo Tosatti
2017-05-15 16:19                 ` Radim Krčmář
2017-05-15 21:06                   ` Marcelo Tosatti

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.