linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [patch 0/4] KVM virtual PTP driver (v5)
@ 2017-01-24 17:09 Marcelo Tosatti
  2017-01-24 17:09 ` [patch 1/4] KVM: x86: provide realtime host clock via vsyscall notifiers Marcelo Tosatti
                   ` (3 more replies)
  0 siblings, 4 replies; 14+ messages in thread
From: Marcelo Tosatti @ 2017-01-24 17:09 UTC (permalink / raw)
  To: kvm, linux-kernel
  Cc: Paolo Bonzini, Radim Krcmar, Richard Cochran, Miroslav Lichvar


This patchset implements a virtual PTP driver which allows guest to sync
its clock to the host clock with high precision
(error is < 1us on an idle guest).

Changelog from v4

Drop PTP_SYS_OFFSET emulation via ->crosstimestamp callback (Richard).
Emulate ->gettime directly without TSC offset (Radim).

Changelog from v3

Patch3:

v2: export kvmclock clocksource structure directly (Paolo)

Patch4:

v2: drop timekeeper spinlock, move back to drivers/ptp/ptp_chardev.c (Paolo)
    ptp_clock_gettime: support drivers with crosstimestamp but not
    gettime64 callbacks (Paolo)

Patch5:

v4: remove gettime64 callback (Paolo)

Changelog from v2

Cross timestamping, emulation of PTP_SYS_OFFSET with cross timestamping
(Paolo, Miroslav, Radim).

Changelog from v1

Patch1:

v2: unify nsec_base (Radim)

Patch2:
v2: improve documentation (Radim)
    change hypercall name to KVM_HC_CLOCK_PAIRING (Radim)
    increase padding size

Patch3:
v2: check for kvmclock (Radim)
    initialize global variables before device registration (Radim)


30 second output of a idle guest:


210 Number of sources = 1
MS Name/IP address         Stratum Poll Reach LastRx Last sample
===============================================================================
#* PHC0                          0   3   377     9     -1ns[   -1ns] +/-    2ns
Tue Jan 24 11:46:36 EST 2017
210 Number of sources = 1
MS Name/IP address         Stratum Poll Reach LastRx Last sample
===============================================================================
#* PHC0                          0   3   377     7     -0ns[   -1ns] +/-    2ns
Tue Jan 24 11:47:06 EST 2017
210 Number of sources = 1
MS Name/IP address         Stratum Poll Reach LastRx Last sample
===============================================================================
#* PHC0                          0   3   377     5     -1ns[   +0ns] +/-    2ns
Tue Jan 24 11:47:36 EST 2017
210 Number of sources = 1
MS Name/IP address         Stratum Poll Reach LastRx Last sample
===============================================================================
#* PHC0                          0   3   377    11     +0ns[   -2ns] +/-    2ns
Tue Jan 24 11:48:06 EST 2017
210 Number of sources = 1
MS Name/IP address         Stratum Poll Reach LastRx Last sample
===============================================================================
#* PHC0                          0   3   377    10     -0ns[   -0ns] +/-    2ns
Tue Jan 24 11:48:36 EST 2017
210 Number of sources = 1
MS Name/IP address         Stratum Poll Reach LastRx Last sample
===============================================================================
#* PHC0                          0   3   377     7     -0ns[   -0ns] +/-    2ns
Tue Jan 24 11:49:06 EST 2017
210 Number of sources = 1
MS Name/IP address         Stratum Poll Reach LastRx Last sample
===============================================================================
#* PHC0                          0   3   377     5     -3ns[   -4ns] +/-    2ns
Tue Jan 24 11:49:36 EST 2017
210 Number of sources = 1
MS Name/IP address         Stratum Poll Reach LastRx Last sample
===============================================================================
#* PHC0                          0   3   377    11     +2ns[   +2ns] +/-    2ns
Tue Jan 24 11:50:06 EST 2017
210 Number of sources = 1
MS Name/IP address         Stratum Poll Reach LastRx Last sample
===============================================================================
#* PHC0                          0   3   377     9     -1ns[   +1ns] +/-    2ns
Tue Jan 24 11:50:36 EST 2017
210 Number of sources = 1
MS Name/IP address         Stratum Poll Reach LastRx Last sample
===============================================================================
#* PHC0                          0   3   377     6     +0ns[   +0ns] +/-    2ns
Tue Jan 24 11:51:06 EST 2017
210 Number of sources = 1
MS Name/IP address         Stratum Poll Reach LastRx Last sample
===============================================================================
#* PHC0                          0   3   377     5     +1ns[  +10ns] +/-    2ns
Tue Jan 24 11:51:36 EST 2017
210 Number of sources = 1
MS Name/IP address         Stratum Poll Reach LastRx Last sample
===============================================================================
#* PHC0                          0   3   377    11     -5ns[   -6ns] +/-    2ns
Tue Jan 24 11:52:06 EST 2017
210 Number of sources = 1
MS Name/IP address         Stratum Poll Reach LastRx Last sample
===============================================================================
#* PHC0                          0   3   377     9     -0ns[   -1ns] +/-    2ns
Tue Jan 24 11:52:36 EST 2017
210 Number of sources = 1
MS Name/IP address         Stratum Poll Reach LastRx Last sample
===============================================================================
#* PHC0                          0   3   377     6     +0ns[   -5ns] +/-    2ns
Tue Jan 24 11:53:06 EST 2017
210 Number of sources = 1
MS Name/IP address         Stratum Poll Reach LastRx Last sample
===============================================================================
#* PHC0                          0   3   377     4     -0ns[   -0ns] +/-    2ns
Tue Jan 24 11:53:36 EST 2017
210 Number of sources = 1
MS Name/IP address         Stratum Poll Reach LastRx Last sample
===============================================================================
#* PHC0                          0   3   377    10     -0ns[   -0ns] +/-    2ns
Tue Jan 24 11:54:06 EST 2017
210 Number of sources = 1
MS Name/IP address         Stratum Poll Reach LastRx Last sample
===============================================================================
#* PHC0                          0   3   377     9     -1ns[   -1ns] +/-    2ns
Tue Jan 24 11:54:36 EST 2017
210 Number of sources = 1
MS Name/IP address         Stratum Poll Reach LastRx Last sample
===============================================================================
#* PHC0                          0   3   377     6     -1ns[   -2ns] +/-    2ns
Tue Jan 24 11:55:06 EST 2017
210 Number of sources = 1
MS Name/IP address         Stratum Poll Reach LastRx Last sample
===============================================================================
#* PHC0                          0   3   377     4     -0ns[   +3ns] +/-    2ns
Tue Jan 24 11:55:36 EST 2017
210 Number of sources = 1
MS Name/IP address         Stratum Poll Reach LastRx Last sample
===============================================================================
#* PHC0                          0   3   377    10     -3ns[   -6ns] +/-    2ns
Tue Jan 24 11:56:06 EST 2017
210 Number of sources = 1
MS Name/IP address         Stratum Poll Reach LastRx Last sample
===============================================================================
#* PHC0                          0   3   377     8     +2ns[   +3ns] +/-    2ns
Tue Jan 24 11:56:36 EST 2017
210 Number of sources = 1
MS Name/IP address         Stratum Poll Reach LastRx Last sample
===============================================================================
#* PHC0                          0   3   377     6     +0ns[   -2ns] +/-    2ns
Tue Jan 24 11:57:06 EST 2017
210 Number of sources = 1
MS Name/IP address         Stratum Poll Reach LastRx Last sample
===============================================================================
#* PHC0                          0   3   377     4     -0ns[   +1ns] +/-    2ns
Tue Jan 24 11:57:36 EST 2017
210 Number of sources = 1
MS Name/IP address         Stratum Poll Reach LastRx Last sample
===============================================================================
#* PHC0                          0   3   377    10     -1ns[   +1ns] +/-    2ns

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [patch 1/4] KVM: x86: provide realtime host clock via vsyscall notifiers
  2017-01-24 17:09 [patch 0/4] KVM virtual PTP driver (v5) Marcelo Tosatti
@ 2017-01-24 17:09 ` Marcelo Tosatti
  2017-01-25 12:30   ` Paolo Bonzini
  2017-01-24 17:09 ` [patch 2/4] KVM: x86: add KVM_HC_CLOCK_OFFSET hypercall Marcelo Tosatti
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 14+ messages in thread
From: Marcelo Tosatti @ 2017-01-24 17:09 UTC (permalink / raw)
  To: kvm, linux-kernel
  Cc: Paolo Bonzini, Radim Krcmar, Richard Cochran, Miroslav Lichvar,
	Marcelo Tosatti

[-- Attachment #1: kvm-do-realtime --]
[-- Type: text/plain, Size: 2010 bytes --]

Expose the realtime host clock and save the TSC value
used for the clock calculation.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

---
 arch/x86/kvm/x86.c |   36 ++++++++++++++++++++++++++++++++++++
 1 file changed, 36 insertions(+)

v2: unify nsec_base (Radim)

Index: kvm-ptpdriver/arch/x86/kvm/x86.c
===================================================================
--- kvm-ptpdriver.orig/arch/x86/kvm/x86.c	2017-01-13 16:43:15.013244682 -0200
+++ kvm-ptpdriver/arch/x86/kvm/x86.c	2017-01-20 09:56:41.404534904 -0200
@@ -1139,6 +1139,7 @@
 
 	u64		boot_ns;
 	u64		nsec_base;
+	u64		wall_time_sec;
 };
 
 static struct pvclock_gtod_data pvclock_gtod_data;
@@ -1162,6 +1163,8 @@
 	vdata->boot_ns			= boot_ns;
 	vdata->nsec_base		= tk->tkr_mono.xtime_nsec;
 
+	vdata->wall_time_sec            = tk->xtime_sec;
+
 	write_seqcount_end(&vdata->seq);
 }
 #endif
@@ -1623,6 +1626,28 @@
 	return mode;
 }
 
+static int do_realtime(struct timespec *ts, cycle_t *cycle_now)
+{
+	struct pvclock_gtod_data *gtod = &pvclock_gtod_data;
+	unsigned long seq;
+	int mode;
+	u64 ns;
+
+	do {
+		seq = read_seqcount_begin(&gtod->seq);
+		mode = gtod->clock.vclock_mode;
+		ts->tv_sec = gtod->wall_time_sec;
+		ns = gtod->nsec_base;
+		ns += vgettsc(cycle_now);
+		ns >>= gtod->clock.shift;
+	} while (unlikely(read_seqcount_retry(&gtod->seq, seq)));
+
+	ts->tv_sec += __iter_div_u64_rem(ns, NSEC_PER_SEC, &ns);
+	ts->tv_nsec = ns;
+
+	return mode;
+}
+
 /* returns true if host is using tsc clocksource */
 static bool kvm_get_time_and_clockread(s64 *kernel_ns, cycle_t *cycle_now)
 {
@@ -1632,6 +1657,17 @@
 
 	return do_monotonic_boot(kernel_ns, cycle_now) == VCLOCK_TSC;
 }
+
+/* returns true if host is using tsc clocksource */
+static bool kvm_get_walltime_and_clockread(struct timespec *ts,
+					   cycle_t *cycle_now)
+{
+	/* checked again under seqlock below */
+	if (pvclock_gtod_data.clock.vclock_mode != VCLOCK_TSC)
+		return false;
+
+	return do_realtime(ts, cycle_now) == VCLOCK_TSC;
+}
 #endif
 
 /*

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [patch 2/4] KVM: x86: add KVM_HC_CLOCK_OFFSET hypercall
  2017-01-24 17:09 [patch 0/4] KVM virtual PTP driver (v5) Marcelo Tosatti
  2017-01-24 17:09 ` [patch 1/4] KVM: x86: provide realtime host clock via vsyscall notifiers Marcelo Tosatti
@ 2017-01-24 17:09 ` Marcelo Tosatti
  2017-01-25 12:30   ` Paolo Bonzini
  2017-01-24 17:09 ` [patch 3/4] kvmclock: export kvmclock clocksource pointer Marcelo Tosatti
  2017-01-24 17:09 ` [patch 4/4] PTP: add kvm PTP driver Marcelo Tosatti
  3 siblings, 1 reply; 14+ messages in thread
From: Marcelo Tosatti @ 2017-01-24 17:09 UTC (permalink / raw)
  To: kvm, linux-kernel
  Cc: Paolo Bonzini, Radim Krcmar, Richard Cochran, Miroslav Lichvar,
	Marcelo Tosatti

[-- Attachment #1: kvm-hv-clock-offset --]
[-- Type: text/plain, Size: 5057 bytes --]

Add a hypercall to retrieve the host realtime clock
and the TSC value used to calculate that clock read.

Used to implement clock synchronization between
host and guest.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

---
 Documentation/virtual/kvm/hypercalls.txt |   33 ++++++++++++++++++++++++
 arch/x86/include/uapi/asm/kvm_para.h     |    9 ++++++
 arch/x86/kvm/x86.c                       |   41 +++++++++++++++++++++++++++++++
 include/uapi/linux/kvm_para.h            |    3 ++
 4 files changed, 86 insertions(+)

v2: improve documentation (Radim)
    change hypercall name to KVM_HC_CLOCK_PAIRING (Radim)
    increase padding size

Index: kvm-ptpdriver/arch/x86/include/uapi/asm/kvm_para.h
===================================================================
--- kvm-ptpdriver.orig/arch/x86/include/uapi/asm/kvm_para.h	2017-01-13 16:43:11.947240575 -0200
+++ kvm-ptpdriver/arch/x86/include/uapi/asm/kvm_para.h	2017-01-13 16:43:25.038258187 -0200
@@ -50,6 +50,15 @@
 	__u32 pad[11];
 };
 
+#define KVM_CLOCK_PAIRING_WALLCLOCK 0
+struct kvm_clock_offset {
+	__s64 sec;
+	__s64 nsec;
+	__u64 tsc;
+	__u32 flags;
+	__u32 pad[9];
+};
+
 #define KVM_STEAL_ALIGNMENT_BITS 5
 #define KVM_STEAL_VALID_BITS ((-1ULL << (KVM_STEAL_ALIGNMENT_BITS + 1)))
 #define KVM_STEAL_RESERVED_MASK (((1 << KVM_STEAL_ALIGNMENT_BITS) - 1 ) << 1)
Index: kvm-ptpdriver/Documentation/virtual/kvm/hypercalls.txt
===================================================================
--- kvm-ptpdriver.orig/Documentation/virtual/kvm/hypercalls.txt	2017-01-13 16:43:11.947240575 -0200
+++ kvm-ptpdriver/Documentation/virtual/kvm/hypercalls.txt	2017-01-13 16:43:25.038258187 -0200
@@ -81,3 +81,36 @@
 same guest can wakeup the sleeping vcpu by issuing KVM_HC_KICK_CPU hypercall,
 specifying APIC ID (a1) of the vcpu to be woken up. An additional argument (a0)
 is used in the hypercall for future use.
+
+
+6. KVM_HC_CLOCK_PAIRING
+------------------------
+Architecture: x86
+Status: active
+Purpose: Hypercall used to synchronize host and guest clocks.
+Usage:
+
+a0: guest physical address where host copies
+"struct kvm_clock_offset" structure.
+
+a1: clock_type, ATM only KVM_CLOCK_PAIRING_WALLCLOCK (0)
+is supported (hosts CLOCK_REALTIME clock).
+
+		struct kvm_clock_offset {
+			__s64 sec;
+			__s64 nsec;
+			__u64 tsc;
+			__u32 flags;
+			__u32 pad[9];
+		};
+
+       Where:
+               * sec: seconds from clock_type clock.
+               * nsec: nanoseconds from clock_type clock.
+               * tsc: TSC value used to calculate sec/nsec pair
+                 (this hypercall only works when host uses TSC clocksource).
+               * flags: flags, unused (0) at the moment.
+
+Returns KVM_EOPNOTSUPP if the host does not use TSC clocksource,
+or if clock type is different than KVM_CLOCK_PAIRING_WALLCLOCK.
+
Index: kvm-ptpdriver/arch/x86/kvm/x86.c
===================================================================
--- kvm-ptpdriver.orig/arch/x86/kvm/x86.c	2017-01-13 16:43:20.898252596 -0200
+++ kvm-ptpdriver/arch/x86/kvm/x86.c	2017-01-13 16:43:25.041258191 -0200
@@ -6119,6 +6119,44 @@
 }
 EXPORT_SYMBOL_GPL(kvm_emulate_halt);
 
+static int kvm_pv_clock_pairing(struct kvm_vcpu *vcpu, gpa_t paddr,
+			       unsigned long clock_type)
+{
+	struct kvm_clock_offset *clock_offset;
+	struct timespec ts;
+	cycle_t cycle;
+	int ret;
+
+	if (clock_type != KVM_CLOCK_PAIRING_WALLCLOCK)
+		return -KVM_EOPNOTSUPP;
+
+	if (pvclock_gtod_data.clock.vclock_mode != VCLOCK_TSC)
+		return -KVM_EOPNOTSUPP;
+
+	clock_offset = kzalloc(sizeof(struct kvm_clock_offset), GFP_KERNEL);
+	if (clock_offset == NULL)
+		return -KVM_ENOMEM;
+
+	if (kvm_get_walltime_and_clockread(&ts, &cycle) == false) {
+		kfree(clock_offset);
+		return -KVM_EOPNOTSUPP;
+	}
+
+	clock_offset->sec = ts.tv_sec;
+	clock_offset->nsec = ts.tv_nsec;
+	clock_offset->tsc = kvm_read_l1_tsc(vcpu, cycle);
+	clock_offset->flags = 0;
+
+	ret = 0;
+	if (kvm_write_guest(vcpu->kvm, paddr, clock_offset,
+			      sizeof(struct kvm_clock_offset)))
+		ret = -KVM_EFAULT;
+
+	kfree(clock_offset);
+
+	return ret;
+}
+
 /*
  * kvm_pv_kick_cpu_op:  Kick a vcpu.
  *
@@ -6183,6 +6221,9 @@
 		kvm_pv_kick_cpu_op(vcpu->kvm, a0, a1);
 		ret = 0;
 		break;
+	case KVM_HC_CLOCK_PAIRING:
+		ret = kvm_pv_clock_pairing(vcpu, a0, a1);
+		break;
 	default:
 		ret = -KVM_ENOSYS;
 		break;
Index: kvm-ptpdriver/include/uapi/linux/kvm_para.h
===================================================================
--- kvm-ptpdriver.orig/include/uapi/linux/kvm_para.h	2017-01-13 16:43:11.947240575 -0200
+++ kvm-ptpdriver/include/uapi/linux/kvm_para.h	2017-01-13 16:43:25.042258192 -0200
@@ -14,6 +14,8 @@
 #define KVM_EFAULT		EFAULT
 #define KVM_E2BIG		E2BIG
 #define KVM_EPERM		EPERM
+#define KVM_EOPNOTSUPP		EOPNOTSUPP
+#define KVM_ENOMEM		ENOMEM
 
 #define KVM_HC_VAPIC_POLL_IRQ		1
 #define KVM_HC_MMU_OP			2
@@ -23,6 +25,7 @@
 #define KVM_HC_MIPS_GET_CLOCK_FREQ	6
 #define KVM_HC_MIPS_EXIT_VM		7
 #define KVM_HC_MIPS_CONSOLE_OUTPUT	8
+#define KVM_HC_CLOCK_PAIRING		9
 
 /*
  * hypercalls use architecture specific

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [patch 3/4] kvmclock: export kvmclock clocksource pointer
  2017-01-24 17:09 [patch 0/4] KVM virtual PTP driver (v5) Marcelo Tosatti
  2017-01-24 17:09 ` [patch 1/4] KVM: x86: provide realtime host clock via vsyscall notifiers Marcelo Tosatti
  2017-01-24 17:09 ` [patch 2/4] KVM: x86: add KVM_HC_CLOCK_OFFSET hypercall Marcelo Tosatti
@ 2017-01-24 17:09 ` Marcelo Tosatti
  2017-01-24 17:09 ` [patch 4/4] PTP: add kvm PTP driver Marcelo Tosatti
  3 siblings, 0 replies; 14+ messages in thread
From: Marcelo Tosatti @ 2017-01-24 17:09 UTC (permalink / raw)
  To: kvm, linux-kernel
  Cc: Paolo Bonzini, Radim Krcmar, Richard Cochran, Miroslav Lichvar,
	Marcelo Tosatti

[-- Attachment #1: kvmclock-export-clocksource --]
[-- Type: text/plain, Size: 1367 bytes --]

To be used by KVM PTP driver.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

---
 arch/x86/include/asm/kvmclock.h |    6 ++++++
 arch/x86/kernel/kvmclock.c      |    3 ++-
 2 files changed, 8 insertions(+), 1 deletion(-)

v2: export kvmclock clocksource structure directly (Paolo)

Index: kvm-ptpdriver/arch/x86/include/asm/kvmclock.h
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ kvm-ptpdriver/arch/x86/include/asm/kvmclock.h	2017-01-20 11:05:28.720038512 -0200
@@ -0,0 +1,6 @@
+#ifndef _ASM_X86_KVM_CLOCK_H
+#define _ASM_X86_KVM_CLOCK_H
+
+extern struct clocksource kvm_clock;
+
+#endif /* _ASM_X86_KVM_CLOCK_H */
Index: kvm-ptpdriver/arch/x86/kernel/kvmclock.c
===================================================================
--- kvm-ptpdriver.orig/arch/x86/kernel/kvmclock.c	2017-01-20 10:03:44.750489955 -0200
+++ kvm-ptpdriver/arch/x86/kernel/kvmclock.c	2017-01-20 11:05:03.137971637 -0200
@@ -28,6 +28,7 @@
 
 #include <asm/x86_init.h>
 #include <asm/reboot.h>
+#include <asm/kvmclock.h>
 
 static int kvmclock __ro_after_init = 1;
 static int msr_kvm_system_time = MSR_KVM_SYSTEM_TIME;
@@ -174,7 +175,7 @@
 	return ret;
 }
 
-static struct clocksource kvm_clock = {
+struct clocksource kvm_clock = {
 	.name = "kvm-clock",
 	.read = kvm_clock_get_cycles,
 	.rating = 400,

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [patch 4/4] PTP: add kvm PTP driver
  2017-01-24 17:09 [patch 0/4] KVM virtual PTP driver (v5) Marcelo Tosatti
                   ` (2 preceding siblings ...)
  2017-01-24 17:09 ` [patch 3/4] kvmclock: export kvmclock clocksource pointer Marcelo Tosatti
@ 2017-01-24 17:09 ` Marcelo Tosatti
  2017-01-24 18:11   ` Richard Cochran
  2017-01-25 12:56   ` Paolo Bonzini
  3 siblings, 2 replies; 14+ messages in thread
From: Marcelo Tosatti @ 2017-01-24 17:09 UTC (permalink / raw)
  To: kvm, linux-kernel
  Cc: Paolo Bonzini, Radim Krcmar, Richard Cochran, Miroslav Lichvar,
	Marcelo Tosatti

[-- Attachment #1: kvm-ptpdriver --]
[-- Type: text/plain, Size: 7426 bytes --]

Add a driver with gettime method returning hosts realtime clock.
This allows Chrony to synchronize host and guest clocks with 
high precision (see results below).

chronyc> sources
MS Name/IP address         Stratum Poll Reach LastRx Last sample
===============================================================================
#* PHC0                          0   3   377     4   +162ns[ -683ns] +/-   11ns

To configure Chronyd to use PHC refclock, add the 
following line to its configuration file:

refclock PHC /dev/ptpX poll 3 dpoll -2 offset 0

Where /dev/ptpX is the kvmclock PTP clock.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

---
 drivers/ptp/Kconfig   |   12 ++
 drivers/ptp/Makefile  |    1 
 drivers/ptp/ptp_kvm.c |  202 ++++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 215 insertions(+)

v2: check for kvmclock (Radim)
    initialize global variables before device registration (Radim)
v3: use cross timestamps callback (Paolo, Miroslav, Radim)
v4: remove gettime64 callback (Paolo)
v5: drop useless nsec assignment (Radim)

Index: kvm-ptpdriver/drivers/ptp/Kconfig
===================================================================
--- kvm-ptpdriver.orig/drivers/ptp/Kconfig	2017-01-24 09:03:47.604586098 -0200
+++ kvm-ptpdriver/drivers/ptp/Kconfig	2017-01-24 09:05:06.321704124 -0200
@@ -90,4 +90,16 @@
 	  To compile this driver as a module, choose M here: the module
 	  will be called ptp_pch.
 
+config PTP_1588_CLOCK_KVM
+	tristate "KVM virtual PTP clock"
+	depends on PTP_1588_CLOCK
+	depends on KVM_GUEST
+	default y
+	help
+	  This driver adds support for using kvm infrastructure as a PTP
+	  clock. This clock is only useful if you are using KVM guests.
+
+	  To compile this driver as a module, choose M here: the module
+	  will be called ptp_kvm.
+
 endmenu
Index: kvm-ptpdriver/drivers/ptp/Makefile
===================================================================
--- kvm-ptpdriver.orig/drivers/ptp/Makefile	2017-01-24 09:03:47.604586098 -0200
+++ kvm-ptpdriver/drivers/ptp/Makefile	2017-01-24 09:05:06.322704125 -0200
@@ -6,3 +6,4 @@
 obj-$(CONFIG_PTP_1588_CLOCK)		+= ptp.o
 obj-$(CONFIG_PTP_1588_CLOCK_IXP46X)	+= ptp_ixp46x.o
 obj-$(CONFIG_PTP_1588_CLOCK_PCH)	+= ptp_pch.o
+obj-$(CONFIG_PTP_1588_CLOCK_KVM)	+= ptp_kvm.o
Index: kvm-ptpdriver/drivers/ptp/ptp_kvm.c
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ kvm-ptpdriver/drivers/ptp/ptp_kvm.c	2017-01-24 09:05:06.322704125 -0200
@@ -0,0 +1,202 @@
+/*
+ * Virtual PTP 1588 clock for use with KVM guests
+ *
+ * Copyright (C) 2017 Red Hat Inc.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ */
+#include <linux/device.h>
+#include <linux/err.h>
+#include <linux/init.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <uapi/linux/kvm_para.h>
+#include <asm/kvm_para.h>
+#include <asm/pvclock.h>
+#include <asm/kvmclock.h>
+#include <uapi/asm/kvm_para.h>
+
+#include <linux/ptp_clock_kernel.h>
+
+struct kvm_ptp_clock {
+	struct ptp_clock *ptp_clock;
+	struct ptp_clock_info caps;
+};
+
+DEFINE_SPINLOCK(kvm_ptp_lock);
+
+static struct pvclock_vsyscall_time_info *hv_clock;
+
+static struct kvm_clock_offset clock_off;
+static phys_addr_t clock_off_gpa;
+
+static int ptp_kvm_get_time_fn(ktime_t *device_time,
+			       struct system_counterval_t *system_counter,
+			       void *ctx)
+{
+	unsigned long ret;
+	struct timespec64 tspec;
+	unsigned version;
+	u8 flags;
+	int cpu;
+	struct pvclock_vcpu_time_info *src;
+
+	preempt_disable_notrace();
+	cpu = smp_processor_id();
+	src = &hv_clock[cpu].pvti;
+
+	spin_lock(&kvm_ptp_lock);
+
+	do {
+		/*
+		 * We are using a TSC value read in the hosts
+		 * kvm_hc_clock_pairing handling.
+		 * So any changes to tsc_to_system_mul
+		 * and tsc_shift or any other pvclock
+		 * data invalidate that measurement.
+		 */
+		version = pvclock_read_begin(src);
+
+		ret = kvm_hypercall2(KVM_HC_CLOCK_PAIRING,
+				     clock_off_gpa,
+				     KVM_CLOCK_PAIRING_WALLCLOCK);
+		if (ret != 0) {
+			pr_err("clock offset hypercall ret %lu\n", ret);
+			spin_unlock(&kvm_ptp_lock);
+			preempt_enable_notrace();
+			return -EOPNOTSUPP;
+		}
+
+		tspec.tv_sec = clock_off.sec;
+		tspec.tv_nsec = clock_off.nsec;
+		ret = __pvclock_read_cycles(src, clock_off.tsc);
+		flags = src->flags;
+	} while (pvclock_read_retry(src, version));
+
+	preempt_enable_notrace();
+
+	system_counter->cycles = ret;
+	system_counter->cs = &kvm_clock;
+
+	*device_time = timespec_to_ktime(tspec);
+
+	spin_unlock(&kvm_ptp_lock);
+
+	return 0;
+}
+
+static int ptp_kvm_getcrosststamp(struct ptp_clock_info *ptp,
+				  struct system_device_crosststamp *xtstamp)
+{
+	return get_device_system_crosststamp(ptp_kvm_get_time_fn, NULL,
+					     NULL, xtstamp);
+}
+
+/*
+ * PTP clock operations
+ */
+
+static int ptp_kvm_adjfreq(struct ptp_clock_info *ptp, s32 ppb)
+{
+	return -EOPNOTSUPP;
+}
+
+static int ptp_kvm_adjtime(struct ptp_clock_info *ptp, s64 delta)
+{
+	return -EOPNOTSUPP;
+}
+
+static int ptp_kvm_settime(struct ptp_clock_info *ptp,
+			   const struct timespec64 *ts)
+{
+	return -EOPNOTSUPP;
+}
+
+static int ptp_kvm_gettime(struct ptp_clock_info *ptp, struct timespec64 *ts)
+{
+	unsigned long ret;
+	struct timespec64 tspec;
+
+	spin_lock(&kvm_ptp_lock);
+
+	ret = kvm_hypercall2(KVM_HC_CLOCK_PAIRING,
+			     clock_off_gpa,
+			     KVM_CLOCK_PAIRING_WALLCLOCK);
+	if (ret != 0) {
+		pr_err("clock offset hypercall ret %lu\n", ret);
+		spin_unlock(&kvm_ptp_lock);
+		return -EOPNOTSUPP;
+	}
+
+	tspec.tv_sec = clock_off.sec;
+	tspec.tv_nsec = clock_off.nsec;
+	spin_unlock(&kvm_ptp_lock);
+
+	memcpy(ts, &tspec, sizeof(struct timespec64));
+
+	return 0;
+}
+
+static int ptp_kvm_enable(struct ptp_clock_info *ptp,
+			  struct ptp_clock_request *rq, int on)
+{
+	return -EOPNOTSUPP;
+}
+
+static struct ptp_clock_info ptp_kvm_caps = {
+	.owner		= THIS_MODULE,
+	.name		= "KVM virtual PTP",
+	.max_adj	= 0,
+	.n_ext_ts	= 0,
+	.n_pins		= 0,
+	.pps		= 0,
+	.adjfreq	= ptp_kvm_adjfreq,
+	.adjtime	= ptp_kvm_adjtime,
+	.gettime64	= ptp_kvm_gettime,
+	.settime64	= ptp_kvm_settime,
+	.enable		= ptp_kvm_enable,
+	.getcrosststamp = ptp_kvm_getcrosststamp,
+};
+
+/* module operations */
+
+static struct kvm_ptp_clock kvm_ptp_clock;
+
+static void __exit ptp_kvm_exit(void)
+{
+	ptp_clock_unregister(kvm_ptp_clock.ptp_clock);
+}
+
+static int __init ptp_kvm_init(void)
+{
+	clock_off_gpa = slow_virt_to_phys(&clock_off);
+	hv_clock = pvclock_pvti_cpu0_va();
+
+	if (!hv_clock)
+		return -ENODEV;
+
+	kvm_ptp_clock.caps = ptp_kvm_caps;
+
+	kvm_ptp_clock.ptp_clock = ptp_clock_register(&kvm_ptp_clock.caps, NULL);
+
+	if (IS_ERR(kvm_ptp_clock.ptp_clock))
+		return PTR_ERR(kvm_ptp_clock.ptp_clock);
+
+	return 0;
+}
+
+module_init(ptp_kvm_init);
+module_exit(ptp_kvm_exit);
+
+MODULE_AUTHOR("Marcelo Tosatti <mtosatti@redhat.com>");
+MODULE_DESCRIPTION("PTP clock using KVMCLOCK");
+MODULE_LICENSE("GPL");

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [patch 4/4] PTP: add kvm PTP driver
  2017-01-24 17:09 ` [patch 4/4] PTP: add kvm PTP driver Marcelo Tosatti
@ 2017-01-24 18:11   ` Richard Cochran
  2017-01-25 12:56   ` Paolo Bonzini
  1 sibling, 0 replies; 14+ messages in thread
From: Richard Cochran @ 2017-01-24 18:11 UTC (permalink / raw)
  To: Marcelo Tosatti
  Cc: kvm, linux-kernel, Paolo Bonzini, Radim Krcmar, Miroslav Lichvar

On Tue, Jan 24, 2017 at 03:09:42PM -0200, Marcelo Tosatti wrote:
> Add a driver with gettime method returning hosts realtime clock.
> This allows Chrony to synchronize host and guest clocks with 
> high precision (see results below).
> 
> chronyc> sources
> MS Name/IP address         Stratum Poll Reach LastRx Last sample
> ===============================================================================
> #* PHC0                          0   3   377     4   +162ns[ -683ns] +/-   11ns
> 
> To configure Chronyd to use PHC refclock, add the 
> following line to its configuration file:
> 
> refclock PHC /dev/ptpX poll 3 dpoll -2 offset 0
> 
> Where /dev/ptpX is the kvmclock PTP clock.
> 
> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

Acked-by: Richard Cochran <richardcochran@gmail.com>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [patch 1/4] KVM: x86: provide realtime host clock via vsyscall notifiers
  2017-01-24 17:09 ` [patch 1/4] KVM: x86: provide realtime host clock via vsyscall notifiers Marcelo Tosatti
@ 2017-01-25 12:30   ` Paolo Bonzini
  0 siblings, 0 replies; 14+ messages in thread
From: Paolo Bonzini @ 2017-01-25 12:30 UTC (permalink / raw)
  To: Marcelo Tosatti, kvm, linux-kernel
  Cc: Radim Krcmar, Richard Cochran, Miroslav Lichvar



On 24/01/2017 18:09, Marcelo Tosatti wrote:
> Expose the realtime host clock and save the TSC value
> used for the clock calculation.
> 
> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
> 
> ---
>  arch/x86/kvm/x86.c |   36 ++++++++++++++++++++++++++++++++++++
>  1 file changed, 36 insertions(+)
> 
> v2: unify nsec_base (Radim)
> 
> Index: kvm-ptpdriver/arch/x86/kvm/x86.c
> ===================================================================
> --- kvm-ptpdriver.orig/arch/x86/kvm/x86.c	2017-01-13 16:43:15.013244682 -0200
> +++ kvm-ptpdriver/arch/x86/kvm/x86.c	2017-01-20 09:56:41.404534904 -0200
> @@ -1139,6 +1139,7 @@
>  
>  	u64		boot_ns;
>  	u64		nsec_base;
> +	u64		wall_time_sec;
>  };
>  
>  static struct pvclock_gtod_data pvclock_gtod_data;
> @@ -1162,6 +1163,8 @@
>  	vdata->boot_ns			= boot_ns;
>  	vdata->nsec_base		= tk->tkr_mono.xtime_nsec;
>  
> +	vdata->wall_time_sec            = tk->xtime_sec;
> +
>  	write_seqcount_end(&vdata->seq);
>  }
>  #endif
> @@ -1623,6 +1626,28 @@
>  	return mode;
>  }
>  
> +static int do_realtime(struct timespec *ts, cycle_t *cycle_now)
> +{
> +	struct pvclock_gtod_data *gtod = &pvclock_gtod_data;
> +	unsigned long seq;
> +	int mode;
> +	u64 ns;
> +
> +	do {
> +		seq = read_seqcount_begin(&gtod->seq);
> +		mode = gtod->clock.vclock_mode;
> +		ts->tv_sec = gtod->wall_time_sec;
> +		ns = gtod->nsec_base;
> +		ns += vgettsc(cycle_now);
> +		ns >>= gtod->clock.shift;
> +	} while (unlikely(read_seqcount_retry(&gtod->seq, seq)));
> +
> +	ts->tv_sec += __iter_div_u64_rem(ns, NSEC_PER_SEC, &ns);
> +	ts->tv_nsec = ns;
> +
> +	return mode;
> +}
> +
>  /* returns true if host is using tsc clocksource */
>  static bool kvm_get_time_and_clockread(s64 *kernel_ns, cycle_t *cycle_now)
>  {
> @@ -1632,6 +1657,17 @@
>  
>  	return do_monotonic_boot(kernel_ns, cycle_now) == VCLOCK_TSC;
>  }
> +
> +/* returns true if host is using tsc clocksource */
> +static bool kvm_get_walltime_and_clockread(struct timespec *ts,
> +					   cycle_t *cycle_now)
> +{
> +	/* checked again under seqlock below */
> +	if (pvclock_gtod_data.clock.vclock_mode != VCLOCK_TSC)
> +		return false;
> +
> +	return do_realtime(ts, cycle_now) == VCLOCK_TSC;
> +}
>  #endif
>  
>  /*
> 
> 
> 

Only used in the next patch, will squash.  Also, cycle_t is gone now so
use u64 instead.

Paolo

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [patch 2/4] KVM: x86: add KVM_HC_CLOCK_OFFSET hypercall
  2017-01-24 17:09 ` [patch 2/4] KVM: x86: add KVM_HC_CLOCK_OFFSET hypercall Marcelo Tosatti
@ 2017-01-25 12:30   ` Paolo Bonzini
  2017-01-25 12:55     ` Marcelo Tosatti
  0 siblings, 1 reply; 14+ messages in thread
From: Paolo Bonzini @ 2017-01-25 12:30 UTC (permalink / raw)
  To: Marcelo Tosatti, kvm, linux-kernel
  Cc: Radim Krcmar, Richard Cochran, Miroslav Lichvar



On 24/01/2017 18:09, Marcelo Tosatti wrote:
> Add a hypercall to retrieve the host realtime clock
> and the TSC value used to calculate that clock read.
> 
> Used to implement clock synchronization between
> host and guest.
> 
> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
> 
> ---
>  Documentation/virtual/kvm/hypercalls.txt |   33 ++++++++++++++++++++++++
>  arch/x86/include/uapi/asm/kvm_para.h     |    9 ++++++
>  arch/x86/kvm/x86.c                       |   41 +++++++++++++++++++++++++++++++
>  include/uapi/linux/kvm_para.h            |    3 ++
>  4 files changed, 86 insertions(+)
> 
> v2: improve documentation (Radim)
>     change hypercall name to KVM_HC_CLOCK_PAIRING (Radim)
>     increase padding size
> 
> Index: kvm-ptpdriver/arch/x86/include/uapi/asm/kvm_para.h
> ===================================================================
> --- kvm-ptpdriver.orig/arch/x86/include/uapi/asm/kvm_para.h	2017-01-13 16:43:11.947240575 -0200
> +++ kvm-ptpdriver/arch/x86/include/uapi/asm/kvm_para.h	2017-01-13 16:43:25.038258187 -0200
> @@ -50,6 +50,15 @@
>  	__u32 pad[11];
>  };
>  
> +#define KVM_CLOCK_PAIRING_WALLCLOCK 0
> +struct kvm_clock_offset {

The struct still has the old name.

> +	__s64 sec;
> +	__s64 nsec;
> +	__u64 tsc;
> +	__u32 flags;
> +	__u32 pad[9];
> +};
> +
>  #define KVM_STEAL_ALIGNMENT_BITS 5
>  #define KVM_STEAL_VALID_BITS ((-1ULL << (KVM_STEAL_ALIGNMENT_BITS + 1)))
>  #define KVM_STEAL_RESERVED_MASK (((1 << KVM_STEAL_ALIGNMENT_BITS) - 1 ) << 1)
> Index: kvm-ptpdriver/Documentation/virtual/kvm/hypercalls.txt
> ===================================================================
> --- kvm-ptpdriver.orig/Documentation/virtual/kvm/hypercalls.txt	2017-01-13 16:43:11.947240575 -0200
> +++ kvm-ptpdriver/Documentation/virtual/kvm/hypercalls.txt	2017-01-13 16:43:25.038258187 -0200
> @@ -81,3 +81,36 @@
>  same guest can wakeup the sleeping vcpu by issuing KVM_HC_KICK_CPU hypercall,
>  specifying APIC ID (a1) of the vcpu to be woken up. An additional argument (a0)
>  is used in the hypercall for future use.
> +
> +
> +6. KVM_HC_CLOCK_PAIRING
> +------------------------
> +Architecture: x86
> +Status: active
> +Purpose: Hypercall used to synchronize host and guest clocks.
> +Usage:
> +
> +a0: guest physical address where host copies
> +"struct kvm_clock_offset" structure.
> +
> +a1: clock_type, ATM only KVM_CLOCK_PAIRING_WALLCLOCK (0)
> +is supported (hosts CLOCK_REALTIME clock).
> +
> +		struct kvm_clock_offset {

Name of the struct not adjusted.
> +			__s64 sec;
> +			__s64 nsec;
> +			__u64 tsc;
> +			__u32 flags;
> +			__u32 pad[9];

I'd just leave one element of padding and put the struct on the stack, 
since we have no reason to think we'll need a lot of padding.

> +		};
> +
> +       Where:
> +               * sec: seconds from clock_type clock.
> +               * nsec: nanoseconds from clock_type clock.
> +               * tsc: TSC value used to calculate sec/nsec pair
> +                 (this hypercall only works when host uses TSC clocksource).

This is already mentioned below, and it's not clear that tsc is a guest 
TSC value.

> +               * flags: flags, unused (0) at the moment.
> +
> +Returns KVM_EOPNOTSUPP if the host does not use TSC clocksource,
> +or if clock type is different than KVM_CLOCK_PAIRING_WALLCLOCK.
> +
> Index: kvm-ptpdriver/arch/x86/kvm/x86.c
> ===================================================================
> --- kvm-ptpdriver.orig/arch/x86/kvm/x86.c	2017-01-13 16:43:20.898252596 -0200
> +++ kvm-ptpdriver/arch/x86/kvm/x86.c	2017-01-13 16:43:25.041258191 -0200
> @@ -6119,6 +6119,44 @@
>  }
>  EXPORT_SYMBOL_GPL(kvm_emulate_halt);
>  
> +static int kvm_pv_clock_pairing(struct kvm_vcpu *vcpu, gpa_t paddr,
> +			       unsigned long clock_type)
> +{
> +	struct kvm_clock_offset *clock_offset;
> +	struct timespec ts;
> +	cycle_t cycle;
> +	int ret;
> +
> +	if (clock_type != KVM_CLOCK_PAIRING_WALLCLOCK)
> +		return -KVM_EOPNOTSUPP;
> +
> +	if (pvclock_gtod_data.clock.vclock_mode != VCLOCK_TSC)
> +		return -KVM_EOPNOTSUPP;

Already checked in kvm_get_walltime_and_clockread.

> +	clock_offset = kzalloc(sizeof(struct kvm_clock_offset), GFP_KERNEL);
> +	if (clock_offset == NULL)
> +		return -KVM_ENOMEM;
> +
> +	if (kvm_get_walltime_and_clockread(&ts, &cycle) == false) {
> +		kfree(clock_offset);
> +		return -KVM_EOPNOTSUPP;
> +	}
> +	clock_offset->sec = ts.tv_sec;
> +	clock_offset->nsec = ts.tv_nsec;
> +	clock_offset->tsc = kvm_read_l1_tsc(vcpu, cycle);
> +	clock_offset->flags = 0;
> +
> +	ret = 0;
> +	if (kvm_write_guest(vcpu->kvm, paddr, clock_offset,
> +			      sizeof(struct kvm_clock_offset)))
> +		ret = -KVM_EFAULT;
> +
> +	kfree(clock_offset);
> +
> +	return ret;
> +}
> +
>  /*
>   * kvm_pv_kick_cpu_op:  Kick a vcpu.
>   *
> @@ -6183,6 +6221,9 @@
>  		kvm_pv_kick_cpu_op(vcpu->kvm, a0, a1);
>  		ret = 0;
>  		break;
> +	case KVM_HC_CLOCK_PAIRING:
> +		ret = kvm_pv_clock_pairing(vcpu, a0, a1);
> +		break;
>  	default:
>  		ret = -KVM_ENOSYS;
>  		break;
> Index: kvm-ptpdriver/include/uapi/linux/kvm_para.h
> ===================================================================
> --- kvm-ptpdriver.orig/include/uapi/linux/kvm_para.h	2017-01-13 16:43:11.947240575 -0200
> +++ kvm-ptpdriver/include/uapi/linux/kvm_para.h	2017-01-13 16:43:25.042258192 -0200
> @@ -14,6 +14,8 @@
>  #define KVM_EFAULT		EFAULT
>  #define KVM_E2BIG		E2BIG
>  #define KVM_EPERM		EPERM
> +#define KVM_EOPNOTSUPP		EOPNOTSUPP
> +#define KVM_ENOMEM		ENOMEM

This file could potentially be used across multiple architectures (it's 
not in arch/x86/include/uapi/asm), but EOPNOTSUPP is not the same on 
all systems.  Better use "95" which is the value it has on x86.

All this gives:

diff --git a/Documentation/virtual/kvm/hypercalls.txt b/Documentation/virtual/kvm/hypercalls.txt
index 55eb3e1ca494..cd35de84999f 100644
--- a/Documentation/virtual/kvm/hypercalls.txt
+++ b/Documentation/virtual/kvm/hypercalls.txt
@@ -94,9 +94,9 @@ a0: guest physical address where host copies
 "struct kvm_clock_offset" structure.
 
 a1: clock_type, ATM only KVM_CLOCK_PAIRING_WALLCLOCK (0)
-is supported (hosts CLOCK_REALTIME clock).
+is supported (corresponding to the host's CLOCK_REALTIME clock).
 
-		struct kvm_clock_offset {
+		struct kvm_clock_pairing {
 			__s64 sec;
 			__s64 nsec;
 			__u64 tsc;
@@ -107,10 +107,12 @@ is supported (hosts CLOCK_REALTIME clock).
        Where:
                * sec: seconds from clock_type clock.
                * nsec: nanoseconds from clock_type clock.
-               * tsc: TSC value used to calculate sec/nsec pair
-                 (this hypercall only works when host uses TSC clocksource).
+               * tsc: guest TSC value used to calculate sec/nsec pair
                * flags: flags, unused (0) at the moment.
 
+The hypercall lets a guest compute a precise timestamp across
+host and guest.  The guest can use the returned TSC value to
+compute the CLOCK_REALTIME for its clock, at the same instant.
+
 Returns KVM_EOPNOTSUPP if the host does not use TSC clocksource,
 or if clock type is different than KVM_CLOCK_PAIRING_WALLCLOCK.
-
diff --git a/arch/x86/include/uapi/asm/kvm_para.h b/arch/x86/include/uapi/asm/kvm_para.h
index ce7b411e28bc..3641d8b7dba9 100644
--- a/arch/x86/include/uapi/asm/kvm_para.h
+++ b/arch/x86/include/uapi/asm/kvm_para.h
@@ -51,12 +51,12 @@ struct kvm_steal_time {
 };
 
 #define KVM_CLOCK_PAIRING_WALLCLOCK 0
-struct kvm_clock_offset {
+struct kvm_clock_pairing {
 	__s64 sec;
 	__s64 nsec;
 	__u64 tsc;
 	__u32 flags;
-	__u32 pad[9];
+	__u32 pad[1];
 };
 
 #define KVM_STEAL_ALIGNMENT_BITS 5
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 4ecc3145ea4b..09e5d31dac98 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -6149,9 +6149,9 @@ int kvm_emulate_halt(struct kvm_vcpu *vcpu)
 EXPORT_SYMBOL_GPL(kvm_emulate_halt);
 
 static int kvm_pv_clock_pairing(struct kvm_vcpu *vcpu, gpa_t paddr,
-			       unsigned long clock_type)
+			        unsigned long clock_type)
 {
-	struct kvm_clock_offset *clock_offset;
+	struct kvm_clock_pairing clock_pairing;
 	struct timespec ts;
 	cycle_t cycle;
 	int ret;
@@ -6159,30 +6159,19 @@ static int kvm_pv_clock_pairing(struct kvm_vcpu *vcpu, gpa_t paddr,
 	if (clock_type != KVM_CLOCK_PAIRING_WALLCLOCK)
 		return -KVM_EOPNOTSUPP;
 
-	if (pvclock_gtod_data.clock.vclock_mode != VCLOCK_TSC)
-		return -KVM_EOPNOTSUPP;
-
-	clock_offset = kzalloc(sizeof(struct kvm_clock_offset), GFP_KERNEL);
-	if (clock_offset == NULL)
-		return -KVM_ENOMEM;
-
-	if (kvm_get_walltime_and_clockread(&ts, &cycle) == false) {
-		kfree(clock_offset);
+	if (kvm_get_walltime_and_clockread(&ts, &cycle) == false)
 		return -KVM_EOPNOTSUPP;
-	}
 
-	clock_offset->sec = ts.tv_sec;
-	clock_offset->nsec = ts.tv_nsec;
-	clock_offset->tsc = kvm_read_l1_tsc(vcpu, cycle);
-	clock_offset->flags = 0;
+	clock_pairing.sec = ts.tv_sec;
+	clock_pairing.nsec = ts.tv_nsec;
+	clock_pairing.tsc = kvm_read_l1_tsc(vcpu, cycle);
+	clock_pairing.flags = 0;
 
 	ret = 0;
-	if (kvm_write_guest(vcpu->kvm, paddr, clock_offset,
-			      sizeof(struct kvm_clock_offset)))
+	if (kvm_write_guest(vcpu->kvm, paddr, &clock_pairing,
+			    sizeof(struct kvm_clock_pairing)))
 		ret = -KVM_EFAULT;
 
-	kfree(clock_offset);
-
 	return ret;
 }
 
diff --git a/include/uapi/linux/kvm_para.h b/include/uapi/linux/kvm_para.h
index 68da60e7f519..c72128527377 100644
--- a/include/uapi/linux/kvm_para.h
+++ b/include/uapi/linux/kvm_para.h
@@ -14,7 +14,7 @@
 #define KVM_EFAULT		EFAULT
 #define KVM_E2BIG		E2BIG
 #define KVM_EPERM		EPERM
-#define KVM_EOPNOTSUPP		EOPNOTSUPP
+#define KVM_EOPNOTSUPP		95
 #define KVM_ENOMEM		ENOMEM
 
 #define KVM_HC_VAPIC_POLL_IRQ		1

Okay?

Paolo

>  #define KVM_HC_VAPIC_POLL_IRQ		1
>  #define KVM_HC_MMU_OP			2
> @@ -23,6 +25,7 @@
>  #define KVM_HC_MIPS_GET_CLOCK_FREQ	6
>  #define KVM_HC_MIPS_EXIT_VM		7
>  #define KVM_HC_MIPS_CONSOLE_OUTPUT	8
> +#define KVM_HC_CLOCK_PAIRING		9
>  
>  /*
>   * hypercalls use architecture specific
> 
> 
> 

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [patch 2/4] KVM: x86: add KVM_HC_CLOCK_OFFSET hypercall
  2017-01-25 12:30   ` Paolo Bonzini
@ 2017-01-25 12:55     ` Marcelo Tosatti
  2017-01-25 12:59       ` Paolo Bonzini
  0 siblings, 1 reply; 14+ messages in thread
From: Marcelo Tosatti @ 2017-01-25 12:55 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: kvm, linux-kernel, Radim Krcmar, Richard Cochran, Miroslav Lichvar

On Wed, Jan 25, 2017 at 01:30:49PM +0100, Paolo Bonzini wrote:
> 
> 
> On 24/01/2017 18:09, Marcelo Tosatti wrote:
> > Add a hypercall to retrieve the host realtime clock
> > and the TSC value used to calculate that clock read.
> > 
> > Used to implement clock synchronization between
> > host and guest.
> > 
> > Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
> > 
> > ---
> >  Documentation/virtual/kvm/hypercalls.txt |   33 ++++++++++++++++++++++++
> >  arch/x86/include/uapi/asm/kvm_para.h     |    9 ++++++
> >  arch/x86/kvm/x86.c                       |   41 +++++++++++++++++++++++++++++++
> >  include/uapi/linux/kvm_para.h            |    3 ++
> >  4 files changed, 86 insertions(+)
> > 
> > v2: improve documentation (Radim)
> >     change hypercall name to KVM_HC_CLOCK_PAIRING (Radim)
> >     increase padding size
> > 
> > Index: kvm-ptpdriver/arch/x86/include/uapi/asm/kvm_para.h
> > ===================================================================
> > --- kvm-ptpdriver.orig/arch/x86/include/uapi/asm/kvm_para.h	2017-01-13 16:43:11.947240575 -0200
> > +++ kvm-ptpdriver/arch/x86/include/uapi/asm/kvm_para.h	2017-01-13 16:43:25.038258187 -0200
> > @@ -50,6 +50,15 @@
> >  	__u32 pad[11];
> >  };
> >  
> > +#define KVM_CLOCK_PAIRING_WALLCLOCK 0
> > +struct kvm_clock_offset {
> 
> The struct still has the old name.
> 
> > +	__s64 sec;
> > +	__s64 nsec;
> > +	__u64 tsc;
> > +	__u32 flags;
> > +	__u32 pad[9];
> > +};

That was on purpose: "to pair" is the operation. 

"struct kvm_clock_pairing" makes no sense to me 
(because pairing is an "action").

(if you'd prefer that name, let me know).

> >  #define KVM_STEAL_ALIGNMENT_BITS 5
> >  #define KVM_STEAL_VALID_BITS ((-1ULL << (KVM_STEAL_ALIGNMENT_BITS + 1)))
> >  #define KVM_STEAL_RESERVED_MASK (((1 << KVM_STEAL_ALIGNMENT_BITS) - 1 ) << 1)
> > Index: kvm-ptpdriver/Documentation/virtual/kvm/hypercalls.txt
> > ===================================================================
> > --- kvm-ptpdriver.orig/Documentation/virtual/kvm/hypercalls.txt	2017-01-13 16:43:11.947240575 -0200
> > +++ kvm-ptpdriver/Documentation/virtual/kvm/hypercalls.txt	2017-01-13 16:43:25.038258187 -0200
> > @@ -81,3 +81,36 @@
> >  same guest can wakeup the sleeping vcpu by issuing KVM_HC_KICK_CPU hypercall,
> >  specifying APIC ID (a1) of the vcpu to be woken up. An additional argument (a0)
> >  is used in the hypercall for future use.
> > +
> > +
> > +6. KVM_HC_CLOCK_PAIRING
> > +------------------------
> > +Architecture: x86
> > +Status: active
> > +Purpose: Hypercall used to synchronize host and guest clocks.
> > +Usage:
> > +
> > +a0: guest physical address where host copies
> > +"struct kvm_clock_offset" structure.
> > +
> > +a1: clock_type, ATM only KVM_CLOCK_PAIRING_WALLCLOCK (0)
> > +is supported (hosts CLOCK_REALTIME clock).
> > +
> > +		struct kvm_clock_offset {
> 
> Name of the struct not adjusted.
> > +			__s64 sec;
> > +			__s64 nsec;
> > +			__u64 tsc;
> > +			__u32 flags;
> > +			__u32 pad[9];
> 
> I'd just leave one element of padding and put the struct on the stack, 
> since we have no reason to think we'll need a lot of padding.

Well you don't know: we could implement it in case of HPET clocksource
hosts.

> > +		};
> > +
> > +       Where:
> > +               * sec: seconds from clock_type clock.
> > +               * nsec: nanoseconds from clock_type clock.
> > +               * tsc: TSC value used to calculate sec/nsec pair
> > +                 (this hypercall only works when host uses TSC clocksource).
> 
> This is already mentioned below, and it's not clear that tsc is a guest 
> TSC value.

Ok fixed.

> > +               * flags: flags, unused (0) at the moment.
> > +
> > +Returns KVM_EOPNOTSUPP if the host does not use TSC clocksource,
> > +or if clock type is different than KVM_CLOCK_PAIRING_WALLCLOCK.
> > +
> > Index: kvm-ptpdriver/arch/x86/kvm/x86.c
> > ===================================================================
> > --- kvm-ptpdriver.orig/arch/x86/kvm/x86.c	2017-01-13 16:43:20.898252596 -0200
> > +++ kvm-ptpdriver/arch/x86/kvm/x86.c	2017-01-13 16:43:25.041258191 -0200
> > @@ -6119,6 +6119,44 @@
> >  }
> >  EXPORT_SYMBOL_GPL(kvm_emulate_halt);
> >  
> > +static int kvm_pv_clock_pairing(struct kvm_vcpu *vcpu, gpa_t paddr,
> > +			       unsigned long clock_type)
> > +{
> > +	struct kvm_clock_offset *clock_offset;
> > +	struct timespec ts;
> > +	cycle_t cycle;
> > +	int ret;
> > +
> > +	if (clock_type != KVM_CLOCK_PAIRING_WALLCLOCK)
> > +		return -KVM_EOPNOTSUPP;
> > +
> > +	if (pvclock_gtod_data.clock.vclock_mode != VCLOCK_TSC)
> > +		return -KVM_EOPNOTSUPP;
> 
> Already checked in kvm_get_walltime_and_clockread.

Fixed.

> > +	clock_offset = kzalloc(sizeof(struct kvm_clock_offset), GFP_KERNEL);
> > +	if (clock_offset == NULL)
> > +		return -KVM_ENOMEM;
> > +
> > +	if (kvm_get_walltime_and_clockread(&ts, &cycle) == false) {
> > +		kfree(clock_offset);
> > +		return -KVM_EOPNOTSUPP;
> > +	}
> > +	clock_offset->sec = ts.tv_sec;
> > +	clock_offset->nsec = ts.tv_nsec;
> > +	clock_offset->tsc = kvm_read_l1_tsc(vcpu, cycle);
> > +	clock_offset->flags = 0;
> > +
> > +	ret = 0;
> > +	if (kvm_write_guest(vcpu->kvm, paddr, clock_offset,
> > +			      sizeof(struct kvm_clock_offset)))
> > +		ret = -KVM_EFAULT;
> > +
> > +	kfree(clock_offset);
> > +
> > +	return ret;
> > +}
> > +
> >  /*
> >   * kvm_pv_kick_cpu_op:  Kick a vcpu.
> >   *
> > @@ -6183,6 +6221,9 @@
> >  		kvm_pv_kick_cpu_op(vcpu->kvm, a0, a1);
> >  		ret = 0;
> >  		break;
> > +	case KVM_HC_CLOCK_PAIRING:
> > +		ret = kvm_pv_clock_pairing(vcpu, a0, a1);
> > +		break;
> >  	default:
> >  		ret = -KVM_ENOSYS;
> >  		break;
> > Index: kvm-ptpdriver/include/uapi/linux/kvm_para.h
> > ===================================================================
> > --- kvm-ptpdriver.orig/include/uapi/linux/kvm_para.h	2017-01-13 16:43:11.947240575 -0200
> > +++ kvm-ptpdriver/include/uapi/linux/kvm_para.h	2017-01-13 16:43:25.042258192 -0200
> > @@ -14,6 +14,8 @@
> >  #define KVM_EFAULT		EFAULT
> >  #define KVM_E2BIG		E2BIG
> >  #define KVM_EPERM		EPERM
> > +#define KVM_EOPNOTSUPP		EOPNOTSUPP
> > +#define KVM_ENOMEM		ENOMEM
> 
> This file could potentially be used across multiple architectures (it's 
> not in arch/x86/include/uapi/asm), but EOPNOTSUPP is not the same on 
> all systems.  Better use "95" which is the value it has on x86.

Fixed.

> All this gives:
> 
> diff --git a/Documentation/virtual/kvm/hypercalls.txt b/Documentation/virtual/kvm/hypercalls.txt
> index 55eb3e1ca494..cd35de84999f 100644
> --- a/Documentation/virtual/kvm/hypercalls.txt
> +++ b/Documentation/virtual/kvm/hypercalls.txt
> @@ -94,9 +94,9 @@ a0: guest physical address where host copies
>  "struct kvm_clock_offset" structure.
>  
>  a1: clock_type, ATM only KVM_CLOCK_PAIRING_WALLCLOCK (0)
> -is supported (hosts CLOCK_REALTIME clock).
> +is supported (corresponding to the host's CLOCK_REALTIME clock).
>  
> -		struct kvm_clock_offset {
> +		struct kvm_clock_pairing {
>  			__s64 sec;
>  			__s64 nsec;
>  			__u64 tsc;
> @@ -107,10 +107,12 @@ is supported (hosts CLOCK_REALTIME clock).
>         Where:
>                 * sec: seconds from clock_type clock.
>                 * nsec: nanoseconds from clock_type clock.
> -               * tsc: TSC value used to calculate sec/nsec pair
> -                 (this hypercall only works when host uses TSC clocksource).
> +               * tsc: guest TSC value used to calculate sec/nsec pair
>                 * flags: flags, unused (0) at the moment.
>  
> +The hypercall lets a guest compute a precise timestamp across
> +host and guest.  The guest can use the returned TSC value to
> +compute the CLOCK_REALTIME for its clock, at the same instant.
> +
>  Returns KVM_EOPNOTSUPP if the host does not use TSC clocksource,
>  or if clock type is different than KVM_CLOCK_PAIRING_WALLCLOCK.
> -
> diff --git a/arch/x86/include/uapi/asm/kvm_para.h b/arch/x86/include/uapi/asm/kvm_para.h
> index ce7b411e28bc..3641d8b7dba9 100644
> --- a/arch/x86/include/uapi/asm/kvm_para.h
> +++ b/arch/x86/include/uapi/asm/kvm_para.h
> @@ -51,12 +51,12 @@ struct kvm_steal_time {
>  };
>  
>  #define KVM_CLOCK_PAIRING_WALLCLOCK 0
> -struct kvm_clock_offset {
> +struct kvm_clock_pairing {
>  	__s64 sec;
>  	__s64 nsec;
>  	__u64 tsc;
>  	__u32 flags;
> -	__u32 pad[9];
> +	__u32 pad[1];
>  };
>  
>  #define KVM_STEAL_ALIGNMENT_BITS 5
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 4ecc3145ea4b..09e5d31dac98 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -6149,9 +6149,9 @@ int kvm_emulate_halt(struct kvm_vcpu *vcpu)
>  EXPORT_SYMBOL_GPL(kvm_emulate_halt);
>  
>  static int kvm_pv_clock_pairing(struct kvm_vcpu *vcpu, gpa_t paddr,
> -			       unsigned long clock_type)
> +			        unsigned long clock_type)
>  {
> -	struct kvm_clock_offset *clock_offset;
> +	struct kvm_clock_pairing clock_pairing;
>  	struct timespec ts;
>  	cycle_t cycle;
>  	int ret;
> @@ -6159,30 +6159,19 @@ static int kvm_pv_clock_pairing(struct kvm_vcpu *vcpu, gpa_t paddr,
>  	if (clock_type != KVM_CLOCK_PAIRING_WALLCLOCK)
>  		return -KVM_EOPNOTSUPP;
>  
> -	if (pvclock_gtod_data.clock.vclock_mode != VCLOCK_TSC)
> -		return -KVM_EOPNOTSUPP;
> -
> -	clock_offset = kzalloc(sizeof(struct kvm_clock_offset), GFP_KERNEL);
> -	if (clock_offset == NULL)
> -		return -KVM_ENOMEM;
> -
> -	if (kvm_get_walltime_and_clockread(&ts, &cycle) == false) {
> -		kfree(clock_offset);
> +	if (kvm_get_walltime_and_clockread(&ts, &cycle) == false)
>  		return -KVM_EOPNOTSUPP;
> -	}
>  
> -	clock_offset->sec = ts.tv_sec;
> -	clock_offset->nsec = ts.tv_nsec;
> -	clock_offset->tsc = kvm_read_l1_tsc(vcpu, cycle);
> -	clock_offset->flags = 0;
> +	clock_pairing.sec = ts.tv_sec;
> +	clock_pairing.nsec = ts.tv_nsec;
> +	clock_pairing.tsc = kvm_read_l1_tsc(vcpu, cycle);
> +	clock_pairing.flags = 0;
>  
>  	ret = 0;
> -	if (kvm_write_guest(vcpu->kvm, paddr, clock_offset,
> -			      sizeof(struct kvm_clock_offset)))
> +	if (kvm_write_guest(vcpu->kvm, paddr, &clock_pairing,
> +			    sizeof(struct kvm_clock_pairing)))
>  		ret = -KVM_EFAULT;
>  
> -	kfree(clock_offset);
> -
>  	return ret;
>  }
>  
> diff --git a/include/uapi/linux/kvm_para.h b/include/uapi/linux/kvm_para.h
> index 68da60e7f519..c72128527377 100644
> --- a/include/uapi/linux/kvm_para.h
> +++ b/include/uapi/linux/kvm_para.h
> @@ -14,7 +14,7 @@
>  #define KVM_EFAULT		EFAULT
>  #define KVM_E2BIG		E2BIG
>  #define KVM_EPERM		EPERM
> -#define KVM_EOPNOTSUPP		EOPNOTSUPP
> +#define KVM_EOPNOTSUPP		95
>  #define KVM_ENOMEM		ENOMEM
>  
>  #define KVM_HC_VAPIC_POLL_IRQ		1
> 
> Okay?

Just not the padding please, the rest is fine.

> 
> Paolo
> 
> >  #define KVM_HC_VAPIC_POLL_IRQ		1
> >  #define KVM_HC_MMU_OP			2
> > @@ -23,6 +25,7 @@
> >  #define KVM_HC_MIPS_GET_CLOCK_FREQ	6
> >  #define KVM_HC_MIPS_EXIT_VM		7
> >  #define KVM_HC_MIPS_CONSOLE_OUTPUT	8
> > +#define KVM_HC_CLOCK_PAIRING		9
> >  
> >  /*
> >   * hypercalls use architecture specific
> > 
> > 
> > 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [patch 4/4] PTP: add kvm PTP driver
  2017-01-24 17:09 ` [patch 4/4] PTP: add kvm PTP driver Marcelo Tosatti
  2017-01-24 18:11   ` Richard Cochran
@ 2017-01-25 12:56   ` Paolo Bonzini
  2017-01-25 13:03     ` Marcelo Tosatti
  1 sibling, 1 reply; 14+ messages in thread
From: Paolo Bonzini @ 2017-01-25 12:56 UTC (permalink / raw)
  To: Marcelo Tosatti, kvm, linux-kernel
  Cc: Radim Krcmar, Richard Cochran, Miroslav Lichvar



On 24/01/2017 18:09, Marcelo Tosatti wrote:
> Add a driver with gettime method returning hosts realtime clock.
> This allows Chrony to synchronize host and guest clocks with 
> high precision (see results below).
> 
> chronyc> sources
> MS Name/IP address         Stratum Poll Reach LastRx Last sample
> ===============================================================================
> #* PHC0                          0   3   377     4   +162ns[ -683ns] +/-   11ns
> 
> To configure Chronyd to use PHC refclock, add the 
> following line to its configuration file:
> 
> refclock PHC /dev/ptpX poll 3 dpoll -2 offset 0
> 
> Where /dev/ptpX is the kvmclock PTP clock.
> 
> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
> 
> ---
>  drivers/ptp/Kconfig   |   12 ++
>  drivers/ptp/Makefile  |    1 
>  drivers/ptp/ptp_kvm.c |  202 ++++++++++++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 215 insertions(+)
> 
> v2: check for kvmclock (Radim)
>     initialize global variables before device registration (Radim)
> v3: use cross timestamps callback (Paolo, Miroslav, Radim)
> v4: remove gettime64 callback (Paolo)
> v5: drop useless nsec assignment (Radim)
> 
> Index: kvm-ptpdriver/drivers/ptp/Kconfig
> ===================================================================
> --- kvm-ptpdriver.orig/drivers/ptp/Kconfig	2017-01-24 09:03:47.604586098 -0200
> +++ kvm-ptpdriver/drivers/ptp/Kconfig	2017-01-24 09:05:06.321704124 -0200
> @@ -90,4 +90,16 @@
>  	  To compile this driver as a module, choose M here: the module
>  	  will be called ptp_pch.
>  
> +config PTP_1588_CLOCK_KVM
> +	tristate "KVM virtual PTP clock"
> +	depends on PTP_1588_CLOCK
> +	depends on KVM_GUEST
> +	default y
> +	help
> +	  This driver adds support for using kvm infrastructure as a PTP
> +	  clock. This clock is only useful if you are using KVM guests.
> +
> +	  To compile this driver as a module, choose M here: the module
> +	  will be called ptp_kvm.
> +
>  endmenu
> Index: kvm-ptpdriver/drivers/ptp/Makefile
> ===================================================================
> --- kvm-ptpdriver.orig/drivers/ptp/Makefile	2017-01-24 09:03:47.604586098 -0200
> +++ kvm-ptpdriver/drivers/ptp/Makefile	2017-01-24 09:05:06.322704125 -0200
> @@ -6,3 +6,4 @@
>  obj-$(CONFIG_PTP_1588_CLOCK)		+= ptp.o
>  obj-$(CONFIG_PTP_1588_CLOCK_IXP46X)	+= ptp_ixp46x.o
>  obj-$(CONFIG_PTP_1588_CLOCK_PCH)	+= ptp_pch.o
> +obj-$(CONFIG_PTP_1588_CLOCK_KVM)	+= ptp_kvm.o
> Index: kvm-ptpdriver/drivers/ptp/ptp_kvm.c
> ===================================================================
> --- /dev/null	1970-01-01 00:00:00.000000000 +0000
> +++ kvm-ptpdriver/drivers/ptp/ptp_kvm.c	2017-01-24 09:05:06.322704125 -0200
> @@ -0,0 +1,202 @@
> +/*
> + * Virtual PTP 1588 clock for use with KVM guests
> + *
> + * Copyright (C) 2017 Red Hat Inc.
> + *
> + *  This program is free software; you can redistribute it and/or modify
> + *  it under the terms of the GNU General Public License as published by
> + *  the Free Software Foundation; either version 2 of the License, or
> + *  (at your option) any later version.
> + *
> + *  This program is distributed in the hope that it will be useful,
> + *  but WITHOUT ANY WARRANTY; without even the implied warranty of
> + *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + *  GNU General Public License for more details.
> + *
> + */
> +#include <linux/device.h>
> +#include <linux/err.h>
> +#include <linux/init.h>
> +#include <linux/kernel.h>
> +#include <linux/module.h>
> +#include <uapi/linux/kvm_para.h>
> +#include <asm/kvm_para.h>
> +#include <asm/pvclock.h>
> +#include <asm/kvmclock.h>
> +#include <uapi/asm/kvm_para.h>
> +
> +#include <linux/ptp_clock_kernel.h>
> +
> +struct kvm_ptp_clock {
> +	struct ptp_clock *ptp_clock;
> +	struct ptp_clock_info caps;
> +};
> +
> +DEFINE_SPINLOCK(kvm_ptp_lock);
> +
> +static struct pvclock_vsyscall_time_info *hv_clock;
> +
> +static struct kvm_clock_offset clock_off;
> +static phys_addr_t clock_off_gpa;
> +
> +static int ptp_kvm_get_time_fn(ktime_t *device_time,
> +			       struct system_counterval_t *system_counter,
> +			       void *ctx)
> +{
> +	unsigned long ret;
> +	struct timespec64 tspec;
> +	unsigned version;
> +	u8 flags;
> +	int cpu;
> +	struct pvclock_vcpu_time_info *src;
> +
> +	preempt_disable_notrace();
> +	cpu = smp_processor_id();
> +	src = &hv_clock[cpu].pvti;
> +
> +	spin_lock(&kvm_ptp_lock);

Any reason to take the spinlock in the non-preemptable region?
This would be worse for RT kernels than taking it outside.

> +	do {
> +		/*
> +		 * We are using a TSC value read in the hosts
> +		 * kvm_hc_clock_pairing handling.
> +		 * So any changes to tsc_to_system_mul
> +		 * and tsc_shift or any other pvclock
> +		 * data invalidate that measurement.
> +		 */
> +		version = pvclock_read_begin(src);
> +
> +		ret = kvm_hypercall2(KVM_HC_CLOCK_PAIRING,
> +				     clock_off_gpa,
> +				     KVM_CLOCK_PAIRING_WALLCLOCK);
> +		if (ret != 0) {
> +			pr_err("clock offset hypercall ret %lu\n", ret);

Need ratelimiting in case something goes wrong and the host stops using
the TSC clocksource?

> +			spin_unlock(&kvm_ptp_lock);
> +			preempt_enable_notrace();
> +			return -EOPNOTSUPP;
> +		}
> +
> +		tspec.tv_sec = clock_off.sec;
> +		tspec.tv_nsec = clock_off.nsec;
> +		ret = __pvclock_read_cycles(src, clock_off.tsc);
> +		flags = src->flags;

Flags written and not read.

Also, I think I disagree with Radim, gettime64 should use the TSC
correction.  It's the right thing to do for the POSIX clock interface.
It's a bit arbitrary for PTP_SYS_OFFSET (though you cannot really argue
with better precision), but chrony is now using PTP_SYS_OFFSET_PRECISE
(and again you cannot really argue with better precision anyway).

I'll test this a bit and then push it to kvm/queue.  Please take a
look!

Paolo

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [patch 2/4] KVM: x86: add KVM_HC_CLOCK_OFFSET hypercall
  2017-01-25 12:55     ` Marcelo Tosatti
@ 2017-01-25 12:59       ` Paolo Bonzini
  2017-01-25 13:06         ` Marcelo Tosatti
  0 siblings, 1 reply; 14+ messages in thread
From: Paolo Bonzini @ 2017-01-25 12:59 UTC (permalink / raw)
  To: Marcelo Tosatti
  Cc: kvm, linux-kernel, Radim Krcmar, Richard Cochran, Miroslav Lichvar



On 25/01/2017 13:55, Marcelo Tosatti wrote:
> On Wed, Jan 25, 2017 at 01:30:49PM +0100, Paolo Bonzini wrote:
>>
>>
>> On 24/01/2017 18:09, Marcelo Tosatti wrote:
>>> Add a hypercall to retrieve the host realtime clock
>>> and the TSC value used to calculate that clock read.
>>>
>>> Used to implement clock synchronization between
>>> host and guest.
>>>
>>> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
>>>
>>> ---
>>>  Documentation/virtual/kvm/hypercalls.txt |   33 ++++++++++++++++++++++++
>>>  arch/x86/include/uapi/asm/kvm_para.h     |    9 ++++++
>>>  arch/x86/kvm/x86.c                       |   41 +++++++++++++++++++++++++++++++
>>>  include/uapi/linux/kvm_para.h            |    3 ++
>>>  4 files changed, 86 insertions(+)
>>>
>>> v2: improve documentation (Radim)
>>>     change hypercall name to KVM_HC_CLOCK_PAIRING (Radim)
>>>     increase padding size
>>>
>>> Index: kvm-ptpdriver/arch/x86/include/uapi/asm/kvm_para.h
>>> ===================================================================
>>> --- kvm-ptpdriver.orig/arch/x86/include/uapi/asm/kvm_para.h	2017-01-13 16:43:11.947240575 -0200
>>> +++ kvm-ptpdriver/arch/x86/include/uapi/asm/kvm_para.h	2017-01-13 16:43:25.038258187 -0200
>>> @@ -50,6 +50,15 @@
>>>  	__u32 pad[11];
>>>  };
>>>  
>>> +#define KVM_CLOCK_PAIRING_WALLCLOCK 0
>>> +struct kvm_clock_offset {
>>
>> The struct still has the old name.
>>
>>> +	__s64 sec;
>>> +	__s64 nsec;
>>> +	__u64 tsc;
>>> +	__u32 flags;
>>> +	__u32 pad[9];
>>> +};
> 
> That was on purpose: "to pair" is the operation. 
> 
> "struct kvm_clock_pairing" makes no sense to me 
> (because pairing is an "action").
> 
> (if you'd prefer that name, let me know).

A "pairing" is a union of two things, in this case a sec/nsec value and
a TSC value.  (I had to look it up in the dictionary though :)).

>>>  #define KVM_STEAL_ALIGNMENT_BITS 5
>>>  #define KVM_STEAL_VALID_BITS ((-1ULL << (KVM_STEAL_ALIGNMENT_BITS + 1)))
>>>  #define KVM_STEAL_RESERVED_MASK (((1 << KVM_STEAL_ALIGNMENT_BITS) - 1 ) << 1)
>>> Index: kvm-ptpdriver/Documentation/virtual/kvm/hypercalls.txt
>>> ===================================================================
>>> --- kvm-ptpdriver.orig/Documentation/virtual/kvm/hypercalls.txt	2017-01-13 16:43:11.947240575 -0200
>>> +++ kvm-ptpdriver/Documentation/virtual/kvm/hypercalls.txt	2017-01-13 16:43:25.038258187 -0200
>>> @@ -81,3 +81,36 @@
>>>  same guest can wakeup the sleeping vcpu by issuing KVM_HC_KICK_CPU hypercall,
>>>  specifying APIC ID (a1) of the vcpu to be woken up. An additional argument (a0)
>>>  is used in the hypercall for future use.
>>> +
>>> +
>>> +6. KVM_HC_CLOCK_PAIRING
>>> +------------------------
>>> +Architecture: x86
>>> +Status: active
>>> +Purpose: Hypercall used to synchronize host and guest clocks.
>>> +Usage:
>>> +
>>> +a0: guest physical address where host copies
>>> +"struct kvm_clock_offset" structure.
>>> +
>>> +a1: clock_type, ATM only KVM_CLOCK_PAIRING_WALLCLOCK (0)
>>> +is supported (hosts CLOCK_REALTIME clock).
>>> +
>>> +		struct kvm_clock_offset {
>>
>> Name of the struct not adjusted.
>>> +			__s64 sec;
>>> +			__s64 nsec;
>>> +			__u64 tsc;
>>> +			__u32 flags;
>>> +			__u32 pad[9];
>>
>> I'd just leave one element of padding and put the struct on the stack, 
>> since we have no reason to think we'll need a lot of padding.
> 
> Well you don't know: we could implement it in case of HPET clocksource
> hosts.

But then the guest wouldn't have a TSC value to use, would it?

Also, how would it use more padding?

Paolo

>>> +		};
>>> +
>>> +       Where:
>>> +               * sec: seconds from clock_type clock.
>>> +               * nsec: nanoseconds from clock_type clock.
>>> +               * tsc: TSC value used to calculate sec/nsec pair
>>> +                 (this hypercall only works when host uses TSC clocksource).
>>
>> This is already mentioned below, and it's not clear that tsc is a guest 
>> TSC value.
> 
> Ok fixed.
> 
>>> +               * flags: flags, unused (0) at the moment.
>>> +
>>> +Returns KVM_EOPNOTSUPP if the host does not use TSC clocksource,
>>> +or if clock type is different than KVM_CLOCK_PAIRING_WALLCLOCK.
>>> +
>>> Index: kvm-ptpdriver/arch/x86/kvm/x86.c
>>> ===================================================================
>>> --- kvm-ptpdriver.orig/arch/x86/kvm/x86.c	2017-01-13 16:43:20.898252596 -0200
>>> +++ kvm-ptpdriver/arch/x86/kvm/x86.c	2017-01-13 16:43:25.041258191 -0200
>>> @@ -6119,6 +6119,44 @@
>>>  }
>>>  EXPORT_SYMBOL_GPL(kvm_emulate_halt);
>>>  
>>> +static int kvm_pv_clock_pairing(struct kvm_vcpu *vcpu, gpa_t paddr,
>>> +			       unsigned long clock_type)
>>> +{
>>> +	struct kvm_clock_offset *clock_offset;
>>> +	struct timespec ts;
>>> +	cycle_t cycle;
>>> +	int ret;
>>> +
>>> +	if (clock_type != KVM_CLOCK_PAIRING_WALLCLOCK)
>>> +		return -KVM_EOPNOTSUPP;
>>> +
>>> +	if (pvclock_gtod_data.clock.vclock_mode != VCLOCK_TSC)
>>> +		return -KVM_EOPNOTSUPP;
>>
>> Already checked in kvm_get_walltime_and_clockread.
> 
> Fixed.
> 
>>> +	clock_offset = kzalloc(sizeof(struct kvm_clock_offset), GFP_KERNEL);
>>> +	if (clock_offset == NULL)
>>> +		return -KVM_ENOMEM;
>>> +
>>> +	if (kvm_get_walltime_and_clockread(&ts, &cycle) == false) {
>>> +		kfree(clock_offset);
>>> +		return -KVM_EOPNOTSUPP;
>>> +	}
>>> +	clock_offset->sec = ts.tv_sec;
>>> +	clock_offset->nsec = ts.tv_nsec;
>>> +	clock_offset->tsc = kvm_read_l1_tsc(vcpu, cycle);
>>> +	clock_offset->flags = 0;
>>> +
>>> +	ret = 0;
>>> +	if (kvm_write_guest(vcpu->kvm, paddr, clock_offset,
>>> +			      sizeof(struct kvm_clock_offset)))
>>> +		ret = -KVM_EFAULT;
>>> +
>>> +	kfree(clock_offset);
>>> +
>>> +	return ret;
>>> +}
>>> +
>>>  /*
>>>   * kvm_pv_kick_cpu_op:  Kick a vcpu.
>>>   *
>>> @@ -6183,6 +6221,9 @@
>>>  		kvm_pv_kick_cpu_op(vcpu->kvm, a0, a1);
>>>  		ret = 0;
>>>  		break;
>>> +	case KVM_HC_CLOCK_PAIRING:
>>> +		ret = kvm_pv_clock_pairing(vcpu, a0, a1);
>>> +		break;
>>>  	default:
>>>  		ret = -KVM_ENOSYS;
>>>  		break;
>>> Index: kvm-ptpdriver/include/uapi/linux/kvm_para.h
>>> ===================================================================
>>> --- kvm-ptpdriver.orig/include/uapi/linux/kvm_para.h	2017-01-13 16:43:11.947240575 -0200
>>> +++ kvm-ptpdriver/include/uapi/linux/kvm_para.h	2017-01-13 16:43:25.042258192 -0200
>>> @@ -14,6 +14,8 @@
>>>  #define KVM_EFAULT		EFAULT
>>>  #define KVM_E2BIG		E2BIG
>>>  #define KVM_EPERM		EPERM
>>> +#define KVM_EOPNOTSUPP		EOPNOTSUPP
>>> +#define KVM_ENOMEM		ENOMEM
>>
>> This file could potentially be used across multiple architectures (it's 
>> not in arch/x86/include/uapi/asm), but EOPNOTSUPP is not the same on 
>> all systems.  Better use "95" which is the value it has on x86.
> 
> Fixed.
> 
>> All this gives:
>>
>> diff --git a/Documentation/virtual/kvm/hypercalls.txt b/Documentation/virtual/kvm/hypercalls.txt
>> index 55eb3e1ca494..cd35de84999f 100644
>> --- a/Documentation/virtual/kvm/hypercalls.txt
>> +++ b/Documentation/virtual/kvm/hypercalls.txt
>> @@ -94,9 +94,9 @@ a0: guest physical address where host copies
>>  "struct kvm_clock_offset" structure.
>>  
>>  a1: clock_type, ATM only KVM_CLOCK_PAIRING_WALLCLOCK (0)
>> -is supported (hosts CLOCK_REALTIME clock).
>> +is supported (corresponding to the host's CLOCK_REALTIME clock).
>>  
>> -		struct kvm_clock_offset {
>> +		struct kvm_clock_pairing {
>>  			__s64 sec;
>>  			__s64 nsec;
>>  			__u64 tsc;
>> @@ -107,10 +107,12 @@ is supported (hosts CLOCK_REALTIME clock).
>>         Where:
>>                 * sec: seconds from clock_type clock.
>>                 * nsec: nanoseconds from clock_type clock.
>> -               * tsc: TSC value used to calculate sec/nsec pair
>> -                 (this hypercall only works when host uses TSC clocksource).
>> +               * tsc: guest TSC value used to calculate sec/nsec pair
>>                 * flags: flags, unused (0) at the moment.
>>  
>> +The hypercall lets a guest compute a precise timestamp across
>> +host and guest.  The guest can use the returned TSC value to
>> +compute the CLOCK_REALTIME for its clock, at the same instant.
>> +
>>  Returns KVM_EOPNOTSUPP if the host does not use TSC clocksource,
>>  or if clock type is different than KVM_CLOCK_PAIRING_WALLCLOCK.
>> -
>> diff --git a/arch/x86/include/uapi/asm/kvm_para.h b/arch/x86/include/uapi/asm/kvm_para.h
>> index ce7b411e28bc..3641d8b7dba9 100644
>> --- a/arch/x86/include/uapi/asm/kvm_para.h
>> +++ b/arch/x86/include/uapi/asm/kvm_para.h
>> @@ -51,12 +51,12 @@ struct kvm_steal_time {
>>  };
>>  
>>  #define KVM_CLOCK_PAIRING_WALLCLOCK 0
>> -struct kvm_clock_offset {
>> +struct kvm_clock_pairing {
>>  	__s64 sec;
>>  	__s64 nsec;
>>  	__u64 tsc;
>>  	__u32 flags;
>> -	__u32 pad[9];
>> +	__u32 pad[1];
>>  };
>>  
>>  #define KVM_STEAL_ALIGNMENT_BITS 5
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index 4ecc3145ea4b..09e5d31dac98 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -6149,9 +6149,9 @@ int kvm_emulate_halt(struct kvm_vcpu *vcpu)
>>  EXPORT_SYMBOL_GPL(kvm_emulate_halt);
>>  
>>  static int kvm_pv_clock_pairing(struct kvm_vcpu *vcpu, gpa_t paddr,
>> -			       unsigned long clock_type)
>> +			        unsigned long clock_type)
>>  {
>> -	struct kvm_clock_offset *clock_offset;
>> +	struct kvm_clock_pairing clock_pairing;
>>  	struct timespec ts;
>>  	cycle_t cycle;
>>  	int ret;
>> @@ -6159,30 +6159,19 @@ static int kvm_pv_clock_pairing(struct kvm_vcpu *vcpu, gpa_t paddr,
>>  	if (clock_type != KVM_CLOCK_PAIRING_WALLCLOCK)
>>  		return -KVM_EOPNOTSUPP;
>>  
>> -	if (pvclock_gtod_data.clock.vclock_mode != VCLOCK_TSC)
>> -		return -KVM_EOPNOTSUPP;
>> -
>> -	clock_offset = kzalloc(sizeof(struct kvm_clock_offset), GFP_KERNEL);
>> -	if (clock_offset == NULL)
>> -		return -KVM_ENOMEM;
>> -
>> -	if (kvm_get_walltime_and_clockread(&ts, &cycle) == false) {
>> -		kfree(clock_offset);
>> +	if (kvm_get_walltime_and_clockread(&ts, &cycle) == false)
>>  		return -KVM_EOPNOTSUPP;
>> -	}
>>  
>> -	clock_offset->sec = ts.tv_sec;
>> -	clock_offset->nsec = ts.tv_nsec;
>> -	clock_offset->tsc = kvm_read_l1_tsc(vcpu, cycle);
>> -	clock_offset->flags = 0;
>> +	clock_pairing.sec = ts.tv_sec;
>> +	clock_pairing.nsec = ts.tv_nsec;
>> +	clock_pairing.tsc = kvm_read_l1_tsc(vcpu, cycle);
>> +	clock_pairing.flags = 0;
>>  
>>  	ret = 0;
>> -	if (kvm_write_guest(vcpu->kvm, paddr, clock_offset,
>> -			      sizeof(struct kvm_clock_offset)))
>> +	if (kvm_write_guest(vcpu->kvm, paddr, &clock_pairing,
>> +			    sizeof(struct kvm_clock_pairing)))
>>  		ret = -KVM_EFAULT;
>>  
>> -	kfree(clock_offset);
>> -
>>  	return ret;
>>  }
>>  
>> diff --git a/include/uapi/linux/kvm_para.h b/include/uapi/linux/kvm_para.h
>> index 68da60e7f519..c72128527377 100644
>> --- a/include/uapi/linux/kvm_para.h
>> +++ b/include/uapi/linux/kvm_para.h
>> @@ -14,7 +14,7 @@
>>  #define KVM_EFAULT		EFAULT
>>  #define KVM_E2BIG		E2BIG
>>  #define KVM_EPERM		EPERM
>> -#define KVM_EOPNOTSUPP		EOPNOTSUPP
>> +#define KVM_EOPNOTSUPP		95
>>  #define KVM_ENOMEM		ENOMEM
>>  
>>  #define KVM_HC_VAPIC_POLL_IRQ		1
>>
>> Okay?
> 
> Just not the padding please, the rest is fine.
> 
>>
>> Paolo
>>
>>>  #define KVM_HC_VAPIC_POLL_IRQ		1
>>>  #define KVM_HC_MMU_OP			2
>>> @@ -23,6 +25,7 @@
>>>  #define KVM_HC_MIPS_GET_CLOCK_FREQ	6
>>>  #define KVM_HC_MIPS_EXIT_VM		7
>>>  #define KVM_HC_MIPS_CONSOLE_OUTPUT	8
>>> +#define KVM_HC_CLOCK_PAIRING		9
>>>  
>>>  /*
>>>   * hypercalls use architecture specific
>>>
>>>
>>>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [patch 4/4] PTP: add kvm PTP driver
  2017-01-25 12:56   ` Paolo Bonzini
@ 2017-01-25 13:03     ` Marcelo Tosatti
  0 siblings, 0 replies; 14+ messages in thread
From: Marcelo Tosatti @ 2017-01-25 13:03 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: kvm, linux-kernel, Radim Krcmar, Richard Cochran, Miroslav Lichvar

On Wed, Jan 25, 2017 at 01:56:45PM +0100, Paolo Bonzini wrote:
> 
> 
> On 24/01/2017 18:09, Marcelo Tosatti wrote:
> > Add a driver with gettime method returning hosts realtime clock.
> > This allows Chrony to synchronize host and guest clocks with 
> > high precision (see results below).
> > 
> > chronyc> sources
> > MS Name/IP address         Stratum Poll Reach LastRx Last sample
> > ===============================================================================
> > #* PHC0                          0   3   377     4   +162ns[ -683ns] +/-   11ns
> > 
> > To configure Chronyd to use PHC refclock, add the 
> > following line to its configuration file:
> > 
> > refclock PHC /dev/ptpX poll 3 dpoll -2 offset 0
> > 
> > Where /dev/ptpX is the kvmclock PTP clock.
> > 
> > Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
> > 
> > ---
> >  drivers/ptp/Kconfig   |   12 ++
> >  drivers/ptp/Makefile  |    1 
> >  drivers/ptp/ptp_kvm.c |  202 ++++++++++++++++++++++++++++++++++++++++++++++++++
> >  3 files changed, 215 insertions(+)
> > 
> > v2: check for kvmclock (Radim)
> >     initialize global variables before device registration (Radim)
> > v3: use cross timestamps callback (Paolo, Miroslav, Radim)
> > v4: remove gettime64 callback (Paolo)
> > v5: drop useless nsec assignment (Radim)
> > 
> > Index: kvm-ptpdriver/drivers/ptp/Kconfig
> > ===================================================================
> > --- kvm-ptpdriver.orig/drivers/ptp/Kconfig	2017-01-24 09:03:47.604586098 -0200
> > +++ kvm-ptpdriver/drivers/ptp/Kconfig	2017-01-24 09:05:06.321704124 -0200
> > @@ -90,4 +90,16 @@
> >  	  To compile this driver as a module, choose M here: the module
> >  	  will be called ptp_pch.
> >  
> > +config PTP_1588_CLOCK_KVM
> > +	tristate "KVM virtual PTP clock"
> > +	depends on PTP_1588_CLOCK
> > +	depends on KVM_GUEST
> > +	default y
> > +	help
> > +	  This driver adds support for using kvm infrastructure as a PTP
> > +	  clock. This clock is only useful if you are using KVM guests.
> > +
> > +	  To compile this driver as a module, choose M here: the module
> > +	  will be called ptp_kvm.
> > +
> >  endmenu
> > Index: kvm-ptpdriver/drivers/ptp/Makefile
> > ===================================================================
> > --- kvm-ptpdriver.orig/drivers/ptp/Makefile	2017-01-24 09:03:47.604586098 -0200
> > +++ kvm-ptpdriver/drivers/ptp/Makefile	2017-01-24 09:05:06.322704125 -0200
> > @@ -6,3 +6,4 @@
> >  obj-$(CONFIG_PTP_1588_CLOCK)		+= ptp.o
> >  obj-$(CONFIG_PTP_1588_CLOCK_IXP46X)	+= ptp_ixp46x.o
> >  obj-$(CONFIG_PTP_1588_CLOCK_PCH)	+= ptp_pch.o
> > +obj-$(CONFIG_PTP_1588_CLOCK_KVM)	+= ptp_kvm.o
> > Index: kvm-ptpdriver/drivers/ptp/ptp_kvm.c
> > ===================================================================
> > --- /dev/null	1970-01-01 00:00:00.000000000 +0000
> > +++ kvm-ptpdriver/drivers/ptp/ptp_kvm.c	2017-01-24 09:05:06.322704125 -0200
> > @@ -0,0 +1,202 @@
> > +/*
> > + * Virtual PTP 1588 clock for use with KVM guests
> > + *
> > + * Copyright (C) 2017 Red Hat Inc.
> > + *
> > + *  This program is free software; you can redistribute it and/or modify
> > + *  it under the terms of the GNU General Public License as published by
> > + *  the Free Software Foundation; either version 2 of the License, or
> > + *  (at your option) any later version.
> > + *
> > + *  This program is distributed in the hope that it will be useful,
> > + *  but WITHOUT ANY WARRANTY; without even the implied warranty of
> > + *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> > + *  GNU General Public License for more details.
> > + *
> > + */
> > +#include <linux/device.h>
> > +#include <linux/err.h>
> > +#include <linux/init.h>
> > +#include <linux/kernel.h>
> > +#include <linux/module.h>
> > +#include <uapi/linux/kvm_para.h>
> > +#include <asm/kvm_para.h>
> > +#include <asm/pvclock.h>
> > +#include <asm/kvmclock.h>
> > +#include <uapi/asm/kvm_para.h>
> > +
> > +#include <linux/ptp_clock_kernel.h>
> > +
> > +struct kvm_ptp_clock {
> > +	struct ptp_clock *ptp_clock;
> > +	struct ptp_clock_info caps;
> > +};
> > +
> > +DEFINE_SPINLOCK(kvm_ptp_lock);
> > +
> > +static struct pvclock_vsyscall_time_info *hv_clock;
> > +
> > +static struct kvm_clock_offset clock_off;
> > +static phys_addr_t clock_off_gpa;
> > +
> > +static int ptp_kvm_get_time_fn(ktime_t *device_time,
> > +			       struct system_counterval_t *system_counter,
> > +			       void *ctx)
> > +{
> > +	unsigned long ret;
> > +	struct timespec64 tspec;
> > +	unsigned version;
> > +	u8 flags;
> > +	int cpu;
> > +	struct pvclock_vcpu_time_info *src;
> > +
> > +	preempt_disable_notrace();
> > +	cpu = smp_processor_id();
> > +	src = &hv_clock[cpu].pvti;
> > +
> > +	spin_lock(&kvm_ptp_lock);
> 
> Any reason to take the spinlock in the non-preemptable region?
> This would be worse for RT kernels than taking it outside.

Not really, can move it before.

> 
> > +	do {
> > +		/*
> > +		 * We are using a TSC value read in the hosts
> > +		 * kvm_hc_clock_pairing handling.
> > +		 * So any changes to tsc_to_system_mul
> > +		 * and tsc_shift or any other pvclock
> > +		 * data invalidate that measurement.
> > +		 */
> > +		version = pvclock_read_begin(src);
> > +
> > +		ret = kvm_hypercall2(KVM_HC_CLOCK_PAIRING,
> > +				     clock_off_gpa,
> > +				     KVM_CLOCK_PAIRING_WALLCLOCK);
> > +		if (ret != 0) {
> > +			pr_err("clock offset hypercall ret %lu\n", ret);
> 
> Need ratelimiting in case something goes wrong and the host stops using
> the TSC clocksource?

Yes wise.


> > +			spin_unlock(&kvm_ptp_lock);
> > +			preempt_enable_notrace();
> > +			return -EOPNOTSUPP;
> > +		}
> > +
> > +		tspec.tv_sec = clock_off.sec;
> > +		tspec.tv_nsec = clock_off.nsec;
> > +		ret = __pvclock_read_cycles(src, clock_off.tsc);
> > +		flags = src->flags;
> 
> Flags written and not read.
> 
> Also, I think I disagree with Radim, gettime64 should use the TSC
> correction.  It's the right thing to do for the POSIX clock interface.
> It's a bit arbitrary for PTP_SYS_OFFSET (though you cannot really argue
> with better precision), but chrony is now using PTP_SYS_OFFSET_PRECISE
> (and again you cannot really argue with better precision anyway).
> 
> I'll test this a bit and then push it to kvm/queue.  Please take a
> look!
> 
> Paolo

Ok cool thanks Paolo, will review and ping if necessary.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [patch 2/4] KVM: x86: add KVM_HC_CLOCK_OFFSET hypercall
  2017-01-25 12:59       ` Paolo Bonzini
@ 2017-01-25 13:06         ` Marcelo Tosatti
  2017-01-25 13:15           ` Paolo Bonzini
  0 siblings, 1 reply; 14+ messages in thread
From: Marcelo Tosatti @ 2017-01-25 13:06 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: kvm, linux-kernel, Radim Krcmar, Richard Cochran, Miroslav Lichvar

On Wed, Jan 25, 2017 at 01:59:34PM +0100, Paolo Bonzini wrote:
> 
> 
> On 25/01/2017 13:55, Marcelo Tosatti wrote:
> > On Wed, Jan 25, 2017 at 01:30:49PM +0100, Paolo Bonzini wrote:
> >>
> >>
> >> On 24/01/2017 18:09, Marcelo Tosatti wrote:
> >>> Add a hypercall to retrieve the host realtime clock
> >>> and the TSC value used to calculate that clock read.
> >>>
> >>> Used to implement clock synchronization between
> >>> host and guest.
> >>>
> >>> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
> >>>
> >>> ---
> >>>  Documentation/virtual/kvm/hypercalls.txt |   33 ++++++++++++++++++++++++
> >>>  arch/x86/include/uapi/asm/kvm_para.h     |    9 ++++++
> >>>  arch/x86/kvm/x86.c                       |   41 +++++++++++++++++++++++++++++++
> >>>  include/uapi/linux/kvm_para.h            |    3 ++
> >>>  4 files changed, 86 insertions(+)
> >>>
> >>> v2: improve documentation (Radim)
> >>>     change hypercall name to KVM_HC_CLOCK_PAIRING (Radim)
> >>>     increase padding size
> >>>
> >>> Index: kvm-ptpdriver/arch/x86/include/uapi/asm/kvm_para.h
> >>> ===================================================================
> >>> --- kvm-ptpdriver.orig/arch/x86/include/uapi/asm/kvm_para.h	2017-01-13 16:43:11.947240575 -0200
> >>> +++ kvm-ptpdriver/arch/x86/include/uapi/asm/kvm_para.h	2017-01-13 16:43:25.038258187 -0200
> >>> @@ -50,6 +50,15 @@
> >>>  	__u32 pad[11];
> >>>  };
> >>>  
> >>> +#define KVM_CLOCK_PAIRING_WALLCLOCK 0
> >>> +struct kvm_clock_offset {
> >>
> >> The struct still has the old name.
> >>
> >>> +	__s64 sec;
> >>> +	__s64 nsec;
> >>> +	__u64 tsc;
> >>> +	__u32 flags;
> >>> +	__u32 pad[9];
> >>> +};
> > 
> > That was on purpose: "to pair" is the operation. 
> > 
> > "struct kvm_clock_pairing" makes no sense to me 
> > (because pairing is an "action").
> > 
> > (if you'd prefer that name, let me know).
> 
> A "pairing" is a union of two things, in this case a sec/nsec value and
> a TSC value.  (I had to look it up in the dictionary though :)).
> 
> >>>  #define KVM_STEAL_ALIGNMENT_BITS 5
> >>>  #define KVM_STEAL_VALID_BITS ((-1ULL << (KVM_STEAL_ALIGNMENT_BITS + 1)))
> >>>  #define KVM_STEAL_RESERVED_MASK (((1 << KVM_STEAL_ALIGNMENT_BITS) - 1 ) << 1)
> >>> Index: kvm-ptpdriver/Documentation/virtual/kvm/hypercalls.txt
> >>> ===================================================================
> >>> --- kvm-ptpdriver.orig/Documentation/virtual/kvm/hypercalls.txt	2017-01-13 16:43:11.947240575 -0200
> >>> +++ kvm-ptpdriver/Documentation/virtual/kvm/hypercalls.txt	2017-01-13 16:43:25.038258187 -0200
> >>> @@ -81,3 +81,36 @@
> >>>  same guest can wakeup the sleeping vcpu by issuing KVM_HC_KICK_CPU hypercall,
> >>>  specifying APIC ID (a1) of the vcpu to be woken up. An additional argument (a0)
> >>>  is used in the hypercall for future use.
> >>> +
> >>> +
> >>> +6. KVM_HC_CLOCK_PAIRING
> >>> +------------------------
> >>> +Architecture: x86
> >>> +Status: active
> >>> +Purpose: Hypercall used to synchronize host and guest clocks.
> >>> +Usage:
> >>> +
> >>> +a0: guest physical address where host copies
> >>> +"struct kvm_clock_offset" structure.
> >>> +
> >>> +a1: clock_type, ATM only KVM_CLOCK_PAIRING_WALLCLOCK (0)
> >>> +is supported (hosts CLOCK_REALTIME clock).
> >>> +
> >>> +		struct kvm_clock_offset {
> >>
> >> Name of the struct not adjusted.
> >>> +			__s64 sec;
> >>> +			__s64 nsec;
> >>> +			__u64 tsc;
> >>> +			__u32 flags;
> >>> +			__u32 pad[9];
> >>
> >> I'd just leave one element of padding and put the struct on the stack, 
> >> since we have no reason to think we'll need a lot of padding.
> > 
> > Well you don't know: we could implement it in case of HPET clocksource
> > hosts.
> 
> But then the guest wouldn't have a TSC value to use, would it?
> 
> Also, how would it use more padding?

I don't know, maybe pass the TSC and host HPET read in the structure? 
(after exposing the HPET to the guest).

Or maybe you want to sync CLOCK_MONOTONIC and that requires other data.

Not sure.

Just some space if new features are necessary.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [patch 2/4] KVM: x86: add KVM_HC_CLOCK_OFFSET hypercall
  2017-01-25 13:06         ` Marcelo Tosatti
@ 2017-01-25 13:15           ` Paolo Bonzini
  0 siblings, 0 replies; 14+ messages in thread
From: Paolo Bonzini @ 2017-01-25 13:15 UTC (permalink / raw)
  To: Marcelo Tosatti
  Cc: kvm, linux-kernel, Radim Krcmar, Richard Cochran, Miroslav Lichvar



On 25/01/2017 14:06, Marcelo Tosatti wrote:
>>> Well you don't know: we could implement it in case of HPET clocksource
>>> hosts.
>> But then the guest wouldn't have a TSC value to use, would it?
>>
>> Also, how would it use more padding?
> I don't know, maybe pass the TSC and host HPET read in the structure? 
> (after exposing the HPET to the guest).
> 
> Or maybe you want to sync CLOCK_MONOTONIC and that requires other data.
> 
> Not sure.
> 
> Just some space if new features are necessary.

Fair enough.

Paolo

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2017-01-25 13:15 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-01-24 17:09 [patch 0/4] KVM virtual PTP driver (v5) Marcelo Tosatti
2017-01-24 17:09 ` [patch 1/4] KVM: x86: provide realtime host clock via vsyscall notifiers Marcelo Tosatti
2017-01-25 12:30   ` Paolo Bonzini
2017-01-24 17:09 ` [patch 2/4] KVM: x86: add KVM_HC_CLOCK_OFFSET hypercall Marcelo Tosatti
2017-01-25 12:30   ` Paolo Bonzini
2017-01-25 12:55     ` Marcelo Tosatti
2017-01-25 12:59       ` Paolo Bonzini
2017-01-25 13:06         ` Marcelo Tosatti
2017-01-25 13:15           ` Paolo Bonzini
2017-01-24 17:09 ` [patch 3/4] kvmclock: export kvmclock clocksource pointer Marcelo Tosatti
2017-01-24 17:09 ` [patch 4/4] PTP: add kvm PTP driver Marcelo Tosatti
2017-01-24 18:11   ` Richard Cochran
2017-01-25 12:56   ` Paolo Bonzini
2017-01-25 13:03     ` Marcelo Tosatti

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).