All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/4] x86: KVM vdso and clock improvements
@ 2015-12-20 11:05 Andy Lutomirski
  2015-12-20 11:05 ` [PATCH v2 1/4] x86, vdso, pvclock: Simplify and speed up the vdso pvclock reader Andy Lutomirski
                   ` (3 more replies)
  0 siblings, 4 replies; 16+ messages in thread
From: Andy Lutomirski @ 2015-12-20 11:05 UTC (permalink / raw)
  To: x86, Marcelo Tosatti, Radim Krcmar, Paolo Bonzini
  Cc: linux-kernel, kvm, Alexander Graf, Andy Lutomirski

x86: KVM vdso and clock improvements

NB: patch 1 doesn't really belong here, but it makes this a lot
easier for me to test.  Patch 1, if it's okay at all, should go
though the kvm tree.  The rest should probably go through
tip:x86/vdso once they're reviewed.

I'll do a followup to enable vdso pvclock on 32-bit guests.
I'm not currently set up to test it.  (The KVM people could also
do it very easily on top of these patches.)

Changes from v1:
 - Dropped patch 1
 - Added Paolo's review and acks
 - Fixed a build issue on some configs

Andy Lutomirski (4):
  x86, vdso, pvclock: Simplify and speed up the vdso pvclock reader
  x86/vdso: Get pvclock data from the vvar VMA instead of the fixmap
  x86/vdso: Remove pvclock fixmap machinery
  x86/vdso: Enable vdso pvclock access on all vdso variants

 arch/x86/entry/vdso/vclock_gettime.c  | 151 ++++++++++++++++------------------
 arch/x86/entry/vdso/vdso-layout.lds.S |   3 +-
 arch/x86/entry/vdso/vdso2c.c          |   3 +
 arch/x86/entry/vdso/vma.c             |  14 ++++
 arch/x86/include/asm/fixmap.h         |   5 --
 arch/x86/include/asm/pvclock.h        |  14 ++--
 arch/x86/include/asm/vdso.h           |   1 +
 arch/x86/kernel/kvmclock.c            |  11 ++-
 arch/x86/kernel/pvclock.c             |  24 ------
 9 files changed, 107 insertions(+), 119 deletions(-)

-- 
2.5.0


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v2 1/4] x86, vdso, pvclock: Simplify and speed up the vdso pvclock reader
  2015-12-20 11:05 [PATCH v2 0/4] x86: KVM vdso and clock improvements Andy Lutomirski
@ 2015-12-20 11:05 ` Andy Lutomirski
  2016-01-04 20:26   ` Marcelo Tosatti
  2015-12-20 11:05 ` [PATCH v2 2/4] x86/vdso: Get pvclock data from the vvar VMA instead of the fixmap Andy Lutomirski
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 16+ messages in thread
From: Andy Lutomirski @ 2015-12-20 11:05 UTC (permalink / raw)
  To: x86, Marcelo Tosatti, Radim Krcmar, Paolo Bonzini
  Cc: linux-kernel, kvm, Alexander Graf, Andy Lutomirski

From: Andy Lutomirski <luto@amacapital.net>

The pvclock vdso code was too abstracted to understand easily and
excessively paranoid.  Simplify it for a huge speedup.

This opens the door for additional simplifications, as the vdso no
longer accesses the pvti for any vcpu other than vcpu 0.

Before, vclock_gettime using kvm-clock took about 45ns on my machine.
With this change, it takes 29ns, which is almost as fast as the pure TSC
implementation.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
---
 arch/x86/entry/vdso/vclock_gettime.c | 81 ++++++++++++++++++++----------------
 1 file changed, 46 insertions(+), 35 deletions(-)

diff --git a/arch/x86/entry/vdso/vclock_gettime.c b/arch/x86/entry/vdso/vclock_gettime.c
index ca94fa649251..c325ba1bdddf 100644
--- a/arch/x86/entry/vdso/vclock_gettime.c
+++ b/arch/x86/entry/vdso/vclock_gettime.c
@@ -78,47 +78,58 @@ static notrace const struct pvclock_vsyscall_time_info *get_pvti(int cpu)
 
 static notrace cycle_t vread_pvclock(int *mode)
 {
-	const struct pvclock_vsyscall_time_info *pvti;
+	const struct pvclock_vcpu_time_info *pvti = &get_pvti(0)->pvti;
 	cycle_t ret;
-	u64 last;
-	u32 version;
-	u8 flags;
-	unsigned cpu, cpu1;
-
+	u64 tsc, pvti_tsc;
+	u64 last, delta, pvti_system_time;
+	u32 version, pvti_tsc_to_system_mul, pvti_tsc_shift;
 
 	/*
-	 * Note: hypervisor must guarantee that:
-	 * 1. cpu ID number maps 1:1 to per-CPU pvclock time info.
-	 * 2. that per-CPU pvclock time info is updated if the
-	 *    underlying CPU changes.
-	 * 3. that version is increased whenever underlying CPU
-	 *    changes.
+	 * Note: The kernel and hypervisor must guarantee that cpu ID
+	 * number maps 1:1 to per-CPU pvclock time info.
+	 *
+	 * Because the hypervisor is entirely unaware of guest userspace
+	 * preemption, it cannot guarantee that per-CPU pvclock time
+	 * info is updated if the underlying CPU changes or that that
+	 * version is increased whenever underlying CPU changes.
 	 *
+	 * On KVM, we are guaranteed that pvti updates for any vCPU are
+	 * atomic as seen by *all* vCPUs.  This is an even stronger
+	 * guarantee than we get with a normal seqlock.
+	 *
+	 * On Xen, we don't appear to have that guarantee, but Xen still
+	 * supplies a valid seqlock using the version field.
+
+	 * We only do pvclock vdso timing at all if
+	 * PVCLOCK_TSC_STABLE_BIT is set, and we interpret that bit to
+	 * mean that all vCPUs have matching pvti and that the TSC is
+	 * synced, so we can just look at vCPU 0's pvti.
 	 */
-	do {
-		cpu = __getcpu() & VGETCPU_CPU_MASK;
-		/* TODO: We can put vcpu id into higher bits of pvti.version.
-		 * This will save a couple of cycles by getting rid of
-		 * __getcpu() calls (Gleb).
-		 */
-
-		pvti = get_pvti(cpu);
-
-		version = __pvclock_read_cycles(&pvti->pvti, &ret, &flags);
-
-		/*
-		 * Test we're still on the cpu as well as the version.
-		 * We could have been migrated just after the first
-		 * vgetcpu but before fetching the version, so we
-		 * wouldn't notice a version change.
-		 */
-		cpu1 = __getcpu() & VGETCPU_CPU_MASK;
-	} while (unlikely(cpu != cpu1 ||
-			  (pvti->pvti.version & 1) ||
-			  pvti->pvti.version != version));
-
-	if (unlikely(!(flags & PVCLOCK_TSC_STABLE_BIT)))
+
+	if (unlikely(!(pvti->flags & PVCLOCK_TSC_STABLE_BIT))) {
 		*mode = VCLOCK_NONE;
+		return 0;
+	}
+
+	do {
+		version = pvti->version;
+
+		/* This is also a read barrier, so we'll read version first. */
+		tsc = rdtsc_ordered();
+
+		pvti_tsc_to_system_mul = pvti->tsc_to_system_mul;
+		pvti_tsc_shift = pvti->tsc_shift;
+		pvti_system_time = pvti->system_time;
+		pvti_tsc = pvti->tsc_timestamp;
+
+		/* Make sure that the version double-check is last. */
+		smp_rmb();
+	} while (unlikely((version & 1) || version != pvti->version));
+
+	delta = tsc - pvti_tsc;
+	ret = pvti_system_time +
+		pvclock_scale_delta(delta, pvti_tsc_to_system_mul,
+				    pvti_tsc_shift);
 
 	/* refer to tsc.c read_tsc() comment for rationale */
 	last = gtod->cycle_last;
-- 
2.5.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v2 2/4] x86/vdso: Get pvclock data from the vvar VMA instead of the fixmap
  2015-12-20 11:05 [PATCH v2 0/4] x86: KVM vdso and clock improvements Andy Lutomirski
  2015-12-20 11:05 ` [PATCH v2 1/4] x86, vdso, pvclock: Simplify and speed up the vdso pvclock reader Andy Lutomirski
@ 2015-12-20 11:05 ` Andy Lutomirski
  2015-12-20 11:05 ` [PATCH v2 3/4] x86/vdso: Remove pvclock fixmap machinery Andy Lutomirski
  2015-12-20 11:05 ` [PATCH v2 4/4] x86/vdso: Enable vdso pvclock access on all vdso variants Andy Lutomirski
  3 siblings, 0 replies; 16+ messages in thread
From: Andy Lutomirski @ 2015-12-20 11:05 UTC (permalink / raw)
  To: x86, Marcelo Tosatti, Radim Krcmar, Paolo Bonzini
  Cc: linux-kernel, kvm, Alexander Graf, Andy Lutomirski

Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
---
 arch/x86/entry/vdso/vclock_gettime.c  | 20 ++++++++------------
 arch/x86/entry/vdso/vdso-layout.lds.S |  3 ++-
 arch/x86/entry/vdso/vdso2c.c          |  3 +++
 arch/x86/entry/vdso/vma.c             | 13 +++++++++++++
 arch/x86/include/asm/pvclock.h        |  9 +++++++++
 arch/x86/include/asm/vdso.h           |  1 +
 arch/x86/kernel/kvmclock.c            |  5 +++++
 7 files changed, 41 insertions(+), 13 deletions(-)

diff --git a/arch/x86/entry/vdso/vclock_gettime.c b/arch/x86/entry/vdso/vclock_gettime.c
index c325ba1bdddf..5dd363d54348 100644
--- a/arch/x86/entry/vdso/vclock_gettime.c
+++ b/arch/x86/entry/vdso/vclock_gettime.c
@@ -36,6 +36,11 @@ static notrace cycle_t vread_hpet(void)
 }
 #endif
 
+#ifdef CONFIG_PARAVIRT_CLOCK
+extern u8 pvclock_page
+	__attribute__((visibility("hidden")));
+#endif
+
 #ifndef BUILD_VDSO32
 
 #include <linux/kernel.h>
@@ -62,23 +67,14 @@ notrace static long vdso_fallback_gtod(struct timeval *tv, struct timezone *tz)
 
 #ifdef CONFIG_PARAVIRT_CLOCK
 
-static notrace const struct pvclock_vsyscall_time_info *get_pvti(int cpu)
+static notrace const struct pvclock_vsyscall_time_info *get_pvti0(void)
 {
-	const struct pvclock_vsyscall_time_info *pvti_base;
-	int idx = cpu / (PAGE_SIZE/PVTI_SIZE);
-	int offset = cpu % (PAGE_SIZE/PVTI_SIZE);
-
-	BUG_ON(PVCLOCK_FIXMAP_BEGIN + idx > PVCLOCK_FIXMAP_END);
-
-	pvti_base = (struct pvclock_vsyscall_time_info *)
-		    __fix_to_virt(PVCLOCK_FIXMAP_BEGIN+idx);
-
-	return &pvti_base[offset];
+	return (const struct pvclock_vsyscall_time_info *)&pvclock_page;
 }
 
 static notrace cycle_t vread_pvclock(int *mode)
 {
-	const struct pvclock_vcpu_time_info *pvti = &get_pvti(0)->pvti;
+	const struct pvclock_vcpu_time_info *pvti = &get_pvti0()->pvti;
 	cycle_t ret;
 	u64 tsc, pvti_tsc;
 	u64 last, delta, pvti_system_time;
diff --git a/arch/x86/entry/vdso/vdso-layout.lds.S b/arch/x86/entry/vdso/vdso-layout.lds.S
index de2c921025f5..4158acc17df0 100644
--- a/arch/x86/entry/vdso/vdso-layout.lds.S
+++ b/arch/x86/entry/vdso/vdso-layout.lds.S
@@ -25,7 +25,7 @@ SECTIONS
 	 * segment.
 	 */
 
-	vvar_start = . - 2 * PAGE_SIZE;
+	vvar_start = . - 3 * PAGE_SIZE;
 	vvar_page = vvar_start;
 
 	/* Place all vvars at the offsets in asm/vvar.h. */
@@ -36,6 +36,7 @@ SECTIONS
 #undef EMIT_VVAR
 
 	hpet_page = vvar_start + PAGE_SIZE;
+	pvclock_page = vvar_start + 2 * PAGE_SIZE;
 
 	. = SIZEOF_HEADERS;
 
diff --git a/arch/x86/entry/vdso/vdso2c.c b/arch/x86/entry/vdso/vdso2c.c
index 785d9922b106..491020b2826d 100644
--- a/arch/x86/entry/vdso/vdso2c.c
+++ b/arch/x86/entry/vdso/vdso2c.c
@@ -73,6 +73,7 @@ enum {
 	sym_vvar_start,
 	sym_vvar_page,
 	sym_hpet_page,
+	sym_pvclock_page,
 	sym_VDSO_FAKE_SECTION_TABLE_START,
 	sym_VDSO_FAKE_SECTION_TABLE_END,
 };
@@ -80,6 +81,7 @@ enum {
 const int special_pages[] = {
 	sym_vvar_page,
 	sym_hpet_page,
+	sym_pvclock_page,
 };
 
 struct vdso_sym {
@@ -91,6 +93,7 @@ struct vdso_sym required_syms[] = {
 	[sym_vvar_start] = {"vvar_start", true},
 	[sym_vvar_page] = {"vvar_page", true},
 	[sym_hpet_page] = {"hpet_page", true},
+	[sym_pvclock_page] = {"pvclock_page", true},
 	[sym_VDSO_FAKE_SECTION_TABLE_START] = {
 		"VDSO_FAKE_SECTION_TABLE_START", false
 	},
diff --git a/arch/x86/entry/vdso/vma.c b/arch/x86/entry/vdso/vma.c
index 64df47148160..aa828191c654 100644
--- a/arch/x86/entry/vdso/vma.c
+++ b/arch/x86/entry/vdso/vma.c
@@ -100,6 +100,7 @@ static int map_vdso(const struct vdso_image *image, bool calculate_addr)
 		.name = "[vvar]",
 		.pages = no_pages,
 	};
+	struct pvclock_vsyscall_time_info *pvti;
 
 	if (calculate_addr) {
 		addr = vdso_addr(current->mm->start_stack,
@@ -169,6 +170,18 @@ static int map_vdso(const struct vdso_image *image, bool calculate_addr)
 	}
 #endif
 
+	pvti = pvclock_pvti_cpu0_va();
+	if (pvti && image->sym_pvclock_page) {
+		ret = remap_pfn_range(vma,
+				      text_start + image->sym_pvclock_page,
+				      __pa(pvti) >> PAGE_SHIFT,
+				      PAGE_SIZE,
+				      PAGE_READONLY);
+
+		if (ret)
+			goto up_fail;
+	}
+
 up_fail:
 	if (ret)
 		current->mm->context.vdso = NULL;
diff --git a/arch/x86/include/asm/pvclock.h b/arch/x86/include/asm/pvclock.h
index 7a6bed5c08bc..571dad355bbc 100644
--- a/arch/x86/include/asm/pvclock.h
+++ b/arch/x86/include/asm/pvclock.h
@@ -4,6 +4,15 @@
 #include <linux/clocksource.h>
 #include <asm/pvclock-abi.h>
 
+#ifdef CONFIG_KVM_GUEST
+extern struct pvclock_vsyscall_time_info *pvclock_pvti_cpu0_va(void);
+#else
+static inline struct pvclock_vsyscall_time_info *pvclock_pvti_cpu0_va(void)
+{
+	return NULL;
+}
+#endif
+
 /* some helper functions for xen and kvm pv clock sources */
 cycle_t pvclock_clocksource_read(struct pvclock_vcpu_time_info *src);
 u8 pvclock_read_flags(struct pvclock_vcpu_time_info *src);
diff --git a/arch/x86/include/asm/vdso.h b/arch/x86/include/asm/vdso.h
index 756de9190aec..deabaf9759b6 100644
--- a/arch/x86/include/asm/vdso.h
+++ b/arch/x86/include/asm/vdso.h
@@ -22,6 +22,7 @@ struct vdso_image {
 
 	long sym_vvar_page;
 	long sym_hpet_page;
+	long sym_pvclock_page;
 	long sym_VDSO32_NOTE_MASK;
 	long sym___kernel_sigreturn;
 	long sym___kernel_rt_sigreturn;
diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
index 2bd81e302427..ec1b06dc82d2 100644
--- a/arch/x86/kernel/kvmclock.c
+++ b/arch/x86/kernel/kvmclock.c
@@ -45,6 +45,11 @@ early_param("no-kvmclock", parse_no_kvmclock);
 static struct pvclock_vsyscall_time_info *hv_clock;
 static struct pvclock_wall_clock wall_clock;
 
+struct pvclock_vsyscall_time_info *pvclock_pvti_cpu0_va(void)
+{
+	return hv_clock;
+}
+
 /*
  * The wallclock is the time of day when we booted. Since then, some time may
  * have elapsed since the hypervisor wrote the data. So we try to account for
-- 
2.5.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v2 3/4] x86/vdso: Remove pvclock fixmap machinery
  2015-12-20 11:05 [PATCH v2 0/4] x86: KVM vdso and clock improvements Andy Lutomirski
  2015-12-20 11:05 ` [PATCH v2 1/4] x86, vdso, pvclock: Simplify and speed up the vdso pvclock reader Andy Lutomirski
  2015-12-20 11:05 ` [PATCH v2 2/4] x86/vdso: Get pvclock data from the vvar VMA instead of the fixmap Andy Lutomirski
@ 2015-12-20 11:05 ` Andy Lutomirski
  2015-12-20 11:05 ` [PATCH v2 4/4] x86/vdso: Enable vdso pvclock access on all vdso variants Andy Lutomirski
  3 siblings, 0 replies; 16+ messages in thread
From: Andy Lutomirski @ 2015-12-20 11:05 UTC (permalink / raw)
  To: x86, Marcelo Tosatti, Radim Krcmar, Paolo Bonzini
  Cc: linux-kernel, kvm, Alexander Graf, Andy Lutomirski

Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
---
 arch/x86/entry/vdso/vclock_gettime.c |  1 -
 arch/x86/entry/vdso/vma.c            |  1 +
 arch/x86/include/asm/fixmap.h        |  5 -----
 arch/x86/include/asm/pvclock.h       |  5 -----
 arch/x86/kernel/kvmclock.c           |  6 ------
 arch/x86/kernel/pvclock.c            | 24 ------------------------
 6 files changed, 1 insertion(+), 41 deletions(-)

diff --git a/arch/x86/entry/vdso/vclock_gettime.c b/arch/x86/entry/vdso/vclock_gettime.c
index 5dd363d54348..59a98c25bde7 100644
--- a/arch/x86/entry/vdso/vclock_gettime.c
+++ b/arch/x86/entry/vdso/vclock_gettime.c
@@ -45,7 +45,6 @@ extern u8 pvclock_page
 
 #include <linux/kernel.h>
 #include <asm/vsyscall.h>
-#include <asm/fixmap.h>
 #include <asm/pvclock.h>
 
 notrace static long vdso_fallback_gettime(long clock, struct timespec *ts)
diff --git a/arch/x86/entry/vdso/vma.c b/arch/x86/entry/vdso/vma.c
index aa828191c654..b8f69e264ac4 100644
--- a/arch/x86/entry/vdso/vma.c
+++ b/arch/x86/entry/vdso/vma.c
@@ -12,6 +12,7 @@
 #include <linux/random.h>
 #include <linux/elf.h>
 #include <linux/cpu.h>
+#include <asm/pvclock.h>
 #include <asm/vgtod.h>
 #include <asm/proto.h>
 #include <asm/vdso.h>
diff --git a/arch/x86/include/asm/fixmap.h b/arch/x86/include/asm/fixmap.h
index f80d70009ff8..6d7d0e52ed5a 100644
--- a/arch/x86/include/asm/fixmap.h
+++ b/arch/x86/include/asm/fixmap.h
@@ -19,7 +19,6 @@
 #include <asm/acpi.h>
 #include <asm/apicdef.h>
 #include <asm/page.h>
-#include <asm/pvclock.h>
 #ifdef CONFIG_X86_32
 #include <linux/threads.h>
 #include <asm/kmap_types.h>
@@ -72,10 +71,6 @@ enum fixed_addresses {
 #ifdef CONFIG_X86_VSYSCALL_EMULATION
 	VSYSCALL_PAGE = (FIXADDR_TOP - VSYSCALL_ADDR) >> PAGE_SHIFT,
 #endif
-#ifdef CONFIG_PARAVIRT_CLOCK
-	PVCLOCK_FIXMAP_BEGIN,
-	PVCLOCK_FIXMAP_END = PVCLOCK_FIXMAP_BEGIN+PVCLOCK_VSYSCALL_NR_PAGES-1,
-#endif
 #endif
 	FIX_DBGP_BASE,
 	FIX_EARLYCON_MEM_BASE,
diff --git a/arch/x86/include/asm/pvclock.h b/arch/x86/include/asm/pvclock.h
index 571dad355bbc..fdcc04020636 100644
--- a/arch/x86/include/asm/pvclock.h
+++ b/arch/x86/include/asm/pvclock.h
@@ -100,10 +100,5 @@ struct pvclock_vsyscall_time_info {
 } __attribute__((__aligned__(SMP_CACHE_BYTES)));
 
 #define PVTI_SIZE sizeof(struct pvclock_vsyscall_time_info)
-#define PVCLOCK_VSYSCALL_NR_PAGES (((NR_CPUS-1)/(PAGE_SIZE/PVTI_SIZE))+1)
-
-int __init pvclock_init_vsyscall(struct pvclock_vsyscall_time_info *i,
-				 int size);
-struct pvclock_vcpu_time_info *pvclock_get_vsyscall_time_info(int cpu);
 
 #endif /* _ASM_X86_PVCLOCK_H */
diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
index ec1b06dc82d2..72cef58693c7 100644
--- a/arch/x86/kernel/kvmclock.c
+++ b/arch/x86/kernel/kvmclock.c
@@ -310,7 +310,6 @@ int __init kvm_setup_vsyscall_timeinfo(void)
 {
 #ifdef CONFIG_X86_64
 	int cpu;
-	int ret;
 	u8 flags;
 	struct pvclock_vcpu_time_info *vcpu_time;
 	unsigned int size;
@@ -330,11 +329,6 @@ int __init kvm_setup_vsyscall_timeinfo(void)
 		return 1;
 	}
 
-	if ((ret = pvclock_init_vsyscall(hv_clock, size))) {
-		put_cpu();
-		return ret;
-	}
-
 	put_cpu();
 
 	kvm_clock.archdata.vclock_mode = VCLOCK_PVCLOCK;
diff --git a/arch/x86/kernel/pvclock.c b/arch/x86/kernel/pvclock.c
index 2f355d229a58..99bfc025111d 100644
--- a/arch/x86/kernel/pvclock.c
+++ b/arch/x86/kernel/pvclock.c
@@ -140,27 +140,3 @@ void pvclock_read_wallclock(struct pvclock_wall_clock *wall_clock,
 
 	set_normalized_timespec(ts, now.tv_sec, now.tv_nsec);
 }
-
-#ifdef CONFIG_X86_64
-/*
- * Initialize the generic pvclock vsyscall state.  This will allocate
- * a/some page(s) for the per-vcpu pvclock information, set up a
- * fixmap mapping for the page(s)
- */
-
-int __init pvclock_init_vsyscall(struct pvclock_vsyscall_time_info *i,
-				 int size)
-{
-	int idx;
-
-	WARN_ON (size != PVCLOCK_VSYSCALL_NR_PAGES*PAGE_SIZE);
-
-	for (idx = 0; idx <= (PVCLOCK_FIXMAP_END-PVCLOCK_FIXMAP_BEGIN); idx++) {
-		__set_fixmap(PVCLOCK_FIXMAP_BEGIN + idx,
-			     __pa(i) + (idx*PAGE_SIZE),
-			     PAGE_KERNEL_VVAR);
-	}
-
-	return 0;
-}
-#endif
-- 
2.5.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v2 4/4] x86/vdso: Enable vdso pvclock access on all vdso variants
  2015-12-20 11:05 [PATCH v2 0/4] x86: KVM vdso and clock improvements Andy Lutomirski
                   ` (2 preceding siblings ...)
  2015-12-20 11:05 ` [PATCH v2 3/4] x86/vdso: Remove pvclock fixmap machinery Andy Lutomirski
@ 2015-12-20 11:05 ` Andy Lutomirski
  3 siblings, 0 replies; 16+ messages in thread
From: Andy Lutomirski @ 2015-12-20 11:05 UTC (permalink / raw)
  To: x86, Marcelo Tosatti, Radim Krcmar, Paolo Bonzini
  Cc: linux-kernel, kvm, Alexander Graf, Andy Lutomirski

Now that pvclock doesn't require access to the fixmap, all vdso
variants can use it.

The kernel side isn't wired up for 32-bit kernels yet, but this
covers 32-bit and x32 userspace on 64-bit kernels.

Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
---
 arch/x86/entry/vdso/vclock_gettime.c | 91 ++++++++++++++++--------------------
 1 file changed, 40 insertions(+), 51 deletions(-)

diff --git a/arch/x86/entry/vdso/vclock_gettime.c b/arch/x86/entry/vdso/vclock_gettime.c
index 59a98c25bde7..8602f06c759f 100644
--- a/arch/x86/entry/vdso/vclock_gettime.c
+++ b/arch/x86/entry/vdso/vclock_gettime.c
@@ -17,8 +17,10 @@
 #include <asm/vvar.h>
 #include <asm/unistd.h>
 #include <asm/msr.h>
+#include <asm/pvclock.h>
 #include <linux/math64.h>
 #include <linux/time.h>
+#include <linux/kernel.h>
 
 #define gtod (&VVAR(vsyscall_gtod_data))
 
@@ -43,10 +45,6 @@ extern u8 pvclock_page
 
 #ifndef BUILD_VDSO32
 
-#include <linux/kernel.h>
-#include <asm/vsyscall.h>
-#include <asm/pvclock.h>
-
 notrace static long vdso_fallback_gettime(long clock, struct timespec *ts)
 {
 	long ret;
@@ -64,8 +62,42 @@ notrace static long vdso_fallback_gtod(struct timeval *tv, struct timezone *tz)
 	return ret;
 }
 
-#ifdef CONFIG_PARAVIRT_CLOCK
 
+#else
+
+notrace static long vdso_fallback_gettime(long clock, struct timespec *ts)
+{
+	long ret;
+
+	asm(
+		"mov %%ebx, %%edx \n"
+		"mov %2, %%ebx \n"
+		"call __kernel_vsyscall \n"
+		"mov %%edx, %%ebx \n"
+		: "=a" (ret)
+		: "0" (__NR_clock_gettime), "g" (clock), "c" (ts)
+		: "memory", "edx");
+	return ret;
+}
+
+notrace static long vdso_fallback_gtod(struct timeval *tv, struct timezone *tz)
+{
+	long ret;
+
+	asm(
+		"mov %%ebx, %%edx \n"
+		"mov %2, %%ebx \n"
+		"call __kernel_vsyscall \n"
+		"mov %%edx, %%ebx \n"
+		: "=a" (ret)
+		: "0" (__NR_gettimeofday), "g" (tv), "c" (tz)
+		: "memory", "edx");
+	return ret;
+}
+
+#endif
+
+#ifdef CONFIG_PARAVIRT_CLOCK
 static notrace const struct pvclock_vsyscall_time_info *get_pvti0(void)
 {
 	return (const struct pvclock_vsyscall_time_info *)&pvclock_page;
@@ -109,9 +141,9 @@ static notrace cycle_t vread_pvclock(int *mode)
 	do {
 		version = pvti->version;
 
-		/* This is also a read barrier, so we'll read version first. */
-		tsc = rdtsc_ordered();
+		smp_rmb();
 
+		tsc = rdtsc_ordered();
 		pvti_tsc_to_system_mul = pvti->tsc_to_system_mul;
 		pvti_tsc_shift = pvti->tsc_shift;
 		pvti_system_time = pvti->system_time;
@@ -126,7 +158,7 @@ static notrace cycle_t vread_pvclock(int *mode)
 		pvclock_scale_delta(delta, pvti_tsc_to_system_mul,
 				    pvti_tsc_shift);
 
-	/* refer to tsc.c read_tsc() comment for rationale */
+	/* refer to vread_tsc() comment for rationale */
 	last = gtod->cycle_last;
 
 	if (likely(ret >= last))
@@ -136,49 +168,6 @@ static notrace cycle_t vread_pvclock(int *mode)
 }
 #endif
 
-#else
-
-notrace static long vdso_fallback_gettime(long clock, struct timespec *ts)
-{
-	long ret;
-
-	asm(
-		"mov %%ebx, %%edx \n"
-		"mov %2, %%ebx \n"
-		"call __kernel_vsyscall \n"
-		"mov %%edx, %%ebx \n"
-		: "=a" (ret)
-		: "0" (__NR_clock_gettime), "g" (clock), "c" (ts)
-		: "memory", "edx");
-	return ret;
-}
-
-notrace static long vdso_fallback_gtod(struct timeval *tv, struct timezone *tz)
-{
-	long ret;
-
-	asm(
-		"mov %%ebx, %%edx \n"
-		"mov %2, %%ebx \n"
-		"call __kernel_vsyscall \n"
-		"mov %%edx, %%ebx \n"
-		: "=a" (ret)
-		: "0" (__NR_gettimeofday), "g" (tv), "c" (tz)
-		: "memory", "edx");
-	return ret;
-}
-
-#ifdef CONFIG_PARAVIRT_CLOCK
-
-static notrace cycle_t vread_pvclock(int *mode)
-{
-	*mode = VCLOCK_NONE;
-	return 0;
-}
-#endif
-
-#endif
-
 notrace static cycle_t vread_tsc(void)
 {
 	cycle_t ret = (cycle_t)rdtsc_ordered();
-- 
2.5.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 1/4] x86, vdso, pvclock: Simplify and speed up the vdso pvclock reader
  2015-12-20 11:05 ` [PATCH v2 1/4] x86, vdso, pvclock: Simplify and speed up the vdso pvclock reader Andy Lutomirski
@ 2016-01-04 20:26   ` Marcelo Tosatti
  2016-01-04 22:33     ` Andy Lutomirski
  0 siblings, 1 reply; 16+ messages in thread
From: Marcelo Tosatti @ 2016-01-04 20:26 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: x86, Radim Krcmar, Paolo Bonzini, linux-kernel, kvm,
	Alexander Graf, Andy Lutomirski

On Sun, Dec 20, 2015 at 03:05:41AM -0800, Andy Lutomirski wrote:
> From: Andy Lutomirski <luto@amacapital.net>
> 
> The pvclock vdso code was too abstracted to understand easily and
> excessively paranoid.  Simplify it for a huge speedup.
> 
> This opens the door for additional simplifications, as the vdso no
> longer accesses the pvti for any vcpu other than vcpu 0.
> 
> Before, vclock_gettime using kvm-clock took about 45ns on my machine.
> With this change, it takes 29ns, which is almost as fast as the pure TSC
> implementation.
> 
> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
> Signed-off-by: Andy Lutomirski <luto@amacapital.net>
> ---
>  arch/x86/entry/vdso/vclock_gettime.c | 81 ++++++++++++++++++++----------------
>  1 file changed, 46 insertions(+), 35 deletions(-)
> 
> diff --git a/arch/x86/entry/vdso/vclock_gettime.c b/arch/x86/entry/vdso/vclock_gettime.c
> index ca94fa649251..c325ba1bdddf 100644
> --- a/arch/x86/entry/vdso/vclock_gettime.c
> +++ b/arch/x86/entry/vdso/vclock_gettime.c
> @@ -78,47 +78,58 @@ static notrace const struct pvclock_vsyscall_time_info *get_pvti(int cpu)
>  
>  static notrace cycle_t vread_pvclock(int *mode)
>  {
> -	const struct pvclock_vsyscall_time_info *pvti;
> +	const struct pvclock_vcpu_time_info *pvti = &get_pvti(0)->pvti;
>  	cycle_t ret;
> -	u64 last;
> -	u32 version;
> -	u8 flags;
> -	unsigned cpu, cpu1;
> -
> +	u64 tsc, pvti_tsc;
> +	u64 last, delta, pvti_system_time;
> +	u32 version, pvti_tsc_to_system_mul, pvti_tsc_shift;
>  
>  	/*
> -	 * Note: hypervisor must guarantee that:
> -	 * 1. cpu ID number maps 1:1 to per-CPU pvclock time info.
> -	 * 2. that per-CPU pvclock time info is updated if the
> -	 *    underlying CPU changes.
> -	 * 3. that version is increased whenever underlying CPU
> -	 *    changes.
> +	 * Note: The kernel and hypervisor must guarantee that cpu ID
> +	 * number maps 1:1 to per-CPU pvclock time info.
> +	 *
> +	 * Because the hypervisor is entirely unaware of guest userspace
> +	 * preemption, it cannot guarantee that per-CPU pvclock time
> +	 * info is updated if the underlying CPU changes or that that
> +	 * version is increased whenever underlying CPU changes.
>  	 *
> +	 * On KVM, we are guaranteed that pvti updates for any vCPU are
> +	 * atomic as seen by *all* vCPUs.  This is an even stronger
> +	 * guarantee than we get with a normal seqlock.
> +	 *
> +	 * On Xen, we don't appear to have that guarantee, but Xen still
> +	 * supplies a valid seqlock using the version field.
> +
> +	 * We only do pvclock vdso timing at all if
> +	 * PVCLOCK_TSC_STABLE_BIT is set, and we interpret that bit to
> +	 * mean that all vCPUs have matching pvti and that the TSC is
> +	 * synced, so we can just look at vCPU 0's pvti.
>  	 */
> -	do {
> -		cpu = __getcpu() & VGETCPU_CPU_MASK;
> -		/* TODO: We can put vcpu id into higher bits of pvti.version.
> -		 * This will save a couple of cycles by getting rid of
> -		 * __getcpu() calls (Gleb).
> -		 */
> -
> -		pvti = get_pvti(cpu);
> -
> -		version = __pvclock_read_cycles(&pvti->pvti, &ret, &flags);
> -
> -		/*
> -		 * Test we're still on the cpu as well as the version.
> -		 * We could have been migrated just after the first
> -		 * vgetcpu but before fetching the version, so we
> -		 * wouldn't notice a version change.
> -		 */
> -		cpu1 = __getcpu() & VGETCPU_CPU_MASK;
> -	} while (unlikely(cpu != cpu1 ||
> -			  (pvti->pvti.version & 1) ||
> -			  pvti->pvti.version != version));
> -
> -	if (unlikely(!(flags & PVCLOCK_TSC_STABLE_BIT)))
> +
> +	if (unlikely(!(pvti->flags & PVCLOCK_TSC_STABLE_BIT))) {
>  		*mode = VCLOCK_NONE;
> +		return 0;
> +	}
> +
> +	do {
> +		version = pvti->version;
> +
> +		/* This is also a read barrier, so we'll read version first. */
> +		tsc = rdtsc_ordered();
> +
> +		pvti_tsc_to_system_mul = pvti->tsc_to_system_mul;
> +		pvti_tsc_shift = pvti->tsc_shift;
> +		pvti_system_time = pvti->system_time;
> +		pvti_tsc = pvti->tsc_timestamp;
> +
> +		/* Make sure that the version double-check is last. */
> +		smp_rmb();
> +	} while (unlikely((version & 1) || version != pvti->version));

Andy,

What happens if PVCLOCK_TSC_STABLE_BIT is disabled here?

> +
> +	delta = tsc - pvti_tsc;
> +	ret = pvti_system_time +
> +		pvclock_scale_delta(delta, pvti_tsc_to_system_mul,
> +				    pvti_tsc_shift);
>  
>  	/* refer to tsc.c read_tsc() comment for rationale */
>  	last = gtod->cycle_last;
> -- 
> 2.5.0
> 
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 1/4] x86, vdso, pvclock: Simplify and speed up the vdso pvclock reader
  2016-01-04 20:26   ` Marcelo Tosatti
@ 2016-01-04 22:33     ` Andy Lutomirski
  2016-01-04 22:59       ` Marcelo Tosatti
  0 siblings, 1 reply; 16+ messages in thread
From: Andy Lutomirski @ 2016-01-04 22:33 UTC (permalink / raw)
  To: Marcelo Tosatti
  Cc: Andy Lutomirski, X86 ML, Radim Krcmar, Paolo Bonzini,
	linux-kernel, kvm list, Alexander Graf

On Mon, Jan 4, 2016 at 12:26 PM, Marcelo Tosatti <mtosatti@redhat.com> wrote:
> On Sun, Dec 20, 2015 at 03:05:41AM -0800, Andy Lutomirski wrote:
>> From: Andy Lutomirski <luto@amacapital.net>
>>
>> The pvclock vdso code was too abstracted to understand easily and
>> excessively paranoid.  Simplify it for a huge speedup.
>>
>> This opens the door for additional simplifications, as the vdso no
>> longer accesses the pvti for any vcpu other than vcpu 0.
>>
>> Before, vclock_gettime using kvm-clock took about 45ns on my machine.
>> With this change, it takes 29ns, which is almost as fast as the pure TSC
>> implementation.
>>
>> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
>> Signed-off-by: Andy Lutomirski <luto@amacapital.net>
>> ---
>>  arch/x86/entry/vdso/vclock_gettime.c | 81 ++++++++++++++++++++----------------
>>  1 file changed, 46 insertions(+), 35 deletions(-)
>>
>> diff --git a/arch/x86/entry/vdso/vclock_gettime.c b/arch/x86/entry/vdso/vclock_gettime.c
>> index ca94fa649251..c325ba1bdddf 100644
>> --- a/arch/x86/entry/vdso/vclock_gettime.c
>> +++ b/arch/x86/entry/vdso/vclock_gettime.c
>> @@ -78,47 +78,58 @@ static notrace const struct pvclock_vsyscall_time_info *get_pvti(int cpu)
>>
>>  static notrace cycle_t vread_pvclock(int *mode)
>>  {
>> -     const struct pvclock_vsyscall_time_info *pvti;
>> +     const struct pvclock_vcpu_time_info *pvti = &get_pvti(0)->pvti;
>>       cycle_t ret;
>> -     u64 last;
>> -     u32 version;
>> -     u8 flags;
>> -     unsigned cpu, cpu1;
>> -
>> +     u64 tsc, pvti_tsc;
>> +     u64 last, delta, pvti_system_time;
>> +     u32 version, pvti_tsc_to_system_mul, pvti_tsc_shift;
>>
>>       /*
>> -      * Note: hypervisor must guarantee that:
>> -      * 1. cpu ID number maps 1:1 to per-CPU pvclock time info.
>> -      * 2. that per-CPU pvclock time info is updated if the
>> -      *    underlying CPU changes.
>> -      * 3. that version is increased whenever underlying CPU
>> -      *    changes.
>> +      * Note: The kernel and hypervisor must guarantee that cpu ID
>> +      * number maps 1:1 to per-CPU pvclock time info.
>> +      *
>> +      * Because the hypervisor is entirely unaware of guest userspace
>> +      * preemption, it cannot guarantee that per-CPU pvclock time
>> +      * info is updated if the underlying CPU changes or that that
>> +      * version is increased whenever underlying CPU changes.
>>        *
>> +      * On KVM, we are guaranteed that pvti updates for any vCPU are
>> +      * atomic as seen by *all* vCPUs.  This is an even stronger
>> +      * guarantee than we get with a normal seqlock.
>> +      *
>> +      * On Xen, we don't appear to have that guarantee, but Xen still
>> +      * supplies a valid seqlock using the version field.
>> +
>> +      * We only do pvclock vdso timing at all if
>> +      * PVCLOCK_TSC_STABLE_BIT is set, and we interpret that bit to
>> +      * mean that all vCPUs have matching pvti and that the TSC is
>> +      * synced, so we can just look at vCPU 0's pvti.
>>        */
>> -     do {
>> -             cpu = __getcpu() & VGETCPU_CPU_MASK;
>> -             /* TODO: We can put vcpu id into higher bits of pvti.version.
>> -              * This will save a couple of cycles by getting rid of
>> -              * __getcpu() calls (Gleb).
>> -              */
>> -
>> -             pvti = get_pvti(cpu);
>> -
>> -             version = __pvclock_read_cycles(&pvti->pvti, &ret, &flags);
>> -
>> -             /*
>> -              * Test we're still on the cpu as well as the version.
>> -              * We could have been migrated just after the first
>> -              * vgetcpu but before fetching the version, so we
>> -              * wouldn't notice a version change.
>> -              */
>> -             cpu1 = __getcpu() & VGETCPU_CPU_MASK;
>> -     } while (unlikely(cpu != cpu1 ||
>> -                       (pvti->pvti.version & 1) ||
>> -                       pvti->pvti.version != version));
>> -
>> -     if (unlikely(!(flags & PVCLOCK_TSC_STABLE_BIT)))
>> +
>> +     if (unlikely(!(pvti->flags & PVCLOCK_TSC_STABLE_BIT))) {
>>               *mode = VCLOCK_NONE;
>> +             return 0;
>> +     }
>> +
>> +     do {
>> +             version = pvti->version;
>> +
>> +             /* This is also a read barrier, so we'll read version first. */
>> +             tsc = rdtsc_ordered();
>> +
>> +             pvti_tsc_to_system_mul = pvti->tsc_to_system_mul;
>> +             pvti_tsc_shift = pvti->tsc_shift;
>> +             pvti_system_time = pvti->system_time;
>> +             pvti_tsc = pvti->tsc_timestamp;
>> +
>> +             /* Make sure that the version double-check is last. */
>> +             smp_rmb();
>> +     } while (unlikely((version & 1) || version != pvti->version));
>
> Andy,
>
> What happens if PVCLOCK_TSC_STABLE_BIT is disabled here?

Do you mean what happens if it's disabled in the loop part after the
first check?  If that's actually possible, I'll do a follow-up to bail
if that happens by moving the check into the loop.

--Andy

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 1/4] x86, vdso, pvclock: Simplify and speed up the vdso pvclock reader
  2016-01-04 22:33     ` Andy Lutomirski
@ 2016-01-04 22:59       ` Marcelo Tosatti
  2016-01-04 23:14         ` [PATCH] x86/vdso/pvclock: Protect STABLE check with the seqcount Andy Lutomirski
  0 siblings, 1 reply; 16+ messages in thread
From: Marcelo Tosatti @ 2016-01-04 22:59 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Andy Lutomirski, X86 ML, Radim Krcmar, Paolo Bonzini,
	linux-kernel, kvm list, Alexander Graf

On Mon, Jan 04, 2016 at 02:33:12PM -0800, Andy Lutomirski wrote:
> On Mon, Jan 4, 2016 at 12:26 PM, Marcelo Tosatti <mtosatti@redhat.com> wrote:
> > On Sun, Dec 20, 2015 at 03:05:41AM -0800, Andy Lutomirski wrote:
> >> From: Andy Lutomirski <luto@amacapital.net>
> >>
> >> The pvclock vdso code was too abstracted to understand easily and
> >> excessively paranoid.  Simplify it for a huge speedup.
> >>
> >> This opens the door for additional simplifications, as the vdso no
> >> longer accesses the pvti for any vcpu other than vcpu 0.
> >>
> >> Before, vclock_gettime using kvm-clock took about 45ns on my machine.
> >> With this change, it takes 29ns, which is almost as fast as the pure TSC
> >> implementation.
> >>
> >> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
> >> Signed-off-by: Andy Lutomirski <luto@amacapital.net>
> >> ---
> >>  arch/x86/entry/vdso/vclock_gettime.c | 81 ++++++++++++++++++++----------------
> >>  1 file changed, 46 insertions(+), 35 deletions(-)
> >>
> >> diff --git a/arch/x86/entry/vdso/vclock_gettime.c b/arch/x86/entry/vdso/vclock_gettime.c
> >> index ca94fa649251..c325ba1bdddf 100644
> >> --- a/arch/x86/entry/vdso/vclock_gettime.c
> >> +++ b/arch/x86/entry/vdso/vclock_gettime.c
> >> @@ -78,47 +78,58 @@ static notrace const struct pvclock_vsyscall_time_info *get_pvti(int cpu)
> >>
> >>  static notrace cycle_t vread_pvclock(int *mode)
> >>  {
> >> -     const struct pvclock_vsyscall_time_info *pvti;
> >> +     const struct pvclock_vcpu_time_info *pvti = &get_pvti(0)->pvti;
> >>       cycle_t ret;
> >> -     u64 last;
> >> -     u32 version;
> >> -     u8 flags;
> >> -     unsigned cpu, cpu1;
> >> -
> >> +     u64 tsc, pvti_tsc;
> >> +     u64 last, delta, pvti_system_time;
> >> +     u32 version, pvti_tsc_to_system_mul, pvti_tsc_shift;
> >>
> >>       /*
> >> -      * Note: hypervisor must guarantee that:
> >> -      * 1. cpu ID number maps 1:1 to per-CPU pvclock time info.
> >> -      * 2. that per-CPU pvclock time info is updated if the
> >> -      *    underlying CPU changes.
> >> -      * 3. that version is increased whenever underlying CPU
> >> -      *    changes.
> >> +      * Note: The kernel and hypervisor must guarantee that cpu ID
> >> +      * number maps 1:1 to per-CPU pvclock time info.
> >> +      *
> >> +      * Because the hypervisor is entirely unaware of guest userspace
> >> +      * preemption, it cannot guarantee that per-CPU pvclock time
> >> +      * info is updated if the underlying CPU changes or that that
> >> +      * version is increased whenever underlying CPU changes.
> >>        *
> >> +      * On KVM, we are guaranteed that pvti updates for any vCPU are
> >> +      * atomic as seen by *all* vCPUs.  This is an even stronger
> >> +      * guarantee than we get with a normal seqlock.
> >> +      *
> >> +      * On Xen, we don't appear to have that guarantee, but Xen still
> >> +      * supplies a valid seqlock using the version field.
> >> +
> >> +      * We only do pvclock vdso timing at all if
> >> +      * PVCLOCK_TSC_STABLE_BIT is set, and we interpret that bit to
> >> +      * mean that all vCPUs have matching pvti and that the TSC is
> >> +      * synced, so we can just look at vCPU 0's pvti.
> >>        */
> >> -     do {
> >> -             cpu = __getcpu() & VGETCPU_CPU_MASK;
> >> -             /* TODO: We can put vcpu id into higher bits of pvti.version.
> >> -              * This will save a couple of cycles by getting rid of
> >> -              * __getcpu() calls (Gleb).
> >> -              */
> >> -
> >> -             pvti = get_pvti(cpu);
> >> -
> >> -             version = __pvclock_read_cycles(&pvti->pvti, &ret, &flags);
> >> -
> >> -             /*
> >> -              * Test we're still on the cpu as well as the version.
> >> -              * We could have been migrated just after the first
> >> -              * vgetcpu but before fetching the version, so we
> >> -              * wouldn't notice a version change.
> >> -              */
> >> -             cpu1 = __getcpu() & VGETCPU_CPU_MASK;
> >> -     } while (unlikely(cpu != cpu1 ||
> >> -                       (pvti->pvti.version & 1) ||
> >> -                       pvti->pvti.version != version));
> >> -
> >> -     if (unlikely(!(flags & PVCLOCK_TSC_STABLE_BIT)))
> >> +
> >> +     if (unlikely(!(pvti->flags & PVCLOCK_TSC_STABLE_BIT))) {
> >>               *mode = VCLOCK_NONE;
> >> +             return 0;
> >> +     }
> >> +
> >> +     do {
> >> +             version = pvti->version;
> >> +
> >> +             /* This is also a read barrier, so we'll read version first. */
> >> +             tsc = rdtsc_ordered();
> >> +
> >> +             pvti_tsc_to_system_mul = pvti->tsc_to_system_mul;
> >> +             pvti_tsc_shift = pvti->tsc_shift;
> >> +             pvti_system_time = pvti->system_time;
> >> +             pvti_tsc = pvti->tsc_timestamp;
> >> +
> >> +             /* Make sure that the version double-check is last. */
> >> +             smp_rmb();
> >> +     } while (unlikely((version & 1) || version != pvti->version));
> >
> > Andy,
> >
> > What happens if PVCLOCK_TSC_STABLE_BIT is disabled here?
> 
> Do you mean what happens if it's disabled in the loop part after the
> first check?  If that's actually possible, I'll do a follow-up to bail
> if that happens by moving the check into the loop.
> 
> --Andy

It is possible.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH] x86/vdso/pvclock: Protect STABLE check with the seqcount
  2016-01-04 22:59       ` Marcelo Tosatti
@ 2016-01-04 23:14         ` Andy Lutomirski
  2016-01-07 21:02           ` Marcelo Tosatti
  2016-01-14  9:07           ` [tip:x86/urgent] " tip-bot for Andy Lutomirski
  0 siblings, 2 replies; 16+ messages in thread
From: Andy Lutomirski @ 2016-01-04 23:14 UTC (permalink / raw)
  To: X86 ML, Radim Krcmar, Paolo Bonzini, linux-kernel, kvm, Alexander Graf
  Cc: Andy Lutomirski

If the clock becomes unstable while we're reading it, we need to
bail.  We can do this by simply moving the check into the seqcount
loop.

Reported-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
---

Marcelo, how's this?

arch/x86/entry/vdso/vclock_gettime.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/arch/x86/entry/vdso/vclock_gettime.c b/arch/x86/entry/vdso/vclock_gettime.c
index 8602f06c759f..1a50e09c945b 100644
--- a/arch/x86/entry/vdso/vclock_gettime.c
+++ b/arch/x86/entry/vdso/vclock_gettime.c
@@ -126,23 +126,23 @@ static notrace cycle_t vread_pvclock(int *mode)
 	 *
 	 * On Xen, we don't appear to have that guarantee, but Xen still
 	 * supplies a valid seqlock using the version field.
-
+	 *
 	 * We only do pvclock vdso timing at all if
 	 * PVCLOCK_TSC_STABLE_BIT is set, and we interpret that bit to
 	 * mean that all vCPUs have matching pvti and that the TSC is
 	 * synced, so we can just look at vCPU 0's pvti.
 	 */
 
-	if (unlikely(!(pvti->flags & PVCLOCK_TSC_STABLE_BIT))) {
-		*mode = VCLOCK_NONE;
-		return 0;
-	}
-
 	do {
 		version = pvti->version;
 
 		smp_rmb();
 
+		if (unlikely(!(pvti->flags & PVCLOCK_TSC_STABLE_BIT))) {
+			*mode = VCLOCK_NONE;
+			return 0;
+		}
+
 		tsc = rdtsc_ordered();
 		pvti_tsc_to_system_mul = pvti->tsc_to_system_mul;
 		pvti_tsc_shift = pvti->tsc_shift;
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH] x86/vdso/pvclock: Protect STABLE check with the seqcount
  2016-01-04 23:14         ` [PATCH] x86/vdso/pvclock: Protect STABLE check with the seqcount Andy Lutomirski
@ 2016-01-07 21:02           ` Marcelo Tosatti
  2016-01-07 21:13             ` Andy Lutomirski
  2016-01-14  9:07           ` [tip:x86/urgent] " tip-bot for Andy Lutomirski
  1 sibling, 1 reply; 16+ messages in thread
From: Marcelo Tosatti @ 2016-01-07 21:02 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: X86 ML, Radim Krcmar, Paolo Bonzini, linux-kernel, kvm, Alexander Graf

On Mon, Jan 04, 2016 at 03:14:28PM -0800, Andy Lutomirski wrote:
> If the clock becomes unstable while we're reading it, we need to
> bail.  We can do this by simply moving the check into the seqcount
> loop.
> 
> Reported-by: Marcelo Tosatti <mtosatti@redhat.com>
> Signed-off-by: Andy Lutomirski <luto@kernel.org>
> ---
> 
> Marcelo, how's this?
> 
> arch/x86/entry/vdso/vclock_gettime.c | 12 ++++++------
>  1 file changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/arch/x86/entry/vdso/vclock_gettime.c b/arch/x86/entry/vdso/vclock_gettime.c
> index 8602f06c759f..1a50e09c945b 100644
> --- a/arch/x86/entry/vdso/vclock_gettime.c
> +++ b/arch/x86/entry/vdso/vclock_gettime.c
> @@ -126,23 +126,23 @@ static notrace cycle_t vread_pvclock(int *mode)
>  	 *
>  	 * On Xen, we don't appear to have that guarantee, but Xen still
>  	 * supplies a valid seqlock using the version field.
> -
> +	 *
>  	 * We only do pvclock vdso timing at all if
>  	 * PVCLOCK_TSC_STABLE_BIT is set, and we interpret that bit to
>  	 * mean that all vCPUs have matching pvti and that the TSC is
>  	 * synced, so we can just look at vCPU 0's pvti.
>  	 */
>  
> -	if (unlikely(!(pvti->flags & PVCLOCK_TSC_STABLE_BIT))) {
> -		*mode = VCLOCK_NONE;
> -		return 0;
> -	}
> -
>  	do {
>  		version = pvti->version;
>  
>  		smp_rmb();
>  
> +		if (unlikely(!(pvti->flags & PVCLOCK_TSC_STABLE_BIT))) {
> +			*mode = VCLOCK_NONE;
> +			return 0;
> +		}
> +
>  		tsc = rdtsc_ordered();
>  		pvti_tsc_to_system_mul = pvti->tsc_to_system_mul;
>  		pvti_tsc_shift = pvti->tsc_shift;
> -- 
> 2.4.3

Check it before returning the value (once cleared, it can't be set back 
to 1), similarly to what was in place before.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] x86/vdso/pvclock: Protect STABLE check with the seqcount
  2016-01-07 21:02           ` Marcelo Tosatti
@ 2016-01-07 21:13             ` Andy Lutomirski
  2016-01-07 21:47               ` Paolo Bonzini
  2016-01-08 14:04               ` Marcelo Tosatti
  0 siblings, 2 replies; 16+ messages in thread
From: Andy Lutomirski @ 2016-01-07 21:13 UTC (permalink / raw)
  To: Marcelo Tosatti
  Cc: Andy Lutomirski, X86 ML, Radim Krcmar, Paolo Bonzini,
	linux-kernel, kvm list, Alexander Graf

On Thu, Jan 7, 2016 at 1:02 PM, Marcelo Tosatti <mtosatti@redhat.com> wrote:
> On Mon, Jan 04, 2016 at 03:14:28PM -0800, Andy Lutomirski wrote:
>> If the clock becomes unstable while we're reading it, we need to
>> bail.  We can do this by simply moving the check into the seqcount
>> loop.
>>
>> Reported-by: Marcelo Tosatti <mtosatti@redhat.com>
>> Signed-off-by: Andy Lutomirski <luto@kernel.org>
>> ---
>>
>> Marcelo, how's this?
>>
>> arch/x86/entry/vdso/vclock_gettime.c | 12 ++++++------
>>  1 file changed, 6 insertions(+), 6 deletions(-)
>>
>> diff --git a/arch/x86/entry/vdso/vclock_gettime.c b/arch/x86/entry/vdso/vclock_gettime.c
>> index 8602f06c759f..1a50e09c945b 100644
>> --- a/arch/x86/entry/vdso/vclock_gettime.c
>> +++ b/arch/x86/entry/vdso/vclock_gettime.c
>> @@ -126,23 +126,23 @@ static notrace cycle_t vread_pvclock(int *mode)
>>        *
>>        * On Xen, we don't appear to have that guarantee, but Xen still
>>        * supplies a valid seqlock using the version field.
>> -
>> +      *
>>        * We only do pvclock vdso timing at all if
>>        * PVCLOCK_TSC_STABLE_BIT is set, and we interpret that bit to
>>        * mean that all vCPUs have matching pvti and that the TSC is
>>        * synced, so we can just look at vCPU 0's pvti.
>>        */
>>
>> -     if (unlikely(!(pvti->flags & PVCLOCK_TSC_STABLE_BIT))) {
>> -             *mode = VCLOCK_NONE;
>> -             return 0;
>> -     }
>> -
>>       do {
>>               version = pvti->version;
>>
>>               smp_rmb();
>>
>> +             if (unlikely(!(pvti->flags & PVCLOCK_TSC_STABLE_BIT))) {
>> +                     *mode = VCLOCK_NONE;
>> +                     return 0;
>> +             }
>> +
>>               tsc = rdtsc_ordered();
>>               pvti_tsc_to_system_mul = pvti->tsc_to_system_mul;
>>               pvti_tsc_shift = pvti->tsc_shift;
>> --
>> 2.4.3
>
> Check it before returning the value (once cleared, it can't be set back
> to 1), similarly to what was in place before.
>
>

I don't understand what you mean.

In the old code (4.3 and 4.4), the vdso checks STABLE_BIT at the end,
which is correct as long as STABLE_BIT can never change from 0 to 1.

In the -tip code, it's clearly wrong.

In the code in this patch, it should be correct regardless of how
STABLE_BIT changes as long as the seqcount works.  Given that the
performance cost of doing that is zero, I'd rather keep it that way.
If we're really paranoid, we could move it after the rest of the pvti
reads and add a barrier, but is there really any host on which that
matters?

--Andy

-- 
Andy Lutomirski
AMA Capital Management, LLC

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] x86/vdso/pvclock: Protect STABLE check with the seqcount
  2016-01-07 21:13             ` Andy Lutomirski
@ 2016-01-07 21:47               ` Paolo Bonzini
  2016-01-08 14:04               ` Marcelo Tosatti
  1 sibling, 0 replies; 16+ messages in thread
From: Paolo Bonzini @ 2016-01-07 21:47 UTC (permalink / raw)
  To: Andy Lutomirski, Marcelo Tosatti
  Cc: Andy Lutomirski, X86 ML, Radim Krcmar, linux-kernel, kvm list,
	Alexander Graf



On 07/01/2016 22:13, Andy Lutomirski wrote:
> I don't understand what you mean.
> 
> In the old code (4.3 and 4.4), the vdso checks STABLE_BIT at the end,
> which is correct as long as STABLE_BIT can never change from 0 to 1.
> 
> In the -tip code, it's clearly wrong.
> 
> In the code in this patch, it should be correct regardless of how
> STABLE_BIT changes as long as the seqcount works.  Given that the
> performance cost of doing that is zero, I'd rather keep it that way.
> If we're really paranoid, we could move it after the rest of the pvti
> reads and add a barrier, but is there really any host on which that
> matters?

I agree that your patch is fine.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>

Paolo

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] x86/vdso/pvclock: Protect STABLE check with the seqcount
  2016-01-07 21:13             ` Andy Lutomirski
  2016-01-07 21:47               ` Paolo Bonzini
@ 2016-01-08 14:04               ` Marcelo Tosatti
  2016-01-12 19:48                 ` Andy Lutomirski
  1 sibling, 1 reply; 16+ messages in thread
From: Marcelo Tosatti @ 2016-01-08 14:04 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Andy Lutomirski, X86 ML, Radim Krcmar, Paolo Bonzini,
	linux-kernel, kvm list, Alexander Graf

On Thu, Jan 07, 2016 at 01:13:41PM -0800, Andy Lutomirski wrote:
> On Thu, Jan 7, 2016 at 1:02 PM, Marcelo Tosatti <mtosatti@redhat.com> wrote:
> > On Mon, Jan 04, 2016 at 03:14:28PM -0800, Andy Lutomirski wrote:
> >> If the clock becomes unstable while we're reading it, we need to
> >> bail.  We can do this by simply moving the check into the seqcount
> >> loop.
> >>
> >> Reported-by: Marcelo Tosatti <mtosatti@redhat.com>
> >> Signed-off-by: Andy Lutomirski <luto@kernel.org>
> >> ---
> >>
> >> Marcelo, how's this?
> >>
> >> arch/x86/entry/vdso/vclock_gettime.c | 12 ++++++------
> >>  1 file changed, 6 insertions(+), 6 deletions(-)
> >>
> >> diff --git a/arch/x86/entry/vdso/vclock_gettime.c b/arch/x86/entry/vdso/vclock_gettime.c
> >> index 8602f06c759f..1a50e09c945b 100644
> >> --- a/arch/x86/entry/vdso/vclock_gettime.c
> >> +++ b/arch/x86/entry/vdso/vclock_gettime.c
> >> @@ -126,23 +126,23 @@ static notrace cycle_t vread_pvclock(int *mode)
> >>        *
> >>        * On Xen, we don't appear to have that guarantee, but Xen still
> >>        * supplies a valid seqlock using the version field.
> >> -
> >> +      *
> >>        * We only do pvclock vdso timing at all if
> >>        * PVCLOCK_TSC_STABLE_BIT is set, and we interpret that bit to
> >>        * mean that all vCPUs have matching pvti and that the TSC is
> >>        * synced, so we can just look at vCPU 0's pvti.
> >>        */
> >>
> >> -     if (unlikely(!(pvti->flags & PVCLOCK_TSC_STABLE_BIT))) {
> >> -             *mode = VCLOCK_NONE;
> >> -             return 0;
> >> -     }
> >> -
> >>       do {
> >>               version = pvti->version;
> >>
> >>               smp_rmb();
> >>
> >> +             if (unlikely(!(pvti->flags & PVCLOCK_TSC_STABLE_BIT))) {
> >> +                     *mode = VCLOCK_NONE;
> >> +                     return 0;
> >> +             }
> >> +
> >>               tsc = rdtsc_ordered();
> >>               pvti_tsc_to_system_mul = pvti->tsc_to_system_mul;
> >>               pvti_tsc_shift = pvti->tsc_shift;
> >> --
> >> 2.4.3
> >
> > Check it before returning the value (once cleared, it can't be set back
> > to 1), similarly to what was in place before.
> >
> >
> 
> I don't understand what you mean.
> 
> In the old code (4.3 and 4.4), the vdso checks STABLE_BIT at the end,
> which is correct as long as STABLE_BIT can never change from 0 to 1.
> 
> In the -tip code, it's clearly wrong.
> 
> In the code in this patch, it should be correct regardless of how
> STABLE_BIT changes as long as the seqcount works.  Given that the
> performance cost of doing that is zero, I'd rather keep it that way.
> If we're really paranoid, we could move it after the rest of the pvti
> reads and add a barrier, but is there really any host on which that
> matters?
> 
> --Andy
> 
> -- 
> Andy Lutomirski
> AMA Capital Management, LLC

Right, its OK due to version check, thanks.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] x86/vdso/pvclock: Protect STABLE check with the seqcount
  2016-01-08 14:04               ` Marcelo Tosatti
@ 2016-01-12 19:48                 ` Andy Lutomirski
  2016-01-13 10:46                   ` Ingo Molnar
  0 siblings, 1 reply; 16+ messages in thread
From: Andy Lutomirski @ 2016-01-12 19:48 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Andy Lutomirski, X86 ML, Radim Krcmar, Paolo Bonzini,
	linux-kernel, kvm list, Alexander Graf, Marcelo Tosatti

Hi Ingo-

Can you apply this before the tip:x86/asm pull request goes out?  It
fixes a regression in tip:x86/asm.

--Andy

On Fri, Jan 8, 2016 at 6:04 AM, Marcelo Tosatti <mtosatti@redhat.com> wrote:
> On Thu, Jan 07, 2016 at 01:13:41PM -0800, Andy Lutomirski wrote:
>> On Thu, Jan 7, 2016 at 1:02 PM, Marcelo Tosatti <mtosatti@redhat.com> wrote:
>> > On Mon, Jan 04, 2016 at 03:14:28PM -0800, Andy Lutomirski wrote:
>> >> If the clock becomes unstable while we're reading it, we need to
>> >> bail.  We can do this by simply moving the check into the seqcount
>> >> loop.
>> >>
>> >> Reported-by: Marcelo Tosatti <mtosatti@redhat.com>
>> >> Signed-off-by: Andy Lutomirski <luto@kernel.org>
>> >> ---
>> >>
>> >> Marcelo, how's this?
>> >>
>> >> arch/x86/entry/vdso/vclock_gettime.c | 12 ++++++------
>> >>  1 file changed, 6 insertions(+), 6 deletions(-)
>> >>
>> >> diff --git a/arch/x86/entry/vdso/vclock_gettime.c b/arch/x86/entry/vdso/vclock_gettime.c
>> >> index 8602f06c759f..1a50e09c945b 100644
>> >> --- a/arch/x86/entry/vdso/vclock_gettime.c
>> >> +++ b/arch/x86/entry/vdso/vclock_gettime.c
>> >> @@ -126,23 +126,23 @@ static notrace cycle_t vread_pvclock(int *mode)
>> >>        *
>> >>        * On Xen, we don't appear to have that guarantee, but Xen still
>> >>        * supplies a valid seqlock using the version field.
>> >> -
>> >> +      *
>> >>        * We only do pvclock vdso timing at all if
>> >>        * PVCLOCK_TSC_STABLE_BIT is set, and we interpret that bit to
>> >>        * mean that all vCPUs have matching pvti and that the TSC is
>> >>        * synced, so we can just look at vCPU 0's pvti.
>> >>        */
>> >>
>> >> -     if (unlikely(!(pvti->flags & PVCLOCK_TSC_STABLE_BIT))) {
>> >> -             *mode = VCLOCK_NONE;
>> >> -             return 0;
>> >> -     }
>> >> -
>> >>       do {
>> >>               version = pvti->version;
>> >>
>> >>               smp_rmb();
>> >>
>> >> +             if (unlikely(!(pvti->flags & PVCLOCK_TSC_STABLE_BIT))) {
>> >> +                     *mode = VCLOCK_NONE;
>> >> +                     return 0;
>> >> +             }
>> >> +
>> >>               tsc = rdtsc_ordered();
>> >>               pvti_tsc_to_system_mul = pvti->tsc_to_system_mul;
>> >>               pvti_tsc_shift = pvti->tsc_shift;
>> >> --
>> >> 2.4.3
>> >
>> > Check it before returning the value (once cleared, it can't be set back
>> > to 1), similarly to what was in place before.
>> >
>> >
>>
>> I don't understand what you mean.
>>
>> In the old code (4.3 and 4.4), the vdso checks STABLE_BIT at the end,
>> which is correct as long as STABLE_BIT can never change from 0 to 1.
>>
>> In the -tip code, it's clearly wrong.
>>
>> In the code in this patch, it should be correct regardless of how
>> STABLE_BIT changes as long as the seqcount works.  Given that the
>> performance cost of doing that is zero, I'd rather keep it that way.
>> If we're really paranoid, we could move it after the rest of the pvti
>> reads and add a barrier, but is there really any host on which that
>> matters?
>>
>> --Andy
>>
>> --
>> Andy Lutomirski
>> AMA Capital Management, LLC
>
> Right, its OK due to version check, thanks.
>



-- 
Andy Lutomirski
AMA Capital Management, LLC

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] x86/vdso/pvclock: Protect STABLE check with the seqcount
  2016-01-12 19:48                 ` Andy Lutomirski
@ 2016-01-13 10:46                   ` Ingo Molnar
  0 siblings, 0 replies; 16+ messages in thread
From: Ingo Molnar @ 2016-01-13 10:46 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Andy Lutomirski, X86 ML, Radim Krcmar, Paolo Bonzini,
	linux-kernel, kvm list, Alexander Graf, Marcelo Tosatti,
	Linus Torvalds


* Andy Lutomirski <luto@amacapital.net> wrote:

> Hi Ingo-
> 
> Can you apply this before the tip:x86/asm pull request goes out?  It
> fixes a regression in tip:x86/asm.

Ooops, saw this mail too late - I'll merge this up into x86/urgent right now and 
send all pending fixes to Linus.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [tip:x86/urgent] x86/vdso/pvclock: Protect STABLE check with the seqcount
  2016-01-04 23:14         ` [PATCH] x86/vdso/pvclock: Protect STABLE check with the seqcount Andy Lutomirski
  2016-01-07 21:02           ` Marcelo Tosatti
@ 2016-01-14  9:07           ` tip-bot for Andy Lutomirski
  1 sibling, 0 replies; 16+ messages in thread
From: tip-bot for Andy Lutomirski @ 2016-01-14  9:07 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, hpa, agraf, pbonzini, mtosatti, mingo, peterz, bp,
	luto, rkrcmar, luto, torvalds, tglx, dvlasenk, brgerst

Commit-ID:  78fd8c7288e0a4bba3ad1d69caf9396a6b69cb00
Gitweb:     http://git.kernel.org/tip/78fd8c7288e0a4bba3ad1d69caf9396a6b69cb00
Author:     Andy Lutomirski <luto@kernel.org>
AuthorDate: Mon, 4 Jan 2016 15:14:28 -0800
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Wed, 13 Jan 2016 11:46:29 +0100

x86/vdso/pvclock: Protect STABLE check with the seqcount

If the clock becomes unstable while we're reading it, we need to
bail.  We can do this by simply moving the check into the
seqcount loop.

Reported-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Alexander Graf <agraf@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Radim Krcmar <rkrcmar@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/755dcedb17269e1d7ce12a9a713dea303835137e.1451949191.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/entry/vdso/vclock_gettime.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/arch/x86/entry/vdso/vclock_gettime.c b/arch/x86/entry/vdso/vclock_gettime.c
index 8602f06..1a50e09 100644
--- a/arch/x86/entry/vdso/vclock_gettime.c
+++ b/arch/x86/entry/vdso/vclock_gettime.c
@@ -126,23 +126,23 @@ static notrace cycle_t vread_pvclock(int *mode)
 	 *
 	 * On Xen, we don't appear to have that guarantee, but Xen still
 	 * supplies a valid seqlock using the version field.
-
+	 *
 	 * We only do pvclock vdso timing at all if
 	 * PVCLOCK_TSC_STABLE_BIT is set, and we interpret that bit to
 	 * mean that all vCPUs have matching pvti and that the TSC is
 	 * synced, so we can just look at vCPU 0's pvti.
 	 */
 
-	if (unlikely(!(pvti->flags & PVCLOCK_TSC_STABLE_BIT))) {
-		*mode = VCLOCK_NONE;
-		return 0;
-	}
-
 	do {
 		version = pvti->version;
 
 		smp_rmb();
 
+		if (unlikely(!(pvti->flags & PVCLOCK_TSC_STABLE_BIT))) {
+			*mode = VCLOCK_NONE;
+			return 0;
+		}
+
 		tsc = rdtsc_ordered();
 		pvti_tsc_to_system_mul = pvti->tsc_to_system_mul;
 		pvti_tsc_shift = pvti->tsc_shift;

^ permalink raw reply related	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2016-01-14  9:08 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-12-20 11:05 [PATCH v2 0/4] x86: KVM vdso and clock improvements Andy Lutomirski
2015-12-20 11:05 ` [PATCH v2 1/4] x86, vdso, pvclock: Simplify and speed up the vdso pvclock reader Andy Lutomirski
2016-01-04 20:26   ` Marcelo Tosatti
2016-01-04 22:33     ` Andy Lutomirski
2016-01-04 22:59       ` Marcelo Tosatti
2016-01-04 23:14         ` [PATCH] x86/vdso/pvclock: Protect STABLE check with the seqcount Andy Lutomirski
2016-01-07 21:02           ` Marcelo Tosatti
2016-01-07 21:13             ` Andy Lutomirski
2016-01-07 21:47               ` Paolo Bonzini
2016-01-08 14:04               ` Marcelo Tosatti
2016-01-12 19:48                 ` Andy Lutomirski
2016-01-13 10:46                   ` Ingo Molnar
2016-01-14  9:07           ` [tip:x86/urgent] " tip-bot for Andy Lutomirski
2015-12-20 11:05 ` [PATCH v2 2/4] x86/vdso: Get pvclock data from the vvar VMA instead of the fixmap Andy Lutomirski
2015-12-20 11:05 ` [PATCH v2 3/4] x86/vdso: Remove pvclock fixmap machinery Andy Lutomirski
2015-12-20 11:05 ` [PATCH v2 4/4] x86/vdso: Enable vdso pvclock access on all vdso variants Andy Lutomirski

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.