kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 00/18] x86/tsc: Clean up rdtsc helpers
@ 2015-06-17  0:35 Andy Lutomirski
  2015-06-17  0:35 ` [PATCH v3 01/18] x86/tsc: Inline native_read_tsc and remove __native_read_tsc Andy Lutomirski
                   ` (18 more replies)
  0 siblings, 19 replies; 57+ messages in thread
From: Andy Lutomirski @ 2015-06-17  0:35 UTC (permalink / raw)
  To: x86
  Cc: Borislav Petkov, Peter Zijlstra, John Stultz, linux-kernel,
	Len Brown, Huang Rui, Denys Vlasenko, kvm, Ralf Baechle,
	Andy Lutomirski

My sincere apologies for the spam.  I send an unholy mixture of the
real patch set and an old poorly split-up patch set, and the result
is incomprehensible.  Here's what I meant to send.

After the some recent threads about rdtsc barriers, I remembered
that our RDTSC wrappers are a big mess.  Let's clean it up.

Currently we have rdtscl, rdtscll, native_read_tsc,
paravirt_read_tsc, and rdtsc_barrier.  For people who haven't
noticed rdtsc_barrier and who haven't carefully read the docs,
there's no indication that all of the other accessors have a giant
ordering gotcha.  The macro forms are ugly, and the paravirt
implementation is completely pointless.

rdtscl is particularly awful.  It reads the low bits.  There are no
performance critical users of just the low bits anywhere in the
kernel.

Clean it up.  After this patch set, there are exactly three
functions.  rdtsc_unordered() is a function that does a raw RDTSC
and returns a 64-bit number.  rdtsc_ordered() is a function that
does a properly ordered RDTSC for general-purpose use.
barrier_before_rdtsc() is exactly what it sounds like.

Changes from v2:
 - Rename rdtsc_unordered to just rdtsc
 - Get rid of rdtsc_barrier entirely instead of renaming it
 - The KVM patch is new (see above)
 - Added some acks

Changes from v1:
 - None, except that I screwed up the v1 emails.

Andy Lutomirski (18):
  x86/tsc: Inline native_read_tsc and remove __native_read_tsc
  x86/msr/kvm: Remove vget_cycles()
  x86/tsc/paravirt: Remove the read_tsc and read_tscp paravirt hooks
  x86/tsc: Replace rdtscll with native_read_tsc
  x86/tsc: Remove the rdtscp and rdtscpll macros
  x86/tsc: Use the full 64-bit tsc in tsc_delay
  x86/cpu/amd: Use the full 64-bit TSC to detect the 2.6.2 bug
  baycom_epp: Replace rdtscl() with native_read_tsc()
  staging/lirc_serial: Remove TSC-based timing
  input/joystick/analog: Switch from rdtscl() to native_read_tsc()
  drivers/input/gameport: Replace rdtscl() with native_read_tsc()
  x86/tsc: Remove rdtscl()
  x86/tsc: Rename native_read_tsc() to rdtsc()
  x86: Add rdtsc_ordered() and use it in trivial call sites
  x86/tsc: Use rdtsc_ordered() in check_tsc_warp() and drop extra
    barriers
  x86/tsc: In read_tsc, use rdtsc_ordered() instead of get_cycles()
  x86/kvm/tsc: Drop extra barrier and use rdtsc_ordered in kvmclock
  x86/tsc: Remove rdtsc_barrier()

 arch/x86/boot/compressed/aslr.c                    |  2 +-
 arch/x86/entry/vdso/vclock_gettime.c               | 16 +-----
 arch/x86/include/asm/barrier.h                     | 11 ----
 arch/x86/include/asm/msr.h                         | 54 ++++++++++++-------
 arch/x86/include/asm/paravirt.h                    | 34 ------------
 arch/x86/include/asm/paravirt_types.h              |  2 -
 arch/x86/include/asm/pvclock.h                     | 21 ++++----
 arch/x86/include/asm/stackprotector.h              |  2 +-
 arch/x86/include/asm/tsc.h                         | 18 +------
 arch/x86/kernel/apb_timer.c                        |  8 +--
 arch/x86/kernel/apic/apic.c                        |  8 +--
 arch/x86/kernel/cpu/amd.c                          |  6 +--
 arch/x86/kernel/cpu/mcheck/mce.c                   |  4 +-
 arch/x86/kernel/espfix_64.c                        |  2 +-
 arch/x86/kernel/hpet.c                             |  4 +-
 arch/x86/kernel/paravirt.c                         |  2 -
 arch/x86/kernel/paravirt_patch_32.c                |  2 -
 arch/x86/kernel/trace_clock.c                      |  7 +--
 arch/x86/kernel/tsc.c                              | 12 ++---
 arch/x86/kernel/tsc_sync.c                         | 14 +++--
 arch/x86/kvm/lapic.c                               |  4 +-
 arch/x86/kvm/svm.c                                 |  4 +-
 arch/x86/kvm/vmx.c                                 |  4 +-
 arch/x86/kvm/x86.c                                 | 26 +++------
 arch/x86/lib/delay.c                               | 13 ++---
 arch/x86/um/asm/barrier.h                          | 13 -----
 arch/x86/xen/enlighten.c                           |  3 --
 drivers/input/gameport/gameport.c                  |  4 +-
 drivers/input/joystick/analog.c                    |  4 +-
 drivers/net/hamradio/baycom_epp.c                  |  2 +-
 drivers/staging/media/lirc/lirc_serial.c           | 63 ++--------------------
 drivers/thermal/intel_powerclamp.c                 |  4 +-
 .../power/cpupower/debug/kernel/cpufreq-test_tsc.c |  4 +-
 33 files changed, 110 insertions(+), 267 deletions(-)

-- 
2.4.2

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH v3 01/18] x86/tsc: Inline native_read_tsc and remove __native_read_tsc
  2015-06-17  0:35 [PATCH v3 00/18] x86/tsc: Clean up rdtsc helpers Andy Lutomirski
@ 2015-06-17  0:35 ` Andy Lutomirski
  2015-06-17  9:26   ` Borislav Petkov
  2015-07-06 15:39   ` [tip:x86/asm] x86/asm/tsc: Inline native_read_tsc() and remove __native_read_tsc() tip-bot for Andy Lutomirski
  2015-06-17  0:35 ` [PATCH v3 02/18] x86/msr/kvm: Remove vget_cycles() Andy Lutomirski
                   ` (17 subsequent siblings)
  18 siblings, 2 replies; 57+ messages in thread
From: Andy Lutomirski @ 2015-06-17  0:35 UTC (permalink / raw)
  To: x86
  Cc: Borislav Petkov, Peter Zijlstra, John Stultz, linux-kernel,
	Len Brown, Huang Rui, Denys Vlasenko, kvm, Ralf Baechle,
	Andy Lutomirski

In cdc7957d1954 ("x86: move native_read_tsc() offline"),
native_read_tsc was moved out of line, presumably for some
now-obsolete vDSO-related reason.  Undo it.

The entire rdtsc, shl, or sequence is only 11 bytes, and calls via
rdtscl and similar helpers were already inlined.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
---
 arch/x86/entry/vdso/vclock_gettime.c  | 2 +-
 arch/x86/include/asm/msr.h            | 8 +++-----
 arch/x86/include/asm/pvclock.h        | 2 +-
 arch/x86/include/asm/stackprotector.h | 2 +-
 arch/x86/include/asm/tsc.h            | 2 +-
 arch/x86/kernel/apb_timer.c           | 4 ++--
 arch/x86/kernel/tsc.c                 | 6 ------
 7 files changed, 9 insertions(+), 17 deletions(-)

diff --git a/arch/x86/entry/vdso/vclock_gettime.c b/arch/x86/entry/vdso/vclock_gettime.c
index 9793322751e0..972b488ac16a 100644
--- a/arch/x86/entry/vdso/vclock_gettime.c
+++ b/arch/x86/entry/vdso/vclock_gettime.c
@@ -186,7 +186,7 @@ notrace static cycle_t vread_tsc(void)
 	 * but no one has ever seen it happen.
 	 */
 	rdtsc_barrier();
-	ret = (cycle_t)__native_read_tsc();
+	ret = (cycle_t)native_read_tsc();
 
 	last = gtod->cycle_last;
 
diff --git a/arch/x86/include/asm/msr.h b/arch/x86/include/asm/msr.h
index e6a707eb5081..88711470af7f 100644
--- a/arch/x86/include/asm/msr.h
+++ b/arch/x86/include/asm/msr.h
@@ -106,12 +106,10 @@ notrace static inline int native_write_msr_safe(unsigned int msr,
 	return err;
 }
 
-extern unsigned long long native_read_tsc(void);
-
 extern int rdmsr_safe_regs(u32 regs[8]);
 extern int wrmsr_safe_regs(u32 regs[8]);
 
-static __always_inline unsigned long long __native_read_tsc(void)
+static __always_inline unsigned long long native_read_tsc(void)
 {
 	DECLARE_ARGS(val, low, high);
 
@@ -181,10 +179,10 @@ static inline int rdmsrl_safe(unsigned msr, unsigned long long *p)
 }
 
 #define rdtscl(low)						\
-	((low) = (u32)__native_read_tsc())
+	((low) = (u32)native_read_tsc())
 
 #define rdtscll(val)						\
-	((val) = __native_read_tsc())
+	((val) = native_read_tsc())
 
 #define rdpmc(counter, low, high)			\
 do {							\
diff --git a/arch/x86/include/asm/pvclock.h b/arch/x86/include/asm/pvclock.h
index d6b078e9fa28..71bd485c2986 100644
--- a/arch/x86/include/asm/pvclock.h
+++ b/arch/x86/include/asm/pvclock.h
@@ -62,7 +62,7 @@ static inline u64 pvclock_scale_delta(u64 delta, u32 mul_frac, int shift)
 static __always_inline
 u64 pvclock_get_nsec_offset(const struct pvclock_vcpu_time_info *src)
 {
-	u64 delta = __native_read_tsc() - src->tsc_timestamp;
+	u64 delta = native_read_tsc() - src->tsc_timestamp;
 	return pvclock_scale_delta(delta, src->tsc_to_system_mul,
 				   src->tsc_shift);
 }
diff --git a/arch/x86/include/asm/stackprotector.h b/arch/x86/include/asm/stackprotector.h
index c2e00bb2a136..bc5fa2af112e 100644
--- a/arch/x86/include/asm/stackprotector.h
+++ b/arch/x86/include/asm/stackprotector.h
@@ -72,7 +72,7 @@ static __always_inline void boot_init_stack_canary(void)
 	 * on during the bootup the random pool has true entropy too.
 	 */
 	get_random_bytes(&canary, sizeof(canary));
-	tsc = __native_read_tsc();
+	tsc = native_read_tsc();
 	canary += tsc + (tsc << 32UL);
 
 	current->stack_canary = canary;
diff --git a/arch/x86/include/asm/tsc.h b/arch/x86/include/asm/tsc.h
index 94605c0e9cee..fd11128faf25 100644
--- a/arch/x86/include/asm/tsc.h
+++ b/arch/x86/include/asm/tsc.h
@@ -42,7 +42,7 @@ static __always_inline cycles_t vget_cycles(void)
 	if (!cpu_has_tsc)
 		return 0;
 #endif
-	return (cycles_t)__native_read_tsc();
+	return (cycles_t)native_read_tsc();
 }
 
 extern void tsc_init(void);
diff --git a/arch/x86/kernel/apb_timer.c b/arch/x86/kernel/apb_timer.c
index ede92c3364d3..9fe111cc50f8 100644
--- a/arch/x86/kernel/apb_timer.c
+++ b/arch/x86/kernel/apb_timer.c
@@ -390,13 +390,13 @@ unsigned long apbt_quick_calibrate(void)
 	old = dw_apb_clocksource_read(clocksource_apbt);
 	old += loop;
 
-	t1 = __native_read_tsc();
+	t1 = native_read_tsc();
 
 	do {
 		new = dw_apb_clocksource_read(clocksource_apbt);
 	} while (new < old);
 
-	t2 = __native_read_tsc();
+	t2 = native_read_tsc();
 
 	shift = 5;
 	if (unlikely(loop >> shift == 0)) {
diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
index 505449700e0c..e7710cd7ba00 100644
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -308,12 +308,6 @@ unsigned long long
 sched_clock(void) __attribute__((alias("native_sched_clock")));
 #endif
 
-unsigned long long native_read_tsc(void)
-{
-	return __native_read_tsc();
-}
-EXPORT_SYMBOL(native_read_tsc);
-
 int check_tsc_unstable(void)
 {
 	return tsc_unstable;
-- 
2.4.2

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v3 02/18] x86/msr/kvm: Remove vget_cycles()
  2015-06-17  0:35 [PATCH v3 00/18] x86/tsc: Clean up rdtsc helpers Andy Lutomirski
  2015-06-17  0:35 ` [PATCH v3 01/18] x86/tsc: Inline native_read_tsc and remove __native_read_tsc Andy Lutomirski
@ 2015-06-17  0:35 ` Andy Lutomirski
  2015-06-17  9:42   ` Borislav Petkov
                     ` (2 more replies)
  2015-06-17  0:35 ` [PATCH v3 03/18] x86/tsc/paravirt: Remove the read_tsc and read_tscp paravirt hooks Andy Lutomirski
                   ` (16 subsequent siblings)
  18 siblings, 3 replies; 57+ messages in thread
From: Andy Lutomirski @ 2015-06-17  0:35 UTC (permalink / raw)
  To: x86
  Cc: Borislav Petkov, Peter Zijlstra, John Stultz, linux-kernel,
	Len Brown, Huang Rui, Denys Vlasenko, kvm, Ralf Baechle,
	Andy Lutomirski

The only caller was kvm's read_tsc.  The only difference between
vget_cycles and native_read_tsc was that vget_cycles returned zero
instead of crashing on TSC-less systems.  KVM's already checks
vclock_mode before calling that function, so the extra check is
unnecessary.

(Off-topic, but the whole KVM clock host implementation is gross.
 IMO it should be rewritten.)

Signed-off-by: Andy Lutomirski <luto@kernel.org>
---
 arch/x86/include/asm/tsc.h | 13 -------------
 arch/x86/kvm/x86.c         |  2 +-
 2 files changed, 1 insertion(+), 14 deletions(-)

diff --git a/arch/x86/include/asm/tsc.h b/arch/x86/include/asm/tsc.h
index fd11128faf25..3da1cc1218ac 100644
--- a/arch/x86/include/asm/tsc.h
+++ b/arch/x86/include/asm/tsc.h
@@ -32,19 +32,6 @@ static inline cycles_t get_cycles(void)
 	return ret;
 }
 
-static __always_inline cycles_t vget_cycles(void)
-{
-	/*
-	 * We only do VDSOs on TSC capable CPUs, so this shouldn't
-	 * access boot_cpu_data (which is not VDSO-safe):
-	 */
-#ifndef CONFIG_X86_TSC
-	if (!cpu_has_tsc)
-		return 0;
-#endif
-	return (cycles_t)native_read_tsc();
-}
-
 extern void tsc_init(void);
 extern void mark_tsc_unstable(char *reason);
 extern int unsynchronized_tsc(void);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 26eaeb522cab..c26faf408bce 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1430,7 +1430,7 @@ static cycle_t read_tsc(void)
 	 * but no one has ever seen it happen.
 	 */
 	rdtsc_barrier();
-	ret = (cycle_t)vget_cycles();
+	ret = (cycle_t)native_read_tsc();
 
 	last = pvclock_gtod_data.clock.cycle_last;
 
-- 
2.4.2

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v3 03/18] x86/tsc/paravirt: Remove the read_tsc and read_tscp paravirt hooks
  2015-06-17  0:35 [PATCH v3 00/18] x86/tsc: Clean up rdtsc helpers Andy Lutomirski
  2015-06-17  0:35 ` [PATCH v3 01/18] x86/tsc: Inline native_read_tsc and remove __native_read_tsc Andy Lutomirski
  2015-06-17  0:35 ` [PATCH v3 02/18] x86/msr/kvm: Remove vget_cycles() Andy Lutomirski
@ 2015-06-17  0:35 ` Andy Lutomirski
  2015-06-17  9:56   ` Borislav Petkov
                     ` (2 more replies)
  2015-06-17  0:35 ` [PATCH v3 04/18] x86/tsc: Replace rdtscll with native_read_tsc Andy Lutomirski
                   ` (15 subsequent siblings)
  18 siblings, 3 replies; 57+ messages in thread
From: Andy Lutomirski @ 2015-06-17  0:35 UTC (permalink / raw)
  To: x86
  Cc: Borislav Petkov, Peter Zijlstra, John Stultz, linux-kernel,
	Len Brown, Huang Rui, Denys Vlasenko, kvm, Ralf Baechle,
	Andy Lutomirski

We've had read_tsc and read_tscp paravirt hooks since the very
beginning of paravirt, i.e., d3561b7fa0fb ("[PATCH] paravirt: header
and stubs for paravirtualisation").  AFAICT the only paravirt guest
implementation that ever replaced these calls was vmware, and it's
gone.  Arguably even vmware shouldn't have hooked rdtsc -- we fully
support systems that don't have a TSC at all, so there's no point
for a paravirt implementation to pretend that we have a TSC but to
replace it.

I also doubt that these hooks actually worked.  Calls to rdtscl and
rdtscll, which respected the hooks, were used seemingly
interchangeably with native_read_tsc, which did not.

Just remove them.  If anyone ever needs them again, they can try
to make a case for why they need them.

Before, on a paravirt config:
   text	   data	    bss	    dec	    hex	filename
13426505	1827056	14508032	29761593	1c62039	vmlinux

After:
   text	   data	    bss	    dec	    hex	filename
13426617	1827056	14508032	29761705	1c620a9	vmlinux

Signed-off-by: Andy Lutomirski <luto@kernel.org>
---
 arch/x86/include/asm/msr.h            | 16 ++++++++--------
 arch/x86/include/asm/paravirt.h       | 34 ----------------------------------
 arch/x86/include/asm/paravirt_types.h |  2 --
 arch/x86/kernel/paravirt.c            |  2 --
 arch/x86/kernel/paravirt_patch_32.c   |  2 --
 arch/x86/xen/enlighten.c              |  3 ---
 6 files changed, 8 insertions(+), 51 deletions(-)

diff --git a/arch/x86/include/asm/msr.h b/arch/x86/include/asm/msr.h
index 88711470af7f..d1afac7df484 100644
--- a/arch/x86/include/asm/msr.h
+++ b/arch/x86/include/asm/msr.h
@@ -178,12 +178,6 @@ static inline int rdmsrl_safe(unsigned msr, unsigned long long *p)
 	return err;
 }
 
-#define rdtscl(low)						\
-	((low) = (u32)native_read_tsc())
-
-#define rdtscll(val)						\
-	((val) = native_read_tsc())
-
 #define rdpmc(counter, low, high)			\
 do {							\
 	u64 _l = native_read_pmc((counter));		\
@@ -193,6 +187,14 @@ do {							\
 
 #define rdpmcl(counter, val) ((val) = native_read_pmc(counter))
 
+#endif	/* !CONFIG_PARAVIRT */
+
+#define rdtscl(low)						\
+	((low) = (u32)native_read_tsc())
+
+#define rdtscll(val)						\
+	((val) = native_read_tsc())
+
 #define rdtscp(low, high, aux)					\
 do {                                                            \
 	unsigned long long _val = native_read_tscp(&(aux));     \
@@ -202,8 +204,6 @@ do {                                                            \
 
 #define rdtscpll(val, aux) (val) = native_read_tscp(&(aux))
 
-#endif	/* !CONFIG_PARAVIRT */
-
 /*
  * 64-bit version of wrmsr_safe():
  */
diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h
index d143bfad45d7..c2be0375bcad 100644
--- a/arch/x86/include/asm/paravirt.h
+++ b/arch/x86/include/asm/paravirt.h
@@ -174,19 +174,6 @@ static inline int rdmsrl_safe(unsigned msr, unsigned long long *p)
 	return err;
 }
 
-static inline u64 paravirt_read_tsc(void)
-{
-	return PVOP_CALL0(u64, pv_cpu_ops.read_tsc);
-}
-
-#define rdtscl(low)				\
-do {						\
-	u64 _l = paravirt_read_tsc();		\
-	low = (int)_l;				\
-} while (0)
-
-#define rdtscll(val) (val = paravirt_read_tsc())
-
 static inline unsigned long long paravirt_sched_clock(void)
 {
 	return PVOP_CALL0(unsigned long long, pv_time_ops.sched_clock);
@@ -215,27 +202,6 @@ do {						\
 
 #define rdpmcl(counter, val) ((val) = paravirt_read_pmc(counter))
 
-static inline unsigned long long paravirt_rdtscp(unsigned int *aux)
-{
-	return PVOP_CALL1(u64, pv_cpu_ops.read_tscp, aux);
-}
-
-#define rdtscp(low, high, aux)				\
-do {							\
-	int __aux;					\
-	unsigned long __val = paravirt_rdtscp(&__aux);	\
-	(low) = (u32)__val;				\
-	(high) = (u32)(__val >> 32);			\
-	(aux) = __aux;					\
-} while (0)
-
-#define rdtscpll(val, aux)				\
-do {							\
-	unsigned long __aux; 				\
-	val = paravirt_rdtscp(&__aux);			\
-	(aux) = __aux;					\
-} while (0)
-
 static inline void paravirt_alloc_ldt(struct desc_struct *ldt, unsigned entries)
 {
 	PVOP_VCALL2(pv_cpu_ops.alloc_ldt, ldt, entries);
diff --git a/arch/x86/include/asm/paravirt_types.h b/arch/x86/include/asm/paravirt_types.h
index a6b8f9fadb06..ce029e4fa7c6 100644
--- a/arch/x86/include/asm/paravirt_types.h
+++ b/arch/x86/include/asm/paravirt_types.h
@@ -156,9 +156,7 @@ struct pv_cpu_ops {
 	u64 (*read_msr)(unsigned int msr, int *err);
 	int (*write_msr)(unsigned int msr, unsigned low, unsigned high);
 
-	u64 (*read_tsc)(void);
 	u64 (*read_pmc)(int counter);
-	unsigned long long (*read_tscp)(unsigned int *aux);
 
 #ifdef CONFIG_X86_32
 	/*
diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c
index 58bcfb67c01f..f68e48f5f6c2 100644
--- a/arch/x86/kernel/paravirt.c
+++ b/arch/x86/kernel/paravirt.c
@@ -351,9 +351,7 @@ __visible struct pv_cpu_ops pv_cpu_ops = {
 	.wbinvd = native_wbinvd,
 	.read_msr = native_read_msr_safe,
 	.write_msr = native_write_msr_safe,
-	.read_tsc = native_read_tsc,
 	.read_pmc = native_read_pmc,
-	.read_tscp = native_read_tscp,
 	.load_tr_desc = native_load_tr_desc,
 	.set_ldt = native_set_ldt,
 	.load_gdt = native_load_gdt,
diff --git a/arch/x86/kernel/paravirt_patch_32.c b/arch/x86/kernel/paravirt_patch_32.c
index e1b013696dde..c89f50a76e97 100644
--- a/arch/x86/kernel/paravirt_patch_32.c
+++ b/arch/x86/kernel/paravirt_patch_32.c
@@ -10,7 +10,6 @@ DEF_NATIVE(pv_mmu_ops, read_cr2, "mov %cr2, %eax");
 DEF_NATIVE(pv_mmu_ops, write_cr3, "mov %eax, %cr3");
 DEF_NATIVE(pv_mmu_ops, read_cr3, "mov %cr3, %eax");
 DEF_NATIVE(pv_cpu_ops, clts, "clts");
-DEF_NATIVE(pv_cpu_ops, read_tsc, "rdtsc");
 
 #if defined(CONFIG_PARAVIRT_SPINLOCKS) && defined(CONFIG_QUEUED_SPINLOCKS)
 DEF_NATIVE(pv_lock_ops, queued_spin_unlock, "movb $0, (%eax)");
@@ -52,7 +51,6 @@ unsigned native_patch(u8 type, u16 clobbers, void *ibuf,
 		PATCH_SITE(pv_mmu_ops, read_cr3);
 		PATCH_SITE(pv_mmu_ops, write_cr3);
 		PATCH_SITE(pv_cpu_ops, clts);
-		PATCH_SITE(pv_cpu_ops, read_tsc);
 #if defined(CONFIG_PARAVIRT_SPINLOCKS) && defined(CONFIG_QUEUED_SPINLOCKS)
 		case PARAVIRT_PATCH(pv_lock_ops.queued_spin_unlock):
 			if (pv_is_native_spin_unlock()) {
diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
index 0b95c9b8283f..32136bfca43f 100644
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -1175,11 +1175,8 @@ static const struct pv_cpu_ops xen_cpu_ops __initconst = {
 	.read_msr = xen_read_msr_safe,
 	.write_msr = xen_write_msr_safe,
 
-	.read_tsc = native_read_tsc,
 	.read_pmc = native_read_pmc,
 
-	.read_tscp = native_read_tscp,
-
 	.iret = xen_iret,
 #ifdef CONFIG_X86_64
 	.usergs_sysret32 = xen_sysret32,
-- 
2.4.2

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v3 04/18] x86/tsc: Replace rdtscll with native_read_tsc
  2015-06-17  0:35 [PATCH v3 00/18] x86/tsc: Clean up rdtsc helpers Andy Lutomirski
                   ` (2 preceding siblings ...)
  2015-06-17  0:35 ` [PATCH v3 03/18] x86/tsc/paravirt: Remove the read_tsc and read_tscp paravirt hooks Andy Lutomirski
@ 2015-06-17  0:35 ` Andy Lutomirski
  2015-06-17 10:03   ` Borislav Petkov
  2015-07-06 15:40   ` [tip:x86/asm] x86/asm/tsc: Replace rdtscll() with native_read_tsc () tip-bot for Andy Lutomirski
  2015-06-17  0:35 ` [PATCH v3 05/18] x86/tsc: Remove the rdtscp and rdtscpll macros Andy Lutomirski
                   ` (14 subsequent siblings)
  18 siblings, 2 replies; 57+ messages in thread
From: Andy Lutomirski @ 2015-06-17  0:35 UTC (permalink / raw)
  To: x86
  Cc: Borislav Petkov, Peter Zijlstra, John Stultz, linux-kernel,
	Len Brown, Huang Rui, Denys Vlasenko, kvm, Ralf Baechle,
	Andy Lutomirski

Now that the read_tsc paravirt hook is gone, rdtscll() is just a
wrapper around native_read_tsc().  Unwrap it.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
---
 arch/x86/boot/compressed/aslr.c                      | 2 +-
 arch/x86/include/asm/msr.h                           | 3 ---
 arch/x86/include/asm/tsc.h                           | 5 +----
 arch/x86/kernel/apb_timer.c                          | 4 ++--
 arch/x86/kernel/apic/apic.c                          | 8 ++++----
 arch/x86/kernel/cpu/mcheck/mce.c                     | 4 ++--
 arch/x86/kernel/espfix_64.c                          | 2 +-
 arch/x86/kernel/hpet.c                               | 4 ++--
 arch/x86/kernel/trace_clock.c                        | 2 +-
 arch/x86/kernel/tsc.c                                | 4 ++--
 arch/x86/kvm/vmx.c                                   | 2 +-
 arch/x86/lib/delay.c                                 | 2 +-
 drivers/thermal/intel_powerclamp.c                   | 4 ++--
 tools/power/cpupower/debug/kernel/cpufreq-test_tsc.c | 4 ++--
 14 files changed, 22 insertions(+), 28 deletions(-)

diff --git a/arch/x86/boot/compressed/aslr.c b/arch/x86/boot/compressed/aslr.c
index d7b1f655b3ef..ea33236190b1 100644
--- a/arch/x86/boot/compressed/aslr.c
+++ b/arch/x86/boot/compressed/aslr.c
@@ -82,7 +82,7 @@ static unsigned long get_random_long(void)
 
 	if (has_cpuflag(X86_FEATURE_TSC)) {
 		debug_putstr(" RDTSC");
-		rdtscll(raw);
+		raw = native_read_tsc();
 
 		random ^= raw;
 		use_i8254 = false;
diff --git a/arch/x86/include/asm/msr.h b/arch/x86/include/asm/msr.h
index d1afac7df484..7273b74e0f99 100644
--- a/arch/x86/include/asm/msr.h
+++ b/arch/x86/include/asm/msr.h
@@ -192,9 +192,6 @@ do {							\
 #define rdtscl(low)						\
 	((low) = (u32)native_read_tsc())
 
-#define rdtscll(val)						\
-	((val) = native_read_tsc())
-
 #define rdtscp(low, high, aux)					\
 do {                                                            \
 	unsigned long long _val = native_read_tscp(&(aux));     \
diff --git a/arch/x86/include/asm/tsc.h b/arch/x86/include/asm/tsc.h
index 3da1cc1218ac..b4883902948b 100644
--- a/arch/x86/include/asm/tsc.h
+++ b/arch/x86/include/asm/tsc.h
@@ -21,15 +21,12 @@ extern void disable_TSC(void);
 
 static inline cycles_t get_cycles(void)
 {
-	unsigned long long ret = 0;
-
 #ifndef CONFIG_X86_TSC
 	if (!cpu_has_tsc)
 		return 0;
 #endif
-	rdtscll(ret);
 
-	return ret;
+	return native_read_tsc();
 }
 
 extern void tsc_init(void);
diff --git a/arch/x86/kernel/apb_timer.c b/arch/x86/kernel/apb_timer.c
index 9fe111cc50f8..25efa534c4e4 100644
--- a/arch/x86/kernel/apb_timer.c
+++ b/arch/x86/kernel/apb_timer.c
@@ -263,7 +263,7 @@ static int apbt_clocksource_register(void)
 
 	/* Verify whether apbt counter works */
 	t1 = dw_apb_clocksource_read(clocksource_apbt);
-	rdtscll(start);
+	start = native_read_tsc();
 
 	/*
 	 * We don't know the TSC frequency yet, but waiting for
@@ -273,7 +273,7 @@ static int apbt_clocksource_register(void)
 	 */
 	do {
 		rep_nop();
-		rdtscll(now);
+		now = native_read_tsc();
 	} while ((now - start) < 200000UL);
 
 	/* APBT is the only always on clocksource, it has to work! */
diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
index dcb52850a28f..51af1ed1ae2e 100644
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -457,7 +457,7 @@ static int lapic_next_deadline(unsigned long delta,
 {
 	u64 tsc;
 
-	rdtscll(tsc);
+	tsc = native_read_tsc();
 	wrmsrl(MSR_IA32_TSC_DEADLINE, tsc + (((u64) delta) * TSC_DIVISOR));
 	return 0;
 }
@@ -592,7 +592,7 @@ static void __init lapic_cal_handler(struct clock_event_device *dev)
 	unsigned long pm = acpi_pm_read_early();
 
 	if (cpu_has_tsc)
-		rdtscll(tsc);
+		tsc = native_read_tsc();
 
 	switch (lapic_cal_loops++) {
 	case 0:
@@ -1209,7 +1209,7 @@ void setup_local_APIC(void)
 	long long max_loops = cpu_khz ? cpu_khz : 1000000;
 
 	if (cpu_has_tsc)
-		rdtscll(tsc);
+		tsc = native_read_tsc();
 
 	if (disable_apic) {
 		disable_ioapic_support();
@@ -1293,7 +1293,7 @@ void setup_local_APIC(void)
 		}
 		if (queued) {
 			if (cpu_has_tsc && cpu_khz) {
-				rdtscll(ntsc);
+				ntsc = native_read_tsc();
 				max_loops = (cpu_khz << 10) - (ntsc - tsc);
 			} else
 				max_loops--;
diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index df919ff103c3..a5283d2d0094 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -125,7 +125,7 @@ void mce_setup(struct mce *m)
 {
 	memset(m, 0, sizeof(struct mce));
 	m->cpu = m->extcpu = smp_processor_id();
-	rdtscll(m->tsc);
+	m->tsc = native_read_tsc();
 	/* We hope get_seconds stays lockless */
 	m->time = get_seconds();
 	m->cpuvendor = boot_cpu_data.x86_vendor;
@@ -1784,7 +1784,7 @@ static void collect_tscs(void *data)
 {
 	unsigned long *cpu_tsc = (unsigned long *)data;
 
-	rdtscll(cpu_tsc[smp_processor_id()]);
+	cpu_tsc[smp_processor_id()] = native_read_tsc();
 }
 
 static int mce_apei_read_done;
diff --git a/arch/x86/kernel/espfix_64.c b/arch/x86/kernel/espfix_64.c
index f5d0730e7b08..334a2a9c034d 100644
--- a/arch/x86/kernel/espfix_64.c
+++ b/arch/x86/kernel/espfix_64.c
@@ -110,7 +110,7 @@ static void init_espfix_random(void)
 	 */
 	if (!arch_get_random_long(&rand)) {
 		/* The constant is an arbitrary large prime */
-		rdtscll(rand);
+		rand = native_read_tsc();
 		rand *= 0xc345c6b72fd16123UL;
 	}
 
diff --git a/arch/x86/kernel/hpet.c b/arch/x86/kernel/hpet.c
index e2449cf38b06..ccf677cd9adc 100644
--- a/arch/x86/kernel/hpet.c
+++ b/arch/x86/kernel/hpet.c
@@ -734,7 +734,7 @@ static int hpet_clocksource_register(void)
 
 	/* Verify whether hpet counter works */
 	t1 = hpet_readl(HPET_COUNTER);
-	rdtscll(start);
+	start = native_read_tsc();
 
 	/*
 	 * We don't know the TSC frequency yet, but waiting for
@@ -744,7 +744,7 @@ static int hpet_clocksource_register(void)
 	 */
 	do {
 		rep_nop();
-		rdtscll(now);
+		now = native_read_tsc();
 	} while ((now - start) < 200000UL);
 
 	if (t1 == hpet_readl(HPET_COUNTER)) {
diff --git a/arch/x86/kernel/trace_clock.c b/arch/x86/kernel/trace_clock.c
index 25b993729f9b..bd8f4d41bd56 100644
--- a/arch/x86/kernel/trace_clock.c
+++ b/arch/x86/kernel/trace_clock.c
@@ -15,7 +15,7 @@ u64 notrace trace_clock_x86_tsc(void)
 	u64 ret;
 
 	rdtsc_barrier();
-	rdtscll(ret);
+	ret = native_read_tsc();
 
 	return ret;
 }
diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
index e7710cd7ba00..e66f5dcaeb63 100644
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -248,7 +248,7 @@ static void set_cyc2ns_scale(unsigned long cpu_khz, int cpu)
 
 	data = cyc2ns_write_begin(cpu);
 
-	rdtscll(tsc_now);
+	tsc_now = native_read_tsc();
 	ns_now = cycles_2_ns(tsc_now);
 
 	/*
@@ -290,7 +290,7 @@ u64 native_sched_clock(void)
 	}
 
 	/* read the Time Stamp Counter: */
-	rdtscll(tsc_now);
+	tsc_now = native_read_tsc();
 
 	/* return the value in ns */
 	return cycles_2_ns(tsc_now);
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index e11dd59398f1..fcff42100948 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -2236,7 +2236,7 @@ static u64 guest_read_tsc(void)
 {
 	u64 host_tsc, tsc_offset;
 
-	rdtscll(host_tsc);
+	host_tsc = native_read_tsc();
 	tsc_offset = vmcs_read64(TSC_OFFSET);
 	return host_tsc + tsc_offset;
 }
diff --git a/arch/x86/lib/delay.c b/arch/x86/lib/delay.c
index 39d6a3db0b96..9a52ad0c0758 100644
--- a/arch/x86/lib/delay.c
+++ b/arch/x86/lib/delay.c
@@ -100,7 +100,7 @@ void use_tsc_delay(void)
 int read_current_timer(unsigned long *timer_val)
 {
 	if (delay_fn == delay_tsc) {
-		rdtscll(*timer_val);
+		*timer_val = native_read_tsc();
 		return 0;
 	}
 	return -1;
diff --git a/drivers/thermal/intel_powerclamp.c b/drivers/thermal/intel_powerclamp.c
index 725718e97a0b..933c5e599d1d 100644
--- a/drivers/thermal/intel_powerclamp.c
+++ b/drivers/thermal/intel_powerclamp.c
@@ -340,7 +340,7 @@ static bool powerclamp_adjust_controls(unsigned int target_ratio,
 
 	/* check result for the last window */
 	msr_now = pkg_state_counter();
-	rdtscll(tsc_now);
+	tsc_now = native_read_tsc();
 
 	/* calculate pkg cstate vs tsc ratio */
 	if (!msr_last || !tsc_last)
@@ -482,7 +482,7 @@ static void poll_pkg_cstate(struct work_struct *dummy)
 	u64 val64;
 
 	msr_now = pkg_state_counter();
-	rdtscll(tsc_now);
+	tsc_now = native_read_tsc();
 	jiffies_now = jiffies;
 
 	/* calculate pkg cstate vs tsc ratio */
diff --git a/tools/power/cpupower/debug/kernel/cpufreq-test_tsc.c b/tools/power/cpupower/debug/kernel/cpufreq-test_tsc.c
index 5224ee5b392d..f02b0c0bff9b 100644
--- a/tools/power/cpupower/debug/kernel/cpufreq-test_tsc.c
+++ b/tools/power/cpupower/debug/kernel/cpufreq-test_tsc.c
@@ -81,11 +81,11 @@ static int __init cpufreq_test_tsc(void)
 
 	printk(KERN_DEBUG "start--> \n");
 	then = read_pmtmr();
-        rdtscll(then_tsc);
+	then_tsc = native_read_tsc();
 	for (i=0;i<20;i++) {
 		mdelay(100);
 		now = read_pmtmr();
-		rdtscll(now_tsc);
+		now_tsc = native_read_tsc();
 		diff = (now - then) & 0xFFFFFF;
 		diff_tsc = now_tsc - then_tsc;
 		printk(KERN_DEBUG "t1: %08u t2: %08u diff_pmtmr: %08u diff_tsc: %016llu\n", then, now, diff, diff_tsc);
-- 
2.4.2

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v3 05/18] x86/tsc: Remove the rdtscp and rdtscpll macros
  2015-06-17  0:35 [PATCH v3 00/18] x86/tsc: Clean up rdtsc helpers Andy Lutomirski
                   ` (3 preceding siblings ...)
  2015-06-17  0:35 ` [PATCH v3 04/18] x86/tsc: Replace rdtscll with native_read_tsc Andy Lutomirski
@ 2015-06-17  0:35 ` Andy Lutomirski
  2015-07-06 15:41   ` [tip:x86/asm] x86/asm/tsc: Remove the rdtscp() and rdtscpll() macros tip-bot for Andy Lutomirski
  2015-06-17  0:35 ` [PATCH v3 06/18] x86/tsc: Use the full 64-bit tsc in tsc_delay Andy Lutomirski
                   ` (13 subsequent siblings)
  18 siblings, 1 reply; 57+ messages in thread
From: Andy Lutomirski @ 2015-06-17  0:35 UTC (permalink / raw)
  To: x86
  Cc: Borislav Petkov, Peter Zijlstra, John Stultz, linux-kernel,
	Len Brown, Huang Rui, Denys Vlasenko, kvm, Ralf Baechle,
	Andy Lutomirski

They have no users.  Leave native_read_tscp, which seems potentially
useful despite also having no callers.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
---
 arch/x86/include/asm/msr.h | 9 ---------
 1 file changed, 9 deletions(-)

diff --git a/arch/x86/include/asm/msr.h b/arch/x86/include/asm/msr.h
index 7273b74e0f99..626f78199665 100644
--- a/arch/x86/include/asm/msr.h
+++ b/arch/x86/include/asm/msr.h
@@ -192,15 +192,6 @@ do {							\
 #define rdtscl(low)						\
 	((low) = (u32)native_read_tsc())
 
-#define rdtscp(low, high, aux)					\
-do {                                                            \
-	unsigned long long _val = native_read_tscp(&(aux));     \
-	(low) = (u32)_val;                                      \
-	(high) = (u32)(_val >> 32);                             \
-} while (0)
-
-#define rdtscpll(val, aux) (val) = native_read_tscp(&(aux))
-
 /*
  * 64-bit version of wrmsr_safe():
  */
-- 
2.4.2

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v3 06/18] x86/tsc: Use the full 64-bit tsc in tsc_delay
  2015-06-17  0:35 [PATCH v3 00/18] x86/tsc: Clean up rdtsc helpers Andy Lutomirski
                   ` (4 preceding siblings ...)
  2015-06-17  0:35 ` [PATCH v3 05/18] x86/tsc: Remove the rdtscp and rdtscpll macros Andy Lutomirski
@ 2015-06-17  0:35 ` Andy Lutomirski
  2015-07-06 15:41   ` [tip:x86/asm] x86/asm/tsc: Use the full 64-bit TSC in delay_tsc() tip-bot for Andy Lutomirski
  2015-06-17  0:35 ` [PATCH v3 07/18] x86/cpu/amd: Use the full 64-bit TSC to detect the 2.6.2 bug Andy Lutomirski
                   ` (12 subsequent siblings)
  18 siblings, 1 reply; 57+ messages in thread
From: Andy Lutomirski @ 2015-06-17  0:35 UTC (permalink / raw)
  To: x86
  Cc: Borislav Petkov, Peter Zijlstra, John Stultz, linux-kernel,
	Len Brown, Huang Rui, Denys Vlasenko, kvm, Ralf Baechle,
	Andy Lutomirski

As a very minor optimization, tsc_delay was only using the low 32
bits of the TSC.  It's a delay function, so just use the whole
thing.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
---
 arch/x86/lib/delay.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/x86/lib/delay.c b/arch/x86/lib/delay.c
index 9a52ad0c0758..35115f3786a9 100644
--- a/arch/x86/lib/delay.c
+++ b/arch/x86/lib/delay.c
@@ -49,16 +49,16 @@ static void delay_loop(unsigned long loops)
 /* TSC based delay: */
 static void delay_tsc(unsigned long __loops)
 {
-	u32 bclock, now, loops = __loops;
+	u64 bclock, now, loops = __loops;
 	int cpu;
 
 	preempt_disable();
 	cpu = smp_processor_id();
 	rdtsc_barrier();
-	rdtscl(bclock);
+	bclock = native_read_tsc();
 	for (;;) {
 		rdtsc_barrier();
-		rdtscl(now);
+		now = native_read_tsc();
 		if ((now - bclock) >= loops)
 			break;
 
@@ -80,7 +80,7 @@ static void delay_tsc(unsigned long __loops)
 			loops -= (now - bclock);
 			cpu = smp_processor_id();
 			rdtsc_barrier();
-			rdtscl(bclock);
+			bclock = native_read_tsc();
 		}
 	}
 	preempt_enable();
-- 
2.4.2


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v3 07/18] x86/cpu/amd: Use the full 64-bit TSC to detect the 2.6.2 bug
  2015-06-17  0:35 [PATCH v3 00/18] x86/tsc: Clean up rdtsc helpers Andy Lutomirski
                   ` (5 preceding siblings ...)
  2015-06-17  0:35 ` [PATCH v3 06/18] x86/tsc: Use the full 64-bit tsc in tsc_delay Andy Lutomirski
@ 2015-06-17  0:35 ` Andy Lutomirski
  2015-07-06 15:41   ` [tip:x86/asm] x86/asm/tsc, " tip-bot for Andy Lutomirski
  2015-06-17  0:35 ` [PATCH v3 08/18] baycom_epp: Replace rdtscl() with native_read_tsc() Andy Lutomirski
                   ` (11 subsequent siblings)
  18 siblings, 1 reply; 57+ messages in thread
From: Andy Lutomirski @ 2015-06-17  0:35 UTC (permalink / raw)
  To: x86
  Cc: Borislav Petkov, Peter Zijlstra, John Stultz, linux-kernel,
	Len Brown, Huang Rui, Denys Vlasenko, kvm, Ralf Baechle,
	Andy Lutomirski

This code is timing 100k indirect calls, so the added overhead of
counting the number of cycles elapsed as a 64-bit number should be
insignificant.  Drop the optimization of using a 32-bit count.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
---
 arch/x86/kernel/cpu/amd.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index 5bd3a99dc20b..c5ceec532799 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -107,7 +107,7 @@ static void init_amd_k6(struct cpuinfo_x86 *c)
 		const int K6_BUG_LOOP = 1000000;
 		int n;
 		void (*f_vide)(void);
-		unsigned long d, d2;
+		u64 d, d2;
 
 		printk(KERN_INFO "AMD K6 stepping B detected - ");
 
@@ -118,10 +118,10 @@ static void init_amd_k6(struct cpuinfo_x86 *c)
 
 		n = K6_BUG_LOOP;
 		f_vide = vide;
-		rdtscl(d);
+		d = native_read_tsc();
 		while (n--)
 			f_vide();
-		rdtscl(d2);
+		d2 = native_read_tsc();
 		d = d2-d;
 
 		if (d > 20*K6_BUG_LOOP)
-- 
2.4.2

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v3 08/18] baycom_epp: Replace rdtscl() with native_read_tsc()
  2015-06-17  0:35 [PATCH v3 00/18] x86/tsc: Clean up rdtsc helpers Andy Lutomirski
                   ` (6 preceding siblings ...)
  2015-06-17  0:35 ` [PATCH v3 07/18] x86/cpu/amd: Use the full 64-bit TSC to detect the 2.6.2 bug Andy Lutomirski
@ 2015-06-17  0:35 ` Andy Lutomirski
  2015-06-17  0:49   ` Thomas Sailer
                     ` (2 more replies)
  2015-06-17  0:35 ` [PATCH v3 09/18] staging/lirc_serial: Remove TSC-based timing Andy Lutomirski
                   ` (10 subsequent siblings)
  18 siblings, 3 replies; 57+ messages in thread
From: Andy Lutomirski @ 2015-06-17  0:35 UTC (permalink / raw)
  To: x86
  Cc: Borislav Petkov, Peter Zijlstra, John Stultz, linux-kernel,
	Len Brown, Huang Rui, Denys Vlasenko, kvm, Ralf Baechle,
	Andy Lutomirski, walter harms, Thomas Sailer, linux-hams

This is only used if BAYCOM_DEBUG is defined.

Cc: walter harms <wharms@bfs.de>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Thomas Sailer <t.sailer@alumni.ethz.ch>
Cc: linux-hams@vger.kernel.org
Signed-off-by: Andy Lutomirski <luto@kernel.org>
---

I'm hoping for an ack for this to go through -tip.

 drivers/net/hamradio/baycom_epp.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/hamradio/baycom_epp.c b/drivers/net/hamradio/baycom_epp.c
index 83c7cce0d172..44e5c3b5e0af 100644
--- a/drivers/net/hamradio/baycom_epp.c
+++ b/drivers/net/hamradio/baycom_epp.c
@@ -638,7 +638,7 @@ static int receive(struct net_device *dev, int cnt)
 #define GETTICK(x)                                                \
 ({                                                                \
 	if (cpu_has_tsc)                                          \
-		rdtscl(x);                                        \
+		x = (unsigned int)native_read_tsc();		  \
 })
 #else /* __i386__ */
 #define GETTICK(x)
-- 
2.4.2


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v3 09/18] staging/lirc_serial: Remove TSC-based timing
  2015-06-17  0:35 [PATCH v3 00/18] x86/tsc: Clean up rdtsc helpers Andy Lutomirski
                   ` (7 preceding siblings ...)
  2015-06-17  0:35 ` [PATCH v3 08/18] baycom_epp: Replace rdtscl() with native_read_tsc() Andy Lutomirski
@ 2015-06-17  0:35 ` Andy Lutomirski
  2015-07-06 15:42   ` [tip:x86/asm] x86/asm/tsc, " tip-bot for Andy Lutomirski
  2015-06-17  0:35 ` [PATCH v3 10/18] input/joystick/analog: Switch from rdtscl() to native_read_tsc() Andy Lutomirski
                   ` (9 subsequent siblings)
  18 siblings, 1 reply; 57+ messages in thread
From: Andy Lutomirski @ 2015-06-17  0:35 UTC (permalink / raw)
  To: x86
  Cc: devel, Denys Vlasenko, kvm, Peter Zijlstra, Greg Kroah-Hartman,
	Jarod Wilson, linux-kernel, Ralf Baechle, John Stultz,
	Andy Lutomirski, Borislav Petkov, Len Brown

It wasn't compiled in by default.  I suspect that the driver was and
still is broken, though -- it's calling udelay with a parameter
that's derived from loops_per_jiffy.

Cc: Jarod Wilson <jarod@wilsonet.com>
Cc: devel@driverdev.osuosl.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
---
 drivers/staging/media/lirc/lirc_serial.c | 63 ++------------------------------
 1 file changed, 4 insertions(+), 59 deletions(-)

diff --git a/drivers/staging/media/lirc/lirc_serial.c b/drivers/staging/media/lirc/lirc_serial.c
index dc7984455c3a..465796a686c4 100644
--- a/drivers/staging/media/lirc/lirc_serial.c
+++ b/drivers/staging/media/lirc/lirc_serial.c
@@ -327,9 +327,6 @@ static void safe_udelay(unsigned long usecs)
  * time
  */
 
-/* So send_pulse can quickly convert microseconds to clocks */
-static unsigned long conv_us_to_clocks;
-
 static int init_timing_params(unsigned int new_duty_cycle,
 		unsigned int new_freq)
 {
@@ -344,7 +341,6 @@ static int init_timing_params(unsigned int new_duty_cycle,
 	/* How many clocks in a microsecond?, avoiding long long divide */
 	work = loops_per_sec;
 	work *= 4295;  /* 4295 = 2^32 / 1e6 */
-	conv_us_to_clocks = work >> 32;
 
 	/*
 	 * Carrier period in clocks, approach good up to 32GHz clock,
@@ -357,10 +353,9 @@ static int init_timing_params(unsigned int new_duty_cycle,
 	pulse_width = period * duty_cycle / 100;
 	space_width = period - pulse_width;
 	dprintk("in init_timing_params, freq=%d, duty_cycle=%d, "
-		"clk/jiffy=%ld, pulse=%ld, space=%ld, "
-		"conv_us_to_clocks=%ld\n",
+		"clk/jiffy=%ld, pulse=%ld, space=%ld\n",
 		freq, duty_cycle, __this_cpu_read(cpu_info.loops_per_jiffy),
-		pulse_width, space_width, conv_us_to_clocks);
+		pulse_width, space_width);
 	return 0;
 }
 #else /* ! USE_RDTSC */
@@ -431,63 +426,14 @@ static long send_pulse_irdeo(unsigned long length)
 	return ret;
 }
 
-#ifdef USE_RDTSC
-/* Version that uses Pentium rdtsc instruction to measure clocks */
-
-/*
- * This version does sub-microsecond timing using rdtsc instruction,
- * and does away with the fudged LIRC_SERIAL_TRANSMITTER_LATENCY
- * Implicitly i586 architecture...  - Steve
- */
-
-static long send_pulse_homebrew_softcarrier(unsigned long length)
-{
-	int flag;
-	unsigned long target, start, now;
-
-	/* Get going quick as we can */
-	rdtscl(start);
-	on();
-	/* Convert length from microseconds to clocks */
-	length *= conv_us_to_clocks;
-	/* And loop till time is up - flipping at right intervals */
-	now = start;
-	target = pulse_width;
-	flag = 1;
-	/*
-	 * FIXME: This looks like a hard busy wait, without even an occasional,
-	 * polite, cpu_relax() call.  There's got to be a better way?
-	 *
-	 * The i2c code has the result of a lot of bit-banging work, I wonder if
-	 * there's something there which could be helpful here.
-	 */
-	while ((now - start) < length) {
-		/* Delay till flip time */
-		do {
-			rdtscl(now);
-		} while ((now - start) < target);
-
-		/* flip */
-		if (flag) {
-			rdtscl(now);
-			off();
-			target += space_width;
-		} else {
-			rdtscl(now); on();
-			target += pulse_width;
-		}
-		flag = !flag;
-	}
-	rdtscl(now);
-	return ((now - start) - length) / conv_us_to_clocks;
-}
-#else /* ! USE_RDTSC */
 /* Version using udelay() */
 
 /*
  * here we use fixed point arithmetic, with 8
  * fractional bits.  that gets us within 0.1% or so of the right average
  * frequency, albeit with some jitter in pulse length - Steve
+ *
+ * This should use ndelay instead.
  */
 
 /* To match 8 fractional bits used for pulse/space length */
@@ -520,7 +466,6 @@ static long send_pulse_homebrew_softcarrier(unsigned long length)
 	}
 	return (actual-length) >> 8;
 }
-#endif /* USE_RDTSC */
 
 static long send_pulse_homebrew(unsigned long length)
 {
-- 
2.4.2

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v3 10/18] input/joystick/analog: Switch from rdtscl() to native_read_tsc()
  2015-06-17  0:35 [PATCH v3 00/18] x86/tsc: Clean up rdtsc helpers Andy Lutomirski
                   ` (8 preceding siblings ...)
  2015-06-17  0:35 ` [PATCH v3 09/18] staging/lirc_serial: Remove TSC-based timing Andy Lutomirski
@ 2015-06-17  0:35 ` Andy Lutomirski
  2015-07-06 15:42   ` [tip:x86/asm] x86/asm/tsc, " tip-bot for Andy Lutomirski
  2015-06-17  0:35 ` [PATCH v3 11/18] drivers/input/gameport: Replace rdtscl() with native_read_tsc() Andy Lutomirski
                   ` (8 subsequent siblings)
  18 siblings, 1 reply; 57+ messages in thread
From: Andy Lutomirski @ 2015-06-17  0:35 UTC (permalink / raw)
  To: x86
  Cc: Borislav Petkov, Peter Zijlstra, John Stultz, linux-kernel,
	Len Brown, Huang Rui, Denys Vlasenko, kvm, Ralf Baechle,
	Andy Lutomirski, linux-input

This timing code is hideous, and this doesn't help.  It gets rid of
one of the last users of rdtscl, though.

Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Cc: linux-input@vger.kernel.org
Signed-off-by: Andy Lutomirski <luto@kernel.org>
---
 drivers/input/joystick/analog.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/input/joystick/analog.c b/drivers/input/joystick/analog.c
index 4284080e481d..f871b4f00056 100644
--- a/drivers/input/joystick/analog.c
+++ b/drivers/input/joystick/analog.c
@@ -143,7 +143,7 @@ struct analog_port {
 
 #include <linux/i8253.h>
 
-#define GET_TIME(x)	do { if (cpu_has_tsc) rdtscl(x); else x = get_time_pit(); } while (0)
+#define GET_TIME(x)	do { if (cpu_has_tsc) x = (unsigned int)native_read_tsc(); else x = get_time_pit(); } while (0)
 #define DELTA(x,y)	(cpu_has_tsc ? ((y) - (x)) : ((x) - (y) + ((x) < (y) ? PIT_TICK_RATE / HZ : 0)))
 #define TIME_NAME	(cpu_has_tsc?"TSC":"PIT")
 static unsigned int get_time_pit(void)
@@ -160,7 +160,7 @@ static unsigned int get_time_pit(void)
         return count;
 }
 #elif defined(__x86_64__)
-#define GET_TIME(x)	rdtscl(x)
+#define GET_TIME(x)	do { x = (unsigned int)native_read_tsc(); } while (0)
 #define DELTA(x,y)	((y)-(x))
 #define TIME_NAME	"TSC"
 #elif defined(__alpha__) || defined(CONFIG_MN10300) || defined(CONFIG_ARM) || defined(CONFIG_ARM64) || defined(CONFIG_TILE)
-- 
2.4.2

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v3 11/18] drivers/input/gameport: Replace rdtscl() with native_read_tsc()
  2015-06-17  0:35 [PATCH v3 00/18] x86/tsc: Clean up rdtsc helpers Andy Lutomirski
                   ` (9 preceding siblings ...)
  2015-06-17  0:35 ` [PATCH v3 10/18] input/joystick/analog: Switch from rdtscl() to native_read_tsc() Andy Lutomirski
@ 2015-06-17  0:35 ` Andy Lutomirski
  2015-07-06 15:43   ` [tip:x86/asm] x86/asm/tsc, drivers/input/gameport: Replace rdtscl () " tip-bot for Andy Lutomirski
  2015-06-17  0:36 ` [PATCH v3 12/18] x86/tsc: Remove rdtscl() Andy Lutomirski
                   ` (7 subsequent siblings)
  18 siblings, 1 reply; 57+ messages in thread
From: Andy Lutomirski @ 2015-06-17  0:35 UTC (permalink / raw)
  To: x86
  Cc: Borislav Petkov, Peter Zijlstra, John Stultz, linux-kernel,
	Len Brown, Huang Rui, Denys Vlasenko, kvm, Ralf Baechle,
	Andy Lutomirski, linux-input

It's unclear to me why this code exists in the first place.

Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Cc: linux-input@vger.kernel.org
Signed-off-by: Andy Lutomirski <luto@kernel.org>
---
 drivers/input/gameport/gameport.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/input/gameport/gameport.c b/drivers/input/gameport/gameport.c
index e853a2134680..abc0cb22e750 100644
--- a/drivers/input/gameport/gameport.c
+++ b/drivers/input/gameport/gameport.c
@@ -149,9 +149,9 @@ static int old_gameport_measure_speed(struct gameport *gameport)
 
 	for(i = 0; i < 50; i++) {
 		local_irq_save(flags);
-		rdtscl(t1);
+		t1 = native_read_tsc();
 		for (t = 0; t < 50; t++) gameport_read(gameport);
-		rdtscl(t2);
+		t2 = native_read_tsc();
 		local_irq_restore(flags);
 		udelay(i * 10);
 		if (t2 - t1 < tx) tx = t2 - t1;
-- 
2.4.2

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v3 12/18] x86/tsc: Remove rdtscl()
  2015-06-17  0:35 [PATCH v3 00/18] x86/tsc: Clean up rdtsc helpers Andy Lutomirski
                   ` (10 preceding siblings ...)
  2015-06-17  0:35 ` [PATCH v3 11/18] drivers/input/gameport: Replace rdtscl() with native_read_tsc() Andy Lutomirski
@ 2015-06-17  0:36 ` Andy Lutomirski
  2015-07-06 15:43   ` [tip:x86/asm] x86/asm/tsc: " tip-bot for Andy Lutomirski
  2015-06-17  0:36 ` [PATCH v3 13/18] x86/tsc: Rename native_read_tsc() to rdtsc() Andy Lutomirski
                   ` (6 subsequent siblings)
  18 siblings, 1 reply; 57+ messages in thread
From: Andy Lutomirski @ 2015-06-17  0:36 UTC (permalink / raw)
  To: x86
  Cc: Borislav Petkov, Peter Zijlstra, John Stultz, linux-kernel,
	Len Brown, Huang Rui, Denys Vlasenko, kvm, Ralf Baechle,
	Andy Lutomirski

It has no more callers, and it was never a very sensible interface
to begin with.  Users of the TSC should either read all 64 bits or
explicitly throw out the high bits.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
---
 arch/x86/include/asm/msr.h | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/arch/x86/include/asm/msr.h b/arch/x86/include/asm/msr.h
index 626f78199665..c89ed6ceed02 100644
--- a/arch/x86/include/asm/msr.h
+++ b/arch/x86/include/asm/msr.h
@@ -189,9 +189,6 @@ do {							\
 
 #endif	/* !CONFIG_PARAVIRT */
 
-#define rdtscl(low)						\
-	((low) = (u32)native_read_tsc())
-
 /*
  * 64-bit version of wrmsr_safe():
  */
-- 
2.4.2


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v3 13/18] x86/tsc: Rename native_read_tsc() to rdtsc()
  2015-06-17  0:35 [PATCH v3 00/18] x86/tsc: Clean up rdtsc helpers Andy Lutomirski
                   ` (11 preceding siblings ...)
  2015-06-17  0:36 ` [PATCH v3 12/18] x86/tsc: Remove rdtscl() Andy Lutomirski
@ 2015-06-17  0:36 ` Andy Lutomirski
  2015-06-24 21:38   ` Borislav Petkov
  2015-07-06 15:43   ` [tip:x86/asm] x86/asm/tsc: " tip-bot for Andy Lutomirski
  2015-06-17  0:36 ` [PATCH v3 14/18] x86: Add rdtsc_ordered() and use it in trivial call sites Andy Lutomirski
                   ` (5 subsequent siblings)
  18 siblings, 2 replies; 57+ messages in thread
From: Andy Lutomirski @ 2015-06-17  0:36 UTC (permalink / raw)
  To: x86
  Cc: Borislav Petkov, Peter Zijlstra, John Stultz, linux-kernel,
	Len Brown, Huang Rui, Denys Vlasenko, kvm, Ralf Baechle,
	Andy Lutomirski

Now that there is no paravirt TSC, the "native" is inappropriate.
The function does RDTSC, so give it the obvious name: rdtsc()

Suggested-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
---
 arch/x86/boot/compressed/aslr.c                      |  2 +-
 arch/x86/entry/vdso/vclock_gettime.c                 |  2 +-
 arch/x86/include/asm/msr.h                           | 11 ++++++++++-
 arch/x86/include/asm/pvclock.h                       |  2 +-
 arch/x86/include/asm/stackprotector.h                |  2 +-
 arch/x86/include/asm/tsc.h                           |  2 +-
 arch/x86/kernel/apb_timer.c                          |  8 ++++----
 arch/x86/kernel/apic/apic.c                          |  8 ++++----
 arch/x86/kernel/cpu/amd.c                            |  4 ++--
 arch/x86/kernel/cpu/mcheck/mce.c                     |  4 ++--
 arch/x86/kernel/espfix_64.c                          |  2 +-
 arch/x86/kernel/hpet.c                               |  4 ++--
 arch/x86/kernel/trace_clock.c                        |  2 +-
 arch/x86/kernel/tsc.c                                |  4 ++--
 arch/x86/kvm/lapic.c                                 |  4 ++--
 arch/x86/kvm/svm.c                                   |  4 ++--
 arch/x86/kvm/vmx.c                                   |  4 ++--
 arch/x86/kvm/x86.c                                   | 12 ++++++------
 arch/x86/lib/delay.c                                 |  8 ++++----
 drivers/input/gameport/gameport.c                    |  4 ++--
 drivers/input/joystick/analog.c                      |  4 ++--
 drivers/net/hamradio/baycom_epp.c                    |  2 +-
 drivers/thermal/intel_powerclamp.c                   |  4 ++--
 tools/power/cpupower/debug/kernel/cpufreq-test_tsc.c |  4 ++--
 24 files changed, 58 insertions(+), 49 deletions(-)

diff --git a/arch/x86/boot/compressed/aslr.c b/arch/x86/boot/compressed/aslr.c
index ea33236190b1..6a9b96b4624d 100644
--- a/arch/x86/boot/compressed/aslr.c
+++ b/arch/x86/boot/compressed/aslr.c
@@ -82,7 +82,7 @@ static unsigned long get_random_long(void)
 
 	if (has_cpuflag(X86_FEATURE_TSC)) {
 		debug_putstr(" RDTSC");
-		raw = native_read_tsc();
+		raw = rdtsc();
 
 		random ^= raw;
 		use_i8254 = false;
diff --git a/arch/x86/entry/vdso/vclock_gettime.c b/arch/x86/entry/vdso/vclock_gettime.c
index 972b488ac16a..0340d93c18ca 100644
--- a/arch/x86/entry/vdso/vclock_gettime.c
+++ b/arch/x86/entry/vdso/vclock_gettime.c
@@ -186,7 +186,7 @@ notrace static cycle_t vread_tsc(void)
 	 * but no one has ever seen it happen.
 	 */
 	rdtsc_barrier();
-	ret = (cycle_t)native_read_tsc();
+	ret = (cycle_t)rdtsc();
 
 	last = gtod->cycle_last;
 
diff --git a/arch/x86/include/asm/msr.h b/arch/x86/include/asm/msr.h
index c89ed6ceed02..ff0c120dafe5 100644
--- a/arch/x86/include/asm/msr.h
+++ b/arch/x86/include/asm/msr.h
@@ -109,7 +109,16 @@ notrace static inline int native_write_msr_safe(unsigned int msr,
 extern int rdmsr_safe_regs(u32 regs[8]);
 extern int wrmsr_safe_regs(u32 regs[8]);
 
-static __always_inline unsigned long long native_read_tsc(void)
+/**
+ * rdtsc() - returns the current TSC without ordering constraints
+ *
+ * rdtsc() returns the result of RDTSC as a 64-bit integer.  The
+ * only ordering constraint it supplies is the ordering implied by
+ * "asm volatile": it will put the RDTSC in the place you expect.  The
+ * CPU can and will speculatively execute that RDTSC, though, so the
+ * results can be non-monotonic if compared on different CPUs.
+ */
+static __always_inline unsigned long long rdtsc(void)
 {
 	DECLARE_ARGS(val, low, high);
 
diff --git a/arch/x86/include/asm/pvclock.h b/arch/x86/include/asm/pvclock.h
index 71bd485c2986..6084bce345fc 100644
--- a/arch/x86/include/asm/pvclock.h
+++ b/arch/x86/include/asm/pvclock.h
@@ -62,7 +62,7 @@ static inline u64 pvclock_scale_delta(u64 delta, u32 mul_frac, int shift)
 static __always_inline
 u64 pvclock_get_nsec_offset(const struct pvclock_vcpu_time_info *src)
 {
-	u64 delta = native_read_tsc() - src->tsc_timestamp;
+	u64 delta = rdtsc() - src->tsc_timestamp;
 	return pvclock_scale_delta(delta, src->tsc_to_system_mul,
 				   src->tsc_shift);
 }
diff --git a/arch/x86/include/asm/stackprotector.h b/arch/x86/include/asm/stackprotector.h
index bc5fa2af112e..58505f01962f 100644
--- a/arch/x86/include/asm/stackprotector.h
+++ b/arch/x86/include/asm/stackprotector.h
@@ -72,7 +72,7 @@ static __always_inline void boot_init_stack_canary(void)
 	 * on during the bootup the random pool has true entropy too.
 	 */
 	get_random_bytes(&canary, sizeof(canary));
-	tsc = native_read_tsc();
+	tsc = rdtsc();
 	canary += tsc + (tsc << 32UL);
 
 	current->stack_canary = canary;
diff --git a/arch/x86/include/asm/tsc.h b/arch/x86/include/asm/tsc.h
index b4883902948b..3df7675debcf 100644
--- a/arch/x86/include/asm/tsc.h
+++ b/arch/x86/include/asm/tsc.h
@@ -26,7 +26,7 @@ static inline cycles_t get_cycles(void)
 		return 0;
 #endif
 
-	return native_read_tsc();
+	return rdtsc();
 }
 
 extern void tsc_init(void);
diff --git a/arch/x86/kernel/apb_timer.c b/arch/x86/kernel/apb_timer.c
index 25efa534c4e4..222a57076039 100644
--- a/arch/x86/kernel/apb_timer.c
+++ b/arch/x86/kernel/apb_timer.c
@@ -263,7 +263,7 @@ static int apbt_clocksource_register(void)
 
 	/* Verify whether apbt counter works */
 	t1 = dw_apb_clocksource_read(clocksource_apbt);
-	start = native_read_tsc();
+	start = rdtsc();
 
 	/*
 	 * We don't know the TSC frequency yet, but waiting for
@@ -273,7 +273,7 @@ static int apbt_clocksource_register(void)
 	 */
 	do {
 		rep_nop();
-		now = native_read_tsc();
+		now = rdtsc();
 	} while ((now - start) < 200000UL);
 
 	/* APBT is the only always on clocksource, it has to work! */
@@ -390,13 +390,13 @@ unsigned long apbt_quick_calibrate(void)
 	old = dw_apb_clocksource_read(clocksource_apbt);
 	old += loop;
 
-	t1 = native_read_tsc();
+	t1 = rdtsc();
 
 	do {
 		new = dw_apb_clocksource_read(clocksource_apbt);
 	} while (new < old);
 
-	t2 = native_read_tsc();
+	t2 = rdtsc();
 
 	shift = 5;
 	if (unlikely(loop >> shift == 0)) {
diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
index 51af1ed1ae2e..0d71cd9b4a50 100644
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -457,7 +457,7 @@ static int lapic_next_deadline(unsigned long delta,
 {
 	u64 tsc;
 
-	tsc = native_read_tsc();
+	tsc = rdtsc();
 	wrmsrl(MSR_IA32_TSC_DEADLINE, tsc + (((u64) delta) * TSC_DIVISOR));
 	return 0;
 }
@@ -592,7 +592,7 @@ static void __init lapic_cal_handler(struct clock_event_device *dev)
 	unsigned long pm = acpi_pm_read_early();
 
 	if (cpu_has_tsc)
-		tsc = native_read_tsc();
+		tsc = rdtsc();
 
 	switch (lapic_cal_loops++) {
 	case 0:
@@ -1209,7 +1209,7 @@ void setup_local_APIC(void)
 	long long max_loops = cpu_khz ? cpu_khz : 1000000;
 
 	if (cpu_has_tsc)
-		tsc = native_read_tsc();
+		tsc = rdtsc();
 
 	if (disable_apic) {
 		disable_ioapic_support();
@@ -1293,7 +1293,7 @@ void setup_local_APIC(void)
 		}
 		if (queued) {
 			if (cpu_has_tsc && cpu_khz) {
-				ntsc = native_read_tsc();
+				ntsc = rdtsc();
 				max_loops = (cpu_khz << 10) - (ntsc - tsc);
 			} else
 				max_loops--;
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index c5ceec532799..a5fe13d43c47 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -118,10 +118,10 @@ static void init_amd_k6(struct cpuinfo_x86 *c)
 
 		n = K6_BUG_LOOP;
 		f_vide = vide;
-		d = native_read_tsc();
+		d = rdtsc();
 		while (n--)
 			f_vide();
-		d2 = native_read_tsc();
+		d2 = rdtsc();
 		d = d2-d;
 
 		if (d > 20*K6_BUG_LOOP)
diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index a5283d2d0094..96cceccd11b4 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -125,7 +125,7 @@ void mce_setup(struct mce *m)
 {
 	memset(m, 0, sizeof(struct mce));
 	m->cpu = m->extcpu = smp_processor_id();
-	m->tsc = native_read_tsc();
+	m->tsc = rdtsc();
 	/* We hope get_seconds stays lockless */
 	m->time = get_seconds();
 	m->cpuvendor = boot_cpu_data.x86_vendor;
@@ -1784,7 +1784,7 @@ static void collect_tscs(void *data)
 {
 	unsigned long *cpu_tsc = (unsigned long *)data;
 
-	cpu_tsc[smp_processor_id()] = native_read_tsc();
+	cpu_tsc[smp_processor_id()] = rdtsc();
 }
 
 static int mce_apei_read_done;
diff --git a/arch/x86/kernel/espfix_64.c b/arch/x86/kernel/espfix_64.c
index 334a2a9c034d..67315cd0132c 100644
--- a/arch/x86/kernel/espfix_64.c
+++ b/arch/x86/kernel/espfix_64.c
@@ -110,7 +110,7 @@ static void init_espfix_random(void)
 	 */
 	if (!arch_get_random_long(&rand)) {
 		/* The constant is an arbitrary large prime */
-		rand = native_read_tsc();
+		rand = rdtsc();
 		rand *= 0xc345c6b72fd16123UL;
 	}
 
diff --git a/arch/x86/kernel/hpet.c b/arch/x86/kernel/hpet.c
index ccf677cd9adc..b090753f40fb 100644
--- a/arch/x86/kernel/hpet.c
+++ b/arch/x86/kernel/hpet.c
@@ -734,7 +734,7 @@ static int hpet_clocksource_register(void)
 
 	/* Verify whether hpet counter works */
 	t1 = hpet_readl(HPET_COUNTER);
-	start = native_read_tsc();
+	start = rdtsc();
 
 	/*
 	 * We don't know the TSC frequency yet, but waiting for
@@ -744,7 +744,7 @@ static int hpet_clocksource_register(void)
 	 */
 	do {
 		rep_nop();
-		now = native_read_tsc();
+		now = rdtsc();
 	} while ((now - start) < 200000UL);
 
 	if (t1 == hpet_readl(HPET_COUNTER)) {
diff --git a/arch/x86/kernel/trace_clock.c b/arch/x86/kernel/trace_clock.c
index bd8f4d41bd56..67efb8c96fc4 100644
--- a/arch/x86/kernel/trace_clock.c
+++ b/arch/x86/kernel/trace_clock.c
@@ -15,7 +15,7 @@ u64 notrace trace_clock_x86_tsc(void)
 	u64 ret;
 
 	rdtsc_barrier();
-	ret = native_read_tsc();
+	ret = rdtsc();
 
 	return ret;
 }
diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
index e66f5dcaeb63..21d6e04e3e82 100644
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -248,7 +248,7 @@ static void set_cyc2ns_scale(unsigned long cpu_khz, int cpu)
 
 	data = cyc2ns_write_begin(cpu);
 
-	tsc_now = native_read_tsc();
+	tsc_now = rdtsc();
 	ns_now = cycles_2_ns(tsc_now);
 
 	/*
@@ -290,7 +290,7 @@ u64 native_sched_clock(void)
 	}
 
 	/* read the Time Stamp Counter: */
-	tsc_now = native_read_tsc();
+	tsc_now = rdtsc();
 
 	/* return the value in ns */
 	return cycles_2_ns(tsc_now);
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 629af0f1c5c4..db155150f5bf 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -1148,7 +1148,7 @@ void wait_lapic_expire(struct kvm_vcpu *vcpu)
 
 	tsc_deadline = apic->lapic_timer.expired_tscdeadline;
 	apic->lapic_timer.expired_tscdeadline = 0;
-	guest_tsc = kvm_x86_ops->read_l1_tsc(vcpu, native_read_tsc());
+	guest_tsc = kvm_x86_ops->read_l1_tsc(vcpu, rdtsc());
 	trace_kvm_wait_lapic_expire(vcpu->vcpu_id, guest_tsc - tsc_deadline);
 
 	/* __delay is delay_tsc whenever the hardware has TSC, thus always.  */
@@ -1216,7 +1216,7 @@ static void start_apic_timer(struct kvm_lapic *apic)
 		local_irq_save(flags);
 
 		now = apic->lapic_timer.timer.base->get_time();
-		guest_tsc = kvm_x86_ops->read_l1_tsc(vcpu, native_read_tsc());
+		guest_tsc = kvm_x86_ops->read_l1_tsc(vcpu, rdtsc());
 		if (likely(tscdeadline > guest_tsc)) {
 			ns = (tscdeadline - guest_tsc) * 1000000ULL;
 			do_div(ns, this_tsc_khz);
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 9afa233b5482..964909a18f4a 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -1077,7 +1077,7 @@ static u64 svm_compute_tsc_offset(struct kvm_vcpu *vcpu, u64 target_tsc)
 {
 	u64 tsc;
 
-	tsc = svm_scale_tsc(vcpu, native_read_tsc());
+	tsc = svm_scale_tsc(vcpu, rdtsc());
 
 	return target_tsc - tsc;
 }
@@ -3074,7 +3074,7 @@ static int svm_get_msr(struct kvm_vcpu *vcpu, unsigned ecx, u64 *data)
 	switch (ecx) {
 	case MSR_IA32_TSC: {
 		*data = svm->vmcb->control.tsc_offset +
-			svm_scale_tsc(vcpu, native_read_tsc());
+			svm_scale_tsc(vcpu, rdtsc());
 
 		break;
 	}
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index fcff42100948..03d27899edb7 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -2236,7 +2236,7 @@ static u64 guest_read_tsc(void)
 {
 	u64 host_tsc, tsc_offset;
 
-	host_tsc = native_read_tsc();
+	host_tsc = rdtsc();
 	tsc_offset = vmcs_read64(TSC_OFFSET);
 	return host_tsc + tsc_offset;
 }
@@ -2317,7 +2317,7 @@ static void vmx_adjust_tsc_offset(struct kvm_vcpu *vcpu, s64 adjustment, bool ho
 
 static u64 vmx_compute_tsc_offset(struct kvm_vcpu *vcpu, u64 target_tsc)
 {
-	return target_tsc - native_read_tsc();
+	return target_tsc - rdtsc();
 }
 
 static bool guest_cpuid_has_vmx(struct kvm_vcpu *vcpu)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index c26faf408bce..b0afdc74c28a 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1430,7 +1430,7 @@ static cycle_t read_tsc(void)
 	 * but no one has ever seen it happen.
 	 */
 	rdtsc_barrier();
-	ret = (cycle_t)native_read_tsc();
+	ret = (cycle_t)rdtsc();
 
 	last = pvclock_gtod_data.clock.cycle_last;
 
@@ -1621,7 +1621,7 @@ static int kvm_guest_time_update(struct kvm_vcpu *v)
 		return 1;
 	}
 	if (!use_master_clock) {
-		host_tsc = native_read_tsc();
+		host_tsc = rdtsc();
 		kernel_ns = get_kernel_ns();
 	}
 
@@ -2945,7 +2945,7 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
 
 	if (unlikely(vcpu->cpu != cpu) || check_tsc_unstable()) {
 		s64 tsc_delta = !vcpu->arch.last_host_tsc ? 0 :
-				native_read_tsc() - vcpu->arch.last_host_tsc;
+				rdtsc() - vcpu->arch.last_host_tsc;
 		if (tsc_delta < 0)
 			mark_tsc_unstable("KVM discovered backwards TSC");
 		if (check_tsc_unstable()) {
@@ -2973,7 +2973,7 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
 {
 	kvm_x86_ops->vcpu_put(vcpu);
 	kvm_put_guest_fpu(vcpu);
-	vcpu->arch.last_host_tsc = native_read_tsc();
+	vcpu->arch.last_host_tsc = rdtsc();
 }
 
 static int kvm_vcpu_ioctl_get_lapic(struct kvm_vcpu *vcpu,
@@ -6388,7 +6388,7 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
 		hw_breakpoint_restore();
 
 	vcpu->arch.last_guest_tsc = kvm_x86_ops->read_l1_tsc(vcpu,
-							   native_read_tsc());
+							   rdtsc());
 
 	vcpu->mode = OUTSIDE_GUEST_MODE;
 	smp_wmb();
@@ -7186,7 +7186,7 @@ int kvm_arch_hardware_enable(void)
 	if (ret != 0)
 		return ret;
 
-	local_tsc = native_read_tsc();
+	local_tsc = rdtsc();
 	stable = !check_tsc_unstable();
 	list_for_each_entry(kvm, &vm_list, vm_list) {
 		kvm_for_each_vcpu(i, vcpu, kvm) {
diff --git a/arch/x86/lib/delay.c b/arch/x86/lib/delay.c
index 35115f3786a9..f24bc59ab0a0 100644
--- a/arch/x86/lib/delay.c
+++ b/arch/x86/lib/delay.c
@@ -55,10 +55,10 @@ static void delay_tsc(unsigned long __loops)
 	preempt_disable();
 	cpu = smp_processor_id();
 	rdtsc_barrier();
-	bclock = native_read_tsc();
+	bclock = rdtsc();
 	for (;;) {
 		rdtsc_barrier();
-		now = native_read_tsc();
+		now = rdtsc();
 		if ((now - bclock) >= loops)
 			break;
 
@@ -80,7 +80,7 @@ static void delay_tsc(unsigned long __loops)
 			loops -= (now - bclock);
 			cpu = smp_processor_id();
 			rdtsc_barrier();
-			bclock = native_read_tsc();
+			bclock = rdtsc();
 		}
 	}
 	preempt_enable();
@@ -100,7 +100,7 @@ void use_tsc_delay(void)
 int read_current_timer(unsigned long *timer_val)
 {
 	if (delay_fn == delay_tsc) {
-		*timer_val = native_read_tsc();
+		*timer_val = rdtsc();
 		return 0;
 	}
 	return -1;
diff --git a/drivers/input/gameport/gameport.c b/drivers/input/gameport/gameport.c
index abc0cb22e750..4a2a9e370be7 100644
--- a/drivers/input/gameport/gameport.c
+++ b/drivers/input/gameport/gameport.c
@@ -149,9 +149,9 @@ static int old_gameport_measure_speed(struct gameport *gameport)
 
 	for(i = 0; i < 50; i++) {
 		local_irq_save(flags);
-		t1 = native_read_tsc();
+		t1 = rdtsc();
 		for (t = 0; t < 50; t++) gameport_read(gameport);
-		t2 = native_read_tsc();
+		t2 = rdtsc();
 		local_irq_restore(flags);
 		udelay(i * 10);
 		if (t2 - t1 < tx) tx = t2 - t1;
diff --git a/drivers/input/joystick/analog.c b/drivers/input/joystick/analog.c
index f871b4f00056..6f8b084e13d0 100644
--- a/drivers/input/joystick/analog.c
+++ b/drivers/input/joystick/analog.c
@@ -143,7 +143,7 @@ struct analog_port {
 
 #include <linux/i8253.h>
 
-#define GET_TIME(x)	do { if (cpu_has_tsc) x = (unsigned int)native_read_tsc(); else x = get_time_pit(); } while (0)
+#define GET_TIME(x)	do { if (cpu_has_tsc) x = (unsigned int)rdtsc(); else x = get_time_pit(); } while (0)
 #define DELTA(x,y)	(cpu_has_tsc ? ((y) - (x)) : ((x) - (y) + ((x) < (y) ? PIT_TICK_RATE / HZ : 0)))
 #define TIME_NAME	(cpu_has_tsc?"TSC":"PIT")
 static unsigned int get_time_pit(void)
@@ -160,7 +160,7 @@ static unsigned int get_time_pit(void)
         return count;
 }
 #elif defined(__x86_64__)
-#define GET_TIME(x)	do { x = (unsigned int)native_read_tsc(); } while (0)
+#define GET_TIME(x)	do { x = (unsigned int)rdtsc(); } while (0)
 #define DELTA(x,y)	((y)-(x))
 #define TIME_NAME	"TSC"
 #elif defined(__alpha__) || defined(CONFIG_MN10300) || defined(CONFIG_ARM) || defined(CONFIG_ARM64) || defined(CONFIG_TILE)
diff --git a/drivers/net/hamradio/baycom_epp.c b/drivers/net/hamradio/baycom_epp.c
index 44e5c3b5e0af..72c9f1f352b4 100644
--- a/drivers/net/hamradio/baycom_epp.c
+++ b/drivers/net/hamradio/baycom_epp.c
@@ -638,7 +638,7 @@ static int receive(struct net_device *dev, int cnt)
 #define GETTICK(x)                                                \
 ({                                                                \
 	if (cpu_has_tsc)                                          \
-		x = (unsigned int)native_read_tsc();		  \
+		x = (unsigned int)rdtsc();		  \
 })
 #else /* __i386__ */
 #define GETTICK(x)
diff --git a/drivers/thermal/intel_powerclamp.c b/drivers/thermal/intel_powerclamp.c
index 933c5e599d1d..7476cc3dc5c0 100644
--- a/drivers/thermal/intel_powerclamp.c
+++ b/drivers/thermal/intel_powerclamp.c
@@ -340,7 +340,7 @@ static bool powerclamp_adjust_controls(unsigned int target_ratio,
 
 	/* check result for the last window */
 	msr_now = pkg_state_counter();
-	tsc_now = native_read_tsc();
+	tsc_now = rdtsc();
 
 	/* calculate pkg cstate vs tsc ratio */
 	if (!msr_last || !tsc_last)
@@ -482,7 +482,7 @@ static void poll_pkg_cstate(struct work_struct *dummy)
 	u64 val64;
 
 	msr_now = pkg_state_counter();
-	tsc_now = native_read_tsc();
+	tsc_now = rdtsc();
 	jiffies_now = jiffies;
 
 	/* calculate pkg cstate vs tsc ratio */
diff --git a/tools/power/cpupower/debug/kernel/cpufreq-test_tsc.c b/tools/power/cpupower/debug/kernel/cpufreq-test_tsc.c
index f02b0c0bff9b..6ff8383f2941 100644
--- a/tools/power/cpupower/debug/kernel/cpufreq-test_tsc.c
+++ b/tools/power/cpupower/debug/kernel/cpufreq-test_tsc.c
@@ -81,11 +81,11 @@ static int __init cpufreq_test_tsc(void)
 
 	printk(KERN_DEBUG "start--> \n");
 	then = read_pmtmr();
-	then_tsc = native_read_tsc();
+	then_tsc = rdtsc();
 	for (i=0;i<20;i++) {
 		mdelay(100);
 		now = read_pmtmr();
-		now_tsc = native_read_tsc();
+		now_tsc = rdtsc();
 		diff = (now - then) & 0xFFFFFF;
 		diff_tsc = now_tsc - then_tsc;
 		printk(KERN_DEBUG "t1: %08u t2: %08u diff_pmtmr: %08u diff_tsc: %016llu\n", then, now, diff, diff_tsc);
-- 
2.4.2

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v3 14/18] x86: Add rdtsc_ordered() and use it in trivial call sites
  2015-06-17  0:35 [PATCH v3 00/18] x86/tsc: Clean up rdtsc helpers Andy Lutomirski
                   ` (12 preceding siblings ...)
  2015-06-17  0:36 ` [PATCH v3 13/18] x86/tsc: Rename native_read_tsc() to rdtsc() Andy Lutomirski
@ 2015-06-17  0:36 ` Andy Lutomirski
  2015-07-06 15:44   ` [tip:x86/asm] x86/asm/tsc: " tip-bot for Andy Lutomirski
  2015-08-21  7:45   ` [tip:x86/asm] x86/asm/tsc: Add rdtscll() merge helper tip-bot for Ingo Molnar
  2015-06-17  0:36 ` [PATCH v3 15/18] x86/tsc: Use rdtsc_ordered() in check_tsc_warp() and drop extra barriers Andy Lutomirski
                   ` (4 subsequent siblings)
  18 siblings, 2 replies; 57+ messages in thread
From: Andy Lutomirski @ 2015-06-17  0:36 UTC (permalink / raw)
  To: x86
  Cc: Borislav Petkov, Peter Zijlstra, John Stultz, linux-kernel,
	Len Brown, Huang Rui, Denys Vlasenko, kvm, Ralf Baechle,
	Andy Lutomirski

rdtsc_barrier(); rdtsc() is an unnecessary mouthful and requires
more thought than should be necessary.  Add an rdtsc_ordered()
helper and replace the trivial call sites with it.

This should not change generated code.  The duplication of the fence
asm is temporary.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
---
 arch/x86/entry/vdso/vclock_gettime.c | 16 ++--------------
 arch/x86/include/asm/msr.h           | 26 ++++++++++++++++++++++++++
 arch/x86/kernel/trace_clock.c        |  7 +------
 arch/x86/kvm/x86.c                   | 16 ++--------------
 arch/x86/lib/delay.c                 |  9 +++------
 5 files changed, 34 insertions(+), 40 deletions(-)

diff --git a/arch/x86/entry/vdso/vclock_gettime.c b/arch/x86/entry/vdso/vclock_gettime.c
index 0340d93c18ca..ca94fa649251 100644
--- a/arch/x86/entry/vdso/vclock_gettime.c
+++ b/arch/x86/entry/vdso/vclock_gettime.c
@@ -175,20 +175,8 @@ static notrace cycle_t vread_pvclock(int *mode)
 
 notrace static cycle_t vread_tsc(void)
 {
-	cycle_t ret;
-	u64 last;
-
-	/*
-	 * Empirically, a fence (of type that depends on the CPU)
-	 * before rdtsc is enough to ensure that rdtsc is ordered
-	 * with respect to loads.  The various CPU manuals are unclear
-	 * as to whether rdtsc can be reordered with later loads,
-	 * but no one has ever seen it happen.
-	 */
-	rdtsc_barrier();
-	ret = (cycle_t)rdtsc();
-
-	last = gtod->cycle_last;
+	cycle_t ret = (cycle_t)rdtsc_ordered();
+	u64 last = gtod->cycle_last;
 
 	if (likely(ret >= last))
 		return ret;
diff --git a/arch/x86/include/asm/msr.h b/arch/x86/include/asm/msr.h
index ff0c120dafe5..02bdd6c65017 100644
--- a/arch/x86/include/asm/msr.h
+++ b/arch/x86/include/asm/msr.h
@@ -127,6 +127,32 @@ static __always_inline unsigned long long rdtsc(void)
 	return EAX_EDX_VAL(val, low, high);
 }
 
+/**
+ * rdtsc_ordered() - read the current TSC in program order
+ *
+ * rdtsc_ordered() returns the result of RDTSC as a 64-bit integer.
+ * It is ordered like a load to a global in-memory counter.  It should
+ * be impossible to observe non-monotonic rdtsc_unordered() behavior
+ * across multiple CPUs as long as the TSC is synced.
+ */
+static __always_inline unsigned long long rdtsc_ordered(void)
+{
+	/*
+	 * The RDTSC instruction is not ordered relative to memory
+	 * access.  The Intel SDM and the AMD APM are both vague on this
+	 * point, but empirically an RDTSC instruction can be
+	 * speculatively executed before prior loads.  An RDTSC
+	 * immediately after an appropriate barrier appears to be
+	 * ordered as a normal load, that is, it provides the same
+	 * ordering guarantees as reading from a global memory location
+	 * that some other imaginary CPU is updating continuously with a
+	 * time stamp.
+	 */
+	alternative_2("", "mfence", X86_FEATURE_MFENCE_RDTSC,
+			  "lfence", X86_FEATURE_LFENCE_RDTSC);
+	return rdtsc();
+}
+
 static inline unsigned long long native_read_pmc(int counter)
 {
 	DECLARE_ARGS(val, low, high);
diff --git a/arch/x86/kernel/trace_clock.c b/arch/x86/kernel/trace_clock.c
index 67efb8c96fc4..80bb24d9b880 100644
--- a/arch/x86/kernel/trace_clock.c
+++ b/arch/x86/kernel/trace_clock.c
@@ -12,10 +12,5 @@
  */
 u64 notrace trace_clock_x86_tsc(void)
 {
-	u64 ret;
-
-	rdtsc_barrier();
-	ret = rdtsc();
-
-	return ret;
+	return rdtsc_ordered();
 }
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index b0afdc74c28a..dfccaf2f2e00 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1419,20 +1419,8 @@ EXPORT_SYMBOL_GPL(kvm_write_tsc);
 
 static cycle_t read_tsc(void)
 {
-	cycle_t ret;
-	u64 last;
-
-	/*
-	 * Empirically, a fence (of type that depends on the CPU)
-	 * before rdtsc is enough to ensure that rdtsc is ordered
-	 * with respect to loads.  The various CPU manuals are unclear
-	 * as to whether rdtsc can be reordered with later loads,
-	 * but no one has ever seen it happen.
-	 */
-	rdtsc_barrier();
-	ret = (cycle_t)rdtsc();
-
-	last = pvclock_gtod_data.clock.cycle_last;
+	cycle_t ret = (cycle_t)rdtsc_ordered();
+	u64 last = pvclock_gtod_data.clock.cycle_last;
 
 	if (likely(ret >= last))
 		return ret;
diff --git a/arch/x86/lib/delay.c b/arch/x86/lib/delay.c
index f24bc59ab0a0..4453d52a143d 100644
--- a/arch/x86/lib/delay.c
+++ b/arch/x86/lib/delay.c
@@ -54,11 +54,9 @@ static void delay_tsc(unsigned long __loops)
 
 	preempt_disable();
 	cpu = smp_processor_id();
-	rdtsc_barrier();
-	bclock = rdtsc();
+	bclock = rdtsc_ordered();
 	for (;;) {
-		rdtsc_barrier();
-		now = rdtsc();
+		now = rdtsc_ordered();
 		if ((now - bclock) >= loops)
 			break;
 
@@ -79,8 +77,7 @@ static void delay_tsc(unsigned long __loops)
 		if (unlikely(cpu != smp_processor_id())) {
 			loops -= (now - bclock);
 			cpu = smp_processor_id();
-			rdtsc_barrier();
-			bclock = rdtsc();
+			bclock = rdtsc_ordered();
 		}
 	}
 	preempt_enable();
-- 
2.4.2


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v3 15/18] x86/tsc: Use rdtsc_ordered() in check_tsc_warp() and drop extra barriers
  2015-06-17  0:35 [PATCH v3 00/18] x86/tsc: Clean up rdtsc helpers Andy Lutomirski
                   ` (13 preceding siblings ...)
  2015-06-17  0:36 ` [PATCH v3 14/18] x86: Add rdtsc_ordered() and use it in trivial call sites Andy Lutomirski
@ 2015-06-17  0:36 ` Andy Lutomirski
  2015-07-06 15:44   ` [tip:x86/asm] x86/asm/tsc/sync: " tip-bot for Andy Lutomirski
  2015-06-17  0:36 ` [PATCH v3 16/18] x86/tsc: In read_tsc, use rdtsc_ordered() instead of get_cycles() Andy Lutomirski
                   ` (3 subsequent siblings)
  18 siblings, 1 reply; 57+ messages in thread
From: Andy Lutomirski @ 2015-06-17  0:36 UTC (permalink / raw)
  To: x86
  Cc: Borislav Petkov, Peter Zijlstra, John Stultz, linux-kernel,
	Len Brown, Huang Rui, Denys Vlasenko, kvm, Ralf Baechle,
	Andy Lutomirski

Using get_cycles was unnecessary: check_tsc_warp() is not called on
TSC-less systems.  Replace rdtsc_barrier(); get_cycles() with
rdtsc_ordered().

While we're at it, make the somewhat more dangerous change of
removing barrier_before_rdtsc after RDTSC in the TSC warp check
code.  This should be okay, though -- the vDSO TSC code doesn't have
that barrier, so, if removing the barrier from the warp check would
cause us to detect a warp that we otherwise wouldn't detect, then we
have a genuine bug.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
---
 arch/x86/kernel/tsc_sync.c | 14 ++++++--------
 1 file changed, 6 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kernel/tsc_sync.c b/arch/x86/kernel/tsc_sync.c
index dd8d0791dfb5..78083bf23ed1 100644
--- a/arch/x86/kernel/tsc_sync.c
+++ b/arch/x86/kernel/tsc_sync.c
@@ -39,16 +39,15 @@ static cycles_t max_warp;
 static int nr_warps;
 
 /*
- * TSC-warp measurement loop running on both CPUs:
+ * TSC-warp measurement loop running on both CPUs.  This is not called
+ * if there is no TSC.
  */
 static void check_tsc_warp(unsigned int timeout)
 {
 	cycles_t start, now, prev, end;
 	int i;
 
-	rdtsc_barrier();
-	start = get_cycles();
-	rdtsc_barrier();
+	start = rdtsc_ordered();
 	/*
 	 * The measurement runs for 'timeout' msecs:
 	 */
@@ -63,9 +62,7 @@ static void check_tsc_warp(unsigned int timeout)
 		 */
 		arch_spin_lock(&sync_lock);
 		prev = last_tsc;
-		rdtsc_barrier();
-		now = get_cycles();
-		rdtsc_barrier();
+		now = rdtsc_ordered();
 		last_tsc = now;
 		arch_spin_unlock(&sync_lock);
 
@@ -126,7 +123,7 @@ void check_tsc_sync_source(int cpu)
 
 	/*
 	 * No need to check if we already know that the TSC is not
-	 * synchronized:
+	 * synchronized or if we have no TSC.
 	 */
 	if (unsynchronized_tsc())
 		return;
@@ -190,6 +187,7 @@ void check_tsc_sync_target(void)
 {
 	int cpus = 2;
 
+	/* Also aborts if there is no TSC. */
 	if (unsynchronized_tsc() || tsc_clocksource_reliable)
 		return;
 
-- 
2.4.2


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v3 16/18] x86/tsc: In read_tsc, use rdtsc_ordered() instead of get_cycles()
  2015-06-17  0:35 [PATCH v3 00/18] x86/tsc: Clean up rdtsc helpers Andy Lutomirski
                   ` (14 preceding siblings ...)
  2015-06-17  0:36 ` [PATCH v3 15/18] x86/tsc: Use rdtsc_ordered() in check_tsc_warp() and drop extra barriers Andy Lutomirski
@ 2015-06-17  0:36 ` Andy Lutomirski
  2015-07-06 15:44   ` [tip:x86/asm] x86/asm/tsc: Use rdtsc_ordered() in read_tsc() " tip-bot for Andy Lutomirski
  2015-06-17  0:36 ` [PATCH v3 17/18] x86/kvm/tsc: Drop extra barrier and use rdtsc_ordered in kvmclock Andy Lutomirski
                   ` (2 subsequent siblings)
  18 siblings, 1 reply; 57+ messages in thread
From: Andy Lutomirski @ 2015-06-17  0:36 UTC (permalink / raw)
  To: x86
  Cc: Borislav Petkov, Peter Zijlstra, John Stultz, linux-kernel,
	Len Brown, Huang Rui, Denys Vlasenko, kvm, Ralf Baechle,
	Andy Lutomirski

There are two logical changes here.  First, this removes a check for
cpu_has_tsc.  That check is unnecessary, as we don't register the
TSC as a clocksource on systems that have no TSC.  Second, it adds a
barrier, thus preventing observable non-monotonicity.

I suspect that the missing barrier was never a problem in practice
because system calls themselves were heavy enough barriers to
prevent user code from observing time warps due to speculation.
(Without the corresponding barrier in the vDSO, however,
non-monotonicity is easy to detect.)

Signed-off-by: Andy Lutomirski <luto@kernel.org>
---
 arch/x86/kernel/tsc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
index 21d6e04e3e82..451bade0d320 100644
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -961,7 +961,7 @@ static struct clocksource clocksource_tsc;
  */
 static cycle_t read_tsc(struct clocksource *cs)
 {
-	return (cycle_t)get_cycles();
+	return (cycle_t)rdtsc_ordered();
 }
 
 /*
-- 
2.4.2


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v3 17/18] x86/kvm/tsc: Drop extra barrier and use rdtsc_ordered in kvmclock
  2015-06-17  0:35 [PATCH v3 00/18] x86/tsc: Clean up rdtsc helpers Andy Lutomirski
                   ` (15 preceding siblings ...)
  2015-06-17  0:36 ` [PATCH v3 16/18] x86/tsc: In read_tsc, use rdtsc_ordered() instead of get_cycles() Andy Lutomirski
@ 2015-06-17  0:36 ` Andy Lutomirski
  2015-06-17  7:47   ` Paolo Bonzini
  2015-07-06 15:45   ` [tip:x86/asm] x86/asm/tsc, x86/kvm: Drop open-coded barrier and use rdtsc_ordered() " tip-bot for Andy Lutomirski
  2015-06-17  0:36 ` [PATCH v3 18/18] x86/tsc: Remove rdtsc_barrier() Andy Lutomirski
  2015-06-17 11:11 ` [PATCH v3 00/18] x86/tsc: Clean up rdtsc helpers Borislav Petkov
  18 siblings, 2 replies; 57+ messages in thread
From: Andy Lutomirski @ 2015-06-17  0:36 UTC (permalink / raw)
  To: x86
  Cc: Borislav Petkov, Peter Zijlstra, John Stultz, linux-kernel,
	Len Brown, Huang Rui, Denys Vlasenko, kvm, Ralf Baechle,
	Andy Lutomirski, Paolo Bonzini, Radim Krcmar, Marcelo Tosatti

__pvclock_read_cycles had an unnecessary barrier.  Get rid of that
barrier and clean up the code by just using rdtsc_ordered().

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krcmar <rkrcmar@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: kvm@vger.kernel.org
Signed-off-by: Andy Lutomirski <luto@kernel.org>
---

I'm hoping to get an ack for this to go in through -tip.  (Arguably
I'm the maintainer of this code given how it's used, but I should
still ask for an ack.)

arch/x86/include/asm/pvclock.h | 21 ++++++++++++---------
 1 file changed, 12 insertions(+), 9 deletions(-)

diff --git a/arch/x86/include/asm/pvclock.h b/arch/x86/include/asm/pvclock.h
index 6084bce345fc..cf2329ca4812 100644
--- a/arch/x86/include/asm/pvclock.h
+++ b/arch/x86/include/asm/pvclock.h
@@ -62,7 +62,18 @@ static inline u64 pvclock_scale_delta(u64 delta, u32 mul_frac, int shift)
 static __always_inline
 u64 pvclock_get_nsec_offset(const struct pvclock_vcpu_time_info *src)
 {
-	u64 delta = rdtsc() - src->tsc_timestamp;
+	/*
+	 * Note: emulated platforms which do not advertise SSE2 support
+	 * break rdtsc_ordered, resulting in kvmclock not using the
+	 * necessary RDTSC barriers.  Without barriers, it is possible
+	 * that RDTSC instruction is executed before prior loads,
+	 * resulting in violation of monotonicity.
+	 *
+	 * On an SMP guest without SSE2, it's unclear how anything is
+	 * supposed to work correctly, though -- memory fences
+	 * (e.g. smp_mb) are important for more than just timing.
+	 */
+	u64 delta = rdtsc_ordered() - src->tsc_timestamp;
 	return pvclock_scale_delta(delta, src->tsc_to_system_mul,
 				   src->tsc_shift);
 }
@@ -76,17 +87,9 @@ unsigned __pvclock_read_cycles(const struct pvclock_vcpu_time_info *src,
 	u8 ret_flags;
 
 	version = src->version;
-	/* Note: emulated platforms which do not advertise SSE2 support
-	 * result in kvmclock not using the necessary RDTSC barriers.
-	 * Without barriers, it is possible that RDTSC instruction reads from
-	 * the time stamp counter outside rdtsc_barrier protected section
-	 * below, resulting in violation of monotonicity.
-	 */
-	rdtsc_barrier();
 	offset = pvclock_get_nsec_offset(src);
 	ret = src->system_time + offset;
 	ret_flags = src->flags;
-	rdtsc_barrier();
 
 	*cycles = ret;
 	*flags = ret_flags;
-- 
2.4.2


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v3 18/18] x86/tsc: Remove rdtsc_barrier()
  2015-06-17  0:35 [PATCH v3 00/18] x86/tsc: Clean up rdtsc helpers Andy Lutomirski
                   ` (16 preceding siblings ...)
  2015-06-17  0:36 ` [PATCH v3 17/18] x86/kvm/tsc: Drop extra barrier and use rdtsc_ordered in kvmclock Andy Lutomirski
@ 2015-06-17  0:36 ` Andy Lutomirski
  2015-07-06 15:45   ` [tip:x86/asm] x86/asm/tsc: " tip-bot for Andy Lutomirski
  2015-06-17 11:11 ` [PATCH v3 00/18] x86/tsc: Clean up rdtsc helpers Borislav Petkov
  18 siblings, 1 reply; 57+ messages in thread
From: Andy Lutomirski @ 2015-06-17  0:36 UTC (permalink / raw)
  To: x86
  Cc: Borislav Petkov, Peter Zijlstra, John Stultz, linux-kernel,
	Len Brown, Huang Rui, Denys Vlasenko, kvm, Ralf Baechle,
	Andy Lutomirski

All callers have been converted to rdtsc_ordered().

Signed-off-by: Andy Lutomirski <luto@kernel.org>
---
 arch/x86/include/asm/barrier.h | 11 -----------
 arch/x86/um/asm/barrier.h      | 13 -------------
 2 files changed, 24 deletions(-)

diff --git a/arch/x86/include/asm/barrier.h b/arch/x86/include/asm/barrier.h
index e51a8f803f55..818cb8788225 100644
--- a/arch/x86/include/asm/barrier.h
+++ b/arch/x86/include/asm/barrier.h
@@ -91,15 +91,4 @@ do {									\
 #define smp_mb__before_atomic()	barrier()
 #define smp_mb__after_atomic()	barrier()
 
-/*
- * Stop RDTSC speculation. This is needed when you need to use RDTSC
- * (or get_cycles or vread that possibly accesses the TSC) in a defined
- * code region.
- */
-static __always_inline void rdtsc_barrier(void)
-{
-	alternative_2("", "mfence", X86_FEATURE_MFENCE_RDTSC,
-			  "lfence", X86_FEATURE_LFENCE_RDTSC);
-}
-
 #endif /* _ASM_X86_BARRIER_H */
diff --git a/arch/x86/um/asm/barrier.h b/arch/x86/um/asm/barrier.h
index b9531d343134..755481f14d90 100644
--- a/arch/x86/um/asm/barrier.h
+++ b/arch/x86/um/asm/barrier.h
@@ -45,17 +45,4 @@
 #define read_barrier_depends()		do { } while (0)
 #define smp_read_barrier_depends()	do { } while (0)
 
-/*
- * Stop RDTSC speculation. This is needed when you need to use RDTSC
- * (or get_cycles or vread that possibly accesses the TSC) in a defined
- * code region.
- *
- * (Could use an alternative three way for this if there was one.)
- */
-static inline void rdtsc_barrier(void)
-{
-	alternative_2("", "mfence", X86_FEATURE_MFENCE_RDTSC,
-			  "lfence", X86_FEATURE_LFENCE_RDTSC);
-}
-
 #endif
-- 
2.4.2

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* Re: [PATCH v3 08/18] baycom_epp: Replace rdtscl() with native_read_tsc()
  2015-06-17  0:35 ` [PATCH v3 08/18] baycom_epp: Replace rdtscl() with native_read_tsc() Andy Lutomirski
@ 2015-06-17  0:49   ` Thomas Sailer
  2015-06-20 13:54   ` walter harms
  2015-07-06 15:42   ` [tip:x86/asm] x86/asm/tsc, drivers/net/hamradio/baycom_epp: " tip-bot for Andy Lutomirski
  2 siblings, 0 replies; 57+ messages in thread
From: Thomas Sailer @ 2015-06-17  0:49 UTC (permalink / raw)
  To: Andy Lutomirski, x86
  Cc: Borislav Petkov, Peter Zijlstra, John Stultz, linux-kernel,
	Len Brown, Huang Rui, Denys Vlasenko, kvm, Ralf Baechle,
	walter harms, linux-hams

Acked-by: Thomas Sailer <t.sailer@alumni.ethz.ch>

On 06/17/2015 02:35 AM, Andy Lutomirski wrote:
> This is only used if BAYCOM_DEBUG is defined.
>
> Cc: walter harms <wharms@bfs.de>
> Cc: Ralf Baechle <ralf@linux-mips.org>
> Cc: Thomas Sailer <t.sailer@alumni.ethz.ch>
> Cc: linux-hams@vger.kernel.org
> Signed-off-by: Andy Lutomirski <luto@kernel.org>
> ---
>
> I'm hoping for an ack for this to go through -tip.
>
>   drivers/net/hamradio/baycom_epp.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/net/hamradio/baycom_epp.c b/drivers/net/hamradio/baycom_epp.c
> index 83c7cce0d172..44e5c3b5e0af 100644
> --- a/drivers/net/hamradio/baycom_epp.c
> +++ b/drivers/net/hamradio/baycom_epp.c
> @@ -638,7 +638,7 @@ static int receive(struct net_device *dev, int cnt)
>   #define GETTICK(x)                                                \
>   ({                                                                \
>   	if (cpu_has_tsc)                                          \
> -		rdtscl(x);                                        \
> +		x = (unsigned int)native_read_tsc();		  \
>   })
>   #else /* __i386__ */
>   #define GETTICK(x)


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v3 17/18] x86/kvm/tsc: Drop extra barrier and use rdtsc_ordered in kvmclock
  2015-06-17  0:36 ` [PATCH v3 17/18] x86/kvm/tsc: Drop extra barrier and use rdtsc_ordered in kvmclock Andy Lutomirski
@ 2015-06-17  7:47   ` Paolo Bonzini
  2015-06-17 13:31     ` Paolo Bonzini
  2015-06-20 21:50     ` Borislav Petkov
  2015-07-06 15:45   ` [tip:x86/asm] x86/asm/tsc, x86/kvm: Drop open-coded barrier and use rdtsc_ordered() " tip-bot for Andy Lutomirski
  1 sibling, 2 replies; 57+ messages in thread
From: Paolo Bonzini @ 2015-06-17  7:47 UTC (permalink / raw)
  To: Andy Lutomirski, x86
  Cc: Borislav Petkov, Peter Zijlstra, John Stultz, linux-kernel,
	Len Brown, Huang Rui, Denys Vlasenko, kvm, Ralf Baechle,
	Radim Krcmar, Marcelo Tosatti



On 17/06/2015 02:36, Andy Lutomirski wrote:
> __pvclock_read_cycles had an unnecessary barrier.  Get rid of that
> barrier and clean up the code by just using rdtsc_ordered().
> 
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Radim Krcmar <rkrcmar@redhat.com>
> Cc: Marcelo Tosatti <mtosatti@redhat.com>
> Cc: kvm@vger.kernel.org
> Signed-off-by: Andy Lutomirski <luto@kernel.org>
> ---
> 
> I'm hoping to get an ack for this to go in through -tip.  (Arguably
> I'm the maintainer of this code given how it's used, but I should
> still ask for an ack.)
> 
> arch/x86/include/asm/pvclock.h | 21 ++++++++++++---------
>  1 file changed, 12 insertions(+), 9 deletions(-)

Can you send a URL to the rest of the series?  I've never even seen v1
or v2 so I have no idea of what this is about.

> diff --git a/arch/x86/include/asm/pvclock.h b/arch/x86/include/asm/pvclock.h
> index 6084bce345fc..cf2329ca4812 100644
> --- a/arch/x86/include/asm/pvclock.h
> +++ b/arch/x86/include/asm/pvclock.h
> @@ -62,7 +62,18 @@ static inline u64 pvclock_scale_delta(u64 delta, u32 mul_frac, int shift)
>  static __always_inline
>  u64 pvclock_get_nsec_offset(const struct pvclock_vcpu_time_info *src)
>  {
> -	u64 delta = rdtsc() - src->tsc_timestamp;
> +	/*
> +	 * Note: emulated platforms which do not advertise SSE2 support
> +	 * break rdtsc_ordered, resulting in kvmclock not using the
> +	 * necessary RDTSC barriers.  Without barriers, it is possible
> +	 * that RDTSC instruction is executed before prior loads,
> +	 * resulting in violation of monotonicity.
> +	 *
> +	 * On an SMP guest without SSE2, it's unclear how anything is
> +	 * supposed to work correctly, though -- memory fences
> +	 * (e.g. smp_mb) are important for more than just timing.
> +	 */

On an SMP guest without SSE2, memory fences are obtained with e.g. "lock
addb $0, (%esp)".

> +	u64 delta = rdtsc_ordered() - src->tsc_timestamp;
>  	return pvclock_scale_delta(delta, src->tsc_to_system_mul,
>  				   src->tsc_shift);
>  }
> @@ -76,17 +87,9 @@ unsigned __pvclock_read_cycles(const struct pvclock_vcpu_time_info *src,
>  	u8 ret_flags;
>  
>  	version = src->version;
> -	/* Note: emulated platforms which do not advertise SSE2 support
> -	 * result in kvmclock not using the necessary RDTSC barriers.
> -	 * Without barriers, it is possible that RDTSC instruction reads from
> -	 * the time stamp counter outside rdtsc_barrier protected section
> -	 * below, resulting in violation of monotonicity.
> -	 */
> -	rdtsc_barrier();
>  	offset = pvclock_get_nsec_offset(src);
>  	ret = src->system_time + offset;
>  	ret_flags = src->flags;
> -	rdtsc_barrier();
>  
>  	*cycles = ret;
>  	*flags = ret_flags;
> 

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v3 01/18] x86/tsc: Inline native_read_tsc and remove __native_read_tsc
  2015-06-17  0:35 ` [PATCH v3 01/18] x86/tsc: Inline native_read_tsc and remove __native_read_tsc Andy Lutomirski
@ 2015-06-17  9:26   ` Borislav Petkov
  2015-07-06 15:39   ` [tip:x86/asm] x86/asm/tsc: Inline native_read_tsc() and remove __native_read_tsc() tip-bot for Andy Lutomirski
  1 sibling, 0 replies; 57+ messages in thread
From: Borislav Petkov @ 2015-06-17  9:26 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: x86, Peter Zijlstra, John Stultz, linux-kernel, Len Brown,
	Huang Rui, Denys Vlasenko, kvm, Ralf Baechle

On Tue, Jun 16, 2015 at 05:35:49PM -0700, Andy Lutomirski wrote:
> In cdc7957d1954 ("x86: move native_read_tsc() offline"),
> native_read_tsc was moved out of line, presumably for some
> now-obsolete vDSO-related reason.  Undo it.
> 
> The entire rdtsc, shl, or sequence is only 11 bytes, and calls via
> rdtscl and similar helpers were already inlined.
> 
> Signed-off-by: Andy Lutomirski <luto@kernel.org>
> ---
>  arch/x86/entry/vdso/vclock_gettime.c  | 2 +-
>  arch/x86/include/asm/msr.h            | 8 +++-----
>  arch/x86/include/asm/pvclock.h        | 2 +-
>  arch/x86/include/asm/stackprotector.h | 2 +-
>  arch/x86/include/asm/tsc.h            | 2 +-
>  arch/x86/kernel/apb_timer.c           | 4 ++--
>  arch/x86/kernel/tsc.c                 | 6 ------
>  7 files changed, 9 insertions(+), 17 deletions(-)

Acked-by: Borislav Petkov <bp@suse.de>

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v3 02/18] x86/msr/kvm: Remove vget_cycles()
  2015-06-17  0:35 ` [PATCH v3 02/18] x86/msr/kvm: Remove vget_cycles() Andy Lutomirski
@ 2015-06-17  9:42   ` Borislav Petkov
  2015-06-17 13:34   ` Paolo Bonzini
  2015-07-06 15:40   ` [tip:x86/asm] x86/asm/tsc, kvm: " tip-bot for Andy Lutomirski
  2 siblings, 0 replies; 57+ messages in thread
From: Borislav Petkov @ 2015-06-17  9:42 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: x86, Peter Zijlstra, John Stultz, linux-kernel, Len Brown,
	Huang Rui, Denys Vlasenko, kvm, Ralf Baechle

On Tue, Jun 16, 2015 at 05:35:50PM -0700, Andy Lutomirski wrote:
> The only caller was kvm's read_tsc.  The only difference between
> vget_cycles and native_read_tsc was that vget_cycles returned zero
> instead of crashing on TSC-less systems.  KVM's already checks
> vclock_mode before calling that function, so the extra check is
> unnecessary.
> 
> (Off-topic, but the whole KVM clock host implementation is gross.
>  IMO it should be rewritten.)
> 
> Signed-off-by: Andy Lutomirski <luto@kernel.org>
> ---
>  arch/x86/include/asm/tsc.h | 13 -------------
>  arch/x86/kvm/x86.c         |  2 +-
>  2 files changed, 1 insertion(+), 14 deletions(-)

Acked-by: Borislav Petkov <bp@suse.de>

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v3 03/18] x86/tsc/paravirt: Remove the read_tsc and read_tscp paravirt hooks
  2015-06-17  0:35 ` [PATCH v3 03/18] x86/tsc/paravirt: Remove the read_tsc and read_tscp paravirt hooks Andy Lutomirski
@ 2015-06-17  9:56   ` Borislav Petkov
  2015-06-19 15:32   ` Borislav Petkov
  2015-07-06 15:40   ` [tip:x86/asm] x86/asm/tsc, x86/paravirt: Remove read_tsc() and read_tscp() " tip-bot for Andy Lutomirski
  2 siblings, 0 replies; 57+ messages in thread
From: Borislav Petkov @ 2015-06-17  9:56 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Denys Vlasenko, kvm, Peter Zijlstra, x86, linux-kernel,
	Ralf Baechle, virtualization, Huang Rui, John Stultz, Len Brown

+ paravirt list.

On Tue, Jun 16, 2015 at 05:35:51PM -0700, Andy Lutomirski wrote:
> We've had read_tsc and read_tscp paravirt hooks since the very
> beginning of paravirt, i.e., d3561b7fa0fb ("[PATCH] paravirt: header
> and stubs for paravirtualisation").  AFAICT the only paravirt guest
> implementation that ever replaced these calls was vmware, and it's
> gone.  Arguably even vmware shouldn't have hooked rdtsc -- we fully
> support systems that don't have a TSC at all, so there's no point
> for a paravirt implementation to pretend that we have a TSC but to
> replace it.
> 
> I also doubt that these hooks actually worked.  Calls to rdtscl and
> rdtscll, which respected the hooks, were used seemingly
> interchangeably with native_read_tsc, which did not.
> 
> Just remove them.  If anyone ever needs them again, they can try
> to make a case for why they need them.
> 
> Before, on a paravirt config:
>    text	   data	    bss	    dec	    hex	filename
> 13426505	1827056	14508032	29761593	1c62039	vmlinux
> 
> After:
>    text	   data	    bss	    dec	    hex	filename
> 13426617	1827056	14508032	29761705	1c620a9	vmlinux
> 
> Signed-off-by: Andy Lutomirski <luto@kernel.org>
> ---
>  arch/x86/include/asm/msr.h            | 16 ++++++++--------
>  arch/x86/include/asm/paravirt.h       | 34 ----------------------------------
>  arch/x86/include/asm/paravirt_types.h |  2 --
>  arch/x86/kernel/paravirt.c            |  2 --
>  arch/x86/kernel/paravirt_patch_32.c   |  2 --
>  arch/x86/xen/enlighten.c              |  3 ---
>  6 files changed, 8 insertions(+), 51 deletions(-)

Nice diffstat.

Acked-by: Borislav Petkov <bp@suse.de>

(leaving in the rest for reference)

> diff --git a/arch/x86/include/asm/msr.h b/arch/x86/include/asm/msr.h
> index 88711470af7f..d1afac7df484 100644
> --- a/arch/x86/include/asm/msr.h
> +++ b/arch/x86/include/asm/msr.h
> @@ -178,12 +178,6 @@ static inline int rdmsrl_safe(unsigned msr, unsigned long long *p)
>  	return err;
>  }
>  
> -#define rdtscl(low)						\
> -	((low) = (u32)native_read_tsc())
> -
> -#define rdtscll(val)						\
> -	((val) = native_read_tsc())
> -
>  #define rdpmc(counter, low, high)			\
>  do {							\
>  	u64 _l = native_read_pmc((counter));		\
> @@ -193,6 +187,14 @@ do {							\
>  
>  #define rdpmcl(counter, val) ((val) = native_read_pmc(counter))
>  
> +#endif	/* !CONFIG_PARAVIRT */
> +
> +#define rdtscl(low)						\
> +	((low) = (u32)native_read_tsc())
> +
> +#define rdtscll(val)						\
> +	((val) = native_read_tsc())
> +
>  #define rdtscp(low, high, aux)					\
>  do {                                                            \
>  	unsigned long long _val = native_read_tscp(&(aux));     \
> @@ -202,8 +204,6 @@ do {                                                            \
>  
>  #define rdtscpll(val, aux) (val) = native_read_tscp(&(aux))
>  
> -#endif	/* !CONFIG_PARAVIRT */
> -
>  /*
>   * 64-bit version of wrmsr_safe():
>   */
> diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h
> index d143bfad45d7..c2be0375bcad 100644
> --- a/arch/x86/include/asm/paravirt.h
> +++ b/arch/x86/include/asm/paravirt.h
> @@ -174,19 +174,6 @@ static inline int rdmsrl_safe(unsigned msr, unsigned long long *p)
>  	return err;
>  }
>  
> -static inline u64 paravirt_read_tsc(void)
> -{
> -	return PVOP_CALL0(u64, pv_cpu_ops.read_tsc);
> -}
> -
> -#define rdtscl(low)				\
> -do {						\
> -	u64 _l = paravirt_read_tsc();		\
> -	low = (int)_l;				\
> -} while (0)
> -
> -#define rdtscll(val) (val = paravirt_read_tsc())
> -
>  static inline unsigned long long paravirt_sched_clock(void)
>  {
>  	return PVOP_CALL0(unsigned long long, pv_time_ops.sched_clock);
> @@ -215,27 +202,6 @@ do {						\
>  
>  #define rdpmcl(counter, val) ((val) = paravirt_read_pmc(counter))
>  
> -static inline unsigned long long paravirt_rdtscp(unsigned int *aux)
> -{
> -	return PVOP_CALL1(u64, pv_cpu_ops.read_tscp, aux);
> -}
> -
> -#define rdtscp(low, high, aux)				\
> -do {							\
> -	int __aux;					\
> -	unsigned long __val = paravirt_rdtscp(&__aux);	\
> -	(low) = (u32)__val;				\
> -	(high) = (u32)(__val >> 32);			\
> -	(aux) = __aux;					\
> -} while (0)
> -
> -#define rdtscpll(val, aux)				\
> -do {							\
> -	unsigned long __aux; 				\
> -	val = paravirt_rdtscp(&__aux);			\
> -	(aux) = __aux;					\
> -} while (0)
> -
>  static inline void paravirt_alloc_ldt(struct desc_struct *ldt, unsigned entries)
>  {
>  	PVOP_VCALL2(pv_cpu_ops.alloc_ldt, ldt, entries);
> diff --git a/arch/x86/include/asm/paravirt_types.h b/arch/x86/include/asm/paravirt_types.h
> index a6b8f9fadb06..ce029e4fa7c6 100644
> --- a/arch/x86/include/asm/paravirt_types.h
> +++ b/arch/x86/include/asm/paravirt_types.h
> @@ -156,9 +156,7 @@ struct pv_cpu_ops {
>  	u64 (*read_msr)(unsigned int msr, int *err);
>  	int (*write_msr)(unsigned int msr, unsigned low, unsigned high);
>  
> -	u64 (*read_tsc)(void);
>  	u64 (*read_pmc)(int counter);
> -	unsigned long long (*read_tscp)(unsigned int *aux);
>  
>  #ifdef CONFIG_X86_32
>  	/*
> diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c
> index 58bcfb67c01f..f68e48f5f6c2 100644
> --- a/arch/x86/kernel/paravirt.c
> +++ b/arch/x86/kernel/paravirt.c
> @@ -351,9 +351,7 @@ __visible struct pv_cpu_ops pv_cpu_ops = {
>  	.wbinvd = native_wbinvd,
>  	.read_msr = native_read_msr_safe,
>  	.write_msr = native_write_msr_safe,
> -	.read_tsc = native_read_tsc,
>  	.read_pmc = native_read_pmc,
> -	.read_tscp = native_read_tscp,
>  	.load_tr_desc = native_load_tr_desc,
>  	.set_ldt = native_set_ldt,
>  	.load_gdt = native_load_gdt,
> diff --git a/arch/x86/kernel/paravirt_patch_32.c b/arch/x86/kernel/paravirt_patch_32.c
> index e1b013696dde..c89f50a76e97 100644
> --- a/arch/x86/kernel/paravirt_patch_32.c
> +++ b/arch/x86/kernel/paravirt_patch_32.c
> @@ -10,7 +10,6 @@ DEF_NATIVE(pv_mmu_ops, read_cr2, "mov %cr2, %eax");
>  DEF_NATIVE(pv_mmu_ops, write_cr3, "mov %eax, %cr3");
>  DEF_NATIVE(pv_mmu_ops, read_cr3, "mov %cr3, %eax");
>  DEF_NATIVE(pv_cpu_ops, clts, "clts");
> -DEF_NATIVE(pv_cpu_ops, read_tsc, "rdtsc");
>  
>  #if defined(CONFIG_PARAVIRT_SPINLOCKS) && defined(CONFIG_QUEUED_SPINLOCKS)
>  DEF_NATIVE(pv_lock_ops, queued_spin_unlock, "movb $0, (%eax)");
> @@ -52,7 +51,6 @@ unsigned native_patch(u8 type, u16 clobbers, void *ibuf,
>  		PATCH_SITE(pv_mmu_ops, read_cr3);
>  		PATCH_SITE(pv_mmu_ops, write_cr3);
>  		PATCH_SITE(pv_cpu_ops, clts);
> -		PATCH_SITE(pv_cpu_ops, read_tsc);
>  #if defined(CONFIG_PARAVIRT_SPINLOCKS) && defined(CONFIG_QUEUED_SPINLOCKS)
>  		case PARAVIRT_PATCH(pv_lock_ops.queued_spin_unlock):
>  			if (pv_is_native_spin_unlock()) {
> diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
> index 0b95c9b8283f..32136bfca43f 100644
> --- a/arch/x86/xen/enlighten.c
> +++ b/arch/x86/xen/enlighten.c
> @@ -1175,11 +1175,8 @@ static const struct pv_cpu_ops xen_cpu_ops __initconst = {
>  	.read_msr = xen_read_msr_safe,
>  	.write_msr = xen_write_msr_safe,
>  
> -	.read_tsc = native_read_tsc,
>  	.read_pmc = native_read_pmc,
>  
> -	.read_tscp = native_read_tscp,
> -
>  	.iret = xen_iret,
>  #ifdef CONFIG_X86_64
>  	.usergs_sysret32 = xen_sysret32,
> -- 
> 2.4.2
> 

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v3 04/18] x86/tsc: Replace rdtscll with native_read_tsc
  2015-06-17  0:35 ` [PATCH v3 04/18] x86/tsc: Replace rdtscll with native_read_tsc Andy Lutomirski
@ 2015-06-17 10:03   ` Borislav Petkov
  2015-07-06 15:40   ` [tip:x86/asm] x86/asm/tsc: Replace rdtscll() with native_read_tsc () tip-bot for Andy Lutomirski
  1 sibling, 0 replies; 57+ messages in thread
From: Borislav Petkov @ 2015-06-17 10:03 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: x86, Peter Zijlstra, John Stultz, linux-kernel, Len Brown,
	Huang Rui, Denys Vlasenko, kvm, Ralf Baechle

On Tue, Jun 16, 2015 at 05:35:52PM -0700, Andy Lutomirski wrote:
> Now that the read_tsc paravirt hook is gone, rdtscll() is just a
> wrapper around native_read_tsc().  Unwrap it.
> 
> Signed-off-by: Andy Lutomirski <luto@kernel.org>
> ---
>  arch/x86/boot/compressed/aslr.c                      | 2 +-
>  arch/x86/include/asm/msr.h                           | 3 ---
>  arch/x86/include/asm/tsc.h                           | 5 +----
>  arch/x86/kernel/apb_timer.c                          | 4 ++--
>  arch/x86/kernel/apic/apic.c                          | 8 ++++----
>  arch/x86/kernel/cpu/mcheck/mce.c                     | 4 ++--
>  arch/x86/kernel/espfix_64.c                          | 2 +-
>  arch/x86/kernel/hpet.c                               | 4 ++--
>  arch/x86/kernel/trace_clock.c                        | 2 +-
>  arch/x86/kernel/tsc.c                                | 4 ++--
>  arch/x86/kvm/vmx.c                                   | 2 +-
>  arch/x86/lib/delay.c                                 | 2 +-
>  drivers/thermal/intel_powerclamp.c                   | 4 ++--
>  tools/power/cpupower/debug/kernel/cpufreq-test_tsc.c | 4 ++--
>  14 files changed, 22 insertions(+), 28 deletions(-)

Acked-by: Borislav Petkov <bp@suse.de>

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v3 00/18] x86/tsc: Clean up rdtsc helpers
  2015-06-17  0:35 [PATCH v3 00/18] x86/tsc: Clean up rdtsc helpers Andy Lutomirski
                   ` (17 preceding siblings ...)
  2015-06-17  0:36 ` [PATCH v3 18/18] x86/tsc: Remove rdtsc_barrier() Andy Lutomirski
@ 2015-06-17 11:11 ` Borislav Petkov
  2015-06-17 13:37   ` Paolo Bonzini
  18 siblings, 1 reply; 57+ messages in thread
From: Borislav Petkov @ 2015-06-17 11:11 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: x86, Peter Zijlstra, John Stultz, linux-kernel, Len Brown,
	Huang Rui, Denys Vlasenko, kvm, Ralf Baechle

On Tue, Jun 16, 2015 at 05:35:48PM -0700, Andy Lutomirski wrote:
> My sincere apologies for the spam.  I send an unholy mixture of the
> real patch set and an old poorly split-up patch set, and the result
> is incomprehensible.  Here's what I meant to send.
> 
> After the some recent threads about rdtsc barriers, I remembered
> that our RDTSC wrappers are a big mess.  Let's clean it up.
> 
> Currently we have rdtscl, rdtscll, native_read_tsc,
> paravirt_read_tsc, and rdtsc_barrier.  For people who haven't
> noticed rdtsc_barrier and who haven't carefully read the docs,
> there's no indication that all of the other accessors have a giant
> ordering gotcha.  The macro forms are ugly, and the paravirt
> implementation is completely pointless.
> 
> rdtscl is particularly awful.  It reads the low bits.  There are no
> performance critical users of just the low bits anywhere in the
> kernel.
> 
> Clean it up.  After this patch set, there are exactly three
> functions.  rdtsc_unordered() is a function that does a raw RDTSC
> and returns a 64-bit number.  rdtsc_ordered() is a function that
> does a properly ordered RDTSC for general-purpose use.
> barrier_before_rdtsc() is exactly what it sounds like.
> 
> Changes from v2:
>  - Rename rdtsc_unordered to just rdtsc
>  - Get rid of rdtsc_barrier entirely instead of renaming it
>  - The KVM patch is new (see above)
>  - Added some acks

peterz reminded me that I'm lazy actually and don't reply to each patch :)

So, I like it, looks good, nice cleanup. It boots on my guest here - I
haven't done any baremetal testing though. Let's give people some more
time to look at it...

Thanks.

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v3 17/18] x86/kvm/tsc: Drop extra barrier and use rdtsc_ordered in kvmclock
  2015-06-17  7:47   ` Paolo Bonzini
@ 2015-06-17 13:31     ` Paolo Bonzini
  2015-06-20 21:50     ` Borislav Petkov
  1 sibling, 0 replies; 57+ messages in thread
From: Paolo Bonzini @ 2015-06-17 13:31 UTC (permalink / raw)
  To: Andy Lutomirski, x86
  Cc: Borislav Petkov, Peter Zijlstra, John Stultz, linux-kernel,
	Len Brown, Huang Rui, Denys Vlasenko, kvm, Ralf Baechle,
	Radim Krcmar, Marcelo Tosatti



On 17/06/2015 09:47, Paolo Bonzini wrote:
> 
> 
> On 17/06/2015 02:36, Andy Lutomirski wrote:
>> __pvclock_read_cycles had an unnecessary barrier.  Get rid of that
>> barrier and clean up the code by just using rdtsc_ordered().
>>
>> Cc: Paolo Bonzini <pbonzini@redhat.com>
>> Cc: Radim Krcmar <rkrcmar@redhat.com>
>> Cc: Marcelo Tosatti <mtosatti@redhat.com>
>> Cc: kvm@vger.kernel.org
>> Signed-off-by: Andy Lutomirski <luto@kernel.org>
>> ---
>>
>> I'm hoping to get an ack for this to go in through -tip.  (Arguably
>> I'm the maintainer of this code given how it's used, but I should
>> still ask for an ack.)
>>
>> arch/x86/include/asm/pvclock.h | 21 ++++++++++++---------
>>  1 file changed, 12 insertions(+), 9 deletions(-)
> 
> Can you send a URL to the rest of the series?  I've never even seen v1
> or v2 so I have no idea of what this is about.

Ah, it was sent to the KVM list, just not CCed to me. :)

Sorry, that's what you get when your unread message count does not fit
in three digits anymore.

Paolo

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v3 02/18] x86/msr/kvm: Remove vget_cycles()
  2015-06-17  0:35 ` [PATCH v3 02/18] x86/msr/kvm: Remove vget_cycles() Andy Lutomirski
  2015-06-17  9:42   ` Borislav Petkov
@ 2015-06-17 13:34   ` Paolo Bonzini
  2015-07-06 15:40   ` [tip:x86/asm] x86/asm/tsc, kvm: " tip-bot for Andy Lutomirski
  2 siblings, 0 replies; 57+ messages in thread
From: Paolo Bonzini @ 2015-06-17 13:34 UTC (permalink / raw)
  To: Andy Lutomirski, x86
  Cc: Borislav Petkov, Peter Zijlstra, John Stultz, linux-kernel,
	Len Brown, Huang Rui, Denys Vlasenko, kvm, Ralf Baechle



On 17/06/2015 02:35, Andy Lutomirski wrote:
> The only caller was kvm's read_tsc.  The only difference between
> vget_cycles and native_read_tsc was that vget_cycles returned zero
> instead of crashing on TSC-less systems.  KVM's already checks
> vclock_mode before calling that function, so the extra check is
> unnecessary.

Or more simply, KVM (host-side) requires the TSC to exist.

Acked-by: Paolo Bonzini <pbonzini@redhat.com>

> (Off-topic, but the whole KVM clock host implementation is gross.
>  IMO it should be rewritten.)
> 
> Signed-off-by: Andy Lutomirski <luto@kernel.org>
> ---
>  arch/x86/include/asm/tsc.h | 13 -------------
>  arch/x86/kvm/x86.c         |  2 +-
>  2 files changed, 1 insertion(+), 14 deletions(-)
> 
> diff --git a/arch/x86/include/asm/tsc.h b/arch/x86/include/asm/tsc.h
> index fd11128faf25..3da1cc1218ac 100644
> --- a/arch/x86/include/asm/tsc.h
> +++ b/arch/x86/include/asm/tsc.h
> @@ -32,19 +32,6 @@ static inline cycles_t get_cycles(void)
>  	return ret;
>  }
>  
> -static __always_inline cycles_t vget_cycles(void)
> -{
> -	/*
> -	 * We only do VDSOs on TSC capable CPUs, so this shouldn't
> -	 * access boot_cpu_data (which is not VDSO-safe):
> -	 */
> -#ifndef CONFIG_X86_TSC
> -	if (!cpu_has_tsc)
> -		return 0;
> -#endif
> -	return (cycles_t)native_read_tsc();
> -}
> -
>  extern void tsc_init(void);
>  extern void mark_tsc_unstable(char *reason);
>  extern int unsynchronized_tsc(void);
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 26eaeb522cab..c26faf408bce 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -1430,7 +1430,7 @@ static cycle_t read_tsc(void)
>  	 * but no one has ever seen it happen.
>  	 */
>  	rdtsc_barrier();
> -	ret = (cycle_t)vget_cycles();
> +	ret = (cycle_t)native_read_tsc();
>  
>  	last = pvclock_gtod_data.clock.cycle_last;
>  
> 

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v3 00/18] x86/tsc: Clean up rdtsc helpers
  2015-06-17 11:11 ` [PATCH v3 00/18] x86/tsc: Clean up rdtsc helpers Borislav Petkov
@ 2015-06-17 13:37   ` Paolo Bonzini
  0 siblings, 0 replies; 57+ messages in thread
From: Paolo Bonzini @ 2015-06-17 13:37 UTC (permalink / raw)
  To: Borislav Petkov, Andy Lutomirski
  Cc: x86, Peter Zijlstra, John Stultz, linux-kernel, Len Brown,
	Huang Rui, Denys Vlasenko, kvm, Ralf Baechle



On 17/06/2015 13:11, Borislav Petkov wrote:
> peterz reminded me that I'm lazy actually and don't reply to each patch :)
> 
> So, I like it, looks good, nice cleanup. It boots on my guest here - I
> haven't done any baremetal testing though. Let's give people some more
> time to look at it...

Same here.  I just remarked on some commit messages and comments.

Paolo

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v3 03/18] x86/tsc/paravirt: Remove the read_tsc and read_tscp paravirt hooks
  2015-06-17  0:35 ` [PATCH v3 03/18] x86/tsc/paravirt: Remove the read_tsc and read_tscp paravirt hooks Andy Lutomirski
  2015-06-17  9:56   ` Borislav Petkov
@ 2015-06-19 15:32   ` Borislav Petkov
  2015-06-19 16:14     ` Andy Lutomirski
  2015-07-06 15:40   ` [tip:x86/asm] x86/asm/tsc, x86/paravirt: Remove read_tsc() and read_tscp() " tip-bot for Andy Lutomirski
  2 siblings, 1 reply; 57+ messages in thread
From: Borislav Petkov @ 2015-06-19 15:32 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: x86, Peter Zijlstra, John Stultz, linux-kernel, Len Brown,
	Huang Rui, Denys Vlasenko, kvm, Ralf Baechle

On Tue, Jun 16, 2015 at 05:35:51PM -0700, Andy Lutomirski wrote:
> We've had read_tsc and read_tscp paravirt hooks since the very
> beginning of paravirt, i.e., d3561b7fa0fb ("[PATCH] paravirt: header
> and stubs for paravirtualisation").  AFAICT the only paravirt guest
> implementation that ever replaced these calls was vmware, and it's
> gone.  Arguably even vmware shouldn't have hooked rdtsc -- we fully
> support systems that don't have a TSC at all, so there's no point
> for a paravirt implementation to pretend that we have a TSC but to
> replace it.
> 
> I also doubt that these hooks actually worked.  Calls to rdtscl and
> rdtscll, which respected the hooks, were used seemingly
> interchangeably with native_read_tsc, which did not.
> 
> Just remove them.  If anyone ever needs them again, they can try
> to make a case for why they need them.
> 
> Before, on a paravirt config:
>    text	   data	    bss	    dec	    hex	filename
> 13426505	1827056	14508032	29761593	1c62039	vmlinux
> 
> After:
>    text	   data	    bss	    dec	    hex	filename
> 13426617	1827056	14508032	29761705	1c620a9	vmlinux

Those look swapped. I mean, you're removing a bunch of stuff and text
grew?!

> Signed-off-by: Andy Lutomirski <luto@kernel.org>
> ---
>  arch/x86/include/asm/msr.h            | 16 ++++++++--------
>  arch/x86/include/asm/paravirt.h       | 34 ----------------------------------
>  arch/x86/include/asm/paravirt_types.h |  2 --
>  arch/x86/kernel/paravirt.c            |  2 --
>  arch/x86/kernel/paravirt_patch_32.c   |  2 --
>  arch/x86/xen/enlighten.c              |  3 ---
>  6 files changed, 8 insertions(+), 51 deletions(-)

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--
--
To unsubscribe from this list: send the line "unsubscribe kvm" in

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v3 03/18] x86/tsc/paravirt: Remove the read_tsc and read_tscp paravirt hooks
  2015-06-19 15:32   ` Borislav Petkov
@ 2015-06-19 16:14     ` Andy Lutomirski
  2015-06-19 17:13       ` Borislav Petkov
  0 siblings, 1 reply; 57+ messages in thread
From: Andy Lutomirski @ 2015-06-19 16:14 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Andy Lutomirski, X86 ML, Peter Zijlstra, John Stultz,
	linux-kernel, Len Brown, Huang Rui, Denys Vlasenko, kvm list,
	Ralf Baechle

On Fri, Jun 19, 2015 at 8:32 AM, Borislav Petkov <bp@suse.de> wrote:
> On Tue, Jun 16, 2015 at 05:35:51PM -0700, Andy Lutomirski wrote:
>> We've had read_tsc and read_tscp paravirt hooks since the very
>> beginning of paravirt, i.e., d3561b7fa0fb ("[PATCH] paravirt: header
>> and stubs for paravirtualisation").  AFAICT the only paravirt guest
>> implementation that ever replaced these calls was vmware, and it's
>> gone.  Arguably even vmware shouldn't have hooked rdtsc -- we fully
>> support systems that don't have a TSC at all, so there's no point
>> for a paravirt implementation to pretend that we have a TSC but to
>> replace it.
>>
>> I also doubt that these hooks actually worked.  Calls to rdtscl and
>> rdtscll, which respected the hooks, were used seemingly
>> interchangeably with native_read_tsc, which did not.
>>
>> Just remove them.  If anyone ever needs them again, they can try
>> to make a case for why they need them.
>>
>> Before, on a paravirt config:
>>    text          data     bss     dec     hex filename
>> 13426505      1827056 14508032        29761593        1c62039 vmlinux
>>
>> After:
>>    text          data     bss     dec     hex filename
>> 13426617      1827056 14508032        29761705        1c620a9 vmlinux
>
> Those look swapped. I mean, you're removing a bunch of stuff and text
> grew?!

Trying again.  The config is different, so the numbers won't match.

Before:
   text    data     bss     dec     hex filename
12618257        1816384 1093632 15528273         ecf151 vmlinux

After:
   text    data     bss     dec     hex filename
12617207        1816384 1093632 15527223         eced37 vmlinux

So, yes, I have it backwards.  There's some penalty here because this
patch also causes rdtsc to be inlined on paravirt kernels (it was
patched as a call instead of an inline instruction sequence), so each
call site pays for the shift, the or, and the pointless zero
extension.

With the "unsigned long" patch, it's:

   text    data     bss     dec     hex filename
12617002        1816384 1093632 15527018         ecec6a vmlinux

Want to fix up the commit message?  It seems silly to re-send the
whole series for this.

--Andy

>
>> Signed-off-by: Andy Lutomirski <luto@kernel.org>
>> ---
>>  arch/x86/include/asm/msr.h            | 16 ++++++++--------
>>  arch/x86/include/asm/paravirt.h       | 34 ----------------------------------
>>  arch/x86/include/asm/paravirt_types.h |  2 --
>>  arch/x86/kernel/paravirt.c            |  2 --
>>  arch/x86/kernel/paravirt_patch_32.c   |  2 --
>>  arch/x86/xen/enlighten.c              |  3 ---
>>  6 files changed, 8 insertions(+), 51 deletions(-)
>
> --
> Regards/Gruss,
>     Boris.
>
> ECO tip #101: Trim your mails when you reply.
> --



-- 
Andy Lutomirski
AMA Capital Management, LLC

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v3 03/18] x86/tsc/paravirt: Remove the read_tsc and read_tscp paravirt hooks
  2015-06-19 16:14     ` Andy Lutomirski
@ 2015-06-19 17:13       ` Borislav Petkov
  0 siblings, 0 replies; 57+ messages in thread
From: Borislav Petkov @ 2015-06-19 17:13 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Andy Lutomirski, X86 ML, Peter Zijlstra, John Stultz,
	linux-kernel, Len Brown, Huang Rui, Denys Vlasenko, kvm list,
	Ralf Baechle

On Fri, Jun 19, 2015 at 09:14:14AM -0700, Andy Lutomirski wrote:
> Want to fix up the commit message?  It seems silly to re-send the
> whole series for this.

Of couse, done.

Thanks.

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v3 08/18] baycom_epp: Replace rdtscl() with native_read_tsc()
  2015-06-17  0:35 ` [PATCH v3 08/18] baycom_epp: Replace rdtscl() with native_read_tsc() Andy Lutomirski
  2015-06-17  0:49   ` Thomas Sailer
@ 2015-06-20 13:54   ` walter harms
  2015-06-20 14:14     ` Thomas Gleixner
  2015-07-06 15:42   ` [tip:x86/asm] x86/asm/tsc, drivers/net/hamradio/baycom_epp: " tip-bot for Andy Lutomirski
  2 siblings, 1 reply; 57+ messages in thread
From: walter harms @ 2015-06-20 13:54 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: x86, Borislav Petkov, Peter Zijlstra, John Stultz, linux-kernel,
	Len Brown, Huang Rui, Denys Vlasenko, kvm, Ralf Baechle,
	Thomas Sailer, linux-hams

Acked-by: walter harms <wharms@bfs.de>

Am 17.06.2015 02:35, schrieb Andy Lutomirski:
> This is only used if BAYCOM_DEBUG is defined.
> 
> Cc: walter harms <wharms@bfs.de>
> Cc: Ralf Baechle <ralf@linux-mips.org>
> Cc: Thomas Sailer <t.sailer@alumni.ethz.ch>
> Cc: linux-hams@vger.kernel.org
> Signed-off-by: Andy Lutomirski <luto@kernel.org>
> ---
> 
> I'm hoping for an ack for this to go through -tip.
> 
>  drivers/net/hamradio/baycom_epp.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/hamradio/baycom_epp.c b/drivers/net/hamradio/baycom_epp.c
> index 83c7cce0d172..44e5c3b5e0af 100644
> --- a/drivers/net/hamradio/baycom_epp.c
> +++ b/drivers/net/hamradio/baycom_epp.c
> @@ -638,7 +638,7 @@ static int receive(struct net_device *dev, int cnt)
>  #define GETTICK(x)                                                \
>  ({                                                                \
>  	if (cpu_has_tsc)                                          \
> -		rdtscl(x);                                        \
> +		x = (unsigned int)native_read_tsc();		  \
>  })
>  #else /* __i386__ */
>  #define GETTICK(x)
--
To unsubscribe from this list: send the line "unsubscribe linux-hams" in

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v3 08/18] baycom_epp: Replace rdtscl() with native_read_tsc()
  2015-06-20 13:54   ` walter harms
@ 2015-06-20 14:14     ` Thomas Gleixner
  2015-06-20 14:26       ` Andy Lutomirski
  0 siblings, 1 reply; 57+ messages in thread
From: Thomas Gleixner @ 2015-06-20 14:14 UTC (permalink / raw)
  To: walter harms
  Cc: Andy Lutomirski, x86, Borislav Petkov, Peter Zijlstra,
	John Stultz, linux-kernel, Len Brown, Huang Rui, Denys Vlasenko,
	kvm, Ralf Baechle, Thomas Sailer, linux-hams

On Sat, 20 Jun 2015, walter harms wrote:

> Acked-by: walter harms <wharms@bfs.de>
> 
> Am 17.06.2015 02:35, schrieb Andy Lutomirski:
> > This is only used if BAYCOM_DEBUG is defined.

So why don't we just replace that by ktime_get() and get rid of the
x86'ism in that driver.
 
Thanks,

	tglx
--
To unsubscribe from this list: send the line "unsubscribe kvm" in

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v3 08/18] baycom_epp: Replace rdtscl() with native_read_tsc()
  2015-06-20 14:14     ` Thomas Gleixner
@ 2015-06-20 14:26       ` Andy Lutomirski
  2015-06-20 16:30         ` Thomas Gleixner
  0 siblings, 1 reply; 57+ messages in thread
From: Andy Lutomirski @ 2015-06-20 14:26 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: walter harms, Andy Lutomirski, X86 ML, Borislav Petkov,
	Peter Zijlstra, John Stultz, linux-kernel, Len Brown, Huang Rui,
	Denys Vlasenko, kvm list, Ralf Baechle, Thomas Sailer,
	linux-hams

On Sat, Jun 20, 2015 at 7:14 AM, Thomas Gleixner <tglx@linutronix.de> wrote:
> On Sat, 20 Jun 2015, walter harms wrote:
>
>> Acked-by: walter harms <wharms@bfs.de>
>>
>> Am 17.06.2015 02:35, schrieb Andy Lutomirski:
>> > This is only used if BAYCOM_DEBUG is defined.
>
> So why don't we just replace that by ktime_get() and get rid of the
> x86'ism in that driver.
>

I don't have the hardware, and I don't see any good reason to make an
rdtsc cleanup depend on a more complicated driver change.  On the
other hand, if the maintainers want to clean it up, I think it would
be a great idea.

This really seems to be debugging code, though.  A normal kernel won't
even compile it.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-hams" in

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v3 08/18] baycom_epp: Replace rdtscl() with native_read_tsc()
  2015-06-20 14:26       ` Andy Lutomirski
@ 2015-06-20 16:30         ` Thomas Gleixner
  0 siblings, 0 replies; 57+ messages in thread
From: Thomas Gleixner @ 2015-06-20 16:30 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: walter harms, Andy Lutomirski, X86 ML, Borislav Petkov,
	Peter Zijlstra, John Stultz, linux-kernel, Len Brown, Huang Rui,
	Denys Vlasenko, kvm list, Ralf Baechle, Thomas Sailer,
	linux-hams

On Sat, 20 Jun 2015, Andy Lutomirski wrote:
> On Sat, Jun 20, 2015 at 7:14 AM, Thomas Gleixner <tglx@linutronix.de> wrote:
> > On Sat, 20 Jun 2015, walter harms wrote:
> >
> >> Acked-by: walter harms <wharms@bfs.de>
> >>
> >> Am 17.06.2015 02:35, schrieb Andy Lutomirski:
> >> > This is only used if BAYCOM_DEBUG is defined.
> >
> > So why don't we just replace that by ktime_get() and get rid of the
> > x86'ism in that driver.
> >
> 
> I don't have the hardware, and I don't see any good reason to make an
> rdtsc cleanup depend on a more complicated driver change.  On the
> other hand, if the maintainers want to clean it up, I think it would
> be a great idea.
> 
> This really seems to be debugging code, though.  A normal kernel won't
> even compile it.

Right, but there is no reason that we have rdtsc outside of arch/x86
at all.

Thanks,

	tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-hams" in

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v3 17/18] x86/kvm/tsc: Drop extra barrier and use rdtsc_ordered in kvmclock
  2015-06-17  7:47   ` Paolo Bonzini
  2015-06-17 13:31     ` Paolo Bonzini
@ 2015-06-20 21:50     ` Borislav Petkov
  1 sibling, 0 replies; 57+ messages in thread
From: Borislav Petkov @ 2015-06-20 21:50 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Andy Lutomirski, x86, Peter Zijlstra, John Stultz, linux-kernel,
	Len Brown, Huang Rui, Denys Vlasenko, kvm, Ralf Baechle,
	Radim Krcmar, Marcelo Tosatti

On Wed, Jun 17, 2015 at 09:47:08AM +0200, Paolo Bonzini wrote:
> On 17/06/2015 02:36, Andy Lutomirski wrote:
> > __pvclock_read_cycles had an unnecessary barrier.  Get rid of that
> > barrier and clean up the code by just using rdtsc_ordered().
> > 
> > Cc: Paolo Bonzini <pbonzini@redhat.com>
> > Cc: Radim Krcmar <rkrcmar@redhat.com>
> > Cc: Marcelo Tosatti <mtosatti@redhat.com>
> > Cc: kvm@vger.kernel.org
> > Signed-off-by: Andy Lutomirski <luto@kernel.org>
> > ---
> > 
> > I'm hoping to get an ack for this to go in through -tip.  (Arguably
> > I'm the maintainer of this code given how it's used, but I should
> > still ask for an ack.)
> > 
> > arch/x86/include/asm/pvclock.h | 21 ++++++++++++---------
> >  1 file changed, 12 insertions(+), 9 deletions(-)
> 
> Can you send a URL to the rest of the series?  I've never even seen v1
> or v2 so I have no idea of what this is about.
> 
> > diff --git a/arch/x86/include/asm/pvclock.h b/arch/x86/include/asm/pvclock.h
> > index 6084bce345fc..cf2329ca4812 100644
> > --- a/arch/x86/include/asm/pvclock.h
> > +++ b/arch/x86/include/asm/pvclock.h
> > @@ -62,7 +62,18 @@ static inline u64 pvclock_scale_delta(u64 delta, u32 mul_frac, int shift)
> >  static __always_inline
> >  u64 pvclock_get_nsec_offset(const struct pvclock_vcpu_time_info *src)
> >  {
> > -	u64 delta = rdtsc() - src->tsc_timestamp;
> > +	/*
> > +	 * Note: emulated platforms which do not advertise SSE2 support
> > +	 * break rdtsc_ordered, resulting in kvmclock not using the
> > +	 * necessary RDTSC barriers.  Without barriers, it is possible
> > +	 * that RDTSC instruction is executed before prior loads,
> > +	 * resulting in violation of monotonicity.
> > +	 *
> > +	 * On an SMP guest without SSE2, it's unclear how anything is
> > +	 * supposed to work correctly, though -- memory fences
> > +	 * (e.g. smp_mb) are important for more than just timing.
> > +	 */
> 
> On an SMP guest without SSE2, memory fences are obtained with e.g. "lock
> addb $0, (%esp)".

Yeah, I killed that comment when applying.

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--
--
To unsubscribe from this list: send the line "unsubscribe kvm" in

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v3 13/18] x86/tsc: Rename native_read_tsc() to rdtsc()
  2015-06-17  0:36 ` [PATCH v3 13/18] x86/tsc: Rename native_read_tsc() to rdtsc() Andy Lutomirski
@ 2015-06-24 21:38   ` Borislav Petkov
  2015-07-06 15:43   ` [tip:x86/asm] x86/asm/tsc: " tip-bot for Andy Lutomirski
  1 sibling, 0 replies; 57+ messages in thread
From: Borislav Petkov @ 2015-06-24 21:38 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: x86, Peter Zijlstra, John Stultz, linux-kernel, Len Brown,
	Huang Rui, Denys Vlasenko, kvm, Ralf Baechle

On Tue, Jun 16, 2015 at 05:36:01PM -0700, Andy Lutomirski wrote:
> Now that there is no paravirt TSC, the "native" is inappropriate.
> The function does RDTSC, so give it the obvious name: rdtsc()
> 
> Suggested-by: Borislav Petkov <bp@suse.de>
> Signed-off-by: Andy Lutomirski <luto@kernel.org>
> ---
>  arch/x86/boot/compressed/aslr.c                      |  2 +-
>  arch/x86/entry/vdso/vclock_gettime.c                 |  2 +-
>  arch/x86/include/asm/msr.h                           | 11 ++++++++++-
>  arch/x86/include/asm/pvclock.h                       |  2 +-
>  arch/x86/include/asm/stackprotector.h                |  2 +-
>  arch/x86/include/asm/tsc.h                           |  2 +-
>  arch/x86/kernel/apb_timer.c                          |  8 ++++----
>  arch/x86/kernel/apic/apic.c                          |  8 ++++----
>  arch/x86/kernel/cpu/amd.c                            |  4 ++--
>  arch/x86/kernel/cpu/mcheck/mce.c                     |  4 ++--
>  arch/x86/kernel/espfix_64.c                          |  2 +-
>  arch/x86/kernel/hpet.c                               |  4 ++--
>  arch/x86/kernel/trace_clock.c                        |  2 +-
>  arch/x86/kernel/tsc.c                                |  4 ++--
>  arch/x86/kvm/lapic.c                                 |  4 ++--
>  arch/x86/kvm/svm.c                                   |  4 ++--
>  arch/x86/kvm/vmx.c                                   |  4 ++--
>  arch/x86/kvm/x86.c                                   | 12 ++++++------
>  arch/x86/lib/delay.c                                 |  8 ++++----
>  drivers/input/gameport/gameport.c                    |  4 ++--
>  drivers/input/joystick/analog.c                      |  4 ++--
>  drivers/net/hamradio/baycom_epp.c                    |  2 +-
>  drivers/thermal/intel_powerclamp.c                   |  4 ++--
>  tools/power/cpupower/debug/kernel/cpufreq-test_tsc.c |  4 ++--
>  24 files changed, 58 insertions(+), 49 deletions(-)

Btw, just added the hunk below to that one because

4055fad34086 ("intel_pstate: Add tsc collection and keep previous target pstate")

which came in through the pm+acpi pull request added another
native_read_tsc() call:

---
diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c
index 15ada47bb720..7c56d7eaa671 100644
--- a/drivers/cpufreq/intel_pstate.c
+++ b/drivers/cpufreq/intel_pstate.c
@@ -765,7 +765,7 @@ static inline void intel_pstate_sample(struct cpudata *cpu)
 	local_irq_save(flags);
 	rdmsrl(MSR_IA32_APERF, aperf);
 	rdmsrl(MSR_IA32_MPERF, mperf);
-	tsc = native_read_tsc();
+	tsc = rdtsc();
 	local_irq_restore(flags);
 
 	cpu->last_sample_time = cpu->sample.time;
---

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [tip:x86/asm] x86/asm/tsc: Inline native_read_tsc() and remove __native_read_tsc()
  2015-06-17  0:35 ` [PATCH v3 01/18] x86/tsc: Inline native_read_tsc and remove __native_read_tsc Andy Lutomirski
  2015-06-17  9:26   ` Borislav Petkov
@ 2015-07-06 15:39   ` tip-bot for Andy Lutomirski
  1 sibling, 0 replies; 57+ messages in thread
From: tip-bot for Andy Lutomirski @ 2015-07-06 15:39 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: tglx, luto, luto, ray.huang, bp, john.stultz, brgerst, peterz,
	ralf, hpa, bp, kvm, lenb, mingo, torvalds, linux-kernel,
	dvlasenk

Commit-ID:  c6e5ca35c4685cd920b1d5279dbc9f4483d7dfd4
Gitweb:     http://git.kernel.org/tip/c6e5ca35c4685cd920b1d5279dbc9f4483d7dfd4
Author:     Andy Lutomirski <luto@kernel.org>
AuthorDate: Thu, 25 Jun 2015 18:43:55 +0200
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Mon, 6 Jul 2015 15:23:25 +0200

x86/asm/tsc: Inline native_read_tsc() and remove __native_read_tsc()

In the following commit:

  cdc7957d1954 ("x86: move native_read_tsc() offline")

... native_read_tsc() was moved out of line, presumably for some
now-obsolete vDSO-related reason. Undo it.

The entire rdtsc, shl, or sequence is only 11 bytes, and calls
via rdtscl() and similar helpers were already inlined.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Len Brown <lenb@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kvm ML <kvm@vger.kernel.org>
Link: http://lkml.kernel.org/r/d05ffe2aaf8468ca475ebc00efad7b2fa174af19.1434501121.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/entry/vdso/vclock_gettime.c  | 2 +-
 arch/x86/include/asm/msr.h            | 8 +++-----
 arch/x86/include/asm/pvclock.h        | 2 +-
 arch/x86/include/asm/stackprotector.h | 2 +-
 arch/x86/include/asm/tsc.h            | 2 +-
 arch/x86/kernel/apb_timer.c           | 4 ++--
 arch/x86/kernel/tsc.c                 | 6 ------
 7 files changed, 9 insertions(+), 17 deletions(-)

diff --git a/arch/x86/entry/vdso/vclock_gettime.c b/arch/x86/entry/vdso/vclock_gettime.c
index 9793322..972b488 100644
--- a/arch/x86/entry/vdso/vclock_gettime.c
+++ b/arch/x86/entry/vdso/vclock_gettime.c
@@ -186,7 +186,7 @@ notrace static cycle_t vread_tsc(void)
 	 * but no one has ever seen it happen.
 	 */
 	rdtsc_barrier();
-	ret = (cycle_t)__native_read_tsc();
+	ret = (cycle_t)native_read_tsc();
 
 	last = gtod->cycle_last;
 
diff --git a/arch/x86/include/asm/msr.h b/arch/x86/include/asm/msr.h
index e6a707e..8871147 100644
--- a/arch/x86/include/asm/msr.h
+++ b/arch/x86/include/asm/msr.h
@@ -106,12 +106,10 @@ notrace static inline int native_write_msr_safe(unsigned int msr,
 	return err;
 }
 
-extern unsigned long long native_read_tsc(void);
-
 extern int rdmsr_safe_regs(u32 regs[8]);
 extern int wrmsr_safe_regs(u32 regs[8]);
 
-static __always_inline unsigned long long __native_read_tsc(void)
+static __always_inline unsigned long long native_read_tsc(void)
 {
 	DECLARE_ARGS(val, low, high);
 
@@ -181,10 +179,10 @@ static inline int rdmsrl_safe(unsigned msr, unsigned long long *p)
 }
 
 #define rdtscl(low)						\
-	((low) = (u32)__native_read_tsc())
+	((low) = (u32)native_read_tsc())
 
 #define rdtscll(val)						\
-	((val) = __native_read_tsc())
+	((val) = native_read_tsc())
 
 #define rdpmc(counter, low, high)			\
 do {							\
diff --git a/arch/x86/include/asm/pvclock.h b/arch/x86/include/asm/pvclock.h
index 628954c..2bd69d6 100644
--- a/arch/x86/include/asm/pvclock.h
+++ b/arch/x86/include/asm/pvclock.h
@@ -62,7 +62,7 @@ static inline u64 pvclock_scale_delta(u64 delta, u32 mul_frac, int shift)
 static __always_inline
 u64 pvclock_get_nsec_offset(const struct pvclock_vcpu_time_info *src)
 {
-	u64 delta = __native_read_tsc() - src->tsc_timestamp;
+	u64 delta = native_read_tsc() - src->tsc_timestamp;
 	return pvclock_scale_delta(delta, src->tsc_to_system_mul,
 				   src->tsc_shift);
 }
diff --git a/arch/x86/include/asm/stackprotector.h b/arch/x86/include/asm/stackprotector.h
index c2e00bb..bc5fa2a 100644
--- a/arch/x86/include/asm/stackprotector.h
+++ b/arch/x86/include/asm/stackprotector.h
@@ -72,7 +72,7 @@ static __always_inline void boot_init_stack_canary(void)
 	 * on during the bootup the random pool has true entropy too.
 	 */
 	get_random_bytes(&canary, sizeof(canary));
-	tsc = __native_read_tsc();
+	tsc = native_read_tsc();
 	canary += tsc + (tsc << 32UL);
 
 	current->stack_canary = canary;
diff --git a/arch/x86/include/asm/tsc.h b/arch/x86/include/asm/tsc.h
index 94605c0..fd11128 100644
--- a/arch/x86/include/asm/tsc.h
+++ b/arch/x86/include/asm/tsc.h
@@ -42,7 +42,7 @@ static __always_inline cycles_t vget_cycles(void)
 	if (!cpu_has_tsc)
 		return 0;
 #endif
-	return (cycles_t)__native_read_tsc();
+	return (cycles_t)native_read_tsc();
 }
 
 extern void tsc_init(void);
diff --git a/arch/x86/kernel/apb_timer.c b/arch/x86/kernel/apb_timer.c
index ede92c3..9fe111c 100644
--- a/arch/x86/kernel/apb_timer.c
+++ b/arch/x86/kernel/apb_timer.c
@@ -390,13 +390,13 @@ unsigned long apbt_quick_calibrate(void)
 	old = dw_apb_clocksource_read(clocksource_apbt);
 	old += loop;
 
-	t1 = __native_read_tsc();
+	t1 = native_read_tsc();
 
 	do {
 		new = dw_apb_clocksource_read(clocksource_apbt);
 	} while (new < old);
 
-	t2 = __native_read_tsc();
+	t2 = native_read_tsc();
 
 	shift = 5;
 	if (unlikely(loop >> shift == 0)) {
diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
index 5054497..e7710cd 100644
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -308,12 +308,6 @@ unsigned long long
 sched_clock(void) __attribute__((alias("native_sched_clock")));
 #endif
 
-unsigned long long native_read_tsc(void)
-{
-	return __native_read_tsc();
-}
-EXPORT_SYMBOL(native_read_tsc);
-
 int check_tsc_unstable(void)
 {
 	return tsc_unstable;

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [tip:x86/asm] x86/asm/tsc, kvm: Remove vget_cycles()
  2015-06-17  0:35 ` [PATCH v3 02/18] x86/msr/kvm: Remove vget_cycles() Andy Lutomirski
  2015-06-17  9:42   ` Borislav Petkov
  2015-06-17 13:34   ` Paolo Bonzini
@ 2015-07-06 15:40   ` tip-bot for Andy Lutomirski
  2 siblings, 0 replies; 57+ messages in thread
From: tip-bot for Andy Lutomirski @ 2015-07-06 15:40 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: luto, ralf, tglx, linux-kernel, john.stultz, lenb, bp, kvm,
	dvlasenk, torvalds, brgerst, peterz, hpa, ray.huang, mingo, bp,
	luto, pbonzini

Commit-ID:  881d7bf843d7139c6dfbffdec4903b3354423c49
Gitweb:     http://git.kernel.org/tip/881d7bf843d7139c6dfbffdec4903b3354423c49
Author:     Andy Lutomirski <luto@kernel.org>
AuthorDate: Thu, 25 Jun 2015 18:43:56 +0200
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Mon, 6 Jul 2015 15:23:25 +0200

x86/asm/tsc, kvm: Remove vget_cycles()

The only caller was KVM's read_tsc(). The only difference
between vget_cycles() and native_read_tsc() was that
vget_cycles() returned zero instead of crashing on TSC-less
systems. KVM already checks vclock_mode() before calling that
function, so the extra check is unnecessary. Also, KVM
(host-side) requires the TSC to exist.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Len Brown <lenb@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kvm ML <kvm@vger.kernel.org>
Link: http://lkml.kernel.org/r/20615df14ae2eb713ea7a5f5123c1dc4c7ca993d.1434501121.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/tsc.h | 13 -------------
 arch/x86/kvm/x86.c         |  2 +-
 2 files changed, 1 insertion(+), 14 deletions(-)

diff --git a/arch/x86/include/asm/tsc.h b/arch/x86/include/asm/tsc.h
index fd11128..3da1cc1 100644
--- a/arch/x86/include/asm/tsc.h
+++ b/arch/x86/include/asm/tsc.h
@@ -32,19 +32,6 @@ static inline cycles_t get_cycles(void)
 	return ret;
 }
 
-static __always_inline cycles_t vget_cycles(void)
-{
-	/*
-	 * We only do VDSOs on TSC capable CPUs, so this shouldn't
-	 * access boot_cpu_data (which is not VDSO-safe):
-	 */
-#ifndef CONFIG_X86_TSC
-	if (!cpu_has_tsc)
-		return 0;
-#endif
-	return (cycles_t)native_read_tsc();
-}
-
 extern void tsc_init(void);
 extern void mark_tsc_unstable(char *reason);
 extern int unsynchronized_tsc(void);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index bbaf44e..f771058 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1455,7 +1455,7 @@ static cycle_t read_tsc(void)
 	 * but no one has ever seen it happen.
 	 */
 	rdtsc_barrier();
-	ret = (cycle_t)vget_cycles();
+	ret = (cycle_t)native_read_tsc();
 
 	last = pvclock_gtod_data.clock.cycle_last;
 

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [tip:x86/asm] x86/asm/tsc, x86/paravirt: Remove read_tsc() and read_tscp() paravirt hooks
  2015-06-17  0:35 ` [PATCH v3 03/18] x86/tsc/paravirt: Remove the read_tsc and read_tscp paravirt hooks Andy Lutomirski
  2015-06-17  9:56   ` Borislav Petkov
  2015-06-19 15:32   ` Borislav Petkov
@ 2015-07-06 15:40   ` tip-bot for Andy Lutomirski
  2 siblings, 0 replies; 57+ messages in thread
From: tip-bot for Andy Lutomirski @ 2015-07-06 15:40 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: mingo, luto, bp, kvm, torvalds, linux-kernel, bp, lenb, tglx,
	ralf, john.stultz, brgerst, luto, dvlasenk, peterz, ray.huang,
	hpa

Commit-ID:  9261e050b686c9fe229cd9918d997b3caaf20e34
Gitweb:     http://git.kernel.org/tip/9261e050b686c9fe229cd9918d997b3caaf20e34
Author:     Andy Lutomirski <luto@kernel.org>
AuthorDate: Thu, 25 Jun 2015 18:43:57 +0200
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Mon, 6 Jul 2015 15:23:26 +0200

x86/asm/tsc, x86/paravirt: Remove read_tsc() and read_tscp() paravirt hooks

We've had ->read_tsc() and ->read_tscp() paravirt hooks since
the very beginning of paravirt, i.e.,

  d3561b7fa0fb ("[PATCH] paravirt: header and stubs for paravirtualisation").

AFAICT, the only paravirt guest implementation that ever
replaced these calls was vmware, and it's gone. Arguably even
vmware shouldn't have hooked RDTSC -- we fully support systems
that don't have a TSC at all, so there's no point for a paravirt
implementation to pretend that we have a TSC but to replace it.

I also doubt that these hooks actually worked. Calls to rdtscl()
and rdtscll(), which respected the hooks, were used seemingly
interchangeably with native_read_tsc(), which did not.

Just remove them. If anyone ever needs them again, they can try
to make a case for why they need them.

Before, on a paravirt config:
  text    	data     bss     dec     hex filename
  12618257      1816384 1093632 15528273 ecf151 vmlinux

After:
  text		data     bss     dec     hex filename
  12617207      1816384 1093632 15527223 eced37 vmlinux

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Len Brown <lenb@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kvm ML <kvm@vger.kernel.org>
Cc: virtualization@lists.linux-foundation.org
Link: http://lkml.kernel.org/r/d08a2600fb298af163681e5efd8e599d889a5b97.1434501121.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/msr.h            | 16 ++++++++--------
 arch/x86/include/asm/paravirt.h       | 34 ----------------------------------
 arch/x86/include/asm/paravirt_types.h |  2 --
 arch/x86/kernel/paravirt.c            |  2 --
 arch/x86/kernel/paravirt_patch_32.c   |  2 --
 arch/x86/xen/enlighten.c              |  3 ---
 6 files changed, 8 insertions(+), 51 deletions(-)

diff --git a/arch/x86/include/asm/msr.h b/arch/x86/include/asm/msr.h
index 8871147..d1afac7 100644
--- a/arch/x86/include/asm/msr.h
+++ b/arch/x86/include/asm/msr.h
@@ -178,12 +178,6 @@ static inline int rdmsrl_safe(unsigned msr, unsigned long long *p)
 	return err;
 }
 
-#define rdtscl(low)						\
-	((low) = (u32)native_read_tsc())
-
-#define rdtscll(val)						\
-	((val) = native_read_tsc())
-
 #define rdpmc(counter, low, high)			\
 do {							\
 	u64 _l = native_read_pmc((counter));		\
@@ -193,6 +187,14 @@ do {							\
 
 #define rdpmcl(counter, val) ((val) = native_read_pmc(counter))
 
+#endif	/* !CONFIG_PARAVIRT */
+
+#define rdtscl(low)						\
+	((low) = (u32)native_read_tsc())
+
+#define rdtscll(val)						\
+	((val) = native_read_tsc())
+
 #define rdtscp(low, high, aux)					\
 do {                                                            \
 	unsigned long long _val = native_read_tscp(&(aux));     \
@@ -202,8 +204,6 @@ do {                                                            \
 
 #define rdtscpll(val, aux) (val) = native_read_tscp(&(aux))
 
-#endif	/* !CONFIG_PARAVIRT */
-
 /*
  * 64-bit version of wrmsr_safe():
  */
diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h
index d143bfa..c2be037 100644
--- a/arch/x86/include/asm/paravirt.h
+++ b/arch/x86/include/asm/paravirt.h
@@ -174,19 +174,6 @@ static inline int rdmsrl_safe(unsigned msr, unsigned long long *p)
 	return err;
 }
 
-static inline u64 paravirt_read_tsc(void)
-{
-	return PVOP_CALL0(u64, pv_cpu_ops.read_tsc);
-}
-
-#define rdtscl(low)				\
-do {						\
-	u64 _l = paravirt_read_tsc();		\
-	low = (int)_l;				\
-} while (0)
-
-#define rdtscll(val) (val = paravirt_read_tsc())
-
 static inline unsigned long long paravirt_sched_clock(void)
 {
 	return PVOP_CALL0(unsigned long long, pv_time_ops.sched_clock);
@@ -215,27 +202,6 @@ do {						\
 
 #define rdpmcl(counter, val) ((val) = paravirt_read_pmc(counter))
 
-static inline unsigned long long paravirt_rdtscp(unsigned int *aux)
-{
-	return PVOP_CALL1(u64, pv_cpu_ops.read_tscp, aux);
-}
-
-#define rdtscp(low, high, aux)				\
-do {							\
-	int __aux;					\
-	unsigned long __val = paravirt_rdtscp(&__aux);	\
-	(low) = (u32)__val;				\
-	(high) = (u32)(__val >> 32);			\
-	(aux) = __aux;					\
-} while (0)
-
-#define rdtscpll(val, aux)				\
-do {							\
-	unsigned long __aux; 				\
-	val = paravirt_rdtscp(&__aux);			\
-	(aux) = __aux;					\
-} while (0)
-
 static inline void paravirt_alloc_ldt(struct desc_struct *ldt, unsigned entries)
 {
 	PVOP_VCALL2(pv_cpu_ops.alloc_ldt, ldt, entries);
diff --git a/arch/x86/include/asm/paravirt_types.h b/arch/x86/include/asm/paravirt_types.h
index a6b8f9f..ce029e4 100644
--- a/arch/x86/include/asm/paravirt_types.h
+++ b/arch/x86/include/asm/paravirt_types.h
@@ -156,9 +156,7 @@ struct pv_cpu_ops {
 	u64 (*read_msr)(unsigned int msr, int *err);
 	int (*write_msr)(unsigned int msr, unsigned low, unsigned high);
 
-	u64 (*read_tsc)(void);
 	u64 (*read_pmc)(int counter);
-	unsigned long long (*read_tscp)(unsigned int *aux);
 
 #ifdef CONFIG_X86_32
 	/*
diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c
index 58bcfb6..f68e48f 100644
--- a/arch/x86/kernel/paravirt.c
+++ b/arch/x86/kernel/paravirt.c
@@ -351,9 +351,7 @@ __visible struct pv_cpu_ops pv_cpu_ops = {
 	.wbinvd = native_wbinvd,
 	.read_msr = native_read_msr_safe,
 	.write_msr = native_write_msr_safe,
-	.read_tsc = native_read_tsc,
 	.read_pmc = native_read_pmc,
-	.read_tscp = native_read_tscp,
 	.load_tr_desc = native_load_tr_desc,
 	.set_ldt = native_set_ldt,
 	.load_gdt = native_load_gdt,
diff --git a/arch/x86/kernel/paravirt_patch_32.c b/arch/x86/kernel/paravirt_patch_32.c
index e1b0136..c89f50a 100644
--- a/arch/x86/kernel/paravirt_patch_32.c
+++ b/arch/x86/kernel/paravirt_patch_32.c
@@ -10,7 +10,6 @@ DEF_NATIVE(pv_mmu_ops, read_cr2, "mov %cr2, %eax");
 DEF_NATIVE(pv_mmu_ops, write_cr3, "mov %eax, %cr3");
 DEF_NATIVE(pv_mmu_ops, read_cr3, "mov %cr3, %eax");
 DEF_NATIVE(pv_cpu_ops, clts, "clts");
-DEF_NATIVE(pv_cpu_ops, read_tsc, "rdtsc");
 
 #if defined(CONFIG_PARAVIRT_SPINLOCKS) && defined(CONFIG_QUEUED_SPINLOCKS)
 DEF_NATIVE(pv_lock_ops, queued_spin_unlock, "movb $0, (%eax)");
@@ -52,7 +51,6 @@ unsigned native_patch(u8 type, u16 clobbers, void *ibuf,
 		PATCH_SITE(pv_mmu_ops, read_cr3);
 		PATCH_SITE(pv_mmu_ops, write_cr3);
 		PATCH_SITE(pv_cpu_ops, clts);
-		PATCH_SITE(pv_cpu_ops, read_tsc);
 #if defined(CONFIG_PARAVIRT_SPINLOCKS) && defined(CONFIG_QUEUED_SPINLOCKS)
 		case PARAVIRT_PATCH(pv_lock_ops.queued_spin_unlock):
 			if (pv_is_native_spin_unlock()) {
diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
index 0b95c9b..32136bf 100644
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -1175,11 +1175,8 @@ static const struct pv_cpu_ops xen_cpu_ops __initconst = {
 	.read_msr = xen_read_msr_safe,
 	.write_msr = xen_write_msr_safe,
 
-	.read_tsc = native_read_tsc,
 	.read_pmc = native_read_pmc,
 
-	.read_tscp = native_read_tscp,
-
 	.iret = xen_iret,
 #ifdef CONFIG_X86_64
 	.usergs_sysret32 = xen_sysret32,

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [tip:x86/asm] x86/asm/tsc: Replace rdtscll() with native_read_tsc ()
  2015-06-17  0:35 ` [PATCH v3 04/18] x86/tsc: Replace rdtscll with native_read_tsc Andy Lutomirski
  2015-06-17 10:03   ` Borislav Petkov
@ 2015-07-06 15:40   ` tip-bot for Andy Lutomirski
  1 sibling, 0 replies; 57+ messages in thread
From: tip-bot for Andy Lutomirski @ 2015-07-06 15:40 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: dvlasenk, lenb, torvalds, luto, ray.huang, hpa, tglx, brgerst,
	peterz, kvm, linux-kernel, mingo, bp, luto, bp, ralf,
	john.stultz

Commit-ID:  87be28aaf1458445d5f648688c2eec0f13b8f3b9
Gitweb:     http://git.kernel.org/tip/87be28aaf1458445d5f648688c2eec0f13b8f3b9
Author:     Andy Lutomirski <luto@kernel.org>
AuthorDate: Thu, 25 Jun 2015 18:43:58 +0200
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Mon, 6 Jul 2015 15:23:26 +0200

x86/asm/tsc: Replace rdtscll() with native_read_tsc()

Now that the ->read_tsc() paravirt hook is gone, rdtscll() is
just a wrapper around native_read_tsc(). Unwrap it.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Len Brown <lenb@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kvm ML <kvm@vger.kernel.org>
Link: http://lkml.kernel.org/r/d2449ae62c1b1fb90195bcfb19ef4a35883a04dc.1434501121.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/boot/compressed/aslr.c                      | 2 +-
 arch/x86/include/asm/msr.h                           | 3 ---
 arch/x86/include/asm/tsc.h                           | 5 +----
 arch/x86/kernel/apb_timer.c                          | 4 ++--
 arch/x86/kernel/apic/apic.c                          | 8 ++++----
 arch/x86/kernel/cpu/mcheck/mce.c                     | 4 ++--
 arch/x86/kernel/espfix_64.c                          | 2 +-
 arch/x86/kernel/hpet.c                               | 4 ++--
 arch/x86/kernel/trace_clock.c                        | 2 +-
 arch/x86/kernel/tsc.c                                | 4 ++--
 arch/x86/kvm/vmx.c                                   | 2 +-
 arch/x86/lib/delay.c                                 | 2 +-
 drivers/thermal/intel_powerclamp.c                   | 4 ++--
 tools/power/cpupower/debug/kernel/cpufreq-test_tsc.c | 4 ++--
 14 files changed, 22 insertions(+), 28 deletions(-)

diff --git a/arch/x86/boot/compressed/aslr.c b/arch/x86/boot/compressed/aslr.c
index d7b1f65..ea33236 100644
--- a/arch/x86/boot/compressed/aslr.c
+++ b/arch/x86/boot/compressed/aslr.c
@@ -82,7 +82,7 @@ static unsigned long get_random_long(void)
 
 	if (has_cpuflag(X86_FEATURE_TSC)) {
 		debug_putstr(" RDTSC");
-		rdtscll(raw);
+		raw = native_read_tsc();
 
 		random ^= raw;
 		use_i8254 = false;
diff --git a/arch/x86/include/asm/msr.h b/arch/x86/include/asm/msr.h
index d1afac7..7273b74 100644
--- a/arch/x86/include/asm/msr.h
+++ b/arch/x86/include/asm/msr.h
@@ -192,9 +192,6 @@ do {							\
 #define rdtscl(low)						\
 	((low) = (u32)native_read_tsc())
 
-#define rdtscll(val)						\
-	((val) = native_read_tsc())
-
 #define rdtscp(low, high, aux)					\
 do {                                                            \
 	unsigned long long _val = native_read_tscp(&(aux));     \
diff --git a/arch/x86/include/asm/tsc.h b/arch/x86/include/asm/tsc.h
index 3da1cc1..b488390 100644
--- a/arch/x86/include/asm/tsc.h
+++ b/arch/x86/include/asm/tsc.h
@@ -21,15 +21,12 @@ extern void disable_TSC(void);
 
 static inline cycles_t get_cycles(void)
 {
-	unsigned long long ret = 0;
-
 #ifndef CONFIG_X86_TSC
 	if (!cpu_has_tsc)
 		return 0;
 #endif
-	rdtscll(ret);
 
-	return ret;
+	return native_read_tsc();
 }
 
 extern void tsc_init(void);
diff --git a/arch/x86/kernel/apb_timer.c b/arch/x86/kernel/apb_timer.c
index 9fe111c..25efa53 100644
--- a/arch/x86/kernel/apb_timer.c
+++ b/arch/x86/kernel/apb_timer.c
@@ -263,7 +263,7 @@ static int apbt_clocksource_register(void)
 
 	/* Verify whether apbt counter works */
 	t1 = dw_apb_clocksource_read(clocksource_apbt);
-	rdtscll(start);
+	start = native_read_tsc();
 
 	/*
 	 * We don't know the TSC frequency yet, but waiting for
@@ -273,7 +273,7 @@ static int apbt_clocksource_register(void)
 	 */
 	do {
 		rep_nop();
-		rdtscll(now);
+		now = native_read_tsc();
 	} while ((now - start) < 200000UL);
 
 	/* APBT is the only always on clocksource, it has to work! */
diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
index dcb5285..51af1ed 100644
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -457,7 +457,7 @@ static int lapic_next_deadline(unsigned long delta,
 {
 	u64 tsc;
 
-	rdtscll(tsc);
+	tsc = native_read_tsc();
 	wrmsrl(MSR_IA32_TSC_DEADLINE, tsc + (((u64) delta) * TSC_DIVISOR));
 	return 0;
 }
@@ -592,7 +592,7 @@ static void __init lapic_cal_handler(struct clock_event_device *dev)
 	unsigned long pm = acpi_pm_read_early();
 
 	if (cpu_has_tsc)
-		rdtscll(tsc);
+		tsc = native_read_tsc();
 
 	switch (lapic_cal_loops++) {
 	case 0:
@@ -1209,7 +1209,7 @@ void setup_local_APIC(void)
 	long long max_loops = cpu_khz ? cpu_khz : 1000000;
 
 	if (cpu_has_tsc)
-		rdtscll(tsc);
+		tsc = native_read_tsc();
 
 	if (disable_apic) {
 		disable_ioapic_support();
@@ -1293,7 +1293,7 @@ void setup_local_APIC(void)
 		}
 		if (queued) {
 			if (cpu_has_tsc && cpu_khz) {
-				rdtscll(ntsc);
+				ntsc = native_read_tsc();
 				max_loops = (cpu_khz << 10) - (ntsc - tsc);
 			} else
 				max_loops--;
diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index df919ff..a5283d2 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -125,7 +125,7 @@ void mce_setup(struct mce *m)
 {
 	memset(m, 0, sizeof(struct mce));
 	m->cpu = m->extcpu = smp_processor_id();
-	rdtscll(m->tsc);
+	m->tsc = native_read_tsc();
 	/* We hope get_seconds stays lockless */
 	m->time = get_seconds();
 	m->cpuvendor = boot_cpu_data.x86_vendor;
@@ -1784,7 +1784,7 @@ static void collect_tscs(void *data)
 {
 	unsigned long *cpu_tsc = (unsigned long *)data;
 
-	rdtscll(cpu_tsc[smp_processor_id()]);
+	cpu_tsc[smp_processor_id()] = native_read_tsc();
 }
 
 static int mce_apei_read_done;
diff --git a/arch/x86/kernel/espfix_64.c b/arch/x86/kernel/espfix_64.c
index f5d0730..334a2a9 100644
--- a/arch/x86/kernel/espfix_64.c
+++ b/arch/x86/kernel/espfix_64.c
@@ -110,7 +110,7 @@ static void init_espfix_random(void)
 	 */
 	if (!arch_get_random_long(&rand)) {
 		/* The constant is an arbitrary large prime */
-		rdtscll(rand);
+		rand = native_read_tsc();
 		rand *= 0xc345c6b72fd16123UL;
 	}
 
diff --git a/arch/x86/kernel/hpet.c b/arch/x86/kernel/hpet.c
index 10757d0..cc390fe 100644
--- a/arch/x86/kernel/hpet.c
+++ b/arch/x86/kernel/hpet.c
@@ -735,7 +735,7 @@ static int hpet_clocksource_register(void)
 
 	/* Verify whether hpet counter works */
 	t1 = hpet_readl(HPET_COUNTER);
-	rdtscll(start);
+	start = native_read_tsc();
 
 	/*
 	 * We don't know the TSC frequency yet, but waiting for
@@ -745,7 +745,7 @@ static int hpet_clocksource_register(void)
 	 */
 	do {
 		rep_nop();
-		rdtscll(now);
+		now = native_read_tsc();
 	} while ((now - start) < 200000UL);
 
 	if (t1 == hpet_readl(HPET_COUNTER)) {
diff --git a/arch/x86/kernel/trace_clock.c b/arch/x86/kernel/trace_clock.c
index 25b9937..bd8f4d4 100644
--- a/arch/x86/kernel/trace_clock.c
+++ b/arch/x86/kernel/trace_clock.c
@@ -15,7 +15,7 @@ u64 notrace trace_clock_x86_tsc(void)
 	u64 ret;
 
 	rdtsc_barrier();
-	rdtscll(ret);
+	ret = native_read_tsc();
 
 	return ret;
 }
diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
index e7710cd..e66f5dc 100644
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -248,7 +248,7 @@ static void set_cyc2ns_scale(unsigned long cpu_khz, int cpu)
 
 	data = cyc2ns_write_begin(cpu);
 
-	rdtscll(tsc_now);
+	tsc_now = native_read_tsc();
 	ns_now = cycles_2_ns(tsc_now);
 
 	/*
@@ -290,7 +290,7 @@ u64 native_sched_clock(void)
 	}
 
 	/* read the Time Stamp Counter: */
-	rdtscll(tsc_now);
+	tsc_now = native_read_tsc();
 
 	/* return the value in ns */
 	return cycles_2_ns(tsc_now);
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index e856dd5..4fa1cca 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -2236,7 +2236,7 @@ static u64 guest_read_tsc(void)
 {
 	u64 host_tsc, tsc_offset;
 
-	rdtscll(host_tsc);
+	host_tsc = native_read_tsc();
 	tsc_offset = vmcs_read64(TSC_OFFSET);
 	return host_tsc + tsc_offset;
 }
diff --git a/arch/x86/lib/delay.c b/arch/x86/lib/delay.c
index 39d6a3d..9a52ad0 100644
--- a/arch/x86/lib/delay.c
+++ b/arch/x86/lib/delay.c
@@ -100,7 +100,7 @@ void use_tsc_delay(void)
 int read_current_timer(unsigned long *timer_val)
 {
 	if (delay_fn == delay_tsc) {
-		rdtscll(*timer_val);
+		*timer_val = native_read_tsc();
 		return 0;
 	}
 	return -1;
diff --git a/drivers/thermal/intel_powerclamp.c b/drivers/thermal/intel_powerclamp.c
index 5820e85..ab13448 100644
--- a/drivers/thermal/intel_powerclamp.c
+++ b/drivers/thermal/intel_powerclamp.c
@@ -340,7 +340,7 @@ static bool powerclamp_adjust_controls(unsigned int target_ratio,
 
 	/* check result for the last window */
 	msr_now = pkg_state_counter();
-	rdtscll(tsc_now);
+	tsc_now = native_read_tsc();
 
 	/* calculate pkg cstate vs tsc ratio */
 	if (!msr_last || !tsc_last)
@@ -482,7 +482,7 @@ static void poll_pkg_cstate(struct work_struct *dummy)
 	u64 val64;
 
 	msr_now = pkg_state_counter();
-	rdtscll(tsc_now);
+	tsc_now = native_read_tsc();
 	jiffies_now = jiffies;
 
 	/* calculate pkg cstate vs tsc ratio */
diff --git a/tools/power/cpupower/debug/kernel/cpufreq-test_tsc.c b/tools/power/cpupower/debug/kernel/cpufreq-test_tsc.c
index 5224ee5..f02b0c0 100644
--- a/tools/power/cpupower/debug/kernel/cpufreq-test_tsc.c
+++ b/tools/power/cpupower/debug/kernel/cpufreq-test_tsc.c
@@ -81,11 +81,11 @@ static int __init cpufreq_test_tsc(void)
 
 	printk(KERN_DEBUG "start--> \n");
 	then = read_pmtmr();
-        rdtscll(then_tsc);
+	then_tsc = native_read_tsc();
 	for (i=0;i<20;i++) {
 		mdelay(100);
 		now = read_pmtmr();
-		rdtscll(now_tsc);
+		now_tsc = native_read_tsc();
 		diff = (now - then) & 0xFFFFFF;
 		diff_tsc = now_tsc - then_tsc;
 		printk(KERN_DEBUG "t1: %08u t2: %08u diff_pmtmr: %08u diff_tsc: %016llu\n", then, now, diff, diff_tsc);

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [tip:x86/asm] x86/asm/tsc: Remove the rdtscp() and rdtscpll() macros
  2015-06-17  0:35 ` [PATCH v3 05/18] x86/tsc: Remove the rdtscp and rdtscpll macros Andy Lutomirski
@ 2015-07-06 15:41   ` tip-bot for Andy Lutomirski
  0 siblings, 0 replies; 57+ messages in thread
From: tip-bot for Andy Lutomirski @ 2015-07-06 15:41 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: luto, bp, lenb, linux-kernel, ray.huang, brgerst, torvalds, tglx,
	bp, kvm, ralf, dvlasenk, mingo, john.stultz, peterz, luto, hpa

Commit-ID:  ec69de52c648b1d9416a810943e68dbe9fe519f4
Gitweb:     http://git.kernel.org/tip/ec69de52c648b1d9416a810943e68dbe9fe519f4
Author:     Andy Lutomirski <luto@kernel.org>
AuthorDate: Thu, 25 Jun 2015 18:43:59 +0200
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Mon, 6 Jul 2015 15:23:26 +0200

x86/asm/tsc: Remove the rdtscp() and rdtscpll() macros

They have no users. Leave native_read_tscp() which seems
potentially useful despite also having no callers.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Len Brown <lenb@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kvm ML <kvm@vger.kernel.org>
Link: http://lkml.kernel.org/r/6abfa3ef80534b5d73898a48c4d25e069303cbe5.1434501121.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/msr.h | 9 ---------
 1 file changed, 9 deletions(-)

diff --git a/arch/x86/include/asm/msr.h b/arch/x86/include/asm/msr.h
index 7273b74..626f781 100644
--- a/arch/x86/include/asm/msr.h
+++ b/arch/x86/include/asm/msr.h
@@ -192,15 +192,6 @@ do {							\
 #define rdtscl(low)						\
 	((low) = (u32)native_read_tsc())
 
-#define rdtscp(low, high, aux)					\
-do {                                                            \
-	unsigned long long _val = native_read_tscp(&(aux));     \
-	(low) = (u32)_val;                                      \
-	(high) = (u32)(_val >> 32);                             \
-} while (0)
-
-#define rdtscpll(val, aux) (val) = native_read_tscp(&(aux))
-
 /*
  * 64-bit version of wrmsr_safe():
  */

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [tip:x86/asm] x86/asm/tsc: Use the full 64-bit TSC in delay_tsc()
  2015-06-17  0:35 ` [PATCH v3 06/18] x86/tsc: Use the full 64-bit tsc in tsc_delay Andy Lutomirski
@ 2015-07-06 15:41   ` tip-bot for Andy Lutomirski
  0 siblings, 0 replies; 57+ messages in thread
From: tip-bot for Andy Lutomirski @ 2015-07-06 15:41 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: mingo, luto, bp, hpa, torvalds, lenb, ralf, luto, dvlasenk, kvm,
	bp, ray.huang, brgerst, john.stultz, linux-kernel, peterz, tglx

Commit-ID:  9cfa1a0279e22063a727fd204a75cf3672860d83
Gitweb:     http://git.kernel.org/tip/9cfa1a0279e22063a727fd204a75cf3672860d83
Author:     Andy Lutomirski <luto@kernel.org>
AuthorDate: Thu, 25 Jun 2015 18:44:00 +0200
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Mon, 6 Jul 2015 15:23:27 +0200

x86/asm/tsc: Use the full 64-bit TSC in delay_tsc()

As a very minor optimization, delay_tsc() was only using the low
32 bits of the TSC. It's a delay function, so just use the whole
thing.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Len Brown <lenb@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kvm ML <kvm@vger.kernel.org>
Link: http://lkml.kernel.org/r/bd1a277c71321b67c4794970cb5ace05efe21ab6.1434501121.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/lib/delay.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/x86/lib/delay.c b/arch/x86/lib/delay.c
index 9a52ad0..35115f3 100644
--- a/arch/x86/lib/delay.c
+++ b/arch/x86/lib/delay.c
@@ -49,16 +49,16 @@ static void delay_loop(unsigned long loops)
 /* TSC based delay: */
 static void delay_tsc(unsigned long __loops)
 {
-	u32 bclock, now, loops = __loops;
+	u64 bclock, now, loops = __loops;
 	int cpu;
 
 	preempt_disable();
 	cpu = smp_processor_id();
 	rdtsc_barrier();
-	rdtscl(bclock);
+	bclock = native_read_tsc();
 	for (;;) {
 		rdtsc_barrier();
-		rdtscl(now);
+		now = native_read_tsc();
 		if ((now - bclock) >= loops)
 			break;
 
@@ -80,7 +80,7 @@ static void delay_tsc(unsigned long __loops)
 			loops -= (now - bclock);
 			cpu = smp_processor_id();
 			rdtsc_barrier();
-			rdtscl(bclock);
+			bclock = native_read_tsc();
 		}
 	}
 	preempt_enable();

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [tip:x86/asm] x86/asm/tsc, x86/cpu/amd: Use the full 64-bit TSC to detect the 2.6.2 bug
  2015-06-17  0:35 ` [PATCH v3 07/18] x86/cpu/amd: Use the full 64-bit TSC to detect the 2.6.2 bug Andy Lutomirski
@ 2015-07-06 15:41   ` tip-bot for Andy Lutomirski
  0 siblings, 0 replies; 57+ messages in thread
From: tip-bot for Andy Lutomirski @ 2015-07-06 15:41 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: hpa, ralf, luto, mingo, brgerst, linux-kernel, luto, tglx, bp,
	kvm, ray.huang, lenb, dvlasenk, torvalds, peterz, bp,
	john.stultz

Commit-ID:  3796366614598e48edf0561b86f18c230a7debc8
Gitweb:     http://git.kernel.org/tip/3796366614598e48edf0561b86f18c230a7debc8
Author:     Andy Lutomirski <luto@kernel.org>
AuthorDate: Thu, 25 Jun 2015 18:44:01 +0200
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Mon, 6 Jul 2015 15:23:27 +0200

x86/asm/tsc, x86/cpu/amd: Use the full 64-bit TSC to detect the 2.6.2 bug

This code is timing 100k indirect calls, so the added overhead
of counting the number of cycles elapsed as a 64-bit number
should be insignificant.  Drop the optimization of using a
32-bit count.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Len Brown <lenb@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kvm ML <kvm@vger.kernel.org>
Link: http://lkml.kernel.org/r/d58f339a9c0dd8352b50d2f7a216f67ec2844f20.1434501121.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/kernel/cpu/amd.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index dd3a4ba..a69710d 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -114,7 +114,7 @@ static void init_amd_k6(struct cpuinfo_x86 *c)
 		const int K6_BUG_LOOP = 1000000;
 		int n;
 		void (*f_vide)(void);
-		unsigned long d, d2;
+		u64 d, d2;
 
 		printk(KERN_INFO "AMD K6 stepping B detected - ");
 
@@ -125,10 +125,10 @@ static void init_amd_k6(struct cpuinfo_x86 *c)
 
 		n = K6_BUG_LOOP;
 		f_vide = vide;
-		rdtscl(d);
+		d = native_read_tsc();
 		while (n--)
 			f_vide();
-		rdtscl(d2);
+		d2 = native_read_tsc();
 		d = d2-d;
 
 		if (d > 20*K6_BUG_LOOP)

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [tip:x86/asm] x86/asm/tsc, drivers/net/hamradio/baycom_epp: Replace rdtscl() with native_read_tsc()
  2015-06-17  0:35 ` [PATCH v3 08/18] baycom_epp: Replace rdtscl() with native_read_tsc() Andy Lutomirski
  2015-06-17  0:49   ` Thomas Sailer
  2015-06-20 13:54   ` walter harms
@ 2015-07-06 15:42   ` tip-bot for Andy Lutomirski
  2 siblings, 0 replies; 57+ messages in thread
From: tip-bot for Andy Lutomirski @ 2015-07-06 15:42 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: lenb, tglx, ralf, bp, mingo, ray.huang, peterz, wharms, dvlasenk,
	hpa, bp, linux-kernel, luto, brgerst, john.stultz, luto,
	torvalds, kvm

Commit-ID:  e18d1f8df176527332761ac29ee3097f8584c478
Gitweb:     http://git.kernel.org/tip/e18d1f8df176527332761ac29ee3097f8584c478
Author:     Andy Lutomirski <luto@kernel.org>
AuthorDate: Thu, 25 Jun 2015 18:44:02 +0200
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Mon, 6 Jul 2015 15:23:27 +0200

x86/asm/tsc, drivers/net/hamradio/baycom_epp: Replace rdtscl() with native_read_tsc()

This is only used if BAYCOM_DEBUG is defined.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Thomas Sailer <t.sailer@alumni.ethz.ch
Acked-by: Walter Harms <wharms@bfs.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Len Brown <lenb@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kvm ML <kvm@vger.kernel.org>
Cc: linux-hams@vger.kernel.org
Link: http://lkml.kernel.org/r/1195ce0c7f34169ff3006341b77806184a46b9bf.1434501121.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 drivers/net/hamradio/baycom_epp.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/hamradio/baycom_epp.c b/drivers/net/hamradio/baycom_epp.c
index 83c7cce..44e5c3b 100644
--- a/drivers/net/hamradio/baycom_epp.c
+++ b/drivers/net/hamradio/baycom_epp.c
@@ -638,7 +638,7 @@ static int receive(struct net_device *dev, int cnt)
 #define GETTICK(x)                                                \
 ({                                                                \
 	if (cpu_has_tsc)                                          \
-		rdtscl(x);                                        \
+		x = (unsigned int)native_read_tsc();		  \
 })
 #else /* __i386__ */
 #define GETTICK(x)

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [tip:x86/asm] x86/asm/tsc, staging/lirc_serial: Remove TSC-based timing
  2015-06-17  0:35 ` [PATCH v3 09/18] staging/lirc_serial: Remove TSC-based timing Andy Lutomirski
@ 2015-07-06 15:42   ` tip-bot for Andy Lutomirski
  0 siblings, 0 replies; 57+ messages in thread
From: tip-bot for Andy Lutomirski @ 2015-07-06 15:42 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: bp, bp, tglx, hpa, lenb, kvm, torvalds, jarod, linux-kernel,
	peterz, luto, luto, gregkh, brgerst, dvlasenk, john.stultz, ralf,
	ray.huang, mingo

Commit-ID:  3a2c16c8489d967de10b3b7f5cc0f7cab4337770
Gitweb:     http://git.kernel.org/tip/3a2c16c8489d967de10b3b7f5cc0f7cab4337770
Author:     Andy Lutomirski <luto@kernel.org>
AuthorDate: Thu, 25 Jun 2015 18:44:03 +0200
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Mon, 6 Jul 2015 15:23:27 +0200

x86/asm/tsc, staging/lirc_serial: Remove TSC-based timing

It wasn't compiled in by default. I suspect that the driver was
and still is broken, though -- it's calling udelay with a
parameter that's derived from loops_per_jiffy.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: Jarod Wilson <jarod@wilsonet.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Len Brown <lenb@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: devel@driverdev.osuosl.org
Cc: kvm ML <kvm@vger.kernel.org>
Link: http://lkml.kernel.org/r/c95df47c5405b494d19d20b2852a9378c9f661f3.1434501121.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 drivers/staging/media/lirc/lirc_serial.c | 63 ++------------------------------
 1 file changed, 4 insertions(+), 59 deletions(-)

diff --git a/drivers/staging/media/lirc/lirc_serial.c b/drivers/staging/media/lirc/lirc_serial.c
index dc79844..465796a 100644
--- a/drivers/staging/media/lirc/lirc_serial.c
+++ b/drivers/staging/media/lirc/lirc_serial.c
@@ -327,9 +327,6 @@ static void safe_udelay(unsigned long usecs)
  * time
  */
 
-/* So send_pulse can quickly convert microseconds to clocks */
-static unsigned long conv_us_to_clocks;
-
 static int init_timing_params(unsigned int new_duty_cycle,
 		unsigned int new_freq)
 {
@@ -344,7 +341,6 @@ static int init_timing_params(unsigned int new_duty_cycle,
 	/* How many clocks in a microsecond?, avoiding long long divide */
 	work = loops_per_sec;
 	work *= 4295;  /* 4295 = 2^32 / 1e6 */
-	conv_us_to_clocks = work >> 32;
 
 	/*
 	 * Carrier period in clocks, approach good up to 32GHz clock,
@@ -357,10 +353,9 @@ static int init_timing_params(unsigned int new_duty_cycle,
 	pulse_width = period * duty_cycle / 100;
 	space_width = period - pulse_width;
 	dprintk("in init_timing_params, freq=%d, duty_cycle=%d, "
-		"clk/jiffy=%ld, pulse=%ld, space=%ld, "
-		"conv_us_to_clocks=%ld\n",
+		"clk/jiffy=%ld, pulse=%ld, space=%ld\n",
 		freq, duty_cycle, __this_cpu_read(cpu_info.loops_per_jiffy),
-		pulse_width, space_width, conv_us_to_clocks);
+		pulse_width, space_width);
 	return 0;
 }
 #else /* ! USE_RDTSC */
@@ -431,63 +426,14 @@ static long send_pulse_irdeo(unsigned long length)
 	return ret;
 }
 
-#ifdef USE_RDTSC
-/* Version that uses Pentium rdtsc instruction to measure clocks */
-
-/*
- * This version does sub-microsecond timing using rdtsc instruction,
- * and does away with the fudged LIRC_SERIAL_TRANSMITTER_LATENCY
- * Implicitly i586 architecture...  - Steve
- */
-
-static long send_pulse_homebrew_softcarrier(unsigned long length)
-{
-	int flag;
-	unsigned long target, start, now;
-
-	/* Get going quick as we can */
-	rdtscl(start);
-	on();
-	/* Convert length from microseconds to clocks */
-	length *= conv_us_to_clocks;
-	/* And loop till time is up - flipping at right intervals */
-	now = start;
-	target = pulse_width;
-	flag = 1;
-	/*
-	 * FIXME: This looks like a hard busy wait, without even an occasional,
-	 * polite, cpu_relax() call.  There's got to be a better way?
-	 *
-	 * The i2c code has the result of a lot of bit-banging work, I wonder if
-	 * there's something there which could be helpful here.
-	 */
-	while ((now - start) < length) {
-		/* Delay till flip time */
-		do {
-			rdtscl(now);
-		} while ((now - start) < target);
-
-		/* flip */
-		if (flag) {
-			rdtscl(now);
-			off();
-			target += space_width;
-		} else {
-			rdtscl(now); on();
-			target += pulse_width;
-		}
-		flag = !flag;
-	}
-	rdtscl(now);
-	return ((now - start) - length) / conv_us_to_clocks;
-}
-#else /* ! USE_RDTSC */
 /* Version using udelay() */
 
 /*
  * here we use fixed point arithmetic, with 8
  * fractional bits.  that gets us within 0.1% or so of the right average
  * frequency, albeit with some jitter in pulse length - Steve
+ *
+ * This should use ndelay instead.
  */
 
 /* To match 8 fractional bits used for pulse/space length */
@@ -520,7 +466,6 @@ static long send_pulse_homebrew_softcarrier(unsigned long length)
 	}
 	return (actual-length) >> 8;
 }
-#endif /* USE_RDTSC */
 
 static long send_pulse_homebrew(unsigned long length)
 {

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [tip:x86/asm] x86/asm/tsc, input/joystick/analog: Switch from rdtscl() to native_read_tsc()
  2015-06-17  0:35 ` [PATCH v3 10/18] input/joystick/analog: Switch from rdtscl() to native_read_tsc() Andy Lutomirski
@ 2015-07-06 15:42   ` tip-bot for Andy Lutomirski
  0 siblings, 0 replies; 57+ messages in thread
From: tip-bot for Andy Lutomirski @ 2015-07-06 15:42 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: john.stultz, hpa, dmitry.torokhov, bp, ray.huang, luto, bp, luto,
	linux-kernel, lenb, torvalds, tglx, kvm, mingo, brgerst, peterz,
	ralf, dvlasenk

Commit-ID:  016bfc449a88c833e949414a41748b359843dbb1
Gitweb:     http://git.kernel.org/tip/016bfc449a88c833e949414a41748b359843dbb1
Author:     Andy Lutomirski <luto@kernel.org>
AuthorDate: Thu, 25 Jun 2015 18:44:04 +0200
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Mon, 6 Jul 2015 15:23:28 +0200

x86/asm/tsc, input/joystick/analog: Switch from rdtscl() to native_read_tsc()

This timing code is hideous, and this doesn't help.  It gets rid
of one of the last users of rdtscl(), though.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Len Brown <lenb@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kvm ML <kvm@vger.kernel.org>
Cc: linux-input@vger.kernel.org
Link: http://lkml.kernel.org/r/90d19b3cea0e05ca6f333d1598daa38afb993260.1434501121.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 drivers/input/joystick/analog.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/input/joystick/analog.c b/drivers/input/joystick/analog.c
index 4284080..f871b4f 100644
--- a/drivers/input/joystick/analog.c
+++ b/drivers/input/joystick/analog.c
@@ -143,7 +143,7 @@ struct analog_port {
 
 #include <linux/i8253.h>
 
-#define GET_TIME(x)	do { if (cpu_has_tsc) rdtscl(x); else x = get_time_pit(); } while (0)
+#define GET_TIME(x)	do { if (cpu_has_tsc) x = (unsigned int)native_read_tsc(); else x = get_time_pit(); } while (0)
 #define DELTA(x,y)	(cpu_has_tsc ? ((y) - (x)) : ((x) - (y) + ((x) < (y) ? PIT_TICK_RATE / HZ : 0)))
 #define TIME_NAME	(cpu_has_tsc?"TSC":"PIT")
 static unsigned int get_time_pit(void)
@@ -160,7 +160,7 @@ static unsigned int get_time_pit(void)
         return count;
 }
 #elif defined(__x86_64__)
-#define GET_TIME(x)	rdtscl(x)
+#define GET_TIME(x)	do { x = (unsigned int)native_read_tsc(); } while (0)
 #define DELTA(x,y)	((y)-(x))
 #define TIME_NAME	"TSC"
 #elif defined(__alpha__) || defined(CONFIG_MN10300) || defined(CONFIG_ARM) || defined(CONFIG_ARM64) || defined(CONFIG_TILE)

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [tip:x86/asm] x86/asm/tsc, drivers/input/gameport: Replace rdtscl () with native_read_tsc()
  2015-06-17  0:35 ` [PATCH v3 11/18] drivers/input/gameport: Replace rdtscl() with native_read_tsc() Andy Lutomirski
@ 2015-07-06 15:43   ` tip-bot for Andy Lutomirski
  0 siblings, 0 replies; 57+ messages in thread
From: tip-bot for Andy Lutomirski @ 2015-07-06 15:43 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: dmitry.torokhov, john.stultz, linux-kernel, ralf, hpa, torvalds,
	lenb, ray.huang, peterz, kvm, luto, dvlasenk, luto, brgerst, bp,
	tglx, bp, mingo

Commit-ID:  732f374ba50b64150bf954c2d4e9f6fae583cccf
Gitweb:     http://git.kernel.org/tip/732f374ba50b64150bf954c2d4e9f6fae583cccf
Author:     Andy Lutomirski <luto@kernel.org>
AuthorDate: Thu, 25 Jun 2015 18:44:05 +0200
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Mon, 6 Jul 2015 15:23:28 +0200

x86/asm/tsc, drivers/input/gameport: Replace rdtscl() with native_read_tsc()

It's unclear to me why this code exists in the first place.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Len Brown <lenb@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kvm ML <kvm@vger.kernel.org>
Cc: linux-input@vger.kernel.org
Link: http://lkml.kernel.org/r/9e058e72f4cf1f13c6483c1360b39c3d188a2c2a.1434501121.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 drivers/input/gameport/gameport.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/input/gameport/gameport.c b/drivers/input/gameport/gameport.c
index e853a21..abc0cb2 100644
--- a/drivers/input/gameport/gameport.c
+++ b/drivers/input/gameport/gameport.c
@@ -149,9 +149,9 @@ static int old_gameport_measure_speed(struct gameport *gameport)
 
 	for(i = 0; i < 50; i++) {
 		local_irq_save(flags);
-		rdtscl(t1);
+		t1 = native_read_tsc();
 		for (t = 0; t < 50; t++) gameport_read(gameport);
-		rdtscl(t2);
+		t2 = native_read_tsc();
 		local_irq_restore(flags);
 		udelay(i * 10);
 		if (t2 - t1 < tx) tx = t2 - t1;

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [tip:x86/asm] x86/asm/tsc: Remove rdtscl()
  2015-06-17  0:36 ` [PATCH v3 12/18] x86/tsc: Remove rdtscl() Andy Lutomirski
@ 2015-07-06 15:43   ` tip-bot for Andy Lutomirski
  0 siblings, 0 replies; 57+ messages in thread
From: tip-bot for Andy Lutomirski @ 2015-07-06 15:43 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: hpa, peterz, brgerst, luto, bp, john.stultz, mingo, ralf, bp,
	torvalds, dvlasenk, linux-kernel, lenb, kvm, tglx, ray.huang,
	luto

Commit-ID:  fe47ae6e1a5005b2e82f7eab57b5c3820453293a
Gitweb:     http://git.kernel.org/tip/fe47ae6e1a5005b2e82f7eab57b5c3820453293a
Author:     Andy Lutomirski <luto@kernel.org>
AuthorDate: Thu, 25 Jun 2015 18:44:06 +0200
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Mon, 6 Jul 2015 15:23:28 +0200

x86/asm/tsc: Remove rdtscl()

It has no more callers, and it was never a very sensible
interface to begin with. Users of the TSC should either read all
64 bits or explicitly throw out the high bits.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Len Brown <lenb@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kvm ML <kvm@vger.kernel.org>
Link: http://lkml.kernel.org/r/250105f7cee519be9d7fc4464b5784caafc8f4fe.1434501121.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/msr.h | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/arch/x86/include/asm/msr.h b/arch/x86/include/asm/msr.h
index 626f781..c89ed6c 100644
--- a/arch/x86/include/asm/msr.h
+++ b/arch/x86/include/asm/msr.h
@@ -189,9 +189,6 @@ do {							\
 
 #endif	/* !CONFIG_PARAVIRT */
 
-#define rdtscl(low)						\
-	((low) = (u32)native_read_tsc())
-
 /*
  * 64-bit version of wrmsr_safe():
  */

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [tip:x86/asm] x86/asm/tsc: Rename native_read_tsc() to rdtsc()
  2015-06-17  0:36 ` [PATCH v3 13/18] x86/tsc: Rename native_read_tsc() to rdtsc() Andy Lutomirski
  2015-06-24 21:38   ` Borislav Petkov
@ 2015-07-06 15:43   ` tip-bot for Andy Lutomirski
  1 sibling, 0 replies; 57+ messages in thread
From: tip-bot for Andy Lutomirski @ 2015-07-06 15:43 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: luto, lenb, bp, tglx, brgerst, ralf, peterz, john.stultz, luto,
	linux-kernel, dvlasenk, kvm, mingo, hpa, torvalds, bp, ray.huang

Commit-ID:  4ea1636b04dbd66536fa387bae2eea463efc705b
Gitweb:     http://git.kernel.org/tip/4ea1636b04dbd66536fa387bae2eea463efc705b
Author:     Andy Lutomirski <luto@kernel.org>
AuthorDate: Thu, 25 Jun 2015 18:44:07 +0200
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Mon, 6 Jul 2015 15:23:28 +0200

x86/asm/tsc: Rename native_read_tsc() to rdtsc()

Now that there is no paravirt TSC, the "native" is
inappropriate. The function does RDTSC, so give it the obvious
name: rdtsc().

Suggested-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Len Brown <lenb@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kvm ML <kvm@vger.kernel.org>
Link: http://lkml.kernel.org/r/fd43e16281991f096c1e4d21574d9e1402c62d39.1434501121.git.luto@kernel.org
[ Ported it to v4.2-rc1. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/boot/compressed/aslr.c                      |  2 +-
 arch/x86/entry/vdso/vclock_gettime.c                 |  2 +-
 arch/x86/include/asm/msr.h                           | 11 ++++++++++-
 arch/x86/include/asm/pvclock.h                       |  2 +-
 arch/x86/include/asm/stackprotector.h                |  2 +-
 arch/x86/include/asm/tsc.h                           |  2 +-
 arch/x86/kernel/apb_timer.c                          |  8 ++++----
 arch/x86/kernel/apic/apic.c                          |  8 ++++----
 arch/x86/kernel/cpu/amd.c                            |  4 ++--
 arch/x86/kernel/cpu/mcheck/mce.c                     |  4 ++--
 arch/x86/kernel/espfix_64.c                          |  2 +-
 arch/x86/kernel/hpet.c                               |  4 ++--
 arch/x86/kernel/trace_clock.c                        |  2 +-
 arch/x86/kernel/tsc.c                                |  4 ++--
 arch/x86/kvm/lapic.c                                 |  4 ++--
 arch/x86/kvm/svm.c                                   |  4 ++--
 arch/x86/kvm/vmx.c                                   |  4 ++--
 arch/x86/kvm/x86.c                                   | 12 ++++++------
 arch/x86/lib/delay.c                                 |  8 ++++----
 drivers/cpufreq/intel_pstate.c                       |  2 +-
 drivers/input/gameport/gameport.c                    |  4 ++--
 drivers/input/joystick/analog.c                      |  4 ++--
 drivers/net/hamradio/baycom_epp.c                    |  2 +-
 drivers/thermal/intel_powerclamp.c                   |  4 ++--
 tools/power/cpupower/debug/kernel/cpufreq-test_tsc.c |  4 ++--
 25 files changed, 59 insertions(+), 50 deletions(-)

diff --git a/arch/x86/boot/compressed/aslr.c b/arch/x86/boot/compressed/aslr.c
index ea33236..6a9b96b 100644
--- a/arch/x86/boot/compressed/aslr.c
+++ b/arch/x86/boot/compressed/aslr.c
@@ -82,7 +82,7 @@ static unsigned long get_random_long(void)
 
 	if (has_cpuflag(X86_FEATURE_TSC)) {
 		debug_putstr(" RDTSC");
-		raw = native_read_tsc();
+		raw = rdtsc();
 
 		random ^= raw;
 		use_i8254 = false;
diff --git a/arch/x86/entry/vdso/vclock_gettime.c b/arch/x86/entry/vdso/vclock_gettime.c
index 972b488..0340d93 100644
--- a/arch/x86/entry/vdso/vclock_gettime.c
+++ b/arch/x86/entry/vdso/vclock_gettime.c
@@ -186,7 +186,7 @@ notrace static cycle_t vread_tsc(void)
 	 * but no one has ever seen it happen.
 	 */
 	rdtsc_barrier();
-	ret = (cycle_t)native_read_tsc();
+	ret = (cycle_t)rdtsc();
 
 	last = gtod->cycle_last;
 
diff --git a/arch/x86/include/asm/msr.h b/arch/x86/include/asm/msr.h
index c89ed6c..ff0c120 100644
--- a/arch/x86/include/asm/msr.h
+++ b/arch/x86/include/asm/msr.h
@@ -109,7 +109,16 @@ notrace static inline int native_write_msr_safe(unsigned int msr,
 extern int rdmsr_safe_regs(u32 regs[8]);
 extern int wrmsr_safe_regs(u32 regs[8]);
 
-static __always_inline unsigned long long native_read_tsc(void)
+/**
+ * rdtsc() - returns the current TSC without ordering constraints
+ *
+ * rdtsc() returns the result of RDTSC as a 64-bit integer.  The
+ * only ordering constraint it supplies is the ordering implied by
+ * "asm volatile": it will put the RDTSC in the place you expect.  The
+ * CPU can and will speculatively execute that RDTSC, though, so the
+ * results can be non-monotonic if compared on different CPUs.
+ */
+static __always_inline unsigned long long rdtsc(void)
 {
 	DECLARE_ARGS(val, low, high);
 
diff --git a/arch/x86/include/asm/pvclock.h b/arch/x86/include/asm/pvclock.h
index 2bd69d6..5c490db 100644
--- a/arch/x86/include/asm/pvclock.h
+++ b/arch/x86/include/asm/pvclock.h
@@ -62,7 +62,7 @@ static inline u64 pvclock_scale_delta(u64 delta, u32 mul_frac, int shift)
 static __always_inline
 u64 pvclock_get_nsec_offset(const struct pvclock_vcpu_time_info *src)
 {
-	u64 delta = native_read_tsc() - src->tsc_timestamp;
+	u64 delta = rdtsc() - src->tsc_timestamp;
 	return pvclock_scale_delta(delta, src->tsc_to_system_mul,
 				   src->tsc_shift);
 }
diff --git a/arch/x86/include/asm/stackprotector.h b/arch/x86/include/asm/stackprotector.h
index bc5fa2a..58505f0 100644
--- a/arch/x86/include/asm/stackprotector.h
+++ b/arch/x86/include/asm/stackprotector.h
@@ -72,7 +72,7 @@ static __always_inline void boot_init_stack_canary(void)
 	 * on during the bootup the random pool has true entropy too.
 	 */
 	get_random_bytes(&canary, sizeof(canary));
-	tsc = native_read_tsc();
+	tsc = rdtsc();
 	canary += tsc + (tsc << 32UL);
 
 	current->stack_canary = canary;
diff --git a/arch/x86/include/asm/tsc.h b/arch/x86/include/asm/tsc.h
index b488390..3df7675 100644
--- a/arch/x86/include/asm/tsc.h
+++ b/arch/x86/include/asm/tsc.h
@@ -26,7 +26,7 @@ static inline cycles_t get_cycles(void)
 		return 0;
 #endif
 
-	return native_read_tsc();
+	return rdtsc();
 }
 
 extern void tsc_init(void);
diff --git a/arch/x86/kernel/apb_timer.c b/arch/x86/kernel/apb_timer.c
index 25efa53..222a570 100644
--- a/arch/x86/kernel/apb_timer.c
+++ b/arch/x86/kernel/apb_timer.c
@@ -263,7 +263,7 @@ static int apbt_clocksource_register(void)
 
 	/* Verify whether apbt counter works */
 	t1 = dw_apb_clocksource_read(clocksource_apbt);
-	start = native_read_tsc();
+	start = rdtsc();
 
 	/*
 	 * We don't know the TSC frequency yet, but waiting for
@@ -273,7 +273,7 @@ static int apbt_clocksource_register(void)
 	 */
 	do {
 		rep_nop();
-		now = native_read_tsc();
+		now = rdtsc();
 	} while ((now - start) < 200000UL);
 
 	/* APBT is the only always on clocksource, it has to work! */
@@ -390,13 +390,13 @@ unsigned long apbt_quick_calibrate(void)
 	old = dw_apb_clocksource_read(clocksource_apbt);
 	old += loop;
 
-	t1 = native_read_tsc();
+	t1 = rdtsc();
 
 	do {
 		new = dw_apb_clocksource_read(clocksource_apbt);
 	} while (new < old);
 
-	t2 = native_read_tsc();
+	t2 = rdtsc();
 
 	shift = 5;
 	if (unlikely(loop >> shift == 0)) {
diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
index 51af1ed..0d71cd9 100644
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -457,7 +457,7 @@ static int lapic_next_deadline(unsigned long delta,
 {
 	u64 tsc;
 
-	tsc = native_read_tsc();
+	tsc = rdtsc();
 	wrmsrl(MSR_IA32_TSC_DEADLINE, tsc + (((u64) delta) * TSC_DIVISOR));
 	return 0;
 }
@@ -592,7 +592,7 @@ static void __init lapic_cal_handler(struct clock_event_device *dev)
 	unsigned long pm = acpi_pm_read_early();
 
 	if (cpu_has_tsc)
-		tsc = native_read_tsc();
+		tsc = rdtsc();
 
 	switch (lapic_cal_loops++) {
 	case 0:
@@ -1209,7 +1209,7 @@ void setup_local_APIC(void)
 	long long max_loops = cpu_khz ? cpu_khz : 1000000;
 
 	if (cpu_has_tsc)
-		tsc = native_read_tsc();
+		tsc = rdtsc();
 
 	if (disable_apic) {
 		disable_ioapic_support();
@@ -1293,7 +1293,7 @@ void setup_local_APIC(void)
 		}
 		if (queued) {
 			if (cpu_has_tsc && cpu_khz) {
-				ntsc = native_read_tsc();
+				ntsc = rdtsc();
 				max_loops = (cpu_khz << 10) - (ntsc - tsc);
 			} else
 				max_loops--;
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index a69710d..51ad2af 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -125,10 +125,10 @@ static void init_amd_k6(struct cpuinfo_x86 *c)
 
 		n = K6_BUG_LOOP;
 		f_vide = vide;
-		d = native_read_tsc();
+		d = rdtsc();
 		while (n--)
 			f_vide();
-		d2 = native_read_tsc();
+		d2 = rdtsc();
 		d = d2-d;
 
 		if (d > 20*K6_BUG_LOOP)
diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index a5283d2..96ccecc 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -125,7 +125,7 @@ void mce_setup(struct mce *m)
 {
 	memset(m, 0, sizeof(struct mce));
 	m->cpu = m->extcpu = smp_processor_id();
-	m->tsc = native_read_tsc();
+	m->tsc = rdtsc();
 	/* We hope get_seconds stays lockless */
 	m->time = get_seconds();
 	m->cpuvendor = boot_cpu_data.x86_vendor;
@@ -1784,7 +1784,7 @@ static void collect_tscs(void *data)
 {
 	unsigned long *cpu_tsc = (unsigned long *)data;
 
-	cpu_tsc[smp_processor_id()] = native_read_tsc();
+	cpu_tsc[smp_processor_id()] = rdtsc();
 }
 
 static int mce_apei_read_done;
diff --git a/arch/x86/kernel/espfix_64.c b/arch/x86/kernel/espfix_64.c
index 334a2a9..67315cd 100644
--- a/arch/x86/kernel/espfix_64.c
+++ b/arch/x86/kernel/espfix_64.c
@@ -110,7 +110,7 @@ static void init_espfix_random(void)
 	 */
 	if (!arch_get_random_long(&rand)) {
 		/* The constant is an arbitrary large prime */
-		rand = native_read_tsc();
+		rand = rdtsc();
 		rand *= 0xc345c6b72fd16123UL;
 	}
 
diff --git a/arch/x86/kernel/hpet.c b/arch/x86/kernel/hpet.c
index cc390fe..f75c590 100644
--- a/arch/x86/kernel/hpet.c
+++ b/arch/x86/kernel/hpet.c
@@ -735,7 +735,7 @@ static int hpet_clocksource_register(void)
 
 	/* Verify whether hpet counter works */
 	t1 = hpet_readl(HPET_COUNTER);
-	start = native_read_tsc();
+	start = rdtsc();
 
 	/*
 	 * We don't know the TSC frequency yet, but waiting for
@@ -745,7 +745,7 @@ static int hpet_clocksource_register(void)
 	 */
 	do {
 		rep_nop();
-		now = native_read_tsc();
+		now = rdtsc();
 	} while ((now - start) < 200000UL);
 
 	if (t1 == hpet_readl(HPET_COUNTER)) {
diff --git a/arch/x86/kernel/trace_clock.c b/arch/x86/kernel/trace_clock.c
index bd8f4d4..67efb8c 100644
--- a/arch/x86/kernel/trace_clock.c
+++ b/arch/x86/kernel/trace_clock.c
@@ -15,7 +15,7 @@ u64 notrace trace_clock_x86_tsc(void)
 	u64 ret;
 
 	rdtsc_barrier();
-	ret = native_read_tsc();
+	ret = rdtsc();
 
 	return ret;
 }
diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
index e66f5dc..21d6e04 100644
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -248,7 +248,7 @@ static void set_cyc2ns_scale(unsigned long cpu_khz, int cpu)
 
 	data = cyc2ns_write_begin(cpu);
 
-	tsc_now = native_read_tsc();
+	tsc_now = rdtsc();
 	ns_now = cycles_2_ns(tsc_now);
 
 	/*
@@ -290,7 +290,7 @@ u64 native_sched_clock(void)
 	}
 
 	/* read the Time Stamp Counter: */
-	tsc_now = native_read_tsc();
+	tsc_now = rdtsc();
 
 	/* return the value in ns */
 	return cycles_2_ns(tsc_now);
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 954e98a..2f0ade4 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -1172,7 +1172,7 @@ void wait_lapic_expire(struct kvm_vcpu *vcpu)
 
 	tsc_deadline = apic->lapic_timer.expired_tscdeadline;
 	apic->lapic_timer.expired_tscdeadline = 0;
-	guest_tsc = kvm_x86_ops->read_l1_tsc(vcpu, native_read_tsc());
+	guest_tsc = kvm_x86_ops->read_l1_tsc(vcpu, rdtsc());
 	trace_kvm_wait_lapic_expire(vcpu->vcpu_id, guest_tsc - tsc_deadline);
 
 	/* __delay is delay_tsc whenever the hardware has TSC, thus always.  */
@@ -1240,7 +1240,7 @@ static void start_apic_timer(struct kvm_lapic *apic)
 		local_irq_save(flags);
 
 		now = apic->lapic_timer.timer.base->get_time();
-		guest_tsc = kvm_x86_ops->read_l1_tsc(vcpu, native_read_tsc());
+		guest_tsc = kvm_x86_ops->read_l1_tsc(vcpu, rdtsc());
 		if (likely(tscdeadline > guest_tsc)) {
 			ns = (tscdeadline - guest_tsc) * 1000000ULL;
 			do_div(ns, this_tsc_khz);
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 602b974..8dfbad7 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -1080,7 +1080,7 @@ static u64 svm_compute_tsc_offset(struct kvm_vcpu *vcpu, u64 target_tsc)
 {
 	u64 tsc;
 
-	tsc = svm_scale_tsc(vcpu, native_read_tsc());
+	tsc = svm_scale_tsc(vcpu, rdtsc());
 
 	return target_tsc - tsc;
 }
@@ -3079,7 +3079,7 @@ static int svm_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 	switch (msr_info->index) {
 	case MSR_IA32_TSC: {
 		msr_info->data = svm->vmcb->control.tsc_offset +
-			svm_scale_tsc(vcpu, native_read_tsc());
+			svm_scale_tsc(vcpu, rdtsc());
 
 		break;
 	}
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 4fa1cca..10d69a6 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -2236,7 +2236,7 @@ static u64 guest_read_tsc(void)
 {
 	u64 host_tsc, tsc_offset;
 
-	host_tsc = native_read_tsc();
+	host_tsc = rdtsc();
 	tsc_offset = vmcs_read64(TSC_OFFSET);
 	return host_tsc + tsc_offset;
 }
@@ -2317,7 +2317,7 @@ static void vmx_adjust_tsc_offset(struct kvm_vcpu *vcpu, s64 adjustment, bool ho
 
 static u64 vmx_compute_tsc_offset(struct kvm_vcpu *vcpu, u64 target_tsc)
 {
-	return target_tsc - native_read_tsc();
+	return target_tsc - rdtsc();
 }
 
 static bool guest_cpuid_has_vmx(struct kvm_vcpu *vcpu)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index f771058..dfa9713 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1455,7 +1455,7 @@ static cycle_t read_tsc(void)
 	 * but no one has ever seen it happen.
 	 */
 	rdtsc_barrier();
-	ret = (cycle_t)native_read_tsc();
+	ret = (cycle_t)rdtsc();
 
 	last = pvclock_gtod_data.clock.cycle_last;
 
@@ -1646,7 +1646,7 @@ static int kvm_guest_time_update(struct kvm_vcpu *v)
 		return 1;
 	}
 	if (!use_master_clock) {
-		host_tsc = native_read_tsc();
+		host_tsc = rdtsc();
 		kernel_ns = get_kernel_ns();
 	}
 
@@ -2810,7 +2810,7 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
 
 	if (unlikely(vcpu->cpu != cpu) || check_tsc_unstable()) {
 		s64 tsc_delta = !vcpu->arch.last_host_tsc ? 0 :
-				native_read_tsc() - vcpu->arch.last_host_tsc;
+				rdtsc() - vcpu->arch.last_host_tsc;
 		if (tsc_delta < 0)
 			mark_tsc_unstable("KVM discovered backwards TSC");
 		if (check_tsc_unstable()) {
@@ -2838,7 +2838,7 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
 {
 	kvm_x86_ops->vcpu_put(vcpu);
 	kvm_put_guest_fpu(vcpu);
-	vcpu->arch.last_host_tsc = native_read_tsc();
+	vcpu->arch.last_host_tsc = rdtsc();
 }
 
 static int kvm_vcpu_ioctl_get_lapic(struct kvm_vcpu *vcpu,
@@ -6623,7 +6623,7 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
 		hw_breakpoint_restore();
 
 	vcpu->arch.last_guest_tsc = kvm_x86_ops->read_l1_tsc(vcpu,
-							   native_read_tsc());
+							   rdtsc());
 
 	vcpu->mode = OUTSIDE_GUEST_MODE;
 	smp_wmb();
@@ -7437,7 +7437,7 @@ int kvm_arch_hardware_enable(void)
 	if (ret != 0)
 		return ret;
 
-	local_tsc = native_read_tsc();
+	local_tsc = rdtsc();
 	stable = !check_tsc_unstable();
 	list_for_each_entry(kvm, &vm_list, vm_list) {
 		kvm_for_each_vcpu(i, vcpu, kvm) {
diff --git a/arch/x86/lib/delay.c b/arch/x86/lib/delay.c
index 35115f3..f24bc59 100644
--- a/arch/x86/lib/delay.c
+++ b/arch/x86/lib/delay.c
@@ -55,10 +55,10 @@ static void delay_tsc(unsigned long __loops)
 	preempt_disable();
 	cpu = smp_processor_id();
 	rdtsc_barrier();
-	bclock = native_read_tsc();
+	bclock = rdtsc();
 	for (;;) {
 		rdtsc_barrier();
-		now = native_read_tsc();
+		now = rdtsc();
 		if ((now - bclock) >= loops)
 			break;
 
@@ -80,7 +80,7 @@ static void delay_tsc(unsigned long __loops)
 			loops -= (now - bclock);
 			cpu = smp_processor_id();
 			rdtsc_barrier();
-			bclock = native_read_tsc();
+			bclock = rdtsc();
 		}
 	}
 	preempt_enable();
@@ -100,7 +100,7 @@ void use_tsc_delay(void)
 int read_current_timer(unsigned long *timer_val)
 {
 	if (delay_fn == delay_tsc) {
-		*timer_val = native_read_tsc();
+		*timer_val = rdtsc();
 		return 0;
 	}
 	return -1;
diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c
index 15ada47..7c56d7e 100644
--- a/drivers/cpufreq/intel_pstate.c
+++ b/drivers/cpufreq/intel_pstate.c
@@ -765,7 +765,7 @@ static inline void intel_pstate_sample(struct cpudata *cpu)
 	local_irq_save(flags);
 	rdmsrl(MSR_IA32_APERF, aperf);
 	rdmsrl(MSR_IA32_MPERF, mperf);
-	tsc = native_read_tsc();
+	tsc = rdtsc();
 	local_irq_restore(flags);
 
 	cpu->last_sample_time = cpu->sample.time;
diff --git a/drivers/input/gameport/gameport.c b/drivers/input/gameport/gameport.c
index abc0cb2..4a2a9e3 100644
--- a/drivers/input/gameport/gameport.c
+++ b/drivers/input/gameport/gameport.c
@@ -149,9 +149,9 @@ static int old_gameport_measure_speed(struct gameport *gameport)
 
 	for(i = 0; i < 50; i++) {
 		local_irq_save(flags);
-		t1 = native_read_tsc();
+		t1 = rdtsc();
 		for (t = 0; t < 50; t++) gameport_read(gameport);
-		t2 = native_read_tsc();
+		t2 = rdtsc();
 		local_irq_restore(flags);
 		udelay(i * 10);
 		if (t2 - t1 < tx) tx = t2 - t1;
diff --git a/drivers/input/joystick/analog.c b/drivers/input/joystick/analog.c
index f871b4f..6f8b084 100644
--- a/drivers/input/joystick/analog.c
+++ b/drivers/input/joystick/analog.c
@@ -143,7 +143,7 @@ struct analog_port {
 
 #include <linux/i8253.h>
 
-#define GET_TIME(x)	do { if (cpu_has_tsc) x = (unsigned int)native_read_tsc(); else x = get_time_pit(); } while (0)
+#define GET_TIME(x)	do { if (cpu_has_tsc) x = (unsigned int)rdtsc(); else x = get_time_pit(); } while (0)
 #define DELTA(x,y)	(cpu_has_tsc ? ((y) - (x)) : ((x) - (y) + ((x) < (y) ? PIT_TICK_RATE / HZ : 0)))
 #define TIME_NAME	(cpu_has_tsc?"TSC":"PIT")
 static unsigned int get_time_pit(void)
@@ -160,7 +160,7 @@ static unsigned int get_time_pit(void)
         return count;
 }
 #elif defined(__x86_64__)
-#define GET_TIME(x)	do { x = (unsigned int)native_read_tsc(); } while (0)
+#define GET_TIME(x)	do { x = (unsigned int)rdtsc(); } while (0)
 #define DELTA(x,y)	((y)-(x))
 #define TIME_NAME	"TSC"
 #elif defined(__alpha__) || defined(CONFIG_MN10300) || defined(CONFIG_ARM) || defined(CONFIG_ARM64) || defined(CONFIG_TILE)
diff --git a/drivers/net/hamradio/baycom_epp.c b/drivers/net/hamradio/baycom_epp.c
index 44e5c3b..72c9f1f 100644
--- a/drivers/net/hamradio/baycom_epp.c
+++ b/drivers/net/hamradio/baycom_epp.c
@@ -638,7 +638,7 @@ static int receive(struct net_device *dev, int cnt)
 #define GETTICK(x)                                                \
 ({                                                                \
 	if (cpu_has_tsc)                                          \
-		x = (unsigned int)native_read_tsc();		  \
+		x = (unsigned int)rdtsc();		  \
 })
 #else /* __i386__ */
 #define GETTICK(x)
diff --git a/drivers/thermal/intel_powerclamp.c b/drivers/thermal/intel_powerclamp.c
index ab13448..2ac0c70 100644
--- a/drivers/thermal/intel_powerclamp.c
+++ b/drivers/thermal/intel_powerclamp.c
@@ -340,7 +340,7 @@ static bool powerclamp_adjust_controls(unsigned int target_ratio,
 
 	/* check result for the last window */
 	msr_now = pkg_state_counter();
-	tsc_now = native_read_tsc();
+	tsc_now = rdtsc();
 
 	/* calculate pkg cstate vs tsc ratio */
 	if (!msr_last || !tsc_last)
@@ -482,7 +482,7 @@ static void poll_pkg_cstate(struct work_struct *dummy)
 	u64 val64;
 
 	msr_now = pkg_state_counter();
-	tsc_now = native_read_tsc();
+	tsc_now = rdtsc();
 	jiffies_now = jiffies;
 
 	/* calculate pkg cstate vs tsc ratio */
diff --git a/tools/power/cpupower/debug/kernel/cpufreq-test_tsc.c b/tools/power/cpupower/debug/kernel/cpufreq-test_tsc.c
index f02b0c0..6ff8383 100644
--- a/tools/power/cpupower/debug/kernel/cpufreq-test_tsc.c
+++ b/tools/power/cpupower/debug/kernel/cpufreq-test_tsc.c
@@ -81,11 +81,11 @@ static int __init cpufreq_test_tsc(void)
 
 	printk(KERN_DEBUG "start--> \n");
 	then = read_pmtmr();
-	then_tsc = native_read_tsc();
+	then_tsc = rdtsc();
 	for (i=0;i<20;i++) {
 		mdelay(100);
 		now = read_pmtmr();
-		now_tsc = native_read_tsc();
+		now_tsc = rdtsc();
 		diff = (now - then) & 0xFFFFFF;
 		diff_tsc = now_tsc - then_tsc;
 		printk(KERN_DEBUG "t1: %08u t2: %08u diff_pmtmr: %08u diff_tsc: %016llu\n", then, now, diff, diff_tsc);

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [tip:x86/asm] x86/asm/tsc: Add rdtsc_ordered() and use it in trivial call sites
  2015-06-17  0:36 ` [PATCH v3 14/18] x86: Add rdtsc_ordered() and use it in trivial call sites Andy Lutomirski
@ 2015-07-06 15:44   ` tip-bot for Andy Lutomirski
  2015-08-21  7:45   ` [tip:x86/asm] x86/asm/tsc: Add rdtscll() merge helper tip-bot for Ingo Molnar
  1 sibling, 0 replies; 57+ messages in thread
From: tip-bot for Andy Lutomirski @ 2015-07-06 15:44 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: john.stultz, lenb, tglx, ralf, brgerst, dvlasenk, luto, luto,
	peterz, ray.huang, linux-kernel, hpa, kvm, bp, bp, torvalds,
	mingo

Commit-ID:  03b9730b769fc4d87e40f6104f4c5b2e43889f19
Gitweb:     http://git.kernel.org/tip/03b9730b769fc4d87e40f6104f4c5b2e43889f19
Author:     Andy Lutomirski <luto@kernel.org>
AuthorDate: Thu, 25 Jun 2015 18:44:08 +0200
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Mon, 6 Jul 2015 15:23:29 +0200

x86/asm/tsc: Add rdtsc_ordered() and use it in trivial call sites

rdtsc_barrier(); rdtsc() is an unnecessary mouthful and requires
more thought than should be necessary. Add an rdtsc_ordered()
helper and replace the trivial call sites with it.

This should not change generated code. The duplication of the
fence asm is temporary.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Len Brown <lenb@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kvm ML <kvm@vger.kernel.org>
Link: http://lkml.kernel.org/r/dddbf98a2af53312e9aa73a5a2b1622fe5d6f52b.1434501121.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/entry/vdso/vclock_gettime.c | 16 ++--------------
 arch/x86/include/asm/msr.h           | 26 ++++++++++++++++++++++++++
 arch/x86/kernel/trace_clock.c        |  7 +------
 arch/x86/kvm/x86.c                   | 16 ++--------------
 arch/x86/lib/delay.c                 |  9 +++------
 5 files changed, 34 insertions(+), 40 deletions(-)

diff --git a/arch/x86/entry/vdso/vclock_gettime.c b/arch/x86/entry/vdso/vclock_gettime.c
index 0340d93..ca94fa6 100644
--- a/arch/x86/entry/vdso/vclock_gettime.c
+++ b/arch/x86/entry/vdso/vclock_gettime.c
@@ -175,20 +175,8 @@ static notrace cycle_t vread_pvclock(int *mode)
 
 notrace static cycle_t vread_tsc(void)
 {
-	cycle_t ret;
-	u64 last;
-
-	/*
-	 * Empirically, a fence (of type that depends on the CPU)
-	 * before rdtsc is enough to ensure that rdtsc is ordered
-	 * with respect to loads.  The various CPU manuals are unclear
-	 * as to whether rdtsc can be reordered with later loads,
-	 * but no one has ever seen it happen.
-	 */
-	rdtsc_barrier();
-	ret = (cycle_t)rdtsc();
-
-	last = gtod->cycle_last;
+	cycle_t ret = (cycle_t)rdtsc_ordered();
+	u64 last = gtod->cycle_last;
 
 	if (likely(ret >= last))
 		return ret;
diff --git a/arch/x86/include/asm/msr.h b/arch/x86/include/asm/msr.h
index ff0c120..02bdd6c 100644
--- a/arch/x86/include/asm/msr.h
+++ b/arch/x86/include/asm/msr.h
@@ -127,6 +127,32 @@ static __always_inline unsigned long long rdtsc(void)
 	return EAX_EDX_VAL(val, low, high);
 }
 
+/**
+ * rdtsc_ordered() - read the current TSC in program order
+ *
+ * rdtsc_ordered() returns the result of RDTSC as a 64-bit integer.
+ * It is ordered like a load to a global in-memory counter.  It should
+ * be impossible to observe non-monotonic rdtsc_unordered() behavior
+ * across multiple CPUs as long as the TSC is synced.
+ */
+static __always_inline unsigned long long rdtsc_ordered(void)
+{
+	/*
+	 * The RDTSC instruction is not ordered relative to memory
+	 * access.  The Intel SDM and the AMD APM are both vague on this
+	 * point, but empirically an RDTSC instruction can be
+	 * speculatively executed before prior loads.  An RDTSC
+	 * immediately after an appropriate barrier appears to be
+	 * ordered as a normal load, that is, it provides the same
+	 * ordering guarantees as reading from a global memory location
+	 * that some other imaginary CPU is updating continuously with a
+	 * time stamp.
+	 */
+	alternative_2("", "mfence", X86_FEATURE_MFENCE_RDTSC,
+			  "lfence", X86_FEATURE_LFENCE_RDTSC);
+	return rdtsc();
+}
+
 static inline unsigned long long native_read_pmc(int counter)
 {
 	DECLARE_ARGS(val, low, high);
diff --git a/arch/x86/kernel/trace_clock.c b/arch/x86/kernel/trace_clock.c
index 67efb8c..80bb24d 100644
--- a/arch/x86/kernel/trace_clock.c
+++ b/arch/x86/kernel/trace_clock.c
@@ -12,10 +12,5 @@
  */
 u64 notrace trace_clock_x86_tsc(void)
 {
-	u64 ret;
-
-	rdtsc_barrier();
-	ret = rdtsc();
-
-	return ret;
+	return rdtsc_ordered();
 }
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index dfa9713..8d73ec8 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1444,20 +1444,8 @@ EXPORT_SYMBOL_GPL(kvm_write_tsc);
 
 static cycle_t read_tsc(void)
 {
-	cycle_t ret;
-	u64 last;
-
-	/*
-	 * Empirically, a fence (of type that depends on the CPU)
-	 * before rdtsc is enough to ensure that rdtsc is ordered
-	 * with respect to loads.  The various CPU manuals are unclear
-	 * as to whether rdtsc can be reordered with later loads,
-	 * but no one has ever seen it happen.
-	 */
-	rdtsc_barrier();
-	ret = (cycle_t)rdtsc();
-
-	last = pvclock_gtod_data.clock.cycle_last;
+	cycle_t ret = (cycle_t)rdtsc_ordered();
+	u64 last = pvclock_gtod_data.clock.cycle_last;
 
 	if (likely(ret >= last))
 		return ret;
diff --git a/arch/x86/lib/delay.c b/arch/x86/lib/delay.c
index f24bc59..4453d52 100644
--- a/arch/x86/lib/delay.c
+++ b/arch/x86/lib/delay.c
@@ -54,11 +54,9 @@ static void delay_tsc(unsigned long __loops)
 
 	preempt_disable();
 	cpu = smp_processor_id();
-	rdtsc_barrier();
-	bclock = rdtsc();
+	bclock = rdtsc_ordered();
 	for (;;) {
-		rdtsc_barrier();
-		now = rdtsc();
+		now = rdtsc_ordered();
 		if ((now - bclock) >= loops)
 			break;
 
@@ -79,8 +77,7 @@ static void delay_tsc(unsigned long __loops)
 		if (unlikely(cpu != smp_processor_id())) {
 			loops -= (now - bclock);
 			cpu = smp_processor_id();
-			rdtsc_barrier();
-			bclock = rdtsc();
+			bclock = rdtsc_ordered();
 		}
 	}
 	preempt_enable();

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [tip:x86/asm] x86/asm/tsc/sync: Use rdtsc_ordered() in check_tsc_warp() and drop extra barriers
  2015-06-17  0:36 ` [PATCH v3 15/18] x86/tsc: Use rdtsc_ordered() in check_tsc_warp() and drop extra barriers Andy Lutomirski
@ 2015-07-06 15:44   ` tip-bot for Andy Lutomirski
  0 siblings, 0 replies; 57+ messages in thread
From: tip-bot for Andy Lutomirski @ 2015-07-06 15:44 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: john.stultz, luto, mingo, kvm, lenb, hpa, brgerst, dvlasenk,
	ray.huang, bp, torvalds, ralf, tglx, luto, bp, peterz,
	linux-kernel

Commit-ID:  eee6946e44510b61c35cf754f5505537c7a8eb77
Gitweb:     http://git.kernel.org/tip/eee6946e44510b61c35cf754f5505537c7a8eb77
Author:     Andy Lutomirski <luto@kernel.org>
AuthorDate: Thu, 25 Jun 2015 18:44:09 +0200
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Mon, 6 Jul 2015 15:23:29 +0200

x86/asm/tsc/sync: Use rdtsc_ordered() in check_tsc_warp() and drop extra barriers

Using get_cycles was unnecessary: check_tsc_warp() is not called
on TSC-less systems. Replace rdtsc_barrier(); get_cycles() with
rdtsc_ordered().

While we're at it, make the somewhat more dangerous change of
removing barrier_before_rdtsc after RDTSC in the TSC warp check
code. This should be okay, though -- the vDSO TSC code doesn't
have that barrier, so, if removing the barrier from the warp
check would cause us to detect a warp that we otherwise wouldn't
detect, then we have a genuine bug.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Len Brown <lenb@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kvm ML <kvm@vger.kernel.org>
Link: http://lkml.kernel.org/r/387c4c3a75f875bcde6cd68cee013273a744f364.1434501121.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/kernel/tsc_sync.c | 14 ++++++--------
 1 file changed, 6 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kernel/tsc_sync.c b/arch/x86/kernel/tsc_sync.c
index dd8d079..78083bf 100644
--- a/arch/x86/kernel/tsc_sync.c
+++ b/arch/x86/kernel/tsc_sync.c
@@ -39,16 +39,15 @@ static cycles_t max_warp;
 static int nr_warps;
 
 /*
- * TSC-warp measurement loop running on both CPUs:
+ * TSC-warp measurement loop running on both CPUs.  This is not called
+ * if there is no TSC.
  */
 static void check_tsc_warp(unsigned int timeout)
 {
 	cycles_t start, now, prev, end;
 	int i;
 
-	rdtsc_barrier();
-	start = get_cycles();
-	rdtsc_barrier();
+	start = rdtsc_ordered();
 	/*
 	 * The measurement runs for 'timeout' msecs:
 	 */
@@ -63,9 +62,7 @@ static void check_tsc_warp(unsigned int timeout)
 		 */
 		arch_spin_lock(&sync_lock);
 		prev = last_tsc;
-		rdtsc_barrier();
-		now = get_cycles();
-		rdtsc_barrier();
+		now = rdtsc_ordered();
 		last_tsc = now;
 		arch_spin_unlock(&sync_lock);
 
@@ -126,7 +123,7 @@ void check_tsc_sync_source(int cpu)
 
 	/*
 	 * No need to check if we already know that the TSC is not
-	 * synchronized:
+	 * synchronized or if we have no TSC.
 	 */
 	if (unsynchronized_tsc())
 		return;
@@ -190,6 +187,7 @@ void check_tsc_sync_target(void)
 {
 	int cpus = 2;
 
+	/* Also aborts if there is no TSC. */
 	if (unsynchronized_tsc() || tsc_clocksource_reliable)
 		return;
 

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [tip:x86/asm] x86/asm/tsc: Use rdtsc_ordered() in read_tsc() instead of get_cycles()
  2015-06-17  0:36 ` [PATCH v3 16/18] x86/tsc: In read_tsc, use rdtsc_ordered() instead of get_cycles() Andy Lutomirski
@ 2015-07-06 15:44   ` tip-bot for Andy Lutomirski
  0 siblings, 0 replies; 57+ messages in thread
From: tip-bot for Andy Lutomirski @ 2015-07-06 15:44 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: lenb, peterz, linux-kernel, john.stultz, dvlasenk, ralf, luto,
	torvalds, ray.huang, bp, mingo, hpa, bp, luto, tglx, kvm,
	brgerst

Commit-ID:  27c634054a3155e1d9a02f0e362e4f4ff8d28ee7
Gitweb:     http://git.kernel.org/tip/27c634054a3155e1d9a02f0e362e4f4ff8d28ee7
Author:     Andy Lutomirski <luto@kernel.org>
AuthorDate: Thu, 25 Jun 2015 18:44:10 +0200
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Mon, 6 Jul 2015 15:23:29 +0200

x86/asm/tsc: Use rdtsc_ordered() in read_tsc() instead of get_cycles()

There are two logical changes here.  First, this removes a check
for cpu_has_tsc.  That check is unnecessary, as we don't
register the TSC as a clocksource on systems that have no TSC.

Second, it adds a barrier, thus preventing observable
non-monotonicity.

I suspect that the missing barrier was never a problem in
practice because system calls themselves were heavy enough
barriers to prevent user code from observing time warps due to
speculation. (Without the corresponding barrier in the vDSO,
however, non-monotonicity is easy to detect.)

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Len Brown <lenb@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kvm ML <kvm@vger.kernel.org>
Link: http://lkml.kernel.org/r/c6ff621a053127a65b70f175443578db7a0711be.1434501121.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/kernel/tsc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
index 21d6e04..451bade0 100644
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -961,7 +961,7 @@ static struct clocksource clocksource_tsc;
  */
 static cycle_t read_tsc(struct clocksource *cs)
 {
-	return (cycle_t)get_cycles();
+	return (cycle_t)rdtsc_ordered();
 }
 
 /*

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [tip:x86/asm] x86/asm/tsc, x86/kvm: Drop open-coded barrier and use rdtsc_ordered() in kvmclock
  2015-06-17  0:36 ` [PATCH v3 17/18] x86/kvm/tsc: Drop extra barrier and use rdtsc_ordered in kvmclock Andy Lutomirski
  2015-06-17  7:47   ` Paolo Bonzini
@ 2015-07-06 15:45   ` tip-bot for Andy Lutomirski
  1 sibling, 0 replies; 57+ messages in thread
From: tip-bot for Andy Lutomirski @ 2015-07-06 15:45 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: brgerst, peterz, lenb, pbonzini, rkrcmar, tglx, dvlasenk,
	john.stultz, ralf, kvm, bp, mingo, luto, mtosatti, luto,
	ray.huang, linux-kernel, bp, torvalds, hpa

Commit-ID:  502dfeff239e8313bfbe906ca0a1a6827ac8481b
Gitweb:     http://git.kernel.org/tip/502dfeff239e8313bfbe906ca0a1a6827ac8481b
Author:     Andy Lutomirski <luto@kernel.org>
AuthorDate: Thu, 25 Jun 2015 18:44:11 +0200
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Mon, 6 Jul 2015 15:23:30 +0200

x86/asm/tsc, x86/kvm: Drop open-coded barrier and use rdtsc_ordered() in kvmclock

__pvclock_read_cycles() used to have two barriers, one of which was unnecessary,
which got removed after an initial version of this patch was sent.

But the barrier is still open-coded unnecessarily - get rid of
that barrier and clean up the code by just using rdtsc_ordered().

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Len Brown <lenb@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Radim Krcmar <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kvm ML <kvm@vger.kernel.org>
Link: http://lkml.kernel.org/r/678981cc4761fb38a793c217c9cac42503cf3719.1434501121.git.luto@kernel.org
[ Ported it to v4.2-rc1. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/pvclock.h | 10 ++--------
 1 file changed, 2 insertions(+), 8 deletions(-)

diff --git a/arch/x86/include/asm/pvclock.h b/arch/x86/include/asm/pvclock.h
index 5c490db..7a6bed5 100644
--- a/arch/x86/include/asm/pvclock.h
+++ b/arch/x86/include/asm/pvclock.h
@@ -62,7 +62,7 @@ static inline u64 pvclock_scale_delta(u64 delta, u32 mul_frac, int shift)
 static __always_inline
 u64 pvclock_get_nsec_offset(const struct pvclock_vcpu_time_info *src)
 {
-	u64 delta = rdtsc() - src->tsc_timestamp;
+	u64 delta = rdtsc_ordered() - src->tsc_timestamp;
 	return pvclock_scale_delta(delta, src->tsc_to_system_mul,
 				   src->tsc_shift);
 }
@@ -76,13 +76,7 @@ unsigned __pvclock_read_cycles(const struct pvclock_vcpu_time_info *src,
 	u8 ret_flags;
 
 	version = src->version;
-	/* Note: emulated platforms which do not advertise SSE2 support
-	 * result in kvmclock not using the necessary RDTSC barriers.
-	 * Without barriers, it is possible that RDTSC instruction reads from
-	 * the time stamp counter outside rdtsc_barrier protected section
-	 * below, resulting in violation of monotonicity.
-	 */
-	rdtsc_barrier();
+
 	offset = pvclock_get_nsec_offset(src);
 	ret = src->system_time + offset;
 	ret_flags = src->flags;

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [tip:x86/asm] x86/asm/tsc: Remove rdtsc_barrier()
  2015-06-17  0:36 ` [PATCH v3 18/18] x86/tsc: Remove rdtsc_barrier() Andy Lutomirski
@ 2015-07-06 15:45   ` tip-bot for Andy Lutomirski
  0 siblings, 0 replies; 57+ messages in thread
From: tip-bot for Andy Lutomirski @ 2015-07-06 15:45 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: luto, richard, hpa, mingo, bp, kvm, linux-kernel, lenb, dvlasenk,
	torvalds, bp, tglx, peterz, ray.huang, luto, brgerst,
	john.stultz, ralf

Commit-ID:  bb8dd96032fc63babfc8b378a37dd7681eeec326
Gitweb:     http://git.kernel.org/tip/bb8dd96032fc63babfc8b378a37dd7681eeec326
Author:     Andy Lutomirski <luto@kernel.org>
AuthorDate: Thu, 25 Jun 2015 18:44:12 +0200
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Mon, 6 Jul 2015 15:23:30 +0200

x86/asm/tsc: Remove rdtsc_barrier()

All callers have been converted to rdtsc_ordered().

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Len Brown <lenb@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kvm ML <kvm@vger.kernel.org>
Link: http://lkml.kernel.org/r/9baa4ae9a1e7c7c282f9cb2f15bb6bf5c2004032.1434501121.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/barrier.h | 11 -----------
 arch/x86/um/asm/barrier.h      | 13 -------------
 2 files changed, 24 deletions(-)

diff --git a/arch/x86/include/asm/barrier.h b/arch/x86/include/asm/barrier.h
index e51a8f8..818cb87 100644
--- a/arch/x86/include/asm/barrier.h
+++ b/arch/x86/include/asm/barrier.h
@@ -91,15 +91,4 @@ do {									\
 #define smp_mb__before_atomic()	barrier()
 #define smp_mb__after_atomic()	barrier()
 
-/*
- * Stop RDTSC speculation. This is needed when you need to use RDTSC
- * (or get_cycles or vread that possibly accesses the TSC) in a defined
- * code region.
- */
-static __always_inline void rdtsc_barrier(void)
-{
-	alternative_2("", "mfence", X86_FEATURE_MFENCE_RDTSC,
-			  "lfence", X86_FEATURE_LFENCE_RDTSC);
-}
-
 #endif /* _ASM_X86_BARRIER_H */
diff --git a/arch/x86/um/asm/barrier.h b/arch/x86/um/asm/barrier.h
index b9531d3..755481f 100644
--- a/arch/x86/um/asm/barrier.h
+++ b/arch/x86/um/asm/barrier.h
@@ -45,17 +45,4 @@
 #define read_barrier_depends()		do { } while (0)
 #define smp_read_barrier_depends()	do { } while (0)
 
-/*
- * Stop RDTSC speculation. This is needed when you need to use RDTSC
- * (or get_cycles or vread that possibly accesses the TSC) in a defined
- * code region.
- *
- * (Could use an alternative three way for this if there was one.)
- */
-static inline void rdtsc_barrier(void)
-{
-	alternative_2("", "mfence", X86_FEATURE_MFENCE_RDTSC,
-			  "lfence", X86_FEATURE_LFENCE_RDTSC);
-}
-
 #endif

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [tip:x86/asm] x86/asm/tsc: Add rdtscll() merge helper
  2015-06-17  0:36 ` [PATCH v3 14/18] x86: Add rdtsc_ordered() and use it in trivial call sites Andy Lutomirski
  2015-07-06 15:44   ` [tip:x86/asm] x86/asm/tsc: " tip-bot for Andy Lutomirski
@ 2015-08-21  7:45   ` tip-bot for Ingo Molnar
  1 sibling, 0 replies; 57+ messages in thread
From: tip-bot for Ingo Molnar @ 2015-08-21  7:45 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: ray.huang, john.stultz, luto, ralf, bp, kvm, lenb, dvlasenk,
	luto, tglx, peterz, bp, linux-kernel, hpa, torvalds, mingo,
	brgerst

Commit-ID:  99770737ca7e3ebc14e66460a69b7032de9421e1
Gitweb:     http://git.kernel.org/tip/99770737ca7e3ebc14e66460a69b7032de9421e1
Author:     Ingo Molnar <mingo@kernel.org>
AuthorDate: Fri, 21 Aug 2015 08:33:53 +0200
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Fri, 21 Aug 2015 08:35:42 +0200

x86/asm/tsc: Add rdtscll() merge helper

Some in-flight code makes use of the old rdtscll() (now removed), provide a wrapper
for a kernel cycle to smooth the transition to rdtsc().

( We use the safest variant, rdtsc_ordered(), which has barriers - this adds another
  incentive to remove the wrapper in the future. )

Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Len Brown <lenb@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kvm ML <kvm@vger.kernel.org>
Link: http://lkml.kernel.org/r/dddbf98a2af53312e9aa73a5a2b1622fe5d6f52b.1434501121.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/msr.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/x86/include/asm/msr.h b/arch/x86/include/asm/msr.h
index 131eec2..54e9f08 100644
--- a/arch/x86/include/asm/msr.h
+++ b/arch/x86/include/asm/msr.h
@@ -152,6 +152,9 @@ static __always_inline unsigned long long rdtsc_ordered(void)
 	return rdtsc();
 }
 
+/* Deprecated, keep it for a cycle for easier merging: */
+#define rdtscll(now)	do { (now) = rdtsc_ordered(); } while (0)
+
 static inline unsigned long long native_read_pmc(int counter)
 {
 	DECLARE_ARGS(val, low, high);

^ permalink raw reply related	[flat|nested] 57+ messages in thread

end of thread, other threads:[~2015-08-21  7:45 UTC | newest]

Thread overview: 57+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-06-17  0:35 [PATCH v3 00/18] x86/tsc: Clean up rdtsc helpers Andy Lutomirski
2015-06-17  0:35 ` [PATCH v3 01/18] x86/tsc: Inline native_read_tsc and remove __native_read_tsc Andy Lutomirski
2015-06-17  9:26   ` Borislav Petkov
2015-07-06 15:39   ` [tip:x86/asm] x86/asm/tsc: Inline native_read_tsc() and remove __native_read_tsc() tip-bot for Andy Lutomirski
2015-06-17  0:35 ` [PATCH v3 02/18] x86/msr/kvm: Remove vget_cycles() Andy Lutomirski
2015-06-17  9:42   ` Borislav Petkov
2015-06-17 13:34   ` Paolo Bonzini
2015-07-06 15:40   ` [tip:x86/asm] x86/asm/tsc, kvm: " tip-bot for Andy Lutomirski
2015-06-17  0:35 ` [PATCH v3 03/18] x86/tsc/paravirt: Remove the read_tsc and read_tscp paravirt hooks Andy Lutomirski
2015-06-17  9:56   ` Borislav Petkov
2015-06-19 15:32   ` Borislav Petkov
2015-06-19 16:14     ` Andy Lutomirski
2015-06-19 17:13       ` Borislav Petkov
2015-07-06 15:40   ` [tip:x86/asm] x86/asm/tsc, x86/paravirt: Remove read_tsc() and read_tscp() " tip-bot for Andy Lutomirski
2015-06-17  0:35 ` [PATCH v3 04/18] x86/tsc: Replace rdtscll with native_read_tsc Andy Lutomirski
2015-06-17 10:03   ` Borislav Petkov
2015-07-06 15:40   ` [tip:x86/asm] x86/asm/tsc: Replace rdtscll() with native_read_tsc () tip-bot for Andy Lutomirski
2015-06-17  0:35 ` [PATCH v3 05/18] x86/tsc: Remove the rdtscp and rdtscpll macros Andy Lutomirski
2015-07-06 15:41   ` [tip:x86/asm] x86/asm/tsc: Remove the rdtscp() and rdtscpll() macros tip-bot for Andy Lutomirski
2015-06-17  0:35 ` [PATCH v3 06/18] x86/tsc: Use the full 64-bit tsc in tsc_delay Andy Lutomirski
2015-07-06 15:41   ` [tip:x86/asm] x86/asm/tsc: Use the full 64-bit TSC in delay_tsc() tip-bot for Andy Lutomirski
2015-06-17  0:35 ` [PATCH v3 07/18] x86/cpu/amd: Use the full 64-bit TSC to detect the 2.6.2 bug Andy Lutomirski
2015-07-06 15:41   ` [tip:x86/asm] x86/asm/tsc, " tip-bot for Andy Lutomirski
2015-06-17  0:35 ` [PATCH v3 08/18] baycom_epp: Replace rdtscl() with native_read_tsc() Andy Lutomirski
2015-06-17  0:49   ` Thomas Sailer
2015-06-20 13:54   ` walter harms
2015-06-20 14:14     ` Thomas Gleixner
2015-06-20 14:26       ` Andy Lutomirski
2015-06-20 16:30         ` Thomas Gleixner
2015-07-06 15:42   ` [tip:x86/asm] x86/asm/tsc, drivers/net/hamradio/baycom_epp: " tip-bot for Andy Lutomirski
2015-06-17  0:35 ` [PATCH v3 09/18] staging/lirc_serial: Remove TSC-based timing Andy Lutomirski
2015-07-06 15:42   ` [tip:x86/asm] x86/asm/tsc, " tip-bot for Andy Lutomirski
2015-06-17  0:35 ` [PATCH v3 10/18] input/joystick/analog: Switch from rdtscl() to native_read_tsc() Andy Lutomirski
2015-07-06 15:42   ` [tip:x86/asm] x86/asm/tsc, " tip-bot for Andy Lutomirski
2015-06-17  0:35 ` [PATCH v3 11/18] drivers/input/gameport: Replace rdtscl() with native_read_tsc() Andy Lutomirski
2015-07-06 15:43   ` [tip:x86/asm] x86/asm/tsc, drivers/input/gameport: Replace rdtscl () " tip-bot for Andy Lutomirski
2015-06-17  0:36 ` [PATCH v3 12/18] x86/tsc: Remove rdtscl() Andy Lutomirski
2015-07-06 15:43   ` [tip:x86/asm] x86/asm/tsc: " tip-bot for Andy Lutomirski
2015-06-17  0:36 ` [PATCH v3 13/18] x86/tsc: Rename native_read_tsc() to rdtsc() Andy Lutomirski
2015-06-24 21:38   ` Borislav Petkov
2015-07-06 15:43   ` [tip:x86/asm] x86/asm/tsc: " tip-bot for Andy Lutomirski
2015-06-17  0:36 ` [PATCH v3 14/18] x86: Add rdtsc_ordered() and use it in trivial call sites Andy Lutomirski
2015-07-06 15:44   ` [tip:x86/asm] x86/asm/tsc: " tip-bot for Andy Lutomirski
2015-08-21  7:45   ` [tip:x86/asm] x86/asm/tsc: Add rdtscll() merge helper tip-bot for Ingo Molnar
2015-06-17  0:36 ` [PATCH v3 15/18] x86/tsc: Use rdtsc_ordered() in check_tsc_warp() and drop extra barriers Andy Lutomirski
2015-07-06 15:44   ` [tip:x86/asm] x86/asm/tsc/sync: " tip-bot for Andy Lutomirski
2015-06-17  0:36 ` [PATCH v3 16/18] x86/tsc: In read_tsc, use rdtsc_ordered() instead of get_cycles() Andy Lutomirski
2015-07-06 15:44   ` [tip:x86/asm] x86/asm/tsc: Use rdtsc_ordered() in read_tsc() " tip-bot for Andy Lutomirski
2015-06-17  0:36 ` [PATCH v3 17/18] x86/kvm/tsc: Drop extra barrier and use rdtsc_ordered in kvmclock Andy Lutomirski
2015-06-17  7:47   ` Paolo Bonzini
2015-06-17 13:31     ` Paolo Bonzini
2015-06-20 21:50     ` Borislav Petkov
2015-07-06 15:45   ` [tip:x86/asm] x86/asm/tsc, x86/kvm: Drop open-coded barrier and use rdtsc_ordered() " tip-bot for Andy Lutomirski
2015-06-17  0:36 ` [PATCH v3 18/18] x86/tsc: Remove rdtsc_barrier() Andy Lutomirski
2015-07-06 15:45   ` [tip:x86/asm] x86/asm/tsc: " tip-bot for Andy Lutomirski
2015-06-17 11:11 ` [PATCH v3 00/18] x86/tsc: Clean up rdtsc helpers Borislav Petkov
2015-06-17 13:37   ` Paolo Bonzini

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).