linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v8 0/6] Early boot time stamps for x86
@ 2017-11-09  3:01 Pavel Tatashin
  2017-11-09  3:01 ` [PATCH v8 1/6] x86/tsc: remove tsc_disabled flag Pavel Tatashin
                   ` (5 more replies)
  0 siblings, 6 replies; 17+ messages in thread
From: Pavel Tatashin @ 2017-11-09  3:01 UTC (permalink / raw)
  To: steven.sistare, daniel.m.jordan, linux, schwidefsky,
	heiko.carstens, john.stultz, sboyd, x86, linux-kernel, mingo,
	tglx, hpa, douly.fnst

changelog
---------
v8 - v7
	- Addressed comments from Dou Liyang:
	- Moved tsc_early_init() and tsc_early_fini() to be all inside
	  tsc.c, and changed them to be static.
	- Removed warning when notsc parameter is used.
	- Merged with:
	  https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git

v7 - v6
	- Removed tsc_disabled flag, now notsc is equivalent of
	  tsc=unstable
	- Simplified changes to sched/clock.c, by removing the
	  sched_clock_early() and friends as requested by Peter Zijlstra.
	  We know always use sched_clock()
	- Modified x86 sched_clock() to return either early boot time or
	  regular.
	- Added another example why ealry boot time is important

v5 - v6
	- Added a new patch:
		time: sync read_boot_clock64() with persistent clock
	  Which fixes missing __init macro, and enabled time discrepancy
	  fix that was noted by Thomas Gleixner
	- Split "x86/time: read_boot_clock64() implementation" into a
	  separate patch
v4 - v5
	- Fix compiler warnings on systems with stable clocks.

v3 - v4
	- Fixed tsc_early_fini() call to be in the 2nd patch as reported
	  by Dou Liyang
	- Improved comment before __use_sched_clock_early to explain why
	  we need both booleans.
	- Simplified valid_clock logic in read_boot_clock64().

v2 - v3
	- Addressed comment from Thomas Gleixner
	- Timestamps are available a little later in boot but still much
	  earlier than in mainline. This significantly simplified this
	  work.

v1 - v2
	In patch "x86/tsc: tsc early":
	- added tsc_adjusted_early()
	- fixed 32-bit compile error use do_div()

Adding early boot time stamps support for x86 machines.
SPARC patches for early boot time stamps are already integrated into
mainline linux.

Sample output
-------------
Before:
https://hastebin.com/jadaqukubu.scala

After:
https://hastebin.com/nubipozacu.scala

For more exaples how early time stamps are used, see this work:

Example 1:
https://lwn.net/Articles/734374/
- Without early boot time stamps we would not know about the extra time
  that is spent zeroing struct pages early in boot even when deferred
  page initialization.

Example 2:
https://patchwork.kernel.org/patch/10021247/
- If early boot timestamps were available, the engineer who introduced
  this bug would have noticed the extra time that is spent early in boot.

Pavel Tatashin (6):
  x86/tsc: remove tsc_disabled flag
  time: sync read_boot_clock64() with persistent clock
  x86/time: read_boot_clock64() implementation
  sched: early boot clock
  x86/paravirt: add active_sched_clock to pv_time_ops
  x86/tsc: use tsc early

 arch/arm/kernel/time.c                |   2 +-
 arch/s390/kernel/time.c               |   2 +-
 arch/x86/include/asm/paravirt.h       |   2 +-
 arch/x86/include/asm/paravirt_types.h |   1 +
 arch/x86/kernel/paravirt.c            |   1 +
 arch/x86/kernel/time.c                |  30 ++++++++++
 arch/x86/kernel/tsc.c                 | 104 ++++++++++++++++++++++++++++------
 arch/x86/xen/time.c                   |   7 ++-
 include/linux/timekeeping.h           |   7 +--
 kernel/sched/clock.c                  |  10 +++-
 kernel/time/timekeeping.c             |   8 ++-
 11 files changed, 144 insertions(+), 30 deletions(-)

-- 
2.15.0

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v8 1/6] x86/tsc: remove tsc_disabled flag
  2017-11-09  3:01 [PATCH v8 0/6] Early boot time stamps for x86 Pavel Tatashin
@ 2017-11-09  3:01 ` Pavel Tatashin
  2017-11-13  3:52   ` Dou Liyang
  2017-11-09  3:01 ` [PATCH v8 2/6] time: sync read_boot_clock64() with persistent clock Pavel Tatashin
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 17+ messages in thread
From: Pavel Tatashin @ 2017-11-09  3:01 UTC (permalink / raw)
  To: steven.sistare, daniel.m.jordan, linux, schwidefsky,
	heiko.carstens, john.stultz, sboyd, x86, linux-kernel, mingo,
	tglx, hpa, douly.fnst

tsc_disabled is set when notsc is passed as kernel parameter. The reason we
have notsc is to avoid timing problems on multi-socket systems.  We already
have a mechanism, however, to detect and resolve these issues by invoking
tsc unstable path. Thus, make notsc to behave the same as tsc=unstable.

Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
---
 arch/x86/kernel/tsc.c | 19 +++----------------
 1 file changed, 3 insertions(+), 16 deletions(-)

diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
index 97907e152356..dbce6fa32aa9 100644
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -37,11 +37,6 @@ EXPORT_SYMBOL(tsc_khz);
  */
 static int __read_mostly tsc_unstable;
 
-/* native_sched_clock() is called before tsc_init(), so
-   we must start with the TSC soft disabled to prevent
-   erroneous rdtsc usage on !boot_cpu_has(X86_FEATURE_TSC) processors */
-static int __read_mostly tsc_disabled = -1;
-
 static DEFINE_STATIC_KEY_FALSE(__use_tsc);
 
 int tsc_clocksource_reliable;
@@ -247,8 +242,7 @@ EXPORT_SYMBOL_GPL(check_tsc_unstable);
 #ifdef CONFIG_X86_TSC
 int __init notsc_setup(char *str)
 {
-	pr_warn("Kernel compiled with CONFIG_X86_TSC, cannot disable TSC completely\n");
-	tsc_disabled = 1;
+	mark_tsc_unstable("boot parameter notsc");
 	return 1;
 }
 #else
@@ -1229,7 +1223,7 @@ static void tsc_refine_calibration_work(struct work_struct *work)
 
 static int __init init_tsc_clocksource(void)
 {
-	if (!boot_cpu_has(X86_FEATURE_TSC) || tsc_disabled > 0 || !tsc_khz)
+	if (!boot_cpu_has(X86_FEATURE_TSC) || !tsc_khz)
 		return 0;
 
 	if (tsc_clocksource_reliable)
@@ -1330,12 +1324,6 @@ void __init tsc_init(void)
 		set_cyc2ns_scale(tsc_khz, cpu, cyc);
 	}
 
-	if (tsc_disabled > 0)
-		return;
-
-	/* now allow native_sched_clock() to use rdtsc */
-
-	tsc_disabled = 0;
 	static_branch_enable(&__use_tsc);
 
 	if (!no_sched_irq_time)
@@ -1365,10 +1353,9 @@ void __init tsc_init(void)
 unsigned long calibrate_delay_is_known(void)
 {
 	int sibling, cpu = smp_processor_id();
-	int constant_tsc = cpu_has(&cpu_data(cpu), X86_FEATURE_CONSTANT_TSC);
 	const struct cpumask *mask = topology_core_cpumask(cpu);
 
-	if (tsc_disabled || !constant_tsc || !mask)
+	if (!cpu_has(&cpu_data(cpu), X86_FEATURE_CONSTANT_TSC) || !mask)
 		return 0;
 
 	sibling = cpumask_any_but(mask, cpu);
-- 
2.15.0

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v8 2/6] time: sync read_boot_clock64() with persistent clock
  2017-11-09  3:01 [PATCH v8 0/6] Early boot time stamps for x86 Pavel Tatashin
  2017-11-09  3:01 ` [PATCH v8 1/6] x86/tsc: remove tsc_disabled flag Pavel Tatashin
@ 2017-11-09  3:01 ` Pavel Tatashin
  2017-11-13  3:47   ` Dou Liyang
  2017-11-09  3:01 ` [PATCH v8 3/6] x86/time: read_boot_clock64() implementation Pavel Tatashin
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 17+ messages in thread
From: Pavel Tatashin @ 2017-11-09  3:01 UTC (permalink / raw)
  To: steven.sistare, daniel.m.jordan, linux, schwidefsky,
	heiko.carstens, john.stultz, sboyd, x86, linux-kernel, mingo,
	tglx, hpa, douly.fnst

read_boot_clock64() returns a boot start timestamp from epoch. Some arches
may need to access the persistent clock interface in order to calculate the
epoch offset. The resolution of the persistent clock, however, might be
low.  Therefore, in order to avoid time discrepancies a new argument 'now'
is added to read_boot_clock64() parameters. Arch may decide to use it
instead of accessing persistent clock again.

Also, change read_boot_clock64() to have __init prototype since it is
accessed only during boot.

Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
---
 arch/arm/kernel/time.c      | 2 +-
 arch/s390/kernel/time.c     | 2 +-
 include/linux/timekeeping.h | 7 +++----
 kernel/time/timekeeping.c   | 8 ++++++--
 4 files changed, 11 insertions(+), 8 deletions(-)

diff --git a/arch/arm/kernel/time.c b/arch/arm/kernel/time.c
index 629f8e9981f1..5b259261a268 100644
--- a/arch/arm/kernel/time.c
+++ b/arch/arm/kernel/time.c
@@ -90,7 +90,7 @@ void read_persistent_clock64(struct timespec64 *ts)
 	__read_persistent_clock(ts);
 }
 
-void read_boot_clock64(struct timespec64 *ts)
+void __init read_boot_clock64(struct timespec64 *now, struct timespec64 *ts)
 {
 	__read_boot_clock(ts);
 }
diff --git a/arch/s390/kernel/time.c b/arch/s390/kernel/time.c
index 5cbd52169348..780b770e6a89 100644
--- a/arch/s390/kernel/time.c
+++ b/arch/s390/kernel/time.c
@@ -220,7 +220,7 @@ void read_persistent_clock64(struct timespec64 *ts)
 	ext_to_timespec64(clk, ts);
 }
 
-void read_boot_clock64(struct timespec64 *ts)
+void __init read_boot_clock64(struct timespec64 *now, struct timespec64 *ts)
 {
 	unsigned char clk[STORE_CLOCK_EXT_SIZE];
 	__u64 delta;
diff --git a/include/linux/timekeeping.h b/include/linux/timekeeping.h
index c198ab40c04f..73919982045e 100644
--- a/include/linux/timekeeping.h
+++ b/include/linux/timekeeping.h
@@ -211,9 +211,8 @@ extern void ktime_get_snapshot(struct system_time_snapshot *systime_snapshot);
  */
 extern int persistent_clock_is_local;
 
-extern void read_persistent_clock64(struct timespec64 *ts);
-extern void read_boot_clock64(struct timespec64 *ts);
-extern int update_persistent_clock64(struct timespec64 now);
-
+void read_persistent_clock64(struct timespec64 *ts);
+void read_boot_clock64(struct timespec64 *now, struct timespec64 *ts);
+int update_persistent_clock64(struct timespec64 now);
 
 #endif
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 353f7bd1eeb0..59b64ff47468 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -1539,9 +1539,13 @@ void __weak read_persistent_clock64(struct timespec64 *ts64)
  * Function to read the exact time the system has been started.
  * Returns a timespec64 with tv_sec=0 and tv_nsec=0 if unsupported.
  *
+ * Argument 'now' contains time from persistent clock to calculate offset from
+ * epoch. May contain zeros if persist ant clock is not available.
+ *
  *  XXX - Do be sure to remove it once all arches implement it.
  */
-void __weak read_boot_clock64(struct timespec64 *ts)
+void __weak __init read_boot_clock64(struct timespec64 *now,
+				     struct timespec64 *ts)
 {
 	ts->tv_sec = 0;
 	ts->tv_nsec = 0;
@@ -1572,7 +1576,7 @@ void __init timekeeping_init(void)
 	} else if (now.tv_sec || now.tv_nsec)
 		persistent_clock_exists = true;
 
-	read_boot_clock64(&boot);
+	read_boot_clock64(&now, &boot);
 	if (!timespec64_valid_strict(&boot)) {
 		pr_warn("WARNING: Boot clock returned invalid value!\n"
 			"         Check your CMOS/BIOS settings.\n");
-- 
2.15.0

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v8 3/6] x86/time: read_boot_clock64() implementation
  2017-11-09  3:01 [PATCH v8 0/6] Early boot time stamps for x86 Pavel Tatashin
  2017-11-09  3:01 ` [PATCH v8 1/6] x86/tsc: remove tsc_disabled flag Pavel Tatashin
  2017-11-09  3:01 ` [PATCH v8 2/6] time: sync read_boot_clock64() with persistent clock Pavel Tatashin
@ 2017-11-09  3:01 ` Pavel Tatashin
  2017-11-09  3:01 ` [PATCH v8 4/6] sched: early boot clock Pavel Tatashin
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 17+ messages in thread
From: Pavel Tatashin @ 2017-11-09  3:01 UTC (permalink / raw)
  To: steven.sistare, daniel.m.jordan, linux, schwidefsky,
	heiko.carstens, john.stultz, sboyd, x86, linux-kernel, mingo,
	tglx, hpa, douly.fnst

read_boot_clock64() returns time of when system was started. Now, that
early boot clock is going to be available on x86 it is possible to
implement x86 specific version of read_boot_clock64() that takes advantage
of this new feature.

Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
---
 arch/x86/kernel/time.c | 30 ++++++++++++++++++++++++++++++
 1 file changed, 30 insertions(+)

diff --git a/arch/x86/kernel/time.c b/arch/x86/kernel/time.c
index 749d189f8cd4..18bee0b7b8a0 100644
--- a/arch/x86/kernel/time.c
+++ b/arch/x86/kernel/time.c
@@ -15,6 +15,7 @@
 #include <linux/i8253.h>
 #include <linux/time.h>
 #include <linux/export.h>
+#include <linux/sched/clock.h>
 
 #include <asm/vsyscall.h>
 #include <asm/x86_init.h>
@@ -101,3 +102,32 @@ void __init time_init(void)
 {
 	late_time_init = x86_late_time_init;
 }
+
+/*
+ * Called once during to boot to initialize boot time.
+ * This function returns timestamp in timespec format which is sec/nsec from
+ * epoch of when boot started.
+ * We use sched_clock_cpu() that gives us nanoseconds from when this clock has
+ * been started and it happens quiet early during boot process. To calculate
+ * offset from epoch we use information provided in 'now' by the caller
+ *
+ * If sched_clock_cpu() is not available or if there is any kind of error
+ * i.e. time from epoch is smaller than boot time, we must return zeros in ts,
+ * and the caller will take care of the error: by assuming that the time when
+ * this function was called is the beginning of boot time.
+ */
+void __init read_boot_clock64(struct timespec64 *now, struct timespec64 *ts)
+{
+	u64 ns_boot = sched_clock_cpu(smp_processor_id());
+	bool valid_clock;
+	u64 ns_now;
+
+	ns_now = timespec64_to_ns(now);
+	valid_clock = ns_boot && timespec64_valid_strict(now) &&
+			(ns_now > ns_boot);
+
+	if (!valid_clock)
+		*ts = (struct timespec64){0, 0};
+	else
+		*ts = ns_to_timespec64(ns_now - ns_boot);
+}
-- 
2.15.0

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v8 4/6] sched: early boot clock
  2017-11-09  3:01 [PATCH v8 0/6] Early boot time stamps for x86 Pavel Tatashin
                   ` (2 preceding siblings ...)
  2017-11-09  3:01 ` [PATCH v8 3/6] x86/time: read_boot_clock64() implementation Pavel Tatashin
@ 2017-11-09  3:01 ` Pavel Tatashin
  2017-11-09  3:02 ` [PATCH v8 5/6] x86/paravirt: add active_sched_clock to pv_time_ops Pavel Tatashin
  2017-11-09  3:02 ` [PATCH v8 6/6] x86/tsc: use tsc early Pavel Tatashin
  5 siblings, 0 replies; 17+ messages in thread
From: Pavel Tatashin @ 2017-11-09  3:01 UTC (permalink / raw)
  To: steven.sistare, daniel.m.jordan, linux, schwidefsky,
	heiko.carstens, john.stultz, sboyd, x86, linux-kernel, mingo,
	tglx, hpa, douly.fnst

Allow sched_clock() to be used before schec_clock_init() and
sched_clock_init_late() are called. This provides us with a way to get
early boot timestamps on machines with unstable clocks.

Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
---
 kernel/sched/clock.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/clock.c b/kernel/sched/clock.c
index e086babe6c61..8bc603951c2d 100644
--- a/kernel/sched/clock.c
+++ b/kernel/sched/clock.c
@@ -217,6 +217,11 @@ void clear_sched_clock_stable(void)
  */
 static int __init sched_clock_init_late(void)
 {
+	/* Transition to unstable clock from early clock */
+	local_irq_disable();
+	__gtod_offset = sched_clock() + __sched_clock_offset - ktime_get_ns();
+	local_irq_enable();
+
 	sched_clock_running = 2;
 	/*
 	 * Ensure that it is impossible to not do a static_key update.
@@ -362,8 +367,9 @@ u64 sched_clock_cpu(int cpu)
 	if (sched_clock_stable())
 		return sched_clock() + __sched_clock_offset;
 
-	if (unlikely(!sched_clock_running))
-		return 0ull;
+	/* Use early clock until sched_clock_init_late() */
+	if (unlikely(sched_clock_running < 2))
+		return sched_clock() + __sched_clock_offset;
 
 	preempt_disable_notrace();
 	scd = cpu_sdc(cpu);
-- 
2.15.0

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v8 5/6] x86/paravirt: add active_sched_clock to pv_time_ops
  2017-11-09  3:01 [PATCH v8 0/6] Early boot time stamps for x86 Pavel Tatashin
                   ` (3 preceding siblings ...)
  2017-11-09  3:01 ` [PATCH v8 4/6] sched: early boot clock Pavel Tatashin
@ 2017-11-09  3:02 ` Pavel Tatashin
  2017-11-09  3:02 ` [PATCH v8 6/6] x86/tsc: use tsc early Pavel Tatashin
  5 siblings, 0 replies; 17+ messages in thread
From: Pavel Tatashin @ 2017-11-09  3:02 UTC (permalink / raw)
  To: steven.sistare, daniel.m.jordan, linux, schwidefsky,
	heiko.carstens, john.stultz, sboyd, x86, linux-kernel, mingo,
	tglx, hpa, douly.fnst

Early boot clock might differ from the clock that is used later on,
therefore add a new field to pv_time_ops, that shows currently active
clock. If platform supports early boot clock, this field will be changed
to use that clock early in boot, and later will be replaced with the
permanent clock.

Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
---
 arch/x86/include/asm/paravirt_types.h | 1 +
 arch/x86/kernel/paravirt.c            | 1 +
 arch/x86/xen/time.c                   | 7 ++++---
 3 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/paravirt_types.h b/arch/x86/include/asm/paravirt_types.h
index 6ec54d01972d..96518f027c15 100644
--- a/arch/x86/include/asm/paravirt_types.h
+++ b/arch/x86/include/asm/paravirt_types.h
@@ -98,6 +98,7 @@ struct pv_lazy_ops {
 struct pv_time_ops {
 	unsigned long long (*sched_clock)(void);
 	unsigned long long (*steal_clock)(int cpu);
+	unsigned long long (*active_sched_clock)(void);
 } __no_randomize_layout;
 
 struct pv_cpu_ops {
diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c
index 041096bdef86..91af9e6b913b 100644
--- a/arch/x86/kernel/paravirt.c
+++ b/arch/x86/kernel/paravirt.c
@@ -320,6 +320,7 @@ struct pv_init_ops pv_init_ops = {
 struct pv_time_ops pv_time_ops = {
 	.sched_clock = native_sched_clock,
 	.steal_clock = native_steal_clock,
+	.active_sched_clock = native_sched_clock,
 };
 
 __visible struct pv_irq_ops pv_irq_ops = {
diff --git a/arch/x86/xen/time.c b/arch/x86/xen/time.c
index 80c2a4bdf230..fdd241748dc8 100644
--- a/arch/x86/xen/time.c
+++ b/arch/x86/xen/time.c
@@ -408,8 +408,8 @@ static void __init xen_time_init(void)
 
 void __ref xen_init_time_ops(void)
 {
-	pv_time_ops = xen_time_ops;
-
+	pv_time_ops.sched_clock = xen_time_ops.sched_clock;
+	pv_time_ops.steal_clock = xen_time_ops.steal_clock;
 	x86_init.timers.timer_init = xen_time_init;
 	x86_init.timers.setup_percpu_clockev = x86_init_noop;
 	x86_cpuinit.setup_percpu_clockev = x86_init_noop;
@@ -450,7 +450,8 @@ void __init xen_hvm_init_time_ops(void)
 		return;
 	}
 
-	pv_time_ops = xen_time_ops;
+	pv_time_ops.sched_clock = xen_time_ops.sched_clock;
+	pv_time_ops.steal_clock = xen_time_ops.steal_clock;
 	x86_init.timers.setup_percpu_clockev = xen_time_init;
 	x86_cpuinit.setup_percpu_clockev = xen_hvm_setup_cpu_clockevents;
 
-- 
2.15.0

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v8 6/6] x86/tsc: use tsc early
  2017-11-09  3:01 [PATCH v8 0/6] Early boot time stamps for x86 Pavel Tatashin
                   ` (4 preceding siblings ...)
  2017-11-09  3:02 ` [PATCH v8 5/6] x86/paravirt: add active_sched_clock to pv_time_ops Pavel Tatashin
@ 2017-11-09  3:02 ` Pavel Tatashin
  2017-11-13  3:47   ` Dou Liyang
  5 siblings, 1 reply; 17+ messages in thread
From: Pavel Tatashin @ 2017-11-09  3:02 UTC (permalink / raw)
  To: steven.sistare, daniel.m.jordan, linux, schwidefsky,
	heiko.carstens, john.stultz, sboyd, x86, linux-kernel, mingo,
	tglx, hpa, douly.fnst

This patch adds early clock feature to x86 platforms.

tsc_early_init():
Determines offset, shift and multiplier for the early clock based on the
TSC frequency.

tsc_early_fini()
Implement the finish part of early tsc feature, prints message about the
offset, which can be useful to find out how much time was spent in post and
boot manager (if TSC starts from 0 during boot)

sched_clock_early():
TSC based implementation of early clock and is called from sched_clock().

Call tsc_early_init() to initialize early boot time stamps functionality on
the supported x86 platforms, and call tsc_early_fini() to finish this
feature after permanent clock has been initialized. The supported x86
systems are those where TSC frequency is determined early in boot.

Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
---
 arch/x86/include/asm/paravirt.h |  2 +-
 arch/x86/kernel/tsc.c           | 85 +++++++++++++++++++++++++++++++++++++++++
 2 files changed, 86 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h
index 283efcaac8af..b4ba220163ce 100644
--- a/arch/x86/include/asm/paravirt.h
+++ b/arch/x86/include/asm/paravirt.h
@@ -171,7 +171,7 @@ static inline int rdmsrl_safe(unsigned msr, unsigned long long *p)
 
 static inline unsigned long long paravirt_sched_clock(void)
 {
-	return PVOP_CALL0(unsigned long long, pv_time_ops.sched_clock);
+	return PVOP_CALL0(unsigned long long, pv_time_ops.active_sched_clock);
 }
 
 struct static_key;
diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
index dbce6fa32aa9..24da1ff96481 100644
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -181,6 +181,80 @@ static void set_cyc2ns_scale(unsigned long khz, int cpu, unsigned long long tsc_
 	local_irq_restore(flags);
 }
 
+#ifdef CONFIG_X86_TSC
+static struct cyc2ns_data  cyc2ns_early;
+
+static u64 sched_clock_early(void)
+{
+	u64 ns = mul_u64_u32_shr(rdtsc(), cyc2ns_early.cyc2ns_mul,
+				 cyc2ns_early.cyc2ns_shift);
+	return ns + cyc2ns_early.cyc2ns_offset;
+}
+
+#ifdef CONFIG_PARAVIRT
+static inline void __init tsc_early_enable(void)
+{
+	pv_time_ops.active_sched_clock = sched_clock_early;
+}
+
+static inline void __init tsc_early_disable(void)
+{
+	pv_time_ops.active_sched_clock = pv_time_ops.sched_clock;
+}
+#else /* CONFIG_PARAVIRT */
+/*
+ * For native clock we use two switches static and dynamic, the static switch is
+ * initially true, so we check the dynamic switch, which is initially false.
+ * Later  when early clock is disabled, we can alter the static switch in order
+ * to avoid branch check on every sched_clock() call.
+ */
+static bool __tsc_early;
+static DEFINE_STATIC_KEY_TRUE(__tsc_early_static);
+
+static inline void __init tsc_early_enable(void)
+{
+	__tsc_early = true;
+}
+
+static inline void __init tsc_early_disable(void)
+{
+	__tsc_early = false;
+	static_branch_disable(&__tsc_early_static);
+}
+#endif /* CONFIG_PARAVIRT */
+
+/*
+ * Initialize clock for early time stamps
+ */
+static void __init tsc_early_init(unsigned int khz)
+{
+	clocks_calc_mult_shift(&cyc2ns_early.cyc2ns_mul,
+			       &cyc2ns_early.cyc2ns_shift,
+			       khz, NSEC_PER_MSEC, 0);
+	cyc2ns_early.cyc2ns_offset = -sched_clock_early();
+	tsc_early_enable();
+}
+
+static void __init tsc_early_fini(void)
+{
+	unsigned long long t;
+	unsigned long r;
+
+	/* We did not have early sched clock if multiplier is 0 */
+	if (cyc2ns_early.cyc2ns_mul == 0) {
+		tsc_early_disable();
+		return;
+	}
+
+	t = -cyc2ns_early.cyc2ns_offset;
+	r = do_div(t, NSEC_PER_SEC);
+
+	tsc_early_disable();
+	__sched_clock_offset = sched_clock_early() - sched_clock();
+	pr_info("sched clock early is finished, offset [%lld.%09lds]\n", t, r);
+}
+#endif /* CONFIG_X86_TSC */
+
 /*
  * Scheduler clock - returns current time in nanosec units.
  */
@@ -193,6 +267,13 @@ u64 native_sched_clock(void)
 		return cycles_2_ns(tsc_now);
 	}
 
+#if !defined(CONFIG_PARAVIRT) && defined(CONFIG_X86_TSC)
+	if (static_branch_unlikely(&__tsc_early_static)) {
+		if (__tsc_early)
+			return sched_clock_early();
+	}
+#endif /* !CONFIG_PARAVIRT && CONFIG_X86_TSC */
+
 	/*
 	 * Fall back to jiffies if there's no TSC available:
 	 * ( But note that we still use it if the TSC is marked
@@ -1274,6 +1355,7 @@ void __init tsc_early_delay_calibrate(void)
 	lpj = tsc_khz * 1000;
 	do_div(lpj, HZ);
 	loops_per_jiffy = lpj;
+	tsc_early_init(tsc_khz);
 }
 
 void __init tsc_init(void)
@@ -1283,6 +1365,7 @@ void __init tsc_init(void)
 
 	if (!boot_cpu_has(X86_FEATURE_TSC)) {
 		setup_clear_cpu_cap(X86_FEATURE_TSC_DEADLINE_TIMER);
+		tsc_early_fini();
 		return;
 	}
 
@@ -1302,6 +1385,7 @@ void __init tsc_init(void)
 	if (!tsc_khz) {
 		mark_tsc_unstable("could not calculate TSC khz");
 		setup_clear_cpu_cap(X86_FEATURE_TSC_DEADLINE_TIMER);
+		tsc_early_fini();
 		return;
 	}
 
@@ -1341,6 +1425,7 @@ void __init tsc_init(void)
 		mark_tsc_unstable("TSCs unsynchronized");
 
 	detect_art();
+	tsc_early_fini();
 }
 
 #ifdef CONFIG_SMP
-- 
2.15.0

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH v8 2/6] time: sync read_boot_clock64() with persistent clock
  2017-11-09  3:01 ` [PATCH v8 2/6] time: sync read_boot_clock64() with persistent clock Pavel Tatashin
@ 2017-11-13  3:47   ` Dou Liyang
  2017-11-14 19:57     ` Pavel Tatashin
  0 siblings, 1 reply; 17+ messages in thread
From: Dou Liyang @ 2017-11-13  3:47 UTC (permalink / raw)
  To: Pavel Tatashin, steven.sistare, daniel.m.jordan, linux,
	schwidefsky, heiko.carstens, john.stultz, sboyd, x86,
	linux-kernel, mingo, tglx, hpa

Hi Pavel,

At 11/09/2017 11:01 AM, Pavel Tatashin wrote:
> read_boot_clock64() returns a boot start timestamp from epoch. Some arches
> may need to access the persistent clock interface in order to calculate the
> epoch offset. The resolution of the persistent clock, however, might be
> low.  Therefore, in order to avoid time discrepancies a new argument 'now'
> is added to read_boot_clock64() parameters. Arch may decide to use it
> instead of accessing persistent clock again.
>
> Also, change read_boot_clock64() to have __init prototype since it is
> accessed only during boot.
>
> Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
> ---
>  arch/arm/kernel/time.c      | 2 +-
>  arch/s390/kernel/time.c     | 2 +-
>  include/linux/timekeeping.h | 7 +++----
>  kernel/time/timekeeping.c   | 8 ++++++--
>  4 files changed, 11 insertions(+), 8 deletions(-)
>
> diff --git a/arch/arm/kernel/time.c b/arch/arm/kernel/time.c
> index 629f8e9981f1..5b259261a268 100644
> --- a/arch/arm/kernel/time.c
> +++ b/arch/arm/kernel/time.c
> @@ -90,7 +90,7 @@ void read_persistent_clock64(struct timespec64 *ts)
>  	__read_persistent_clock(ts);
>  }
>
> -void read_boot_clock64(struct timespec64 *ts)
> +void __init read_boot_clock64(struct timespec64 *now, struct timespec64 *ts)
>  {
>  	__read_boot_clock(ts);
>  }
> diff --git a/arch/s390/kernel/time.c b/arch/s390/kernel/time.c
> index 5cbd52169348..780b770e6a89 100644
> --- a/arch/s390/kernel/time.c
> +++ b/arch/s390/kernel/time.c
> @@ -220,7 +220,7 @@ void read_persistent_clock64(struct timespec64 *ts)
>  	ext_to_timespec64(clk, ts);
>  }
>
> -void read_boot_clock64(struct timespec64 *ts)
> +void __init read_boot_clock64(struct timespec64 *now, struct timespec64 *ts)
>  {
>  	unsigned char clk[STORE_CLOCK_EXT_SIZE];
>  	__u64 delta;
> diff --git a/include/linux/timekeeping.h b/include/linux/timekeeping.h
> index c198ab40c04f..73919982045e 100644
> --- a/include/linux/timekeeping.h
> +++ b/include/linux/timekeeping.h
> @@ -211,9 +211,8 @@ extern void ktime_get_snapshot(struct system_time_snapshot *systime_snapshot);
>   */
>  extern int persistent_clock_is_local;
>
> -extern void read_persistent_clock64(struct timespec64 *ts);
> -extern void read_boot_clock64(struct timespec64 *ts);
> -extern int update_persistent_clock64(struct timespec64 now);
> -
> +void read_persistent_clock64(struct timespec64 *ts);
> +void read_boot_clock64(struct timespec64 *now, struct timespec64 *ts);
> +int update_persistent_clock64(struct timespec64 now);
>

why we should remove the *extern* keyword?

Thanks,
	dou.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v8 6/6] x86/tsc: use tsc early
  2017-11-09  3:02 ` [PATCH v8 6/6] x86/tsc: use tsc early Pavel Tatashin
@ 2017-11-13  3:47   ` Dou Liyang
  2017-11-14 21:46     ` Pavel Tatashin
  0 siblings, 1 reply; 17+ messages in thread
From: Dou Liyang @ 2017-11-13  3:47 UTC (permalink / raw)
  To: Pavel Tatashin, steven.sistare, daniel.m.jordan, linux,
	schwidefsky, heiko.carstens, john.stultz, sboyd, x86,
	linux-kernel, mingo, tglx, hpa

Hi Pavel,

At 11/09/2017 11:02 AM, Pavel Tatashin wrote:
> This patch adds early clock feature to x86 platforms.
>
> tsc_early_init():
> Determines offset, shift and multiplier for the early clock based on the
> TSC frequency.
>
> tsc_early_fini()
> Implement the finish part of early tsc feature, prints message about the
> offset, which can be useful to find out how much time was spent in post and
> boot manager (if TSC starts from 0 during boot)
>
> sched_clock_early():
> TSC based implementation of early clock and is called from sched_clock().
>
> Call tsc_early_init() to initialize early boot time stamps functionality on
> the supported x86 platforms, and call tsc_early_fini() to finish this
> feature after permanent clock has been initialized. The supported x86
> systems are those where TSC frequency is determined early in boot.
>
> Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
> ---
>  arch/x86/include/asm/paravirt.h |  2 +-
>  arch/x86/kernel/tsc.c           | 85 +++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 86 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h
> index 283efcaac8af..b4ba220163ce 100644
> --- a/arch/x86/include/asm/paravirt.h
> +++ b/arch/x86/include/asm/paravirt.h
> @@ -171,7 +171,7 @@ static inline int rdmsrl_safe(unsigned msr, unsigned long long *p)
>
>  static inline unsigned long long paravirt_sched_clock(void)
>  {
> -	return PVOP_CALL0(unsigned long long, pv_time_ops.sched_clock);
> +	return PVOP_CALL0(unsigned long long, pv_time_ops.active_sched_clock);
>  }
>

Should in the 5th patch

>  struct static_key;
> diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
> index dbce6fa32aa9..24da1ff96481 100644
> --- a/arch/x86/kernel/tsc.c
> +++ b/arch/x86/kernel/tsc.c
> @@ -181,6 +181,80 @@ static void set_cyc2ns_scale(unsigned long khz, int cpu, unsigned long long tsc_
>  	local_irq_restore(flags);
>  }
>
> +#ifdef CONFIG_X86_TSC
> +static struct cyc2ns_data  cyc2ns_early;
> +
> +static u64 sched_clock_early(void)
> +{
> +	u64 ns = mul_u64_u32_shr(rdtsc(), cyc2ns_early.cyc2ns_mul,
> +				 cyc2ns_early.cyc2ns_shift);
> +	return ns + cyc2ns_early.cyc2ns_offset;
> +}
> +
> +#ifdef CONFIG_PARAVIRT
> +static inline void __init tsc_early_enable(void)
> +{
> +	pv_time_ops.active_sched_clock = sched_clock_early;
> +}
> +
> +static inline void __init tsc_early_disable(void)
> +{
> +	pv_time_ops.active_sched_clock = pv_time_ops.sched_clock;
> +}
> +#else /* CONFIG_PARAVIRT */
> +/*
> + * For native clock we use two switches static and dynamic, the static switch is
> + * initially true, so we check the dynamic switch, which is initially false.
> + * Later  when early clock is disabled, we can alter the static switch in order
> + * to avoid branch check on every sched_clock() call.
> + */
> +static bool __tsc_early;
> +static DEFINE_STATIC_KEY_TRUE(__tsc_early_static);
> +
> +static inline void __init tsc_early_enable(void)
> +{
> +	__tsc_early = true;
> +}
> +
> +static inline void __init tsc_early_disable(void)
> +{
> +	__tsc_early = false;
> +	static_branch_disable(&__tsc_early_static);
> +}
> +#endif /* CONFIG_PARAVIRT */
> +
> +/*
> + * Initialize clock for early time stamps
> + */
> +static void __init tsc_early_init(unsigned int khz)
> +{
> +	clocks_calc_mult_shift(&cyc2ns_early.cyc2ns_mul,
> +			       &cyc2ns_early.cyc2ns_shift,
> +			       khz, NSEC_PER_MSEC, 0);
> +	cyc2ns_early.cyc2ns_offset = -sched_clock_early();
> +	tsc_early_enable();
> +}
> +
> +static void __init tsc_early_fini(void)
> +{
> +	unsigned long long t;
> +	unsigned long r;
> +
> +	/* We did not have early sched clock if multiplier is 0 */
> +	if (cyc2ns_early.cyc2ns_mul == 0) {
> +		tsc_early_disable();
> +		return;
> +	}
> +
> +	t = -cyc2ns_early.cyc2ns_offset;
> +	r = do_div(t, NSEC_PER_SEC);
> +
> +	tsc_early_disable();
> +	__sched_clock_offset = sched_clock_early() - sched_clock();
> +	pr_info("sched clock early is finished, offset [%lld.%09lds]\n", t, r);
> +}

Add definitions for the situation of X86_TSC = no :

#else /* CONFIG_X86_TSC */
static inline void tsc_early_init(unsigned int khz) { }
static inline void tsc_early_fini(void) { }

> +#endif /* CONFIG_X86_TSC */
> +


>  /*
>   * Scheduler clock - returns current time in nanosec units.
>   */
> @@ -193,6 +267,13 @@ u64 native_sched_clock(void)
>  		return cycles_2_ns(tsc_now);
>  	}
>
> +#if !defined(CONFIG_PARAVIRT) && defined(CONFIG_X86_TSC)
> +	if (static_branch_unlikely(&__tsc_early_static)) {
> +		if (__tsc_early)
> +			return sched_clock_early();
> +	}
> +#endif /* !CONFIG_PARAVIRT && CONFIG_X86_TSC */
> +
>  	/*
>  	 * Fall back to jiffies if there's no TSC available:
>  	 * ( But note that we still use it if the TSC is marked
> @@ -1274,6 +1355,7 @@ void __init tsc_early_delay_calibrate(void)
>  	lpj = tsc_khz * 1000;
>  	do_div(lpj, HZ);
>  	loops_per_jiffy = lpj;
> +	tsc_early_init(tsc_khz);
>  }
>
>  void __init tsc_init(void)
> @@ -1283,6 +1365,7 @@ void __init tsc_init(void)
>
>  	if (!boot_cpu_has(X86_FEATURE_TSC)) {
>  		setup_clear_cpu_cap(X86_FEATURE_TSC_DEADLINE_TIMER);
> +		tsc_early_fini();

According to tsc_early_delay_calibrate(), if 
(!boot_cpu_has(X86_FEATURE_TSC || !tsc_khz ), tsc_early_init(tsc_khz)
will never be called, so here is  redundant.

>  		return;
>  	}
>
> @@ -1302,6 +1385,7 @@ void __init tsc_init(void)
>  	if (!tsc_khz) {
>  		mark_tsc_unstable("could not calculate TSC khz");
>  		setup_clear_cpu_cap(X86_FEATURE_TSC_DEADLINE_TIMER);
> +		tsc_early_fini();

ditto

>  		return;
>  	}
>
> @@ -1341,6 +1425,7 @@ void __init tsc_init(void)
>  		mark_tsc_unstable("TSCs unsynchronized");
>
>  	detect_art();
> +	tsc_early_fini();

IMO, Just keep the finishing call here is enough.

BTW, seems you forgot to cc Peter Zijlstra <peterz@infradead.org> in 
both V7 and V8 patchsets.

Thanks,
	dou
>  }
>
>  #ifdef CONFIG_SMP
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v8 1/6] x86/tsc: remove tsc_disabled flag
  2017-11-09  3:01 ` [PATCH v8 1/6] x86/tsc: remove tsc_disabled flag Pavel Tatashin
@ 2017-11-13  3:52   ` Dou Liyang
  2017-11-13 18:51     ` Pavel Tatashin
  0 siblings, 1 reply; 17+ messages in thread
From: Dou Liyang @ 2017-11-13  3:52 UTC (permalink / raw)
  To: Pavel Tatashin, steven.sistare, daniel.m.jordan, linux,
	schwidefsky, heiko.carstens, john.stultz, sboyd, x86,
	linux-kernel, mingo, tglx, hpa

Hi Pavel,

At 11/09/2017 11:01 AM, Pavel Tatashin wrote:
> tsc_disabled is set when notsc is passed as kernel parameter. The reason we
> have notsc is to avoid timing problems on multi-socket systems.  We already
> have a mechanism, however, to detect and resolve these issues by invoking
> tsc unstable path. Thus, make notsc to behave the same as tsc=unstable.
>
> Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>

I am not sure if I could add the signature.

Anyway, it looks good to me.

Reviewed-by: Dou Liyang <douly.fnst@cn.fujitsu.com>

> ---
>  arch/x86/kernel/tsc.c | 19 +++----------------
>  1 file changed, 3 insertions(+), 16 deletions(-)
>
> diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
> index 97907e152356..dbce6fa32aa9 100644
> --- a/arch/x86/kernel/tsc.c
> +++ b/arch/x86/kernel/tsc.c
> @@ -37,11 +37,6 @@ EXPORT_SYMBOL(tsc_khz);
>   */
>  static int __read_mostly tsc_unstable;
>
> -/* native_sched_clock() is called before tsc_init(), so
> -   we must start with the TSC soft disabled to prevent
> -   erroneous rdtsc usage on !boot_cpu_has(X86_FEATURE_TSC) processors */
> -static int __read_mostly tsc_disabled = -1;
> -
>  static DEFINE_STATIC_KEY_FALSE(__use_tsc);
>
>  int tsc_clocksource_reliable;
> @@ -247,8 +242,7 @@ EXPORT_SYMBOL_GPL(check_tsc_unstable);
>  #ifdef CONFIG_X86_TSC
>  int __init notsc_setup(char *str)
>  {
> -	pr_warn("Kernel compiled with CONFIG_X86_TSC, cannot disable TSC completely\n");
> -	tsc_disabled = 1;
> +	mark_tsc_unstable("boot parameter notsc");
>  	return 1;
>  }
>  #else
> @@ -1229,7 +1223,7 @@ static void tsc_refine_calibration_work(struct work_struct *work)
>
>  static int __init init_tsc_clocksource(void)
>  {
> -	if (!boot_cpu_has(X86_FEATURE_TSC) || tsc_disabled > 0 || !tsc_khz)
> +	if (!boot_cpu_has(X86_FEATURE_TSC) || !tsc_khz)
>  		return 0;
>
>  	if (tsc_clocksource_reliable)
> @@ -1330,12 +1324,6 @@ void __init tsc_init(void)
>  		set_cyc2ns_scale(tsc_khz, cpu, cyc);
>  	}
>
> -	if (tsc_disabled > 0)
> -		return;
> -
> -	/* now allow native_sched_clock() to use rdtsc */
> -
> -	tsc_disabled = 0;
>  	static_branch_enable(&__use_tsc);
>
>  	if (!no_sched_irq_time)
> @@ -1365,10 +1353,9 @@ void __init tsc_init(void)
>  unsigned long calibrate_delay_is_known(void)
>  {
>  	int sibling, cpu = smp_processor_id();
> -	int constant_tsc = cpu_has(&cpu_data(cpu), X86_FEATURE_CONSTANT_TSC);
>  	const struct cpumask *mask = topology_core_cpumask(cpu);
>
> -	if (tsc_disabled || !constant_tsc || !mask)
> +	if (!cpu_has(&cpu_data(cpu), X86_FEATURE_CONSTANT_TSC) || !mask)
>  		return 0;
>
>  	sibling = cpumask_any_but(mask, cpu);
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v8 1/6] x86/tsc: remove tsc_disabled flag
  2017-11-13  3:52   ` Dou Liyang
@ 2017-11-13 18:51     ` Pavel Tatashin
  0 siblings, 0 replies; 17+ messages in thread
From: Pavel Tatashin @ 2017-11-13 18:51 UTC (permalink / raw)
  To: Dou Liyang
  Cc: Steve Sistare, Daniel Jordan, linux, schwidefsky, heiko.carstens,
	John Stultz, sboyd, x86, linux-kernel, mingo, Thomas Gleixner,
	hpa

> Reviewed-by: Dou Liyang <douly.fnst@cn.fujitsu.com>

Thank you!

Pavel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v8 2/6] time: sync read_boot_clock64() with persistent clock
  2017-11-13  3:47   ` Dou Liyang
@ 2017-11-14 19:57     ` Pavel Tatashin
  2017-11-15  2:47       ` Dou Liyang
  0 siblings, 1 reply; 17+ messages in thread
From: Pavel Tatashin @ 2017-11-14 19:57 UTC (permalink / raw)
  To: Dou Liyang
  Cc: Steve Sistare, Daniel Jordan, linux, schwidefsky, heiko.carstens,
	John Stultz, sboyd, x86, linux-kernel, mingo, Thomas Gleixner,
	hpa

> why we should remove the *extern* keyword?

Hi Dou,

While, I am not sure why it was decided to stop using externs in
headers, this is a warning printed by scripts/checkpatch.pl:

CHECK: extern prototypes should be avoided in .h files

To have a clean checkpatch output I removed externs.

Pavel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v8 6/6] x86/tsc: use tsc early
  2017-11-13  3:47   ` Dou Liyang
@ 2017-11-14 21:46     ` Pavel Tatashin
  2017-11-15  3:53       ` Dou Liyang
  0 siblings, 1 reply; 17+ messages in thread
From: Pavel Tatashin @ 2017-11-14 21:46 UTC (permalink / raw)
  To: Dou Liyang
  Cc: Steve Sistare, Daniel Jordan, linux, schwidefsky, heiko.carstens,
	John Stultz, sboyd, x86, linux-kernel, mingo, Thomas Gleixner,
	hpa

Hi Dou,

Great comments, my replies below:

>>  static inline unsigned long long paravirt_sched_clock(void)
>>  {
>> -       return PVOP_CALL0(unsigned long long, pv_time_ops.sched_clock);
>> +       return PVOP_CALL0(unsigned long long,
>> pv_time_ops.active_sched_clock);
>>  }
>>
>
> Should in the 5th patch

Actually, it has to be in patch 6, because otherwise patch 5 without
patch 6 would cause native_sched_clock() to be used even when a
platform specific clock is set, thus may cause performance regressions
where it is not optimal to use tsc for clock.

> Add definitions for the situation of X86_TSC = no :
>
> #else /* CONFIG_X86_TSC */
> static inline void tsc_early_init(unsigned int khz) { }
> static inline void tsc_early_fini(void) { }

Excellent point, I totally forgot about  X86_TSC = no, however, a
better fix is to simply remove #ifdef CONFIG_X86_TSC from my
functions. Apparently, even with X86_TSC=no we can use TSC unless
notsc kernel parameter is passed. This will be in the next patchset.

>
> According to tsc_early_delay_calibrate(), if (!boot_cpu_has(X86_FEATURE_TSC
> || !tsc_khz ), tsc_early_init(tsc_khz)
> will never be called, so here is  redundant.
>
>>                 return;
>>         }
>>
>> @@ -1302,6 +1385,7 @@ void __init tsc_init(void)
>>         if (!tsc_khz) {
>>                 mark_tsc_unstable("could not calculate TSC khz");
>>                 setup_clear_cpu_cap(X86_FEATURE_TSC_DEADLINE_TIMER);
>> +               tsc_early_fini();
>
>
> ditto

Right, in both case we still want to call tsc_early_fini(). Because,
it calls tsc_early_disable() even when tsc_early_init() was never
called.  tsc_early_disable()  either sets static branch
__tsc_early_static to false or changes active_sched_clock to be
platform specific, depending on CONFIG_PARAVIRT.

> BTW, seems you forgot to cc Peter Zijlstra <peterz@infradead.org> in both V7
> and V8 patchsets.

Thank you for noticing this! I will include Peter when I send out
patchset version 9.

Thank you,
Pavel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v8 2/6] time: sync read_boot_clock64() with persistent clock
  2017-11-14 19:57     ` Pavel Tatashin
@ 2017-11-15  2:47       ` Dou Liyang
  2017-11-15  3:56         ` Pavel Tatashin
  0 siblings, 1 reply; 17+ messages in thread
From: Dou Liyang @ 2017-11-15  2:47 UTC (permalink / raw)
  To: Pavel Tatashin
  Cc: Steve Sistare, Daniel Jordan, linux, schwidefsky, heiko.carstens,
	John Stultz, sboyd, x86, linux-kernel, mingo, Thomas Gleixner,
	hpa

Hi Pavel,

At 11/15/2017 03:57 AM, Pavel Tatashin wrote:
>> why we should remove the *extern* keyword?
>
> Hi Dou,
>
> While, I am not sure why it was decided to stop using externs in
> headers, this is a warning printed by scripts/checkpatch.pl:
>
> CHECK: extern prototypes should be avoided in .h files
>
> To have a clean checkpatch output I removed externs.
>

I see.

IMO, using the extern keyword on function prototypes in *.h files
is superfluous, but, It doesn't matter for functionality. *extern*
is default keywords.

AFAIK, it's a code style problem. In x86 arch, we prefer to
keep *extern* explicitly, so, let's keep it like before for
code consistency.

Thank,
	dou

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v8 6/6] x86/tsc: use tsc early
  2017-11-14 21:46     ` Pavel Tatashin
@ 2017-11-15  3:53       ` Dou Liyang
  2017-11-15 16:01         ` Pavel Tatashin
  0 siblings, 1 reply; 17+ messages in thread
From: Dou Liyang @ 2017-11-15  3:53 UTC (permalink / raw)
  To: Pavel Tatashin
  Cc: Steve Sistare, Daniel Jordan, linux, schwidefsky, heiko.carstens,
	John Stultz, sboyd, x86, linux-kernel, mingo, Thomas Gleixner,
	hpa

Hi Pavel,

At 11/15/2017 05:46 AM, Pavel Tatashin wrote:
> Hi Dou,
>
> Great comments, my replies below:
>
>>>  static inline unsigned long long paravirt_sched_clock(void)
>>>  {
>>> -       return PVOP_CALL0(unsigned long long, pv_time_ops.sched_clock);
>>> +       return PVOP_CALL0(unsigned long long,
>>> pv_time_ops.active_sched_clock);
>>>  }
>>>
>>
>> Should in the 5th patch
>
> Actually, it has to be in patch 6, because otherwise patch 5 without
> patch 6 would cause native_sched_clock() to be used even when a
> platform specific clock is set, thus may cause performance regressions
> where it is not optimal to use tsc for clock.
>
Indeed.

 >> +	tsc_early_disable();
 >> +	__sched_clock_offset = sched_clock_early() - sched_clock();
 >> +	pr_info("sched clock early is finished, offset[%lld.%09lds]\n", t, r);
 >> +}

s/sched clock early/early sched clock/

 >> +#endif /* CONFIG_X86_TSC */
>> Add definitions for the situation of X86_TSC = no :
>>
>> #else /* CONFIG_X86_TSC */
>> static inline void tsc_early_init(unsigned int khz) { }
>> static inline void tsc_early_fini(void) { }
>
> Excellent point, I totally forgot about  X86_TSC = no, however, a
> better fix is to simply remove #ifdef CONFIG_X86_TSC from my
> functions. Apparently, even with X86_TSC=no we can use TSC unless

Agree with removing #ifdef CONFIG_X86_TSC

> notsc kernel parameter is passed. This will be in the next patchset.
>
>>
>> According to tsc_early_delay_calibrate(), if (!boot_cpu_has(X86_FEATURE_TSC
>> || !tsc_khz ), tsc_early_init(tsc_khz)
>> will never be called, so here is  redundant.
>>
>>>                 return;
>>>         }
>>>
>>> @@ -1302,6 +1385,7 @@ void __init tsc_init(void)
>>>         if (!tsc_khz) {
>>>                 mark_tsc_unstable("could not calculate TSC khz");
>>>                 setup_clear_cpu_cap(X86_FEATURE_TSC_DEADLINE_TIMER);
>>> +               tsc_early_fini();
>>
>>
>> ditto
>
> Right, in both case we still want to call tsc_early_fini(). Because,
> it calls tsc_early_disable() even when tsc_early_init() was never
> called.  tsc_early_disable()  either sets static branch
> __tsc_early_static to false or changes active_sched_clock to be
> platform specific, depending on CONFIG_PARAVIRT.

Yes, the function names confused me, the actually purpose of
tsc_early_fini() is switching schedule clock to the final one. right?

How about replacing *return;* with *goto final_sched_clock;*

Thanks,
	dou.
>
>> BTW, seems you forgot to cc Peter Zijlstra <peterz@infradead.org> in both V7
>> and V8 patchsets.
>
> Thank you for noticing this! I will include Peter when I send out
> patchset version 9.
>
> Thank you,
> Pavel
>
>
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v8 2/6] time: sync read_boot_clock64() with persistent clock
  2017-11-15  2:47       ` Dou Liyang
@ 2017-11-15  3:56         ` Pavel Tatashin
  0 siblings, 0 replies; 17+ messages in thread
From: Pavel Tatashin @ 2017-11-15  3:56 UTC (permalink / raw)
  To: Dou Liyang
  Cc: Steve Sistare, Daniel Jordan, linux, schwidefsky, heiko.carstens,
	John Stultz, sboyd, x86, linux-kernel, mingo, Thomas Gleixner,
	hpa

> IMO, using the extern keyword on function prototypes in *.h files
> is superfluous, but, It doesn't matter for functionality. *extern*
> is default keywords.
>
> AFAIK, it's a code style problem. In x86 arch, we prefer to
> keep *extern* explicitly, so, let's keep it like before for
> code consistency.

Sounds good, I will restore externs.

Thank you,
Pavel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v8 6/6] x86/tsc: use tsc early
  2017-11-15  3:53       ` Dou Liyang
@ 2017-11-15 16:01         ` Pavel Tatashin
  0 siblings, 0 replies; 17+ messages in thread
From: Pavel Tatashin @ 2017-11-15 16:01 UTC (permalink / raw)
  To: Dou Liyang
  Cc: Steve Sistare, Daniel Jordan, linux, schwidefsky, heiko.carstens,
	John Stultz, sboyd, x86, linux-kernel, mingo, Thomas Gleixner,
	hpa

> s/sched clock early/early sched clock/

Ok, changed

>
> Yes, the function names confused me, the actually purpose of
> tsc_early_fini() is switching schedule clock to the final one. right?
>
> How about replacing *return;* with *goto final_sched_clock;*

Done

I will send new patchset soon, with all the changes included.

Thank you,
Pavel

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2017-11-15 16:01 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-09  3:01 [PATCH v8 0/6] Early boot time stamps for x86 Pavel Tatashin
2017-11-09  3:01 ` [PATCH v8 1/6] x86/tsc: remove tsc_disabled flag Pavel Tatashin
2017-11-13  3:52   ` Dou Liyang
2017-11-13 18:51     ` Pavel Tatashin
2017-11-09  3:01 ` [PATCH v8 2/6] time: sync read_boot_clock64() with persistent clock Pavel Tatashin
2017-11-13  3:47   ` Dou Liyang
2017-11-14 19:57     ` Pavel Tatashin
2017-11-15  2:47       ` Dou Liyang
2017-11-15  3:56         ` Pavel Tatashin
2017-11-09  3:01 ` [PATCH v8 3/6] x86/time: read_boot_clock64() implementation Pavel Tatashin
2017-11-09  3:01 ` [PATCH v8 4/6] sched: early boot clock Pavel Tatashin
2017-11-09  3:02 ` [PATCH v8 5/6] x86/paravirt: add active_sched_clock to pv_time_ops Pavel Tatashin
2017-11-09  3:02 ` [PATCH v8 6/6] x86/tsc: use tsc early Pavel Tatashin
2017-11-13  3:47   ` Dou Liyang
2017-11-14 21:46     ` Pavel Tatashin
2017-11-15  3:53       ` Dou Liyang
2017-11-15 16:01         ` Pavel Tatashin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).