linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/3] tracing: Introduce trace clock tai
@ 2022-04-14  9:18 Kurt Kanzenbach
  2022-04-14  9:18 ` [PATCH v2 1/3] timekeeping: Introduce fast accessor to " Kurt Kanzenbach
                   ` (2 more replies)
  0 siblings, 3 replies; 20+ messages in thread
From: Kurt Kanzenbach @ 2022-04-14  9:18 UTC (permalink / raw)
  To: John Stultz, Thomas Gleixner, Stephen Boyd, Steven Rostedt,
	Ingo Molnar, Jonathan Corbet
  Cc: Richard Cochran, linux-doc, linux-kernel, Kurt Kanzenbach

Hi,

the Linux kernel tracing infrastructure has support for using different clocks
to generate timestamps for trace events. Especially in TSN networks it's useful
to have TAI as trace clock, because the application scheduling is done in
accordance to the network time, which is based on TAI. With a tai trace_clock in
place, it becomes very convenient to correlate network activity with Linux
kernel application traces.

However, there's no fast accessor for CLOCK_TAI yet. Therefore, patch #1 is
adding one. Patch #2 introduces the clock and the last one adds documentation
for it.

Changes since v1:

 - Add missing data_race() annotation for TAI offset which is accessed without
   any form of locking for KCSAN (Thomas)
 - Add missing documentation entry for ktime_get_tai_fast_ns()
 - Minor changelog tweaks

Previous versions:

 - https://lore.kernel.org/all/20220409081300.4762-1-kurt@linutronix.de/

Thanks,
Kurt

Kurt Kanzenbach (3):
  timekeeping: Introduce fast accessor to clock tai
  tracing: Introduce trace clock tai
  tracing: Add documentation for trace clock tai

 Documentation/core-api/timekeeping.rst |  1 +
 Documentation/trace/ftrace.rst         | 12 ++++++++++++
 include/linux/timekeeping.h            |  1 +
 kernel/time/timekeeping.c              | 17 +++++++++++++++++
 kernel/trace/trace.c                   |  1 +
 5 files changed, 32 insertions(+)

-- 
2.30.2


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v2 1/3] timekeeping: Introduce fast accessor to clock tai
  2022-04-14  9:18 [PATCH v2 0/3] tracing: Introduce trace clock tai Kurt Kanzenbach
@ 2022-04-14  9:18 ` Kurt Kanzenbach
  2022-04-14 14:59   ` Thomas Gleixner
                     ` (2 more replies)
  2022-04-14  9:18 ` [PATCH v2 2/3] tracing: Introduce trace " Kurt Kanzenbach
  2022-04-14  9:18 ` [PATCH v2 3/3] tracing: Add documentation for " Kurt Kanzenbach
  2 siblings, 3 replies; 20+ messages in thread
From: Kurt Kanzenbach @ 2022-04-14  9:18 UTC (permalink / raw)
  To: John Stultz, Thomas Gleixner, Stephen Boyd, Steven Rostedt,
	Ingo Molnar, Jonathan Corbet
  Cc: Richard Cochran, linux-doc, linux-kernel, Kurt Kanzenbach

Introduce fast/NMI safe accessor to clock tai for tracing. The Linux kernel
tracing infrastructure has support for using different clocks to generate
timestamps for trace events. Especially in TSN networks it's useful to have TAI
as trace clock, because the application scheduling is done in accordance to the
network time, which is based on TAI. With a tai trace_clock in place, it becomes
very convenient to correlate network activity with Linux kernel application
traces.

Use the same implementation as ktime_get_boot_fast_ns() does by reading the
monotonic time and adding the TAI offset. The same limitations as for the fast
boot implementation apply. The TAI offset may change at run time e.g., by
setting the time or using adjtimex() with an offset. However, these kind of
offset changes are rare events. Nevertheless, the user has to be aware and deal
with it in post processing.

An alternative approach would be to use the same implementation as
ktime_get_real_fast_ns() does. However, this requires to add an additional u64
member to the tk_read_base struct. This struct together with a seqcount is
designed to fit into a single cache line on 64 bit architectures. Adding a new
member would violate this constraint.

Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de>
---
 Documentation/core-api/timekeeping.rst |  1 +
 include/linux/timekeeping.h            |  1 +
 kernel/time/timekeeping.c              | 17 +++++++++++++++++
 3 files changed, 19 insertions(+)

diff --git a/Documentation/core-api/timekeeping.rst b/Documentation/core-api/timekeeping.rst
index 729e24864fe7..22ec68f24421 100644
--- a/Documentation/core-api/timekeeping.rst
+++ b/Documentation/core-api/timekeeping.rst
@@ -132,6 +132,7 @@ Some additional variants exist for more specialized cases:
 .. c:function:: u64 ktime_get_mono_fast_ns( void )
 		u64 ktime_get_raw_fast_ns( void )
 		u64 ktime_get_boot_fast_ns( void )
+		u64 ktime_get_tai_fast_ns( void )
 		u64 ktime_get_real_fast_ns( void )
 
 	These variants are safe to call from any context, including from
diff --git a/include/linux/timekeeping.h b/include/linux/timekeeping.h
index 78a98bdff76d..fe1e467ba046 100644
--- a/include/linux/timekeeping.h
+++ b/include/linux/timekeeping.h
@@ -177,6 +177,7 @@ static inline u64 ktime_get_raw_ns(void)
 extern u64 ktime_get_mono_fast_ns(void);
 extern u64 ktime_get_raw_fast_ns(void);
 extern u64 ktime_get_boot_fast_ns(void);
+extern u64 ktime_get_tai_fast_ns(void);
 extern u64 ktime_get_real_fast_ns(void);
 
 /*
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index dcdcb85121e4..2c22023fbf5f 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -532,6 +532,23 @@ u64 notrace ktime_get_boot_fast_ns(void)
 }
 EXPORT_SYMBOL_GPL(ktime_get_boot_fast_ns);
 
+/**
+ * ktime_get_tai_fast_ns - NMI safe and fast access to tai clock.
+ *
+ * The same limitations as described for ktime_get_boot_fast_ns() apply. The
+ * mono time and the TAI offset are not read atomically which may yield wrong
+ * readouts. However, an update of the TAI offset is an rare event e.g., caused
+ * by settime or adjtimex with an offset. The user of this function has to deal
+ * with the possibility of wrong timestamps in post processing.
+ */
+u64 notrace ktime_get_tai_fast_ns(void)
+{
+	struct timekeeper *tk = &tk_core.timekeeper;
+
+	return (ktime_get_mono_fast_ns() + ktime_to_ns(data_race(tk->offs_tai)));
+}
+EXPORT_SYMBOL_GPL(ktime_get_tai_fast_ns);
+
 static __always_inline u64 __ktime_get_real_fast(struct tk_fast *tkf, u64 *mono)
 {
 	struct tk_read_base *tkr;
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v2 2/3] tracing: Introduce trace clock tai
  2022-04-14  9:18 [PATCH v2 0/3] tracing: Introduce trace clock tai Kurt Kanzenbach
  2022-04-14  9:18 ` [PATCH v2 1/3] timekeeping: Introduce fast accessor to " Kurt Kanzenbach
@ 2022-04-14  9:18 ` Kurt Kanzenbach
  2022-04-14  9:18 ` [PATCH v2 3/3] tracing: Add documentation for " Kurt Kanzenbach
  2 siblings, 0 replies; 20+ messages in thread
From: Kurt Kanzenbach @ 2022-04-14  9:18 UTC (permalink / raw)
  To: John Stultz, Thomas Gleixner, Stephen Boyd, Steven Rostedt,
	Ingo Molnar, Jonathan Corbet
  Cc: Richard Cochran, linux-doc, linux-kernel, Kurt Kanzenbach

A fast/NMI safe accessor for CLOCK_TAI has been introduced.
Use it for adding the additional trace clock "tai".

Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de>
---
 kernel/trace/trace.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index f4de111fa18f..7be87b29c6c1 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -1542,6 +1542,7 @@ static struct {
 	{ ktime_get_mono_fast_ns,	"mono",		1 },
 	{ ktime_get_raw_fast_ns,	"mono_raw",	1 },
 	{ ktime_get_boot_fast_ns,	"boot",		1 },
+	{ ktime_get_tai_fast_ns,	"tai",		1 },
 	ARCH_TRACE_CLOCKS
 };
 
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v2 3/3] tracing: Add documentation for trace clock tai
  2022-04-14  9:18 [PATCH v2 0/3] tracing: Introduce trace clock tai Kurt Kanzenbach
  2022-04-14  9:18 ` [PATCH v2 1/3] timekeeping: Introduce fast accessor to " Kurt Kanzenbach
  2022-04-14  9:18 ` [PATCH v2 2/3] tracing: Introduce trace " Kurt Kanzenbach
@ 2022-04-14  9:18 ` Kurt Kanzenbach
  2022-04-14 11:27   ` Bagas Sanjaya
  2 siblings, 1 reply; 20+ messages in thread
From: Kurt Kanzenbach @ 2022-04-14  9:18 UTC (permalink / raw)
  To: John Stultz, Thomas Gleixner, Stephen Boyd, Steven Rostedt,
	Ingo Molnar, Jonathan Corbet
  Cc: Richard Cochran, linux-doc, linux-kernel, Kurt Kanzenbach

Add documentation for newly introduced trace clock "tai".
This clock corresponds to CLOCK_TAI.

Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de>
---
 Documentation/trace/ftrace.rst | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/Documentation/trace/ftrace.rst b/Documentation/trace/ftrace.rst
index 45b8c56af67a..b37dc19e4d40 100644
--- a/Documentation/trace/ftrace.rst
+++ b/Documentation/trace/ftrace.rst
@@ -517,6 +517,18 @@ of ftrace. Here is a list of some of the key files:
 		processing should be able to handle them. See comments in the
 		ktime_get_boot_fast_ns() function for more information.
 
+	tai:
+		This is the tai clock (CLOCK_TAI) and is derived from the wall-
+		clock time. However, this clock does not experience
+		discontinuities and backwards jumps caused by NTP inserting leap
+		seconds. Since the clock access is designed for use in tracing,
+		side effects are possible. The clock access may yield wrong
+		readouts in case the internal TAI offset is updated e.g., caused
+		by setting the system time or using adjtimex() with an offset.
+		These effects are rare and post processing should be able to
+		handle them. See comments in the ktime_get_tai_fast_ns()
+		function for more information.
+
 	To set a clock, simply echo the clock name into this file::
 
 	  # echo global > trace_clock
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 3/3] tracing: Add documentation for trace clock tai
  2022-04-14  9:18 ` [PATCH v2 3/3] tracing: Add documentation for " Kurt Kanzenbach
@ 2022-04-14 11:27   ` Bagas Sanjaya
  2022-04-14 11:38     ` Kurt Kanzenbach
  0 siblings, 1 reply; 20+ messages in thread
From: Bagas Sanjaya @ 2022-04-14 11:27 UTC (permalink / raw)
  To: Kurt Kanzenbach, John Stultz, Thomas Gleixner, Stephen Boyd,
	Steven Rostedt, Ingo Molnar, Jonathan Corbet
  Cc: Richard Cochran, linux-doc, linux-kernel

On 4/14/22 16:18, Kurt Kanzenbach wrote:
> +	tai:
> +		This is the tai clock (CLOCK_TAI) and is derived from the wall-
> +		clock time. However, this clock does not experience
> +		discontinuities and backwards jumps caused by NTP inserting leap
> +		seconds. Since the clock access is designed for use in tracing,
> +		side effects are possible. The clock access may yield wrong
> +		readouts in case the internal TAI offset is updated e.g., caused
> +		by setting the system time or using adjtimex() with an offset.
> +		These effects are rare and post processing should be able to
> +		handle them. See comments in the ktime_get_tai_fast_ns()
> +		function for more information.
> +

In what file are the ktime_get_tai_fast_ns() comments?

-- 
An old man doll... just what I always wanted! - Clara

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 3/3] tracing: Add documentation for trace clock tai
  2022-04-14 11:27   ` Bagas Sanjaya
@ 2022-04-14 11:38     ` Kurt Kanzenbach
  0 siblings, 0 replies; 20+ messages in thread
From: Kurt Kanzenbach @ 2022-04-14 11:38 UTC (permalink / raw)
  To: Bagas Sanjaya, John Stultz, Thomas Gleixner, Stephen Boyd,
	Steven Rostedt, Ingo Molnar, Jonathan Corbet
  Cc: Richard Cochran, linux-doc, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 951 bytes --]

On Thu Apr 14 2022, Bagas Sanjaya wrote:
> On 4/14/22 16:18, Kurt Kanzenbach wrote:
>> +	tai:
>> +		This is the tai clock (CLOCK_TAI) and is derived from the wall-
>> +		clock time. However, this clock does not experience
>> +		discontinuities and backwards jumps caused by NTP inserting leap
>> +		seconds. Since the clock access is designed for use in tracing,
>> +		side effects are possible. The clock access may yield wrong
>> +		readouts in case the internal TAI offset is updated e.g., caused
>> +		by setting the system time or using adjtimex() with an offset.
>> +		These effects are rare and post processing should be able to
>> +		handle them. See comments in the ktime_get_tai_fast_ns()
>> +		function for more information.
>> +
>
> In what file are the ktime_get_tai_fast_ns() comments?

In kernel/time/timekeeping.c. That function is introduced in patch #1
and has kernel doc comments. Similar to ktime_get_boot_fast_ns().

Thanks,
Kurt

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 861 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 1/3] timekeeping: Introduce fast accessor to clock tai
  2022-04-14  9:18 ` [PATCH v2 1/3] timekeeping: Introduce fast accessor to " Kurt Kanzenbach
@ 2022-04-14 14:59   ` Thomas Gleixner
  2022-04-18 17:15     ` Steven Rostedt
  2022-04-14 15:01   ` [tip: timers/core] " tip-bot2 for Kurt Kanzenbach
  2022-04-26 21:53   ` [PATCH v2 1/3] " Steven Rostedt
  2 siblings, 1 reply; 20+ messages in thread
From: Thomas Gleixner @ 2022-04-14 14:59 UTC (permalink / raw)
  To: Kurt Kanzenbach, John Stultz, Stephen Boyd, Steven Rostedt,
	Ingo Molnar, Jonathan Corbet
  Cc: Richard Cochran, linux-doc, linux-kernel, Kurt Kanzenbach

On Thu, Apr 14 2022 at 11:18, Kurt Kanzenbach wrote:
> Introduce fast/NMI safe accessor to clock tai for tracing. The Linux kernel
> tracing infrastructure has support for using different clocks to generate
> timestamps for trace events. Especially in TSN networks it's useful to have TAI
> as trace clock, because the application scheduling is done in accordance to the
> network time, which is based on TAI. With a tai trace_clock in place, it becomes
> very convenient to correlate network activity with Linux kernel application
> traces.

Steven, this is available for you to pull via:

  git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip tai-for-tracing

It's a single tagged commit on top of 5.18-rc2

Thanks,

        tglx


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [tip: timers/core] timekeeping: Introduce fast accessor to clock tai
  2022-04-14  9:18 ` [PATCH v2 1/3] timekeeping: Introduce fast accessor to " Kurt Kanzenbach
  2022-04-14 14:59   ` Thomas Gleixner
@ 2022-04-14 15:01   ` tip-bot2 for Kurt Kanzenbach
  2022-04-26 21:53   ` [PATCH v2 1/3] " Steven Rostedt
  2 siblings, 0 replies; 20+ messages in thread
From: tip-bot2 for Kurt Kanzenbach @ 2022-04-14 15:01 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Kurt Kanzenbach, Thomas Gleixner, Steven Rostedt, x86, linux-kernel

The following commit has been merged into the timers/core branch of tip:

Commit-ID:     3dc6ffae2da201284cb24af66af77ee0bbb2efaa
Gitweb:        https://git.kernel.org/tip/3dc6ffae2da201284cb24af66af77ee0bbb2efaa
Author:        Kurt Kanzenbach <kurt@linutronix.de>
AuthorDate:    Thu, 14 Apr 2022 11:18:03 +02:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Thu, 14 Apr 2022 16:19:30 +02:00

timekeeping: Introduce fast accessor to clock tai

Introduce fast/NMI safe accessor to clock tai for tracing. The Linux kernel
tracing infrastructure has support for using different clocks to generate
timestamps for trace events. Especially in TSN networks it's useful to have TAI
as trace clock, because the application scheduling is done in accordance to the
network time, which is based on TAI. With a tai trace_clock in place, it becomes
very convenient to correlate network activity with Linux kernel application
traces.

Use the same implementation as ktime_get_boot_fast_ns() does by reading the
monotonic time and adding the TAI offset. The same limitations as for the fast
boot implementation apply. The TAI offset may change at run time e.g., by
setting the time or using adjtimex() with an offset. However, these kind of
offset changes are rare events. Nevertheless, the user has to be aware and deal
with it in post processing.

An alternative approach would be to use the same implementation as
ktime_get_real_fast_ns() does. However, this requires to add an additional u64
member to the tk_read_base struct. This struct together with a seqcount is
designed to fit into a single cache line on 64 bit architectures. Adding a new
member would violate this constraint.

Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: https://lore.kernel.org/r/20220414091805.89667-2-kurt@linutronix.de

---
 Documentation/core-api/timekeeping.rst |  1 +
 include/linux/timekeeping.h            |  1 +
 kernel/time/timekeeping.c              | 17 +++++++++++++++++
 3 files changed, 19 insertions(+)

diff --git a/Documentation/core-api/timekeeping.rst b/Documentation/core-api/timekeeping.rst
index 729e248..22ec68f 100644
--- a/Documentation/core-api/timekeeping.rst
+++ b/Documentation/core-api/timekeeping.rst
@@ -132,6 +132,7 @@ Some additional variants exist for more specialized cases:
 .. c:function:: u64 ktime_get_mono_fast_ns( void )
 		u64 ktime_get_raw_fast_ns( void )
 		u64 ktime_get_boot_fast_ns( void )
+		u64 ktime_get_tai_fast_ns( void )
 		u64 ktime_get_real_fast_ns( void )
 
 	These variants are safe to call from any context, including from
diff --git a/include/linux/timekeeping.h b/include/linux/timekeeping.h
index 78a98bd..fe1e467 100644
--- a/include/linux/timekeeping.h
+++ b/include/linux/timekeeping.h
@@ -177,6 +177,7 @@ static inline u64 ktime_get_raw_ns(void)
 extern u64 ktime_get_mono_fast_ns(void);
 extern u64 ktime_get_raw_fast_ns(void);
 extern u64 ktime_get_boot_fast_ns(void);
+extern u64 ktime_get_tai_fast_ns(void);
 extern u64 ktime_get_real_fast_ns(void);
 
 /*
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index dcdcb85..2c22023 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -532,6 +532,23 @@ u64 notrace ktime_get_boot_fast_ns(void)
 }
 EXPORT_SYMBOL_GPL(ktime_get_boot_fast_ns);
 
+/**
+ * ktime_get_tai_fast_ns - NMI safe and fast access to tai clock.
+ *
+ * The same limitations as described for ktime_get_boot_fast_ns() apply. The
+ * mono time and the TAI offset are not read atomically which may yield wrong
+ * readouts. However, an update of the TAI offset is an rare event e.g., caused
+ * by settime or adjtimex with an offset. The user of this function has to deal
+ * with the possibility of wrong timestamps in post processing.
+ */
+u64 notrace ktime_get_tai_fast_ns(void)
+{
+	struct timekeeper *tk = &tk_core.timekeeper;
+
+	return (ktime_get_mono_fast_ns() + ktime_to_ns(data_race(tk->offs_tai)));
+}
+EXPORT_SYMBOL_GPL(ktime_get_tai_fast_ns);
+
 static __always_inline u64 __ktime_get_real_fast(struct tk_fast *tkf, u64 *mono)
 {
 	struct tk_read_base *tkr;

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 1/3] timekeeping: Introduce fast accessor to clock tai
  2022-04-14 14:59   ` Thomas Gleixner
@ 2022-04-18 17:15     ` Steven Rostedt
  0 siblings, 0 replies; 20+ messages in thread
From: Steven Rostedt @ 2022-04-18 17:15 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Kurt Kanzenbach, John Stultz, Stephen Boyd, Ingo Molnar,
	Jonathan Corbet, Richard Cochran, linux-doc, linux-kernel,
	Linus Torvalds

On Thu, 14 Apr 2022 16:59:24 +0200
Thomas Gleixner <tglx@linutronix.de> wrote:

> Steven, this is available for you to pull via:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip tai-for-tracing
> 
> It's a single tagged commit on top of 5.18-rc2

Thanks. As I have not started putting together my next merge window series
yet, I can start off of this commit.

-- Steve

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 1/3] timekeeping: Introduce fast accessor to clock tai
  2022-04-14  9:18 ` [PATCH v2 1/3] timekeeping: Introduce fast accessor to " Kurt Kanzenbach
  2022-04-14 14:59   ` Thomas Gleixner
  2022-04-14 15:01   ` [tip: timers/core] " tip-bot2 for Kurt Kanzenbach
@ 2022-04-26 21:53   ` Steven Rostedt
  2022-04-27  8:17     ` Thomas Gleixner
  2022-04-27  8:38     ` Kurt Kanzenbach
  2 siblings, 2 replies; 20+ messages in thread
From: Steven Rostedt @ 2022-04-26 21:53 UTC (permalink / raw)
  To: Kurt Kanzenbach
  Cc: John Stultz, Thomas Gleixner, Stephen Boyd, Ingo Molnar,
	Jonathan Corbet, Richard Cochran, linux-doc, linux-kernel

On Thu, 14 Apr 2022 11:18:03 +0200
Kurt Kanzenbach <kurt@linutronix.de> wrote:

I finally ran this series through my tests, and it has some issues.

> Introduce fast/NMI safe accessor to clock tai for tracing. The Linux kernel
> tracing infrastructure has support for using different clocks to generate
> timestamps for trace events. Especially in TSN networks it's useful to have TAI
> as trace clock, because the application scheduling is done in accordance to the
> network time, which is based on TAI. With a tai trace_clock in place, it becomes
> very convenient to correlate network activity with Linux kernel application
> traces.
> 
> Use the same implementation as ktime_get_boot_fast_ns() does by reading the
> monotonic time and adding the TAI offset. The same limitations as for the fast
> boot implementation apply. The TAI offset may change at run time e.g., by
> setting the time or using adjtimex() with an offset. However, these kind of
> offset changes are rare events. Nevertheless, the user has to be aware and deal
> with it in post processing.
> 
> An alternative approach would be to use the same implementation as
> ktime_get_real_fast_ns() does. However, this requires to add an additional u64
> member to the tk_read_base struct. This struct together with a seqcount is
> designed to fit into a single cache line on 64 bit architectures. Adding a new
> member would violate this constraint.
> 
> Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de>
> ---
>  Documentation/core-api/timekeeping.rst |  1 +
>  include/linux/timekeeping.h            |  1 +
>  kernel/time/timekeeping.c              | 17 +++++++++++++++++
>  3 files changed, 19 insertions(+)
> 
> diff --git a/Documentation/core-api/timekeeping.rst b/Documentation/core-api/timekeeping.rst
> index 729e24864fe7..22ec68f24421 100644
> --- a/Documentation/core-api/timekeeping.rst
> +++ b/Documentation/core-api/timekeeping.rst
> @@ -132,6 +132,7 @@ Some additional variants exist for more specialized cases:
>  .. c:function:: u64 ktime_get_mono_fast_ns( void )
>  		u64 ktime_get_raw_fast_ns( void )
>  		u64 ktime_get_boot_fast_ns( void )
> +		u64 ktime_get_tai_fast_ns( void )
>  		u64 ktime_get_real_fast_ns( void )
>  
>  	These variants are safe to call from any context, including from
> diff --git a/include/linux/timekeeping.h b/include/linux/timekeeping.h
> index 78a98bdff76d..fe1e467ba046 100644
> --- a/include/linux/timekeeping.h
> +++ b/include/linux/timekeeping.h
> @@ -177,6 +177,7 @@ static inline u64 ktime_get_raw_ns(void)
>  extern u64 ktime_get_mono_fast_ns(void);
>  extern u64 ktime_get_raw_fast_ns(void);
>  extern u64 ktime_get_boot_fast_ns(void);
> +extern u64 ktime_get_tai_fast_ns(void);
>  extern u64 ktime_get_real_fast_ns(void);
>  
>  /*
> diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
> index dcdcb85121e4..2c22023fbf5f 100644
> --- a/kernel/time/timekeeping.c
> +++ b/kernel/time/timekeeping.c
> @@ -532,6 +532,23 @@ u64 notrace ktime_get_boot_fast_ns(void)
>  }
>  EXPORT_SYMBOL_GPL(ktime_get_boot_fast_ns);
>  
> +/**
> + * ktime_get_tai_fast_ns - NMI safe and fast access to tai clock.
> + *
> + * The same limitations as described for ktime_get_boot_fast_ns() apply. The
> + * mono time and the TAI offset are not read atomically which may yield wrong
> + * readouts. However, an update of the TAI offset is an rare event e.g., caused
> + * by settime or adjtimex with an offset. The user of this function has to deal
> + * with the possibility of wrong timestamps in post processing.
> + */
> +u64 notrace ktime_get_tai_fast_ns(void)
> +{
> +	struct timekeeper *tk = &tk_core.timekeeper;
> +
> +	return (ktime_get_mono_fast_ns() + ktime_to_ns(data_race(tk->offs_tai)));

As you are using this for tracing, can you open code the
ktime_get_mono_fast_ns(), otherwise we need to mark that function as
notrace. Not to mention, this is a fast path and using the noinline of
__ktime_get_fast_ns() should be less overhead.

That said, I hit this too:

            less-5071    [000] d.h2. 498087876.351330: do_raw_spin_trylock <-_raw_spin_lock
            less-5071    [000] d.h4. 498087876.351334: ktime_get_mono_fast_ns <-ktime_get_tai_fast_ns
            less-5071    [000] d.h5. 498087876.351334: ktime_get_mono_fast_ns <-ktime_get_tai_fast_ns
            less-5071    [000] d.h3. 498087876.351334: rcu_read_lock_sched_held <-lock_acquired
            less-5071    [000] d.h5. 498087876.351337: ktime_get_mono_fast_ns <-ktime_get_tai_fast_ns
    kworker/u8:1-45      [003] d.h7. 1651009380.982749: ktime_get_mono_fast_ns <-ktime_get_tai_fast_ns
    kworker/u8:1-45      [003] d.h7. 1651009380.982749: ktime_get_mono_fast_ns <-ktime_get_tai_fast_ns
    kworker/u8:1-45      [003] d.h5. 1651009380.982749: rcu_read_lock_held_common <-rcu_read_lock_sched_held
    kworker/u8:1-45      [003] d.h7. 498087876.375905: ktime_get_mono_fast_ns <-ktime_get_tai_fast_ns
    kworker/u8:1-45      [003] d.h7. 498087876.375905: ktime_get_mono_fast_ns <-ktime_get_tai_fast_ns
    kworker/u8:1-45      [003] d.h5. 498087876.375905: update_cfs_group <-task_tick_fair
    kworker/u8:1-45      [003] d.h7. 498087876.375909: ktime_get_mono_fast_ns <-ktime_get_tai_fast_ns

The clock seems to be toggling between 1651009380 and 498087876 causing the
ftrace ring buffer to shutdown (it doesn't allow for time to go backwards).

This is running on a 32 bit x86.

-- Steve


> +}
> +EXPORT_SYMBOL_GPL(ktime_get_tai_fast_ns);
> +
>  static __always_inline u64 __ktime_get_real_fast(struct tk_fast *tkf, u64 *mono)
>  {
>  	struct tk_read_base *tkr;


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 1/3] timekeeping: Introduce fast accessor to clock tai
  2022-04-26 21:53   ` [PATCH v2 1/3] " Steven Rostedt
@ 2022-04-27  8:17     ` Thomas Gleixner
  2022-04-27  8:38     ` Kurt Kanzenbach
  1 sibling, 0 replies; 20+ messages in thread
From: Thomas Gleixner @ 2022-04-27  8:17 UTC (permalink / raw)
  To: Steven Rostedt, Kurt Kanzenbach
  Cc: John Stultz, Stephen Boyd, Ingo Molnar, Jonathan Corbet,
	Richard Cochran, linux-doc, linux-kernel

On Tue, Apr 26 2022 at 17:53, Steven Rostedt wrote:
>> +/**
>> + * ktime_get_tai_fast_ns - NMI safe and fast access to tai clock.
>> + *
>> + * The same limitations as described for ktime_get_boot_fast_ns() apply. The
>> + * mono time and the TAI offset are not read atomically which may yield wrong
>> + * readouts. However, an update of the TAI offset is an rare event e.g., caused
>> + * by settime or adjtimex with an offset. The user of this function has to deal
>> + * with the possibility of wrong timestamps in post processing.
>> + */
>> +u64 notrace ktime_get_tai_fast_ns(void)
>> +{
>> +	struct timekeeper *tk = &tk_core.timekeeper;
>> +
>> +	return (ktime_get_mono_fast_ns() + ktime_to_ns(data_race(tk->offs_tai)));
>
> As you are using this for tracing, can you open code the
> ktime_get_mono_fast_ns(), otherwise we need to mark that function as
> notrace.

which we should do anyway....

kernel/trace/trace.c:   { ktime_get_mono_fast_ns,       "mono", 1 },

> Not to mention, this is a fast path and using the noinline of
> __ktime_get_fast_ns() should be less overhead.
>
> That said, I hit this too:
>
>             less-5071    [000] d.h2. 498087876.351330: do_raw_spin_trylock <-_raw_spin_lock
>             less-5071    [000] d.h4. 498087876.351334: ktime_get_mono_fast_ns <-ktime_get_tai_fast_ns
>             less-5071    [000] d.h5. 498087876.351334: ktime_get_mono_fast_ns <-ktime_get_tai_fast_ns
>             less-5071    [000] d.h3. 498087876.351334: rcu_read_lock_sched_held <-lock_acquired
>             less-5071    [000] d.h5. 498087876.351337: ktime_get_mono_fast_ns <-ktime_get_tai_fast_ns
>     kworker/u8:1-45      [003] d.h7. 1651009380.982749: ktime_get_mono_fast_ns <-ktime_get_tai_fast_ns
>     kworker/u8:1-45      [003] d.h7. 1651009380.982749: ktime_get_mono_fast_ns <-ktime_get_tai_fast_ns
>     kworker/u8:1-45      [003] d.h5. 1651009380.982749: rcu_read_lock_held_common <-rcu_read_lock_sched_held
>     kworker/u8:1-45      [003] d.h7. 498087876.375905: ktime_get_mono_fast_ns <-ktime_get_tai_fast_ns
>     kworker/u8:1-45      [003] d.h7. 498087876.375905: ktime_get_mono_fast_ns <-ktime_get_tai_fast_ns
>     kworker/u8:1-45      [003] d.h5. 498087876.375905: update_cfs_group <-task_tick_fair
>     kworker/u8:1-45      [003] d.h7. 498087876.375909: ktime_get_mono_fast_ns <-ktime_get_tai_fast_ns
>
> The clock seems to be toggling between 1651009380 and 498087876 causing the
> ftrace ring buffer to shutdown (it doesn't allow for time to go backwards).

That does not make sense. offs_tai changes only when clock is set via
adjtimex() or settimeofday().

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 1/3] timekeeping: Introduce fast accessor to clock tai
  2022-04-26 21:53   ` [PATCH v2 1/3] " Steven Rostedt
  2022-04-27  8:17     ` Thomas Gleixner
@ 2022-04-27  8:38     ` Kurt Kanzenbach
  2022-04-27 15:27       ` Steven Rostedt
  1 sibling, 1 reply; 20+ messages in thread
From: Kurt Kanzenbach @ 2022-04-27  8:38 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: John Stultz, Thomas Gleixner, Stephen Boyd, Ingo Molnar,
	Jonathan Corbet, Richard Cochran, linux-doc, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 3245 bytes --]

On Tue Apr 26 2022, Steven Rostedt wrote:
> On Thu, 14 Apr 2022 11:18:03 +0200
> Kurt Kanzenbach <kurt@linutronix.de> wrote:
>
> I finally ran this series through my tests, and it has some issues.

Thanks!

[snip]

>> +/**
>> + * ktime_get_tai_fast_ns - NMI safe and fast access to tai clock.
>> + *
>> + * The same limitations as described for ktime_get_boot_fast_ns() apply. The
>> + * mono time and the TAI offset are not read atomically which may yield wrong
>> + * readouts. However, an update of the TAI offset is an rare event e.g., caused
>> + * by settime or adjtimex with an offset. The user of this function has to deal
>> + * with the possibility of wrong timestamps in post processing.
>> + */
>> +u64 notrace ktime_get_tai_fast_ns(void)
>> +{
>> +	struct timekeeper *tk = &tk_core.timekeeper;
>> +
>> +	return (ktime_get_mono_fast_ns() + ktime_to_ns(data_race(tk->offs_tai)));
>
> As you are using this for tracing, can you open code the
> ktime_get_mono_fast_ns(), otherwise we need to mark that function as
> notrace. Not to mention, this is a fast path and using the noinline of
> __ktime_get_fast_ns() should be less overhead.

Looking at the other functions used for tracing:

 - ktime_get_mono_fast_ns - no annotations
 - ktime_get_raw_fast_ns  - no annotations
 - ktime_get_boot_fast_ns - notrace
 - ktime_get_tai_fast_ns  - notrace

Seems a little bit inconsistent.

>
> That said, I hit this too:
>
>             less-5071    [000] d.h2. 498087876.351330: do_raw_spin_trylock <-_raw_spin_lock
>             less-5071    [000] d.h4. 498087876.351334: ktime_get_mono_fast_ns <-ktime_get_tai_fast_ns
>             less-5071    [000] d.h5. 498087876.351334: ktime_get_mono_fast_ns <-ktime_get_tai_fast_ns
>             less-5071    [000] d.h3. 498087876.351334: rcu_read_lock_sched_held <-lock_acquired
>             less-5071    [000] d.h5. 498087876.351337: ktime_get_mono_fast_ns <-ktime_get_tai_fast_ns
>     kworker/u8:1-45      [003] d.h7. 1651009380.982749: ktime_get_mono_fast_ns <-ktime_get_tai_fast_ns
>     kworker/u8:1-45      [003] d.h7. 1651009380.982749: ktime_get_mono_fast_ns <-ktime_get_tai_fast_ns
>     kworker/u8:1-45      [003] d.h5. 1651009380.982749: rcu_read_lock_held_common <-rcu_read_lock_sched_held
>     kworker/u8:1-45      [003] d.h7. 498087876.375905: ktime_get_mono_fast_ns <-ktime_get_tai_fast_ns
>     kworker/u8:1-45      [003] d.h7. 498087876.375905: ktime_get_mono_fast_ns <-ktime_get_tai_fast_ns
>     kworker/u8:1-45      [003] d.h5. 498087876.375905: update_cfs_group <-task_tick_fair
>     kworker/u8:1-45      [003] d.h7. 498087876.375909: ktime_get_mono_fast_ns <-ktime_get_tai_fast_ns
>
> The clock seems to be toggling between 1651009380 and 498087876 causing the
> ftrace ring buffer to shutdown (it doesn't allow for time to go backwards).
>
> This is running on a 32 bit x86.

Hm. The only problem is that the TAI offset may change. That should only
happen if the time is set or similar. For the TSN use case I've ran
ptp4l and phc2sys. phc2sys in the default configuration sets the offset
hard once and uses frequency adjustments from that point on. I didn't
observe any time jumps. Nevertheless, my test systems is based on
arm64. I'll do some testing on x86.

Thanks,
Kurt

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 861 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 1/3] timekeeping: Introduce fast accessor to clock tai
  2022-04-27  8:38     ` Kurt Kanzenbach
@ 2022-04-27 15:27       ` Steven Rostedt
  2022-04-27 17:22         ` Steven Rostedt
                           ` (2 more replies)
  0 siblings, 3 replies; 20+ messages in thread
From: Steven Rostedt @ 2022-04-27 15:27 UTC (permalink / raw)
  To: Kurt Kanzenbach
  Cc: John Stultz, Thomas Gleixner, Stephen Boyd, Ingo Molnar,
	Jonathan Corbet, Richard Cochran, linux-doc, linux-kernel

On Wed, 27 Apr 2022 10:38:59 +0200
Kurt Kanzenbach <kurt@linutronix.de> wrote:

> Looking at the other functions used for tracing:
> 
>  - ktime_get_mono_fast_ns - no annotations
>  - ktime_get_raw_fast_ns  - no annotations
>  - ktime_get_boot_fast_ns - notrace
>  - ktime_get_tai_fast_ns  - notrace

Yeah, all should be notrace.

> 
> Seems a little bit inconsistent.
> 
> >
> > That said, I hit this too:
> >
> >             less-5071    [000] d.h2. 498087876.351330: do_raw_spin_trylock <-_raw_spin_lock
> >             less-5071    [000] d.h4. 498087876.351334: ktime_get_mono_fast_ns <-ktime_get_tai_fast_ns
> >             less-5071    [000] d.h5. 498087876.351334: ktime_get_mono_fast_ns <-ktime_get_tai_fast_ns
> >             less-5071    [000] d.h3. 498087876.351334: rcu_read_lock_sched_held <-lock_acquired
> >             less-5071    [000] d.h5. 498087876.351337: ktime_get_mono_fast_ns <-ktime_get_tai_fast_ns
> >     kworker/u8:1-45      [003] d.h7. 1651009380.982749: ktime_get_mono_fast_ns <-ktime_get_tai_fast_ns
> >     kworker/u8:1-45      [003] d.h7. 1651009380.982749: ktime_get_mono_fast_ns <-ktime_get_tai_fast_ns
> >     kworker/u8:1-45      [003] d.h5. 1651009380.982749: rcu_read_lock_held_common <-rcu_read_lock_sched_held
> >     kworker/u8:1-45      [003] d.h7. 498087876.375905: ktime_get_mono_fast_ns <-ktime_get_tai_fast_ns
> >     kworker/u8:1-45      [003] d.h7. 498087876.375905: ktime_get_mono_fast_ns <-ktime_get_tai_fast_ns
> >     kworker/u8:1-45      [003] d.h5. 498087876.375905: update_cfs_group <-task_tick_fair
> >     kworker/u8:1-45      [003] d.h7. 498087876.375909: ktime_get_mono_fast_ns <-ktime_get_tai_fast_ns
> >
> > The clock seems to be toggling between 1651009380 and 498087876 causing the
> > ftrace ring buffer to shutdown (it doesn't allow for time to go backwards).
> >
> > This is running on a 32 bit x86.  
> 
> Hm. The only problem is that the TAI offset may change. That should only
> happen if the time is set or similar. For the TSN use case I've ran
> ptp4l and phc2sys. phc2sys in the default configuration sets the offset
> hard once and uses frequency adjustments from that point on. I didn't
> observe any time jumps. Nevertheless, my test systems is based on
> arm64. I'll do some testing on x86.


I'm able to trigger this on x86 64bit too.

One thing I noticed, is that the two numbers I have (from a different
trace, but very similar to the above)

$ printf "%llx\n" 498151194674148935
6e9c9df4afd3647

$ printf "%llx\n" 1651072699280995911
16e9c9df4afd3647

That is, the last nibble either is 0 or 1, causing the change?

06e9c9df4afd3647
16e9c9df4afd3647

The numbers are the same except for that last nibble.

-- Steve




^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 1/3] timekeeping: Introduce fast accessor to clock tai
  2022-04-27 15:27       ` Steven Rostedt
@ 2022-04-27 17:22         ` Steven Rostedt
  2022-04-27 19:23           ` Steven Rostedt
  2022-04-27 20:06         ` Thomas Gleixner
  2022-04-27 20:42         ` Kurt Kanzenbach
  2 siblings, 1 reply; 20+ messages in thread
From: Steven Rostedt @ 2022-04-27 17:22 UTC (permalink / raw)
  To: Kurt Kanzenbach
  Cc: John Stultz, Thomas Gleixner, Stephen Boyd, Ingo Molnar,
	Jonathan Corbet, Richard Cochran, linux-doc, linux-kernel

On Wed, 27 Apr 2022 11:27:59 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:

> I'm able to trigger this on x86 64bit too.
> 
> One thing I noticed, is that the two numbers I have (from a different
> trace, but very similar to the above)
> 
> $ printf "%llx\n" 498151194674148935
> 6e9c9df4afd3647
> 
> $ printf "%llx\n" 1651072699280995911
> 16e9c9df4afd3647
> 
> That is, the last nibble either is 0 or 1, causing the change?
> 
> 06e9c9df4afd3647
> 16e9c9df4afd3647
> 
> The numbers are the same except for that last nibble.

And I just realized what the issue is. The ring buffer only allows for 59
bits of timestamp, as there's times it needs to inject a timestamp into the
buffer, which is 16 bytes, and the first 5 bits are meta data, leaving 59
bits for the timestamp. Since no timestamp came close to that before this
timer, everything worked.

Thus, when a full timestamp is injected, it takes off the 5 MSBs of the
timestamp (which took off that 1) and that is what is happening.

I can see if I can play some games to detect this and replace the top 5
bits with the saved timestamp at the head of the sub buffer.

-- Steve

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 1/3] timekeeping: Introduce fast accessor to clock tai
  2022-04-27 17:22         ` Steven Rostedt
@ 2022-04-27 19:23           ` Steven Rostedt
  0 siblings, 0 replies; 20+ messages in thread
From: Steven Rostedt @ 2022-04-27 19:23 UTC (permalink / raw)
  To: Kurt Kanzenbach
  Cc: John Stultz, Thomas Gleixner, Stephen Boyd, Ingo Molnar,
	Jonathan Corbet, Richard Cochran, linux-doc, linux-kernel

On Wed, 27 Apr 2022 13:22:05 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:

> I can see if I can play some games to detect this and replace the top 5
> bits with the saved timestamp at the head of the sub buffer.

This appears to fix the issue. It adds a function when reading the absolute
time stamp comparing it to a previous time stamp. If the previous time
stamp has any of the 5 MSB set, then it will OR it into the absolute time
stamp, and then compare it to the previous time stamp to check for overflow.

It also does not complain of "big deltas" if the update is an absolute
time stamp and the current time stamp has any of the 5 MSB set.

I will need to update libtraceevent to handle this case as well.

But looks like there's nothing on your end that needs to be done. Probably
just set those other functions you found to notrace. But that can be a
separate patch.

-- Steve

diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
index 655d6db3e3c3..3a0c7ed0e93f 100644
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -29,6 +29,14 @@
 
 #include <asm/local.h>
 
+/*
+ * The "absolute" timestamp in the buffer is only 59 bits.
+ * If a clock has the 5 MSBs set, it needs to be saved and
+ * reinserted.
+ */
+#define TS_MSB		(0xf8ULL << 56)
+#define ABS_TS_MASK	(~TS_MSB)
+
 static void update_pages_handler(struct work_struct *work);
 
 /*
@@ -783,6 +791,24 @@ static inline void verify_event(struct ring_buffer_per_cpu *cpu_buffer,
 }
 #endif
 
+/*
+ * The absolute time stamp drops the 5 MSBs and some clocks may
+ * require them. The rb_fix_abs_ts() will take a previous full
+ * time stamp, and add the 5 MSB of that time stamp on to the
+ * saved absolute time stamp. Then they are compared in case of
+ * the unlikely event that the latest time stamp incremented
+ * the 5 MSB.
+ */
+static inline u64 rb_fix_abs_ts(u64 abs, u64 save_ts)
+{
+	if (save_ts & TS_MSB) {
+		abs |= save_ts & TS_MSB;
+		/* Check for overflow */
+		if (unlikely(abs < save_ts))
+			abs += 1ULL << 59;
+	}
+	return abs;
+}
 
 static inline u64 rb_time_stamp(struct trace_buffer *buffer);
 
@@ -811,8 +837,10 @@ u64 ring_buffer_event_time_stamp(struct trace_buffer *buffer,
 	u64 ts;
 
 	/* If the event includes an absolute time, then just use that */
-	if (event->type_len == RINGBUF_TYPE_TIME_STAMP)
-		return rb_event_time_stamp(event);
+	if (event->type_len == RINGBUF_TYPE_TIME_STAMP) {
+		ts = rb_event_time_stamp(event);
+		return rb_fix_abs_ts(ts, cpu_buffer->tail_page->page->time_stamp);
+	}
 
 	nest = local_read(&cpu_buffer->committing);
 	verify_event(cpu_buffer, event);
@@ -2754,8 +2782,15 @@ static void rb_add_timestamp(struct ring_buffer_per_cpu *cpu_buffer,
 		(RB_ADD_STAMP_FORCE | RB_ADD_STAMP_ABSOLUTE);
 
 	if (unlikely(info->delta > (1ULL << 59))) {
+		/*
+		 * Some timers can use more than 59 bits, and when a timestamp
+		 * is added to the buffer, it will lose those bits.
+		 */
+		if (abs && (info->ts & TS_MSB)) {
+			info->delta &= ABS_TS_MASK;
+
 		/* did the clock go backwards */
-		if (info->before == info->after && info->before > info->ts) {
+		} else if (info->before == info->after && info->before > info->ts) {
 			/* not interrupted */
 			static int once;
 
@@ -3304,7 +3339,7 @@ static void dump_buffer_page(struct buffer_data_page *bpage,
 
 		case RINGBUF_TYPE_TIME_STAMP:
 			delta = rb_event_time_stamp(event);
-			ts = delta;
+			ts = rb_fix_abs_ts(delta, ts);
 			pr_warn("  [%lld] absolute:%lld TIME STAMP\n", ts, delta);
 			break;
 
@@ -3380,7 +3415,7 @@ static void check_buffer(struct ring_buffer_per_cpu *cpu_buffer,
 
 		case RINGBUF_TYPE_TIME_STAMP:
 			delta = rb_event_time_stamp(event);
-			ts = delta;
+			ts = rb_fix_abs_ts(delta, ts);
 			break;
 
 		case RINGBUF_TYPE_PADDING:
@@ -4367,6 +4402,7 @@ rb_update_read_stamp(struct ring_buffer_per_cpu *cpu_buffer,
 
 	case RINGBUF_TYPE_TIME_STAMP:
 		delta = rb_event_time_stamp(event);
+		delta = rb_fix_abs_ts(delta, cpu_buffer->read_stamp);
 		cpu_buffer->read_stamp = delta;
 		return;
 
@@ -4397,6 +4433,7 @@ rb_update_iter_read_stamp(struct ring_buffer_iter *iter,
 
 	case RINGBUF_TYPE_TIME_STAMP:
 		delta = rb_event_time_stamp(event);
+		delta = rb_fix_abs_ts(delta, iter->read_stamp);
 		iter->read_stamp = delta;
 		return;
 
@@ -4650,6 +4687,7 @@ rb_buffer_peek(struct ring_buffer_per_cpu *cpu_buffer, u64 *ts,
 	case RINGBUF_TYPE_TIME_STAMP:
 		if (ts) {
 			*ts = rb_event_time_stamp(event);
+			*ts = rb_fix_abs_ts(*ts, reader->page->time_stamp);
 			ring_buffer_normalize_time_stamp(cpu_buffer->buffer,
 							 cpu_buffer->cpu, ts);
 		}
@@ -4741,6 +4779,7 @@ rb_iter_peek(struct ring_buffer_iter *iter, u64 *ts)
 	case RINGBUF_TYPE_TIME_STAMP:
 		if (ts) {
 			*ts = rb_event_time_stamp(event);
+			*ts = rb_fix_abs_ts(*ts, iter->head_page->page->time_stamp);
 			ring_buffer_normalize_time_stamp(cpu_buffer->buffer,
 							 cpu_buffer->cpu, ts);
 		}

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 1/3] timekeeping: Introduce fast accessor to clock tai
  2022-04-27 15:27       ` Steven Rostedt
  2022-04-27 17:22         ` Steven Rostedt
@ 2022-04-27 20:06         ` Thomas Gleixner
  2022-04-27 20:08           ` Steven Rostedt
  2022-04-27 20:42         ` Kurt Kanzenbach
  2 siblings, 1 reply; 20+ messages in thread
From: Thomas Gleixner @ 2022-04-27 20:06 UTC (permalink / raw)
  To: Steven Rostedt, Kurt Kanzenbach
  Cc: John Stultz, Stephen Boyd, Ingo Molnar, Jonathan Corbet,
	Richard Cochran, linux-doc, linux-kernel

On Wed, Apr 27 2022 at 11:27, Steven Rostedt wrote:
>> >             less-5071    [000] d.h2. 498087876.351330: do_raw_spin_trylock <-_raw_spin_lock
>> >             less-5071    [000] d.h4. 498087876.351334: ktime_get_mono_fast_ns <-ktime_get_tai_fast_ns
>> >             less-5071    [000] d.h5. 498087876.351334: ktime_get_mono_fast_ns <-ktime_get_tai_fast_ns
>> >             less-5071    [000] d.h3. 498087876.351334: rcu_read_lock_sched_held <-lock_acquired
>> >             less-5071    [000] d.h5. 498087876.351337: ktime_get_mono_fast_ns <-ktime_get_tai_fast_ns
>> >     kworker/u8:1-45      [003] d.h7. 1651009380.982749: ktime_get_mono_fast_ns <-ktime_get_tai_fast_ns
>> >     kworker/u8:1-45      [003] d.h7. 1651009380.982749: ktime_get_mono_fast_ns <-ktime_get_tai_fast_ns
>> >     kworker/u8:1-45      [003] d.h5. 1651009380.982749: rcu_read_lock_held_common <-rcu_read_lock_sched_held
>> >     kworker/u8:1-45      [003] d.h7. 498087876.375905: ktime_get_mono_fast_ns <-ktime_get_tai_fast_ns
>> >     kworker/u8:1-45      [003] d.h7. 498087876.375905: ktime_get_mono_fast_ns <-ktime_get_tai_fast_ns
>> >     kworker/u8:1-45      [003] d.h5. 498087876.375905: update_cfs_group <-task_tick_fair
>> >     kworker/u8:1-45      [003] d.h7. 498087876.375909: ktime_get_mono_fast_ns <-ktime_get_tai_fast_ns
>> >
>> > The clock seems to be toggling between 1651009380 and 498087876 causing the
>> > ftrace ring buffer to shutdown (it doesn't allow for time to go backwards).
>> >
>> > This is running on a 32 bit x86.  
>> 
>> Hm. The only problem is that the TAI offset may change. That should only
>> happen if the time is set or similar. For the TSN use case I've ran
>> ptp4l and phc2sys. phc2sys in the default configuration sets the offset
>> hard once and uses frequency adjustments from that point on. I didn't
>> observe any time jumps. Nevertheless, my test systems is based on
>> arm64. I'll do some testing on x86.
>
> I'm able to trigger this on x86 64bit too.
>
> One thing I noticed, is that the two numbers I have (from a different
> trace, but very similar to the above)
>
> $ printf "%llx\n" 498151194674148935
> 6e9c9df4afd3647
>
> $ printf "%llx\n" 1651072699280995911
> 16e9c9df4afd3647

TAI offset after boot is somewhere in this region:

It's 16e9d5dca2e7ac57 here. Converted to ctime:

Wed Apr 27 20:58:02 2022

and that does not change after booting on the VM I'm testing. But then
ftrace suddenly throws this warning:

[  456.444350] Delta way too big! 1152921504606861182
   ts=1651087617720667106 before=498166113113805924
   after=498166113113820130 write stamp=498166113113820130

I first set tai as trace clock and then enabled a trace point. It seems
the first hit on this trace point ended up there.

> That is, the last nibble either is 0 or 1, causing the change?
>
> 06e9c9df4afd3647
> 16e9c9df4afd3647

I added debug into ktime_get_tai_fast_ns() to emit a pr_warn() when
either the TAI offset or the result is less than 1 << 60.

That does not trigger, but ftrace still confuses itself....

There is some mumble about timestamps only requiring 59 bits and the
check which triggers the warning has:

      if (unlikely(info->delta > (1ULL << 59))) {

So something cuts off bit 60 somewhere and then the whole thing goes
into confusion mode.

When I set wall time to something before 1985, it works.

ctime(0x06e9c9df4afd3647/1e9) = Mon Oct 14 16:19:54 1985

Haven't found where ftrace loses the upper bit, but it does so
somewhere.

Thanks,

        tglx




^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 1/3] timekeeping: Introduce fast accessor to clock tai
  2022-04-27 20:06         ` Thomas Gleixner
@ 2022-04-27 20:08           ` Steven Rostedt
  0 siblings, 0 replies; 20+ messages in thread
From: Steven Rostedt @ 2022-04-27 20:08 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Kurt Kanzenbach, John Stultz, Stephen Boyd, Ingo Molnar,
	Jonathan Corbet, Richard Cochran, linux-doc, linux-kernel

On Wed, 27 Apr 2022 22:06:49 +0200
Thomas Gleixner <tglx@linutronix.de> wrote:

> Haven't found where ftrace loses the upper bit, but it does so
> somewhere.

https://lore.kernel.org/all/20220427153339.16c33f75@gandalf.local.home/

 ;-)

-- Steve

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 1/3] timekeeping: Introduce fast accessor to clock tai
  2022-04-27 15:27       ` Steven Rostedt
  2022-04-27 17:22         ` Steven Rostedt
  2022-04-27 20:06         ` Thomas Gleixner
@ 2022-04-27 20:42         ` Kurt Kanzenbach
  2022-04-27 21:05           ` Steven Rostedt
  2022-04-27 21:12           ` Thomas Gleixner
  2 siblings, 2 replies; 20+ messages in thread
From: Kurt Kanzenbach @ 2022-04-27 20:42 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: John Stultz, Thomas Gleixner, Stephen Boyd, Ingo Molnar,
	Jonathan Corbet, Richard Cochran, linux-doc, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 3467 bytes --]

On Wed Apr 27 2022, Steven Rostedt wrote:
> I'm able to trigger this on x86 64bit too.

OK, reproduced it by using the function tracer (instead of events).

Anyway, with your patch applied [1] and the one below it looks way
better:

|            bash-4252    [003] ...1. 1651091299.936121: file_ra_state_init <-do_dentry_open
|            bash-4252    [003] ...1. 1651091299.936121: inode_to_bdi <-file_ra_state_init
|            bash-4252    [003] ...1. 1651091299.936121: do_truncate <-path_openat
|            bash-4252    [003] ...1. 1651091299.936122: dentry_needs_remove_privs <-do_truncate
|            bash-4252    [003] ...1. 1651091299.936122: should_remove_suid <-dentry_needs_remove_privs
|            bash-4252    [003] ...1. 1651091299.936122: security_inode_need_killpriv <-dentry_needs_remove_privs
|            bash-4252    [003] ...1. 1651091299.936122: cap_inode_need_killpriv <-security_inode_need_killpriv
|            bash-4252    [003] ...1. 1651091299.936123: __vfs_getxattr <-cap_inode_need_killpriv
|            bash-4252    [003] ...1. 1651091299.936123: xattr_resolve_name <-__vfs_getxattr
|            bash-4252    [003] ...1. 1651091299.936123: is_bad_inode <-xattr_resolve_name
|            bash-4252    [003] ...1. 1651091299.936123: down_write <-do_truncate
|            bash-4252    [003] ...1. 1651091299.936123: notify_change <-do_truncate
|            bash-4252    [003] ...1. 1651091299.936123: may_setattr <-notify_change
|            bash-4252    [003] ...1. 1651091299.936123: current_time <-notify_change
|            bash-4252    [003] ...1. 1651091299.936124: ktime_get_coarse_real_ts64 <-current_time
|            bash-4252    [003] ...1. 1651091299.936124: security_inode_setattr <-notify_change
|            bash-4252    [003] ...1. 1651091299.936124: selinux_inode_setattr <-security_inode_setattr
|            bash-4252    [003] ...1. 1651091299.936125: inode_has_perm <-selinux_inode_setattr
|/sys/kernel/debug/tracing # phc_ctl /dev/ptp0 get
|phc_ctl[760.604]: clock time is 1651091302.512977591 or Wed Apr 27 22:28:22 2022

Thanks for debugging it!

[1] - https://lore.kernel.org/all/20220427153339.16c33f75@gandalf.local.home/

Thanks,
Kurt

commit 81c4f2de420cc4ac08efc39e78ffd80e146bfbd7
Author: Kurt Kanzenbach <kurt@linutronix.de>
Date:   Wed Apr 27 21:59:58 2022 +0200

    timekeeping: Mark mono fast time accessors as notrace
    
    Mark the CLOCK_MONOTONIC fast time accessors as notrace. These functions are
    used in tracing to retrieve timestamps.
    
    Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de>

diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 2c22023fbf5f..dddc37035215 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -482,7 +482,7 @@ static __always_inline u64 __ktime_get_fast_ns(struct tk_fast *tkf)
  * of the following timestamps. Callers need to be aware of that and
  * deal with it.
  */
-u64 ktime_get_mono_fast_ns(void)
+u64 notrace ktime_get_mono_fast_ns(void)
 {
        return __ktime_get_fast_ns(&tk_fast_mono);
 }
@@ -494,7 +494,7 @@ EXPORT_SYMBOL_GPL(ktime_get_mono_fast_ns);
  * Contrary to ktime_get_mono_fast_ns() this is always correct because the
  * conversion factor is not affected by NTP/PTP correction.
  */
-u64 ktime_get_raw_fast_ns(void)
+u64 notrace ktime_get_raw_fast_ns(void)
 {
        return __ktime_get_fast_ns(&tk_fast_raw);
 }

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 861 bytes --]

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 1/3] timekeeping: Introduce fast accessor to clock tai
  2022-04-27 20:42         ` Kurt Kanzenbach
@ 2022-04-27 21:05           ` Steven Rostedt
  2022-04-27 21:12           ` Thomas Gleixner
  1 sibling, 0 replies; 20+ messages in thread
From: Steven Rostedt @ 2022-04-27 21:05 UTC (permalink / raw)
  To: Kurt Kanzenbach
  Cc: John Stultz, Thomas Gleixner, Stephen Boyd, Ingo Molnar,
	Jonathan Corbet, Richard Cochran, linux-doc, linux-kernel

On Wed, 27 Apr 2022 22:42:57 +0200
Kurt Kanzenbach <kurt@linutronix.de> wrote:

> Thanks for debugging it!

Unfortunately, I found that 32 bit is still broken. Will have that fixed
soon.

-- Steve

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 1/3] timekeeping: Introduce fast accessor to clock tai
  2022-04-27 20:42         ` Kurt Kanzenbach
  2022-04-27 21:05           ` Steven Rostedt
@ 2022-04-27 21:12           ` Thomas Gleixner
  1 sibling, 0 replies; 20+ messages in thread
From: Thomas Gleixner @ 2022-04-27 21:12 UTC (permalink / raw)
  To: Kurt Kanzenbach, Steven Rostedt
  Cc: John Stultz, Stephen Boyd, Ingo Molnar, Jonathan Corbet,
	Richard Cochran, linux-doc, linux-kernel

On Wed, Apr 27 2022 at 22:42, Kurt Kanzenbach wrote:
> Anyway, with your patch applied [1] and the one below it looks way
> better:
>
> commit 81c4f2de420cc4ac08efc39e78ffd80e146bfbd7
> Author: Kurt Kanzenbach <kurt@linutronix.de>
> Date:   Wed Apr 27 21:59:58 2022 +0200
>
>     timekeeping: Mark mono fast time accessors as notrace

Can you please post that proper?
     
>     Mark the CLOCK_MONOTONIC fast time accessors as notrace. These functions are
>     used in tracing to retrieve timestamps.
>

That lacks a Fixes: tag. These accessors should have been notrace from
day one.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2022-04-27 21:13 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-14  9:18 [PATCH v2 0/3] tracing: Introduce trace clock tai Kurt Kanzenbach
2022-04-14  9:18 ` [PATCH v2 1/3] timekeeping: Introduce fast accessor to " Kurt Kanzenbach
2022-04-14 14:59   ` Thomas Gleixner
2022-04-18 17:15     ` Steven Rostedt
2022-04-14 15:01   ` [tip: timers/core] " tip-bot2 for Kurt Kanzenbach
2022-04-26 21:53   ` [PATCH v2 1/3] " Steven Rostedt
2022-04-27  8:17     ` Thomas Gleixner
2022-04-27  8:38     ` Kurt Kanzenbach
2022-04-27 15:27       ` Steven Rostedt
2022-04-27 17:22         ` Steven Rostedt
2022-04-27 19:23           ` Steven Rostedt
2022-04-27 20:06         ` Thomas Gleixner
2022-04-27 20:08           ` Steven Rostedt
2022-04-27 20:42         ` Kurt Kanzenbach
2022-04-27 21:05           ` Steven Rostedt
2022-04-27 21:12           ` Thomas Gleixner
2022-04-14  9:18 ` [PATCH v2 2/3] tracing: Introduce trace " Kurt Kanzenbach
2022-04-14  9:18 ` [PATCH v2 3/3] tracing: Add documentation for " Kurt Kanzenbach
2022-04-14 11:27   ` Bagas Sanjaya
2022-04-14 11:38     ` Kurt Kanzenbach

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).