All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] printk, Add printk.clock kernel parameter [v2]
@ 2016-01-13 12:34 Prarit Bhargava
  2016-01-13 12:34 ` [PATCH 1/2] kernel, timekeeping, add ktime_get_[boot|real|tai]_fast_ns functions Prarit Bhargava
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Prarit Bhargava @ 2016-01-13 12:34 UTC (permalink / raw)
  To: linux-kernel
  Cc: Prarit Bhargava, John Stultz, Xunlei Pang, Thomas Gleixner,
	Baolin Wang, Andrew Morton, Greg Kroah-Hartman, Petr Mladek,
	Tejun Heo, Peter Hurley, Vasily Averin, Joe Perches

The script used in the analysis below:

dmesg_with_human_timestamps () {
    $(type -P dmesg) "$@" | perl -w -e 'use strict;
        my ($uptime) = do { local @ARGV="/proc/uptime";<>}; ($uptime) = ($uptime =~ /^(\d+)\./);
        foreach my $line (<>) {
            printf( ($line=~/^\[\s*(\d+)\.\d+\](.+)/) ? ( "[%s]%s\n", scalar localtime(time - $uptime + $1), $2 ) : $line )
        }'
}

dmesg_with_human_timestamps

----8<----

Over the past years I've seen many reports of bugs that include
time-stamped kernel logs (enabled when CONFIG_PRINTK_TIME=y or
print.time=1 is specified as a kernel parameter) that do not align
with either external time stamped logs or /var/log/messages.

For example,

[root@intel-wildcatpass-06 ~]# date; echo "Hello!" > /dev/kmsg ; date
Thu Dec 17 13:58:31 EST 2015
Thu Dec 17 13:58:31 EST 2015

which displays

[83973.768912] Hello!

on the serial console.

Running a script to convert this to "boot time",

[root@intel-wildcatpass-06 ~]# ./human.sh  | tail -1
[Thu Dec 17 13:59:57 2015] Hello!

which is already off by 1 minute and 26 seconds off after ~24 hours of
uptime.

This occurs because the time stamp is obtained from a call to
local_clock() which (on x86) is a direct call to the hardware.  These
hardware clock reads are not modified by the standard ntp or ptp protocol,
while the other timestamps are, and that results in situations external
time sources are further and further offset from the kernel log
timestamps.

This patchset introduces additional NMI safe timekeeping functions and the
kernel parameter printk.clock=[local|boot|real|tai] allowing a
user to specify an adjusted clock to use with printk timestamps.  The
hardware clock, or the existing functionality, is preserved by default.

[v2]: use NMI safe timekeeping access functions

Cc: John Stultz <john.stultz@linaro.org>
Cc: Xunlei Pang <pang.xunlei@linaro.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Baolin Wang <baolin.wang@linaro.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Petr Mladek <pmladek@suse.cz>
Cc: Tejun Heo <tj@kernel.org>
Cc: Peter Hurley <peter@hurleysoftware.com>
Cc: Vasily Averin <vvs@virtuozzo.com>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Prarit Bhargava <prarit@redhat.com>


Prarit Bhargava (2):
  kernel, timekeeping, add ktime_get_[boot|real|tai]_fast_ns functions
  printk, Add printk.clock kernel parameter

 include/linux/timekeeping.h |    3 +++
 kernel/printk/printk.c      |   54 +++++++++++++++++++++++++++++++++++++++++--
 kernel/time/timekeeping.c   |   52 +++++++++++++++++++++++++++++++++--------
 3 files changed, 97 insertions(+), 12 deletions(-)

-- 
1.7.9.3

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/2] kernel, timekeeping, add ktime_get_[boot|real|tai]_fast_ns functions
  2016-01-13 12:34 [PATCH 0/2] printk, Add printk.clock kernel parameter [v2] Prarit Bhargava
@ 2016-01-13 12:34 ` Prarit Bhargava
  2016-01-13 12:34 ` [PATCH 2/2] printk, Add printk.clock kernel parameter Prarit Bhargava
  2016-01-13 13:45 ` [PATCH 0/2] printk, Add printk.clock kernel parameter [v2] Thomas Gleixner
  2 siblings, 0 replies; 12+ messages in thread
From: Prarit Bhargava @ 2016-01-13 12:34 UTC (permalink / raw)
  To: linux-kernel
  Cc: Prarit Bhargava, John Stultz, Thomas Gleixner, Ingo Molnar,
	Xunlei Pang, Peter Zijlstra, Baolin Wang, Arnd Bergmann

This patch introduces NMI safe time functions,
ktime_get[boot|real|tai]_fast_ns() for use by the printk() timestamping
code, which are based on the existing NMI safe versions of accessing the
monotonic clock.

Cc: John Stultz <john.stultz@linaro.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Xunlei Pang <pang.xunlei@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Baolin Wang <baolin.wang@linaro.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
---
 include/linux/timekeeping.h |    3 +++
 kernel/time/timekeeping.c   |   52 ++++++++++++++++++++++++++++++++++---------
 2 files changed, 45 insertions(+), 10 deletions(-)

diff --git a/include/linux/timekeeping.h b/include/linux/timekeeping.h
index ec89d84..c3f10f5 100644
--- a/include/linux/timekeeping.h
+++ b/include/linux/timekeeping.h
@@ -233,6 +233,9 @@ static inline u64 ktime_get_raw_ns(void)
 
 extern u64 ktime_get_mono_fast_ns(void);
 extern u64 ktime_get_raw_fast_ns(void);
+extern u64 ktime_get_boot_fast_ns(void);
+extern u64 ktime_get_real_fast_ns(void);
+extern u64 ktime_get_tai_fast_ns(void);
 
 /*
  * Timespec interfaces utilizing the ktime based ones
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index d563c19..5449a5cc 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -44,6 +44,14 @@ static struct {
 static DEFINE_RAW_SPINLOCK(timekeeper_lock);
 static struct timekeeper shadow_timekeeper;
 
+static ktime_t *offsets[TK_OFFS_MAX] = {
+	[TK_OFFS_REAL]	= &tk_core.timekeeper.offs_real,
+	[TK_OFFS_BOOT]	= &tk_core.timekeeper.offs_boot,
+	[TK_OFFS_TAI]	= &tk_core.timekeeper.offs_tai,
+};
+
+static int timekeeping_initialized;
+
 /**
  * struct tk_fast - NMI safe timekeeper
  * @seq:	Sequence counter for protecting updates. The lowest bit
@@ -375,16 +383,22 @@ static void update_fast_timekeeper(struct tk_read_base *tkr, struct tk_fast *tkf
  * of the following timestamps. Callers need to be aware of that and
  * deal with it.
  */
-static __always_inline u64 __ktime_get_fast_ns(struct tk_fast *tkf)
+static __always_inline u64 __ktime_get_fast_ns(struct tk_fast *tkf,
+					       ktime_t *offset)
 {
 	struct tk_read_base *tkr;
 	unsigned int seq;
 	u64 now;
+	ktime_t base;
+
+	if (unlikely(!timekeeping_initialized))
+		return 0;
 
 	do {
 		seq = raw_read_seqcount_latch(&tkf->seq);
 		tkr = tkf->base + (seq & 0x01);
-		now = ktime_to_ns(tkr->base) + timekeeping_get_ns(tkr);
+		base = ktime_add(tkr->base, *offset);
+		now = ktime_to_ns(base) + timekeeping_get_ns(tkr);
 	} while (read_seqcount_retry(&tkf->seq, seq));
 
 	return now;
@@ -392,16 +406,38 @@ static __always_inline u64 __ktime_get_fast_ns(struct tk_fast *tkf)
 
 u64 ktime_get_mono_fast_ns(void)
 {
-	return __ktime_get_fast_ns(&tk_fast_mono);
+	ktime_t zero = ktime_set(0, 0);
+
+	return __ktime_get_fast_ns(&tk_fast_mono, &zero);
 }
 EXPORT_SYMBOL_GPL(ktime_get_mono_fast_ns);
 
 u64 ktime_get_raw_fast_ns(void)
 {
-	return __ktime_get_fast_ns(&tk_fast_raw);
+	ktime_t zero = ktime_set(0, 0);
+
+	return __ktime_get_fast_ns(&tk_fast_raw, &zero);
 }
 EXPORT_SYMBOL_GPL(ktime_get_raw_fast_ns);
 
+u64 ktime_get_boot_fast_ns(void)
+{
+	return __ktime_get_fast_ns(&tk_fast_raw, offsets[TK_OFFS_BOOT]);
+}
+EXPORT_SYMBOL_GPL(ktime_get_boot_fast_ns);
+
+u64 ktime_get_real_fast_ns(void)
+{
+	return __ktime_get_fast_ns(&tk_fast_raw, offsets[TK_OFFS_REAL]);
+}
+EXPORT_SYMBOL_GPL(ktime_get_real_fast_ns);
+
+u64 ktime_get_tai_fast_ns(void)
+{
+	return __ktime_get_fast_ns(&tk_fast_raw, offsets[TK_OFFS_TAI]);
+}
+EXPORT_SYMBOL_GPL(ktime_get_tai_fast_ns);
+
 /* Suspend-time cycles value for halted fast timekeeper. */
 static cycle_t cycles_at_suspend;
 
@@ -699,12 +735,6 @@ u32 ktime_get_resolution_ns(void)
 }
 EXPORT_SYMBOL_GPL(ktime_get_resolution_ns);
 
-static ktime_t *offsets[TK_OFFS_MAX] = {
-	[TK_OFFS_REAL]	= &tk_core.timekeeper.offs_real,
-	[TK_OFFS_BOOT]	= &tk_core.timekeeper.offs_boot,
-	[TK_OFFS_TAI]	= &tk_core.timekeeper.offs_tai,
-};
-
 ktime_t ktime_get_with_offset(enum tk_offsets offs)
 {
 	struct timekeeper *tk = &tk_core.timekeeper;
@@ -1255,6 +1285,8 @@ void __init timekeeping_init(void)
 
 	write_seqcount_end(&tk_core.seq);
 	raw_spin_unlock_irqrestore(&timekeeper_lock, flags);
+
+	timekeeping_initialized = 1;
 }
 
 /* time in seconds when suspend began for persistent clock */
-- 
1.7.9.3

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/2] printk, Add printk.clock kernel parameter
  2016-01-13 12:34 [PATCH 0/2] printk, Add printk.clock kernel parameter [v2] Prarit Bhargava
  2016-01-13 12:34 ` [PATCH 1/2] kernel, timekeeping, add ktime_get_[boot|real|tai]_fast_ns functions Prarit Bhargava
@ 2016-01-13 12:34 ` Prarit Bhargava
  2016-01-13 13:45 ` [PATCH 0/2] printk, Add printk.clock kernel parameter [v2] Thomas Gleixner
  2 siblings, 0 replies; 12+ messages in thread
From: Prarit Bhargava @ 2016-01-13 12:34 UTC (permalink / raw)
  To: linux-kernel
  Cc: Prarit Bhargava, John Stultz, Xunlei Pang, Thomas Gleixner,
	Baolin Wang, Andrew Morton, Greg Kroah-Hartman, Petr Mladek,
	Tejun Heo, Peter Hurley, Vasily Averin, Joe Perches

The script used in the analysis below:

dmesg_with_human_timestamps () {
    $(type -P dmesg) "$@" | perl -w -e 'use strict;
        my ($uptime) = do { local @ARGV="/proc/uptime";<>}; ($uptime) = ($uptime =~ /^(\d+)\./);
        foreach my $line (<>) {
            printf( ($line=~/^\[\s*(\d+)\.\d+\](.+)/) ? ( "[%s]%s\n", scalar localtime(time - $uptime + $1), $2 ) : $line )
        }'
}

dmesg_with_human_timestamps

----8<----

Over the past years I've seen many reports of bugs that include
time-stamped kernel logs (enabled when CONFIG_PRINTK_TIME=y or
print.time=1 is specified as a kernel parameter) that do not align
with either external time stamped logs or /var/log/messages.

For example,

[root@intel-wildcatpass-06 ~]# date; echo "Hello!" > /dev/kmsg ; date
Thu Dec 17 13:58:31 EST 2015
Thu Dec 17 13:58:31 EST 2015

which displays

[83973.768912] Hello!

on the serial console.

Running a script to convert this to "boot time",

[root@intel-wildcatpass-06 ~]# ./human.sh  | tail -1
[Thu Dec 17 13:59:57 2015] Hello!

which is already off by 1 minute and 26 seconds off after ~24 hours of
uptime.

This occurs because the time stamp is obtained from a call to
local_clock() which (on x86) is a direct call to the hardware.  These
hardware clock reads are not modified by the standard ntp or ptp protocol,
while the other timestamps are, and that results in situations external
time sources are further and further offset from the kernel log
timestamps.

This patch introduces printk.clock=[local|boot|real|tai] allowing a
user to specify an adjusted clock to use with printk timestamps.  The
hardware clock, or the existing functionality, is preserved by default.

[v2]: use NMI safe timekeeping access functions

Cc: John Stultz <john.stultz@linaro.org>
Cc: Xunlei Pang <pang.xunlei@linaro.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Baolin Wang <baolin.wang@linaro.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Petr Mladek <pmladek@suse.cz>
Cc: Tejun Heo <tj@kernel.org>
Cc: Peter Hurley <peter@hurleysoftware.com>
Cc: Vasily Averin <vvs@virtuozzo.com>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
---
 kernel/printk/printk.c |   54 ++++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 52 insertions(+), 2 deletions(-)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 2ce8826..671f0ec 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -419,6 +419,56 @@ static u32 truncate_msg(u16 *text_len, u16 *trunc_msg_len,
 	return msg_used_size(*text_len + *trunc_msg_len, 0, pad_len);
 }
 
+/* Default timestamp is local clock */
+static char *printk_clock = "local";
+static u64 (*printk_clock_fn)(void) = &local_clock;
+
+static int printk_clock_param_set(const char *val,
+				  const struct kernel_param *kp)
+{
+	char *printk_clock_new = strstrip((char *)val);
+	int ret;
+
+	/* check if change is needed */
+	if (!strcmp(printk_clock_new, printk_clock))
+		return 0;
+
+	if (!strncmp("local", printk_clock_new, 5)) {
+		ret = param_set_charp(printk_clock_new, kp);
+		if (ret)
+			return ret;
+		printk_clock_fn = &local_clock;
+	} else if (!strncmp("real", printk_clock_new, 4)) {
+		ret = param_set_charp(printk_clock_new, kp);
+		if (ret)
+			return ret;
+		printk_clock_fn = &ktime_get_real_fast_ns;
+	} else if (!strncmp("boot", printk_clock_new, 4)) {
+		ret = param_set_charp(printk_clock_new, kp);
+		if (ret)
+			return ret;
+		printk_clock_fn = &ktime_get_boot_fast_ns;
+	} else if (!strncmp("tai", printk_clock_new, 3)) {
+		ret = param_set_charp(printk_clock_new, kp);
+		if (ret)
+			return ret;
+		printk_clock_fn = &ktime_get_tai_fast_ns;
+	} else {
+		return -EINVAL;
+	}
+
+	pr_info("printk: timestamp set to %s clock.\n", printk_clock);
+	return 0;
+}
+
+static struct kernel_param_ops printk_clock_param_ops = {
+	.set =		printk_clock_param_set,
+	.get =		param_get_charp,
+};
+
+module_param_cb(clock, &printk_clock_param_ops, &printk_clock, 0644);
+MODULE_PARM_DESC(clock, "Clock used for printk timestamps.  Can be local (hardware/default), boot, real, or tai.\n");
+
 /* insert record into the buffer, discard old ones, update heads */
 static int log_store(int facility, int level,
 		     enum log_flags flags, u64 ts_nsec,
@@ -467,7 +517,7 @@ static int log_store(int facility, int level,
 	if (ts_nsec > 0)
 		msg->ts_nsec = ts_nsec;
 	else
-		msg->ts_nsec = local_clock();
+		msg->ts_nsec = printk_clock_fn();
 	memset(log_dict(msg) + dict_len, 0, pad_len);
 	msg->len = size;
 
@@ -1613,7 +1663,7 @@ static bool cont_add(int facility, int level, const char *text, size_t len)
 		cont.facility = facility;
 		cont.level = level;
 		cont.owner = current;
-		cont.ts_nsec = local_clock();
+		cont.ts_nsec = printk_clock_fn();
 		cont.flags = 0;
 		cont.cons = 0;
 		cont.flushed = false;
-- 
1.7.9.3

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/2] printk, Add printk.clock kernel parameter [v2]
  2016-01-13 12:34 [PATCH 0/2] printk, Add printk.clock kernel parameter [v2] Prarit Bhargava
  2016-01-13 12:34 ` [PATCH 1/2] kernel, timekeeping, add ktime_get_[boot|real|tai]_fast_ns functions Prarit Bhargava
  2016-01-13 12:34 ` [PATCH 2/2] printk, Add printk.clock kernel parameter Prarit Bhargava
@ 2016-01-13 13:45 ` Thomas Gleixner
  2016-01-13 14:36   ` Prarit Bhargava
  2 siblings, 1 reply; 12+ messages in thread
From: Thomas Gleixner @ 2016-01-13 13:45 UTC (permalink / raw)
  To: Prarit Bhargava
  Cc: linux-kernel, John Stultz, Xunlei Pang, Baolin Wang,
	Andrew Morton, Greg Kroah-Hartman, Petr Mladek, Tejun Heo,
	Peter Hurley, Vasily Averin, Joe Perches

On Wed, 13 Jan 2016, Prarit Bhargava wrote:
> This patchset introduces additional NMI safe timekeeping functions and the
> kernel parameter printk.clock=[local|boot|real|tai] allowing a
> user to specify an adjusted clock to use with printk timestamps.  The
> hardware clock, or the existing functionality, is preserved by default.

You still fail to explain WHY we need a gazillion of different clocks
here.

What's the problem with using the fast monotonic clock instead of local_clock
and be done with it? I really don't see the point why we would need
boot/real/tai and all the extra churn in the fast clock.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/2] printk, Add printk.clock kernel parameter [v2]
  2016-01-13 13:45 ` [PATCH 0/2] printk, Add printk.clock kernel parameter [v2] Thomas Gleixner
@ 2016-01-13 14:36   ` Prarit Bhargava
  2016-01-13 17:28     ` Thomas Gleixner
  0 siblings, 1 reply; 12+ messages in thread
From: Prarit Bhargava @ 2016-01-13 14:36 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: linux-kernel, John Stultz, Xunlei Pang, Baolin Wang,
	Andrew Morton, Greg Kroah-Hartman, Petr Mladek, Tejun Heo,
	Peter Hurley, Vasily Averin, Joe Perches



On 01/13/2016 08:45 AM, Thomas Gleixner wrote:
> On Wed, 13 Jan 2016, Prarit Bhargava wrote:
>> This patchset introduces additional NMI safe timekeeping functions and the
>> kernel parameter printk.clock=[local|boot|real|tai] allowing a
>> user to specify an adjusted clock to use with printk timestamps.  The
>> hardware clock, or the existing functionality, is preserved by default.
> 
> You still fail to explain WHY we need a gazillion of different clocks
> here.

I've had cases in the past where an earlier warning/failures have resulted in a
much later panics, across several systems.  Trying to synchronize all of these
events with wall clock time is all but impossible after the event has occurred.
 I've seen cases where earlier MCAs lead to panics, earlier I/O warnings have
lead to panics, panics/problems at a specific time, etc.  Attempting to figure
out what happened in the lab or cluster is not trivial without having a
timestamp that can actually be synchronized against a wall clock.

In the case that made me finally submit this, the disks were generating
seemingly random I/O timeout errors which meant at that point I had no logging
to disk (and this assumes the systems are logging to disk because I'm seeing
more and more systems that are not).  I did manage to get dmesg from crash
dumps, however, the problem then became trying to figure out exactly what time
the system started having problems (Was there an external event that lead to the
failures and panics?  Are the early failures across systems at the same time, or
did they occur over several hours?  Did the systems all panic at the same time?
 Was the failure at a specific time after boot and due to a weird timeout?  etc.)

Trying to figure out what actually is happening & debugging becomes much easier
with the above timestamp patch because I can actually tell what time something
happened.

Admittedly, I have not used TAI.  I started by using REAL, and then the BOOT
clock to see if this was some sort of strange 10-day timeout on the system.  I
only included TAI option for completeness.

> 
> What's the problem with using the fast monotonic clock instead of local_clock
> and be done with it? I really don't see the point why we would need
> boot/real/tai and all the extra churn in the fast clock.

AFAICT the local_clock() (on x86 at least) is accessed without accessing a lock
and is just a tsc read.  I assumed that local_clock() fast and lockless access
was the "best" method for obtaining a time stamp.  I would only suggest using
the other clocks on systems that are "known stable", or running kernels that are
considered to have stable timekeeping code.

P.

> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/2] printk, Add printk.clock kernel parameter [v2]
  2016-01-13 14:36   ` Prarit Bhargava
@ 2016-01-13 17:28     ` Thomas Gleixner
  2016-01-14 12:52       ` Petr Mladek
  0 siblings, 1 reply; 12+ messages in thread
From: Thomas Gleixner @ 2016-01-13 17:28 UTC (permalink / raw)
  To: Prarit Bhargava
  Cc: linux-kernel, John Stultz, Xunlei Pang, Baolin Wang,
	Andrew Morton, Greg Kroah-Hartman, Petr Mladek, Tejun Heo,
	Peter Hurley, Vasily Averin, Joe Perches

On Wed, 13 Jan 2016, Prarit Bhargava wrote:
> On 01/13/2016 08:45 AM, Thomas Gleixner wrote:
> > On Wed, 13 Jan 2016, Prarit Bhargava wrote:
> >> This patchset introduces additional NMI safe timekeeping functions and the
> >> kernel parameter printk.clock=[local|boot|real|tai] allowing a
> >> user to specify an adjusted clock to use with printk timestamps.  The
> >> hardware clock, or the existing functionality, is preserved by default.
> > 
> > You still fail to explain WHY we need a gazillion of different clocks
> > here.
> 
> I've had cases in the past where an earlier warning/failures have resulted in a
> much later panics, across several systems.  Trying to synchronize all of these
> events with wall clock time is all but impossible after the event has occurred.
>  I've seen cases where earlier MCAs lead to panics, earlier I/O warnings have
> lead to panics, panics/problems at a specific time, etc.  Attempting to figure
> out what happened in the lab or cluster is not trivial without having a
> timestamp that can actually be synchronized against a wall clock.
> 
> In the case that made me finally submit this, the disks were generating
> seemingly random I/O timeout errors which meant at that point I had no logging
> to disk (and this assumes the systems are logging to disk because I'm seeing
> more and more systems that are not).  I did manage to get dmesg from crash
> dumps, however, the problem then became trying to figure out exactly what time
> the system started having problems (Was there an external event that lead to the
> failures and panics?  Are the early failures across systems at the same time, or
> did they occur over several hours?  Did the systems all panic at the same time?
>  Was the failure at a specific time after boot and due to a weird timeout?  etc.)
> 
> Trying to figure out what actually is happening & debugging becomes much easier
> with the above timestamp patch because I can actually tell what time something
> happened.
> 
> Admittedly, I have not used TAI.  I started by using REAL, and then the BOOT
> clock to see if this was some sort of strange 10-day timeout on the system.  I
> only included TAI option for completeness.

So I really don't see a reason why we would need TAI for that. Just because we
can is not one.

BOOT is questionable as well simply because the boot offset only changes when
the system suspends/resumes. So we rather note that time delta in dmesg than
having a gazillion of command line options.

That leaves REAL as a possible useful option. Though the points where the MONO
to REAL offset changes are rather limited as well (Initial RTC readout,
do_settimeofday(), adjtimex(), leap seconds, suspend/resume).

> > What's the problem with using the fast monotonic clock instead of local_clock
> > and be done with it? I really don't see the point why we would need
> > boot/real/tai and all the extra churn in the fast clock.
> 
> AFAICT the local_clock() (on x86 at least) is accessed without accessing a lock
> and is just a tsc read.  I assumed that local_clock() fast and lockless access
> was the "best" method for obtaining a time stamp.

printk() is hardly a hotpath function and local_clock() is not much faster
than ktime_get_mono_fast_ns(). It's lockless as well otherwise it would not be
NMI safe.

So what's the point? 

> I would only suggest using the other clocks on systems that are "known
> stable", or running kernels that are considered to have stable timekeeping
> code.

I neither run production stuff on systems which are known to be unstable nor
on a kernel which does not have a stable timekeeping code.

Sorry, but your argumentation does not make any sense.

You can solve the whole business by changing the timestamp in printk_log to

   u64	    mono;
   u64	    offset_real;

and have a function which does:

u64 ktime_get_log_ts(u64 *offset_real)
{
	*offset_real = tk_core.timekeeper.offs_real;

	if (timekeeping_active)
		return ktime_get_mono_fast_ns();
	else
		return local_clock();
}

That uses extra 8 bytes of data per log buffer entry, but that's really a non
issue. That already gives you the data for crash analysis. If you want to be
able to have the clock REAL time stamps in console/dmesg/syslog then you
simply can make printk_time an integer and act depending on the value:

     printk_time=0  timestamps off
     printk_time=1  timestamps clock MONO (default and backwards compatible)
     printk_time=2  timestamps clock REAL
     
That's minimal intrusive and does the job without the need of a gazillion of
command line options and new functions which are just pointless bloat.

Note, that on 32bit systems accessing tk_core.timekeeper.offs_real is racy
vs. a concurrent update, but with your proposed solution it's not any
different.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/2] printk, Add printk.clock kernel parameter [v2]
  2016-01-13 17:28     ` Thomas Gleixner
@ 2016-01-14 12:52       ` Petr Mladek
  2016-01-14 14:39         ` Prarit Bhargava
  2016-01-14 14:44         ` Thomas Gleixner
  0 siblings, 2 replies; 12+ messages in thread
From: Petr Mladek @ 2016-01-14 12:52 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Prarit Bhargava, linux-kernel, John Stultz, Xunlei Pang,
	Baolin Wang, Andrew Morton, Greg Kroah-Hartman, Tejun Heo,
	Peter Hurley, Vasily Averin, Joe Perches

On Wed 2016-01-13 18:28:50, Thomas Gleixner wrote:
> You can solve the whole business by changing the timestamp in printk_log to
> 
>    u64	    mono;
>    u64	    offset_real;

This is not so easy because the structure is proceed by userspace tool,
e.g. crash, see log_buf_kexec_setup(). We would need to update all
the tools as well.


> and have a function which does:
> 
> u64 ktime_get_log_ts(u64 *offset_real)
> {
> 	*offset_real = tk_core.timekeeper.offs_real;
> 
> 	if (timekeeping_active)
> 		return ktime_get_mono_fast_ns();
> 	else
> 		return local_clock();
> }

A solution would be to apply the offset_real immediately. I wonder if
any tool expects the messages to be sorted by a monotonic clock. In
fact, it might be useful to see that some messages are disordered
against the real time, e.g. because of the leaf second.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/2] printk, Add printk.clock kernel parameter [v2]
  2016-01-14 12:52       ` Petr Mladek
@ 2016-01-14 14:39         ` Prarit Bhargava
  2016-01-14 14:44         ` Thomas Gleixner
  1 sibling, 0 replies; 12+ messages in thread
From: Prarit Bhargava @ 2016-01-14 14:39 UTC (permalink / raw)
  To: Petr Mladek, Thomas Gleixner
  Cc: linux-kernel, John Stultz, Xunlei Pang, Baolin Wang,
	Andrew Morton, Greg Kroah-Hartman, Tejun Heo, Peter Hurley,
	Vasily Averin, Joe Perches



On 01/14/2016 07:52 AM, Petr Mladek wrote:
> On Wed 2016-01-13 18:28:50, Thomas Gleixner wrote:
>> You can solve the whole business by changing the timestamp in printk_log to
>>
>>    u64	    mono;
>>    u64	    offset_real;
> 
> This is not so easy because the structure is proceed by userspace tool,
> e.g. crash, see log_buf_kexec_setup(). We would need to update all
> the tools as well.
> 
> 
>> and have a function which does:
>>
>> u64 ktime_get_log_ts(u64 *offset_real)
>> {
>> 	*offset_real = tk_core.timekeeper.offs_real;
>>
>> 	if (timekeeping_active)
>> 		return ktime_get_mono_fast_ns();
>> 	else
>> 		return local_clock();
>> }
> 
> A solution would be to apply the offset_real immediately. I wonder if
> any tool expects the messages to be sorted by a monotonic clock. In

/var/log/messages from systemd will have to be fixed, but that's something that
was brought up previously (and IMO should be trivial based on the value in
/sys/modules/printk/parameters/time).

> fact, it might be useful to see that some messages are disordered
> against the real time, e.g. because of the leaf second.

I kicked a leap seconds during my testing (I have been running tests from
/tools/tests/selftests/timers) and didn't see anything strange with both a stock
leap-a-day.c and a modified leap-a-day.c which only does leap insertions.

P.

> 
> Best Regards,
> Petr
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/2] printk, Add printk.clock kernel parameter [v2]
  2016-01-14 12:52       ` Petr Mladek
  2016-01-14 14:39         ` Prarit Bhargava
@ 2016-01-14 14:44         ` Thomas Gleixner
  2016-01-21 16:09           ` Prarit Bhargava
  1 sibling, 1 reply; 12+ messages in thread
From: Thomas Gleixner @ 2016-01-14 14:44 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Prarit Bhargava, linux-kernel, John Stultz, Xunlei Pang,
	Baolin Wang, Andrew Morton, Greg Kroah-Hartman, Tejun Heo,
	Peter Hurley, Vasily Averin, Joe Perches

On Thu, 14 Jan 2016, Petr Mladek wrote:
> On Wed 2016-01-13 18:28:50, Thomas Gleixner wrote:
> > You can solve the whole business by changing the timestamp in printk_log to
> > 
> >    u64	    mono;
> >    u64	    offset_real;
> 
> This is not so easy because the structure is proceed by userspace tool,
> e.g. crash, see log_buf_kexec_setup(). We would need to update all
> the tools as well.

Fair enough.
 
> > and have a function which does:
> > 
> > u64 ktime_get_log_ts(u64 *offset_real)
> > {
> > 	*offset_real = tk_core.timekeeper.offs_real;
> > 
> > 	if (timekeeping_active)
> > 		return ktime_get_mono_fast_ns();
> > 	else
> > 		return local_clock();
> > }
> 
> A solution would be to apply the offset_real immediately. I wonder if
> any tool expects the messages to be sorted by a monotonic clock. In
> fact, it might be useful to see that some messages are disordered
> against the real time, e.g. because of the leaf second.

Not only leap seconds, it's also settimeofday and NTP might make the wall time
jump under certain conditions.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/2] printk, Add printk.clock kernel parameter [v2]
  2016-01-14 14:44         ` Thomas Gleixner
@ 2016-01-21 16:09           ` Prarit Bhargava
  2016-01-22  8:04             ` Thomas Gleixner
  0 siblings, 1 reply; 12+ messages in thread
From: Prarit Bhargava @ 2016-01-21 16:09 UTC (permalink / raw)
  To: Thomas Gleixner, Petr Mladek
  Cc: linux-kernel, John Stultz, Xunlei Pang, Baolin Wang,
	Andrew Morton, Greg Kroah-Hartman, Tejun Heo, Peter Hurley,
	Vasily Averin, Joe Perches



On 01/14/2016 09:44 AM, Thomas Gleixner wrote:
> On Thu, 14 Jan 2016, Petr Mladek wrote:
>> On Wed 2016-01-13 18:28:50, Thomas Gleixner wrote:
>>> You can solve the whole business by changing the timestamp in printk_log to
>>>
>>>    u64	    mono;
>>>    u64	    offset_real;
>>
>> This is not so easy because the structure is proceed by userspace tool,
>> e.g. crash, see log_buf_kexec_setup(). We would need to update all
>> the tools as well.
> 
> Fair enough.
>  
>>> and have a function which does:
>>>
>>> u64 ktime_get_log_ts(u64 *offset_real)
>>> {
>>> 	*offset_real = tk_core.timekeeper.offs_real;
>>>
>>> 	if (timekeeping_active)
>>> 		return ktime_get_mono_fast_ns();
>>> 	else
>>> 		return local_clock();
>>> }
>>
>> A solution would be to apply the offset_real immediately. I wonder if
>> any tool expects the messages to be sorted by a monotonic clock. In
>> fact, it might be useful to see that some messages are disordered
>> against the real time, e.g. because of the leaf second.

I've tested v2 with the leap second and haven't seen anything unusual on Fedora23.

> 
> Not only leap seconds, it's also settimeofday and NTP might make the wall time
> jump under certain conditions.

Also tried this as well to set the time to different timezones as well as
setting the RTC in BIOS to the wrong time to see if there were any issues.
Again, nothing seemed to happen.

... noting of course the previously mentioned issue with /var/log/messages &
systemd that Petr found.

So IIUC ... I want real time to be reported (not boot or TAI), use tglx's
suggestion of ktime_get_log_ts(), and implement printk.time=1 (local_clock())
and printk.time=2 (real clock).

Is that correct?

P.

> 
> Thanks,
> 
> 	tglx
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/2] printk, Add printk.clock kernel parameter [v2]
  2016-01-21 16:09           ` Prarit Bhargava
@ 2016-01-22  8:04             ` Thomas Gleixner
  0 siblings, 0 replies; 12+ messages in thread
From: Thomas Gleixner @ 2016-01-22  8:04 UTC (permalink / raw)
  To: Prarit Bhargava
  Cc: Petr Mladek, linux-kernel, John Stultz, Xunlei Pang, Baolin Wang,
	Andrew Morton, Greg Kroah-Hartman, Tejun Heo, Peter Hurley,
	Vasily Averin, Joe Perches

On Thu, 21 Jan 2016, Prarit Bhargava wrote:
> So IIUC ... I want real time to be reported (not boot or TAI), use tglx's
> suggestion of ktime_get_log_ts(), and implement printk.time=1 (local_clock())
> and printk.time=2 (real clock).

I'd like to see an option for clock mono as well. That helps to correlate
other things like user space tracing.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 0/2] printk, Add printk.clock kernel parameter [v2]
@ 2016-01-13 12:34 Prarit Bhargava
  0 siblings, 0 replies; 12+ messages in thread
From: Prarit Bhargava @ 2016-01-13 12:34 UTC (permalink / raw)
  To: linux-kernel
  Cc: Prarit Bhargava, John Stultz, Xunlei Pang, Thomas Gleixner,
	Baolin Wang, Andrew Morton, Greg Kroah-Hartman, Petr Mladek,
	Tejun Heo, Peter Hurley, Vasily Averin, Joe Perches

The script used in the analysis below:

dmesg_with_human_timestamps () {
    $(type -P dmesg) "$@" | perl -w -e 'use strict;
        my ($uptime) = do { local @ARGV="/proc/uptime";<>}; ($uptime) = ($uptime =~ /^(\d+)\./);
        foreach my $line (<>) {
            printf( ($line=~/^\[\s*(\d+)\.\d+\](.+)/) ? ( "[%s]%s\n", scalar localtime(time - $uptime + $1), $2 ) : $line )
        }'
}

dmesg_with_human_timestamps

----8<----

Over the past years I've seen many reports of bugs that include
time-stamped kernel logs (enabled when CONFIG_PRINTK_TIME=y or
print.time=1 is specified as a kernel parameter) that do not align
with either external time stamped logs or /var/log/messages.

For example,

[root@intel-wildcatpass-06 ~]# date; echo "Hello!" > /dev/kmsg ; date
Thu Dec 17 13:58:31 EST 2015
Thu Dec 17 13:58:31 EST 2015

which displays

[83973.768912] Hello!

on the serial console.

Running a script to convert this to "boot time",

[root@intel-wildcatpass-06 ~]# ./human.sh  | tail -1
[Thu Dec 17 13:59:57 2015] Hello!

which is already off by 1 minute and 26 seconds off after ~24 hours of
uptime.

This occurs because the time stamp is obtained from a call to
local_clock() which (on x86) is a direct call to the hardware.  These
hardware clock reads are not modified by the standard ntp or ptp protocol,
while the other timestamps are, and that results in situations external
time sources are further and further offset from the kernel log
timestamps.

This patchset introduces additional NMI safe timekeeping functions and the
kernel parameter printk.clock=[local|boot|real|tai] allowing a
user to specify an adjusted clock to use with printk timestamps.  The
hardware clock, or the existing functionality, is preserved by default.

[v2]: use NMI safe timekeeping access functions

Cc: John Stultz <john.stultz@linaro.org>
Cc: Xunlei Pang <pang.xunlei@linaro.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Baolin Wang <baolin.wang@linaro.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Petr Mladek <pmladek@suse.cz>
Cc: Tejun Heo <tj@kernel.org>
Cc: Peter Hurley <peter@hurleysoftware.com>
Cc: Vasily Averin <vvs@virtuozzo.com>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Prarit Bhargava <prarit@redhat.com>


Prarit Bhargava (2):
  kernel, timekeeping, add ktime_get_[boot|real|tai]_fast_ns functions
  printk, Add printk.clock kernel parameter

 include/linux/timekeeping.h |    3 +++
 kernel/printk/printk.c      |   54 +++++++++++++++++++++++++++++++++++++++++--
 kernel/time/timekeeping.c   |   52 +++++++++++++++++++++++++++++++++--------
 3 files changed, 97 insertions(+), 12 deletions(-)

-- 
1.7.9.3

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2016-01-22  8:05 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-01-13 12:34 [PATCH 0/2] printk, Add printk.clock kernel parameter [v2] Prarit Bhargava
2016-01-13 12:34 ` [PATCH 1/2] kernel, timekeeping, add ktime_get_[boot|real|tai]_fast_ns functions Prarit Bhargava
2016-01-13 12:34 ` [PATCH 2/2] printk, Add printk.clock kernel parameter Prarit Bhargava
2016-01-13 13:45 ` [PATCH 0/2] printk, Add printk.clock kernel parameter [v2] Thomas Gleixner
2016-01-13 14:36   ` Prarit Bhargava
2016-01-13 17:28     ` Thomas Gleixner
2016-01-14 12:52       ` Petr Mladek
2016-01-14 14:39         ` Prarit Bhargava
2016-01-14 14:44         ` Thomas Gleixner
2016-01-21 16:09           ` Prarit Bhargava
2016-01-22  8:04             ` Thomas Gleixner
  -- strict thread matches above, loose matches on Subject: below --
2016-01-13 12:34 Prarit Bhargava

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.