* [PATCH 0/2] printk, Add printk.clock kernel parameter [v2] @ 2016-01-13 12:34 Prarit Bhargava 2016-01-13 12:34 ` [PATCH 1/2] kernel, timekeeping, add ktime_get_[boot|real|tai]_fast_ns functions Prarit Bhargava ` (2 more replies) 0 siblings, 3 replies; 16+ messages in thread From: Prarit Bhargava @ 2016-01-13 12:34 UTC (permalink / raw) To: linux-kernel Cc: Prarit Bhargava, John Stultz, Xunlei Pang, Thomas Gleixner, Baolin Wang, Andrew Morton, Greg Kroah-Hartman, Petr Mladek, Tejun Heo, Peter Hurley, Vasily Averin, Joe Perches The script used in the analysis below: dmesg_with_human_timestamps () { $(type -P dmesg) "$@" | perl -w -e 'use strict; my ($uptime) = do { local @ARGV="/proc/uptime";<>}; ($uptime) = ($uptime =~ /^(\d+)\./); foreach my $line (<>) { printf( ($line=~/^\[\s*(\d+)\.\d+\](.+)/) ? ( "[%s]%s\n", scalar localtime(time - $uptime + $1), $2 ) : $line ) }' } dmesg_with_human_timestamps ----8<---- Over the past years I've seen many reports of bugs that include time-stamped kernel logs (enabled when CONFIG_PRINTK_TIME=y or print.time=1 is specified as a kernel parameter) that do not align with either external time stamped logs or /var/log/messages. For example, [root@intel-wildcatpass-06 ~]# date; echo "Hello!" > /dev/kmsg ; date Thu Dec 17 13:58:31 EST 2015 Thu Dec 17 13:58:31 EST 2015 which displays [83973.768912] Hello! on the serial console. Running a script to convert this to "boot time", [root@intel-wildcatpass-06 ~]# ./human.sh | tail -1 [Thu Dec 17 13:59:57 2015] Hello! which is already off by 1 minute and 26 seconds off after ~24 hours of uptime. This occurs because the time stamp is obtained from a call to local_clock() which (on x86) is a direct call to the hardware. These hardware clock reads are not modified by the standard ntp or ptp protocol, while the other timestamps are, and that results in situations external time sources are further and further offset from the kernel log timestamps. This patchset introduces additional NMI safe timekeeping functions and the kernel parameter printk.clock=[local|boot|real|tai] allowing a user to specify an adjusted clock to use with printk timestamps. The hardware clock, or the existing functionality, is preserved by default. [v2]: use NMI safe timekeeping access functions Cc: John Stultz <john.stultz@linaro.org> Cc: Xunlei Pang <pang.xunlei@linaro.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Baolin Wang <baolin.wang@linaro.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Petr Mladek <pmladek@suse.cz> Cc: Tejun Heo <tj@kernel.org> Cc: Peter Hurley <peter@hurleysoftware.com> Cc: Vasily Averin <vvs@virtuozzo.com> Cc: Joe Perches <joe@perches.com> Signed-off-by: Prarit Bhargava <prarit@redhat.com> Prarit Bhargava (2): kernel, timekeeping, add ktime_get_[boot|real|tai]_fast_ns functions printk, Add printk.clock kernel parameter include/linux/timekeeping.h | 3 +++ kernel/printk/printk.c | 54 +++++++++++++++++++++++++++++++++++++++++-- kernel/time/timekeeping.c | 52 +++++++++++++++++++++++++++++++++-------- 3 files changed, 97 insertions(+), 12 deletions(-) -- 1.7.9.3 ^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH 1/2] kernel, timekeeping, add ktime_get_[boot|real|tai]_fast_ns functions 2016-01-13 12:34 [PATCH 0/2] printk, Add printk.clock kernel parameter [v2] Prarit Bhargava @ 2016-01-13 12:34 ` Prarit Bhargava 2016-01-13 12:34 ` [PATCH 2/2] printk, Add printk.clock kernel parameter Prarit Bhargava 2016-01-13 13:45 ` [PATCH 0/2] printk, Add printk.clock kernel parameter [v2] Thomas Gleixner 2 siblings, 0 replies; 16+ messages in thread From: Prarit Bhargava @ 2016-01-13 12:34 UTC (permalink / raw) To: linux-kernel Cc: Prarit Bhargava, John Stultz, Thomas Gleixner, Ingo Molnar, Xunlei Pang, Peter Zijlstra, Baolin Wang, Arnd Bergmann This patch introduces NMI safe time functions, ktime_get[boot|real|tai]_fast_ns() for use by the printk() timestamping code, which are based on the existing NMI safe versions of accessing the monotonic clock. Cc: John Stultz <john.stultz@linaro.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Xunlei Pang <pang.xunlei@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Baolin Wang <baolin.wang@linaro.org> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Prarit Bhargava <prarit@redhat.com> --- include/linux/timekeeping.h | 3 +++ kernel/time/timekeeping.c | 52 ++++++++++++++++++++++++++++++++++--------- 2 files changed, 45 insertions(+), 10 deletions(-) diff --git a/include/linux/timekeeping.h b/include/linux/timekeeping.h index ec89d84..c3f10f5 100644 --- a/include/linux/timekeeping.h +++ b/include/linux/timekeeping.h @@ -233,6 +233,9 @@ static inline u64 ktime_get_raw_ns(void) extern u64 ktime_get_mono_fast_ns(void); extern u64 ktime_get_raw_fast_ns(void); +extern u64 ktime_get_boot_fast_ns(void); +extern u64 ktime_get_real_fast_ns(void); +extern u64 ktime_get_tai_fast_ns(void); /* * Timespec interfaces utilizing the ktime based ones diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c index d563c19..5449a5cc 100644 --- a/kernel/time/timekeeping.c +++ b/kernel/time/timekeeping.c @@ -44,6 +44,14 @@ static struct { static DEFINE_RAW_SPINLOCK(timekeeper_lock); static struct timekeeper shadow_timekeeper; +static ktime_t *offsets[TK_OFFS_MAX] = { + [TK_OFFS_REAL] = &tk_core.timekeeper.offs_real, + [TK_OFFS_BOOT] = &tk_core.timekeeper.offs_boot, + [TK_OFFS_TAI] = &tk_core.timekeeper.offs_tai, +}; + +static int timekeeping_initialized; + /** * struct tk_fast - NMI safe timekeeper * @seq: Sequence counter for protecting updates. The lowest bit @@ -375,16 +383,22 @@ static void update_fast_timekeeper(struct tk_read_base *tkr, struct tk_fast *tkf * of the following timestamps. Callers need to be aware of that and * deal with it. */ -static __always_inline u64 __ktime_get_fast_ns(struct tk_fast *tkf) +static __always_inline u64 __ktime_get_fast_ns(struct tk_fast *tkf, + ktime_t *offset) { struct tk_read_base *tkr; unsigned int seq; u64 now; + ktime_t base; + + if (unlikely(!timekeeping_initialized)) + return 0; do { seq = raw_read_seqcount_latch(&tkf->seq); tkr = tkf->base + (seq & 0x01); - now = ktime_to_ns(tkr->base) + timekeeping_get_ns(tkr); + base = ktime_add(tkr->base, *offset); + now = ktime_to_ns(base) + timekeeping_get_ns(tkr); } while (read_seqcount_retry(&tkf->seq, seq)); return now; @@ -392,16 +406,38 @@ static __always_inline u64 __ktime_get_fast_ns(struct tk_fast *tkf) u64 ktime_get_mono_fast_ns(void) { - return __ktime_get_fast_ns(&tk_fast_mono); + ktime_t zero = ktime_set(0, 0); + + return __ktime_get_fast_ns(&tk_fast_mono, &zero); } EXPORT_SYMBOL_GPL(ktime_get_mono_fast_ns); u64 ktime_get_raw_fast_ns(void) { - return __ktime_get_fast_ns(&tk_fast_raw); + ktime_t zero = ktime_set(0, 0); + + return __ktime_get_fast_ns(&tk_fast_raw, &zero); } EXPORT_SYMBOL_GPL(ktime_get_raw_fast_ns); +u64 ktime_get_boot_fast_ns(void) +{ + return __ktime_get_fast_ns(&tk_fast_raw, offsets[TK_OFFS_BOOT]); +} +EXPORT_SYMBOL_GPL(ktime_get_boot_fast_ns); + +u64 ktime_get_real_fast_ns(void) +{ + return __ktime_get_fast_ns(&tk_fast_raw, offsets[TK_OFFS_REAL]); +} +EXPORT_SYMBOL_GPL(ktime_get_real_fast_ns); + +u64 ktime_get_tai_fast_ns(void) +{ + return __ktime_get_fast_ns(&tk_fast_raw, offsets[TK_OFFS_TAI]); +} +EXPORT_SYMBOL_GPL(ktime_get_tai_fast_ns); + /* Suspend-time cycles value for halted fast timekeeper. */ static cycle_t cycles_at_suspend; @@ -699,12 +735,6 @@ u32 ktime_get_resolution_ns(void) } EXPORT_SYMBOL_GPL(ktime_get_resolution_ns); -static ktime_t *offsets[TK_OFFS_MAX] = { - [TK_OFFS_REAL] = &tk_core.timekeeper.offs_real, - [TK_OFFS_BOOT] = &tk_core.timekeeper.offs_boot, - [TK_OFFS_TAI] = &tk_core.timekeeper.offs_tai, -}; - ktime_t ktime_get_with_offset(enum tk_offsets offs) { struct timekeeper *tk = &tk_core.timekeeper; @@ -1255,6 +1285,8 @@ void __init timekeeping_init(void) write_seqcount_end(&tk_core.seq); raw_spin_unlock_irqrestore(&timekeeper_lock, flags); + + timekeeping_initialized = 1; } /* time in seconds when suspend began for persistent clock */ -- 1.7.9.3 ^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH 2/2] printk, Add printk.clock kernel parameter 2016-01-13 12:34 [PATCH 0/2] printk, Add printk.clock kernel parameter [v2] Prarit Bhargava 2016-01-13 12:34 ` [PATCH 1/2] kernel, timekeeping, add ktime_get_[boot|real|tai]_fast_ns functions Prarit Bhargava @ 2016-01-13 12:34 ` Prarit Bhargava 2016-01-13 13:45 ` [PATCH 0/2] printk, Add printk.clock kernel parameter [v2] Thomas Gleixner 2 siblings, 0 replies; 16+ messages in thread From: Prarit Bhargava @ 2016-01-13 12:34 UTC (permalink / raw) To: linux-kernel Cc: Prarit Bhargava, John Stultz, Xunlei Pang, Thomas Gleixner, Baolin Wang, Andrew Morton, Greg Kroah-Hartman, Petr Mladek, Tejun Heo, Peter Hurley, Vasily Averin, Joe Perches The script used in the analysis below: dmesg_with_human_timestamps () { $(type -P dmesg) "$@" | perl -w -e 'use strict; my ($uptime) = do { local @ARGV="/proc/uptime";<>}; ($uptime) = ($uptime =~ /^(\d+)\./); foreach my $line (<>) { printf( ($line=~/^\[\s*(\d+)\.\d+\](.+)/) ? ( "[%s]%s\n", scalar localtime(time - $uptime + $1), $2 ) : $line ) }' } dmesg_with_human_timestamps ----8<---- Over the past years I've seen many reports of bugs that include time-stamped kernel logs (enabled when CONFIG_PRINTK_TIME=y or print.time=1 is specified as a kernel parameter) that do not align with either external time stamped logs or /var/log/messages. For example, [root@intel-wildcatpass-06 ~]# date; echo "Hello!" > /dev/kmsg ; date Thu Dec 17 13:58:31 EST 2015 Thu Dec 17 13:58:31 EST 2015 which displays [83973.768912] Hello! on the serial console. Running a script to convert this to "boot time", [root@intel-wildcatpass-06 ~]# ./human.sh | tail -1 [Thu Dec 17 13:59:57 2015] Hello! which is already off by 1 minute and 26 seconds off after ~24 hours of uptime. This occurs because the time stamp is obtained from a call to local_clock() which (on x86) is a direct call to the hardware. These hardware clock reads are not modified by the standard ntp or ptp protocol, while the other timestamps are, and that results in situations external time sources are further and further offset from the kernel log timestamps. This patch introduces printk.clock=[local|boot|real|tai] allowing a user to specify an adjusted clock to use with printk timestamps. The hardware clock, or the existing functionality, is preserved by default. [v2]: use NMI safe timekeeping access functions Cc: John Stultz <john.stultz@linaro.org> Cc: Xunlei Pang <pang.xunlei@linaro.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Baolin Wang <baolin.wang@linaro.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Petr Mladek <pmladek@suse.cz> Cc: Tejun Heo <tj@kernel.org> Cc: Peter Hurley <peter@hurleysoftware.com> Cc: Vasily Averin <vvs@virtuozzo.com> Cc: Joe Perches <joe@perches.com> Signed-off-by: Prarit Bhargava <prarit@redhat.com> --- kernel/printk/printk.c | 54 ++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 52 insertions(+), 2 deletions(-) diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c index 2ce8826..671f0ec 100644 --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c @@ -419,6 +419,56 @@ static u32 truncate_msg(u16 *text_len, u16 *trunc_msg_len, return msg_used_size(*text_len + *trunc_msg_len, 0, pad_len); } +/* Default timestamp is local clock */ +static char *printk_clock = "local"; +static u64 (*printk_clock_fn)(void) = &local_clock; + +static int printk_clock_param_set(const char *val, + const struct kernel_param *kp) +{ + char *printk_clock_new = strstrip((char *)val); + int ret; + + /* check if change is needed */ + if (!strcmp(printk_clock_new, printk_clock)) + return 0; + + if (!strncmp("local", printk_clock_new, 5)) { + ret = param_set_charp(printk_clock_new, kp); + if (ret) + return ret; + printk_clock_fn = &local_clock; + } else if (!strncmp("real", printk_clock_new, 4)) { + ret = param_set_charp(printk_clock_new, kp); + if (ret) + return ret; + printk_clock_fn = &ktime_get_real_fast_ns; + } else if (!strncmp("boot", printk_clock_new, 4)) { + ret = param_set_charp(printk_clock_new, kp); + if (ret) + return ret; + printk_clock_fn = &ktime_get_boot_fast_ns; + } else if (!strncmp("tai", printk_clock_new, 3)) { + ret = param_set_charp(printk_clock_new, kp); + if (ret) + return ret; + printk_clock_fn = &ktime_get_tai_fast_ns; + } else { + return -EINVAL; + } + + pr_info("printk: timestamp set to %s clock.\n", printk_clock); + return 0; +} + +static struct kernel_param_ops printk_clock_param_ops = { + .set = printk_clock_param_set, + .get = param_get_charp, +}; + +module_param_cb(clock, &printk_clock_param_ops, &printk_clock, 0644); +MODULE_PARM_DESC(clock, "Clock used for printk timestamps. Can be local (hardware/default), boot, real, or tai.\n"); + /* insert record into the buffer, discard old ones, update heads */ static int log_store(int facility, int level, enum log_flags flags, u64 ts_nsec, @@ -467,7 +517,7 @@ static int log_store(int facility, int level, if (ts_nsec > 0) msg->ts_nsec = ts_nsec; else - msg->ts_nsec = local_clock(); + msg->ts_nsec = printk_clock_fn(); memset(log_dict(msg) + dict_len, 0, pad_len); msg->len = size; @@ -1613,7 +1663,7 @@ static bool cont_add(int facility, int level, const char *text, size_t len) cont.facility = facility; cont.level = level; cont.owner = current; - cont.ts_nsec = local_clock(); + cont.ts_nsec = printk_clock_fn(); cont.flags = 0; cont.cons = 0; cont.flushed = false; -- 1.7.9.3 ^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH 0/2] printk, Add printk.clock kernel parameter [v2] 2016-01-13 12:34 [PATCH 0/2] printk, Add printk.clock kernel parameter [v2] Prarit Bhargava 2016-01-13 12:34 ` [PATCH 1/2] kernel, timekeeping, add ktime_get_[boot|real|tai]_fast_ns functions Prarit Bhargava 2016-01-13 12:34 ` [PATCH 2/2] printk, Add printk.clock kernel parameter Prarit Bhargava @ 2016-01-13 13:45 ` Thomas Gleixner 2016-01-13 14:36 ` Prarit Bhargava 2 siblings, 1 reply; 16+ messages in thread From: Thomas Gleixner @ 2016-01-13 13:45 UTC (permalink / raw) To: Prarit Bhargava Cc: linux-kernel, John Stultz, Xunlei Pang, Baolin Wang, Andrew Morton, Greg Kroah-Hartman, Petr Mladek, Tejun Heo, Peter Hurley, Vasily Averin, Joe Perches On Wed, 13 Jan 2016, Prarit Bhargava wrote: > This patchset introduces additional NMI safe timekeeping functions and the > kernel parameter printk.clock=[local|boot|real|tai] allowing a > user to specify an adjusted clock to use with printk timestamps. The > hardware clock, or the existing functionality, is preserved by default. You still fail to explain WHY we need a gazillion of different clocks here. What's the problem with using the fast monotonic clock instead of local_clock and be done with it? I really don't see the point why we would need boot/real/tai and all the extra churn in the fast clock. Thanks, tglx ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 0/2] printk, Add printk.clock kernel parameter [v2] 2016-01-13 13:45 ` [PATCH 0/2] printk, Add printk.clock kernel parameter [v2] Thomas Gleixner @ 2016-01-13 14:36 ` Prarit Bhargava 2016-01-13 17:28 ` Thomas Gleixner 0 siblings, 1 reply; 16+ messages in thread From: Prarit Bhargava @ 2016-01-13 14:36 UTC (permalink / raw) To: Thomas Gleixner Cc: linux-kernel, John Stultz, Xunlei Pang, Baolin Wang, Andrew Morton, Greg Kroah-Hartman, Petr Mladek, Tejun Heo, Peter Hurley, Vasily Averin, Joe Perches On 01/13/2016 08:45 AM, Thomas Gleixner wrote: > On Wed, 13 Jan 2016, Prarit Bhargava wrote: >> This patchset introduces additional NMI safe timekeeping functions and the >> kernel parameter printk.clock=[local|boot|real|tai] allowing a >> user to specify an adjusted clock to use with printk timestamps. The >> hardware clock, or the existing functionality, is preserved by default. > > You still fail to explain WHY we need a gazillion of different clocks > here. I've had cases in the past where an earlier warning/failures have resulted in a much later panics, across several systems. Trying to synchronize all of these events with wall clock time is all but impossible after the event has occurred. I've seen cases where earlier MCAs lead to panics, earlier I/O warnings have lead to panics, panics/problems at a specific time, etc. Attempting to figure out what happened in the lab or cluster is not trivial without having a timestamp that can actually be synchronized against a wall clock. In the case that made me finally submit this, the disks were generating seemingly random I/O timeout errors which meant at that point I had no logging to disk (and this assumes the systems are logging to disk because I'm seeing more and more systems that are not). I did manage to get dmesg from crash dumps, however, the problem then became trying to figure out exactly what time the system started having problems (Was there an external event that lead to the failures and panics? Are the early failures across systems at the same time, or did they occur over several hours? Did the systems all panic at the same time? Was the failure at a specific time after boot and due to a weird timeout? etc.) Trying to figure out what actually is happening & debugging becomes much easier with the above timestamp patch because I can actually tell what time something happened. Admittedly, I have not used TAI. I started by using REAL, and then the BOOT clock to see if this was some sort of strange 10-day timeout on the system. I only included TAI option for completeness. > > What's the problem with using the fast monotonic clock instead of local_clock > and be done with it? I really don't see the point why we would need > boot/real/tai and all the extra churn in the fast clock. AFAICT the local_clock() (on x86 at least) is accessed without accessing a lock and is just a tsc read. I assumed that local_clock() fast and lockless access was the "best" method for obtaining a time stamp. I would only suggest using the other clocks on systems that are "known stable", or running kernels that are considered to have stable timekeeping code. P. > ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 0/2] printk, Add printk.clock kernel parameter [v2] 2016-01-13 14:36 ` Prarit Bhargava @ 2016-01-13 17:28 ` Thomas Gleixner 2016-01-14 12:52 ` Petr Mladek 0 siblings, 1 reply; 16+ messages in thread From: Thomas Gleixner @ 2016-01-13 17:28 UTC (permalink / raw) To: Prarit Bhargava Cc: linux-kernel, John Stultz, Xunlei Pang, Baolin Wang, Andrew Morton, Greg Kroah-Hartman, Petr Mladek, Tejun Heo, Peter Hurley, Vasily Averin, Joe Perches On Wed, 13 Jan 2016, Prarit Bhargava wrote: > On 01/13/2016 08:45 AM, Thomas Gleixner wrote: > > On Wed, 13 Jan 2016, Prarit Bhargava wrote: > >> This patchset introduces additional NMI safe timekeeping functions and the > >> kernel parameter printk.clock=[local|boot|real|tai] allowing a > >> user to specify an adjusted clock to use with printk timestamps. The > >> hardware clock, or the existing functionality, is preserved by default. > > > > You still fail to explain WHY we need a gazillion of different clocks > > here. > > I've had cases in the past where an earlier warning/failures have resulted in a > much later panics, across several systems. Trying to synchronize all of these > events with wall clock time is all but impossible after the event has occurred. > I've seen cases where earlier MCAs lead to panics, earlier I/O warnings have > lead to panics, panics/problems at a specific time, etc. Attempting to figure > out what happened in the lab or cluster is not trivial without having a > timestamp that can actually be synchronized against a wall clock. > > In the case that made me finally submit this, the disks were generating > seemingly random I/O timeout errors which meant at that point I had no logging > to disk (and this assumes the systems are logging to disk because I'm seeing > more and more systems that are not). I did manage to get dmesg from crash > dumps, however, the problem then became trying to figure out exactly what time > the system started having problems (Was there an external event that lead to the > failures and panics? Are the early failures across systems at the same time, or > did they occur over several hours? Did the systems all panic at the same time? > Was the failure at a specific time after boot and due to a weird timeout? etc.) > > Trying to figure out what actually is happening & debugging becomes much easier > with the above timestamp patch because I can actually tell what time something > happened. > > Admittedly, I have not used TAI. I started by using REAL, and then the BOOT > clock to see if this was some sort of strange 10-day timeout on the system. I > only included TAI option for completeness. So I really don't see a reason why we would need TAI for that. Just because we can is not one. BOOT is questionable as well simply because the boot offset only changes when the system suspends/resumes. So we rather note that time delta in dmesg than having a gazillion of command line options. That leaves REAL as a possible useful option. Though the points where the MONO to REAL offset changes are rather limited as well (Initial RTC readout, do_settimeofday(), adjtimex(), leap seconds, suspend/resume). > > What's the problem with using the fast monotonic clock instead of local_clock > > and be done with it? I really don't see the point why we would need > > boot/real/tai and all the extra churn in the fast clock. > > AFAICT the local_clock() (on x86 at least) is accessed without accessing a lock > and is just a tsc read. I assumed that local_clock() fast and lockless access > was the "best" method for obtaining a time stamp. printk() is hardly a hotpath function and local_clock() is not much faster than ktime_get_mono_fast_ns(). It's lockless as well otherwise it would not be NMI safe. So what's the point? > I would only suggest using the other clocks on systems that are "known > stable", or running kernels that are considered to have stable timekeeping > code. I neither run production stuff on systems which are known to be unstable nor on a kernel which does not have a stable timekeeping code. Sorry, but your argumentation does not make any sense. You can solve the whole business by changing the timestamp in printk_log to u64 mono; u64 offset_real; and have a function which does: u64 ktime_get_log_ts(u64 *offset_real) { *offset_real = tk_core.timekeeper.offs_real; if (timekeeping_active) return ktime_get_mono_fast_ns(); else return local_clock(); } That uses extra 8 bytes of data per log buffer entry, but that's really a non issue. That already gives you the data for crash analysis. If you want to be able to have the clock REAL time stamps in console/dmesg/syslog then you simply can make printk_time an integer and act depending on the value: printk_time=0 timestamps off printk_time=1 timestamps clock MONO (default and backwards compatible) printk_time=2 timestamps clock REAL That's minimal intrusive and does the job without the need of a gazillion of command line options and new functions which are just pointless bloat. Note, that on 32bit systems accessing tk_core.timekeeper.offs_real is racy vs. a concurrent update, but with your proposed solution it's not any different. Thanks, tglx ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 0/2] printk, Add printk.clock kernel parameter [v2] 2016-01-13 17:28 ` Thomas Gleixner @ 2016-01-14 12:52 ` Petr Mladek 2016-01-14 14:39 ` Prarit Bhargava 2016-01-14 14:44 ` Thomas Gleixner 0 siblings, 2 replies; 16+ messages in thread From: Petr Mladek @ 2016-01-14 12:52 UTC (permalink / raw) To: Thomas Gleixner Cc: Prarit Bhargava, linux-kernel, John Stultz, Xunlei Pang, Baolin Wang, Andrew Morton, Greg Kroah-Hartman, Tejun Heo, Peter Hurley, Vasily Averin, Joe Perches On Wed 2016-01-13 18:28:50, Thomas Gleixner wrote: > You can solve the whole business by changing the timestamp in printk_log to > > u64 mono; > u64 offset_real; This is not so easy because the structure is proceed by userspace tool, e.g. crash, see log_buf_kexec_setup(). We would need to update all the tools as well. > and have a function which does: > > u64 ktime_get_log_ts(u64 *offset_real) > { > *offset_real = tk_core.timekeeper.offs_real; > > if (timekeeping_active) > return ktime_get_mono_fast_ns(); > else > return local_clock(); > } A solution would be to apply the offset_real immediately. I wonder if any tool expects the messages to be sorted by a monotonic clock. In fact, it might be useful to see that some messages are disordered against the real time, e.g. because of the leaf second. Best Regards, Petr ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 0/2] printk, Add printk.clock kernel parameter [v2] 2016-01-14 12:52 ` Petr Mladek @ 2016-01-14 14:39 ` Prarit Bhargava 2016-01-14 14:44 ` Thomas Gleixner 1 sibling, 0 replies; 16+ messages in thread From: Prarit Bhargava @ 2016-01-14 14:39 UTC (permalink / raw) To: Petr Mladek, Thomas Gleixner Cc: linux-kernel, John Stultz, Xunlei Pang, Baolin Wang, Andrew Morton, Greg Kroah-Hartman, Tejun Heo, Peter Hurley, Vasily Averin, Joe Perches On 01/14/2016 07:52 AM, Petr Mladek wrote: > On Wed 2016-01-13 18:28:50, Thomas Gleixner wrote: >> You can solve the whole business by changing the timestamp in printk_log to >> >> u64 mono; >> u64 offset_real; > > This is not so easy because the structure is proceed by userspace tool, > e.g. crash, see log_buf_kexec_setup(). We would need to update all > the tools as well. > > >> and have a function which does: >> >> u64 ktime_get_log_ts(u64 *offset_real) >> { >> *offset_real = tk_core.timekeeper.offs_real; >> >> if (timekeeping_active) >> return ktime_get_mono_fast_ns(); >> else >> return local_clock(); >> } > > A solution would be to apply the offset_real immediately. I wonder if > any tool expects the messages to be sorted by a monotonic clock. In /var/log/messages from systemd will have to be fixed, but that's something that was brought up previously (and IMO should be trivial based on the value in /sys/modules/printk/parameters/time). > fact, it might be useful to see that some messages are disordered > against the real time, e.g. because of the leaf second. I kicked a leap seconds during my testing (I have been running tests from /tools/tests/selftests/timers) and didn't see anything strange with both a stock leap-a-day.c and a modified leap-a-day.c which only does leap insertions. P. > > Best Regards, > Petr > ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 0/2] printk, Add printk.clock kernel parameter [v2] 2016-01-14 12:52 ` Petr Mladek 2016-01-14 14:39 ` Prarit Bhargava @ 2016-01-14 14:44 ` Thomas Gleixner 2016-01-21 16:09 ` Prarit Bhargava 1 sibling, 1 reply; 16+ messages in thread From: Thomas Gleixner @ 2016-01-14 14:44 UTC (permalink / raw) To: Petr Mladek Cc: Prarit Bhargava, linux-kernel, John Stultz, Xunlei Pang, Baolin Wang, Andrew Morton, Greg Kroah-Hartman, Tejun Heo, Peter Hurley, Vasily Averin, Joe Perches On Thu, 14 Jan 2016, Petr Mladek wrote: > On Wed 2016-01-13 18:28:50, Thomas Gleixner wrote: > > You can solve the whole business by changing the timestamp in printk_log to > > > > u64 mono; > > u64 offset_real; > > This is not so easy because the structure is proceed by userspace tool, > e.g. crash, see log_buf_kexec_setup(). We would need to update all > the tools as well. Fair enough. > > and have a function which does: > > > > u64 ktime_get_log_ts(u64 *offset_real) > > { > > *offset_real = tk_core.timekeeper.offs_real; > > > > if (timekeeping_active) > > return ktime_get_mono_fast_ns(); > > else > > return local_clock(); > > } > > A solution would be to apply the offset_real immediately. I wonder if > any tool expects the messages to be sorted by a monotonic clock. In > fact, it might be useful to see that some messages are disordered > against the real time, e.g. because of the leaf second. Not only leap seconds, it's also settimeofday and NTP might make the wall time jump under certain conditions. Thanks, tglx ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 0/2] printk, Add printk.clock kernel parameter [v2] 2016-01-14 14:44 ` Thomas Gleixner @ 2016-01-21 16:09 ` Prarit Bhargava 2016-01-22 8:04 ` Thomas Gleixner 0 siblings, 1 reply; 16+ messages in thread From: Prarit Bhargava @ 2016-01-21 16:09 UTC (permalink / raw) To: Thomas Gleixner, Petr Mladek Cc: linux-kernel, John Stultz, Xunlei Pang, Baolin Wang, Andrew Morton, Greg Kroah-Hartman, Tejun Heo, Peter Hurley, Vasily Averin, Joe Perches On 01/14/2016 09:44 AM, Thomas Gleixner wrote: > On Thu, 14 Jan 2016, Petr Mladek wrote: >> On Wed 2016-01-13 18:28:50, Thomas Gleixner wrote: >>> You can solve the whole business by changing the timestamp in printk_log to >>> >>> u64 mono; >>> u64 offset_real; >> >> This is not so easy because the structure is proceed by userspace tool, >> e.g. crash, see log_buf_kexec_setup(). We would need to update all >> the tools as well. > > Fair enough. > >>> and have a function which does: >>> >>> u64 ktime_get_log_ts(u64 *offset_real) >>> { >>> *offset_real = tk_core.timekeeper.offs_real; >>> >>> if (timekeeping_active) >>> return ktime_get_mono_fast_ns(); >>> else >>> return local_clock(); >>> } >> >> A solution would be to apply the offset_real immediately. I wonder if >> any tool expects the messages to be sorted by a monotonic clock. In >> fact, it might be useful to see that some messages are disordered >> against the real time, e.g. because of the leaf second. I've tested v2 with the leap second and haven't seen anything unusual on Fedora23. > > Not only leap seconds, it's also settimeofday and NTP might make the wall time > jump under certain conditions. Also tried this as well to set the time to different timezones as well as setting the RTC in BIOS to the wrong time to see if there were any issues. Again, nothing seemed to happen. ... noting of course the previously mentioned issue with /var/log/messages & systemd that Petr found. So IIUC ... I want real time to be reported (not boot or TAI), use tglx's suggestion of ktime_get_log_ts(), and implement printk.time=1 (local_clock()) and printk.time=2 (real clock). Is that correct? P. > > Thanks, > > tglx > ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 0/2] printk, Add printk.clock kernel parameter [v2] 2016-01-21 16:09 ` Prarit Bhargava @ 2016-01-22 8:04 ` Thomas Gleixner 0 siblings, 0 replies; 16+ messages in thread From: Thomas Gleixner @ 2016-01-22 8:04 UTC (permalink / raw) To: Prarit Bhargava Cc: Petr Mladek, linux-kernel, John Stultz, Xunlei Pang, Baolin Wang, Andrew Morton, Greg Kroah-Hartman, Tejun Heo, Peter Hurley, Vasily Averin, Joe Perches On Thu, 21 Jan 2016, Prarit Bhargava wrote: > So IIUC ... I want real time to be reported (not boot or TAI), use tglx's > suggestion of ktime_get_log_ts(), and implement printk.time=1 (local_clock()) > and printk.time=2 (real clock). I'd like to see an option for clock mono as well. That helps to correlate other things like user space tracing. Thanks, tglx ^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH 0/2] printk, Add a printk.clock kernel parameter @ 2016-01-06 13:00 Prarit Bhargava 2016-01-06 13:00 ` [PATCH 2/2] printk, Add " Prarit Bhargava 0 siblings, 1 reply; 16+ messages in thread From: Prarit Bhargava @ 2016-01-06 13:00 UTC (permalink / raw) To: linux-kernel Cc: Prarit Bhargava, John Stultz, Xunlei Pang, Thomas Gleixner, Baolin Wang, Andrew Morton, Greg Kroah-Hartman, Petr Mladek, Tejun Heo, Peter Hurley, Vasily Averin, Joe Perches Over the past years I've seen many reports of bugs that include time-stamped kernel logs (enabled when CONFIG_PRINTK_TIME=y or print.time=1 is specified as a kernel parameter) that do not align with either external time stamped logs or /var/log/messages. For example, [root@intel-wildcatpass-06 ~]# date; echo "Hello!" > /dev/kmsg ; date Thu Dec 17 13:58:31 EST 2015 Thu Dec 17 13:58:31 EST 2015 which displays [83973.768912] Hello! on the serial console. Running a script to convert this to "boot time", [root@intel-wildcatpass-06 ~]# ./human.sh | tail -1 [Thu Dec 17 13:59:57 2015] Hello! which is already off by 1 minute and 26 seconds off after ~24 hours of uptime. This occurs because the time stamp is obtained from a call to local_clock() which (on x86) is a direct call to the hardware. These hardware clock reads are not modified by the standard ntp or ptp protocol, while the other timestamps are, and that results in situations where /var/log/messages timestamps are further and further offset from the hardware clock timestamps. This patch introduces printk.clock=[local|boot|real|tai] allowing a user to specify an adjusted clock to use with printk timestamps. The hardware clock, or the existing functionality, is preserved by default. Cc: John Stultz <john.stultz@linaro.org> Cc: Xunlei Pang <pang.xunlei@linaro.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Baolin Wang <baolin.wang@linaro.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Petr Mladek <pmladek@suse.cz> Cc: Tejun Heo <tj@kernel.org> Cc: Peter Hurley <peter@hurleysoftware.com> Cc: Vasily Averin <vvs@virtuozzo.com> Cc: Joe Perches <joe@perches.com> Signed-off-by: Prarit Bhargava <prarit@redhat.com> Prarit Bhargava (2): kernel, timekeeping, add trylock option to ktime_get_with_offset() printk, Add printk.clock kernel parameter include/linux/time64.h | 2 + include/linux/timekeeping.h | 50 +++++++++++++++++++-- kernel/printk/printk.c | 105 +++++++++++++++++++++++++++++++++++++++---- kernel/time/timekeeping.c | 15 ++++++- 4 files changed, 159 insertions(+), 13 deletions(-) -- 1.7.9.3 ^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH 2/2] printk, Add printk.clock kernel parameter 2016-01-06 13:00 [PATCH 0/2] printk, Add a printk.clock kernel parameter Prarit Bhargava @ 2016-01-06 13:00 ` Prarit Bhargava 2016-01-07 14:57 ` Petr Mladek 0 siblings, 1 reply; 16+ messages in thread From: Prarit Bhargava @ 2016-01-06 13:00 UTC (permalink / raw) To: linux-kernel Cc: Prarit Bhargava, John Stultz, Xunlei Pang, Thomas Gleixner, Baolin Wang, Andrew Morton, Greg Kroah-Hartman, Petr Mladek, Tejun Heo, Peter Hurley, Vasily Averin, Joe Perches The script used in the analysis below: dmesg_with_human_timestamps () { $(type -P dmesg) "$@" | perl -w -e 'use strict; my ($uptime) = do { local @ARGV="/proc/uptime";<>}; ($uptime) = ($uptime =~ /^(\d+)\./); foreach my $line (<>) { printf( ($line=~/^\[\s*(\d+)\.\d+\](.+)/) ? ( "[%s]%s\n", scalar localtime(time - $uptime + $1), $2 ) : $line ) }' } dmesg_with_human_timestamps ----8<---- Over the past years I've seen many reports of bugs that include time-stamped kernel logs (enabled when CONFIG_PRINTK_TIME=y or print.time=1 is specified as a kernel parameter) that do not align with either external time stamped logs or /var/log/messages. For example, [root@intel-wildcatpass-06 ~]# date; echo "Hello!" > /dev/kmsg ; date Thu Dec 17 13:58:31 EST 2015 Thu Dec 17 13:58:31 EST 2015 which displays [83973.768912] Hello! on the serial console. Running a script to convert this to "boot time", [root@intel-wildcatpass-06 ~]# ./human.sh | tail -1 [Thu Dec 17 13:59:57 2015] Hello! which is already off by 1 minute and 26 seconds off after ~24 hours of uptime. This occurs because the time stamp is obtained from a call to local_clock() which (on x86) is a direct call to the hardware. These hardware clock reads are not modified by the standard ntp or ptp protocol, while the other timestamps are, and that results in situations where /var/log/messages timestamps are further and further offset from the hardware clock timestamps. This patch introduces printk.clock=[local|boot|real|tai] allowing a user to specify an adjusted clock to use with printk timestamps. The hardware clock, or the existing functionality, is preserved by default. If the timekeeper_lock (used to protect writes from reads) is under contention, the last known good time is output again and "??" is used as the last two digits of time. This avoids contention issues with a CPU panicking while another CPU is writing time data. Cc: John Stultz <john.stultz@linaro.org> Cc: Xunlei Pang <pang.xunlei@linaro.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Baolin Wang <baolin.wang@linaro.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Petr Mladek <pmladek@suse.cz> Cc: Tejun Heo <tj@kernel.org> Cc: Peter Hurley <peter@hurleysoftware.com> Cc: Vasily Averin <vvs@virtuozzo.com> Cc: Joe Perches <joe@perches.com> Signed-off-by: Prarit Bhargava <prarit@redhat.com> --- include/linux/time64.h | 2 + kernel/printk/printk.c | 105 ++++++++++++++++++++++++++++++++++++++++++++---- 2 files changed, 99 insertions(+), 8 deletions(-) diff --git a/include/linux/time64.h b/include/linux/time64.h index 367d5af..f4a35d9 100644 --- a/include/linux/time64.h +++ b/include/linux/time64.h @@ -39,6 +39,8 @@ struct itimerspec64 { #define TIME64_MAX ((s64)~((u64)1 << 63)) #define KTIME_MAX ((s64)~((u64)1 << 63)) #define KTIME_SEC_MAX (KTIME_MAX / NSEC_PER_SEC) +/* Maximum value returned by ktime_to_ns() */ +#define KTIME_MAX_NS ((u64)((U64_MAX>>1))) #if __BITS_PER_LONG == 64 diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c index 2ce8826..ffc05ee 100644 --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c @@ -419,6 +419,83 @@ static u32 truncate_msg(u16 *text_len, u16 *trunc_msg_len, return msg_used_size(*text_len + *trunc_msg_len, 0, pad_len); } +static u64 printk_try_real_ns(void) +{ + return ktime_try_real_ns(); +} + +static u64 printk_try_boot_ns(void) +{ + return ktime_try_boot_ns(); +} + +static u64 printk_try_tai_ns(void) +{ + return ktime_try_tai_ns(); +} + +/* default to local clock */ +static u64 last_time_stamp; +static char *printk_clock = "local"; +static u64 (*printk_clock_fn)(void) = &local_clock; + +static u64 printk_get_timestamp(void) +{ + u64 ret = printk_clock_fn(); + + if (likely(ret != KTIME_MAX_NS)) + last_time_stamp = ret; + return ret; +} + +static int printk_clock_param_set(const char *val, + const struct kernel_param *kp) +{ + char *printk_clock_new = strstrip((char *)val); + int ret; + + /* check if change is needed */ + if (!strcmp(printk_clock_new, printk_clock)) + return 0; + + if (!strncmp("local", printk_clock_new, 5)) { + ret = param_set_charp(printk_clock_new, kp); + if (ret) + return ret; + printk_clock_fn = &local_clock; + } else if (!strncmp("real", printk_clock_new, 4)) { + ret = param_set_charp(printk_clock_new, kp); + if (ret) + return ret; + printk_clock_fn = &printk_try_real_ns; + } else if (!strncmp("boot", printk_clock_new, 4)) { + ret = param_set_charp(printk_clock_new, kp); + if (ret) + return ret; + printk_clock_fn = &printk_try_boot_ns; + } else if (!strncmp("tai", printk_clock_new, 3)) { + ret = param_set_charp(printk_clock_new, kp); + if (ret) + return ret; + printk_clock_fn = &printk_try_tai_ns; + } + + /* Did the clock change ? */ + if (strcmp(printk_clock_new, printk_clock)) + return -EINVAL; + + pr_info("printk: timestamp set to %s clock.\n", printk_clock); + return 0; +} + +static struct kernel_param_ops printk_clock_param_ops = { + .set = printk_clock_param_set, + .get = param_get_charp, +}; + +module_param_cb(clock, &printk_clock_param_ops, &printk_clock, 0644); +MODULE_PARM_DESC(clock, "Clock used for printk timestamps. Can be local (hardware/default), boot, real, or tai.\n"); + /* insert record into the buffer, discard old ones, update heads */ static int log_store(int facility, int level, enum log_flags flags, u64 ts_nsec, @@ -467,7 +544,7 @@ static int log_store(int facility, int level, if (ts_nsec > 0) msg->ts_nsec = ts_nsec; else - msg->ts_nsec = local_clock(); + msg->ts_nsec = printk_get_timestamp(); memset(log_dict(msg) + dict_len, 0, pad_len); msg->len = size; @@ -1039,18 +1116,30 @@ module_param_named(time, printk_time, bool, S_IRUGO | S_IWUSR); static size_t print_time(u64 ts, char *buf) { - unsigned long rem_nsec; + unsigned long rem; + size_t size; if (!printk_time) return 0; - rem_nsec = do_div(ts, 1000000000); + if (ts == KTIME_MAX_NS) + ts = last_time_stamp; - if (!buf) - return snprintf(NULL, 0, "[%5lu.000000] ", (unsigned long)ts); + if (buf) { + size = INT_MAX; + rem = do_div(ts, 1000000000) / 1000; + } else { + size = 0; + rem = 0; + } - return sprintf(buf, "[%5lu.%06lu] ", - (unsigned long)ts, rem_nsec / 1000); + if (ts != KTIME_MAX_NS) { + return snprintf(buf, size, "[%5lu.%06lu] ", + (unsigned long)ts, rem); + } + /* return last_time_stamp with ?? warning */ + return snprintf(buf, size, "[%5lu.%04lu??] ", + (unsigned long)ts, rem); } static size_t print_prefix(const struct printk_log *msg, bool syslog, char *buf) @@ -1613,7 +1702,7 @@ static bool cont_add(int facility, int level, const char *text, size_t len) cont.facility = facility; cont.level = level; cont.owner = current; - cont.ts_nsec = local_clock(); + cont.ts_nsec = printk_get_timestamp(); cont.flags = 0; cont.cons = 0; cont.flushed = false; -- 1.7.9.3 ^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH 2/2] printk, Add printk.clock kernel parameter 2016-01-06 13:00 ` [PATCH 2/2] printk, Add " Prarit Bhargava @ 2016-01-07 14:57 ` Petr Mladek 2016-01-07 15:38 ` Prarit Bhargava 0 siblings, 1 reply; 16+ messages in thread From: Petr Mladek @ 2016-01-07 14:57 UTC (permalink / raw) To: Prarit Bhargava Cc: linux-kernel, John Stultz, Xunlei Pang, Thomas Gleixner, Baolin Wang, Andrew Morton, Greg Kroah-Hartman, Tejun Heo, Peter Hurley, Vasily Averin, Joe Perches On Wed 2016-01-06 08:00:34, Prarit Bhargava wrote: > This patch introduces printk.clock=[local|boot|real|tai] allowing a > user to specify an adjusted clock to use with printk timestamps. The > hardware clock, or the existing functionality, is preserved by default. > > If the timekeeper_lock (used to protect writes from reads) is under > contention, the last known good time is output again and "??" is used as > the last two digits of time. This avoids contention issues with a CPU > panicking while another CPU is writing time data. > > --- > include/linux/time64.h | 2 + > kernel/printk/printk.c | 105 ++++++++++++++++++++++++++++++++++++++++++++---- > 2 files changed, 99 insertions(+), 8 deletions(-) > > diff --git a/include/linux/time64.h b/include/linux/time64.h > index 367d5af..f4a35d9 100644 > --- a/include/linux/time64.h > +++ b/include/linux/time64.h > @@ -39,6 +39,8 @@ struct itimerspec64 { > #define TIME64_MAX ((s64)~((u64)1 << 63)) > #define KTIME_MAX ((s64)~((u64)1 << 63)) > #define KTIME_SEC_MAX (KTIME_MAX / NSEC_PER_SEC) > +/* Maximum value returned by ktime_to_ns() */ > +#define KTIME_MAX_NS ((u64)((U64_MAX>>1))) Please, use KTIME_NS_MAX to use the same naming scheme as KTIME_SEC_MAX. Well, I would return 0 when the time is not known. See also below. > #if __BITS_PER_LONG == 64 > > diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c > index 2ce8826..ffc05ee 100644 > --- a/kernel/printk/printk.c > +++ b/kernel/printk/printk.c > @@ -419,6 +419,83 @@ static u32 truncate_msg(u16 *text_len, u16 *trunc_msg_len, > return msg_used_size(*text_len + *trunc_msg_len, 0, pad_len); > } > > +static u64 printk_try_real_ns(void) > +{ > + return ktime_try_real_ns(); > +} > + > +static u64 printk_try_boot_ns(void) > +{ > + return ktime_try_boot_ns(); > +} > + > +static u64 printk_try_tai_ns(void) > +{ > + return ktime_try_tai_ns(); > +} Do we need these wrappers? Why not using ktime_try_xxx_ns() directly? You already use local_clock(). > +static int printk_clock_param_set(const char *val, > + const struct kernel_param *kp) > +{ > + char *printk_clock_new = strstrip((char *)val); > + int ret; > + > + /* check if change is needed */ > + if (!strcmp(printk_clock_new, printk_clock)) > + return 0; > + > + if (!strncmp("local", printk_clock_new, 5)) { > + ret = param_set_charp(printk_clock_new, kp); > + if (ret) > + return ret; > + printk_clock_fn = &local_clock; > + } else if (!strncmp("real", printk_clock_new, 4)) { > + ret = param_set_charp(printk_clock_new, kp); > + if (ret) > + return ret; > + printk_clock_fn = &printk_try_real_ns; > + } else if (!strncmp("boot", printk_clock_new, 4)) { > + ret = param_set_charp(printk_clock_new, kp); > + if (ret) > + return ret; > + printk_clock_fn = &printk_try_boot_ns; > + } else if (!strncmp("tai", printk_clock_new, 3)) { > + ret = param_set_charp(printk_clock_new, kp); > + if (ret) > + return ret; > + printk_clock_fn = &printk_try_tai_ns; > + } > + > + /* Did the clock change ? */ > + if (strcmp(printk_clock_new, printk_clock)) > + return -EINVAL; I wonder if we need this extra check. It might be enough to print the error message in the final "else" above. > + pr_info("printk: timestamp set to %s clock.\n", printk_clock); > + return 0; > +} > + > +static struct kernel_param_ops printk_clock_param_ops = { > + .set = printk_clock_param_set, > + .get = param_get_charp, > +}; > + > +module_param_cb(clock, &printk_clock_param_ops, &printk_clock, 0644); > +MODULE_PARM_DESC(clock, "Clock used for printk timestamps. Can be local (hardware/default), boot, real, or tai.\n"); I have problems to parse this. It seems that the most common style used for possible values is to putthem into brackets. I wonder if this is better readable. MODULE_PARM_DESC(clock, "Clock used for printk timestamps (local = default, boot, real, tai)."); > + > /* insert record into the buffer, discard old ones, update heads */ > static int log_store(int facility, int level, > enum log_flags flags, u64 ts_nsec, > @@ -467,7 +544,7 @@ static int log_store(int facility, int level, > if (ts_nsec > 0) > msg->ts_nsec = ts_nsec; > else > - msg->ts_nsec = local_clock(); > + msg->ts_nsec = printk_get_timestamp(); Hmm, one problem is that each clock returns another type of time. "local" and "boot" clocks returns the number of ns since the boot. While "real" and "tai" clocks returns the number of ns since 1.1.1970 or so. The tools reading the timestamps are confused by this. For example, I get this: $> echo boot >/sys/module/printk/parameters/clock $> echo local >/sys/module/printk/parameters/clock $> echo real >/sys/module/printk/parameters/clock $> echo tai >/sys/module/printk/parameters/clock $> dmesg | tail -6 [ 6.976593] e1000: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX [ 6.977168] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready [ 77.500483] printk: timestamp set to boot clock. [ 81.419957] printk: timestamp set to local clock. [1452177961.544909] printk: timestamp set to real clock. [1452177965.224824] printk: timestamp set to tai clock. $> dmesg -T | tail -6 [Thu Jan 7 15:44:41 2016] e1000: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX [Thu Jan 7 15:44:41 2016] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready [Thu Jan 7 15:45:52 2016] printk: timestamp set to boot clock. [Thu Jan 7 15:45:56 2016] printk: timestamp set to local clock. [Fri Jan 13 06:30:36 2062] printk: timestamp set to real clock. [Fri Jan 13 06:30:40 2062] printk: timestamp set to tai clock. Please, note that the last messages looks as they are printed in a far far future ;-) > memset(log_dict(msg) + dict_len, 0, pad_len); > msg->len = size; > > @@ -1039,18 +1116,30 @@ module_param_named(time, printk_time, bool, S_IRUGO | S_IWUSR); > > static size_t print_time(u64 ts, char *buf) > { > - unsigned long rem_nsec; > + unsigned long rem; > + size_t size; > > if (!printk_time) > return 0; > > - rem_nsec = do_div(ts, 1000000000); > + if (ts == KTIME_MAX_NS) > + ts = last_time_stamp; > > - if (!buf) > - return snprintf(NULL, 0, "[%5lu.000000] ", (unsigned long)ts); > + if (buf) { > + size = INT_MAX; > + rem = do_div(ts, 1000000000) / 1000; > + } else { > + size = 0; > + rem = 0; > + } > > - return sprintf(buf, "[%5lu.%06lu] ", > - (unsigned long)ts, rem_nsec / 1000); > + if (ts != KTIME_MAX_NS) { > + return snprintf(buf, size, "[%5lu.%06lu] ", > + (unsigned long)ts, rem); > + } > + /* return last_time_stamp with ?? warning */ > + return snprintf(buf, size, "[%5lu.%04lu??] ", > + (unsigned long)ts, rem); I am not sure if this is worth the effort. All other tools parsing the log will be confused by this. Also there are tools that read the ring buffer dirrectly, e.g. "crash" reading a crashdump. They will need to get updated to handle KTIME_MAX_NS correctly. Instead, I would modify printk_get_timestamp() to return the last time stamp if the current time is not known. Then log_buf will included reasonable values. IMHO, it is not a big deal if two messages have the same time stamp. It already happens for messages that are printed before timekeeping machine is initialized. They all have zero time. Also the time stamp is created under the logbuf_lock(). It means that it is not the time when the printk() was called but it is the time when the printk() got access to the log_buf. By other words, the timestamp is not perfect anyway. Even better solution would be to always get some reasonable time. Especially, when the message is printed long time after the previous one. Best Regards, Petr ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 2/2] printk, Add printk.clock kernel parameter 2016-01-07 14:57 ` Petr Mladek @ 2016-01-07 15:38 ` Prarit Bhargava 2016-01-07 15:52 ` Petr Mladek 0 siblings, 1 reply; 16+ messages in thread From: Prarit Bhargava @ 2016-01-07 15:38 UTC (permalink / raw) To: Petr Mladek Cc: linux-kernel, John Stultz, Xunlei Pang, Thomas Gleixner, Baolin Wang, Andrew Morton, Greg Kroah-Hartman, Tejun Heo, Peter Hurley, Vasily Averin, Joe Perches On 01/07/2016 09:57 AM, Petr Mladek wrote: > On Wed 2016-01-06 08:00:34, Prarit Bhargava wrote: >> This patch introduces printk.clock=[local|boot|real|tai] allowing a >> user to specify an adjusted clock to use with printk timestamps. The >> hardware clock, or the existing functionality, is preserved by default. >> >> If the timekeeper_lock (used to protect writes from reads) is under >> contention, the last known good time is output again and "??" is used as >> the last two digits of time. This avoids contention issues with a CPU >> panicking while another CPU is writing time data. >> >> --- >> include/linux/time64.h | 2 + >> kernel/printk/printk.c | 105 ++++++++++++++++++++++++++++++++++++++++++++---- >> 2 files changed, 99 insertions(+), 8 deletions(-) >> >> diff --git a/include/linux/time64.h b/include/linux/time64.h >> index 367d5af..f4a35d9 100644 >> --- a/include/linux/time64.h >> +++ b/include/linux/time64.h >> @@ -39,6 +39,8 @@ struct itimerspec64 { >> #define TIME64_MAX ((s64)~((u64)1 << 63)) >> #define KTIME_MAX ((s64)~((u64)1 << 63)) >> #define KTIME_SEC_MAX (KTIME_MAX / NSEC_PER_SEC) >> +/* Maximum value returned by ktime_to_ns() */ >> +#define KTIME_MAX_NS ((u64)((U64_MAX>>1))) > > Please, use KTIME_NS_MAX to use the same naming scheme as > KTIME_SEC_MAX. > > Well, I would return 0 when the time is not known. See also below. > >> #if __BITS_PER_LONG == 64 >> >> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c >> index 2ce8826..ffc05ee 100644 >> --- a/kernel/printk/printk.c >> +++ b/kernel/printk/printk.c >> @@ -419,6 +419,83 @@ static u32 truncate_msg(u16 *text_len, u16 *trunc_msg_len, >> return msg_used_size(*text_len + *trunc_msg_len, 0, pad_len); >> } >> >> +static u64 printk_try_real_ns(void) >> +{ >> + return ktime_try_real_ns(); >> +} >> + >> +static u64 printk_try_boot_ns(void) >> +{ >> + return ktime_try_boot_ns(); >> +} >> + >> +static u64 printk_try_tai_ns(void) >> +{ >> + return ktime_try_tai_ns(); >> +} > > Do we need these wrappers? Why not using ktime_try_xxx_ns() directly? > You already use local_clock(). Unfortunately ktime_* are inlined functions, and I've mimicked the currently coding style there. In any case I'm doing a v2 with different functions. I'll see if making this stuff not-inlined is any better. > > >> +static int printk_clock_param_set(const char *val, >> + const struct kernel_param *kp) >> +{ >> + char *printk_clock_new = strstrip((char *)val); >> + int ret; >> + >> + /* check if change is needed */ >> + if (!strcmp(printk_clock_new, printk_clock)) >> + return 0; >> + >> + if (!strncmp("local", printk_clock_new, 5)) { >> + ret = param_set_charp(printk_clock_new, kp); >> + if (ret) >> + return ret; >> + printk_clock_fn = &local_clock; >> + } else if (!strncmp("real", printk_clock_new, 4)) { >> + ret = param_set_charp(printk_clock_new, kp); >> + if (ret) >> + return ret; >> + printk_clock_fn = &printk_try_real_ns; >> + } else if (!strncmp("boot", printk_clock_new, 4)) { >> + ret = param_set_charp(printk_clock_new, kp); >> + if (ret) >> + return ret; >> + printk_clock_fn = &printk_try_boot_ns; >> + } else if (!strncmp("tai", printk_clock_new, 3)) { >> + ret = param_set_charp(printk_clock_new, kp); >> + if (ret) >> + return ret; >> + printk_clock_fn = &printk_try_tai_ns; >> + } >> + >> + /* Did the clock change ? */ >> + if (strcmp(printk_clock_new, printk_clock)) >> + return -EINVAL; > > I wonder if we need this extra check. It might be enough to > print the error message in the final "else" above. I was trying to catch a local to local change or a real to real change. But yeah ... I could sort of do that in the final else above. > >> + pr_info("printk: timestamp set to %s clock.\n", printk_clock); >> + return 0; >> +} >> + >> +static struct kernel_param_ops printk_clock_param_ops = { >> + .set = printk_clock_param_set, >> + .get = param_get_charp, >> +}; >> + >> +module_param_cb(clock, &printk_clock_param_ops, &printk_clock, 0644); >> +MODULE_PARM_DESC(clock, "Clock used for printk timestamps. Can be local (hardware/default), boot, real, or tai.\n"); > > I have problems to parse this. It seems that the most common style > used for possible values is to putthem into brackets. I wonder > if this is better readable. > > MODULE_PARM_DESC(clock, "Clock used for printk timestamps (local = default, boot, real, tai)."); > Ah, thanks! :) I was wondering if there was a coding style preference there when I wrote this up. > >> + >> /* insert record into the buffer, discard old ones, update heads */ >> static int log_store(int facility, int level, >> enum log_flags flags, u64 ts_nsec, >> @@ -467,7 +544,7 @@ static int log_store(int facility, int level, >> if (ts_nsec > 0) >> msg->ts_nsec = ts_nsec; >> else >> - msg->ts_nsec = local_clock(); >> + msg->ts_nsec = printk_get_timestamp(); > > Hmm, one problem is that each clock returns another type of time. > "local" and "boot" clocks returns the number of ns since the boot. > While "real" and "tai" clocks returns the number of ns since 1.1.1970 > or so. > > The tools reading the timestamps are confused by this. For example, > I get this: > > $> echo boot >/sys/module/printk/parameters/clock > $> echo local >/sys/module/printk/parameters/clock > $> echo real >/sys/module/printk/parameters/clock > $> echo tai >/sys/module/printk/parameters/clock > $> dmesg | tail -6 > [ 6.976593] e1000: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX > [ 6.977168] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready > [ 77.500483] printk: timestamp set to boot clock. > [ 81.419957] printk: timestamp set to local clock. > [1452177961.544909] printk: timestamp set to real clock. > [1452177965.224824] printk: timestamp set to tai clock. > $> dmesg -T | tail -6 > [Thu Jan 7 15:44:41 2016] e1000: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX > [Thu Jan 7 15:44:41 2016] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready > [Thu Jan 7 15:45:52 2016] printk: timestamp set to boot clock. > [Thu Jan 7 15:45:56 2016] printk: timestamp set to local clock. > [Fri Jan 13 06:30:36 2062] printk: timestamp set to real clock. > [Fri Jan 13 06:30:40 2062] printk: timestamp set to tai clock. > > Please, note that the last messages looks as they are printed in > a far far future ;-) Heh :) I didn't even think of testing that. I'll have to think about that a bit more. systemd logging must assume boot time/local time. I'll ping those guys with a patch if/when the next version of this gets accepted. Thanks for catching this. > > >> memset(log_dict(msg) + dict_len, 0, pad_len); >> msg->len = size; >> >> @@ -1039,18 +1116,30 @@ module_param_named(time, printk_time, bool, S_IRUGO | S_IWUSR); >> >> static size_t print_time(u64 ts, char *buf) >> { >> - unsigned long rem_nsec; >> + unsigned long rem; >> + size_t size; >> >> if (!printk_time) >> return 0; >> >> - rem_nsec = do_div(ts, 1000000000); >> + if (ts == KTIME_MAX_NS) >> + ts = last_time_stamp; >> >> - if (!buf) >> - return snprintf(NULL, 0, "[%5lu.000000] ", (unsigned long)ts); >> + if (buf) { >> + size = INT_MAX; >> + rem = do_div(ts, 1000000000) / 1000; >> + } else { >> + size = 0; >> + rem = 0; >> + } >> >> - return sprintf(buf, "[%5lu.%06lu] ", >> - (unsigned long)ts, rem_nsec / 1000); >> + if (ts != KTIME_MAX_NS) { >> + return snprintf(buf, size, "[%5lu.%06lu] ", >> + (unsigned long)ts, rem); >> + } >> + /* return last_time_stamp with ?? warning */ >> + return snprintf(buf, size, "[%5lu.%04lu??] ", >> + (unsigned long)ts, rem); > > I am not sure if this is worth the effort. All other tools parsing the > log will be confused by this. Also there are tools that read the ring > buffer dirrectly, e.g. "crash" reading a crashdump. They will need to > get updated to handle KTIME_MAX_NS correctly. Yep -- the thread contains suggestions on how to avoid all of this by using the NMI safe ktime_* calls (and implementing new ones). I'm going to do that which should nicely clean all of this up. Thanks for the review Petr, P. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 2/2] printk, Add printk.clock kernel parameter 2016-01-07 15:38 ` Prarit Bhargava @ 2016-01-07 15:52 ` Petr Mladek 2016-01-08 9:04 ` Thomas Gleixner 0 siblings, 1 reply; 16+ messages in thread From: Petr Mladek @ 2016-01-07 15:52 UTC (permalink / raw) To: Prarit Bhargava Cc: linux-kernel, John Stultz, Xunlei Pang, Thomas Gleixner, Baolin Wang, Andrew Morton, Greg Kroah-Hartman, Tejun Heo, Peter Hurley, Vasily Averin, Joe Perches On Thu 2016-01-07 10:38:37, Prarit Bhargava wrote: > > > On 01/07/2016 09:57 AM, Petr Mladek wrote: > > On Wed 2016-01-06 08:00:34, Prarit Bhargava wrote: > >> This patch introduces printk.clock=[local|boot|real|tai] allowing a > >> user to specify an adjusted clock to use with printk timestamps. The > >> hardware clock, or the existing functionality, is preserved by default. > >> + pr_info("printk: timestamp set to %s clock.\n", printk_clock); > >> + return 0; > >> +} > >> + > >> +static struct kernel_param_ops printk_clock_param_ops = { > >> + .set = printk_clock_param_set, > >> + .get = param_get_charp, > >> +}; > >> + > >> +module_param_cb(clock, &printk_clock_param_ops, &printk_clock, 0644); > >> +MODULE_PARM_DESC(clock, "Clock used for printk timestamps. Can be local (hardware/default), boot, real, or tai.\n"); > > > > I have problems to parse this. It seems that the most common style > > used for possible values is to putthem into brackets. I wonder > > if this is better readable. > > > > MODULE_PARM_DESC(clock, "Clock used for printk timestamps (local = default, boot, real, tai)."); > > > > Ah, thanks! :) I was wondering if there was a coding style preference there > when I wrote this up. Heh, I am not sure if there is any coding style for this. I just went over the output from git grep MODULE_PARM_DESC > > > >> + > >> /* insert record into the buffer, discard old ones, update heads */ > >> static int log_store(int facility, int level, > >> enum log_flags flags, u64 ts_nsec, > >> @@ -467,7 +544,7 @@ static int log_store(int facility, int level, > >> if (ts_nsec > 0) > >> msg->ts_nsec = ts_nsec; > >> else > >> - msg->ts_nsec = local_clock(); > >> + msg->ts_nsec = printk_get_timestamp(); > > > > Hmm, one problem is that each clock returns another type of time. > > "local" and "boot" clocks returns the number of ns since the boot. > > While "real" and "tai" clocks returns the number of ns since 1.1.1970 > > or so. > > > > The tools reading the timestamps are confused by this. For example, > > I get this: > > > > $> echo boot >/sys/module/printk/parameters/clock > > $> echo local >/sys/module/printk/parameters/clock > > $> echo real >/sys/module/printk/parameters/clock > > $> echo tai >/sys/module/printk/parameters/clock > > $> dmesg | tail -6 > > [ 6.976593] e1000: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX > > [ 6.977168] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready > > [ 77.500483] printk: timestamp set to boot clock. > > [ 81.419957] printk: timestamp set to local clock. > > [1452177961.544909] printk: timestamp set to real clock. > > [1452177965.224824] printk: timestamp set to tai clock. > > $> dmesg -T | tail -6 > > [Thu Jan 7 15:44:41 2016] e1000: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX > > [Thu Jan 7 15:44:41 2016] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready > > [Thu Jan 7 15:45:52 2016] printk: timestamp set to boot clock. > > [Thu Jan 7 15:45:56 2016] printk: timestamp set to local clock. > > [Fri Jan 13 06:30:36 2062] printk: timestamp set to real clock. > > [Fri Jan 13 06:30:40 2062] printk: timestamp set to tai clock. > > > > Please, note that the last messages looks as they are printed in > > a far far future ;-) > > Heh :) I didn't even think of testing that. I'll have to think about that a > bit more. systemd logging must assume boot time/local time. I'll ping those > guys with a patch if/when the next version of this gets accepted. I am not sure how many tools would be affected by this. An easier solution would be to normalize all times and always save the relative time since the boot as it is done now. Best Regards, Petr ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 2/2] printk, Add printk.clock kernel parameter 2016-01-07 15:52 ` Petr Mladek @ 2016-01-08 9:04 ` Thomas Gleixner 0 siblings, 0 replies; 16+ messages in thread From: Thomas Gleixner @ 2016-01-08 9:04 UTC (permalink / raw) To: Petr Mladek Cc: Prarit Bhargava, linux-kernel, John Stultz, Xunlei Pang, Baolin Wang, Andrew Morton, Greg Kroah-Hartman, Tejun Heo, Peter Hurley, Vasily Averin, Joe Perches On Thu, 7 Jan 2016, Petr Mladek wrote: > On Thu 2016-01-07 10:38:37, Prarit Bhargava wrote: > > Heh :) I didn't even think of testing that. I'll have to think about that a > > bit more. systemd logging must assume boot time/local time. I'll ping those > > guys with a patch if/when the next version of this gets accepted. > > I am not sure how many tools would be affected by this. I don't see a point in a gazillion of different clocks for printk either. > An easier solution would be to normalize all times and always save the > relative time since the boot as it is done now. Why? What's wrong with simply using the existing ktime_get_mono_fast_ns() instead of local_clock() and be done with it? Thanks, tglx ^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2016-01-22 8:05 UTC | newest] Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2016-01-13 12:34 [PATCH 0/2] printk, Add printk.clock kernel parameter [v2] Prarit Bhargava 2016-01-13 12:34 ` [PATCH 1/2] kernel, timekeeping, add ktime_get_[boot|real|tai]_fast_ns functions Prarit Bhargava 2016-01-13 12:34 ` [PATCH 2/2] printk, Add printk.clock kernel parameter Prarit Bhargava 2016-01-13 13:45 ` [PATCH 0/2] printk, Add printk.clock kernel parameter [v2] Thomas Gleixner 2016-01-13 14:36 ` Prarit Bhargava 2016-01-13 17:28 ` Thomas Gleixner 2016-01-14 12:52 ` Petr Mladek 2016-01-14 14:39 ` Prarit Bhargava 2016-01-14 14:44 ` Thomas Gleixner 2016-01-21 16:09 ` Prarit Bhargava 2016-01-22 8:04 ` Thomas Gleixner -- strict thread matches above, loose matches on Subject: below -- 2016-01-06 13:00 [PATCH 0/2] printk, Add a printk.clock kernel parameter Prarit Bhargava 2016-01-06 13:00 ` [PATCH 2/2] printk, Add " Prarit Bhargava 2016-01-07 14:57 ` Petr Mladek 2016-01-07 15:38 ` Prarit Bhargava 2016-01-07 15:52 ` Petr Mladek 2016-01-08 9:04 ` Thomas Gleixner
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.