linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] arm64: Allow early timestamping of kernel log
@ 2019-07-22 10:33 Marc Zyngier
  2019-07-22 10:33 ` [PATCH 1/3] printk: Allow architecture-specific timestamping function Marc Zyngier
                   ` (4 more replies)
  0 siblings, 5 replies; 12+ messages in thread
From: Marc Zyngier @ 2019-07-22 10:33 UTC (permalink / raw)
  To: Thomas Gleixner, John Stultz, Pavel Tatashin, Petr Mladek,
	Sergey Senozhatsky, Steven Rostedt, Will Deacon, Catalin Marinas,
	Mark Rutland
  Cc: linux-kernel, linux-arm-kernel

So far, we've let the arm64 kernel start its meaningful time stamping
of the kernel log pretty late, which is caused by sched_clock() being
initialised rather late compared to other architectures.

Pavel Tatashin proposed[1] to move the initialisation of sched_clock
much earlier, which I had objections to. The reason for initialising
sched_clock late is that a number of systems have broken counters, and
we need to apply all kind of terrifying workarounds to avoid time
going backward on the affected platforms. Being able to identify the
right workaround comes pretty late in the kernel boot, and providing
an unreliable sched_clock, even for a short period of time, isn't an
appealing prospect.

To address this, I'm proposing that we allow an architecture to chose
to (1) divorce time stamping and sched_clock during the early phase of
booting, and (2) inherit the time stamping clock as the new epoch the
first time a sched_sched clock gets registered.

(1) would allow arm64 to provide a time stamping clock, however
unreliable it might be, while (2) would allow sched_clock to provide
time stamps that are continuous with the time-stamping clock.

The last patch in the series adds the necessary logic to arm64,
allowing the (potentially unreliable) time stamping of early kernel
messages.

Tested on a bunch of arm64 systems, both bare-metal and in VMs. Boot
tested on a x86 guest.

[1] https://lore.kernel.org/patchwork/patch/1015110/

Marc Zyngier (3):
  printk: Allow architecture-specific timestamping function
  sched/clock: Allow sched_clock to inherit timestamp_clock epoch
  arm64: Allow early time stamping

 arch/arm64/Kconfig          |  3 +++
 arch/arm64/kernel/setup.c   | 44 +++++++++++++++++++++++++++++++++++++
 include/linux/sched/clock.h | 13 +++++++++++
 kernel/printk/printk.c      |  4 ++--
 kernel/time/sched_clock.c   | 10 +++++++++
 5 files changed, 72 insertions(+), 2 deletions(-)

-- 
2.20.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/3] printk: Allow architecture-specific timestamping function
  2019-07-22 10:33 [PATCH 0/3] arm64: Allow early timestamping of kernel log Marc Zyngier
@ 2019-07-22 10:33 ` Marc Zyngier
  2019-07-22 11:25   ` Petr Mladek
  2019-07-22 10:33 ` [PATCH 2/3] sched/clock: Allow sched_clock to inherit timestamp_clock epoch Marc Zyngier
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 12+ messages in thread
From: Marc Zyngier @ 2019-07-22 10:33 UTC (permalink / raw)
  To: Thomas Gleixner, John Stultz, Pavel Tatashin, Petr Mladek,
	Sergey Senozhatsky, Steven Rostedt, Will Deacon, Catalin Marinas,
	Mark Rutland
  Cc: linux-kernel, linux-arm-kernel

printk currently relies on local_clock to time-stamp the kernel
messages. In order to allow the timestamping (and only that)
to be overridden by architecture-specific code, let's declare
a new timestamp_clock() function, which gets used by the printk
code. Architectures willing to make use of this facility will
have to define CONFIG_ARCH_HAS_TIMESTAMP_CLOCK.

The default is of course to return local_clock(), so that the
existing behaviour stays unchanged.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 include/linux/sched/clock.h | 13 +++++++++++++
 kernel/printk/printk.c      |  4 ++--
 2 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/include/linux/sched/clock.h b/include/linux/sched/clock.h
index 867d588314e0..3cf4b2a8ce18 100644
--- a/include/linux/sched/clock.h
+++ b/include/linux/sched/clock.h
@@ -98,4 +98,17 @@ static inline void enable_sched_clock_irqtime(void) {}
 static inline void disable_sched_clock_irqtime(void) {}
 #endif
 
+#ifdef CONFIG_ARCH_HAS_TIMESTAMP_CLOCK
+/* Special need architectures can provide their timestamping function */
+extern u64 timestamp_clock(void);
+
+#else
+
+static inline u64 timestamp_clock(void)
+{
+	return local_clock();
+}
+
+#endif
+
 #endif /* _LINUX_SCHED_CLOCK_H */
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 1888f6a3b694..166702316714 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -638,7 +638,7 @@ static int log_store(u32 caller_id, int facility, int level,
 	if (ts_nsec > 0)
 		msg->ts_nsec = ts_nsec;
 	else
-		msg->ts_nsec = local_clock();
+		msg->ts_nsec = timestamp_clock();
 #ifdef CONFIG_PRINTK_CALLER
 	msg->caller_id = caller_id;
 #endif
@@ -1841,7 +1841,7 @@ static bool cont_add(u32 caller_id, int facility, int level,
 		cont.facility = facility;
 		cont.level = level;
 		cont.caller_id = caller_id;
-		cont.ts_nsec = local_clock();
+		cont.ts_nsec = timestamp_clock();
 		cont.flags = flags;
 	}
 
-- 
2.20.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/3] sched/clock: Allow sched_clock to inherit timestamp_clock epoch
  2019-07-22 10:33 [PATCH 0/3] arm64: Allow early timestamping of kernel log Marc Zyngier
  2019-07-22 10:33 ` [PATCH 1/3] printk: Allow architecture-specific timestamping function Marc Zyngier
@ 2019-07-22 10:33 ` Marc Zyngier
  2019-07-22 10:33 ` [PATCH 3/3] arm64: Allow early time stamping Marc Zyngier
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 12+ messages in thread
From: Marc Zyngier @ 2019-07-22 10:33 UTC (permalink / raw)
  To: Thomas Gleixner, John Stultz, Pavel Tatashin, Petr Mladek,
	Sergey Senozhatsky, Steven Rostedt, Will Deacon, Catalin Marinas,
	Mark Rutland
  Cc: linux-kernel, linux-arm-kernel

Now that we can let an architecture override the timestamping
function, it becomes desirable to ensure that, should the
architecture code switch its timestamping code to sched_clock
once it has been registered, the sched_clock inherits the
timestamp value as its new epoch.

This ensures that the time stamps are continuous and that there
is no jitter other than that introduced by the lack of quality
of the timestamping clock.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 kernel/time/sched_clock.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/kernel/time/sched_clock.c b/kernel/time/sched_clock.c
index 142b07619918..ee1bd449ec81 100644
--- a/kernel/time/sched_clock.c
+++ b/kernel/time/sched_clock.c
@@ -192,6 +192,16 @@ sched_clock_register(u64 (*read)(void), int bits, unsigned long rate)
 	new_epoch = read();
 	cyc = cd.actual_read_sched_clock();
 	ns = rd.epoch_ns + cyc_to_ns((cyc - rd.epoch_cyc) & rd.sched_clock_mask, rd.mult, rd.shift);
+
+	/*
+	 * If the architecture has a timestamp clock, and this is the
+	 * first time we register a new sched_clock, use the timestamp
+	 * clock as the epoch.
+	 */
+	if (IS_ENABLED(CONFIG_ARCH_HAS_TIMESTAMP_CLOCK) &&
+	    unlikely(cd.actual_read_sched_clock == jiffy_sched_clock_read))
+		ns = timestamp_clock();
+
 	cd.actual_read_sched_clock = read;
 
 	rd.read_sched_clock	= read;
-- 
2.20.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 3/3] arm64: Allow early time stamping
  2019-07-22 10:33 [PATCH 0/3] arm64: Allow early timestamping of kernel log Marc Zyngier
  2019-07-22 10:33 ` [PATCH 1/3] printk: Allow architecture-specific timestamping function Marc Zyngier
  2019-07-22 10:33 ` [PATCH 2/3] sched/clock: Allow sched_clock to inherit timestamp_clock epoch Marc Zyngier
@ 2019-07-22 10:33 ` Marc Zyngier
  2019-07-22 20:52 ` [PATCH 0/3] arm64: Allow early timestamping of kernel log Pavel Tatashin
  2019-07-23  2:42 ` Hanjun Guo
  4 siblings, 0 replies; 12+ messages in thread
From: Marc Zyngier @ 2019-07-22 10:33 UTC (permalink / raw)
  To: Thomas Gleixner, John Stultz, Pavel Tatashin, Petr Mladek,
	Sergey Senozhatsky, Steven Rostedt, Will Deacon, Catalin Marinas,
	Mark Rutland
  Cc: linux-kernel, linux-arm-kernel

In order to provide early timestamping on arm64 systems, we provide
a timestamp_clock() function that is available as early as possible.
This function simply returns the current counter value scales in
nanoseconds, and 0-based.

In order to deal with the idiosyncrasies of some broken platforms,
we condition this on the frequency being set in the CNTFRQ_EL0
register, and revert back to using local_clock() as soon as it starts
returning non-zero values.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/Kconfig        |  3 +++
 arch/arm64/kernel/setup.c | 44 +++++++++++++++++++++++++++++++++++++++
 2 files changed, 47 insertions(+)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 3adcec05b1f6..ac3882ddc1c2 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -911,6 +911,9 @@ config ARCH_WANT_HUGE_PMD_SHARE
 config ARCH_HAS_CACHE_LINE_SIZE
 	def_bool y
 
+config ARCH_HAS_TIMESTAMP_CLOCK
+	def_bool y
+
 config ARCH_ENABLE_SPLIT_PMD_PTLOCK
 	def_bool y if PGTABLE_LEVELS > 2
 
diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index 9c4bad7d7131..cf21e3df7165 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -15,6 +15,7 @@
 #include <linux/initrd.h>
 #include <linux/console.h>
 #include <linux/cache.h>
+#include <linux/clocksource.h>
 #include <linux/screen_info.h>
 #include <linux/init.h>
 #include <linux/kexec.h>
@@ -28,10 +29,12 @@
 #include <linux/of_fdt.h>
 #include <linux/efi.h>
 #include <linux/psci.h>
+#include <linux/sched/clock.h>
 #include <linux/sched/task.h>
 #include <linux/mm.h>
 
 #include <asm/acpi.h>
+#include <asm/arch_timer.h>
 #include <asm/fixmap.h>
 #include <asm/cpu.h>
 #include <asm/cputype.h>
@@ -269,8 +272,49 @@ arch_initcall(reserve_memblock_reserved_regions);
 
 u64 __cpu_logical_map[NR_CPUS] = { [0 ... NR_CPUS-1] = INVALID_HWID };
 
+static u64 notrace ts_counter_read_cc(const struct cyclecounter *cc)
+{
+	return __arch_counter_get_cntvct();
+}
+
+static struct cyclecounter ts_cc __ro_after_init = {
+	.read	= ts_counter_read_cc,
+	.mask	= CLOCKSOURCE_MASK(56),
+};
+
+static struct timecounter ts_tc;
+
+static bool ts_cc_valid __ro_after_init = false;
+
+static __init void timestamp_clock_init(void)
+{
+	u64 frq = arch_timer_get_cntfrq();
+
+	if (!frq)
+		return;
+
+	clocks_calc_mult_shift(&ts_cc.mult, &ts_cc.shift,
+			       frq, NSEC_PER_SEC, 3600);
+	/* timestamp starts at 0 (local_clock is a good enough approximation) */
+	timecounter_init(&ts_tc, &ts_cc, local_clock());
+	ts_cc_valid = true;
+	pr_info("Using timestamp clock @%lluMHz\n", frq / 1000 / 1000);
+}
+
+u64 timestamp_clock(void)
+{
+	u64 ns = local_clock();
+
+	if (likely(ns || !ts_cc_valid))
+		return ns;
+
+	return timecounter_read(&ts_tc);
+}
+
 void __init setup_arch(char **cmdline_p)
 {
+	timestamp_clock_init();
+
 	init_mm.start_code = (unsigned long) _text;
 	init_mm.end_code   = (unsigned long) _etext;
 	init_mm.end_data   = (unsigned long) _edata;
-- 
2.20.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/3] printk: Allow architecture-specific timestamping function
  2019-07-22 10:33 ` [PATCH 1/3] printk: Allow architecture-specific timestamping function Marc Zyngier
@ 2019-07-22 11:25   ` Petr Mladek
  2019-07-22 12:47     ` Marc Zyngier
  0 siblings, 1 reply; 12+ messages in thread
From: Petr Mladek @ 2019-07-22 11:25 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Mark Rutland, Pavel Tatashin, Catalin Marinas, Will Deacon,
	linux-kernel, Steven Rostedt, Sergey Senozhatsky, John Stultz,
	Thomas Gleixner, linux-arm-kernel

On Mon 2019-07-22 11:33:28, Marc Zyngier wrote:
> printk currently relies on local_clock to time-stamp the kernel
> messages. In order to allow the timestamping (and only that)
> to be overridden by architecture-specific code, let's declare
> a new timestamp_clock() function, which gets used by the printk
> code. Architectures willing to make use of this facility will
> have to define CONFIG_ARCH_HAS_TIMESTAMP_CLOCK.
>
> The default is of course to return local_clock(), so that the
> existing behaviour stays unchanged.
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  include/linux/sched/clock.h | 13 +++++++++++++
>  kernel/printk/printk.c      |  4 ++--
>  2 files changed, 15 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/sched/clock.h b/include/linux/sched/clock.h
> index 867d588314e0..3cf4b2a8ce18 100644
> --- a/include/linux/sched/clock.h
> +++ b/include/linux/sched/clock.h
> @@ -98,4 +98,17 @@ static inline void enable_sched_clock_irqtime(void) {}
>  static inline void disable_sched_clock_irqtime(void) {}
>  #endif
>  
> +#ifdef CONFIG_ARCH_HAS_TIMESTAMP_CLOCK
> +/* Special need architectures can provide their timestamping function */

The commit message and the above comment should be more specific
about what are the special needs.

It must be clear how and why the clock differs from the other
clocks, especially from lock_clock().

Also the first mail says that timestamp_clock() might be
unstable. Is this true only during the early boot or all
the time?

The timestamp helps to order the events. An unstable clock
might be better than nothing during the boot. But it would
look strange to use it all the time, especially when it was
unrelated to any other clock used by the system.

Best Regards,
Petr

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/3] printk: Allow architecture-specific timestamping function
  2019-07-22 11:25   ` Petr Mladek
@ 2019-07-22 12:47     ` Marc Zyngier
  2019-07-22 13:03       ` Russell King - ARM Linux admin
  0 siblings, 1 reply; 12+ messages in thread
From: Marc Zyngier @ 2019-07-22 12:47 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Mark Rutland, Pavel Tatashin, Catalin Marinas, Will Deacon,
	linux-kernel, Steven Rostedt, Sergey Senozhatsky, John Stultz,
	Thomas Gleixner, linux-arm-kernel

On 22/07/2019 12:25, Petr Mladek wrote:
> On Mon 2019-07-22 11:33:28, Marc Zyngier wrote:
>> printk currently relies on local_clock to time-stamp the kernel
>> messages. In order to allow the timestamping (and only that)
>> to be overridden by architecture-specific code, let's declare
>> a new timestamp_clock() function, which gets used by the printk
>> code. Architectures willing to make use of this facility will
>> have to define CONFIG_ARCH_HAS_TIMESTAMP_CLOCK.
>>
>> The default is of course to return local_clock(), so that the
>> existing behaviour stays unchanged.
>>
>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>> ---
>>  include/linux/sched/clock.h | 13 +++++++++++++
>>  kernel/printk/printk.c      |  4 ++--
>>  2 files changed, 15 insertions(+), 2 deletions(-)
>>
>> diff --git a/include/linux/sched/clock.h b/include/linux/sched/clock.h
>> index 867d588314e0..3cf4b2a8ce18 100644
>> --- a/include/linux/sched/clock.h
>> +++ b/include/linux/sched/clock.h
>> @@ -98,4 +98,17 @@ static inline void enable_sched_clock_irqtime(void) {}
>>  static inline void disable_sched_clock_irqtime(void) {}
>>  #endif
>>  
>> +#ifdef CONFIG_ARCH_HAS_TIMESTAMP_CLOCK
>> +/* Special need architectures can provide their timestamping function */
> 
> The commit message and the above comment should be more specific
> about what are the special needs.
> 
> It must be clear how and why the clock differs from the other
> clocks, especially from lock_clock().

Fair enough. How about something along the lines of:

"An architecture can override the timestamp clock (which defaults to
local_clock) if local_clock is not significant early enough (sched_clock
being available too late)."

> Also the first mail says that timestamp_clock() might be
> unstable. Is this true only during the early boot or all
> the time?

With the current proposal, the instability period only exists until
sched_clock gets initialized, at which point it takes over and the
timestamping becomes stable.

Note that this is the arm64 implementation, and not something that is
currently guaranteed by just selecting CONFIG_ARCH_HAS_TIMESTAMP_CLOCK.

> 
> The timestamp helps to order the events. An unstable clock
> might be better than nothing during the boot. But it would
> look strange to use it all the time, especially when it was
> unrelated to any other clock used by the system.

And that's exactly what patch #3 implements. Once local_clock() returns
something significant, we use that.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/3] printk: Allow architecture-specific timestamping function
  2019-07-22 12:47     ` Marc Zyngier
@ 2019-07-22 13:03       ` Russell King - ARM Linux admin
  2019-07-22 13:26         ` Marc Zyngier
  0 siblings, 1 reply; 12+ messages in thread
From: Russell King - ARM Linux admin @ 2019-07-22 13:03 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Mark Rutland, Petr Mladek, Pavel Tatashin, Catalin Marinas,
	Will Deacon, linux-kernel, Steven Rostedt, Sergey Senozhatsky,
	John Stultz, Thomas Gleixner, linux-arm-kernel

On Mon, Jul 22, 2019 at 01:47:57PM +0100, Marc Zyngier wrote:
> On 22/07/2019 12:25, Petr Mladek wrote:
> > On Mon 2019-07-22 11:33:28, Marc Zyngier wrote:
> >> printk currently relies on local_clock to time-stamp the kernel
> >> messages. In order to allow the timestamping (and only that)
> >> to be overridden by architecture-specific code, let's declare
> >> a new timestamp_clock() function, which gets used by the printk
> >> code. Architectures willing to make use of this facility will
> >> have to define CONFIG_ARCH_HAS_TIMESTAMP_CLOCK.
> >>
> >> The default is of course to return local_clock(), so that the
> >> existing behaviour stays unchanged.
> >>
> >> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> >> ---
> >>  include/linux/sched/clock.h | 13 +++++++++++++
> >>  kernel/printk/printk.c      |  4 ++--
> >>  2 files changed, 15 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/include/linux/sched/clock.h b/include/linux/sched/clock.h
> >> index 867d588314e0..3cf4b2a8ce18 100644
> >> --- a/include/linux/sched/clock.h
> >> +++ b/include/linux/sched/clock.h
> >> @@ -98,4 +98,17 @@ static inline void enable_sched_clock_irqtime(void) {}
> >>  static inline void disable_sched_clock_irqtime(void) {}
> >>  #endif
> >>  
> >> +#ifdef CONFIG_ARCH_HAS_TIMESTAMP_CLOCK
> >> +/* Special need architectures can provide their timestamping function */
> > 
> > The commit message and the above comment should be more specific
> > about what are the special needs.
> > 
> > It must be clear how and why the clock differs from the other
> > clocks, especially from lock_clock().
> 
> Fair enough. How about something along the lines of:
> 
> "An architecture can override the timestamp clock (which defaults to
> local_clock) if local_clock is not significant early enough (sched_clock
> being available too late)."

We have:
1) the standard clocksource
2) the sched_clock, which is _supposed_ to be initialised early
3) persistent_clock

Do we really need another clock?

Why not initialise sched_clock() early (as in, before sched_init(),
which is where the first sched_clock() read occurs) ?

We've already been around the argument that sched_clock() apparently
can't be initialised early enough (which is the argument I had in reply
to the sched_clock() situation on ARM32) then how does inventing
timestamp_clock() solve this problem?

Wouldn't timestamp_clock() also suffer from the very same "we can't
initialise it early enough" issue, and it'll just be setup along side
clocksources, just like sched_clock() has become?

I fail to see what adding yet another architecture specific clock
implementation buys, apart from yet more complexity.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/3] printk: Allow architecture-specific timestamping function
  2019-07-22 13:03       ` Russell King - ARM Linux admin
@ 2019-07-22 13:26         ` Marc Zyngier
  0 siblings, 0 replies; 12+ messages in thread
From: Marc Zyngier @ 2019-07-22 13:26 UTC (permalink / raw)
  To: Russell King - ARM Linux admin
  Cc: Mark Rutland, Petr Mladek, Pavel Tatashin, Catalin Marinas,
	Will Deacon, linux-kernel, Steven Rostedt, Sergey Senozhatsky,
	John Stultz, Thomas Gleixner, linux-arm-kernel

On 22/07/2019 14:03, Russell King - ARM Linux admin wrote:
> On Mon, Jul 22, 2019 at 01:47:57PM +0100, Marc Zyngier wrote:
>> On 22/07/2019 12:25, Petr Mladek wrote:
>>> On Mon 2019-07-22 11:33:28, Marc Zyngier wrote:
>>>> printk currently relies on local_clock to time-stamp the kernel
>>>> messages. In order to allow the timestamping (and only that)
>>>> to be overridden by architecture-specific code, let's declare
>>>> a new timestamp_clock() function, which gets used by the printk
>>>> code. Architectures willing to make use of this facility will
>>>> have to define CONFIG_ARCH_HAS_TIMESTAMP_CLOCK.
>>>>
>>>> The default is of course to return local_clock(), so that the
>>>> existing behaviour stays unchanged.
>>>>
>>>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>>>> ---
>>>>  include/linux/sched/clock.h | 13 +++++++++++++
>>>>  kernel/printk/printk.c      |  4 ++--
>>>>  2 files changed, 15 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/include/linux/sched/clock.h b/include/linux/sched/clock.h
>>>> index 867d588314e0..3cf4b2a8ce18 100644
>>>> --- a/include/linux/sched/clock.h
>>>> +++ b/include/linux/sched/clock.h
>>>> @@ -98,4 +98,17 @@ static inline void enable_sched_clock_irqtime(void) {}
>>>>  static inline void disable_sched_clock_irqtime(void) {}
>>>>  #endif
>>>>  
>>>> +#ifdef CONFIG_ARCH_HAS_TIMESTAMP_CLOCK
>>>> +/* Special need architectures can provide their timestamping function */
>>>
>>> The commit message and the above comment should be more specific
>>> about what are the special needs.
>>>
>>> It must be clear how and why the clock differs from the other
>>> clocks, especially from lock_clock().
>>
>> Fair enough. How about something along the lines of:
>>
>> "An architecture can override the timestamp clock (which defaults to
>> local_clock) if local_clock is not significant early enough (sched_clock
>> being available too late)."
> 
> We have:
> 1) the standard clocksource
> 2) the sched_clock, which is _supposed_ to be initialised early
> 3) persistent_clock
> 
> Do we really need another clock?
> 
> Why not initialise sched_clock() early (as in, before sched_init(),
> which is where the first sched_clock() read occurs) ?

Because, as you hint at below, that's not generally possible if you need
to identify the system early enough to discover that you need to apply
an erratum workaround. If you init sched_clock() before you know what
you're running on, you may end-up with a clock that can jump in either
direction.

And while the first call to sched_clock happens pretty late, the
timestamping code uses it pretty early, via the local_clock() indirection.

> 
> We've already been around the argument that sched_clock() apparently
> can't be initialised early enough (which is the argument I had in reply
> to the sched_clock() situation on ARM32) then how does inventing
> timestamp_clock() solve this problem?

It allows the kernel message to be timestamped with a potentially
unreliable clock without breaking the promise that sched_clock() will
not go backward or otherwise behave erratically.

> Wouldn't timestamp_clock() also suffer from the very same "we can't
> initialise it early enough" issue, and it'll just be setup along side
> clocksources, just like sched_clock() has become?

At least on arm64, the architected counter is always available, and
doesn't require any setup (at least none by the time the kernel is booted).

> I fail to see what adding yet another architecture specific clock
> implementation buys, apart from yet more complexity.
> 

It buys us early timestamping without forcing us to deal with an
unreliable. The additional complexity looks pretty minimal to me, and no
other architecture is forced to use it.

	M.
-- 
Jazz is not dead. It just smells funny...

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/3] arm64: Allow early timestamping of kernel log
  2019-07-22 10:33 [PATCH 0/3] arm64: Allow early timestamping of kernel log Marc Zyngier
                   ` (2 preceding siblings ...)
  2019-07-22 10:33 ` [PATCH 3/3] arm64: Allow early time stamping Marc Zyngier
@ 2019-07-22 20:52 ` Pavel Tatashin
  2019-07-23  7:17   ` Marc Zyngier
  2019-07-23  2:42 ` Hanjun Guo
  4 siblings, 1 reply; 12+ messages in thread
From: Pavel Tatashin @ 2019-07-22 20:52 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Mark Rutland, Petr Mladek, Catalin Marinas, Will Deacon, LKML,
	Steven Rostedt, Sergey Senozhatsky, John Stultz, Thomas Gleixner,
	Linux ARM

On Mon, Jul 22, 2019 at 3:33 AM Marc Zyngier <marc.zyngier@arm.com> wrote:
>
> So far, we've let the arm64 kernel start its meaningful time stamping
> of the kernel log pretty late, which is caused by sched_clock() being
> initialised rather late compared to other architectures.
>
> Pavel Tatashin proposed[1] to move the initialisation of sched_clock
> much earlier, which I had objections to. The reason for initialising
> sched_clock late is that a number of systems have broken counters, and
> we need to apply all kind of terrifying workarounds to avoid time
> going backward on the affected platforms. Being able to identify the
> right workaround comes pretty late in the kernel boot, and providing
> an unreliable sched_clock, even for a short period of time, isn't an
> appealing prospect.
>
> To address this, I'm proposing that we allow an architecture to chose
> to (1) divorce time stamping and sched_clock during the early phase of
> booting, and (2) inherit the time stamping clock as the new epoch the
> first time a sched_sched clock gets registered.

Could we have a stable clock config for arm64: if it is known that
this Linux build is going to run on non-broken firmware, and with a
known stable cntvct_el0 register it can be optionally configured to
use that stable sched_clock instead of generic clock that arm64 is
using? This way, the early printk are going to be available on those
systems without adding a different method for printk's only. It would
also mean that other users of early sched_clock() such as ftrace could
benefit from it.

>
> (1) would allow arm64 to provide a time stamping clock, however
> unreliable it might be, while (2) would allow sched_clock to provide
> time stamps that are continuous with the time-stamping clock.
>
> The last patch in the series adds the necessary logic to arm64,
> allowing the (potentially unreliable) time stamping of early kernel
> messages.
>
> Tested on a bunch of arm64 systems, both bare-metal and in VMs. Boot
> tested on a x86 guest.
>
> [1] https://lore.kernel.org/patchwork/patch/1015110/
>
> Marc Zyngier (3):
>   printk: Allow architecture-specific timestamping function
>   sched/clock: Allow sched_clock to inherit timestamp_clock epoch
>   arm64: Allow early time stamping
>
>  arch/arm64/Kconfig          |  3 +++
>  arch/arm64/kernel/setup.c   | 44 +++++++++++++++++++++++++++++++++++++
>  include/linux/sched/clock.h | 13 +++++++++++
>  kernel/printk/printk.c      |  4 ++--
>  kernel/time/sched_clock.c   | 10 +++++++++
>  5 files changed, 72 insertions(+), 2 deletions(-)
>
> --
> 2.20.1
>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/3] arm64: Allow early timestamping of kernel log
  2019-07-22 10:33 [PATCH 0/3] arm64: Allow early timestamping of kernel log Marc Zyngier
                   ` (3 preceding siblings ...)
  2019-07-22 20:52 ` [PATCH 0/3] arm64: Allow early timestamping of kernel log Pavel Tatashin
@ 2019-07-23  2:42 ` Hanjun Guo
  4 siblings, 0 replies; 12+ messages in thread
From: Hanjun Guo @ 2019-07-23  2:42 UTC (permalink / raw)
  To: Marc Zyngier, Thomas Gleixner, John Stultz, Pavel Tatashin,
	Petr Mladek, Sergey Senozhatsky, Steven Rostedt, Will Deacon,
	Catalin Marinas, Mark Rutland
  Cc: linux-kernel, linux-arm-kernel

On 2019/7/22 18:33, Marc Zyngier wrote:
> So far, we've let the arm64 kernel start its meaningful time stamping
> of the kernel log pretty late, which is caused by sched_clock() being
> initialised rather late compared to other architectures.
> 
> Pavel Tatashin proposed[1] to move the initialisation of sched_clock
> much earlier, which I had objections to. The reason for initialising
> sched_clock late is that a number of systems have broken counters, and
> we need to apply all kind of terrifying workarounds to avoid time
> going backward on the affected platforms. Being able to identify the
> right workaround comes pretty late in the kernel boot, and providing
> an unreliable sched_clock, even for a short period of time, isn't an
> appealing prospect.
> 
> To address this, I'm proposing that we allow an architecture to chose
> to (1) divorce time stamping and sched_clock during the early phase of
> booting, and (2) inherit the time stamping clock as the new epoch the
> first time a sched_sched clock gets registered.
> 
> (1) would allow arm64 to provide a time stamping clock, however
> unreliable it might be, while (2) would allow sched_clock to provide
> time stamps that are continuous with the time-stamping clock.
> 
> The last patch in the series adds the necessary logic to arm64,
> allowing the (potentially unreliable) time stamping of early kernel
> messages.
> 
> Tested on a bunch of arm64 systems, both bare-metal and in VMs. Boot
> tested on a x86 guest.

This makes the boot log more useful and I can debug some time consuming
issue easier before the arch timer initialization, tested on my ARM64
server, I can see the timestamping from the start [1],

Tested-by: Hanjun Guo <guohanjun@huawei.com>

Thanks
Hanjun

[1]:
[    0.000000] Booting Linux on physical CPU 0x0000080000 [0x481fd010]
[    0.000000] Linux version 5.2.0+ (root@localhost.localdomain) (gcc version 9.0.1 20190312 (Red Hat 9.0.1-0.10) (GCC)) #45 SMP Tue Jul 23 09:17:48 CST 2019
[    0.000000] Using timestamp clock @100MHz
[    0.000074] efi: Getting EFI parameters from FDT:
[    0.000082] efi: EFI v2.70 by EDK II
[    0.000083] efi:  ACPI 2.0=0x3a300000  SMBIOS 3.0=0x39f80000  MEMATTR=0x30996018  MEMRESERVE=0x30997e18
[    0.000122] crashkernel reserved: 0x000000000ba00000 - 0x000000002ba00000 (512 MB)
[    0.000126] cma: Failed to reserve 512 MiB
[    0.185111] ACPI: Early table checksum verification disabled
[    0.185115] ACPI: RSDP 0x000000003A300000 000024 (v02 HISI  )
[    0.185120] ACPI: XSDT 0x000000003A270000 00009C (v01 HISI   HIP08    00000000      01000013)
[    0.185127] ACPI: FACP 0x0000000039B10000 000114 (v06 HISI   HIP08    00000000 HISI 20151124)
[    0.185134] ACPI: DSDT 0x0000000039AB0000 0084E4 (v02 HISI   HIP08    00000000 INTL 20181213)
[    0.185139] ACPI: PCCT 0x0000000039FB0000 00008A (v01 HISI   HIP08    00000000 HISI 20151124)
[    0.185143] ACPI: SSDT 0x0000000039F90000 01021A (v02 HISI   HIP07    00000000 INTL 20181213)
[    0.185147] ACPI: BERT 0x0000000039F50000 000030 (v01 HISI   HIP08    00000000 HISI 20151124)
[    0.185150] ACPI: HEST 0x0000000039F30000 000308 (v01 HISI   HIP08    00000000 HISI 20151124)
[    0.185154] ACPI: ERST 0x0000000039EF0000 000230 (v01 HISI   HIP08    00000000 HISI 20151124)
[    0.185158] ACPI: EINJ 0x0000000039EE0000 000170 (v01 HISI   HIP08    00000000 HISI 20151124)
[    0.185162] ACPI: SLIT 0x0000000039B30000 00003C (v01 HISI   HIP08    00000000 HISI 20151124)
[    0.185166] ACPI: GTDT 0x0000000039B00000 00007C (v02 HISI   HIP08    00000000 HISI 20151124)
[    0.185169] ACPI: MCFG 0x0000000039AF0000 00003C (v01 HISI   HIP08    00000000 HISI 20151124)
[    0.185173] ACPI: SPCR 0x0000000039AE0000 000050 (v02 HISI   HIP08    00000000 HISI 20151124)
[    0.185177] ACPI: SRAT 0x0000000039AD0000 0007D0 (v03 HISI   HIP08    00000000 HISI 20151124)
[    0.185181] ACPI: APIC 0x0000000039AC0000 001E6C (v04 HISI   HIP08    00000000 HISI 20151124)
[    0.185185] ACPI: IORT 0x0000000039AA0000 001310 (v00 HISI   HIP08    00000000 INTL 20181213)
[    0.185189] ACPI: PPTT 0x0000000030970000 0031B0 (v01 HISI   HIP08    00000000 HISI 20151124)
[    0.185196] ACPI: SPCR: console: pl011,mmio32,0x94080000,115200
[    0.185208] ACPI: SRAT: Node 0 PXM 0 [mem 0x2080000000-0x2fffffffff]
[    0.185210] ACPI: SRAT: Node 1 PXM 1 [mem 0x3000000000-0x3fffffffff]
[    0.185212] ACPI: SRAT: Node 0 PXM 0 [mem 0x00000000-0x7fffffff]
[    0.185213] ACPI: SRAT: Node 2 PXM 2 [mem 0x202000000000-0x202fffffffff]
[    0.185215] ACPI: SRAT: Node 3 PXM 3 [mem 0x203000000000-0x203fffffffff]
[    0.185221] NUMA: NODE_DATA [mem 0x2fffffe3c0-0x2fffffffff]
[    0.185224] NUMA: NODE_DATA [mem 0x3fffffe3c0-0x3fffffffff]
[    0.185226] NUMA: NODE_DATA [mem 0x202fffffe3c0-0x202fffffffff]
[    0.185229] NUMA: NODE_DATA [mem 0x203ffdfde3c0-0x203ffdfdffff]



_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/3] arm64: Allow early timestamping of kernel log
  2019-07-22 20:52 ` [PATCH 0/3] arm64: Allow early timestamping of kernel log Pavel Tatashin
@ 2019-07-23  7:17   ` Marc Zyngier
  2019-09-23 19:13     ` Pavel Tatashin
  0 siblings, 1 reply; 12+ messages in thread
From: Marc Zyngier @ 2019-07-23  7:17 UTC (permalink / raw)
  To: Pavel Tatashin
  Cc: Mark Rutland, Petr Mladek, Catalin Marinas, Will Deacon, LKML,
	Steven Rostedt, Sergey Senozhatsky, John Stultz, Thomas Gleixner,
	Linux ARM

On Mon, 22 Jul 2019 21:52:42 +0100,
Pavel Tatashin <pasha.tatashin@soleen.com> wrote:
> 
> On Mon, Jul 22, 2019 at 3:33 AM Marc Zyngier <marc.zyngier@arm.com> wrote:
> >
> > So far, we've let the arm64 kernel start its meaningful time stamping
> > of the kernel log pretty late, which is caused by sched_clock() being
> > initialised rather late compared to other architectures.
> >
> > Pavel Tatashin proposed[1] to move the initialisation of sched_clock
> > much earlier, which I had objections to. The reason for initialising
> > sched_clock late is that a number of systems have broken counters, and
> > we need to apply all kind of terrifying workarounds to avoid time
> > going backward on the affected platforms. Being able to identify the
> > right workaround comes pretty late in the kernel boot, and providing
> > an unreliable sched_clock, even for a short period of time, isn't an
> > appealing prospect.
> >
> > To address this, I'm proposing that we allow an architecture to chose
> > to (1) divorce time stamping and sched_clock during the early phase of
> > booting, and (2) inherit the time stamping clock as the new epoch the
> > first time a sched_sched clock gets registered.
> 
> Could we have a stable clock config for arm64: if it is known that
> this Linux build is going to run on non-broken firmware, and with a
> known stable cntvct_el0 register it can be optionally configured to
> use that stable sched_clock instead of generic clock that arm64 is
> using?

Hmmm. Then I guess a prerequisite is this patch:

https://lwn.net/Articles/490040/

as the number of systems this will reliably run on is a close
approximation to zero. Yes, counting is hard...

More seriously, we've been there before (cue the 32bit ARM port 8
years ago), and really don't want to go back to a time where we had
multiple config options for everything. There is one kernel, which
should run *reliably* on everything.

> This way, the early printk are going to be available on those
> systems without adding a different method for printk's only. It would
> also mean that other users of early sched_clock() such as ftrace could
> benefit from it.

See above. But maybe the real thing to do is to allow local_clock() to
be overridden. Same effect, same complexity, just hidden away from any
given subsystem.

Thanks,

	M.

-- 
Jazz is not dead, it just smells funny.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/3] arm64: Allow early timestamping of kernel log
  2019-07-23  7:17   ` Marc Zyngier
@ 2019-09-23 19:13     ` Pavel Tatashin
  0 siblings, 0 replies; 12+ messages in thread
From: Pavel Tatashin @ 2019-09-23 19:13 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Mark Rutland, Petr Mladek, Catalin Marinas, Will Deacon, LKML,
	Steven Rostedt, Sergey Senozhatsky, John Stultz, Thomas Gleixner,
	Linux ARM

On Tue, Jul 23, 2019 at 3:17 AM Marc Zyngier <marc.zyngier@arm.com> wrote:
>
> On Mon, 22 Jul 2019 21:52:42 +0100,
> Pavel Tatashin <pasha.tatashin@soleen.com> wrote:
> >
> > On Mon, Jul 22, 2019 at 3:33 AM Marc Zyngier <marc.zyngier@arm.com> wrote:
> > >
> > > So far, we've let the arm64 kernel start its meaningful time stamping
> > > of the kernel log pretty late, which is caused by sched_clock() being
> > > initialised rather late compared to other architectures.
> > >
> > > Pavel Tatashin proposed[1] to move the initialisation of sched_clock
> > > much earlier, which I had objections to. The reason for initialising
> > > sched_clock late is that a number of systems have broken counters, and
> > > we need to apply all kind of terrifying workarounds to avoid time
> > > going backward on the affected platforms. Being able to identify the
> > > right workaround comes pretty late in the kernel boot, and providing
> > > an unreliable sched_clock, even for a short period of time, isn't an
> > > appealing prospect.
> > >
> > > To address this, I'm proposing that we allow an architecture to chose
> > > to (1) divorce time stamping and sched_clock during the early phase of
> > > booting, and (2) inherit the time stamping clock as the new epoch the
> > > first time a sched_sched clock gets registered.

Hi Marc,

I know we briefly discussed this at plumbers, but I want to bring it
up again, because I am still puzzled why it is not possible to
stabilize unstable clock early in boot.

Here is an example where clock is stabilized:
https://soleen.com/source/xref/linux/kernel/sched/clock.c?r=457c8996#265

It uses a value that is read at last ticks to normalize clock, and
because ticks are not available early in boot instead we can make sure
that early in boot sched_clock() never returns value smaller than
previously returned value, and if we want to be extra careful, we can
also make sure that sched_clock() early in boot does not jump ahead by
more than some fixed amount of time i.e. more than one hour.

If sched_clock() is available early we will get the benefit of having
other tracers that use it to debug early boot information.

Pasha

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2019-09-23 19:13 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-22 10:33 [PATCH 0/3] arm64: Allow early timestamping of kernel log Marc Zyngier
2019-07-22 10:33 ` [PATCH 1/3] printk: Allow architecture-specific timestamping function Marc Zyngier
2019-07-22 11:25   ` Petr Mladek
2019-07-22 12:47     ` Marc Zyngier
2019-07-22 13:03       ` Russell King - ARM Linux admin
2019-07-22 13:26         ` Marc Zyngier
2019-07-22 10:33 ` [PATCH 2/3] sched/clock: Allow sched_clock to inherit timestamp_clock epoch Marc Zyngier
2019-07-22 10:33 ` [PATCH 3/3] arm64: Allow early time stamping Marc Zyngier
2019-07-22 20:52 ` [PATCH 0/3] arm64: Allow early timestamping of kernel log Pavel Tatashin
2019-07-23  7:17   ` Marc Zyngier
2019-09-23 19:13     ` Pavel Tatashin
2019-07-23  2:42 ` Hanjun Guo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).