All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/8] Add persistent clock support
@ 2018-06-13 11:32 ` Baolin Wang
  0 siblings, 0 replies; 38+ messages in thread
From: Baolin Wang @ 2018-06-13 11:32 UTC (permalink / raw)
  To: tglx, john.stultz, daniel.lezcano, arnd, tony, aaro.koskinen,
	linux, mark.rutland, marc.zyngier
  Cc: baolin.wang, broonie, paulmck, mlichvar, rdunlap, kstewart,
	gregkh, pombredanne, thierry.reding, jonathanh, heiko,
	linus.walleij, viresh.kumar, mingo, hpa, peterz, douly.fnst,
	len.brown, rajvi.jingar, alexandre.belloni, x86,
	linux-arm-kernel, linux-tegra, linux-kernel, linux-omap

Hi,

We will meet below issues when compensating the suspend time for the timekeeping.

1. We have too many different ways of dealing with persistent timekeeping
across architectures, so it is hard for one driver to be compatible with
different architectures. For example, we should register register_persistent_clock()
on arm architecture, but we should set x86_platform.get_wallclock() on x86
architecture, and we should implement the read_persistent_clock64() on arm64
architecture and so on.

2. On some platforms (such as Spreadtrum platform), we registered the high
resolution timer as one clocksource to update the OS time, but the high
resolution timer will be stopped when system goes into suspend state.
So we use another one always-on timer (but low resolution) to calculate
the suspend time to compensate the OS time. Though we can register the
always-on timer as one clocksource, we need re-calculate the mult/shift
with one larger conversion range to calculate the suspend time and need
update the clock in case of running over the always-on timer. 

More duplicate code will be added if other platforms meet this case.

3. Now we have 3 sources that could be used to compensate the OS time:
Nonstop clocksource during suspend, persistent clock and rtc device,
which is complicated. Another hand is that the nonstop clocksource can
risk wrapping if the system suspend time is too long, so we need one
mechanism to wake up the system before the nonstop clocksource wrapping.

According to above issues, we can introduce one common persistent clock
framework to be compatible with different architectures, in future we will
remove the persistent clock implementation for each architecture. Also
this framework will implement common code to help drivers to register easily.
Moreover if we converted all SUSPEND_NONSTOP clocksource to register to
be one persistent clock, we can remove the SUSPEND_NONSTOP clocksource
accounting in timekeeping, which means we can only compensate the OS time
from persistent clock and RTC.

Will be appreciated for any comments. Thank you all.

Arnd posted some comments as below last time, but we did not get a general
consensus, so I post them again. Arnd said:

"I was planning to discuss this with Daniel and John during Linaro Connect,
but that didn't happen, so I'd like to bring up the bigger picture here again.

Today, we have a number of subsystem-type interfaces that deal with
time keeping in the wider sense (I might be missing some):
 - clock source
 - clock event
 - sched clock
 - real time clock
 - ptp clock
 - persistent clock

The first five have separate registration interfaces and may all refer
to different hardware blocks, or (more commonly) have some overlap
in the hardware. The fifth one is generalized by your series, without it
it's really architecture specific (as the other ones were one one point).

Are we happy with that structure in the long run? One of my earlier
comments on this series was that I thought it should be combined with
the clocksource registration, but upon talking to Baolin about it more,
I realized that this just follows the same structure that we have for the
others.

In theory, we could have a more abstract way of registering a clock
hardware that interfaces with any combination of the six subsystems
I mentioned above, with a superset of the subsystem specific structures
and a set of flags that indicate what a particular device is usable for.

Combining all six might be a bit too much (in particular rtc, though
it clearly overlaps the persistent-clk case), but what your general
ideas on where we should be heading? Is it worth reworking the
core kernel portion of the subsystems to simplify the individual
drivers?"

John also posted some important comments for this patch set, that I quite
agree with his points. John said:

"For context, these abstractions have grown out of the need for using
different hardware components for all of these. It was quite common
for x86 hardware to use the acpi_pm for clocksource, lapic/PIT for
clockevent, tsc for sched_clock and CMOS RTC for persistent clock.
While some of these could be backed by the same hardware, it wasn't
common. However, hardware with less restrictions have allowed in some
cases for these to be more unified, but I'm not sure if its particularly common.

Another part of the reason that we don't combine the above
abstractions, even when they are backed by the same hardware, is
because some of the fields used for freq conversion (mult/shift) have
different needs for the different types of accounting.

For instance, with a clocksource, we are very focused on avoiding
error to keep timekeeing accurate, thus we want to use as large a
shift (and thus mult) as possible (and do our shifting as late as
possible in our accounting). However, that then shrinks the amount of
time that can be accumulated in one go w/o causing an overflow.

Where as with sched_clock, we don't worry as much as about accuracy,
so we can use smaller shifts (and thus mults), and thus can go for
longer periods of time between accumulating without worrying.

Similarly for the persistent clock case we don't need need to worry as
much about accuracy, so we can can use smaller shifts, but we are not
in as much of a hot patch, so we can also"

Changes since RFC V1:
 - Move the alarmtimer starting into alarmtimer.c.
 - Export persistent_clock_start_alarmtimer().
 - Add one memeber to indicate if the alarmtimer was initialized.
 - Fix build issues.

Baolin Wang (8):
  time: Add persistent clock support
  clocksource: sprd: Add one persistent timer for Spreadtrum platform
  arm: time: Remove the persistent clock support for ARM architecture
  clocksource: arm_arch_timer: Register the persistent clock
  clocksource: timer-ti-32k: Register the persistent clock
  clocksource: time-pistachio: Register the persistent clock
  x86: tsc: Register the persistent clock
  time: timekeeping: Remove time compensating from nonstop clocksources

 arch/arm/include/asm/mach/time.h     |    4 -
 arch/arm/kernel/time.c               |   36 -------
 arch/arm/plat-omap/Kconfig           |    1 +
 arch/arm/plat-omap/counter_32k.c     |   44 ++------
 arch/x86/Kconfig                     |    1 +
 arch/x86/kernel/tsc.c                |   21 ++++
 drivers/clocksource/Kconfig          |    4 +
 drivers/clocksource/arm_arch_timer.c |   10 ++
 drivers/clocksource/tegra20_timer.c  |   12 ++-
 drivers/clocksource/time-pistachio.c |    3 +
 drivers/clocksource/timer-sprd.c     |   80 +++++++++++++++
 drivers/clocksource/timer-ti-32k.c   |    4 +
 include/linux/persistent_clock.h     |   23 +++++
 kernel/time/Kconfig                  |    4 +
 kernel/time/Makefile                 |    1 +
 kernel/time/alarmtimer.c             |    4 +
 kernel/time/persistent_clock.c       |  184 ++++++++++++++++++++++++++++++++++
 kernel/time/timekeeping.c            |   19 +---
 18 files changed, 360 insertions(+), 95 deletions(-)
 create mode 100644 include/linux/persistent_clock.h
 create mode 100644 kernel/time/persistent_clock.c

-- 
1.7.9.5

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH 0/8] Add persistent clock support
@ 2018-06-13 11:32 ` Baolin Wang
  0 siblings, 0 replies; 38+ messages in thread
From: Baolin Wang @ 2018-06-13 11:32 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

We will meet below issues when compensating the suspend time for the timekeeping.

1. We have too many different ways of dealing with persistent timekeeping
across architectures, so it is hard for one driver to be compatible with
different architectures. For example, we should register register_persistent_clock()
on arm architecture, but we should set x86_platform.get_wallclock() on x86
architecture, and we should implement the read_persistent_clock64() on arm64
architecture and so on.

2. On some platforms (such as Spreadtrum platform), we registered the high
resolution timer as one clocksource to update the OS time, but the high
resolution timer will be stopped when system goes into suspend state.
So we use another one always-on timer (but low resolution) to calculate
the suspend time to compensate the OS time. Though we can register the
always-on timer as one clocksource, we need re-calculate the mult/shift
with one larger conversion range to calculate the suspend time and need
update the clock in case of running over the always-on timer. 

More duplicate code will be added if other platforms meet this case.

3. Now we have 3 sources that could be used to compensate the OS time:
Nonstop clocksource during suspend, persistent clock and rtc device,
which is complicated. Another hand is that the nonstop clocksource can
risk wrapping if the system suspend time is too long, so we need one
mechanism to wake up the system before the nonstop clocksource wrapping.

According to above issues, we can introduce one common persistent clock
framework to be compatible with different architectures, in future we will
remove the persistent clock implementation for each architecture. Also
this framework will implement common code to help drivers to register easily.
Moreover if we converted all SUSPEND_NONSTOP clocksource to register to
be one persistent clock, we can remove the SUSPEND_NONSTOP clocksource
accounting in timekeeping, which means we can only compensate the OS time
from persistent clock and RTC.

Will be appreciated for any comments. Thank you all.

Arnd posted some comments as below last time, but we did not get a general
consensus, so I post them again. Arnd said:

"I was planning to discuss this with Daniel and John during Linaro Connect,
but that didn't happen, so I'd like to bring up the bigger picture here again.

Today, we have a number of subsystem-type interfaces that deal with
time keeping in the wider sense (I might be missing some):
 - clock source
 - clock event
 - sched clock
 - real time clock
 - ptp clock
 - persistent clock

The first five have separate registration interfaces and may all refer
to different hardware blocks, or (more commonly) have some overlap
in the hardware. The fifth one is generalized by your series, without it
it's really architecture specific (as the other ones were one one point).

Are we happy with that structure in the long run? One of my earlier
comments on this series was that I thought it should be combined with
the clocksource registration, but upon talking to Baolin about it more,
I realized that this just follows the same structure that we have for the
others.

In theory, we could have a more abstract way of registering a clock
hardware that interfaces with any combination of the six subsystems
I mentioned above, with a superset of the subsystem specific structures
and a set of flags that indicate what a particular device is usable for.

Combining all six might be a bit too much (in particular rtc, though
it clearly overlaps the persistent-clk case), but what your general
ideas on where we should be heading? Is it worth reworking the
core kernel portion of the subsystems to simplify the individual
drivers?"

John also posted some important comments for this patch set, that I quite
agree with his points. John said:

"For context, these abstractions have grown out of the need for using
different hardware components for all of these. It was quite common
for x86 hardware to use the acpi_pm for clocksource, lapic/PIT for
clockevent, tsc for sched_clock and CMOS RTC for persistent clock.
While some of these could be backed by the same hardware, it wasn't
common. However, hardware with less restrictions have allowed in some
cases for these to be more unified, but I'm not sure if its particularly common.

Another part of the reason that we don't combine the above
abstractions, even when they are backed by the same hardware, is
because some of the fields used for freq conversion (mult/shift) have
different needs for the different types of accounting.

For instance, with a clocksource, we are very focused on avoiding
error to keep timekeeing accurate, thus we want to use as large a
shift (and thus mult) as possible (and do our shifting as late as
possible in our accounting). However, that then shrinks the amount of
time that can be accumulated in one go w/o causing an overflow.

Where as with sched_clock, we don't worry as much as about accuracy,
so we can use smaller shifts (and thus mults), and thus can go for
longer periods of time between accumulating without worrying.

Similarly for the persistent clock case we don't need need to worry as
much about accuracy, so we can can use smaller shifts, but we are not
in as much of a hot patch, so we can also"

Changes since RFC V1:
 - Move the alarmtimer starting into alarmtimer.c.
 - Export persistent_clock_start_alarmtimer().
 - Add one memeber to indicate if the alarmtimer was initialized.
 - Fix build issues.

Baolin Wang (8):
  time: Add persistent clock support
  clocksource: sprd: Add one persistent timer for Spreadtrum platform
  arm: time: Remove the persistent clock support for ARM architecture
  clocksource: arm_arch_timer: Register the persistent clock
  clocksource: timer-ti-32k: Register the persistent clock
  clocksource: time-pistachio: Register the persistent clock
  x86: tsc: Register the persistent clock
  time: timekeeping: Remove time compensating from nonstop clocksources

 arch/arm/include/asm/mach/time.h     |    4 -
 arch/arm/kernel/time.c               |   36 -------
 arch/arm/plat-omap/Kconfig           |    1 +
 arch/arm/plat-omap/counter_32k.c     |   44 ++------
 arch/x86/Kconfig                     |    1 +
 arch/x86/kernel/tsc.c                |   21 ++++
 drivers/clocksource/Kconfig          |    4 +
 drivers/clocksource/arm_arch_timer.c |   10 ++
 drivers/clocksource/tegra20_timer.c  |   12 ++-
 drivers/clocksource/time-pistachio.c |    3 +
 drivers/clocksource/timer-sprd.c     |   80 +++++++++++++++
 drivers/clocksource/timer-ti-32k.c   |    4 +
 include/linux/persistent_clock.h     |   23 +++++
 kernel/time/Kconfig                  |    4 +
 kernel/time/Makefile                 |    1 +
 kernel/time/alarmtimer.c             |    4 +
 kernel/time/persistent_clock.c       |  184 ++++++++++++++++++++++++++++++++++
 kernel/time/timekeeping.c            |   19 +---
 18 files changed, 360 insertions(+), 95 deletions(-)
 create mode 100644 include/linux/persistent_clock.h
 create mode 100644 kernel/time/persistent_clock.c

-- 
1.7.9.5

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH 1/8] time: Add persistent clock support
  2018-06-13 11:32 ` Baolin Wang
@ 2018-06-13 11:32   ` Baolin Wang
  -1 siblings, 0 replies; 38+ messages in thread
From: Baolin Wang @ 2018-06-13 11:32 UTC (permalink / raw)
  To: tglx, john.stultz, daniel.lezcano, arnd, tony, aaro.koskinen,
	linux, mark.rutland, marc.zyngier
  Cc: baolin.wang, broonie, paulmck, mlichvar, rdunlap, kstewart,
	gregkh, pombredanne, thierry.reding, jonathanh, heiko,
	linus.walleij, viresh.kumar, mingo, hpa, peterz, douly.fnst,
	len.brown, rajvi.jingar, alexandre.belloni, x86,
	linux-arm-kernel, linux-tegra, linux-kernel, linux-omap

On our Spreadtrum SC9860 platform, we registered the high resolution
ARM generic timer as one clocksource to update the OS time, but the
ARM generic timer will be stopped in suspend state. So we use one 64bit
always-on timer (but low resolution) of Spreadtrum to calculate the
suspend time to compensate the OS time. Though we can register the
always-on timer as one clocksource, we need re-calculate the
mult/shift with one larger conversion range to calculate the suspend
time.

But now we have too many different ways of dealing with persistent
timekeeping across architectures, and there will be many duplicate
code if we register one timer to be one persistent clock. Thus it
will be more helpful if we add one common framework for timer drivers
to be registered as one persistent clock and implement the common
read_persistent_clock64() to compensate the OS time.

Moreover we can register the clocksource with CLOCK_SOURCE_SUSPEND_NONSTOP
to be one persistent clock, then we can simplify the suspend/resume
accounting by removing CLOCK_SOURCE_SUSPEND_NONSTOP timing. After that
we can only compensate the OS time by persistent clock or RTC.

Signed-off-by: Baolin Wang <baolin.wang@linaro.org>
---
 include/linux/persistent_clock.h |   23 +++++
 kernel/time/Kconfig              |    4 +
 kernel/time/Makefile             |    1 +
 kernel/time/alarmtimer.c         |    4 +
 kernel/time/persistent_clock.c   |  184 ++++++++++++++++++++++++++++++++++++++
 5 files changed, 216 insertions(+)
 create mode 100644 include/linux/persistent_clock.h
 create mode 100644 kernel/time/persistent_clock.c

diff --git a/include/linux/persistent_clock.h b/include/linux/persistent_clock.h
new file mode 100644
index 0000000..7d42c1a
--- /dev/null
+++ b/include/linux/persistent_clock.h
@@ -0,0 +1,23 @@
+// SPDX-License-Identifier: GPL-2.0
+#ifndef __PERSISTENT_CLOCK_H__
+#define __PERSISTENT_CLOCK_H__
+
+#ifdef CONFIG_PERSISTENT_CLOCK
+extern int persistent_clock_init_and_register(u64 (*read)(void),
+					      u64 mask, u32 freq,
+					      u64 maxsec);
+extern void persistent_clock_cleanup(void);
+extern void persistent_clock_start_alarmtimer(void);
+#else
+static inline int persistent_clock_init_and_register(u64 (*read)(void),
+						     u64 mask, u32 freq,
+						     u64 maxsec)
+{
+	return 0;
+}
+
+static inline void persistent_clock_cleanup(void) { }
+static inline void persistent_clock_start_alarmtimer(void) { }
+#endif
+
+#endif
diff --git a/kernel/time/Kconfig b/kernel/time/Kconfig
index 78eabc4..7188600 100644
--- a/kernel/time/Kconfig
+++ b/kernel/time/Kconfig
@@ -47,6 +47,10 @@ config GENERIC_CLOCKEVENTS_MIN_ADJUST
 config GENERIC_CMOS_UPDATE
 	bool
 
+# Persistent clock support
+config PERSISTENT_CLOCK
+	bool
+
 if GENERIC_CLOCKEVENTS
 menu "Timers subsystem"
 
diff --git a/kernel/time/Makefile b/kernel/time/Makefile
index f1e46f3..f6d368f 100644
--- a/kernel/time/Makefile
+++ b/kernel/time/Makefile
@@ -18,3 +18,4 @@ obj-$(CONFIG_GENERIC_SCHED_CLOCK)		+= sched_clock.o
 obj-$(CONFIG_TICK_ONESHOT)			+= tick-oneshot.o tick-sched.o
 obj-$(CONFIG_DEBUG_FS)				+= timekeeping_debug.o
 obj-$(CONFIG_TEST_UDELAY)			+= test_udelay.o
+obj-$(CONFIG_PERSISTENT_CLOCK)			+= persistent_clock.o
diff --git a/kernel/time/alarmtimer.c b/kernel/time/alarmtimer.c
index 639321b..1518fdb 100644
--- a/kernel/time/alarmtimer.c
+++ b/kernel/time/alarmtimer.c
@@ -29,6 +29,7 @@
 #include <linux/freezer.h>
 #include <linux/compat.h>
 #include <linux/module.h>
+#include <linux/persistent_clock.h>
 
 #include "posix-timers.h"
 
@@ -892,6 +893,9 @@ static int __init alarmtimer_init(void)
 		error = PTR_ERR(pdev);
 		goto out_drv;
 	}
+
+	/* Start one alarmtimer to update persistent clock */
+	persistent_clock_start_alarmtimer();
 	return 0;
 
 out_drv:
diff --git a/kernel/time/persistent_clock.c b/kernel/time/persistent_clock.c
new file mode 100644
index 0000000..edaa659
--- /dev/null
+++ b/kernel/time/persistent_clock.c
@@ -0,0 +1,184 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2018 Linaro, Inc.
+ *
+ * Author: Baolin Wang <baolin.wang@linaro.org>
+ */
+
+#include <linux/alarmtimer.h>
+#include <linux/clocksource.h>
+#include <linux/persistent_clock.h>
+
+/**
+ * persistent_clock_read_data - data required to read persistent clock
+ * @read: Returns a cycle value from persistent clock.
+ * @last_cycles: Clock cycle value at last update.
+ * @last_ns: Time value (nanoseconds) at last update.
+ * @mask: Bitmask for two's complement subtraction of non 64bit clocks.
+ * @mult: Cycle to nanosecond multiplier.
+ * @shift: Cycle to nanosecond divisor.
+ */
+struct persistent_clock_read_data {
+	u64 (*read)(void);
+	u64 last_cycles;
+	u64 last_ns;
+	u64 mask;
+	u32 mult;
+	u32 shift;
+};
+
+/**
+ * persistent_clock - represent the persistent clock
+ * @read_data: Data required to read from persistent clock.
+ * @seq: Sequence counter for protecting updates.
+ * @freq: The frequency of the persistent clock.
+ * @wrap: Duration for persistent clock can run before wrapping.
+ * @alarm: Update timeout for persistent clock wrap.
+ * @alarm_inited: Indicate if the alarm has been initialized.
+ */
+struct persistent_clock {
+	struct persistent_clock_read_data read_data;
+	seqcount_t seq;
+	u32 freq;
+	ktime_t wrap;
+	struct alarm alarm;
+	bool alarm_inited;
+};
+
+static struct persistent_clock p;
+
+void read_persistent_clock64(struct timespec64 *ts)
+{
+	struct persistent_clock_read_data *read_data = &p.read_data;
+	unsigned long seq;
+	u64 delta, nsecs;
+
+	if (!read_data->read) {
+		ts->tv_sec = 0;
+		ts->tv_nsec = 0;
+		return;
+	}
+
+	do {
+		seq = read_seqcount_begin(&p.seq);
+		delta = (read_data->read() - read_data->last_cycles) &
+			read_data->mask;
+
+		nsecs = read_data->last_ns +
+			clocksource_cyc2ns(delta, read_data->mult,
+					   read_data->shift);
+		*ts = ns_to_timespec64(nsecs);
+	} while (read_seqcount_retry(&p.seq, seq));
+}
+
+static void persistent_clock_update(void)
+{
+	struct persistent_clock_read_data *read_data = &p.read_data;
+	u64 cycles, delta;
+
+	write_seqcount_begin(&p.seq);
+
+	cycles = read_data->read();
+	delta = (cycles - read_data->last_cycles) & read_data->mask;
+	read_data->last_ns += clocksource_cyc2ns(delta, read_data->mult,
+						 read_data->shift);
+	read_data->last_cycles = cycles;
+
+	write_seqcount_end(&p.seq);
+}
+
+static enum alarmtimer_restart persistent_clock_alarm_fired(struct alarm *alarm,
+							    ktime_t now)
+{
+	persistent_clock_update();
+
+	alarm_forward(&p.alarm, now, p.wrap);
+	return ALARMTIMER_RESTART;
+}
+
+int persistent_clock_init_and_register(u64 (*read)(void), u64 mask,
+				       u32 freq, u64 maxsec)
+{
+	struct persistent_clock_read_data *read_data = &p.read_data;
+	u64 wrap, res, secs = maxsec;
+
+	if (!read || !mask || !freq)
+		return -EINVAL;
+
+	if (!secs) {
+		/*
+		 * If the timer driver did not specify the maximum conversion
+		 * seconds of the persistent clock, then we calculate the
+		 * conversion range with the persistent clock's bits and
+		 * frequency.
+		 */
+		secs = mask;
+		do_div(secs, freq);
+
+		/*
+		 * Some persistent counter can be larger than 32bit, so we
+		 * need limit the max suspend time to have a good conversion
+		 * precision. So 24 hours may be enough usually.
+		 */
+		if (secs > 86400)
+			secs = 86400;
+	}
+
+	/* Calculate the mult/shift to convert cycles to ns. */
+	clocks_calc_mult_shift(&read_data->mult, &read_data->shift, freq,
+			       NSEC_PER_SEC, (u32)secs);
+
+	/* Calculate how many nanoseconds until we risk wrapping. */
+	wrap = clocks_calc_max_nsecs(read_data->mult, read_data->shift, 0,
+				     mask, NULL);
+	p.wrap = ns_to_ktime(wrap);
+
+	p.freq = freq;
+	read_data->mask = mask;
+	read_data->read = read;
+
+	persistent_clock_update();
+
+	/* Calculate the ns resolution of this persistent clock. */
+	res = clocksource_cyc2ns(1ULL, read_data->mult, read_data->shift);
+
+	pr_info("persistent clock: mask %llu at %uHz, resolution %lluns, wraps every %lluns\n",
+		mask, freq, res, wrap);
+	return 0;
+}
+
+void persistent_clock_cleanup(void)
+{
+	p.read_data.read = NULL;
+
+	if (p.alarm_inited) {
+		alarm_cancel(&p.alarm);
+		p.alarm_inited = false;
+	}
+}
+
+void persistent_clock_start_alarmtimer(void)
+{
+	struct persistent_clock_read_data *read_data = &p.read_data;
+	ktime_t now;
+
+	/*
+	 * If no persistent clock function has been provided or the alarmtimer
+	 * has been initialized at that point, just return.
+	 */
+	if (!read_data->read || p.alarm_inited)
+		return;
+
+	persistent_clock_update();
+
+	/*
+	 * Since the persistent clock will not be stopped when system enters the
+	 * suspend state, thus we need start one alarmtimer to wakeup the system
+	 * to update the persistent clock before wrapping. We should start the
+	 * update alarmtimer after the alarmtimer subsystem was initialized.
+	 */
+	alarm_init(&p.alarm, ALARM_BOOTTIME, persistent_clock_alarm_fired);
+	now = ktime_get_boottime();
+	alarm_start(&p.alarm, ktime_add(now, p.wrap));
+	p.alarm_inited = true;
+}
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 1/8] time: Add persistent clock support
@ 2018-06-13 11:32   ` Baolin Wang
  0 siblings, 0 replies; 38+ messages in thread
From: Baolin Wang @ 2018-06-13 11:32 UTC (permalink / raw)
  To: linux-arm-kernel

On our Spreadtrum SC9860 platform, we registered the high resolution
ARM generic timer as one clocksource to update the OS time, but the
ARM generic timer will be stopped in suspend state. So we use one 64bit
always-on timer (but low resolution) of Spreadtrum to calculate the
suspend time to compensate the OS time. Though we can register the
always-on timer as one clocksource, we need re-calculate the
mult/shift with one larger conversion range to calculate the suspend
time.

But now we have too many different ways of dealing with persistent
timekeeping across architectures, and there will be many duplicate
code if we register one timer to be one persistent clock. Thus it
will be more helpful if we add one common framework for timer drivers
to be registered as one persistent clock and implement the common
read_persistent_clock64() to compensate the OS time.

Moreover we can register the clocksource with CLOCK_SOURCE_SUSPEND_NONSTOP
to be one persistent clock, then we can simplify the suspend/resume
accounting by removing CLOCK_SOURCE_SUSPEND_NONSTOP timing. After that
we can only compensate the OS time by persistent clock or RTC.

Signed-off-by: Baolin Wang <baolin.wang@linaro.org>
---
 include/linux/persistent_clock.h |   23 +++++
 kernel/time/Kconfig              |    4 +
 kernel/time/Makefile             |    1 +
 kernel/time/alarmtimer.c         |    4 +
 kernel/time/persistent_clock.c   |  184 ++++++++++++++++++++++++++++++++++++++
 5 files changed, 216 insertions(+)
 create mode 100644 include/linux/persistent_clock.h
 create mode 100644 kernel/time/persistent_clock.c

diff --git a/include/linux/persistent_clock.h b/include/linux/persistent_clock.h
new file mode 100644
index 0000000..7d42c1a
--- /dev/null
+++ b/include/linux/persistent_clock.h
@@ -0,0 +1,23 @@
+// SPDX-License-Identifier: GPL-2.0
+#ifndef __PERSISTENT_CLOCK_H__
+#define __PERSISTENT_CLOCK_H__
+
+#ifdef CONFIG_PERSISTENT_CLOCK
+extern int persistent_clock_init_and_register(u64 (*read)(void),
+					      u64 mask, u32 freq,
+					      u64 maxsec);
+extern void persistent_clock_cleanup(void);
+extern void persistent_clock_start_alarmtimer(void);
+#else
+static inline int persistent_clock_init_and_register(u64 (*read)(void),
+						     u64 mask, u32 freq,
+						     u64 maxsec)
+{
+	return 0;
+}
+
+static inline void persistent_clock_cleanup(void) { }
+static inline void persistent_clock_start_alarmtimer(void) { }
+#endif
+
+#endif
diff --git a/kernel/time/Kconfig b/kernel/time/Kconfig
index 78eabc4..7188600 100644
--- a/kernel/time/Kconfig
+++ b/kernel/time/Kconfig
@@ -47,6 +47,10 @@ config GENERIC_CLOCKEVENTS_MIN_ADJUST
 config GENERIC_CMOS_UPDATE
 	bool
 
+# Persistent clock support
+config PERSISTENT_CLOCK
+	bool
+
 if GENERIC_CLOCKEVENTS
 menu "Timers subsystem"
 
diff --git a/kernel/time/Makefile b/kernel/time/Makefile
index f1e46f3..f6d368f 100644
--- a/kernel/time/Makefile
+++ b/kernel/time/Makefile
@@ -18,3 +18,4 @@ obj-$(CONFIG_GENERIC_SCHED_CLOCK)		+= sched_clock.o
 obj-$(CONFIG_TICK_ONESHOT)			+= tick-oneshot.o tick-sched.o
 obj-$(CONFIG_DEBUG_FS)				+= timekeeping_debug.o
 obj-$(CONFIG_TEST_UDELAY)			+= test_udelay.o
+obj-$(CONFIG_PERSISTENT_CLOCK)			+= persistent_clock.o
diff --git a/kernel/time/alarmtimer.c b/kernel/time/alarmtimer.c
index 639321b..1518fdb 100644
--- a/kernel/time/alarmtimer.c
+++ b/kernel/time/alarmtimer.c
@@ -29,6 +29,7 @@
 #include <linux/freezer.h>
 #include <linux/compat.h>
 #include <linux/module.h>
+#include <linux/persistent_clock.h>
 
 #include "posix-timers.h"
 
@@ -892,6 +893,9 @@ static int __init alarmtimer_init(void)
 		error = PTR_ERR(pdev);
 		goto out_drv;
 	}
+
+	/* Start one alarmtimer to update persistent clock */
+	persistent_clock_start_alarmtimer();
 	return 0;
 
 out_drv:
diff --git a/kernel/time/persistent_clock.c b/kernel/time/persistent_clock.c
new file mode 100644
index 0000000..edaa659
--- /dev/null
+++ b/kernel/time/persistent_clock.c
@@ -0,0 +1,184 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2018 Linaro, Inc.
+ *
+ * Author: Baolin Wang <baolin.wang@linaro.org>
+ */
+
+#include <linux/alarmtimer.h>
+#include <linux/clocksource.h>
+#include <linux/persistent_clock.h>
+
+/**
+ * persistent_clock_read_data - data required to read persistent clock
+ * @read: Returns a cycle value from persistent clock.
+ * @last_cycles: Clock cycle value at last update.
+ * @last_ns: Time value (nanoseconds) at last update.
+ * @mask: Bitmask for two's complement subtraction of non 64bit clocks.
+ * @mult: Cycle to nanosecond multiplier.
+ * @shift: Cycle to nanosecond divisor.
+ */
+struct persistent_clock_read_data {
+	u64 (*read)(void);
+	u64 last_cycles;
+	u64 last_ns;
+	u64 mask;
+	u32 mult;
+	u32 shift;
+};
+
+/**
+ * persistent_clock - represent the persistent clock
+ * @read_data: Data required to read from persistent clock.
+ * @seq: Sequence counter for protecting updates.
+ * @freq: The frequency of the persistent clock.
+ * @wrap: Duration for persistent clock can run before wrapping.
+ * @alarm: Update timeout for persistent clock wrap.
+ * @alarm_inited: Indicate if the alarm has been initialized.
+ */
+struct persistent_clock {
+	struct persistent_clock_read_data read_data;
+	seqcount_t seq;
+	u32 freq;
+	ktime_t wrap;
+	struct alarm alarm;
+	bool alarm_inited;
+};
+
+static struct persistent_clock p;
+
+void read_persistent_clock64(struct timespec64 *ts)
+{
+	struct persistent_clock_read_data *read_data = &p.read_data;
+	unsigned long seq;
+	u64 delta, nsecs;
+
+	if (!read_data->read) {
+		ts->tv_sec = 0;
+		ts->tv_nsec = 0;
+		return;
+	}
+
+	do {
+		seq = read_seqcount_begin(&p.seq);
+		delta = (read_data->read() - read_data->last_cycles) &
+			read_data->mask;
+
+		nsecs = read_data->last_ns +
+			clocksource_cyc2ns(delta, read_data->mult,
+					   read_data->shift);
+		*ts = ns_to_timespec64(nsecs);
+	} while (read_seqcount_retry(&p.seq, seq));
+}
+
+static void persistent_clock_update(void)
+{
+	struct persistent_clock_read_data *read_data = &p.read_data;
+	u64 cycles, delta;
+
+	write_seqcount_begin(&p.seq);
+
+	cycles = read_data->read();
+	delta = (cycles - read_data->last_cycles) & read_data->mask;
+	read_data->last_ns += clocksource_cyc2ns(delta, read_data->mult,
+						 read_data->shift);
+	read_data->last_cycles = cycles;
+
+	write_seqcount_end(&p.seq);
+}
+
+static enum alarmtimer_restart persistent_clock_alarm_fired(struct alarm *alarm,
+							    ktime_t now)
+{
+	persistent_clock_update();
+
+	alarm_forward(&p.alarm, now, p.wrap);
+	return ALARMTIMER_RESTART;
+}
+
+int persistent_clock_init_and_register(u64 (*read)(void), u64 mask,
+				       u32 freq, u64 maxsec)
+{
+	struct persistent_clock_read_data *read_data = &p.read_data;
+	u64 wrap, res, secs = maxsec;
+
+	if (!read || !mask || !freq)
+		return -EINVAL;
+
+	if (!secs) {
+		/*
+		 * If the timer driver did not specify the maximum conversion
+		 * seconds of the persistent clock, then we calculate the
+		 * conversion range with the persistent clock's bits and
+		 * frequency.
+		 */
+		secs = mask;
+		do_div(secs, freq);
+
+		/*
+		 * Some persistent counter can be larger than 32bit, so we
+		 * need limit the max suspend time to have a good conversion
+		 * precision. So 24 hours may be enough usually.
+		 */
+		if (secs > 86400)
+			secs = 86400;
+	}
+
+	/* Calculate the mult/shift to convert cycles to ns. */
+	clocks_calc_mult_shift(&read_data->mult, &read_data->shift, freq,
+			       NSEC_PER_SEC, (u32)secs);
+
+	/* Calculate how many nanoseconds until we risk wrapping. */
+	wrap = clocks_calc_max_nsecs(read_data->mult, read_data->shift, 0,
+				     mask, NULL);
+	p.wrap = ns_to_ktime(wrap);
+
+	p.freq = freq;
+	read_data->mask = mask;
+	read_data->read = read;
+
+	persistent_clock_update();
+
+	/* Calculate the ns resolution of this persistent clock. */
+	res = clocksource_cyc2ns(1ULL, read_data->mult, read_data->shift);
+
+	pr_info("persistent clock: mask %llu at %uHz, resolution %lluns, wraps every %lluns\n",
+		mask, freq, res, wrap);
+	return 0;
+}
+
+void persistent_clock_cleanup(void)
+{
+	p.read_data.read = NULL;
+
+	if (p.alarm_inited) {
+		alarm_cancel(&p.alarm);
+		p.alarm_inited = false;
+	}
+}
+
+void persistent_clock_start_alarmtimer(void)
+{
+	struct persistent_clock_read_data *read_data = &p.read_data;
+	ktime_t now;
+
+	/*
+	 * If no persistent clock function has been provided or the alarmtimer
+	 * has been initialized at that point, just return.
+	 */
+	if (!read_data->read || p.alarm_inited)
+		return;
+
+	persistent_clock_update();
+
+	/*
+	 * Since the persistent clock will not be stopped when system enters the
+	 * suspend state, thus we need start one alarmtimer to wakeup the system
+	 * to update the persistent clock before wrapping. We should start the
+	 * update alarmtimer after the alarmtimer subsystem was initialized.
+	 */
+	alarm_init(&p.alarm, ALARM_BOOTTIME, persistent_clock_alarm_fired);
+	now = ktime_get_boottime();
+	alarm_start(&p.alarm, ktime_add(now, p.wrap));
+	p.alarm_inited = true;
+}
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 2/8] clocksource: sprd: Add one persistent timer for Spreadtrum platform
  2018-06-13 11:32 ` Baolin Wang
@ 2018-06-13 11:32   ` Baolin Wang
  -1 siblings, 0 replies; 38+ messages in thread
From: Baolin Wang @ 2018-06-13 11:32 UTC (permalink / raw)
  To: tglx, john.stultz, daniel.lezcano, arnd, tony, aaro.koskinen,
	linux, mark.rutland, marc.zyngier
  Cc: baolin.wang, broonie, paulmck, mlichvar, rdunlap, kstewart,
	gregkh, pombredanne, thierry.reding, jonathanh, heiko,
	linus.walleij, viresh.kumar, mingo, hpa, peterz, douly.fnst,
	len.brown, rajvi.jingar, alexandre.belloni, x86,
	linux-arm-kernel, linux-tegra, linux-kernel, linux-omap

On Spreadtrum SC9860 platform, we need one persistent timer to calculate
the suspend time to compensate the OS time.

This patch registers one Spreadtrum AON timer as persistent timer, which
runs at 32bit and periodic mode.

Signed-off-by: Baolin Wang <baolin.wang@linaro.org>
---
 drivers/clocksource/Kconfig      |    1 +
 drivers/clocksource/timer-sprd.c |   80 ++++++++++++++++++++++++++++++++++++++
 2 files changed, 81 insertions(+)

diff --git a/drivers/clocksource/Kconfig b/drivers/clocksource/Kconfig
index dec0dd8..7f11c6c 100644
--- a/drivers/clocksource/Kconfig
+++ b/drivers/clocksource/Kconfig
@@ -455,6 +455,7 @@ config SPRD_TIMER
 	depends on (ARCH_SPRD || COMPILE_TEST)
 	default ARCH_SPRD
 	select TIMER_OF
+	select PERSISTENT_CLOCK
 	help
 	  Enables support for the Spreadtrum timer driver.
 
diff --git a/drivers/clocksource/timer-sprd.c b/drivers/clocksource/timer-sprd.c
index ef9ebea..c6f657a 100644
--- a/drivers/clocksource/timer-sprd.c
+++ b/drivers/clocksource/timer-sprd.c
@@ -3,8 +3,11 @@
  * Copyright (C) 2017 Spreadtrum Communications Inc.
  */
 
+#include <linux/clk.h>
 #include <linux/init.h>
 #include <linux/interrupt.h>
+#include <linux/of_address.h>
+#include <linux/persistent_clock.h>
 
 #include "timer-of.h"
 
@@ -157,3 +160,80 @@ static int __init sprd_timer_init(struct device_node *np)
 }
 
 TIMER_OF_DECLARE(sc9860_timer, "sprd,sc9860-timer", sprd_timer_init);
+
+void __iomem *pbase;
+
+static u64 sprd_persistent_timer_read(void)
+{
+	return ~(u64)readl_relaxed(pbase + TIMER_VALUE_SHDW_LO) &
+		CLOCKSOURCE_MASK(32);
+}
+
+static void sprd_persistent_timer_disable(void)
+{
+	sprd_timer_disable(pbase);
+}
+
+static void sprd_persistent_timer_enable(void)
+{
+	sprd_timer_disable(pbase);
+	sprd_timer_update_counter(pbase, TIMER_VALUE_LO_MASK);
+	sprd_timer_enable(pbase, TIMER_CTL_PERIOD_MODE);
+}
+
+static int __init sprd_persistent_timer_init(struct device_node *np)
+{
+	struct clk *clk;
+	u32 freq;
+	int ret;
+
+	clk = of_clk_get(np, 0);
+	if (IS_ERR(clk)) {
+		pr_err("Can't get timer clock for %pOF\n", np);
+		return PTR_ERR(clk);
+	}
+
+	ret = clk_prepare_enable(clk);
+	if (ret) {
+		pr_err("Failed to enable clock for %pOF\n", np);
+		clk_put(clk);
+		return ret;
+	}
+
+	freq = clk_get_rate(clk);
+	if (!freq) {
+		pr_err("Failed to get clock rate for %pOF\n", np);
+		ret = -EINVAL;
+		goto clk_rate_err;
+	}
+
+	pbase = of_io_request_and_map(np, 0, of_node_full_name(np));
+	if (IS_ERR(pbase)) {
+		pr_err("Can't map timer registers for %pOF\n", np);
+		ret = PTR_ERR(pbase);
+		goto clk_rate_err;
+	}
+
+	sprd_persistent_timer_enable();
+
+	ret = persistent_clock_init_and_register(sprd_persistent_timer_read,
+						 CLOCKSOURCE_MASK(32), freq, 0);
+	if (ret) {
+		pr_err("Failed to register persistent clock for %pOF\n", np);
+		goto persist_err;
+	}
+
+	return 0;
+
+persist_err:
+	sprd_persistent_timer_disable();
+	iounmap(pbase);
+clk_rate_err:
+	clk_disable_unprepare(clk);
+	clk_put(clk);
+
+	return ret;
+}
+
+TIMER_OF_DECLARE(sc9860_persistent_timer, "sprd,sc9860-persistent-timer",
+		 sprd_persistent_timer_init);
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 2/8] clocksource: sprd: Add one persistent timer for Spreadtrum platform
@ 2018-06-13 11:32   ` Baolin Wang
  0 siblings, 0 replies; 38+ messages in thread
From: Baolin Wang @ 2018-06-13 11:32 UTC (permalink / raw)
  To: linux-arm-kernel

On Spreadtrum SC9860 platform, we need one persistent timer to calculate
the suspend time to compensate the OS time.

This patch registers one Spreadtrum AON timer as persistent timer, which
runs at 32bit and periodic mode.

Signed-off-by: Baolin Wang <baolin.wang@linaro.org>
---
 drivers/clocksource/Kconfig      |    1 +
 drivers/clocksource/timer-sprd.c |   80 ++++++++++++++++++++++++++++++++++++++
 2 files changed, 81 insertions(+)

diff --git a/drivers/clocksource/Kconfig b/drivers/clocksource/Kconfig
index dec0dd8..7f11c6c 100644
--- a/drivers/clocksource/Kconfig
+++ b/drivers/clocksource/Kconfig
@@ -455,6 +455,7 @@ config SPRD_TIMER
 	depends on (ARCH_SPRD || COMPILE_TEST)
 	default ARCH_SPRD
 	select TIMER_OF
+	select PERSISTENT_CLOCK
 	help
 	  Enables support for the Spreadtrum timer driver.
 
diff --git a/drivers/clocksource/timer-sprd.c b/drivers/clocksource/timer-sprd.c
index ef9ebea..c6f657a 100644
--- a/drivers/clocksource/timer-sprd.c
+++ b/drivers/clocksource/timer-sprd.c
@@ -3,8 +3,11 @@
  * Copyright (C) 2017 Spreadtrum Communications Inc.
  */
 
+#include <linux/clk.h>
 #include <linux/init.h>
 #include <linux/interrupt.h>
+#include <linux/of_address.h>
+#include <linux/persistent_clock.h>
 
 #include "timer-of.h"
 
@@ -157,3 +160,80 @@ static int __init sprd_timer_init(struct device_node *np)
 }
 
 TIMER_OF_DECLARE(sc9860_timer, "sprd,sc9860-timer", sprd_timer_init);
+
+void __iomem *pbase;
+
+static u64 sprd_persistent_timer_read(void)
+{
+	return ~(u64)readl_relaxed(pbase + TIMER_VALUE_SHDW_LO) &
+		CLOCKSOURCE_MASK(32);
+}
+
+static void sprd_persistent_timer_disable(void)
+{
+	sprd_timer_disable(pbase);
+}
+
+static void sprd_persistent_timer_enable(void)
+{
+	sprd_timer_disable(pbase);
+	sprd_timer_update_counter(pbase, TIMER_VALUE_LO_MASK);
+	sprd_timer_enable(pbase, TIMER_CTL_PERIOD_MODE);
+}
+
+static int __init sprd_persistent_timer_init(struct device_node *np)
+{
+	struct clk *clk;
+	u32 freq;
+	int ret;
+
+	clk = of_clk_get(np, 0);
+	if (IS_ERR(clk)) {
+		pr_err("Can't get timer clock for %pOF\n", np);
+		return PTR_ERR(clk);
+	}
+
+	ret = clk_prepare_enable(clk);
+	if (ret) {
+		pr_err("Failed to enable clock for %pOF\n", np);
+		clk_put(clk);
+		return ret;
+	}
+
+	freq = clk_get_rate(clk);
+	if (!freq) {
+		pr_err("Failed to get clock rate for %pOF\n", np);
+		ret = -EINVAL;
+		goto clk_rate_err;
+	}
+
+	pbase = of_io_request_and_map(np, 0, of_node_full_name(np));
+	if (IS_ERR(pbase)) {
+		pr_err("Can't map timer registers for %pOF\n", np);
+		ret = PTR_ERR(pbase);
+		goto clk_rate_err;
+	}
+
+	sprd_persistent_timer_enable();
+
+	ret = persistent_clock_init_and_register(sprd_persistent_timer_read,
+						 CLOCKSOURCE_MASK(32), freq, 0);
+	if (ret) {
+		pr_err("Failed to register persistent clock for %pOF\n", np);
+		goto persist_err;
+	}
+
+	return 0;
+
+persist_err:
+	sprd_persistent_timer_disable();
+	iounmap(pbase);
+clk_rate_err:
+	clk_disable_unprepare(clk);
+	clk_put(clk);
+
+	return ret;
+}
+
+TIMER_OF_DECLARE(sc9860_persistent_timer, "sprd,sc9860-persistent-timer",
+		 sprd_persistent_timer_init);
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 3/8] arm: time: Remove the persistent clock support for ARM architecture
  2018-06-13 11:32 ` Baolin Wang
@ 2018-06-13 11:32   ` Baolin Wang
  -1 siblings, 0 replies; 38+ messages in thread
From: Baolin Wang @ 2018-06-13 11:32 UTC (permalink / raw)
  To: tglx, john.stultz, daniel.lezcano, arnd, tony, aaro.koskinen,
	linux, mark.rutland, marc.zyngier
  Cc: baolin.wang, broonie, paulmck, mlichvar, rdunlap, kstewart,
	gregkh, pombredanne, thierry.reding, jonathanh, heiko,
	linus.walleij, viresh.kumar, mingo, hpa, peterz, douly.fnst,
	len.brown, rajvi.jingar, alexandre.belloni, x86,
	linux-arm-kernel, linux-tegra, linux-kernel, linux-omap

We have introduced the persistent clock framework to support the OS time
compensating from persistent clock, and we will convert all drivers to
use common persistent clock framework instead of the persistent clock
support used only for the ARM architecture. So we can remove these code
with converting the Omap 32k counter and tegra20 timer.

Moreover there are no drivers will register read_boot_clock64(), so
remove it too.

Signed-off-by: Baolin Wang <baolin.wang@linaro.org>
---
 arch/arm/include/asm/mach/time.h    |    4 ----
 arch/arm/kernel/time.c              |   36 ----------------------------
 arch/arm/plat-omap/Kconfig          |    1 +
 arch/arm/plat-omap/counter_32k.c    |   44 ++++++-----------------------------
 drivers/clocksource/tegra20_timer.c |   12 +++++++---
 5 files changed, 17 insertions(+), 80 deletions(-)

diff --git a/arch/arm/include/asm/mach/time.h b/arch/arm/include/asm/mach/time.h
index 0f79e4d..3cbcafc 100644
--- a/arch/arm/include/asm/mach/time.h
+++ b/arch/arm/include/asm/mach/time.h
@@ -12,8 +12,4 @@
 
 extern void timer_tick(void);
 
-typedef void (*clock_access_fn)(struct timespec64 *);
-extern int register_persistent_clock(clock_access_fn read_boot,
-				     clock_access_fn read_persistent);
-
 #endif
diff --git a/arch/arm/kernel/time.c b/arch/arm/kernel/time.c
index cf2701c..713905c 100644
--- a/arch/arm/kernel/time.c
+++ b/arch/arm/kernel/time.c
@@ -76,42 +76,6 @@ void timer_tick(void)
 }
 #endif
 
-static void dummy_clock_access(struct timespec64 *ts)
-{
-	ts->tv_sec = 0;
-	ts->tv_nsec = 0;
-}
-
-static clock_access_fn __read_persistent_clock = dummy_clock_access;
-static clock_access_fn __read_boot_clock = dummy_clock_access;
-
-void read_persistent_clock64(struct timespec64 *ts)
-{
-	__read_persistent_clock(ts);
-}
-
-void read_boot_clock64(struct timespec64 *ts)
-{
-	__read_boot_clock(ts);
-}
-
-int __init register_persistent_clock(clock_access_fn read_boot,
-				     clock_access_fn read_persistent)
-{
-	/* Only allow the clockaccess functions to be registered once */
-	if (__read_persistent_clock == dummy_clock_access &&
-	    __read_boot_clock == dummy_clock_access) {
-		if (read_boot)
-			__read_boot_clock = read_boot;
-		if (read_persistent)
-			__read_persistent_clock = read_persistent;
-
-		return 0;
-	}
-
-	return -EINVAL;
-}
-
 void __init time_init(void)
 {
 	if (machine_desc->init_time) {
diff --git a/arch/arm/plat-omap/Kconfig b/arch/arm/plat-omap/Kconfig
index c0a242c..073a80f 100644
--- a/arch/arm/plat-omap/Kconfig
+++ b/arch/arm/plat-omap/Kconfig
@@ -1,5 +1,6 @@
 # SPDX-License-Identifier: GPL-2.0
 config ARCH_OMAP
+	select PERSISTENT_CLOCK
 	bool
 
 if ARCH_OMAP
diff --git a/arch/arm/plat-omap/counter_32k.c b/arch/arm/plat-omap/counter_32k.c
index 2438b96..5d52f7c 100644
--- a/arch/arm/plat-omap/counter_32k.c
+++ b/arch/arm/plat-omap/counter_32k.c
@@ -19,8 +19,7 @@
 #include <linux/io.h>
 #include <linux/clocksource.h>
 #include <linux/sched_clock.h>
-
-#include <asm/mach/time.h>
+#include <linux/persistent_clock.h>
 
 #include <plat/counter-32k.h>
 
@@ -44,33 +43,6 @@ static u64 notrace omap_32k_read_sched_clock(void)
 }
 
 /**
- * omap_read_persistent_clock64 -  Return time from a persistent clock.
- *
- * Reads the time from a source which isn't disabled during PM, the
- * 32k sync timer.  Convert the cycles elapsed since last read into
- * nsecs and adds to a monotonically increasing timespec64.
- */
-static struct timespec64 persistent_ts;
-static cycles_t cycles;
-static unsigned int persistent_mult, persistent_shift;
-
-static void omap_read_persistent_clock64(struct timespec64 *ts)
-{
-	unsigned long long nsecs;
-	cycles_t last_cycles;
-
-	last_cycles = cycles;
-	cycles = sync32k_cnt_reg ? readl_relaxed(sync32k_cnt_reg) : 0;
-
-	nsecs = clocksource_cyc2ns(cycles - last_cycles,
-					persistent_mult, persistent_shift);
-
-	timespec64_add_ns(&persistent_ts, nsecs);
-
-	*ts = persistent_ts;
-}
-
-/**
  * omap_init_clocksource_32k - setup and register counter 32k as a
  * kernel clocksource
  * @pbase: base addr of counter_32k module
@@ -95,13 +67,6 @@ int __init omap_init_clocksource_32k(void __iomem *vbase)
 	else
 		sync32k_cnt_reg = vbase + OMAP2_32KSYNCNT_CR_OFF_LOW;
 
-	/*
-	 * 120000 rough estimate from the calculations in
-	 * __clocksource_update_freq_scale.
-	 */
-	clocks_calc_mult_shift(&persistent_mult, &persistent_shift,
-			32768, NSEC_PER_SEC, 120000);
-
 	ret = clocksource_mmio_init(sync32k_cnt_reg, "32k_counter", 32768,
 				250, 32, clocksource_mmio_readl_up);
 	if (ret) {
@@ -110,7 +75,12 @@ int __init omap_init_clocksource_32k(void __iomem *vbase)
 	}
 
 	sched_clock_register(omap_32k_read_sched_clock, 32, 32768);
-	register_persistent_clock(NULL, omap_read_persistent_clock64);
+	/*
+	 * 120000 rough estimate from the calculations in
+	 * __clocksource_update_freq_scale.
+	 */
+	persistent_clock_init_and_register(omap_32k_read_sched_clock,
+					   CLOCKSOURCE_MASK(32), 32768, 120000);
 	pr_info("OMAP clocksource: 32k_counter at 32768 Hz\n");
 
 	return 0;
diff --git a/drivers/clocksource/tegra20_timer.c b/drivers/clocksource/tegra20_timer.c
index c337a81..97a34cb 100644
--- a/drivers/clocksource/tegra20_timer.c
+++ b/drivers/clocksource/tegra20_timer.c
@@ -124,7 +124,7 @@ static u64 tegra_rtc_read_ms(void)
 }
 
 /*
- * tegra_read_persistent_clock64 -  Return time from a persistent clock.
+ * read_persistent_clock64 -  Return time from a persistent clock.
  *
  * Reads the time from a source which isn't disabled during PM, the
  * 32k sync timer.  Convert the cycles elapsed since last read into
@@ -133,10 +133,16 @@ static u64 tegra_rtc_read_ms(void)
  * tegra_rtc driver could be executing to avoid race conditions
  * on the RTC shadow register
  */
-static void tegra_read_persistent_clock64(struct timespec64 *ts)
+void read_persistent_clock64(struct timespec64 *ts)
 {
 	u64 delta;
 
+	if (!rtc_base) {
+		ts->tv_sec = 0;
+		ts->tv_nsec = 0;
+		return;
+	}
+
 	last_persistent_ms = persistent_ms;
 	persistent_ms = tegra_rtc_read_ms();
 	delta = persistent_ms - last_persistent_ms;
@@ -259,6 +265,6 @@ static int __init tegra20_init_rtc(struct device_node *np)
 	else
 		clk_prepare_enable(clk);
 
-	return register_persistent_clock(NULL, tegra_read_persistent_clock64);
+	return 0;
 }
 TIMER_OF_DECLARE(tegra20_rtc, "nvidia,tegra20-rtc", tegra20_init_rtc);
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 3/8] arm: time: Remove the persistent clock support for ARM architecture
@ 2018-06-13 11:32   ` Baolin Wang
  0 siblings, 0 replies; 38+ messages in thread
From: Baolin Wang @ 2018-06-13 11:32 UTC (permalink / raw)
  To: linux-arm-kernel

We have introduced the persistent clock framework to support the OS time
compensating from persistent clock, and we will convert all drivers to
use common persistent clock framework instead of the persistent clock
support used only for the ARM architecture. So we can remove these code
with converting the Omap 32k counter and tegra20 timer.

Moreover there are no drivers will register read_boot_clock64(), so
remove it too.

Signed-off-by: Baolin Wang <baolin.wang@linaro.org>
---
 arch/arm/include/asm/mach/time.h    |    4 ----
 arch/arm/kernel/time.c              |   36 ----------------------------
 arch/arm/plat-omap/Kconfig          |    1 +
 arch/arm/plat-omap/counter_32k.c    |   44 ++++++-----------------------------
 drivers/clocksource/tegra20_timer.c |   12 +++++++---
 5 files changed, 17 insertions(+), 80 deletions(-)

diff --git a/arch/arm/include/asm/mach/time.h b/arch/arm/include/asm/mach/time.h
index 0f79e4d..3cbcafc 100644
--- a/arch/arm/include/asm/mach/time.h
+++ b/arch/arm/include/asm/mach/time.h
@@ -12,8 +12,4 @@
 
 extern void timer_tick(void);
 
-typedef void (*clock_access_fn)(struct timespec64 *);
-extern int register_persistent_clock(clock_access_fn read_boot,
-				     clock_access_fn read_persistent);
-
 #endif
diff --git a/arch/arm/kernel/time.c b/arch/arm/kernel/time.c
index cf2701c..713905c 100644
--- a/arch/arm/kernel/time.c
+++ b/arch/arm/kernel/time.c
@@ -76,42 +76,6 @@ void timer_tick(void)
 }
 #endif
 
-static void dummy_clock_access(struct timespec64 *ts)
-{
-	ts->tv_sec = 0;
-	ts->tv_nsec = 0;
-}
-
-static clock_access_fn __read_persistent_clock = dummy_clock_access;
-static clock_access_fn __read_boot_clock = dummy_clock_access;
-
-void read_persistent_clock64(struct timespec64 *ts)
-{
-	__read_persistent_clock(ts);
-}
-
-void read_boot_clock64(struct timespec64 *ts)
-{
-	__read_boot_clock(ts);
-}
-
-int __init register_persistent_clock(clock_access_fn read_boot,
-				     clock_access_fn read_persistent)
-{
-	/* Only allow the clockaccess functions to be registered once */
-	if (__read_persistent_clock == dummy_clock_access &&
-	    __read_boot_clock == dummy_clock_access) {
-		if (read_boot)
-			__read_boot_clock = read_boot;
-		if (read_persistent)
-			__read_persistent_clock = read_persistent;
-
-		return 0;
-	}
-
-	return -EINVAL;
-}
-
 void __init time_init(void)
 {
 	if (machine_desc->init_time) {
diff --git a/arch/arm/plat-omap/Kconfig b/arch/arm/plat-omap/Kconfig
index c0a242c..073a80f 100644
--- a/arch/arm/plat-omap/Kconfig
+++ b/arch/arm/plat-omap/Kconfig
@@ -1,5 +1,6 @@
 # SPDX-License-Identifier: GPL-2.0
 config ARCH_OMAP
+	select PERSISTENT_CLOCK
 	bool
 
 if ARCH_OMAP
diff --git a/arch/arm/plat-omap/counter_32k.c b/arch/arm/plat-omap/counter_32k.c
index 2438b96..5d52f7c 100644
--- a/arch/arm/plat-omap/counter_32k.c
+++ b/arch/arm/plat-omap/counter_32k.c
@@ -19,8 +19,7 @@
 #include <linux/io.h>
 #include <linux/clocksource.h>
 #include <linux/sched_clock.h>
-
-#include <asm/mach/time.h>
+#include <linux/persistent_clock.h>
 
 #include <plat/counter-32k.h>
 
@@ -44,33 +43,6 @@ static u64 notrace omap_32k_read_sched_clock(void)
 }
 
 /**
- * omap_read_persistent_clock64 -  Return time from a persistent clock.
- *
- * Reads the time from a source which isn't disabled during PM, the
- * 32k sync timer.  Convert the cycles elapsed since last read into
- * nsecs and adds to a monotonically increasing timespec64.
- */
-static struct timespec64 persistent_ts;
-static cycles_t cycles;
-static unsigned int persistent_mult, persistent_shift;
-
-static void omap_read_persistent_clock64(struct timespec64 *ts)
-{
-	unsigned long long nsecs;
-	cycles_t last_cycles;
-
-	last_cycles = cycles;
-	cycles = sync32k_cnt_reg ? readl_relaxed(sync32k_cnt_reg) : 0;
-
-	nsecs = clocksource_cyc2ns(cycles - last_cycles,
-					persistent_mult, persistent_shift);
-
-	timespec64_add_ns(&persistent_ts, nsecs);
-
-	*ts = persistent_ts;
-}
-
-/**
  * omap_init_clocksource_32k - setup and register counter 32k as a
  * kernel clocksource
  * @pbase: base addr of counter_32k module
@@ -95,13 +67,6 @@ int __init omap_init_clocksource_32k(void __iomem *vbase)
 	else
 		sync32k_cnt_reg = vbase + OMAP2_32KSYNCNT_CR_OFF_LOW;
 
-	/*
-	 * 120000 rough estimate from the calculations in
-	 * __clocksource_update_freq_scale.
-	 */
-	clocks_calc_mult_shift(&persistent_mult, &persistent_shift,
-			32768, NSEC_PER_SEC, 120000);
-
 	ret = clocksource_mmio_init(sync32k_cnt_reg, "32k_counter", 32768,
 				250, 32, clocksource_mmio_readl_up);
 	if (ret) {
@@ -110,7 +75,12 @@ int __init omap_init_clocksource_32k(void __iomem *vbase)
 	}
 
 	sched_clock_register(omap_32k_read_sched_clock, 32, 32768);
-	register_persistent_clock(NULL, omap_read_persistent_clock64);
+	/*
+	 * 120000 rough estimate from the calculations in
+	 * __clocksource_update_freq_scale.
+	 */
+	persistent_clock_init_and_register(omap_32k_read_sched_clock,
+					   CLOCKSOURCE_MASK(32), 32768, 120000);
 	pr_info("OMAP clocksource: 32k_counter at 32768 Hz\n");
 
 	return 0;
diff --git a/drivers/clocksource/tegra20_timer.c b/drivers/clocksource/tegra20_timer.c
index c337a81..97a34cb 100644
--- a/drivers/clocksource/tegra20_timer.c
+++ b/drivers/clocksource/tegra20_timer.c
@@ -124,7 +124,7 @@ static u64 tegra_rtc_read_ms(void)
 }
 
 /*
- * tegra_read_persistent_clock64 -  Return time from a persistent clock.
+ * read_persistent_clock64 -  Return time from a persistent clock.
  *
  * Reads the time from a source which isn't disabled during PM, the
  * 32k sync timer.  Convert the cycles elapsed since last read into
@@ -133,10 +133,16 @@ static u64 tegra_rtc_read_ms(void)
  * tegra_rtc driver could be executing to avoid race conditions
  * on the RTC shadow register
  */
-static void tegra_read_persistent_clock64(struct timespec64 *ts)
+void read_persistent_clock64(struct timespec64 *ts)
 {
 	u64 delta;
 
+	if (!rtc_base) {
+		ts->tv_sec = 0;
+		ts->tv_nsec = 0;
+		return;
+	}
+
 	last_persistent_ms = persistent_ms;
 	persistent_ms = tegra_rtc_read_ms();
 	delta = persistent_ms - last_persistent_ms;
@@ -259,6 +265,6 @@ static int __init tegra20_init_rtc(struct device_node *np)
 	else
 		clk_prepare_enable(clk);
 
-	return register_persistent_clock(NULL, tegra_read_persistent_clock64);
+	return 0;
 }
 TIMER_OF_DECLARE(tegra20_rtc, "nvidia,tegra20-rtc", tegra20_init_rtc);
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 4/8] clocksource: arm_arch_timer: Register the persistent clock
  2018-06-13 11:32 ` Baolin Wang
@ 2018-06-13 11:32   ` Baolin Wang
  -1 siblings, 0 replies; 38+ messages in thread
From: Baolin Wang @ 2018-06-13 11:32 UTC (permalink / raw)
  To: tglx, john.stultz, daniel.lezcano, arnd, tony, aaro.koskinen,
	linux, mark.rutland, marc.zyngier
  Cc: baolin.wang, broonie, paulmck, mlichvar, rdunlap, kstewart,
	gregkh, pombredanne, thierry.reding, jonathanh, heiko,
	linus.walleij, viresh.kumar, mingo, hpa, peterz, douly.fnst,
	len.brown, rajvi.jingar, alexandre.belloni, x86,
	linux-arm-kernel, linux-tegra, linux-kernel, linux-omap

Register the persistent clock to compensate the suspend time for OS time,
if the ARM counter clocksource will not be stopped in suspend state.

Signed-off-by: Baolin Wang <baolin.wang@linaro.org>
---
 drivers/clocksource/Kconfig          |    1 +
 drivers/clocksource/arm_arch_timer.c |   10 ++++++++++
 2 files changed, 11 insertions(+)

diff --git a/drivers/clocksource/Kconfig b/drivers/clocksource/Kconfig
index 7f11c6c..5e51fcf 100644
--- a/drivers/clocksource/Kconfig
+++ b/drivers/clocksource/Kconfig
@@ -308,6 +308,7 @@ config ARC_TIMERS_64BIT
 
 config ARM_ARCH_TIMER
 	bool
+	select PERSISTENT_CLOCK
 	select TIMER_OF if OF
 	select TIMER_ACPI if ACPI
 
diff --git a/drivers/clocksource/arm_arch_timer.c b/drivers/clocksource/arm_arch_timer.c
index 57cb2f0..671be63 100644
--- a/drivers/clocksource/arm_arch_timer.c
+++ b/drivers/clocksource/arm_arch_timer.c
@@ -32,6 +32,7 @@
 #include <asm/virt.h>
 
 #include <clocksource/arm_arch_timer.h>
+#include <linux/persistent_clock.h>
 
 #undef pr_fmt
 #define pr_fmt(fmt) "arch_timer: " fmt
@@ -950,6 +951,15 @@ static void __init arch_counter_register(unsigned type)
 
 	/* 56 bits minimum, so we assume worst case rollover */
 	sched_clock_register(arch_timer_read_counter, 56, arch_timer_rate);
+
+	/*
+	 * Register the persistent clock if the clocksource will not be stopped
+	 * in suspend state.
+	 */
+	if (!arch_counter_suspend_stop)
+		persistent_clock_init_and_register(arch_timer_read_counter,
+						   CLOCKSOURCE_MASK(56),
+						   arch_timer_rate, 0);
 }
 
 static void arch_timer_stop(struct clock_event_device *clk)
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 4/8] clocksource: arm_arch_timer: Register the persistent clock
@ 2018-06-13 11:32   ` Baolin Wang
  0 siblings, 0 replies; 38+ messages in thread
From: Baolin Wang @ 2018-06-13 11:32 UTC (permalink / raw)
  To: linux-arm-kernel

Register the persistent clock to compensate the suspend time for OS time,
if the ARM counter clocksource will not be stopped in suspend state.

Signed-off-by: Baolin Wang <baolin.wang@linaro.org>
---
 drivers/clocksource/Kconfig          |    1 +
 drivers/clocksource/arm_arch_timer.c |   10 ++++++++++
 2 files changed, 11 insertions(+)

diff --git a/drivers/clocksource/Kconfig b/drivers/clocksource/Kconfig
index 7f11c6c..5e51fcf 100644
--- a/drivers/clocksource/Kconfig
+++ b/drivers/clocksource/Kconfig
@@ -308,6 +308,7 @@ config ARC_TIMERS_64BIT
 
 config ARM_ARCH_TIMER
 	bool
+	select PERSISTENT_CLOCK
 	select TIMER_OF if OF
 	select TIMER_ACPI if ACPI
 
diff --git a/drivers/clocksource/arm_arch_timer.c b/drivers/clocksource/arm_arch_timer.c
index 57cb2f0..671be63 100644
--- a/drivers/clocksource/arm_arch_timer.c
+++ b/drivers/clocksource/arm_arch_timer.c
@@ -32,6 +32,7 @@
 #include <asm/virt.h>
 
 #include <clocksource/arm_arch_timer.h>
+#include <linux/persistent_clock.h>
 
 #undef pr_fmt
 #define pr_fmt(fmt) "arch_timer: " fmt
@@ -950,6 +951,15 @@ static void __init arch_counter_register(unsigned type)
 
 	/* 56 bits minimum, so we assume worst case rollover */
 	sched_clock_register(arch_timer_read_counter, 56, arch_timer_rate);
+
+	/*
+	 * Register the persistent clock if the clocksource will not be stopped
+	 * in suspend state.
+	 */
+	if (!arch_counter_suspend_stop)
+		persistent_clock_init_and_register(arch_timer_read_counter,
+						   CLOCKSOURCE_MASK(56),
+						   arch_timer_rate, 0);
 }
 
 static void arch_timer_stop(struct clock_event_device *clk)
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 5/8] clocksource: timer-ti-32k: Register the persistent clock
  2018-06-13 11:32 ` Baolin Wang
@ 2018-06-13 11:32   ` Baolin Wang
  -1 siblings, 0 replies; 38+ messages in thread
From: Baolin Wang @ 2018-06-13 11:32 UTC (permalink / raw)
  To: tglx, john.stultz, daniel.lezcano, arnd, tony, aaro.koskinen,
	linux, mark.rutland, marc.zyngier
  Cc: baolin.wang, broonie, paulmck, mlichvar, rdunlap, kstewart,
	gregkh, pombredanne, thierry.reding, jonathanh, heiko,
	linus.walleij, viresh.kumar, mingo, hpa, peterz, douly.fnst,
	len.brown, rajvi.jingar, alexandre.belloni, x86,
	linux-arm-kernel, linux-tegra, linux-kernel, linux-omap

Since the 32K counter is always available, then we can register the
persistent clock to compensate the suspend time for the OS time.

Signed-off-by: Baolin Wang <baolin.wang@linaro.org>
---
 drivers/clocksource/Kconfig        |    1 +
 drivers/clocksource/timer-ti-32k.c |    4 ++++
 2 files changed, 5 insertions(+)

diff --git a/drivers/clocksource/Kconfig b/drivers/clocksource/Kconfig
index 5e51fcf..3cd136f 100644
--- a/drivers/clocksource/Kconfig
+++ b/drivers/clocksource/Kconfig
@@ -262,6 +262,7 @@ config CLKSRC_TI_32K
 	bool "Texas Instruments 32.768 Hz Clocksource" if COMPILE_TEST
 	depends on GENERIC_SCHED_CLOCK
 	select TIMER_OF if OF
+	select PERSISTENT_CLOCK
 	help
 	  This option enables support for Texas Instruments 32.768 Hz clocksource
 	  available on many OMAP-like platforms.
diff --git a/drivers/clocksource/timer-ti-32k.c b/drivers/clocksource/timer-ti-32k.c
index 880a861..353ff9d 100644
--- a/drivers/clocksource/timer-ti-32k.c
+++ b/drivers/clocksource/timer-ti-32k.c
@@ -41,6 +41,7 @@
 #include <linux/clocksource.h>
 #include <linux/of.h>
 #include <linux/of_address.h>
+#include <linux/persistent_clock.h>
 
 /*
  * 32KHz clocksource ... always available, on pretty most chips except
@@ -120,6 +121,9 @@ static int __init ti_32k_timer_init(struct device_node *np)
 	}
 
 	sched_clock_register(omap_32k_read_sched_clock, 32, 32768);
+	persistent_clock_init_and_register(omap_32k_read_sched_clock,
+					   CLOCKSOURCE_MASK(32), 32768, 0);
+
 	pr_info("OMAP clocksource: 32k_counter at 32768 Hz\n");
 
 	return 0;
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 5/8] clocksource: timer-ti-32k: Register the persistent clock
@ 2018-06-13 11:32   ` Baolin Wang
  0 siblings, 0 replies; 38+ messages in thread
From: Baolin Wang @ 2018-06-13 11:32 UTC (permalink / raw)
  To: linux-arm-kernel

Since the 32K counter is always available, then we can register the
persistent clock to compensate the suspend time for the OS time.

Signed-off-by: Baolin Wang <baolin.wang@linaro.org>
---
 drivers/clocksource/Kconfig        |    1 +
 drivers/clocksource/timer-ti-32k.c |    4 ++++
 2 files changed, 5 insertions(+)

diff --git a/drivers/clocksource/Kconfig b/drivers/clocksource/Kconfig
index 5e51fcf..3cd136f 100644
--- a/drivers/clocksource/Kconfig
+++ b/drivers/clocksource/Kconfig
@@ -262,6 +262,7 @@ config CLKSRC_TI_32K
 	bool "Texas Instruments 32.768 Hz Clocksource" if COMPILE_TEST
 	depends on GENERIC_SCHED_CLOCK
 	select TIMER_OF if OF
+	select PERSISTENT_CLOCK
 	help
 	  This option enables support for Texas Instruments 32.768 Hz clocksource
 	  available on many OMAP-like platforms.
diff --git a/drivers/clocksource/timer-ti-32k.c b/drivers/clocksource/timer-ti-32k.c
index 880a861..353ff9d 100644
--- a/drivers/clocksource/timer-ti-32k.c
+++ b/drivers/clocksource/timer-ti-32k.c
@@ -41,6 +41,7 @@
 #include <linux/clocksource.h>
 #include <linux/of.h>
 #include <linux/of_address.h>
+#include <linux/persistent_clock.h>
 
 /*
  * 32KHz clocksource ... always available, on pretty most chips except
@@ -120,6 +121,9 @@ static int __init ti_32k_timer_init(struct device_node *np)
 	}
 
 	sched_clock_register(omap_32k_read_sched_clock, 32, 32768);
+	persistent_clock_init_and_register(omap_32k_read_sched_clock,
+					   CLOCKSOURCE_MASK(32), 32768, 0);
+
 	pr_info("OMAP clocksource: 32k_counter at 32768 Hz\n");
 
 	return 0;
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 6/8] clocksource: time-pistachio: Register the persistent clock
  2018-06-13 11:32 ` Baolin Wang
@ 2018-06-13 11:32   ` Baolin Wang
  -1 siblings, 0 replies; 38+ messages in thread
From: Baolin Wang @ 2018-06-13 11:32 UTC (permalink / raw)
  To: tglx, john.stultz, daniel.lezcano, arnd, tony, aaro.koskinen,
	linux, mark.rutland, marc.zyngier
  Cc: baolin.wang, broonie, paulmck, mlichvar, rdunlap, kstewart,
	gregkh, pombredanne, thierry.reding, jonathanh, heiko,
	linus.walleij, viresh.kumar, mingo, hpa, peterz, douly.fnst,
	len.brown, rajvi.jingar, alexandre.belloni, x86,
	linux-arm-kernel, linux-tegra, linux-kernel, linux-omap

Since the timer on pistachio platform is always available, we can
register it as one persistent clock to compensate the suspend time
for the OS time.

Signed-off-by: Baolin Wang <baolin.wang@linaro.org>
---
 drivers/clocksource/Kconfig          |    1 +
 drivers/clocksource/time-pistachio.c |    3 +++
 2 files changed, 4 insertions(+)

diff --git a/drivers/clocksource/Kconfig b/drivers/clocksource/Kconfig
index 3cd136f..af552ba 100644
--- a/drivers/clocksource/Kconfig
+++ b/drivers/clocksource/Kconfig
@@ -255,6 +255,7 @@ config CLKSRC_PISTACHIO
 	bool "Clocksource for Pistachio SoC" if COMPILE_TEST
 	depends on HAS_IOMEM
 	select TIMER_OF
+	select PERSISTENT_CLOCK
 	help
 	  Enables the clocksource for the Pistachio SoC.
 
diff --git a/drivers/clocksource/time-pistachio.c b/drivers/clocksource/time-pistachio.c
index a2dd85d..5c3eb71 100644
--- a/drivers/clocksource/time-pistachio.c
+++ b/drivers/clocksource/time-pistachio.c
@@ -20,6 +20,7 @@
 #include <linux/mfd/syscon.h>
 #include <linux/of.h>
 #include <linux/of_address.h>
+#include <linux/persistent_clock.h>
 #include <linux/platform_device.h>
 #include <linux/regmap.h>
 #include <linux/sched_clock.h>
@@ -212,6 +213,8 @@ static int __init pistachio_clksrc_of_init(struct device_node *node)
 
 	raw_spin_lock_init(&pcs_gpt.lock);
 	sched_clock_register(pistachio_read_sched_clock, 32, rate);
+	persistent_clock_init_and_register(pistachio_read_sched_clock,
+					   CLOCKSOURCE_MASK(32), rate, 0);
 	return clocksource_register_hz(&pcs_gpt.cs, rate);
 }
 TIMER_OF_DECLARE(pistachio_gptimer, "img,pistachio-gptimer",
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 6/8] clocksource: time-pistachio: Register the persistent clock
@ 2018-06-13 11:32   ` Baolin Wang
  0 siblings, 0 replies; 38+ messages in thread
From: Baolin Wang @ 2018-06-13 11:32 UTC (permalink / raw)
  To: linux-arm-kernel

Since the timer on pistachio platform is always available, we can
register it as one persistent clock to compensate the suspend time
for the OS time.

Signed-off-by: Baolin Wang <baolin.wang@linaro.org>
---
 drivers/clocksource/Kconfig          |    1 +
 drivers/clocksource/time-pistachio.c |    3 +++
 2 files changed, 4 insertions(+)

diff --git a/drivers/clocksource/Kconfig b/drivers/clocksource/Kconfig
index 3cd136f..af552ba 100644
--- a/drivers/clocksource/Kconfig
+++ b/drivers/clocksource/Kconfig
@@ -255,6 +255,7 @@ config CLKSRC_PISTACHIO
 	bool "Clocksource for Pistachio SoC" if COMPILE_TEST
 	depends on HAS_IOMEM
 	select TIMER_OF
+	select PERSISTENT_CLOCK
 	help
 	  Enables the clocksource for the Pistachio SoC.
 
diff --git a/drivers/clocksource/time-pistachio.c b/drivers/clocksource/time-pistachio.c
index a2dd85d..5c3eb71 100644
--- a/drivers/clocksource/time-pistachio.c
+++ b/drivers/clocksource/time-pistachio.c
@@ -20,6 +20,7 @@
 #include <linux/mfd/syscon.h>
 #include <linux/of.h>
 #include <linux/of_address.h>
+#include <linux/persistent_clock.h>
 #include <linux/platform_device.h>
 #include <linux/regmap.h>
 #include <linux/sched_clock.h>
@@ -212,6 +213,8 @@ static int __init pistachio_clksrc_of_init(struct device_node *node)
 
 	raw_spin_lock_init(&pcs_gpt.lock);
 	sched_clock_register(pistachio_read_sched_clock, 32, rate);
+	persistent_clock_init_and_register(pistachio_read_sched_clock,
+					   CLOCKSOURCE_MASK(32), rate, 0);
 	return clocksource_register_hz(&pcs_gpt.cs, rate);
 }
 TIMER_OF_DECLARE(pistachio_gptimer, "img,pistachio-gptimer",
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 7/8] x86: tsc: Register the persistent clock
  2018-06-13 11:32 ` Baolin Wang
@ 2018-06-13 11:32   ` Baolin Wang
  -1 siblings, 0 replies; 38+ messages in thread
From: Baolin Wang @ 2018-06-13 11:32 UTC (permalink / raw)
  To: tglx, john.stultz, daniel.lezcano, arnd, tony, aaro.koskinen,
	linux, mark.rutland, marc.zyngier
  Cc: baolin.wang, broonie, paulmck, mlichvar, rdunlap, kstewart,
	gregkh, pombredanne, thierry.reding, jonathanh, heiko,
	linus.walleij, viresh.kumar, mingo, hpa, peterz, douly.fnst,
	len.brown, rajvi.jingar, alexandre.belloni, x86,
	linux-arm-kernel, linux-tegra, linux-kernel, linux-omap

Register the tsc as one persistent clock to compensate the suspend time
if the tsc clocksource is always available.

Signed-off-by: Baolin Wang <baolin.wang@linaro.org>
---
 arch/x86/Kconfig      |    1 +
 arch/x86/kernel/tsc.c |   21 +++++++++++++++++++++
 2 files changed, 22 insertions(+)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 297789a..549dd01 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -200,6 +200,7 @@ config X86
 	select USER_STACKTRACE_SUPPORT
 	select VIRT_TO_BUS
 	select X86_FEATURE_NAMES		if PROC_FS
+	select PERSISTENT_CLOCK
 
 config INSTRUCTION_DECODER
 	def_bool y
diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
index 74392d9..cb4f495 100644
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -11,6 +11,7 @@
 #include <linux/delay.h>
 #include <linux/clocksource.h>
 #include <linux/percpu.h>
+#include <linux/persistent_clock.h>
 #include <linux/timex.h>
 #include <linux/static_key.h>
 
@@ -1032,6 +1033,11 @@ static u64 read_tsc(struct clocksource *cs)
 	return (u64)rdtsc_ordered();
 }
 
+static u64 notrace tsc_read_persistent_clock(void)
+{
+	return (u64)rdtsc_ordered();
+}
+
 static void tsc_cs_mark_unstable(struct clocksource *cs)
 {
 	if (tsc_unstable)
@@ -1300,6 +1306,14 @@ static void tsc_refine_calibration_work(struct work_struct *work)
 	if (boot_cpu_has(X86_FEATURE_ART))
 		art_related_clocksource = &clocksource_tsc;
 	clocksource_register_khz(&clocksource_tsc, tsc_khz);
+
+	if (clocksource_tsc.flags & CLOCK_SOURCE_SUSPEND_NONSTOP) {
+		persistent_clock_init_and_register(tsc_read_persistent_clock,
+						   CLOCKSOURCE_MASK(64),
+						   tsc_khz * 1000, 0);
+		persistent_clock_start_alarmtimer();
+	}
+
 unreg:
 	clocksource_unregister(&clocksource_tsc_early);
 }
@@ -1327,6 +1341,13 @@ static int __init init_tsc_clocksource(void)
 		if (boot_cpu_has(X86_FEATURE_ART))
 			art_related_clocksource = &clocksource_tsc;
 		clocksource_register_khz(&clocksource_tsc, tsc_khz);
+
+		if (clocksource_tsc.flags & CLOCK_SOURCE_SUSPEND_NONSTOP) {
+			persistent_clock_init_and_register(tsc_read_persistent_clock,
+							   CLOCKSOURCE_MASK(64),
+							   tsc_khz * 1000, 0);
+			persistent_clock_start_alarmtimer();
+		}
 unreg:
 		clocksource_unregister(&clocksource_tsc_early);
 		return 0;
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 7/8] x86: tsc: Register the persistent clock
@ 2018-06-13 11:32   ` Baolin Wang
  0 siblings, 0 replies; 38+ messages in thread
From: Baolin Wang @ 2018-06-13 11:32 UTC (permalink / raw)
  To: linux-arm-kernel

Register the tsc as one persistent clock to compensate the suspend time
if the tsc clocksource is always available.

Signed-off-by: Baolin Wang <baolin.wang@linaro.org>
---
 arch/x86/Kconfig      |    1 +
 arch/x86/kernel/tsc.c |   21 +++++++++++++++++++++
 2 files changed, 22 insertions(+)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 297789a..549dd01 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -200,6 +200,7 @@ config X86
 	select USER_STACKTRACE_SUPPORT
 	select VIRT_TO_BUS
 	select X86_FEATURE_NAMES		if PROC_FS
+	select PERSISTENT_CLOCK
 
 config INSTRUCTION_DECODER
 	def_bool y
diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
index 74392d9..cb4f495 100644
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -11,6 +11,7 @@
 #include <linux/delay.h>
 #include <linux/clocksource.h>
 #include <linux/percpu.h>
+#include <linux/persistent_clock.h>
 #include <linux/timex.h>
 #include <linux/static_key.h>
 
@@ -1032,6 +1033,11 @@ static u64 read_tsc(struct clocksource *cs)
 	return (u64)rdtsc_ordered();
 }
 
+static u64 notrace tsc_read_persistent_clock(void)
+{
+	return (u64)rdtsc_ordered();
+}
+
 static void tsc_cs_mark_unstable(struct clocksource *cs)
 {
 	if (tsc_unstable)
@@ -1300,6 +1306,14 @@ static void tsc_refine_calibration_work(struct work_struct *work)
 	if (boot_cpu_has(X86_FEATURE_ART))
 		art_related_clocksource = &clocksource_tsc;
 	clocksource_register_khz(&clocksource_tsc, tsc_khz);
+
+	if (clocksource_tsc.flags & CLOCK_SOURCE_SUSPEND_NONSTOP) {
+		persistent_clock_init_and_register(tsc_read_persistent_clock,
+						   CLOCKSOURCE_MASK(64),
+						   tsc_khz * 1000, 0);
+		persistent_clock_start_alarmtimer();
+	}
+
 unreg:
 	clocksource_unregister(&clocksource_tsc_early);
 }
@@ -1327,6 +1341,13 @@ static int __init init_tsc_clocksource(void)
 		if (boot_cpu_has(X86_FEATURE_ART))
 			art_related_clocksource = &clocksource_tsc;
 		clocksource_register_khz(&clocksource_tsc, tsc_khz);
+
+		if (clocksource_tsc.flags & CLOCK_SOURCE_SUSPEND_NONSTOP) {
+			persistent_clock_init_and_register(tsc_read_persistent_clock,
+							   CLOCKSOURCE_MASK(64),
+							   tsc_khz * 1000, 0);
+			persistent_clock_start_alarmtimer();
+		}
 unreg:
 		clocksource_unregister(&clocksource_tsc_early);
 		return 0;
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 8/8] time: timekeeping: Remove time compensating from nonstop clocksources
  2018-06-13 11:32 ` Baolin Wang
@ 2018-06-13 11:32   ` Baolin Wang
  -1 siblings, 0 replies; 38+ messages in thread
From: Baolin Wang @ 2018-06-13 11:32 UTC (permalink / raw)
  To: tglx, john.stultz, daniel.lezcano, arnd, tony, aaro.koskinen,
	linux, mark.rutland, marc.zyngier
  Cc: baolin.wang, broonie, paulmck, mlichvar, rdunlap, kstewart,
	gregkh, pombredanne, thierry.reding, jonathanh, heiko,
	linus.walleij, viresh.kumar, mingo, hpa, peterz, douly.fnst,
	len.brown, rajvi.jingar, alexandre.belloni, x86,
	linux-arm-kernel, linux-tegra, linux-kernel, linux-omap

Since we have converted all nonstop clocksources to use persistent clock,
thus we can simplify the time compensating by removing the nonstop
clocksources. Now we can compensate the suspend time for the OS time from
the persistent clock or rtc device.

Signed-off-by: Baolin Wang <baolin.wang@linaro.org>
---
 kernel/time/timekeeping.c |   19 ++++---------------
 1 file changed, 4 insertions(+), 15 deletions(-)

diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 4786df9..3026d98 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -1666,7 +1666,6 @@ void timekeeping_inject_sleeptime64(struct timespec64 *delta)
 void timekeeping_resume(void)
 {
 	struct timekeeper *tk = &tk_core.timekeeper;
-	struct clocksource *clock = tk->tkr_mono.clock;
 	unsigned long flags;
 	struct timespec64 ts_new, ts_delta;
 	u64 cycle_now;
@@ -1682,27 +1681,17 @@ void timekeeping_resume(void)
 
 	/*
 	 * After system resumes, we need to calculate the suspended time and
-	 * compensate it for the OS time. There are 3 sources that could be
-	 * used: Nonstop clocksource during suspend, persistent clock and rtc
-	 * device.
+	 * compensate it for the OS time. There are 2 sources that could be
+	 * used: persistent clock and rtc device.
 	 *
 	 * One specific platform may have 1 or 2 or all of them, and the
 	 * preference will be:
-	 *	suspend-nonstop clocksource -> persistent clock -> rtc
+	 *	persistent clock -> rtc
 	 * The less preferred source will only be tried if there is no better
 	 * usable source. The rtc part is handled separately in rtc core code.
 	 */
 	cycle_now = tk_clock_read(&tk->tkr_mono);
-	if ((clock->flags & CLOCK_SOURCE_SUSPEND_NONSTOP) &&
-		cycle_now > tk->tkr_mono.cycle_last) {
-		u64 nsec, cyc_delta;
-
-		cyc_delta = clocksource_delta(cycle_now, tk->tkr_mono.cycle_last,
-					      tk->tkr_mono.mask);
-		nsec = mul_u64_u32_shr(cyc_delta, clock->mult, clock->shift);
-		ts_delta = ns_to_timespec64(nsec);
-		sleeptime_injected = true;
-	} else if (timespec64_compare(&ts_new, &timekeeping_suspend_time) > 0) {
+	if (timespec64_compare(&ts_new, &timekeeping_suspend_time) > 0) {
 		ts_delta = timespec64_sub(ts_new, timekeeping_suspend_time);
 		sleeptime_injected = true;
 	}
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 8/8] time: timekeeping: Remove time compensating from nonstop clocksources
@ 2018-06-13 11:32   ` Baolin Wang
  0 siblings, 0 replies; 38+ messages in thread
From: Baolin Wang @ 2018-06-13 11:32 UTC (permalink / raw)
  To: linux-arm-kernel

Since we have converted all nonstop clocksources to use persistent clock,
thus we can simplify the time compensating by removing the nonstop
clocksources. Now we can compensate the suspend time for the OS time from
the persistent clock or rtc device.

Signed-off-by: Baolin Wang <baolin.wang@linaro.org>
---
 kernel/time/timekeeping.c |   19 ++++---------------
 1 file changed, 4 insertions(+), 15 deletions(-)

diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 4786df9..3026d98 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -1666,7 +1666,6 @@ void timekeeping_inject_sleeptime64(struct timespec64 *delta)
 void timekeeping_resume(void)
 {
 	struct timekeeper *tk = &tk_core.timekeeper;
-	struct clocksource *clock = tk->tkr_mono.clock;
 	unsigned long flags;
 	struct timespec64 ts_new, ts_delta;
 	u64 cycle_now;
@@ -1682,27 +1681,17 @@ void timekeeping_resume(void)
 
 	/*
 	 * After system resumes, we need to calculate the suspended time and
-	 * compensate it for the OS time. There are 3 sources that could be
-	 * used: Nonstop clocksource during suspend, persistent clock and rtc
-	 * device.
+	 * compensate it for the OS time. There are 2 sources that could be
+	 * used: persistent clock and rtc device.
 	 *
 	 * One specific platform may have 1 or 2 or all of them, and the
 	 * preference will be:
-	 *	suspend-nonstop clocksource -> persistent clock -> rtc
+	 *	persistent clock -> rtc
 	 * The less preferred source will only be tried if there is no better
 	 * usable source. The rtc part is handled separately in rtc core code.
 	 */
 	cycle_now = tk_clock_read(&tk->tkr_mono);
-	if ((clock->flags & CLOCK_SOURCE_SUSPEND_NONSTOP) &&
-		cycle_now > tk->tkr_mono.cycle_last) {
-		u64 nsec, cyc_delta;
-
-		cyc_delta = clocksource_delta(cycle_now, tk->tkr_mono.cycle_last,
-					      tk->tkr_mono.mask);
-		nsec = mul_u64_u32_shr(cyc_delta, clock->mult, clock->shift);
-		ts_delta = ns_to_timespec64(nsec);
-		sleeptime_injected = true;
-	} else if (timespec64_compare(&ts_new, &timekeeping_suspend_time) > 0) {
+	if (timespec64_compare(&ts_new, &timekeeping_suspend_time) > 0) {
 		ts_delta = timespec64_sub(ts_new, timekeeping_suspend_time);
 		sleeptime_injected = true;
 	}
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [PATCH 1/8] time: Add persistent clock support
  2018-06-13 11:32   ` Baolin Wang
  (?)
@ 2018-06-24  0:14     ` Thomas Gleixner
  -1 siblings, 0 replies; 38+ messages in thread
From: Thomas Gleixner @ 2018-06-24  0:14 UTC (permalink / raw)
  To: Baolin Wang
  Cc: mark.rutland, kstewart, alexandre.belloni, x86, heiko, tony,
	viresh.kumar, linus.walleij, linux-kernel, thierry.reding, hpa,
	mingo, aaro.koskinen, daniel.lezcano, linux, jonathanh, peterz,
	paulmck, len.brown, arnd, marc.zyngier, mlichvar, broonie,
	john.stultz, linux-tegra, linux-omap, linux-arm-kernel,
	douly.fnst, gregkh, rdunlap, rajvi.jingar, pombredanne

On Wed, 13 Jun 2018, Baolin Wang wrote:
> Moreover we can register the clocksource with CLOCK_SOURCE_SUSPEND_NONSTOP
> to be one persistent clock, then we can simplify the suspend/resume
> accounting by removing CLOCK_SOURCE_SUSPEND_NONSTOP timing. After that
> we can only compensate the OS time by persistent clock or RTC.

That makes sense because it adds a gazillion lines of code and removes 5?
Not really,

> +/**
> + * persistent_clock_read_data - data required to read persistent clock
> + * @read: Returns a cycle value from persistent clock.
> + * @last_cycles: Clock cycle value at last update.
> + * @last_ns: Time value (nanoseconds) at last update.
> + * @mask: Bitmask for two's complement subtraction of non 64bit clocks.
> + * @mult: Cycle to nanosecond multiplier.
> + * @shift: Cycle to nanosecond divisor.
> + */
> +struct persistent_clock_read_data {
> +	u64 (*read)(void);
> +	u64 last_cycles;
> +	u64 last_ns;
> +	u64 mask;
> +	u32 mult;
> +	u32 shift;
> +};
> +/**
> + * persistent_clock - represent the persistent clock
> + * @read_data: Data required to read from persistent clock.
> + * @seq: Sequence counter for protecting updates.
> + * @freq: The frequency of the persistent clock.
> + * @wrap: Duration for persistent clock can run before wrapping.
> + * @alarm: Update timeout for persistent clock wrap.
> + * @alarm_inited: Indicate if the alarm has been initialized.
> + */
> +struct persistent_clock {
> +	struct persistent_clock_read_data read_data;
> +	seqcount_t seq;
> +	u32 freq;
> +	ktime_t wrap;
> +	struct alarm alarm;
> +	bool alarm_inited;
> +};

NAK!

There is no reason to invent yet another set of data structures and yet
more read functions with a sequence counter. which are just a bad and
broken copy of the existing timekeeping/clocksource code. And of course the
stuff is not serialized against multiple registrations, etc. etc.

Plus the utter nonsense that any call site has to do the same thing over
and over:

    register():
    start_alarm_timer();

Why is this required in the first place? It's not at all. The only place
where such an alarm timer will be required is when the system actually goes
to suspend. Starting it at registration time is pointless and even counter
productive. Assume the clocksource wraps every 2 hours. So you start it at
boot time and after 119 minutes uptime the system suspends. So it will
wakeup one minute later to update the clocksource. Heck no. If the timer is
started when the machine actually suspends it will wake up earliest in 120
minutes.

And you even add that to the TSC which does not need it at all. It will
wrap in about 400 years on a 2GHZ machine. So you degrade the functionality
instead of improving it.

So no, this is not going anywhere.

Let's look at the problem itself:

   You want to use one clocksource for timekeeping during runtime which is
   fast and accurate and another one for suspend time injection which is
   slower and/or less accurate because the fast one stops in suspend.

   Plus you need an alarmtimer which makes sure that the clocksource does
   not wrap around during suspend.

Now lets look what we have already:

   Both clocksources already exist and are registered as clocksources with
   all required data in the clocksource core.

Ergo the only sane and logical conclusion is to expand the existing
infrastructure to handle that.

When a clocksource is registered, then the registration function already
makes decisions about using it as timekeeping clocksource. So add a few
lines of code to check whether the newly registered clocksource is suitable
and preferred for suspend.

	if (!stops_in_suspend(newcs)) {
		if (!suspend_cs || is_preferred_suspend_cs(newcs))
			suspend_cs = newcs;
	}		
		
The is_preferred_suspend_cs() can be based on rating, the maximum suspend
length which can be achieved or whatever is sensible. It should start of as
a very simple decision function based on rating and not an prematurely
overengineered monstrosity.

The suspend/resume() code needs a few very simple changes:
   
xxx_suspend():
	clocksource_prepare_suspend();

  Note, this is _NOT_ timekeeping_suspend() because that is invoked _AFTER_
  alarmtimer_suspend(). So if an alarm timer is required it needs to be
  armed before that. A trivial solution might be to just call it from
  alarmtimer_suspend(), but that a minor detail to worry about.

timekeeping_suspend()
{
	clocksource_enter_suspend();
	...

timekeeping_resume()
{
...
	if (clocksource_leave_suspend(&nsec)) {
		ts_delta = ns_to_timespec64(nsec);
		sleeptime_injected = true;
	} else if (......


Now lets look at the new functions:

void clocksource_prepare_suspend(void)
{
	if (!suspend_cs)
		return;

	if (needs_alarmtimer(suspend_cs))
		start_suspend_alarm(suspend_cs);
}

void clocksource_enter_suspend(void)
{
	if (!suspend_cs)
		return;
	suspend_start = suspend_cs->read();
}

bool clocksource_leave_suspend(u64 *nsec)
{
	u64 now, delta;

	if (!suspend_cs)
		return false;

	now = suspend_cs->read();
	delta = clocksource_delta(now, suspend_start, suspend_cs->mask);
	*nsec = mul_u64_u32_shr(delta, suspend_cs->mult, suspend_cs->shift);
	return true;
}

See? It does not need any of this totally nonsensical stuff in your
registration function and not any new read functions and whatever, because
it simply can use the bog standard mult/shift values. Why? Because the
conversion above can cope with a full 64 bit * 32 bit multiply without
falling apart. It's already there in timekeeping_resume() otherwise
resuming with a NONSTOP TSC would result in bogus sleep times after a few
minutes. It's slower than the normal clocksource conversion which is
optimized for performance, but thats completely irrelevant on resume. This
whole blurb about requiring separate mult/shift values is just plain
drivel.

Plus any reasonably broad clocksource will not need an alarmtimer at
all. Because the only reason it is needed is when the clocksource itself
wraps around. And that has absolutely nothing to do with mult/shift
values. That just depends on the frequency and the bitwidth of the counter,

So it does not need an update function either because in case of broad
enough clocksources there is absolutely no need for update and in case of
wrapping ones the alarmtimer brings it out of suspend on time. And because
the only interesting thing is the delta between suspend and resume this is
all a complete non issue.

The clocksource core already has all the registration/unregistration
functionality plus an interface to reconfigure the frequency, so
clocksources can come and go and be reconfigured and all of this just
works.

Once the extra few lines of code are in place, then you can go and cleanup
the existing mess of homebrewn interfaces and claim that this is
consolidation and simplification.

<rant>

What's wrong with you people? Didn't they teach you in school that the
first thing to do is proper problem and code analysis? If they did not, go
back to them and ask your money back,

I'm really tired of these overengineered trainwrecks which are then
advertised with bullshit marketing like the best invention since sliced
bread. This might work in random corporates, but not here.

</rant>

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 1/8] time: Add persistent clock support
@ 2018-06-24  0:14     ` Thomas Gleixner
  0 siblings, 0 replies; 38+ messages in thread
From: Thomas Gleixner @ 2018-06-24  0:14 UTC (permalink / raw)
  To: Baolin Wang
  Cc: john.stultz, daniel.lezcano, arnd, tony, aaro.koskinen, linux,
	mark.rutland, marc.zyngier, broonie, paulmck, mlichvar, rdunlap,
	kstewart, gregkh, pombredanne, thierry.reding, jonathanh, heiko,
	linus.walleij, viresh.kumar, mingo, hpa, peterz, douly.fnst,
	len.brown, rajvi.jingar, alexandre.belloni, x86,
	linux-arm-kernel, linux-tegra, linux-kernel, linux-omap

On Wed, 13 Jun 2018, Baolin Wang wrote:
> Moreover we can register the clocksource with CLOCK_SOURCE_SUSPEND_NONSTOP
> to be one persistent clock, then we can simplify the suspend/resume
> accounting by removing CLOCK_SOURCE_SUSPEND_NONSTOP timing. After that
> we can only compensate the OS time by persistent clock or RTC.

That makes sense because it adds a gazillion lines of code and removes 5?
Not really,

> +/**
> + * persistent_clock_read_data - data required to read persistent clock
> + * @read: Returns a cycle value from persistent clock.
> + * @last_cycles: Clock cycle value at last update.
> + * @last_ns: Time value (nanoseconds) at last update.
> + * @mask: Bitmask for two's complement subtraction of non 64bit clocks.
> + * @mult: Cycle to nanosecond multiplier.
> + * @shift: Cycle to nanosecond divisor.
> + */
> +struct persistent_clock_read_data {
> +	u64 (*read)(void);
> +	u64 last_cycles;
> +	u64 last_ns;
> +	u64 mask;
> +	u32 mult;
> +	u32 shift;
> +};
> +/**
> + * persistent_clock - represent the persistent clock
> + * @read_data: Data required to read from persistent clock.
> + * @seq: Sequence counter for protecting updates.
> + * @freq: The frequency of the persistent clock.
> + * @wrap: Duration for persistent clock can run before wrapping.
> + * @alarm: Update timeout for persistent clock wrap.
> + * @alarm_inited: Indicate if the alarm has been initialized.
> + */
> +struct persistent_clock {
> +	struct persistent_clock_read_data read_data;
> +	seqcount_t seq;
> +	u32 freq;
> +	ktime_t wrap;
> +	struct alarm alarm;
> +	bool alarm_inited;
> +};

NAK!

There is no reason to invent yet another set of data structures and yet
more read functions with a sequence counter. which are just a bad and
broken copy of the existing timekeeping/clocksource code. And of course the
stuff is not serialized against multiple registrations, etc. etc.

Plus the utter nonsense that any call site has to do the same thing over
and over:

    register():
    start_alarm_timer();

Why is this required in the first place? It's not at all. The only place
where such an alarm timer will be required is when the system actually goes
to suspend. Starting it at registration time is pointless and even counter
productive. Assume the clocksource wraps every 2 hours. So you start it at
boot time and after 119 minutes uptime the system suspends. So it will
wakeup one minute later to update the clocksource. Heck no. If the timer is
started when the machine actually suspends it will wake up earliest in 120
minutes.

And you even add that to the TSC which does not need it at all. It will
wrap in about 400 years on a 2GHZ machine. So you degrade the functionality
instead of improving it.

So no, this is not going anywhere.

Let's look at the problem itself:

   You want to use one clocksource for timekeeping during runtime which is
   fast and accurate and another one for suspend time injection which is
   slower and/or less accurate because the fast one stops in suspend.

   Plus you need an alarmtimer which makes sure that the clocksource does
   not wrap around during suspend.

Now lets look what we have already:

   Both clocksources already exist and are registered as clocksources with
   all required data in the clocksource core.

Ergo the only sane and logical conclusion is to expand the existing
infrastructure to handle that.

When a clocksource is registered, then the registration function already
makes decisions about using it as timekeeping clocksource. So add a few
lines of code to check whether the newly registered clocksource is suitable
and preferred for suspend.

	if (!stops_in_suspend(newcs)) {
		if (!suspend_cs || is_preferred_suspend_cs(newcs))
			suspend_cs = newcs;
	}		
		
The is_preferred_suspend_cs() can be based on rating, the maximum suspend
length which can be achieved or whatever is sensible. It should start of as
a very simple decision function based on rating and not an prematurely
overengineered monstrosity.

The suspend/resume() code needs a few very simple changes:
   
xxx_suspend():
	clocksource_prepare_suspend();

  Note, this is _NOT_ timekeeping_suspend() because that is invoked _AFTER_
  alarmtimer_suspend(). So if an alarm timer is required it needs to be
  armed before that. A trivial solution might be to just call it from
  alarmtimer_suspend(), but that a minor detail to worry about.

timekeeping_suspend()
{
	clocksource_enter_suspend();
	...

timekeeping_resume()
{
...
	if (clocksource_leave_suspend(&nsec)) {
		ts_delta = ns_to_timespec64(nsec);
		sleeptime_injected = true;
	} else if (......


Now lets look at the new functions:

void clocksource_prepare_suspend(void)
{
	if (!suspend_cs)
		return;

	if (needs_alarmtimer(suspend_cs))
		start_suspend_alarm(suspend_cs);
}

void clocksource_enter_suspend(void)
{
	if (!suspend_cs)
		return;
	suspend_start = suspend_cs->read();
}

bool clocksource_leave_suspend(u64 *nsec)
{
	u64 now, delta;

	if (!suspend_cs)
		return false;

	now = suspend_cs->read();
	delta = clocksource_delta(now, suspend_start, suspend_cs->mask);
	*nsec = mul_u64_u32_shr(delta, suspend_cs->mult, suspend_cs->shift);
	return true;
}

See? It does not need any of this totally nonsensical stuff in your
registration function and not any new read functions and whatever, because
it simply can use the bog standard mult/shift values. Why? Because the
conversion above can cope with a full 64 bit * 32 bit multiply without
falling apart. It's already there in timekeeping_resume() otherwise
resuming with a NONSTOP TSC would result in bogus sleep times after a few
minutes. It's slower than the normal clocksource conversion which is
optimized for performance, but thats completely irrelevant on resume. This
whole blurb about requiring separate mult/shift values is just plain
drivel.

Plus any reasonably broad clocksource will not need an alarmtimer at
all. Because the only reason it is needed is when the clocksource itself
wraps around. And that has absolutely nothing to do with mult/shift
values. That just depends on the frequency and the bitwidth of the counter,

So it does not need an update function either because in case of broad
enough clocksources there is absolutely no need for update and in case of
wrapping ones the alarmtimer brings it out of suspend on time. And because
the only interesting thing is the delta between suspend and resume this is
all a complete non issue.

The clocksource core already has all the registration/unregistration
functionality plus an interface to reconfigure the frequency, so
clocksources can come and go and be reconfigured and all of this just
works.

Once the extra few lines of code are in place, then you can go and cleanup
the existing mess of homebrewn interfaces and claim that this is
consolidation and simplification.

<rant>

What's wrong with you people? Didn't they teach you in school that the
first thing to do is proper problem and code analysis? If they did not, go
back to them and ask your money back,

I'm really tired of these overengineered trainwrecks which are then
advertised with bullshit marketing like the best invention since sliced
bread. This might work in random corporates, but not here.

</rant>

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH 1/8] time: Add persistent clock support
@ 2018-06-24  0:14     ` Thomas Gleixner
  0 siblings, 0 replies; 38+ messages in thread
From: Thomas Gleixner @ 2018-06-24  0:14 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, 13 Jun 2018, Baolin Wang wrote:
> Moreover we can register the clocksource with CLOCK_SOURCE_SUSPEND_NONSTOP
> to be one persistent clock, then we can simplify the suspend/resume
> accounting by removing CLOCK_SOURCE_SUSPEND_NONSTOP timing. After that
> we can only compensate the OS time by persistent clock or RTC.

That makes sense because it adds a gazillion lines of code and removes 5?
Not really,

> +/**
> + * persistent_clock_read_data - data required to read persistent clock
> + * @read: Returns a cycle value from persistent clock.
> + * @last_cycles: Clock cycle value at last update.
> + * @last_ns: Time value (nanoseconds) at last update.
> + * @mask: Bitmask for two's complement subtraction of non 64bit clocks.
> + * @mult: Cycle to nanosecond multiplier.
> + * @shift: Cycle to nanosecond divisor.
> + */
> +struct persistent_clock_read_data {
> +	u64 (*read)(void);
> +	u64 last_cycles;
> +	u64 last_ns;
> +	u64 mask;
> +	u32 mult;
> +	u32 shift;
> +};
> +/**
> + * persistent_clock - represent the persistent clock
> + * @read_data: Data required to read from persistent clock.
> + * @seq: Sequence counter for protecting updates.
> + * @freq: The frequency of the persistent clock.
> + * @wrap: Duration for persistent clock can run before wrapping.
> + * @alarm: Update timeout for persistent clock wrap.
> + * @alarm_inited: Indicate if the alarm has been initialized.
> + */
> +struct persistent_clock {
> +	struct persistent_clock_read_data read_data;
> +	seqcount_t seq;
> +	u32 freq;
> +	ktime_t wrap;
> +	struct alarm alarm;
> +	bool alarm_inited;
> +};

NAK!

There is no reason to invent yet another set of data structures and yet
more read functions with a sequence counter. which are just a bad and
broken copy of the existing timekeeping/clocksource code. And of course the
stuff is not serialized against multiple registrations, etc. etc.

Plus the utter nonsense that any call site has to do the same thing over
and over:

    register():
    start_alarm_timer();

Why is this required in the first place? It's not at all. The only place
where such an alarm timer will be required is when the system actually goes
to suspend. Starting it at registration time is pointless and even counter
productive. Assume the clocksource wraps every 2 hours. So you start it at
boot time and after 119 minutes uptime the system suspends. So it will
wakeup one minute later to update the clocksource. Heck no. If the timer is
started when the machine actually suspends it will wake up earliest in 120
minutes.

And you even add that to the TSC which does not need it at all. It will
wrap in about 400 years on a 2GHZ machine. So you degrade the functionality
instead of improving it.

So no, this is not going anywhere.

Let's look at the problem itself:

   You want to use one clocksource for timekeeping during runtime which is
   fast and accurate and another one for suspend time injection which is
   slower and/or less accurate because the fast one stops in suspend.

   Plus you need an alarmtimer which makes sure that the clocksource does
   not wrap around during suspend.

Now lets look what we have already:

   Both clocksources already exist and are registered as clocksources with
   all required data in the clocksource core.

Ergo the only sane and logical conclusion is to expand the existing
infrastructure to handle that.

When a clocksource is registered, then the registration function already
makes decisions about using it as timekeeping clocksource. So add a few
lines of code to check whether the newly registered clocksource is suitable
and preferred for suspend.

	if (!stops_in_suspend(newcs)) {
		if (!suspend_cs || is_preferred_suspend_cs(newcs))
			suspend_cs = newcs;
	}		
		
The is_preferred_suspend_cs() can be based on rating, the maximum suspend
length which can be achieved or whatever is sensible. It should start of as
a very simple decision function based on rating and not an prematurely
overengineered monstrosity.

The suspend/resume() code needs a few very simple changes:
   
xxx_suspend():
	clocksource_prepare_suspend();

  Note, this is _NOT_ timekeeping_suspend() because that is invoked _AFTER_
  alarmtimer_suspend(). So if an alarm timer is required it needs to be
  armed before that. A trivial solution might be to just call it from
  alarmtimer_suspend(), but that a minor detail to worry about.

timekeeping_suspend()
{
	clocksource_enter_suspend();
	...

timekeeping_resume()
{
...
	if (clocksource_leave_suspend(&nsec)) {
		ts_delta = ns_to_timespec64(nsec);
		sleeptime_injected = true;
	} else if (......


Now lets look at the new functions:

void clocksource_prepare_suspend(void)
{
	if (!suspend_cs)
		return;

	if (needs_alarmtimer(suspend_cs))
		start_suspend_alarm(suspend_cs);
}

void clocksource_enter_suspend(void)
{
	if (!suspend_cs)
		return;
	suspend_start = suspend_cs->read();
}

bool clocksource_leave_suspend(u64 *nsec)
{
	u64 now, delta;

	if (!suspend_cs)
		return false;

	now = suspend_cs->read();
	delta = clocksource_delta(now, suspend_start, suspend_cs->mask);
	*nsec = mul_u64_u32_shr(delta, suspend_cs->mult, suspend_cs->shift);
	return true;
}

See? It does not need any of this totally nonsensical stuff in your
registration function and not any new read functions and whatever, because
it simply can use the bog standard mult/shift values. Why? Because the
conversion above can cope with a full 64 bit * 32 bit multiply without
falling apart. It's already there in timekeeping_resume() otherwise
resuming with a NONSTOP TSC would result in bogus sleep times after a few
minutes. It's slower than the normal clocksource conversion which is
optimized for performance, but thats completely irrelevant on resume. This
whole blurb about requiring separate mult/shift values is just plain
drivel.

Plus any reasonably broad clocksource will not need an alarmtimer at
all. Because the only reason it is needed is when the clocksource itself
wraps around. And that has absolutely nothing to do with mult/shift
values. That just depends on the frequency and the bitwidth of the counter,

So it does not need an update function either because in case of broad
enough clocksources there is absolutely no need for update and in case of
wrapping ones the alarmtimer brings it out of suspend on time. And because
the only interesting thing is the delta between suspend and resume this is
all a complete non issue.

The clocksource core already has all the registration/unregistration
functionality plus an interface to reconfigure the frequency, so
clocksources can come and go and be reconfigured and all of this just
works.

Once the extra few lines of code are in place, then you can go and cleanup
the existing mess of homebrewn interfaces and claim that this is
consolidation and simplification.

<rant>

What's wrong with you people? Didn't they teach you in school that the
first thing to do is proper problem and code analysis? If they did not, go
back to them and ask your money back,

I'm really tired of these overengineered trainwrecks which are then
advertised with bullshit marketing like the best invention since sliced
bread. This might work in random corporates, but not here.

</rant>

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 3/8] arm: time: Remove the persistent clock support for ARM architecture
  2018-06-13 11:32   ` Baolin Wang
  (?)
@ 2018-06-24  0:21     ` Thomas Gleixner
  -1 siblings, 0 replies; 38+ messages in thread
From: Thomas Gleixner @ 2018-06-24  0:21 UTC (permalink / raw)
  To: Baolin Wang
  Cc: mark.rutland, kstewart, alexandre.belloni, x86, heiko, tony,
	viresh.kumar, linus.walleij, linux-kernel, thierry.reding, hpa,
	mingo, aaro.koskinen, daniel.lezcano, linux, jonathanh, peterz,
	paulmck, len.brown, arnd, marc.zyngier, mlichvar, broonie,
	john.stultz, linux-tegra, linux-omap, linux-arm-kernel,
	douly.fnst, gregkh, rdunlap, rajvi.jingar, pombredanne

On Wed, 13 Jun 2018, Baolin Wang wrote:

> We have introduced the persistent clock framework to support the OS time
> compensating from persistent clock, and we will convert all drivers to
> use common persistent clock framework instead of the persistent clock
> support used only for the ARM architecture. So we can remove these code
> with converting the Omap 32k counter and tegra20 timer.

Why did I look at that in the first place? But as I did, I just have to
say, it's just consistent trainwreck engineering. Remove working code first
and then add new one.

Hell NO! This is not how it works. We add new infrastructure - if required
with some extra temporary helpers - while keeping the existing
functionality intact. Then we convert the users of the old infrastructure
over and when the last user is gone, we remove that including all temporary
helpers.

That's not something fundamentally new. It's documented all over the place.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 3/8] arm: time: Remove the persistent clock support for ARM architecture
@ 2018-06-24  0:21     ` Thomas Gleixner
  0 siblings, 0 replies; 38+ messages in thread
From: Thomas Gleixner @ 2018-06-24  0:21 UTC (permalink / raw)
  To: Baolin Wang
  Cc: john.stultz, daniel.lezcano, arnd, tony, aaro.koskinen, linux,
	mark.rutland, marc.zyngier, broonie, paulmck, mlichvar, rdunlap,
	kstewart, gregkh, pombredanne, thierry.reding, jonathanh, heiko,
	linus.walleij, viresh.kumar, mingo, hpa, peterz, douly.fnst,
	len.brown, rajvi.jingar, alexandre.belloni, x86,
	linux-arm-kernel, linux-tegra, linux-kernel, linux-omap

On Wed, 13 Jun 2018, Baolin Wang wrote:

> We have introduced the persistent clock framework to support the OS time
> compensating from persistent clock, and we will convert all drivers to
> use common persistent clock framework instead of the persistent clock
> support used only for the ARM architecture. So we can remove these code
> with converting the Omap 32k counter and tegra20 timer.

Why did I look at that in the first place? But as I did, I just have to
say, it's just consistent trainwreck engineering. Remove working code first
and then add new one.

Hell NO! This is not how it works. We add new infrastructure - if required
with some extra temporary helpers - while keeping the existing
functionality intact. Then we convert the users of the old infrastructure
over and when the last user is gone, we remove that including all temporary
helpers.

That's not something fundamentally new. It's documented all over the place.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH 3/8] arm: time: Remove the persistent clock support for ARM architecture
@ 2018-06-24  0:21     ` Thomas Gleixner
  0 siblings, 0 replies; 38+ messages in thread
From: Thomas Gleixner @ 2018-06-24  0:21 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, 13 Jun 2018, Baolin Wang wrote:

> We have introduced the persistent clock framework to support the OS time
> compensating from persistent clock, and we will convert all drivers to
> use common persistent clock framework instead of the persistent clock
> support used only for the ARM architecture. So we can remove these code
> with converting the Omap 32k counter and tegra20 timer.

Why did I look at that in the first place? But as I did, I just have to
say, it's just consistent trainwreck engineering. Remove working code first
and then add new one.

Hell NO! This is not how it works. We add new infrastructure - if required
with some extra temporary helpers - while keeping the existing
functionality intact. Then we convert the users of the old infrastructure
over and when the last user is gone, we remove that including all temporary
helpers.

That's not something fundamentally new. It's documented all over the place.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 1/8] time: Add persistent clock support
  2018-06-24  0:14     ` Thomas Gleixner
@ 2018-06-25 16:18       ` Thomas Gleixner
  -1 siblings, 0 replies; 38+ messages in thread
From: Thomas Gleixner @ 2018-06-25 16:18 UTC (permalink / raw)
  To: Baolin Wang
  Cc: john.stultz, daniel.lezcano, arnd, tony, aaro.koskinen, linux,
	mark.rutland, marc.zyngier, broonie, paulmck, mlichvar, rdunlap,
	kstewart, gregkh, pombredanne, thierry.reding, jonathanh, heiko,
	linus.walleij, viresh.kumar, mingo, hpa, peterz, douly.fnst,
	len.brown, rajvi.jingar, alexandre.belloni, x86,
	linux-arm-kernel, linux-tegra, linux-kernel, linux-omap

On Sun, 24 Jun 2018, Thomas Gleixner wrote:
> The clocksource core already has all the registration/unregistration
> functionality plus an interface to reconfigure the frequency, so
> clocksources can come and go and be reconfigured and all of this just
> works.

Just for completeness sake.

If this extra suspend clocksource is not needed, i.e. because the system is
not suspended, then it does not need to be enabled. clocksources already
have an enable/disable callback exactly for that purpose. So instead of
having the clocks running during normal system operation the thing can be
OFF and save power and only be switched to ON when it is actually needed,
i.e. when going into suspend.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH 1/8] time: Add persistent clock support
@ 2018-06-25 16:18       ` Thomas Gleixner
  0 siblings, 0 replies; 38+ messages in thread
From: Thomas Gleixner @ 2018-06-25 16:18 UTC (permalink / raw)
  To: linux-arm-kernel

On Sun, 24 Jun 2018, Thomas Gleixner wrote:
> The clocksource core already has all the registration/unregistration
> functionality plus an interface to reconfigure the frequency, so
> clocksources can come and go and be reconfigured and all of this just
> works.

Just for completeness sake.

If this extra suspend clocksource is not needed, i.e. because the system is
not suspended, then it does not need to be enabled. clocksources already
have an enable/disable callback exactly for that purpose. So instead of
having the clocks running during normal system operation the thing can be
OFF and save power and only be switched to ON when it is actually needed,
i.e. when going into suspend.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 1/8] time: Add persistent clock support
  2018-06-24  0:14     ` Thomas Gleixner
  (?)
@ 2018-06-25 17:23       ` John Stultz
  -1 siblings, 0 replies; 38+ messages in thread
From: John Stultz @ 2018-06-25 17:23 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Baolin Wang, Daniel Lezcano, Arnd Bergmann, Tony Lindgren,
	aaro.koskinen, Russell King - ARM Linux, Mark Rutland,
	Marc Zyngier, Mark Brown, Paul E. McKenney, Miroslav Lichvar,
	Randy Dunlap, Kate Stewart, Greg KH, pombredanne, Thierry Reding,
	Jon Hunter, Heiko Stübner, Linus Walleij, Viresh Kumar

On Sat, Jun 23, 2018 at 5:14 PM, Thomas Gleixner <tglx@linutronix.de> wrote:
> On Wed, 13 Jun 2018, Baolin Wang wrote:
>> Moreover we can register the clocksource with CLOCK_SOURCE_SUSPEND_NONSTOP
>> to be one persistent clock, then we can simplify the suspend/resume
>> accounting by removing CLOCK_SOURCE_SUSPEND_NONSTOP timing. After that
>> we can only compensate the OS time by persistent clock or RTC.
>
> That makes sense because it adds a gazillion lines of code and removes 5?
> Not really,
>
>> +/**
>> + * persistent_clock_read_data - data required to read persistent clock
>> + * @read: Returns a cycle value from persistent clock.
>> + * @last_cycles: Clock cycle value at last update.
>> + * @last_ns: Time value (nanoseconds) at last update.
>> + * @mask: Bitmask for two's complement subtraction of non 64bit clocks.
>> + * @mult: Cycle to nanosecond multiplier.
>> + * @shift: Cycle to nanosecond divisor.
>> + */
>> +struct persistent_clock_read_data {
>> +     u64 (*read)(void);
>> +     u64 last_cycles;
>> +     u64 last_ns;
>> +     u64 mask;
>> +     u32 mult;
>> +     u32 shift;
>> +};
>> +/**
>> + * persistent_clock - represent the persistent clock
>> + * @read_data: Data required to read from persistent clock.
>> + * @seq: Sequence counter for protecting updates.
>> + * @freq: The frequency of the persistent clock.
>> + * @wrap: Duration for persistent clock can run before wrapping.
>> + * @alarm: Update timeout for persistent clock wrap.
>> + * @alarm_inited: Indicate if the alarm has been initialized.
>> + */
>> +struct persistent_clock {
>> +     struct persistent_clock_read_data read_data;
>> +     seqcount_t seq;
>> +     u32 freq;
>> +     ktime_t wrap;
>> +     struct alarm alarm;
>> +     bool alarm_inited;
>> +};
>
> NAK!
>
> There is no reason to invent yet another set of data structures and yet
> more read functions with a sequence counter. which are just a bad and
> broken copy of the existing timekeeping/clocksource code. And of course the
> stuff is not serialized against multiple registrations, etc. etc.
>
> Plus the utter nonsense that any call site has to do the same thing over
> and over:
>
>     register():
>     start_alarm_timer();
>
> Why is this required in the first place? It's not at all. The only place
> where such an alarm timer will be required is when the system actually goes
> to suspend. Starting it at registration time is pointless and even counter
> productive. Assume the clocksource wraps every 2 hours. So you start it at
> boot time and after 119 minutes uptime the system suspends. So it will
> wakeup one minute later to update the clocksource. Heck no. If the timer is
> started when the machine actually suspends it will wake up earliest in 120
> minutes.
>
> And you even add that to the TSC which does not need it at all. It will
> wrap in about 400 years on a 2GHZ machine. So you degrade the functionality
> instead of improving it.
>
> So no, this is not going anywhere.
>
> Let's look at the problem itself:
>
>    You want to use one clocksource for timekeeping during runtime which is
>    fast and accurate and another one for suspend time injection which is
>    slower and/or less accurate because the fast one stops in suspend.
>
>    Plus you need an alarmtimer which makes sure that the clocksource does
>    not wrap around during suspend.
>
> Now lets look what we have already:
>
>    Both clocksources already exist and are registered as clocksources with
>    all required data in the clocksource core.
>
> Ergo the only sane and logical conclusion is to expand the existing
> infrastructure to handle that.
>
> When a clocksource is registered, then the registration function already
> makes decisions about using it as timekeeping clocksource. So add a few
> lines of code to check whether the newly registered clocksource is suitable
> and preferred for suspend.
>
>         if (!stops_in_suspend(newcs)) {
>                 if (!suspend_cs || is_preferred_suspend_cs(newcs))
>                         suspend_cs = newcs;
>         }
>
> The is_preferred_suspend_cs() can be based on rating, the maximum suspend
> length which can be achieved or whatever is sensible. It should start of as
> a very simple decision function based on rating and not an prematurely
> overengineered monstrosity.
>
> The suspend/resume() code needs a few very simple changes:
>
> xxx_suspend():
>         clocksource_prepare_suspend();
>
>   Note, this is _NOT_ timekeeping_suspend() because that is invoked _AFTER_
>   alarmtimer_suspend(). So if an alarm timer is required it needs to be
>   armed before that. A trivial solution might be to just call it from
>   alarmtimer_suspend(), but that a minor detail to worry about.
>
> timekeeping_suspend()
> {
>         clocksource_enter_suspend();
>         ...
>
> timekeeping_resume()
> {
> ...
>         if (clocksource_leave_suspend(&nsec)) {
>                 ts_delta = ns_to_timespec64(nsec);
>                 sleeptime_injected = true;
>         } else if (......
>
>
> Now lets look at the new functions:
>
> void clocksource_prepare_suspend(void)
> {
>         if (!suspend_cs)
>                 return;
>
>         if (needs_alarmtimer(suspend_cs))
>                 start_suspend_alarm(suspend_cs);
> }
>
> void clocksource_enter_suspend(void)
> {
>         if (!suspend_cs)
>                 return;
>         suspend_start = suspend_cs->read();
> }
>
> bool clocksource_leave_suspend(u64 *nsec)
> {
>         u64 now, delta;
>
>         if (!suspend_cs)
>                 return false;
>
>         now = suspend_cs->read();
>         delta = clocksource_delta(now, suspend_start, suspend_cs->mask);
>         *nsec = mul_u64_u32_shr(delta, suspend_cs->mult, suspend_cs->shift);
>         return true;
> }
>
> See? It does not need any of this totally nonsensical stuff in your
> registration function and not any new read functions and whatever, because
> it simply can use the bog standard mult/shift values. Why? Because the
> conversion above can cope with a full 64 bit * 32 bit multiply without
> falling apart. It's already there in timekeeping_resume() otherwise
> resuming with a NONSTOP TSC would result in bogus sleep times after a few
> minutes. It's slower than the normal clocksource conversion which is
> optimized for performance, but thats completely irrelevant on resume. This
> whole blurb about requiring separate mult/shift values is just plain
> drivel.
>
> Plus any reasonably broad clocksource will not need an alarmtimer at
> all. Because the only reason it is needed is when the clocksource itself
> wraps around. And that has absolutely nothing to do with mult/shift
> values. That just depends on the frequency and the bitwidth of the counter,
>
> So it does not need an update function either because in case of broad
> enough clocksources there is absolutely no need for update and in case of
> wrapping ones the alarmtimer brings it out of suspend on time. And because
> the only interesting thing is the delta between suspend and resume this is
> all a complete non issue.
>
> The clocksource core already has all the registration/unregistration
> functionality plus an interface to reconfigure the frequency, so
> clocksources can come and go and be reconfigured and all of this just
> works.
>
> Once the extra few lines of code are in place, then you can go and cleanup
> the existing mess of homebrewn interfaces and claim that this is
> consolidation and simplification.
>
> <rant>
>
> What's wrong with you people? Didn't they teach you in school that the
> first thing to do is proper problem and code analysis? If they did not, go
> back to them and ask your money back,
>
> I'm really tired of these overengineered trainwrecks which are then
> advertised with bullshit marketing like the best invention since sliced
> bread. This might work in random corporates, but not here.
>
> </rant>

Thomas, what is wrong with *you*?  This is completely unnecessary.

First of all, I sent Baolin on this approach, as I was getting sick of
seeing the suspend/resume timing continually adding extra warts and
complexity to handle odd hardware that needs some solution. So I
suggested he clean that up in a way that lets these multiple ways of
solving the problem be done in device specific code w/o adding more
complexity to the core logic - in a fashion how the clocksources
allowed us to support nice fast/well-behaving hardware w/o preventing
ugly-but-neccessary solutions like the pit clocksource.

Now, I've also been quite a poor maintainer of late, and haven't done
much to help with pre-review of Baloin's code.  So that's on me, not
him.

Admittedly, my taste isn't always great, so there is likely to be a
better approach, and having your guidance here is *really* valued. I
just wish we could get it without all this unnecessary vinegar.

I was really lucky to have your guidance and support when I was
starting in the community. Your helping bring me up as a co-maintainer
(only a a relatively small set of responsibility compared to your much
wider maintainership), does let me see that having the responsibility
of billions of devices running your code is immense and comes with no
small amount of stress. Juggling the infinite incoming stream of
review requests on naieve or otherwise lacking code with other
important development priorities is a major burden, so *I really get
how frustrating it is*.

And its super annoying to have lots of short-term development requests
being thrown at you when you're the one who has to maintain it for the
long term.  But long-long-term, no one is going to be a maintainer
forever, and we are not going to develop more olympic swimmers if we
keep peeing in the pool.


Baolin: My apologies for leading you poorly here. Work on something
else for a day to let the flames burn off you, then review Thomas'
technical feedback, ignoring the harsh style, and try to address his
technical comments with the approach. I'll try to do better in finding
time to review your work.

-john

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 1/8] time: Add persistent clock support
@ 2018-06-25 17:23       ` John Stultz
  0 siblings, 0 replies; 38+ messages in thread
From: John Stultz @ 2018-06-25 17:23 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Baolin Wang, Daniel Lezcano, Arnd Bergmann, Tony Lindgren,
	aaro.koskinen, Russell King - ARM Linux, Mark Rutland,
	Marc Zyngier, Mark Brown, Paul E. McKenney, Miroslav Lichvar,
	Randy Dunlap, Kate Stewart, Greg KH, pombredanne, Thierry Reding,
	Jon Hunter, Heiko Stübner, Linus Walleij, Viresh Kumar,
	Ingo Molnar, H. Peter Anvin, Peter Zijlstra, douly.fnst,
	Len Brown, rajvi.jingar, alexandre.belloni, x86,
	linux-arm-kernel, linux-tegra, lkml, linux-omap

On Sat, Jun 23, 2018 at 5:14 PM, Thomas Gleixner <tglx@linutronix.de> wrote:
> On Wed, 13 Jun 2018, Baolin Wang wrote:
>> Moreover we can register the clocksource with CLOCK_SOURCE_SUSPEND_NONSTOP
>> to be one persistent clock, then we can simplify the suspend/resume
>> accounting by removing CLOCK_SOURCE_SUSPEND_NONSTOP timing. After that
>> we can only compensate the OS time by persistent clock or RTC.
>
> That makes sense because it adds a gazillion lines of code and removes 5?
> Not really,
>
>> +/**
>> + * persistent_clock_read_data - data required to read persistent clock
>> + * @read: Returns a cycle value from persistent clock.
>> + * @last_cycles: Clock cycle value at last update.
>> + * @last_ns: Time value (nanoseconds) at last update.
>> + * @mask: Bitmask for two's complement subtraction of non 64bit clocks.
>> + * @mult: Cycle to nanosecond multiplier.
>> + * @shift: Cycle to nanosecond divisor.
>> + */
>> +struct persistent_clock_read_data {
>> +     u64 (*read)(void);
>> +     u64 last_cycles;
>> +     u64 last_ns;
>> +     u64 mask;
>> +     u32 mult;
>> +     u32 shift;
>> +};
>> +/**
>> + * persistent_clock - represent the persistent clock
>> + * @read_data: Data required to read from persistent clock.
>> + * @seq: Sequence counter for protecting updates.
>> + * @freq: The frequency of the persistent clock.
>> + * @wrap: Duration for persistent clock can run before wrapping.
>> + * @alarm: Update timeout for persistent clock wrap.
>> + * @alarm_inited: Indicate if the alarm has been initialized.
>> + */
>> +struct persistent_clock {
>> +     struct persistent_clock_read_data read_data;
>> +     seqcount_t seq;
>> +     u32 freq;
>> +     ktime_t wrap;
>> +     struct alarm alarm;
>> +     bool alarm_inited;
>> +};
>
> NAK!
>
> There is no reason to invent yet another set of data structures and yet
> more read functions with a sequence counter. which are just a bad and
> broken copy of the existing timekeeping/clocksource code. And of course the
> stuff is not serialized against multiple registrations, etc. etc.
>
> Plus the utter nonsense that any call site has to do the same thing over
> and over:
>
>     register():
>     start_alarm_timer();
>
> Why is this required in the first place? It's not at all. The only place
> where such an alarm timer will be required is when the system actually goes
> to suspend. Starting it at registration time is pointless and even counter
> productive. Assume the clocksource wraps every 2 hours. So you start it at
> boot time and after 119 minutes uptime the system suspends. So it will
> wakeup one minute later to update the clocksource. Heck no. If the timer is
> started when the machine actually suspends it will wake up earliest in 120
> minutes.
>
> And you even add that to the TSC which does not need it at all. It will
> wrap in about 400 years on a 2GHZ machine. So you degrade the functionality
> instead of improving it.
>
> So no, this is not going anywhere.
>
> Let's look at the problem itself:
>
>    You want to use one clocksource for timekeeping during runtime which is
>    fast and accurate and another one for suspend time injection which is
>    slower and/or less accurate because the fast one stops in suspend.
>
>    Plus you need an alarmtimer which makes sure that the clocksource does
>    not wrap around during suspend.
>
> Now lets look what we have already:
>
>    Both clocksources already exist and are registered as clocksources with
>    all required data in the clocksource core.
>
> Ergo the only sane and logical conclusion is to expand the existing
> infrastructure to handle that.
>
> When a clocksource is registered, then the registration function already
> makes decisions about using it as timekeeping clocksource. So add a few
> lines of code to check whether the newly registered clocksource is suitable
> and preferred for suspend.
>
>         if (!stops_in_suspend(newcs)) {
>                 if (!suspend_cs || is_preferred_suspend_cs(newcs))
>                         suspend_cs = newcs;
>         }
>
> The is_preferred_suspend_cs() can be based on rating, the maximum suspend
> length which can be achieved or whatever is sensible. It should start of as
> a very simple decision function based on rating and not an prematurely
> overengineered monstrosity.
>
> The suspend/resume() code needs a few very simple changes:
>
> xxx_suspend():
>         clocksource_prepare_suspend();
>
>   Note, this is _NOT_ timekeeping_suspend() because that is invoked _AFTER_
>   alarmtimer_suspend(). So if an alarm timer is required it needs to be
>   armed before that. A trivial solution might be to just call it from
>   alarmtimer_suspend(), but that a minor detail to worry about.
>
> timekeeping_suspend()
> {
>         clocksource_enter_suspend();
>         ...
>
> timekeeping_resume()
> {
> ...
>         if (clocksource_leave_suspend(&nsec)) {
>                 ts_delta = ns_to_timespec64(nsec);
>                 sleeptime_injected = true;
>         } else if (......
>
>
> Now lets look at the new functions:
>
> void clocksource_prepare_suspend(void)
> {
>         if (!suspend_cs)
>                 return;
>
>         if (needs_alarmtimer(suspend_cs))
>                 start_suspend_alarm(suspend_cs);
> }
>
> void clocksource_enter_suspend(void)
> {
>         if (!suspend_cs)
>                 return;
>         suspend_start = suspend_cs->read();
> }
>
> bool clocksource_leave_suspend(u64 *nsec)
> {
>         u64 now, delta;
>
>         if (!suspend_cs)
>                 return false;
>
>         now = suspend_cs->read();
>         delta = clocksource_delta(now, suspend_start, suspend_cs->mask);
>         *nsec = mul_u64_u32_shr(delta, suspend_cs->mult, suspend_cs->shift);
>         return true;
> }
>
> See? It does not need any of this totally nonsensical stuff in your
> registration function and not any new read functions and whatever, because
> it simply can use the bog standard mult/shift values. Why? Because the
> conversion above can cope with a full 64 bit * 32 bit multiply without
> falling apart. It's already there in timekeeping_resume() otherwise
> resuming with a NONSTOP TSC would result in bogus sleep times after a few
> minutes. It's slower than the normal clocksource conversion which is
> optimized for performance, but thats completely irrelevant on resume. This
> whole blurb about requiring separate mult/shift values is just plain
> drivel.
>
> Plus any reasonably broad clocksource will not need an alarmtimer at
> all. Because the only reason it is needed is when the clocksource itself
> wraps around. And that has absolutely nothing to do with mult/shift
> values. That just depends on the frequency and the bitwidth of the counter,
>
> So it does not need an update function either because in case of broad
> enough clocksources there is absolutely no need for update and in case of
> wrapping ones the alarmtimer brings it out of suspend on time. And because
> the only interesting thing is the delta between suspend and resume this is
> all a complete non issue.
>
> The clocksource core already has all the registration/unregistration
> functionality plus an interface to reconfigure the frequency, so
> clocksources can come and go and be reconfigured and all of this just
> works.
>
> Once the extra few lines of code are in place, then you can go and cleanup
> the existing mess of homebrewn interfaces and claim that this is
> consolidation and simplification.
>
> <rant>
>
> What's wrong with you people? Didn't they teach you in school that the
> first thing to do is proper problem and code analysis? If they did not, go
> back to them and ask your money back,
>
> I'm really tired of these overengineered trainwrecks which are then
> advertised with bullshit marketing like the best invention since sliced
> bread. This might work in random corporates, but not here.
>
> </rant>

Thomas, what is wrong with *you*?  This is completely unnecessary.

First of all, I sent Baolin on this approach, as I was getting sick of
seeing the suspend/resume timing continually adding extra warts and
complexity to handle odd hardware that needs some solution. So I
suggested he clean that up in a way that lets these multiple ways of
solving the problem be done in device specific code w/o adding more
complexity to the core logic - in a fashion how the clocksources
allowed us to support nice fast/well-behaving hardware w/o preventing
ugly-but-neccessary solutions like the pit clocksource.

Now, I've also been quite a poor maintainer of late, and haven't done
much to help with pre-review of Baloin's code.  So that's on me, not
him.

Admittedly, my taste isn't always great, so there is likely to be a
better approach, and having your guidance here is *really* valued. I
just wish we could get it without all this unnecessary vinegar.

I was really lucky to have your guidance and support when I was
starting in the community. Your helping bring me up as a co-maintainer
(only a a relatively small set of responsibility compared to your much
wider maintainership), does let me see that having the responsibility
of billions of devices running your code is immense and comes with no
small amount of stress. Juggling the infinite incoming stream of
review requests on naieve or otherwise lacking code with other
important development priorities is a major burden, so *I really get
how frustrating it is*.

And its super annoying to have lots of short-term development requests
being thrown at you when you're the one who has to maintain it for the
long term.  But long-long-term, no one is going to be a maintainer
forever, and we are not going to develop more olympic swimmers if we
keep peeing in the pool.


Baolin: My apologies for leading you poorly here. Work on something
else for a day to let the flames burn off you, then review Thomas'
technical feedback, ignoring the harsh style, and try to address his
technical comments with the approach. I'll try to do better in finding
time to review your work.

-john

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH 1/8] time: Add persistent clock support
@ 2018-06-25 17:23       ` John Stultz
  0 siblings, 0 replies; 38+ messages in thread
From: John Stultz @ 2018-06-25 17:23 UTC (permalink / raw)
  To: linux-arm-kernel

On Sat, Jun 23, 2018 at 5:14 PM, Thomas Gleixner <tglx@linutronix.de> wrote:
> On Wed, 13 Jun 2018, Baolin Wang wrote:
>> Moreover we can register the clocksource with CLOCK_SOURCE_SUSPEND_NONSTOP
>> to be one persistent clock, then we can simplify the suspend/resume
>> accounting by removing CLOCK_SOURCE_SUSPEND_NONSTOP timing. After that
>> we can only compensate the OS time by persistent clock or RTC.
>
> That makes sense because it adds a gazillion lines of code and removes 5?
> Not really,
>
>> +/**
>> + * persistent_clock_read_data - data required to read persistent clock
>> + * @read: Returns a cycle value from persistent clock.
>> + * @last_cycles: Clock cycle value at last update.
>> + * @last_ns: Time value (nanoseconds) at last update.
>> + * @mask: Bitmask for two's complement subtraction of non 64bit clocks.
>> + * @mult: Cycle to nanosecond multiplier.
>> + * @shift: Cycle to nanosecond divisor.
>> + */
>> +struct persistent_clock_read_data {
>> +     u64 (*read)(void);
>> +     u64 last_cycles;
>> +     u64 last_ns;
>> +     u64 mask;
>> +     u32 mult;
>> +     u32 shift;
>> +};
>> +/**
>> + * persistent_clock - represent the persistent clock
>> + * @read_data: Data required to read from persistent clock.
>> + * @seq: Sequence counter for protecting updates.
>> + * @freq: The frequency of the persistent clock.
>> + * @wrap: Duration for persistent clock can run before wrapping.
>> + * @alarm: Update timeout for persistent clock wrap.
>> + * @alarm_inited: Indicate if the alarm has been initialized.
>> + */
>> +struct persistent_clock {
>> +     struct persistent_clock_read_data read_data;
>> +     seqcount_t seq;
>> +     u32 freq;
>> +     ktime_t wrap;
>> +     struct alarm alarm;
>> +     bool alarm_inited;
>> +};
>
> NAK!
>
> There is no reason to invent yet another set of data structures and yet
> more read functions with a sequence counter. which are just a bad and
> broken copy of the existing timekeeping/clocksource code. And of course the
> stuff is not serialized against multiple registrations, etc. etc.
>
> Plus the utter nonsense that any call site has to do the same thing over
> and over:
>
>     register():
>     start_alarm_timer();
>
> Why is this required in the first place? It's not at all. The only place
> where such an alarm timer will be required is when the system actually goes
> to suspend. Starting it at registration time is pointless and even counter
> productive. Assume the clocksource wraps every 2 hours. So you start it at
> boot time and after 119 minutes uptime the system suspends. So it will
> wakeup one minute later to update the clocksource. Heck no. If the timer is
> started when the machine actually suspends it will wake up earliest in 120
> minutes.
>
> And you even add that to the TSC which does not need it at all. It will
> wrap in about 400 years on a 2GHZ machine. So you degrade the functionality
> instead of improving it.
>
> So no, this is not going anywhere.
>
> Let's look at the problem itself:
>
>    You want to use one clocksource for timekeeping during runtime which is
>    fast and accurate and another one for suspend time injection which is
>    slower and/or less accurate because the fast one stops in suspend.
>
>    Plus you need an alarmtimer which makes sure that the clocksource does
>    not wrap around during suspend.
>
> Now lets look what we have already:
>
>    Both clocksources already exist and are registered as clocksources with
>    all required data in the clocksource core.
>
> Ergo the only sane and logical conclusion is to expand the existing
> infrastructure to handle that.
>
> When a clocksource is registered, then the registration function already
> makes decisions about using it as timekeeping clocksource. So add a few
> lines of code to check whether the newly registered clocksource is suitable
> and preferred for suspend.
>
>         if (!stops_in_suspend(newcs)) {
>                 if (!suspend_cs || is_preferred_suspend_cs(newcs))
>                         suspend_cs = newcs;
>         }
>
> The is_preferred_suspend_cs() can be based on rating, the maximum suspend
> length which can be achieved or whatever is sensible. It should start of as
> a very simple decision function based on rating and not an prematurely
> overengineered monstrosity.
>
> The suspend/resume() code needs a few very simple changes:
>
> xxx_suspend():
>         clocksource_prepare_suspend();
>
>   Note, this is _NOT_ timekeeping_suspend() because that is invoked _AFTER_
>   alarmtimer_suspend(). So if an alarm timer is required it needs to be
>   armed before that. A trivial solution might be to just call it from
>   alarmtimer_suspend(), but that a minor detail to worry about.
>
> timekeeping_suspend()
> {
>         clocksource_enter_suspend();
>         ...
>
> timekeeping_resume()
> {
> ...
>         if (clocksource_leave_suspend(&nsec)) {
>                 ts_delta = ns_to_timespec64(nsec);
>                 sleeptime_injected = true;
>         } else if (......
>
>
> Now lets look at the new functions:
>
> void clocksource_prepare_suspend(void)
> {
>         if (!suspend_cs)
>                 return;
>
>         if (needs_alarmtimer(suspend_cs))
>                 start_suspend_alarm(suspend_cs);
> }
>
> void clocksource_enter_suspend(void)
> {
>         if (!suspend_cs)
>                 return;
>         suspend_start = suspend_cs->read();
> }
>
> bool clocksource_leave_suspend(u64 *nsec)
> {
>         u64 now, delta;
>
>         if (!suspend_cs)
>                 return false;
>
>         now = suspend_cs->read();
>         delta = clocksource_delta(now, suspend_start, suspend_cs->mask);
>         *nsec = mul_u64_u32_shr(delta, suspend_cs->mult, suspend_cs->shift);
>         return true;
> }
>
> See? It does not need any of this totally nonsensical stuff in your
> registration function and not any new read functions and whatever, because
> it simply can use the bog standard mult/shift values. Why? Because the
> conversion above can cope with a full 64 bit * 32 bit multiply without
> falling apart. It's already there in timekeeping_resume() otherwise
> resuming with a NONSTOP TSC would result in bogus sleep times after a few
> minutes. It's slower than the normal clocksource conversion which is
> optimized for performance, but thats completely irrelevant on resume. This
> whole blurb about requiring separate mult/shift values is just plain
> drivel.
>
> Plus any reasonably broad clocksource will not need an alarmtimer at
> all. Because the only reason it is needed is when the clocksource itself
> wraps around. And that has absolutely nothing to do with mult/shift
> values. That just depends on the frequency and the bitwidth of the counter,
>
> So it does not need an update function either because in case of broad
> enough clocksources there is absolutely no need for update and in case of
> wrapping ones the alarmtimer brings it out of suspend on time. And because
> the only interesting thing is the delta between suspend and resume this is
> all a complete non issue.
>
> The clocksource core already has all the registration/unregistration
> functionality plus an interface to reconfigure the frequency, so
> clocksources can come and go and be reconfigured and all of this just
> works.
>
> Once the extra few lines of code are in place, then you can go and cleanup
> the existing mess of homebrewn interfaces and claim that this is
> consolidation and simplification.
>
> <rant>
>
> What's wrong with you people? Didn't they teach you in school that the
> first thing to do is proper problem and code analysis? If they did not, go
> back to them and ask your money back,
>
> I'm really tired of these overengineered trainwrecks which are then
> advertised with bullshit marketing like the best invention since sliced
> bread. This might work in random corporates, but not here.
>
> </rant>

Thomas, what is wrong with *you*?  This is completely unnecessary.

First of all, I sent Baolin on this approach, as I was getting sick of
seeing the suspend/resume timing continually adding extra warts and
complexity to handle odd hardware that needs some solution. So I
suggested he clean that up in a way that lets these multiple ways of
solving the problem be done in device specific code w/o adding more
complexity to the core logic - in a fashion how the clocksources
allowed us to support nice fast/well-behaving hardware w/o preventing
ugly-but-neccessary solutions like the pit clocksource.

Now, I've also been quite a poor maintainer of late, and haven't done
much to help with pre-review of Baloin's code.  So that's on me, not
him.

Admittedly, my taste isn't always great, so there is likely to be a
better approach, and having your guidance here is *really* valued. I
just wish we could get it without all this unnecessary vinegar.

I was really lucky to have your guidance and support when I was
starting in the community. Your helping bring me up as a co-maintainer
(only a a relatively small set of responsibility compared to your much
wider maintainership), does let me see that having the responsibility
of billions of devices running your code is immense and comes with no
small amount of stress. Juggling the infinite incoming stream of
review requests on naieve or otherwise lacking code with other
important development priorities is a major burden, so *I really get
how frustrating it is*.

And its super annoying to have lots of short-term development requests
being thrown at you when you're the one who has to maintain it for the
long term.  But long-long-term, no one is going to be a maintainer
forever, and we are not going to develop more olympic swimmers if we
keep peeing in the pool.


Baolin: My apologies for leading you poorly here. Work on something
else for a day to let the flames burn off you, then review Thomas'
technical feedback, ignoring the harsh style, and try to address his
technical comments with the approach. I'll try to do better in finding
time to review your work.

-john

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 1/8] time: Add persistent clock support
  2018-06-25 17:23       ` John Stultz
  (?)
@ 2018-06-25 20:06         ` Thomas Gleixner
  -1 siblings, 0 replies; 38+ messages in thread
From: Thomas Gleixner @ 2018-06-25 20:06 UTC (permalink / raw)
  To: John Stultz
  Cc: Mark Rutland, Kate Stewart, alexandre.belloni, x86,
	Heiko Stübner, Tony Lindgren, Viresh Kumar, Linus Walleij,
	lkml, Thierry Reding, H. Peter Anvin, Ingo Molnar, aaro.koskinen,
	Daniel Lezcano, Russell King - ARM Linux, Jon Hunter,
	Peter Zijlstra, Paul E. McKenney, Len Brown, Arnd Bergmann,
	Marc Zyngier, Miroslav Lichvar, Mark Brown, linux-tegra

On Mon, 25 Jun 2018, John Stultz wrote:
> On Sat, Jun 23, 2018 at 5:14 PM, Thomas Gleixner <tglx@linutronix.de> wrote:
> 
> Thomas, what is wrong with *you*?  This is completely unnecessary.

Nothing is wrong with me. I just hit a point where the amount of half baken
crap tripped me over the edge and I let of steam. Sorry for that, it surely
wasn't necessary. It was one of those days where grumpiness took
control. My steam pressure is back to normal and I'm more than happy to put
that back on the pure technical level.

To be clear, that rant was not a shot at Baolin. It was definitely
addressed to the people behind him who set him on this project and then
failed to do any sanity checking.

> Now, I've also been quite a poor maintainer of late, and haven't done
> much to help with pre-review of Baloin's code.  So that's on me, not
> him.

Fair enough.

> I was really lucky to have your guidance and support when I was
> starting in the community. Your helping bring me up as a co-maintainer
> (only a a relatively small set of responsibility compared to your much
> wider maintainership), does let me see that having the responsibility
> of billions of devices running your code is immense and comes with no
> small amount of stress. Juggling the infinite incoming stream of
> review requests on naieve or otherwise lacking code with other
> important development priorities is a major burden, so *I really get
> how frustrating it is*.

Yes, and I have to admit that I'm not a super human and despite trying to
control my emotions, there are points where I fail in doing so. It makes me
explode and that's it. I can assure Baolin, that this is nothing which
sticks. I'll look at the next patchset from a pure technical level. There
are surely precautions, but definitely no prejudice.

> And its super annoying to have lots of short-term development requests
> being thrown at you when you're the one who has to maintain it for the
> long term.  But long-long-term, no one is going to be a maintainer
> forever, and we are not going to develop more olympic swimmers if we
> keep peeing in the pool.

Right, that's why I early looked for co/sub-maintainers, but as you
admitted yourself, that's not always working out and I surely would
appreciate if there would be more help on all ends. I'm the last person who
rejects help, quite the contrary I'm very happy when I can mark mails as
'none of my business, somebody else takes care of that'.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 1/8] time: Add persistent clock support
@ 2018-06-25 20:06         ` Thomas Gleixner
  0 siblings, 0 replies; 38+ messages in thread
From: Thomas Gleixner @ 2018-06-25 20:06 UTC (permalink / raw)
  To: John Stultz
  Cc: Baolin Wang, Daniel Lezcano, Arnd Bergmann, Tony Lindgren,
	aaro.koskinen, Russell King - ARM Linux, Mark Rutland,
	Marc Zyngier, Mark Brown, Paul E. McKenney, Miroslav Lichvar,
	Randy Dunlap, Kate Stewart, Greg KH, pombredanne, Thierry Reding,
	Jon Hunter, Heiko Stübner, Linus Walleij, Viresh Kumar,
	Ingo Molnar, H. Peter Anvin, Peter Zijlstra, douly.fnst,
	Len Brown, rajvi.jingar, alexandre.belloni, x86,
	linux-arm-kernel, linux-tegra, lkml, linux-omap

On Mon, 25 Jun 2018, John Stultz wrote:
> On Sat, Jun 23, 2018 at 5:14 PM, Thomas Gleixner <tglx@linutronix.de> wrote:
> 
> Thomas, what is wrong with *you*?  This is completely unnecessary.

Nothing is wrong with me. I just hit a point where the amount of half baken
crap tripped me over the edge and I let of steam. Sorry for that, it surely
wasn't necessary. It was one of those days where grumpiness took
control. My steam pressure is back to normal and I'm more than happy to put
that back on the pure technical level.

To be clear, that rant was not a shot at Baolin. It was definitely
addressed to the people behind him who set him on this project and then
failed to do any sanity checking.

> Now, I've also been quite a poor maintainer of late, and haven't done
> much to help with pre-review of Baloin's code.  So that's on me, not
> him.

Fair enough.

> I was really lucky to have your guidance and support when I was
> starting in the community. Your helping bring me up as a co-maintainer
> (only a a relatively small set of responsibility compared to your much
> wider maintainership), does let me see that having the responsibility
> of billions of devices running your code is immense and comes with no
> small amount of stress. Juggling the infinite incoming stream of
> review requests on naieve or otherwise lacking code with other
> important development priorities is a major burden, so *I really get
> how frustrating it is*.

Yes, and I have to admit that I'm not a super human and despite trying to
control my emotions, there are points where I fail in doing so. It makes me
explode and that's it. I can assure Baolin, that this is nothing which
sticks. I'll look at the next patchset from a pure technical level. There
are surely precautions, but definitely no prejudice.

> And its super annoying to have lots of short-term development requests
> being thrown at you when you're the one who has to maintain it for the
> long term.  But long-long-term, no one is going to be a maintainer
> forever, and we are not going to develop more olympic swimmers if we
> keep peeing in the pool.

Right, that's why I early looked for co/sub-maintainers, but as you
admitted yourself, that's not always working out and I surely would
appreciate if there would be more help on all ends. I'm the last person who
rejects help, quite the contrary I'm very happy when I can mark mails as
'none of my business, somebody else takes care of that'.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH 1/8] time: Add persistent clock support
@ 2018-06-25 20:06         ` Thomas Gleixner
  0 siblings, 0 replies; 38+ messages in thread
From: Thomas Gleixner @ 2018-06-25 20:06 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, 25 Jun 2018, John Stultz wrote:
> On Sat, Jun 23, 2018 at 5:14 PM, Thomas Gleixner <tglx@linutronix.de> wrote:
> 
> Thomas, what is wrong with *you*?  This is completely unnecessary.

Nothing is wrong with me. I just hit a point where the amount of half baken
crap tripped me over the edge and I let of steam. Sorry for that, it surely
wasn't necessary. It was one of those days where grumpiness took
control. My steam pressure is back to normal and I'm more than happy to put
that back on the pure technical level.

To be clear, that rant was not a shot at Baolin. It was definitely
addressed to the people behind him who set him on this project and then
failed to do any sanity checking.

> Now, I've also been quite a poor maintainer of late, and haven't done
> much to help with pre-review of Baloin's code.  So that's on me, not
> him.

Fair enough.

> I was really lucky to have your guidance and support when I was
> starting in the community. Your helping bring me up as a co-maintainer
> (only a a relatively small set of responsibility compared to your much
> wider maintainership), does let me see that having the responsibility
> of billions of devices running your code is immense and comes with no
> small amount of stress. Juggling the infinite incoming stream of
> review requests on naieve or otherwise lacking code with other
> important development priorities is a major burden, so *I really get
> how frustrating it is*.

Yes, and I have to admit that I'm not a super human and despite trying to
control my emotions, there are points where I fail in doing so. It makes me
explode and that's it. I can assure Baolin, that this is nothing which
sticks. I'll look at the next patchset from a pure technical level. There
are surely precautions, but definitely no prejudice.

> And its super annoying to have lots of short-term development requests
> being thrown at you when you're the one who has to maintain it for the
> long term.  But long-long-term, no one is going to be a maintainer
> forever, and we are not going to develop more olympic swimmers if we
> keep peeing in the pool.

Right, that's why I early looked for co/sub-maintainers, but as you
admitted yourself, that's not always working out and I surely would
appreciate if there would be more help on all ends. I'm the last person who
rejects help, quite the contrary I'm very happy when I can mark mails as
'none of my business, somebody else takes care of that'.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 1/8] time: Add persistent clock support
  2018-06-24  0:14     ` Thomas Gleixner
  (?)
@ 2018-06-26  2:08       ` Baolin Wang
  -1 siblings, 0 replies; 38+ messages in thread
From: Baolin Wang @ 2018-06-26  2:08 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: John Stultz, Daniel Lezcano, Arnd Bergmann, Tony Lindgren,
	aaro.koskinen, linux, Mark Rutland, Marc Zyngier, Mark Brown,
	Paul McKenney, mlichvar, Randy Dunlap, Kate Stewart, Greg KH,
	Philippe Ombredanne, Thierry Reding, Jon Hunter,
	Heiko Stübner, Linus Walleij, Viresh Kumar, Ingo Molnar

Hi Thomas,

On 24 June 2018 at 08:14, Thomas Gleixner <tglx@linutronix.de> wrote:
> On Wed, 13 Jun 2018, Baolin Wang wrote:
>> Moreover we can register the clocksource with CLOCK_SOURCE_SUSPEND_NONSTOP
>> to be one persistent clock, then we can simplify the suspend/resume
>> accounting by removing CLOCK_SOURCE_SUSPEND_NONSTOP timing. After that
>> we can only compensate the OS time by persistent clock or RTC.
>
> That makes sense because it adds a gazillion lines of code and removes 5?
> Not really,
>
>> +/**
>> + * persistent_clock_read_data - data required to read persistent clock
>> + * @read: Returns a cycle value from persistent clock.
>> + * @last_cycles: Clock cycle value at last update.
>> + * @last_ns: Time value (nanoseconds) at last update.
>> + * @mask: Bitmask for two's complement subtraction of non 64bit clocks.
>> + * @mult: Cycle to nanosecond multiplier.
>> + * @shift: Cycle to nanosecond divisor.
>> + */
>> +struct persistent_clock_read_data {
>> +     u64 (*read)(void);
>> +     u64 last_cycles;
>> +     u64 last_ns;
>> +     u64 mask;
>> +     u32 mult;
>> +     u32 shift;
>> +};
>> +/**
>> + * persistent_clock - represent the persistent clock
>> + * @read_data: Data required to read from persistent clock.
>> + * @seq: Sequence counter for protecting updates.
>> + * @freq: The frequency of the persistent clock.
>> + * @wrap: Duration for persistent clock can run before wrapping.
>> + * @alarm: Update timeout for persistent clock wrap.
>> + * @alarm_inited: Indicate if the alarm has been initialized.
>> + */
>> +struct persistent_clock {
>> +     struct persistent_clock_read_data read_data;
>> +     seqcount_t seq;
>> +     u32 freq;
>> +     ktime_t wrap;
>> +     struct alarm alarm;
>> +     bool alarm_inited;
>> +};
>
> NAK!
>
> There is no reason to invent yet another set of data structures and yet
> more read functions with a sequence counter. which are just a bad and
> broken copy of the existing timekeeping/clocksource code. And of course the
> stuff is not serialized against multiple registrations, etc. etc.
>
> Plus the utter nonsense that any call site has to do the same thing over
> and over:
>
>     register():
>     start_alarm_timer();
>
> Why is this required in the first place? It's not at all. The only place
> where such an alarm timer will be required is when the system actually goes
> to suspend. Starting it at registration time is pointless and even counter
> productive. Assume the clocksource wraps every 2 hours. So you start it at
> boot time and after 119 minutes uptime the system suspends. So it will
> wakeup one minute later to update the clocksource. Heck no. If the timer is
> started when the machine actually suspends it will wake up earliest in 120
> minutes.
>
> And you even add that to the TSC which does not need it at all. It will
> wrap in about 400 years on a 2GHZ machine. So you degrade the functionality
> instead of improving it.
>
> So no, this is not going anywhere.
>
> Let's look at the problem itself:
>
>    You want to use one clocksource for timekeeping during runtime which is
>    fast and accurate and another one for suspend time injection which is
>    slower and/or less accurate because the fast one stops in suspend.
>
>    Plus you need an alarmtimer which makes sure that the clocksource does
>    not wrap around during suspend.
>
> Now lets look what we have already:
>
>    Both clocksources already exist and are registered as clocksources with
>    all required data in the clocksource core.
>
> Ergo the only sane and logical conclusion is to expand the existing
> infrastructure to handle that.
>
> When a clocksource is registered, then the registration function already
> makes decisions about using it as timekeeping clocksource. So add a few
> lines of code to check whether the newly registered clocksource is suitable
> and preferred for suspend.
>
>         if (!stops_in_suspend(newcs)) {
>                 if (!suspend_cs || is_preferred_suspend_cs(newcs))
>                         suspend_cs = newcs;
>         }
>
> The is_preferred_suspend_cs() can be based on rating, the maximum suspend
> length which can be achieved or whatever is sensible. It should start of as
> a very simple decision function based on rating and not an prematurely
> overengineered monstrosity.
>
> The suspend/resume() code needs a few very simple changes:
>
> xxx_suspend():
>         clocksource_prepare_suspend();
>
>   Note, this is _NOT_ timekeeping_suspend() because that is invoked _AFTER_
>   alarmtimer_suspend(). So if an alarm timer is required it needs to be
>   armed before that. A trivial solution might be to just call it from
>   alarmtimer_suspend(), but that a minor detail to worry about.
>
> timekeeping_suspend()
> {
>         clocksource_enter_suspend();
>         ...
>
> timekeeping_resume()
> {
> ...
>         if (clocksource_leave_suspend(&nsec)) {
>                 ts_delta = ns_to_timespec64(nsec);
>                 sleeptime_injected = true;
>         } else if (......
>
>
> Now lets look at the new functions:
>
> void clocksource_prepare_suspend(void)
> {
>         if (!suspend_cs)
>                 return;
>
>         if (needs_alarmtimer(suspend_cs))
>                 start_suspend_alarm(suspend_cs);
> }
>
> void clocksource_enter_suspend(void)
> {
>         if (!suspend_cs)
>                 return;
>         suspend_start = suspend_cs->read();
> }
>
> bool clocksource_leave_suspend(u64 *nsec)
> {
>         u64 now, delta;
>
>         if (!suspend_cs)
>                 return false;
>
>         now = suspend_cs->read();
>         delta = clocksource_delta(now, suspend_start, suspend_cs->mask);
>         *nsec = mul_u64_u32_shr(delta, suspend_cs->mult, suspend_cs->shift);
>         return true;
> }
>
> See? It does not need any of this totally nonsensical stuff in your
> registration function and not any new read functions and whatever, because
> it simply can use the bog standard mult/shift values. Why? Because the
> conversion above can cope with a full 64 bit * 32 bit multiply without
> falling apart. It's already there in timekeeping_resume() otherwise
> resuming with a NONSTOP TSC would result in bogus sleep times after a few
> minutes. It's slower than the normal clocksource conversion which is
> optimized for performance, but thats completely irrelevant on resume. This
> whole blurb about requiring separate mult/shift values is just plain
> drivel.
>
> Plus any reasonably broad clocksource will not need an alarmtimer at
> all. Because the only reason it is needed is when the clocksource itself
> wraps around. And that has absolutely nothing to do with mult/shift
> values. That just depends on the frequency and the bitwidth of the counter,
>
> So it does not need an update function either because in case of broad
> enough clocksources there is absolutely no need for update and in case of
> wrapping ones the alarmtimer brings it out of suspend on time. And because
> the only interesting thing is the delta between suspend and resume this is
> all a complete non issue.
>
> The clocksource core already has all the registration/unregistration
> functionality plus an interface to reconfigure the frequency, so
> clocksources can come and go and be reconfigured and all of this just
> works.
>
> Once the extra few lines of code are in place, then you can go and cleanup
> the existing mess of homebrewn interfaces and claim that this is
> consolidation and simplification.

I am very grateful for your suggestion, and I am sorry I made a stupid
mistake that I missed the mul_u64_u32_shr() function which can avoid
introducing extra mult/shift for suspend clocksource. I will use that
and follow your other suggestions too. Thanks again.

>
> <rant>
>
> What's wrong with you people? Didn't they teach you in school that the
> first thing to do is proper problem and code analysis? If they did not, go
> back to them and ask your money back,
>
> I'm really tired of these overengineered trainwrecks which are then
> advertised with bullshit marketing like the best invention since sliced
> bread. This might work in random corporates, but not here.
>
> </rant>
>
> Thanks,
>
>         tglx



---
Baolin Wang
Best Regards

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 1/8] time: Add persistent clock support
@ 2018-06-26  2:08       ` Baolin Wang
  0 siblings, 0 replies; 38+ messages in thread
From: Baolin Wang @ 2018-06-26  2:08 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: John Stultz, Daniel Lezcano, Arnd Bergmann, Tony Lindgren,
	aaro.koskinen, linux, Mark Rutland, Marc Zyngier, Mark Brown,
	Paul McKenney, mlichvar, Randy Dunlap, Kate Stewart, Greg KH,
	Philippe Ombredanne, Thierry Reding, Jon Hunter,
	Heiko Stübner, Linus Walleij, Viresh Kumar, Ingo Molnar,
	H. Peter Anvin, peterz, douly.fnst, len.brown, rajvi.jingar,
	Alexandre Belloni, x86, Linux ARM, linux-tegra, LKML, linux-omap

Hi Thomas,

On 24 June 2018 at 08:14, Thomas Gleixner <tglx@linutronix.de> wrote:
> On Wed, 13 Jun 2018, Baolin Wang wrote:
>> Moreover we can register the clocksource with CLOCK_SOURCE_SUSPEND_NONSTOP
>> to be one persistent clock, then we can simplify the suspend/resume
>> accounting by removing CLOCK_SOURCE_SUSPEND_NONSTOP timing. After that
>> we can only compensate the OS time by persistent clock or RTC.
>
> That makes sense because it adds a gazillion lines of code and removes 5?
> Not really,
>
>> +/**
>> + * persistent_clock_read_data - data required to read persistent clock
>> + * @read: Returns a cycle value from persistent clock.
>> + * @last_cycles: Clock cycle value at last update.
>> + * @last_ns: Time value (nanoseconds) at last update.
>> + * @mask: Bitmask for two's complement subtraction of non 64bit clocks.
>> + * @mult: Cycle to nanosecond multiplier.
>> + * @shift: Cycle to nanosecond divisor.
>> + */
>> +struct persistent_clock_read_data {
>> +     u64 (*read)(void);
>> +     u64 last_cycles;
>> +     u64 last_ns;
>> +     u64 mask;
>> +     u32 mult;
>> +     u32 shift;
>> +};
>> +/**
>> + * persistent_clock - represent the persistent clock
>> + * @read_data: Data required to read from persistent clock.
>> + * @seq: Sequence counter for protecting updates.
>> + * @freq: The frequency of the persistent clock.
>> + * @wrap: Duration for persistent clock can run before wrapping.
>> + * @alarm: Update timeout for persistent clock wrap.
>> + * @alarm_inited: Indicate if the alarm has been initialized.
>> + */
>> +struct persistent_clock {
>> +     struct persistent_clock_read_data read_data;
>> +     seqcount_t seq;
>> +     u32 freq;
>> +     ktime_t wrap;
>> +     struct alarm alarm;
>> +     bool alarm_inited;
>> +};
>
> NAK!
>
> There is no reason to invent yet another set of data structures and yet
> more read functions with a sequence counter. which are just a bad and
> broken copy of the existing timekeeping/clocksource code. And of course the
> stuff is not serialized against multiple registrations, etc. etc.
>
> Plus the utter nonsense that any call site has to do the same thing over
> and over:
>
>     register():
>     start_alarm_timer();
>
> Why is this required in the first place? It's not at all. The only place
> where such an alarm timer will be required is when the system actually goes
> to suspend. Starting it at registration time is pointless and even counter
> productive. Assume the clocksource wraps every 2 hours. So you start it at
> boot time and after 119 minutes uptime the system suspends. So it will
> wakeup one minute later to update the clocksource. Heck no. If the timer is
> started when the machine actually suspends it will wake up earliest in 120
> minutes.
>
> And you even add that to the TSC which does not need it at all. It will
> wrap in about 400 years on a 2GHZ machine. So you degrade the functionality
> instead of improving it.
>
> So no, this is not going anywhere.
>
> Let's look at the problem itself:
>
>    You want to use one clocksource for timekeeping during runtime which is
>    fast and accurate and another one for suspend time injection which is
>    slower and/or less accurate because the fast one stops in suspend.
>
>    Plus you need an alarmtimer which makes sure that the clocksource does
>    not wrap around during suspend.
>
> Now lets look what we have already:
>
>    Both clocksources already exist and are registered as clocksources with
>    all required data in the clocksource core.
>
> Ergo the only sane and logical conclusion is to expand the existing
> infrastructure to handle that.
>
> When a clocksource is registered, then the registration function already
> makes decisions about using it as timekeeping clocksource. So add a few
> lines of code to check whether the newly registered clocksource is suitable
> and preferred for suspend.
>
>         if (!stops_in_suspend(newcs)) {
>                 if (!suspend_cs || is_preferred_suspend_cs(newcs))
>                         suspend_cs = newcs;
>         }
>
> The is_preferred_suspend_cs() can be based on rating, the maximum suspend
> length which can be achieved or whatever is sensible. It should start of as
> a very simple decision function based on rating and not an prematurely
> overengineered monstrosity.
>
> The suspend/resume() code needs a few very simple changes:
>
> xxx_suspend():
>         clocksource_prepare_suspend();
>
>   Note, this is _NOT_ timekeeping_suspend() because that is invoked _AFTER_
>   alarmtimer_suspend(). So if an alarm timer is required it needs to be
>   armed before that. A trivial solution might be to just call it from
>   alarmtimer_suspend(), but that a minor detail to worry about.
>
> timekeeping_suspend()
> {
>         clocksource_enter_suspend();
>         ...
>
> timekeeping_resume()
> {
> ...
>         if (clocksource_leave_suspend(&nsec)) {
>                 ts_delta = ns_to_timespec64(nsec);
>                 sleeptime_injected = true;
>         } else if (......
>
>
> Now lets look at the new functions:
>
> void clocksource_prepare_suspend(void)
> {
>         if (!suspend_cs)
>                 return;
>
>         if (needs_alarmtimer(suspend_cs))
>                 start_suspend_alarm(suspend_cs);
> }
>
> void clocksource_enter_suspend(void)
> {
>         if (!suspend_cs)
>                 return;
>         suspend_start = suspend_cs->read();
> }
>
> bool clocksource_leave_suspend(u64 *nsec)
> {
>         u64 now, delta;
>
>         if (!suspend_cs)
>                 return false;
>
>         now = suspend_cs->read();
>         delta = clocksource_delta(now, suspend_start, suspend_cs->mask);
>         *nsec = mul_u64_u32_shr(delta, suspend_cs->mult, suspend_cs->shift);
>         return true;
> }
>
> See? It does not need any of this totally nonsensical stuff in your
> registration function and not any new read functions and whatever, because
> it simply can use the bog standard mult/shift values. Why? Because the
> conversion above can cope with a full 64 bit * 32 bit multiply without
> falling apart. It's already there in timekeeping_resume() otherwise
> resuming with a NONSTOP TSC would result in bogus sleep times after a few
> minutes. It's slower than the normal clocksource conversion which is
> optimized for performance, but thats completely irrelevant on resume. This
> whole blurb about requiring separate mult/shift values is just plain
> drivel.
>
> Plus any reasonably broad clocksource will not need an alarmtimer at
> all. Because the only reason it is needed is when the clocksource itself
> wraps around. And that has absolutely nothing to do with mult/shift
> values. That just depends on the frequency and the bitwidth of the counter,
>
> So it does not need an update function either because in case of broad
> enough clocksources there is absolutely no need for update and in case of
> wrapping ones the alarmtimer brings it out of suspend on time. And because
> the only interesting thing is the delta between suspend and resume this is
> all a complete non issue.
>
> The clocksource core already has all the registration/unregistration
> functionality plus an interface to reconfigure the frequency, so
> clocksources can come and go and be reconfigured and all of this just
> works.
>
> Once the extra few lines of code are in place, then you can go and cleanup
> the existing mess of homebrewn interfaces and claim that this is
> consolidation and simplification.

I am very grateful for your suggestion, and I am sorry I made a stupid
mistake that I missed the mul_u64_u32_shr() function which can avoid
introducing extra mult/shift for suspend clocksource. I will use that
and follow your other suggestions too. Thanks again.

>
> <rant>
>
> What's wrong with you people? Didn't they teach you in school that the
> first thing to do is proper problem and code analysis? If they did not, go
> back to them and ask your money back,
>
> I'm really tired of these overengineered trainwrecks which are then
> advertised with bullshit marketing like the best invention since sliced
> bread. This might work in random corporates, but not here.
>
> </rant>
>
> Thanks,
>
>         tglx



---
Baolin Wang
Best Regards

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH 1/8] time: Add persistent clock support
@ 2018-06-26  2:08       ` Baolin Wang
  0 siblings, 0 replies; 38+ messages in thread
From: Baolin Wang @ 2018-06-26  2:08 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Thomas,

On 24 June 2018 at 08:14, Thomas Gleixner <tglx@linutronix.de> wrote:
> On Wed, 13 Jun 2018, Baolin Wang wrote:
>> Moreover we can register the clocksource with CLOCK_SOURCE_SUSPEND_NONSTOP
>> to be one persistent clock, then we can simplify the suspend/resume
>> accounting by removing CLOCK_SOURCE_SUSPEND_NONSTOP timing. After that
>> we can only compensate the OS time by persistent clock or RTC.
>
> That makes sense because it adds a gazillion lines of code and removes 5?
> Not really,
>
>> +/**
>> + * persistent_clock_read_data - data required to read persistent clock
>> + * @read: Returns a cycle value from persistent clock.
>> + * @last_cycles: Clock cycle value at last update.
>> + * @last_ns: Time value (nanoseconds) at last update.
>> + * @mask: Bitmask for two's complement subtraction of non 64bit clocks.
>> + * @mult: Cycle to nanosecond multiplier.
>> + * @shift: Cycle to nanosecond divisor.
>> + */
>> +struct persistent_clock_read_data {
>> +     u64 (*read)(void);
>> +     u64 last_cycles;
>> +     u64 last_ns;
>> +     u64 mask;
>> +     u32 mult;
>> +     u32 shift;
>> +};
>> +/**
>> + * persistent_clock - represent the persistent clock
>> + * @read_data: Data required to read from persistent clock.
>> + * @seq: Sequence counter for protecting updates.
>> + * @freq: The frequency of the persistent clock.
>> + * @wrap: Duration for persistent clock can run before wrapping.
>> + * @alarm: Update timeout for persistent clock wrap.
>> + * @alarm_inited: Indicate if the alarm has been initialized.
>> + */
>> +struct persistent_clock {
>> +     struct persistent_clock_read_data read_data;
>> +     seqcount_t seq;
>> +     u32 freq;
>> +     ktime_t wrap;
>> +     struct alarm alarm;
>> +     bool alarm_inited;
>> +};
>
> NAK!
>
> There is no reason to invent yet another set of data structures and yet
> more read functions with a sequence counter. which are just a bad and
> broken copy of the existing timekeeping/clocksource code. And of course the
> stuff is not serialized against multiple registrations, etc. etc.
>
> Plus the utter nonsense that any call site has to do the same thing over
> and over:
>
>     register():
>     start_alarm_timer();
>
> Why is this required in the first place? It's not at all. The only place
> where such an alarm timer will be required is when the system actually goes
> to suspend. Starting it at registration time is pointless and even counter
> productive. Assume the clocksource wraps every 2 hours. So you start it at
> boot time and after 119 minutes uptime the system suspends. So it will
> wakeup one minute later to update the clocksource. Heck no. If the timer is
> started when the machine actually suspends it will wake up earliest in 120
> minutes.
>
> And you even add that to the TSC which does not need it at all. It will
> wrap in about 400 years on a 2GHZ machine. So you degrade the functionality
> instead of improving it.
>
> So no, this is not going anywhere.
>
> Let's look at the problem itself:
>
>    You want to use one clocksource for timekeeping during runtime which is
>    fast and accurate and another one for suspend time injection which is
>    slower and/or less accurate because the fast one stops in suspend.
>
>    Plus you need an alarmtimer which makes sure that the clocksource does
>    not wrap around during suspend.
>
> Now lets look what we have already:
>
>    Both clocksources already exist and are registered as clocksources with
>    all required data in the clocksource core.
>
> Ergo the only sane and logical conclusion is to expand the existing
> infrastructure to handle that.
>
> When a clocksource is registered, then the registration function already
> makes decisions about using it as timekeeping clocksource. So add a few
> lines of code to check whether the newly registered clocksource is suitable
> and preferred for suspend.
>
>         if (!stops_in_suspend(newcs)) {
>                 if (!suspend_cs || is_preferred_suspend_cs(newcs))
>                         suspend_cs = newcs;
>         }
>
> The is_preferred_suspend_cs() can be based on rating, the maximum suspend
> length which can be achieved or whatever is sensible. It should start of as
> a very simple decision function based on rating and not an prematurely
> overengineered monstrosity.
>
> The suspend/resume() code needs a few very simple changes:
>
> xxx_suspend():
>         clocksource_prepare_suspend();
>
>   Note, this is _NOT_ timekeeping_suspend() because that is invoked _AFTER_
>   alarmtimer_suspend(). So if an alarm timer is required it needs to be
>   armed before that. A trivial solution might be to just call it from
>   alarmtimer_suspend(), but that a minor detail to worry about.
>
> timekeeping_suspend()
> {
>         clocksource_enter_suspend();
>         ...
>
> timekeeping_resume()
> {
> ...
>         if (clocksource_leave_suspend(&nsec)) {
>                 ts_delta = ns_to_timespec64(nsec);
>                 sleeptime_injected = true;
>         } else if (......
>
>
> Now lets look at the new functions:
>
> void clocksource_prepare_suspend(void)
> {
>         if (!suspend_cs)
>                 return;
>
>         if (needs_alarmtimer(suspend_cs))
>                 start_suspend_alarm(suspend_cs);
> }
>
> void clocksource_enter_suspend(void)
> {
>         if (!suspend_cs)
>                 return;
>         suspend_start = suspend_cs->read();
> }
>
> bool clocksource_leave_suspend(u64 *nsec)
> {
>         u64 now, delta;
>
>         if (!suspend_cs)
>                 return false;
>
>         now = suspend_cs->read();
>         delta = clocksource_delta(now, suspend_start, suspend_cs->mask);
>         *nsec = mul_u64_u32_shr(delta, suspend_cs->mult, suspend_cs->shift);
>         return true;
> }
>
> See? It does not need any of this totally nonsensical stuff in your
> registration function and not any new read functions and whatever, because
> it simply can use the bog standard mult/shift values. Why? Because the
> conversion above can cope with a full 64 bit * 32 bit multiply without
> falling apart. It's already there in timekeeping_resume() otherwise
> resuming with a NONSTOP TSC would result in bogus sleep times after a few
> minutes. It's slower than the normal clocksource conversion which is
> optimized for performance, but thats completely irrelevant on resume. This
> whole blurb about requiring separate mult/shift values is just plain
> drivel.
>
> Plus any reasonably broad clocksource will not need an alarmtimer at
> all. Because the only reason it is needed is when the clocksource itself
> wraps around. And that has absolutely nothing to do with mult/shift
> values. That just depends on the frequency and the bitwidth of the counter,
>
> So it does not need an update function either because in case of broad
> enough clocksources there is absolutely no need for update and in case of
> wrapping ones the alarmtimer brings it out of suspend on time. And because
> the only interesting thing is the delta between suspend and resume this is
> all a complete non issue.
>
> The clocksource core already has all the registration/unregistration
> functionality plus an interface to reconfigure the frequency, so
> clocksources can come and go and be reconfigured and all of this just
> works.
>
> Once the extra few lines of code are in place, then you can go and cleanup
> the existing mess of homebrewn interfaces and claim that this is
> consolidation and simplification.

I am very grateful for your suggestion, and I am sorry I made a stupid
mistake that I missed the mul_u64_u32_shr() function which can avoid
introducing extra mult/shift for suspend clocksource. I will use that
and follow your other suggestions too. Thanks again.

>
> <rant>
>
> What's wrong with you people? Didn't they teach you in school that the
> first thing to do is proper problem and code analysis? If they did not, go
> back to them and ask your money back,
>
> I'm really tired of these overengineered trainwrecks which are then
> advertised with bullshit marketing like the best invention since sliced
> bread. This might work in random corporates, but not here.
>
> </rant>
>
> Thanks,
>
>         tglx



---
Baolin Wang
Best Regards

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 1/8] time: Add persistent clock support
  2018-06-25 17:23       ` John Stultz
  (?)
@ 2018-06-26  2:12         ` Baolin Wang
  -1 siblings, 0 replies; 38+ messages in thread
From: Baolin Wang @ 2018-06-26  2:12 UTC (permalink / raw)
  To: John Stultz
  Cc: Thomas Gleixner, Daniel Lezcano, Arnd Bergmann, Tony Lindgren,
	aaro.koskinen, Russell King - ARM Linux, Mark Rutland,
	Marc Zyngier, Mark Brown, Paul E. McKenney, Miroslav Lichvar,
	Randy Dunlap, Kate Stewart, Greg KH, Philippe Ombredanne,
	Thierry Reding, Jon Hunter, Heiko Stübner, Linus Walleij

Hi John,

On 26 June 2018 at 01:23, John Stultz <john.stultz@linaro.org> wrote:
> On Sat, Jun 23, 2018 at 5:14 PM, Thomas Gleixner <tglx@linutronix.de> wrote:
>> On Wed, 13 Jun 2018, Baolin Wang wrote:
>>> Moreover we can register the clocksource with CLOCK_SOURCE_SUSPEND_NONSTOP
>>> to be one persistent clock, then we can simplify the suspend/resume
>>> accounting by removing CLOCK_SOURCE_SUSPEND_NONSTOP timing. After that
>>> we can only compensate the OS time by persistent clock or RTC.
>>
>> That makes sense because it adds a gazillion lines of code and removes 5?
>> Not really,
>>
>>> +/**
>>> + * persistent_clock_read_data - data required to read persistent clock
>>> + * @read: Returns a cycle value from persistent clock.
>>> + * @last_cycles: Clock cycle value at last update.
>>> + * @last_ns: Time value (nanoseconds) at last update.
>>> + * @mask: Bitmask for two's complement subtraction of non 64bit clocks.
>>> + * @mult: Cycle to nanosecond multiplier.
>>> + * @shift: Cycle to nanosecond divisor.
>>> + */
>>> +struct persistent_clock_read_data {
>>> +     u64 (*read)(void);
>>> +     u64 last_cycles;
>>> +     u64 last_ns;
>>> +     u64 mask;
>>> +     u32 mult;
>>> +     u32 shift;
>>> +};
>>> +/**
>>> + * persistent_clock - represent the persistent clock
>>> + * @read_data: Data required to read from persistent clock.
>>> + * @seq: Sequence counter for protecting updates.
>>> + * @freq: The frequency of the persistent clock.
>>> + * @wrap: Duration for persistent clock can run before wrapping.
>>> + * @alarm: Update timeout for persistent clock wrap.
>>> + * @alarm_inited: Indicate if the alarm has been initialized.
>>> + */
>>> +struct persistent_clock {
>>> +     struct persistent_clock_read_data read_data;
>>> +     seqcount_t seq;
>>> +     u32 freq;
>>> +     ktime_t wrap;
>>> +     struct alarm alarm;
>>> +     bool alarm_inited;
>>> +};
>>
>> NAK!
>>
>> There is no reason to invent yet another set of data structures and yet
>> more read functions with a sequence counter. which are just a bad and
>> broken copy of the existing timekeeping/clocksource code. And of course the
>> stuff is not serialized against multiple registrations, etc. etc.
>>
>> Plus the utter nonsense that any call site has to do the same thing over
>> and over:
>>
>>     register():
>>     start_alarm_timer();
>>
>> Why is this required in the first place? It's not at all. The only place
>> where such an alarm timer will be required is when the system actually goes
>> to suspend. Starting it at registration time is pointless and even counter
>> productive. Assume the clocksource wraps every 2 hours. So you start it at
>> boot time and after 119 minutes uptime the system suspends. So it will
>> wakeup one minute later to update the clocksource. Heck no. If the timer is
>> started when the machine actually suspends it will wake up earliest in 120
>> minutes.
>>
>> And you even add that to the TSC which does not need it at all. It will
>> wrap in about 400 years on a 2GHZ machine. So you degrade the functionality
>> instead of improving it.
>>
>> So no, this is not going anywhere.
>>
>> Let's look at the problem itself:
>>
>>    You want to use one clocksource for timekeeping during runtime which is
>>    fast and accurate and another one for suspend time injection which is
>>    slower and/or less accurate because the fast one stops in suspend.
>>
>>    Plus you need an alarmtimer which makes sure that the clocksource does
>>    not wrap around during suspend.
>>
>> Now lets look what we have already:
>>
>>    Both clocksources already exist and are registered as clocksources with
>>    all required data in the clocksource core.
>>
>> Ergo the only sane and logical conclusion is to expand the existing
>> infrastructure to handle that.
>>
>> When a clocksource is registered, then the registration function already
>> makes decisions about using it as timekeeping clocksource. So add a few
>> lines of code to check whether the newly registered clocksource is suitable
>> and preferred for suspend.
>>
>>         if (!stops_in_suspend(newcs)) {
>>                 if (!suspend_cs || is_preferred_suspend_cs(newcs))
>>                         suspend_cs = newcs;
>>         }
>>
>> The is_preferred_suspend_cs() can be based on rating, the maximum suspend
>> length which can be achieved or whatever is sensible. It should start of as
>> a very simple decision function based on rating and not an prematurely
>> overengineered monstrosity.
>>
>> The suspend/resume() code needs a few very simple changes:
>>
>> xxx_suspend():
>>         clocksource_prepare_suspend();
>>
>>   Note, this is _NOT_ timekeeping_suspend() because that is invoked _AFTER_
>>   alarmtimer_suspend(). So if an alarm timer is required it needs to be
>>   armed before that. A trivial solution might be to just call it from
>>   alarmtimer_suspend(), but that a minor detail to worry about.
>>
>> timekeeping_suspend()
>> {
>>         clocksource_enter_suspend();
>>         ...
>>
>> timekeeping_resume()
>> {
>> ...
>>         if (clocksource_leave_suspend(&nsec)) {
>>                 ts_delta = ns_to_timespec64(nsec);
>>                 sleeptime_injected = true;
>>         } else if (......
>>
>>
>> Now lets look at the new functions:
>>
>> void clocksource_prepare_suspend(void)
>> {
>>         if (!suspend_cs)
>>                 return;
>>
>>         if (needs_alarmtimer(suspend_cs))
>>                 start_suspend_alarm(suspend_cs);
>> }
>>
>> void clocksource_enter_suspend(void)
>> {
>>         if (!suspend_cs)
>>                 return;
>>         suspend_start = suspend_cs->read();
>> }
>>
>> bool clocksource_leave_suspend(u64 *nsec)
>> {
>>         u64 now, delta;
>>
>>         if (!suspend_cs)
>>                 return false;
>>
>>         now = suspend_cs->read();
>>         delta = clocksource_delta(now, suspend_start, suspend_cs->mask);
>>         *nsec = mul_u64_u32_shr(delta, suspend_cs->mult, suspend_cs->shift);
>>         return true;
>> }
>>
>> See? It does not need any of this totally nonsensical stuff in your
>> registration function and not any new read functions and whatever, because
>> it simply can use the bog standard mult/shift values. Why? Because the
>> conversion above can cope with a full 64 bit * 32 bit multiply without
>> falling apart. It's already there in timekeeping_resume() otherwise
>> resuming with a NONSTOP TSC would result in bogus sleep times after a few
>> minutes. It's slower than the normal clocksource conversion which is
>> optimized for performance, but thats completely irrelevant on resume. This
>> whole blurb about requiring separate mult/shift values is just plain
>> drivel.
>>
>> Plus any reasonably broad clocksource will not need an alarmtimer at
>> all. Because the only reason it is needed is when the clocksource itself
>> wraps around. And that has absolutely nothing to do with mult/shift
>> values. That just depends on the frequency and the bitwidth of the counter,
>>
>> So it does not need an update function either because in case of broad
>> enough clocksources there is absolutely no need for update and in case of
>> wrapping ones the alarmtimer brings it out of suspend on time. And because
>> the only interesting thing is the delta between suspend and resume this is
>> all a complete non issue.
>>
>> The clocksource core already has all the registration/unregistration
>> functionality plus an interface to reconfigure the frequency, so
>> clocksources can come and go and be reconfigured and all of this just
>> works.
>>
>> Once the extra few lines of code are in place, then you can go and cleanup
>> the existing mess of homebrewn interfaces and claim that this is
>> consolidation and simplification.
>>
>> <rant>
>>
>> What's wrong with you people? Didn't they teach you in school that the
>> first thing to do is proper problem and code analysis? If they did not, go
>> back to them and ask your money back,
>>
>> I'm really tired of these overengineered trainwrecks which are then
>> advertised with bullshit marketing like the best invention since sliced
>> bread. This might work in random corporates, but not here.
>>
>> </rant>
>
> Thomas, what is wrong with *you*?  This is completely unnecessary.
>
> First of all, I sent Baolin on this approach, as I was getting sick of
> seeing the suspend/resume timing continually adding extra warts and
> complexity to handle odd hardware that needs some solution. So I
> suggested he clean that up in a way that lets these multiple ways of
> solving the problem be done in device specific code w/o adding more
> complexity to the core logic - in a fashion how the clocksources
> allowed us to support nice fast/well-behaving hardware w/o preventing
> ugly-but-neccessary solutions like the pit clocksource.
>
> Now, I've also been quite a poor maintainer of late, and haven't done
> much to help with pre-review of Baloin's code.  So that's on me, not
> him.
>
> Admittedly, my taste isn't always great, so there is likely to be a
> better approach, and having your guidance here is *really* valued. I
> just wish we could get it without all this unnecessary vinegar.
>
> I was really lucky to have your guidance and support when I was
> starting in the community. Your helping bring me up as a co-maintainer
> (only a a relatively small set of responsibility compared to your much
> wider maintainership), does let me see that having the responsibility
> of billions of devices running your code is immense and comes with no
> small amount of stress. Juggling the infinite incoming stream of
> review requests on naieve or otherwise lacking code with other
> important development priorities is a major burden, so *I really get
> how frustrating it is*.
>
> And its super annoying to have lots of short-term development requests
> being thrown at you when you're the one who has to maintain it for the
> long term.  But long-long-term, no one is going to be a maintainer
> forever, and we are not going to develop more olympic swimmers if we
> keep peeing in the pool.
>
>
> Baolin: My apologies for leading you poorly here. Work on something
> else for a day to let the flames burn off you, then review Thomas'
> technical feedback, ignoring the harsh style, and try to address his
> technical comments with the approach. I'll try to do better in finding
> time to review your work.

Thanks John, I will follow Thomas' suggestion and re-implement the approach.

---
Baolin Wang
Best Regards

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 1/8] time: Add persistent clock support
@ 2018-06-26  2:12         ` Baolin Wang
  0 siblings, 0 replies; 38+ messages in thread
From: Baolin Wang @ 2018-06-26  2:12 UTC (permalink / raw)
  To: John Stultz
  Cc: Thomas Gleixner, Daniel Lezcano, Arnd Bergmann, Tony Lindgren,
	aaro.koskinen, Russell King - ARM Linux, Mark Rutland,
	Marc Zyngier, Mark Brown, Paul E. McKenney, Miroslav Lichvar,
	Randy Dunlap, Kate Stewart, Greg KH, Philippe Ombredanne,
	Thierry Reding, Jon Hunter, Heiko Stübner, Linus Walleij,
	Viresh Kumar, Ingo Molnar, H. Peter Anvin, Peter Zijlstra,
	douly.fnst, Len Brown, rajvi.jingar, Alexandre Belloni, x86,
	Linux ARM, linux-tegra, lkml, linux-omap

Hi John,

On 26 June 2018 at 01:23, John Stultz <john.stultz@linaro.org> wrote:
> On Sat, Jun 23, 2018 at 5:14 PM, Thomas Gleixner <tglx@linutronix.de> wrote:
>> On Wed, 13 Jun 2018, Baolin Wang wrote:
>>> Moreover we can register the clocksource with CLOCK_SOURCE_SUSPEND_NONSTOP
>>> to be one persistent clock, then we can simplify the suspend/resume
>>> accounting by removing CLOCK_SOURCE_SUSPEND_NONSTOP timing. After that
>>> we can only compensate the OS time by persistent clock or RTC.
>>
>> That makes sense because it adds a gazillion lines of code and removes 5?
>> Not really,
>>
>>> +/**
>>> + * persistent_clock_read_data - data required to read persistent clock
>>> + * @read: Returns a cycle value from persistent clock.
>>> + * @last_cycles: Clock cycle value at last update.
>>> + * @last_ns: Time value (nanoseconds) at last update.
>>> + * @mask: Bitmask for two's complement subtraction of non 64bit clocks.
>>> + * @mult: Cycle to nanosecond multiplier.
>>> + * @shift: Cycle to nanosecond divisor.
>>> + */
>>> +struct persistent_clock_read_data {
>>> +     u64 (*read)(void);
>>> +     u64 last_cycles;
>>> +     u64 last_ns;
>>> +     u64 mask;
>>> +     u32 mult;
>>> +     u32 shift;
>>> +};
>>> +/**
>>> + * persistent_clock - represent the persistent clock
>>> + * @read_data: Data required to read from persistent clock.
>>> + * @seq: Sequence counter for protecting updates.
>>> + * @freq: The frequency of the persistent clock.
>>> + * @wrap: Duration for persistent clock can run before wrapping.
>>> + * @alarm: Update timeout for persistent clock wrap.
>>> + * @alarm_inited: Indicate if the alarm has been initialized.
>>> + */
>>> +struct persistent_clock {
>>> +     struct persistent_clock_read_data read_data;
>>> +     seqcount_t seq;
>>> +     u32 freq;
>>> +     ktime_t wrap;
>>> +     struct alarm alarm;
>>> +     bool alarm_inited;
>>> +};
>>
>> NAK!
>>
>> There is no reason to invent yet another set of data structures and yet
>> more read functions with a sequence counter. which are just a bad and
>> broken copy of the existing timekeeping/clocksource code. And of course the
>> stuff is not serialized against multiple registrations, etc. etc.
>>
>> Plus the utter nonsense that any call site has to do the same thing over
>> and over:
>>
>>     register():
>>     start_alarm_timer();
>>
>> Why is this required in the first place? It's not at all. The only place
>> where such an alarm timer will be required is when the system actually goes
>> to suspend. Starting it at registration time is pointless and even counter
>> productive. Assume the clocksource wraps every 2 hours. So you start it at
>> boot time and after 119 minutes uptime the system suspends. So it will
>> wakeup one minute later to update the clocksource. Heck no. If the timer is
>> started when the machine actually suspends it will wake up earliest in 120
>> minutes.
>>
>> And you even add that to the TSC which does not need it at all. It will
>> wrap in about 400 years on a 2GHZ machine. So you degrade the functionality
>> instead of improving it.
>>
>> So no, this is not going anywhere.
>>
>> Let's look at the problem itself:
>>
>>    You want to use one clocksource for timekeeping during runtime which is
>>    fast and accurate and another one for suspend time injection which is
>>    slower and/or less accurate because the fast one stops in suspend.
>>
>>    Plus you need an alarmtimer which makes sure that the clocksource does
>>    not wrap around during suspend.
>>
>> Now lets look what we have already:
>>
>>    Both clocksources already exist and are registered as clocksources with
>>    all required data in the clocksource core.
>>
>> Ergo the only sane and logical conclusion is to expand the existing
>> infrastructure to handle that.
>>
>> When a clocksource is registered, then the registration function already
>> makes decisions about using it as timekeeping clocksource. So add a few
>> lines of code to check whether the newly registered clocksource is suitable
>> and preferred for suspend.
>>
>>         if (!stops_in_suspend(newcs)) {
>>                 if (!suspend_cs || is_preferred_suspend_cs(newcs))
>>                         suspend_cs = newcs;
>>         }
>>
>> The is_preferred_suspend_cs() can be based on rating, the maximum suspend
>> length which can be achieved or whatever is sensible. It should start of as
>> a very simple decision function based on rating and not an prematurely
>> overengineered monstrosity.
>>
>> The suspend/resume() code needs a few very simple changes:
>>
>> xxx_suspend():
>>         clocksource_prepare_suspend();
>>
>>   Note, this is _NOT_ timekeeping_suspend() because that is invoked _AFTER_
>>   alarmtimer_suspend(). So if an alarm timer is required it needs to be
>>   armed before that. A trivial solution might be to just call it from
>>   alarmtimer_suspend(), but that a minor detail to worry about.
>>
>> timekeeping_suspend()
>> {
>>         clocksource_enter_suspend();
>>         ...
>>
>> timekeeping_resume()
>> {
>> ...
>>         if (clocksource_leave_suspend(&nsec)) {
>>                 ts_delta = ns_to_timespec64(nsec);
>>                 sleeptime_injected = true;
>>         } else if (......
>>
>>
>> Now lets look at the new functions:
>>
>> void clocksource_prepare_suspend(void)
>> {
>>         if (!suspend_cs)
>>                 return;
>>
>>         if (needs_alarmtimer(suspend_cs))
>>                 start_suspend_alarm(suspend_cs);
>> }
>>
>> void clocksource_enter_suspend(void)
>> {
>>         if (!suspend_cs)
>>                 return;
>>         suspend_start = suspend_cs->read();
>> }
>>
>> bool clocksource_leave_suspend(u64 *nsec)
>> {
>>         u64 now, delta;
>>
>>         if (!suspend_cs)
>>                 return false;
>>
>>         now = suspend_cs->read();
>>         delta = clocksource_delta(now, suspend_start, suspend_cs->mask);
>>         *nsec = mul_u64_u32_shr(delta, suspend_cs->mult, suspend_cs->shift);
>>         return true;
>> }
>>
>> See? It does not need any of this totally nonsensical stuff in your
>> registration function and not any new read functions and whatever, because
>> it simply can use the bog standard mult/shift values. Why? Because the
>> conversion above can cope with a full 64 bit * 32 bit multiply without
>> falling apart. It's already there in timekeeping_resume() otherwise
>> resuming with a NONSTOP TSC would result in bogus sleep times after a few
>> minutes. It's slower than the normal clocksource conversion which is
>> optimized for performance, but thats completely irrelevant on resume. This
>> whole blurb about requiring separate mult/shift values is just plain
>> drivel.
>>
>> Plus any reasonably broad clocksource will not need an alarmtimer at
>> all. Because the only reason it is needed is when the clocksource itself
>> wraps around. And that has absolutely nothing to do with mult/shift
>> values. That just depends on the frequency and the bitwidth of the counter,
>>
>> So it does not need an update function either because in case of broad
>> enough clocksources there is absolutely no need for update and in case of
>> wrapping ones the alarmtimer brings it out of suspend on time. And because
>> the only interesting thing is the delta between suspend and resume this is
>> all a complete non issue.
>>
>> The clocksource core already has all the registration/unregistration
>> functionality plus an interface to reconfigure the frequency, so
>> clocksources can come and go and be reconfigured and all of this just
>> works.
>>
>> Once the extra few lines of code are in place, then you can go and cleanup
>> the existing mess of homebrewn interfaces and claim that this is
>> consolidation and simplification.
>>
>> <rant>
>>
>> What's wrong with you people? Didn't they teach you in school that the
>> first thing to do is proper problem and code analysis? If they did not, go
>> back to them and ask your money back,
>>
>> I'm really tired of these overengineered trainwrecks which are then
>> advertised with bullshit marketing like the best invention since sliced
>> bread. This might work in random corporates, but not here.
>>
>> </rant>
>
> Thomas, what is wrong with *you*?  This is completely unnecessary.
>
> First of all, I sent Baolin on this approach, as I was getting sick of
> seeing the suspend/resume timing continually adding extra warts and
> complexity to handle odd hardware that needs some solution. So I
> suggested he clean that up in a way that lets these multiple ways of
> solving the problem be done in device specific code w/o adding more
> complexity to the core logic - in a fashion how the clocksources
> allowed us to support nice fast/well-behaving hardware w/o preventing
> ugly-but-neccessary solutions like the pit clocksource.
>
> Now, I've also been quite a poor maintainer of late, and haven't done
> much to help with pre-review of Baloin's code.  So that's on me, not
> him.
>
> Admittedly, my taste isn't always great, so there is likely to be a
> better approach, and having your guidance here is *really* valued. I
> just wish we could get it without all this unnecessary vinegar.
>
> I was really lucky to have your guidance and support when I was
> starting in the community. Your helping bring me up as a co-maintainer
> (only a a relatively small set of responsibility compared to your much
> wider maintainership), does let me see that having the responsibility
> of billions of devices running your code is immense and comes with no
> small amount of stress. Juggling the infinite incoming stream of
> review requests on naieve or otherwise lacking code with other
> important development priorities is a major burden, so *I really get
> how frustrating it is*.
>
> And its super annoying to have lots of short-term development requests
> being thrown at you when you're the one who has to maintain it for the
> long term.  But long-long-term, no one is going to be a maintainer
> forever, and we are not going to develop more olympic swimmers if we
> keep peeing in the pool.
>
>
> Baolin: My apologies for leading you poorly here. Work on something
> else for a day to let the flames burn off you, then review Thomas'
> technical feedback, ignoring the harsh style, and try to address his
> technical comments with the approach. I'll try to do better in finding
> time to review your work.

Thanks John, I will follow Thomas' suggestion and re-implement the approach.

---
Baolin Wang
Best Regards

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH 1/8] time: Add persistent clock support
@ 2018-06-26  2:12         ` Baolin Wang
  0 siblings, 0 replies; 38+ messages in thread
From: Baolin Wang @ 2018-06-26  2:12 UTC (permalink / raw)
  To: linux-arm-kernel

Hi John,

On 26 June 2018 at 01:23, John Stultz <john.stultz@linaro.org> wrote:
> On Sat, Jun 23, 2018 at 5:14 PM, Thomas Gleixner <tglx@linutronix.de> wrote:
>> On Wed, 13 Jun 2018, Baolin Wang wrote:
>>> Moreover we can register the clocksource with CLOCK_SOURCE_SUSPEND_NONSTOP
>>> to be one persistent clock, then we can simplify the suspend/resume
>>> accounting by removing CLOCK_SOURCE_SUSPEND_NONSTOP timing. After that
>>> we can only compensate the OS time by persistent clock or RTC.
>>
>> That makes sense because it adds a gazillion lines of code and removes 5?
>> Not really,
>>
>>> +/**
>>> + * persistent_clock_read_data - data required to read persistent clock
>>> + * @read: Returns a cycle value from persistent clock.
>>> + * @last_cycles: Clock cycle value at last update.
>>> + * @last_ns: Time value (nanoseconds) at last update.
>>> + * @mask: Bitmask for two's complement subtraction of non 64bit clocks.
>>> + * @mult: Cycle to nanosecond multiplier.
>>> + * @shift: Cycle to nanosecond divisor.
>>> + */
>>> +struct persistent_clock_read_data {
>>> +     u64 (*read)(void);
>>> +     u64 last_cycles;
>>> +     u64 last_ns;
>>> +     u64 mask;
>>> +     u32 mult;
>>> +     u32 shift;
>>> +};
>>> +/**
>>> + * persistent_clock - represent the persistent clock
>>> + * @read_data: Data required to read from persistent clock.
>>> + * @seq: Sequence counter for protecting updates.
>>> + * @freq: The frequency of the persistent clock.
>>> + * @wrap: Duration for persistent clock can run before wrapping.
>>> + * @alarm: Update timeout for persistent clock wrap.
>>> + * @alarm_inited: Indicate if the alarm has been initialized.
>>> + */
>>> +struct persistent_clock {
>>> +     struct persistent_clock_read_data read_data;
>>> +     seqcount_t seq;
>>> +     u32 freq;
>>> +     ktime_t wrap;
>>> +     struct alarm alarm;
>>> +     bool alarm_inited;
>>> +};
>>
>> NAK!
>>
>> There is no reason to invent yet another set of data structures and yet
>> more read functions with a sequence counter. which are just a bad and
>> broken copy of the existing timekeeping/clocksource code. And of course the
>> stuff is not serialized against multiple registrations, etc. etc.
>>
>> Plus the utter nonsense that any call site has to do the same thing over
>> and over:
>>
>>     register():
>>     start_alarm_timer();
>>
>> Why is this required in the first place? It's not at all. The only place
>> where such an alarm timer will be required is when the system actually goes
>> to suspend. Starting it at registration time is pointless and even counter
>> productive. Assume the clocksource wraps every 2 hours. So you start it at
>> boot time and after 119 minutes uptime the system suspends. So it will
>> wakeup one minute later to update the clocksource. Heck no. If the timer is
>> started when the machine actually suspends it will wake up earliest in 120
>> minutes.
>>
>> And you even add that to the TSC which does not need it at all. It will
>> wrap in about 400 years on a 2GHZ machine. So you degrade the functionality
>> instead of improving it.
>>
>> So no, this is not going anywhere.
>>
>> Let's look at the problem itself:
>>
>>    You want to use one clocksource for timekeeping during runtime which is
>>    fast and accurate and another one for suspend time injection which is
>>    slower and/or less accurate because the fast one stops in suspend.
>>
>>    Plus you need an alarmtimer which makes sure that the clocksource does
>>    not wrap around during suspend.
>>
>> Now lets look what we have already:
>>
>>    Both clocksources already exist and are registered as clocksources with
>>    all required data in the clocksource core.
>>
>> Ergo the only sane and logical conclusion is to expand the existing
>> infrastructure to handle that.
>>
>> When a clocksource is registered, then the registration function already
>> makes decisions about using it as timekeeping clocksource. So add a few
>> lines of code to check whether the newly registered clocksource is suitable
>> and preferred for suspend.
>>
>>         if (!stops_in_suspend(newcs)) {
>>                 if (!suspend_cs || is_preferred_suspend_cs(newcs))
>>                         suspend_cs = newcs;
>>         }
>>
>> The is_preferred_suspend_cs() can be based on rating, the maximum suspend
>> length which can be achieved or whatever is sensible. It should start of as
>> a very simple decision function based on rating and not an prematurely
>> overengineered monstrosity.
>>
>> The suspend/resume() code needs a few very simple changes:
>>
>> xxx_suspend():
>>         clocksource_prepare_suspend();
>>
>>   Note, this is _NOT_ timekeeping_suspend() because that is invoked _AFTER_
>>   alarmtimer_suspend(). So if an alarm timer is required it needs to be
>>   armed before that. A trivial solution might be to just call it from
>>   alarmtimer_suspend(), but that a minor detail to worry about.
>>
>> timekeeping_suspend()
>> {
>>         clocksource_enter_suspend();
>>         ...
>>
>> timekeeping_resume()
>> {
>> ...
>>         if (clocksource_leave_suspend(&nsec)) {
>>                 ts_delta = ns_to_timespec64(nsec);
>>                 sleeptime_injected = true;
>>         } else if (......
>>
>>
>> Now lets look at the new functions:
>>
>> void clocksource_prepare_suspend(void)
>> {
>>         if (!suspend_cs)
>>                 return;
>>
>>         if (needs_alarmtimer(suspend_cs))
>>                 start_suspend_alarm(suspend_cs);
>> }
>>
>> void clocksource_enter_suspend(void)
>> {
>>         if (!suspend_cs)
>>                 return;
>>         suspend_start = suspend_cs->read();
>> }
>>
>> bool clocksource_leave_suspend(u64 *nsec)
>> {
>>         u64 now, delta;
>>
>>         if (!suspend_cs)
>>                 return false;
>>
>>         now = suspend_cs->read();
>>         delta = clocksource_delta(now, suspend_start, suspend_cs->mask);
>>         *nsec = mul_u64_u32_shr(delta, suspend_cs->mult, suspend_cs->shift);
>>         return true;
>> }
>>
>> See? It does not need any of this totally nonsensical stuff in your
>> registration function and not any new read functions and whatever, because
>> it simply can use the bog standard mult/shift values. Why? Because the
>> conversion above can cope with a full 64 bit * 32 bit multiply without
>> falling apart. It's already there in timekeeping_resume() otherwise
>> resuming with a NONSTOP TSC would result in bogus sleep times after a few
>> minutes. It's slower than the normal clocksource conversion which is
>> optimized for performance, but thats completely irrelevant on resume. This
>> whole blurb about requiring separate mult/shift values is just plain
>> drivel.
>>
>> Plus any reasonably broad clocksource will not need an alarmtimer at
>> all. Because the only reason it is needed is when the clocksource itself
>> wraps around. And that has absolutely nothing to do with mult/shift
>> values. That just depends on the frequency and the bitwidth of the counter,
>>
>> So it does not need an update function either because in case of broad
>> enough clocksources there is absolutely no need for update and in case of
>> wrapping ones the alarmtimer brings it out of suspend on time. And because
>> the only interesting thing is the delta between suspend and resume this is
>> all a complete non issue.
>>
>> The clocksource core already has all the registration/unregistration
>> functionality plus an interface to reconfigure the frequency, so
>> clocksources can come and go and be reconfigured and all of this just
>> works.
>>
>> Once the extra few lines of code are in place, then you can go and cleanup
>> the existing mess of homebrewn interfaces and claim that this is
>> consolidation and simplification.
>>
>> <rant>
>>
>> What's wrong with you people? Didn't they teach you in school that the
>> first thing to do is proper problem and code analysis? If they did not, go
>> back to them and ask your money back,
>>
>> I'm really tired of these overengineered trainwrecks which are then
>> advertised with bullshit marketing like the best invention since sliced
>> bread. This might work in random corporates, but not here.
>>
>> </rant>
>
> Thomas, what is wrong with *you*?  This is completely unnecessary.
>
> First of all, I sent Baolin on this approach, as I was getting sick of
> seeing the suspend/resume timing continually adding extra warts and
> complexity to handle odd hardware that needs some solution. So I
> suggested he clean that up in a way that lets these multiple ways of
> solving the problem be done in device specific code w/o adding more
> complexity to the core logic - in a fashion how the clocksources
> allowed us to support nice fast/well-behaving hardware w/o preventing
> ugly-but-neccessary solutions like the pit clocksource.
>
> Now, I've also been quite a poor maintainer of late, and haven't done
> much to help with pre-review of Baloin's code.  So that's on me, not
> him.
>
> Admittedly, my taste isn't always great, so there is likely to be a
> better approach, and having your guidance here is *really* valued. I
> just wish we could get it without all this unnecessary vinegar.
>
> I was really lucky to have your guidance and support when I was
> starting in the community. Your helping bring me up as a co-maintainer
> (only a a relatively small set of responsibility compared to your much
> wider maintainership), does let me see that having the responsibility
> of billions of devices running your code is immense and comes with no
> small amount of stress. Juggling the infinite incoming stream of
> review requests on naieve or otherwise lacking code with other
> important development priorities is a major burden, so *I really get
> how frustrating it is*.
>
> And its super annoying to have lots of short-term development requests
> being thrown at you when you're the one who has to maintain it for the
> long term.  But long-long-term, no one is going to be a maintainer
> forever, and we are not going to develop more olympic swimmers if we
> keep peeing in the pool.
>
>
> Baolin: My apologies for leading you poorly here. Work on something
> else for a day to let the flames burn off you, then review Thomas'
> technical feedback, ignoring the harsh style, and try to address his
> technical comments with the approach. I'll try to do better in finding
> time to review your work.

Thanks John, I will follow Thomas' suggestion and re-implement the approach.

---
Baolin Wang
Best Regards

^ permalink raw reply	[flat|nested] 38+ messages in thread

end of thread, other threads:[~2018-06-26  2:12 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-06-13 11:32 [PATCH 0/8] Add persistent clock support Baolin Wang
2018-06-13 11:32 ` Baolin Wang
2018-06-13 11:32 ` [PATCH 1/8] time: " Baolin Wang
2018-06-13 11:32   ` Baolin Wang
2018-06-24  0:14   ` Thomas Gleixner
2018-06-24  0:14     ` Thomas Gleixner
2018-06-24  0:14     ` Thomas Gleixner
2018-06-25 16:18     ` Thomas Gleixner
2018-06-25 16:18       ` Thomas Gleixner
2018-06-25 17:23     ` John Stultz
2018-06-25 17:23       ` John Stultz
2018-06-25 17:23       ` John Stultz
2018-06-25 20:06       ` Thomas Gleixner
2018-06-25 20:06         ` Thomas Gleixner
2018-06-25 20:06         ` Thomas Gleixner
2018-06-26  2:12       ` Baolin Wang
2018-06-26  2:12         ` Baolin Wang
2018-06-26  2:12         ` Baolin Wang
2018-06-26  2:08     ` Baolin Wang
2018-06-26  2:08       ` Baolin Wang
2018-06-26  2:08       ` Baolin Wang
2018-06-13 11:32 ` [PATCH 2/8] clocksource: sprd: Add one persistent timer for Spreadtrum platform Baolin Wang
2018-06-13 11:32   ` Baolin Wang
2018-06-13 11:32 ` [PATCH 3/8] arm: time: Remove the persistent clock support for ARM architecture Baolin Wang
2018-06-13 11:32   ` Baolin Wang
2018-06-24  0:21   ` Thomas Gleixner
2018-06-24  0:21     ` Thomas Gleixner
2018-06-24  0:21     ` Thomas Gleixner
2018-06-13 11:32 ` [PATCH 4/8] clocksource: arm_arch_timer: Register the persistent clock Baolin Wang
2018-06-13 11:32   ` Baolin Wang
2018-06-13 11:32 ` [PATCH 5/8] clocksource: timer-ti-32k: " Baolin Wang
2018-06-13 11:32   ` Baolin Wang
2018-06-13 11:32 ` [PATCH 6/8] clocksource: time-pistachio: " Baolin Wang
2018-06-13 11:32   ` Baolin Wang
2018-06-13 11:32 ` [PATCH 7/8] x86: tsc: " Baolin Wang
2018-06-13 11:32   ` Baolin Wang
2018-06-13 11:32 ` [PATCH 8/8] time: timekeeping: Remove time compensating from nonstop clocksources Baolin Wang
2018-06-13 11:32   ` Baolin Wang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.