All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC v5 0/6] Patchset enabling hardware based cross-timestamps for next gen Intel platforms
@ 2016-01-04 12:45 ` Christopher S. Hall
  0 siblings, 0 replies; 38+ messages in thread
From: Christopher S. Hall @ 2016-01-04 12:45 UTC (permalink / raw)
  To: tglx, richardcochran, mingo, john.stultz, hpa, jeffrey.t.kirsher
  Cc: Christopher S. Hall, x86, linux-kernel, intel-wired-lan, netdev,
	kevin.b.stanton

Modern Intel hardware adds an Always Running Timer (ART) that allows the
network and audio device clocks to precisely cross timestamp the device
clock with the system clock. This allows a precise correlation of the
device time and system time.

This patchset adds interfaces to the timekeeping code allowing drivers
to translate ART time to system time.

Changelog:

Changes from v4 to v5:

*	Changes the history mechanism to interpolate system time using
	a single historic system time pair (monotonic raw, realtime)
	rather than implementing a precise history using shadow
	timekeeper (see v4 changes). The advantage of this approach is
	that the history can be arbitrarily long. This approach may
	also be simpler in terms of coding. The major disadvantage is
	that the realtime clock can be adjusted.  When adjusted, the
	realtime clock time (when interpolating from history) is
	always approximate. In general, the longer the interpolation
	period the larger the potential error. There isn't any error
	interpolating the monotonic raw clock time.
*	This patchset also addresses objections to the previous
	patchsets overly complex correlated timestamp structure. This
	patchset splits that structure into several smaller
	structures.  The correlated timestamp interface is renamed
	cross timestamp to avoid any confusion with the correlated
	clocksource.
*	The correlated clocksource is separated from the cross
	timestamp mechanism.
*	Add monotonic raw to the PTP user interface
*	Add e1000e driver configuration option that wraps Intel PCH
	specific code

Changes v3 to v4: 

*	Adds a history mechanism to accomodate slower devices. In this
	case the response time for timestamp reads to the Intel DSP
	are too slow to be accomodated by the original correlated time
	mechanism. The history mechanism turns shadow timekeeper into
	an array where the history is stored.

Christopher S. Hall (6):
  Timekeeping cross timestamp interface for device drivers
  Always Running Timer (ART) correlated clocksource
  Add history to cross timestamp interface supporting slower devices
  Remove duplicate code from ktime_get_raw_and_real code
  Add PTP_SYS_OFFSET_PRECISE for driver crosstimestamping
  Adds hardware supported cross timestamp

 Documentation/ptp/testptp.c                 |   6 +-
 arch/x86/include/asm/cpufeature.h           |   2 +-
 arch/x86/include/asm/tsc.h                  |   2 +
 arch/x86/kernel/tsc.c                       |  46 ++++++
 drivers/net/ethernet/intel/Kconfig          |   9 ++
 drivers/net/ethernet/intel/e1000e/defines.h |   5 +
 drivers/net/ethernet/intel/e1000e/ptp.c     | 92 +++++++++++++++++++++++++++++
 drivers/net/ethernet/intel/e1000e/regs.h    |   4 +
 drivers/ptp/ptp_chardev.c                   |  31 ++++
 include/linux/clocksource.h                 |  43 +++++
 include/linux/ptp_clock_kernel.h            |   8 +
 include/linux/timekeeper_internal.h         |   4 +
 include/linux/timekeeping.h                 |  42 +++++
 include/uapi/linux/ptp_clock.h              |  13 +-
 kernel/time/timekeeping.c                   | 243 ++++++++++++++++++++++++++--
 15 files changed, 521 insertions(+), 21 deletions(-)

-- 
2.1.4


^ permalink raw reply	[flat|nested] 38+ messages in thread

* [Intel-wired-lan] [RFC v5 0/6] Patchset enabling hardware based cross-timestamps for next gen Intel platforms
@ 2016-01-04 12:45 ` Christopher S. Hall
  0 siblings, 0 replies; 38+ messages in thread
From: Christopher S. Hall @ 2016-01-04 12:45 UTC (permalink / raw)
  To: intel-wired-lan

Modern Intel hardware adds an Always Running Timer (ART) that allows the
network and audio device clocks to precisely cross timestamp the device
clock with the system clock. This allows a precise correlation of the
device time and system time.

This patchset adds interfaces to the timekeeping code allowing drivers
to translate ART time to system time.

Changelog:

Changes from v4 to v5:

*	Changes the history mechanism to interpolate system time using
	a single historic system time pair (monotonic raw, realtime)
	rather than implementing a precise history using shadow
	timekeeper (see v4 changes). The advantage of this approach is
	that the history can be arbitrarily long. This approach may
	also be simpler in terms of coding. The major disadvantage is
	that the realtime clock can be adjusted.  When adjusted, the
	realtime clock time (when interpolating from history) is
	always approximate. In general, the longer the interpolation
	period the larger the potential error. There isn't any error
	interpolating the monotonic raw clock time.
*	This patchset also addresses objections to the previous
	patchsets overly complex correlated timestamp structure. This
	patchset splits that structure into several smaller
	structures.  The correlated timestamp interface is renamed
	cross timestamp to avoid any confusion with the correlated
	clocksource.
*	The correlated clocksource is separated from the cross
	timestamp mechanism.
*	Add monotonic raw to the PTP user interface
*	Add e1000e driver configuration option that wraps Intel PCH
	specific code

Changes v3 to v4: 

*	Adds a history mechanism to accomodate slower devices. In this
	case the response time for timestamp reads to the Intel DSP
	are too slow to be accomodated by the original correlated time
	mechanism. The history mechanism turns shadow timekeeper into
	an array where the history is stored.

Christopher S. Hall (6):
  Timekeeping cross timestamp interface for device drivers
  Always Running Timer (ART) correlated clocksource
  Add history to cross timestamp interface supporting slower devices
  Remove duplicate code from ktime_get_raw_and_real code
  Add PTP_SYS_OFFSET_PRECISE for driver crosstimestamping
  Adds hardware supported cross timestamp

 Documentation/ptp/testptp.c                 |   6 +-
 arch/x86/include/asm/cpufeature.h           |   2 +-
 arch/x86/include/asm/tsc.h                  |   2 +
 arch/x86/kernel/tsc.c                       |  46 ++++++
 drivers/net/ethernet/intel/Kconfig          |   9 ++
 drivers/net/ethernet/intel/e1000e/defines.h |   5 +
 drivers/net/ethernet/intel/e1000e/ptp.c     | 92 +++++++++++++++++++++++++++++
 drivers/net/ethernet/intel/e1000e/regs.h    |   4 +
 drivers/ptp/ptp_chardev.c                   |  31 ++++
 include/linux/clocksource.h                 |  43 +++++
 include/linux/ptp_clock_kernel.h            |   8 +
 include/linux/timekeeper_internal.h         |   4 +
 include/linux/timekeeping.h                 |  42 +++++
 include/uapi/linux/ptp_clock.h              |  13 +-
 kernel/time/timekeeping.c                   | 243 ++++++++++++++++++++++++++--
 15 files changed, 521 insertions(+), 21 deletions(-)

-- 
2.1.4


^ permalink raw reply	[flat|nested] 38+ messages in thread

* [RFC v5 1/6] Timekeeping cross timestamp interface for device drivers
  2016-01-04 12:45 ` [Intel-wired-lan] " Christopher S. Hall
@ 2016-01-04 12:45   ` Christopher S. Hall
  -1 siblings, 0 replies; 38+ messages in thread
From: Christopher S. Hall @ 2016-01-04 12:45 UTC (permalink / raw)
  To: tglx, richardcochran, mingo, john.stultz, hpa, jeffrey.t.kirsher
  Cc: Christopher S. Hall, x86, linux-kernel, intel-wired-lan, netdev,
	kevin.b.stanton

ACKNOWLEDGMENT: The original correlated clock source and cross
timestamp code was developed by Thomas Gleixner
<tglx@linutronix.de>. It has changed considerably and any mistakes are
mine.

The precision with which events on multiple networked systems can be
synchronized using, as an example, PTP (IEEE 1588, 802.1AS) is limited
by the precision of the cross timestamps between the system clock and
the device (timestamp) clock. Precision here is the degree of
simultaneity when capturing the cross timestamp.

Currently the PTP cross timestamp is captured in software using the
PTP device driver ioctl PTP_SYS_OFFSET. Reads of the device clock are
interleaved with reads of the realtime clock. At best, the precision
of this cross timestamp is on the order of several microseconds due to
software latencies. Sub-microsecond precision is required for
industrial control and some media applications. To achieve this level
of precision hardware supported cross timestamping is needed.

Hardware cross timestamps are derived from simultaneously capturing
the device clock and the system clock. Applications use timestamps in
nanoseconds rather than clock ticks. The device driver can scale
device clock ticks to device time in nanoseconds, but cannot transform
system clock ticks. Only the kernel timekeeping code can do this. The
function get_device_system_crosststamp() allows device drivers to
return a cross timestamp with system time properly scaled to
nanoseconds.

The cross timestamps contain the realtime and monotonic raw clock
times. The realtime value is needed to discipline that clock using PTP
and the monotonic raw value is used for applications that don't
require a "real" time, but need an unadjusted clock time.

The get_device_system_crosststamp() code calls back into the driver
to ensure that the system counter is within the current timekeeping
update interval. Because of possible NTP/PTP frequency adjustments,
extrapolating a realtime clock time outside the current interval with
a potentially different scaling factor can result in a small amount of
error.

Modern Intel hardware provides an Always Running Timer (ART) which is
exactly related to TSC through a known frequency ratio. The ART is
routed to devices on the system and is used to precisely and
simultaneously capture the device clock with the ART. The kernel
timekeeping code requires a system clock value from the current clock
source to calculate the system time. The correlated clocksource adds a
means to transform a clock (in this case ART) exactly to the system
clock (TSC clocksource) that is used to calculate system time.

Signed-off-by: Christopher S. Hall <christopher.s.hall@intel.com>
---
 include/linux/clocksource.h | 27 ++++++++++++++++
 include/linux/timekeeping.h | 35 +++++++++++++++++++++
 kernel/time/timekeeping.c   | 77 ++++++++++++++++++++++++++++++++++++++++++---
 3 files changed, 135 insertions(+), 4 deletions(-)

diff --git a/include/linux/clocksource.h b/include/linux/clocksource.h
index 7784b59..4b7973d 100644
--- a/include/linux/clocksource.h
+++ b/include/linux/clocksource.h
@@ -255,4 +255,31 @@ static inline void clocksource_probe(void) {}
 #define CLOCKSOURCE_ACPI_DECLARE(name, table_id, fn)		\
 	ACPI_DECLARE_PROBE_ENTRY(clksrc, name, table_id, 0, NULL, 0, fn)
 
+/*
+ * struct correlated_cs - Descriptor for a clocksource correlated to another
+ *	clocksource
+ * @related_cs:		Pointer to the related timekeeping clocksource
+ * @convert:		Conversion function to convert a timestamp from
+ *			the correlated clocksource to cycles of the related
+ *			timekeeping clocksource
+ */
+struct correlated_cs {
+	struct clocksource	*related_cs;
+	cycle_t			(*convert)(struct correlated_cs *cs,
+					   cycle_t cycles);
+};
+
+/*
+ * struct raw_system_counterval - system counter value captured in device
+ *	driver used to produce system/device cross-timestamp
+ * @system:	System counter value
+ * @cs:		Clocksource related to system counter value. This is used by
+ *		timekeeping code to verify validity of counter for system time
+ *		conversion
+ */
+struct raw_system_counterval {
+	cycle_t			cycles;
+	struct clocksource	*cs;
+};
+
 #endif /* _LINUX_CLOCKSOURCE_H */
diff --git a/include/linux/timekeeping.h b/include/linux/timekeeping.h
index ec89d84..2209943 100644
--- a/include/linux/timekeeping.h
+++ b/include/linux/timekeeping.h
@@ -266,6 +266,41 @@ extern void timekeeping_inject_sleeptime64(struct timespec64 *delta);
 extern void ktime_get_raw_and_real_ts64(struct timespec64 *ts_raw,
 				        struct timespec64 *ts_real);
 
+
+/*
+ * struct system_device_crosststamp - system/device cross-timestamp
+ *	(syncronized capture)
+ * @device:	Device time
+ * @realtime:	Realtime simultaneous with device time
+ * @monoraw:	Monotonic raw simultaneous with device time
+ */
+struct system_device_crosststamp {
+	ktime_t device;
+	ktime_t sys_realtime;
+	ktime_t sys_monoraw;
+};
+
+struct raw_system_counterval;
+/*
+ * struct get_sync_device_time - Provides method to capture device time
+ *	synchronized with raw system counter value
+ * @get_time:	Callback providing synchronized capture of device time
+ *		and system counter. Returns 0 on success, < 0 on failure
+ * @ctx:	Context provided to callback function
+ */
+struct get_sync_device_time {
+	int	(*get_time)(ktime_t *device,
+			    struct raw_system_counterval *system,
+			    void *ctx);
+	void	 *ctx;
+};
+
+/*
+ * Get cross timestamp between system clock and device clock
+ */
+extern int get_device_system_crosststamp(struct system_device_crosststamp *ct,
+					 struct get_sync_device_time *dt);
+
 /*
  * Persistent clock related interfaces
  */
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index d563c19..9c1ddc3 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -298,13 +298,11 @@ u32 (*arch_gettimeoffset)(void) = default_arch_gettimeoffset;
 static inline u32 arch_gettimeoffset(void) { return 0; }
 #endif
 
-static inline s64 timekeeping_get_ns(struct tk_read_base *tkr)
+static inline s64 timekeeping_delta_to_ns(struct tk_read_base *tkr,
+					  cycle_t delta)
 {
-	cycle_t delta;
 	s64 nsec;
 
-	delta = timekeeping_get_delta(tkr);
-
 	nsec = delta * tkr->mult + tkr->xtime_nsec;
 	nsec >>= tkr->shift;
 
@@ -312,6 +310,24 @@ static inline s64 timekeeping_get_ns(struct tk_read_base *tkr)
 	return nsec + arch_gettimeoffset();
 }
 
+static inline s64 timekeeping_get_ns(struct tk_read_base *tkr)
+{
+	cycle_t delta;
+
+	delta = timekeeping_get_delta(tkr);
+	return timekeeping_delta_to_ns(tkr, delta);
+}
+
+static inline s64 timekeeping_cycles_to_ns(struct tk_read_base *tkr,
+					    cycle_t cycles)
+{
+	cycle_t delta;
+
+	/* calculate the delta since the last update_wall_time */
+	delta = clocksource_delta(cycles, tkr->cycle_last, tkr->mask);
+	return timekeeping_delta_to_ns(tkr, delta);
+}
+
 /**
  * update_fast_timekeeper - Update the fast and NMI safe monotonic timekeeper.
  * @tkr: Timekeeping readout base from which we take the update
@@ -885,6 +901,59 @@ EXPORT_SYMBOL(ktime_get_raw_and_real_ts64);
 #endif /* CONFIG_NTP_PPS */
 
 /**
+ * get_device_system_crosststamp - Synchronously capture system/device timestamp
+ * @xtstamp:		Receives simultaneously captured system and device time
+ * @sync_devicetime:	Callback to get simultaneous device time and
+ *	system counter from the device driver
+ *
+ * Reads a timestamp from a device and correlates it to system time
+ */
+int get_device_system_crosststamp(struct system_device_crosststamp *xtstamp,
+				  struct get_sync_device_time *sync_devicetime)
+{
+	struct timekeeper *tk = &tk_core.timekeeper;
+	unsigned long seq;
+	struct raw_system_counterval raw_sys;
+	ktime_t base_raw;
+	ktime_t base_real;
+	s64 nsec_raw;
+	s64 nsec_real;
+	int ret;
+
+	do {
+		seq = read_seqcount_begin(&tk_core.seq);
+		/*
+		 * Try to get a timestamp from the device driver.
+		 */
+		ret = sync_devicetime->get_time(&xtstamp->device, &raw_sys,
+						sync_devicetime->ctx);
+		if (ret)
+			return ret;
+
+		/*
+		 * Verify that the correlated clocksource is related to
+		 * the currently installed timekeeper clocksource
+		 */
+		if (tk->tkr_mono.clock != raw_sys.cs)
+			return -ENODEV;
+
+		base_real = ktime_add(tk->tkr_mono.base,
+				      tk_core.timekeeper.offs_real);
+		base_raw = tk->tkr_raw.base;
+
+		nsec_real = timekeeping_cycles_to_ns(&tk->tkr_mono,
+						     raw_sys.cycles);
+		nsec_raw = timekeeping_cycles_to_ns(&tk->tkr_raw,
+						    raw_sys.cycles);
+	} while (read_seqcount_retry(&tk_core.seq, seq));
+
+	xtstamp->sys_realtime = ktime_add_ns(base_real, nsec_real);
+	xtstamp->sys_monoraw = ktime_add_ns(base_raw, nsec_raw);
+	return 0;
+}
+EXPORT_SYMBOL_GPL(get_device_system_crosststamp);
+
+/**
  * do_gettimeofday - Returns the time of day in a timeval
  * @tv:		pointer to the timeval to be set
  *
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [Intel-wired-lan] [RFC v5 1/6] Timekeeping cross timestamp interface for device drivers
@ 2016-01-04 12:45   ` Christopher S. Hall
  0 siblings, 0 replies; 38+ messages in thread
From: Christopher S. Hall @ 2016-01-04 12:45 UTC (permalink / raw)
  To: intel-wired-lan

ACKNOWLEDGMENT: The original correlated clock source and cross
timestamp code was developed by Thomas Gleixner
<tglx@linutronix.de>. It has changed considerably and any mistakes are
mine.

The precision with which events on multiple networked systems can be
synchronized using, as an example, PTP (IEEE 1588, 802.1AS) is limited
by the precision of the cross timestamps between the system clock and
the device (timestamp) clock. Precision here is the degree of
simultaneity when capturing the cross timestamp.

Currently the PTP cross timestamp is captured in software using the
PTP device driver ioctl PTP_SYS_OFFSET. Reads of the device clock are
interleaved with reads of the realtime clock. At best, the precision
of this cross timestamp is on the order of several microseconds due to
software latencies. Sub-microsecond precision is required for
industrial control and some media applications. To achieve this level
of precision hardware supported cross timestamping is needed.

Hardware cross timestamps are derived from simultaneously capturing
the device clock and the system clock. Applications use timestamps in
nanoseconds rather than clock ticks. The device driver can scale
device clock ticks to device time in nanoseconds, but cannot transform
system clock ticks. Only the kernel timekeeping code can do this. The
function get_device_system_crosststamp() allows device drivers to
return a cross timestamp with system time properly scaled to
nanoseconds.

The cross timestamps contain the realtime and monotonic raw clock
times. The realtime value is needed to discipline that clock using PTP
and the monotonic raw value is used for applications that don't
require a "real" time, but need an unadjusted clock time.

The get_device_system_crosststamp() code calls back into the driver
to ensure that the system counter is within the current timekeeping
update interval. Because of possible NTP/PTP frequency adjustments,
extrapolating a realtime clock time outside the current interval with
a potentially different scaling factor can result in a small amount of
error.

Modern Intel hardware provides an Always Running Timer (ART) which is
exactly related to TSC through a known frequency ratio. The ART is
routed to devices on the system and is used to precisely and
simultaneously capture the device clock with the ART. The kernel
timekeeping code requires a system clock value from the current clock
source to calculate the system time. The correlated clocksource adds a
means to transform a clock (in this case ART) exactly to the system
clock (TSC clocksource) that is used to calculate system time.

Signed-off-by: Christopher S. Hall <christopher.s.hall@intel.com>
---
 include/linux/clocksource.h | 27 ++++++++++++++++
 include/linux/timekeeping.h | 35 +++++++++++++++++++++
 kernel/time/timekeeping.c   | 77 ++++++++++++++++++++++++++++++++++++++++++---
 3 files changed, 135 insertions(+), 4 deletions(-)

diff --git a/include/linux/clocksource.h b/include/linux/clocksource.h
index 7784b59..4b7973d 100644
--- a/include/linux/clocksource.h
+++ b/include/linux/clocksource.h
@@ -255,4 +255,31 @@ static inline void clocksource_probe(void) {}
 #define CLOCKSOURCE_ACPI_DECLARE(name, table_id, fn)		\
 	ACPI_DECLARE_PROBE_ENTRY(clksrc, name, table_id, 0, NULL, 0, fn)
 
+/*
+ * struct correlated_cs - Descriptor for a clocksource correlated to another
+ *	clocksource
+ * @related_cs:		Pointer to the related timekeeping clocksource
+ * @convert:		Conversion function to convert a timestamp from
+ *			the correlated clocksource to cycles of the related
+ *			timekeeping clocksource
+ */
+struct correlated_cs {
+	struct clocksource	*related_cs;
+	cycle_t			(*convert)(struct correlated_cs *cs,
+					   cycle_t cycles);
+};
+
+/*
+ * struct raw_system_counterval - system counter value captured in device
+ *	driver used to produce system/device cross-timestamp
+ * @system:	System counter value
+ * @cs:		Clocksource related to system counter value. This is used by
+ *		timekeeping code to verify validity of counter for system time
+ *		conversion
+ */
+struct raw_system_counterval {
+	cycle_t			cycles;
+	struct clocksource	*cs;
+};
+
 #endif /* _LINUX_CLOCKSOURCE_H */
diff --git a/include/linux/timekeeping.h b/include/linux/timekeeping.h
index ec89d84..2209943 100644
--- a/include/linux/timekeeping.h
+++ b/include/linux/timekeeping.h
@@ -266,6 +266,41 @@ extern void timekeeping_inject_sleeptime64(struct timespec64 *delta);
 extern void ktime_get_raw_and_real_ts64(struct timespec64 *ts_raw,
 				        struct timespec64 *ts_real);
 
+
+/*
+ * struct system_device_crosststamp - system/device cross-timestamp
+ *	(syncronized capture)
+ * @device:	Device time
+ * @realtime:	Realtime simultaneous with device time
+ * @monoraw:	Monotonic raw simultaneous with device time
+ */
+struct system_device_crosststamp {
+	ktime_t device;
+	ktime_t sys_realtime;
+	ktime_t sys_monoraw;
+};
+
+struct raw_system_counterval;
+/*
+ * struct get_sync_device_time - Provides method to capture device time
+ *	synchronized with raw system counter value
+ * @get_time:	Callback providing synchronized capture of device time
+ *		and system counter. Returns 0 on success, < 0 on failure
+ * @ctx:	Context provided to callback function
+ */
+struct get_sync_device_time {
+	int	(*get_time)(ktime_t *device,
+			    struct raw_system_counterval *system,
+			    void *ctx);
+	void	 *ctx;
+};
+
+/*
+ * Get cross timestamp between system clock and device clock
+ */
+extern int get_device_system_crosststamp(struct system_device_crosststamp *ct,
+					 struct get_sync_device_time *dt);
+
 /*
  * Persistent clock related interfaces
  */
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index d563c19..9c1ddc3 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -298,13 +298,11 @@ u32 (*arch_gettimeoffset)(void) = default_arch_gettimeoffset;
 static inline u32 arch_gettimeoffset(void) { return 0; }
 #endif
 
-static inline s64 timekeeping_get_ns(struct tk_read_base *tkr)
+static inline s64 timekeeping_delta_to_ns(struct tk_read_base *tkr,
+					  cycle_t delta)
 {
-	cycle_t delta;
 	s64 nsec;
 
-	delta = timekeeping_get_delta(tkr);
-
 	nsec = delta * tkr->mult + tkr->xtime_nsec;
 	nsec >>= tkr->shift;
 
@@ -312,6 +310,24 @@ static inline s64 timekeeping_get_ns(struct tk_read_base *tkr)
 	return nsec + arch_gettimeoffset();
 }
 
+static inline s64 timekeeping_get_ns(struct tk_read_base *tkr)
+{
+	cycle_t delta;
+
+	delta = timekeeping_get_delta(tkr);
+	return timekeeping_delta_to_ns(tkr, delta);
+}
+
+static inline s64 timekeeping_cycles_to_ns(struct tk_read_base *tkr,
+					    cycle_t cycles)
+{
+	cycle_t delta;
+
+	/* calculate the delta since the last update_wall_time */
+	delta = clocksource_delta(cycles, tkr->cycle_last, tkr->mask);
+	return timekeeping_delta_to_ns(tkr, delta);
+}
+
 /**
  * update_fast_timekeeper - Update the fast and NMI safe monotonic timekeeper.
  * @tkr: Timekeeping readout base from which we take the update
@@ -885,6 +901,59 @@ EXPORT_SYMBOL(ktime_get_raw_and_real_ts64);
 #endif /* CONFIG_NTP_PPS */
 
 /**
+ * get_device_system_crosststamp - Synchronously capture system/device timestamp
+ * @xtstamp:		Receives simultaneously captured system and device time
+ * @sync_devicetime:	Callback to get simultaneous device time and
+ *	system counter from the device driver
+ *
+ * Reads a timestamp from a device and correlates it to system time
+ */
+int get_device_system_crosststamp(struct system_device_crosststamp *xtstamp,
+				  struct get_sync_device_time *sync_devicetime)
+{
+	struct timekeeper *tk = &tk_core.timekeeper;
+	unsigned long seq;
+	struct raw_system_counterval raw_sys;
+	ktime_t base_raw;
+	ktime_t base_real;
+	s64 nsec_raw;
+	s64 nsec_real;
+	int ret;
+
+	do {
+		seq = read_seqcount_begin(&tk_core.seq);
+		/*
+		 * Try to get a timestamp from the device driver.
+		 */
+		ret = sync_devicetime->get_time(&xtstamp->device, &raw_sys,
+						sync_devicetime->ctx);
+		if (ret)
+			return ret;
+
+		/*
+		 * Verify that the correlated clocksource is related to
+		 * the currently installed timekeeper clocksource
+		 */
+		if (tk->tkr_mono.clock != raw_sys.cs)
+			return -ENODEV;
+
+		base_real = ktime_add(tk->tkr_mono.base,
+				      tk_core.timekeeper.offs_real);
+		base_raw = tk->tkr_raw.base;
+
+		nsec_real = timekeeping_cycles_to_ns(&tk->tkr_mono,
+						     raw_sys.cycles);
+		nsec_raw = timekeeping_cycles_to_ns(&tk->tkr_raw,
+						    raw_sys.cycles);
+	} while (read_seqcount_retry(&tk_core.seq, seq));
+
+	xtstamp->sys_realtime = ktime_add_ns(base_real, nsec_real);
+	xtstamp->sys_monoraw = ktime_add_ns(base_raw, nsec_raw);
+	return 0;
+}
+EXPORT_SYMBOL_GPL(get_device_system_crosststamp);
+
+/**
  * do_gettimeofday - Returns the time of day in a timeval
  * @tv:		pointer to the timeval to be set
  *
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [RFC v5 2/6] Always Running Timer (ART) correlated clocksource
  2016-01-04 12:45 ` [Intel-wired-lan] " Christopher S. Hall
@ 2016-01-04 12:45   ` Christopher S. Hall
  -1 siblings, 0 replies; 38+ messages in thread
From: Christopher S. Hall @ 2016-01-04 12:45 UTC (permalink / raw)
  To: tglx, richardcochran, mingo, john.stultz, hpa, jeffrey.t.kirsher
  Cc: Christopher S. Hall, x86, linux-kernel, intel-wired-lan, netdev,
	kevin.b.stanton

On modern Intel systems TSC is derived from the new Always Running Timer
(ART). ART can be captured simultaneous to the capture of
audio and network device clocks, allowing a correlation between timebases
to be constructed. Upon capture, the driver converts the captured ART
value to the appropriate system clock using the correlated clocksource
mechanism.

On systems that support ART a new CPUID leaf (0x15) returns parameters
“m” and “n” such that:

TSC_value = (ART_value * m) / n + k [n >= 2]

[k is an offset that can adjusted by a privileged agent. The
IA32_TSC_ADJUST MSR is an example of an interface to adjust k.
See 17.14.4 of the Intel SDM for more details]

Signed-off-by: Christopher S. Hall <christopher.s.hall@intel.com>
---
 arch/x86/include/asm/cpufeature.h |  2 +-
 arch/x86/include/asm/tsc.h        |  2 ++
 arch/x86/kernel/tsc.c             | 46 +++++++++++++++++++++++++++++++++++++++
 3 files changed, 49 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h
index e4f8010..58c799e 100644
--- a/arch/x86/include/asm/cpufeature.h
+++ b/arch/x86/include/asm/cpufeature.h
@@ -85,7 +85,7 @@
 #define X86_FEATURE_P4		( 3*32+ 7) /* "" P4 */
 #define X86_FEATURE_CONSTANT_TSC ( 3*32+ 8) /* TSC ticks at a constant rate */
 #define X86_FEATURE_UP		( 3*32+ 9) /* smp kernel running on up */
-/* free, was #define X86_FEATURE_FXSAVE_LEAK ( 3*32+10) * "" FXSAVE leaks FOP/FIP/FOP */
+#define X86_FEATURE_ART		(3*32+10) /* Platform has always running timer (ART) */
 #define X86_FEATURE_ARCH_PERFMON ( 3*32+11) /* Intel Architectural PerfMon */
 #define X86_FEATURE_PEBS	( 3*32+12) /* Precise-Event Based Sampling */
 #define X86_FEATURE_BTS		( 3*32+13) /* Branch Trace Store */
diff --git a/arch/x86/include/asm/tsc.h b/arch/x86/include/asm/tsc.h
index 6d7c547..9474c9c 100644
--- a/arch/x86/include/asm/tsc.h
+++ b/arch/x86/include/asm/tsc.h
@@ -29,6 +29,8 @@ static inline cycles_t get_cycles(void)
 	return rdtsc();
 }
 
+extern struct correlated_cs art_timestamper;
+
 extern void tsc_init(void);
 extern void mark_tsc_unstable(char *reason);
 extern int unsynchronized_tsc(void);
diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
index c7c4d9c..26dcf63 100644
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -949,10 +949,36 @@ static struct notifier_block time_cpufreq_notifier_block = {
 	.notifier_call  = time_cpufreq_notifier
 };
 
+#define ART_CPUID_LEAF (0x15)
+/* The denominator will never be less that 2 */
+#define ART_MIN_DENOMINATOR (2)
+
+static u32 art_to_tsc_numerator;
+static u32 art_to_tsc_denominator;
+
+/*
+ * If ART is present detect the numerator:denominator to convert to TSC
+ */
+static void detect_art(void)
+{
+	unsigned int unused[2];
+
+	if (boot_cpu_data.cpuid_level >= ART_CPUID_LEAF) {
+		cpuid(ART_CPUID_LEAF, &art_to_tsc_denominator,
+		      &art_to_tsc_numerator, unused, unused+1);
+
+		if (art_to_tsc_denominator >= ART_MIN_DENOMINATOR)
+			set_cpu_cap(&boot_cpu_data, X86_FEATURE_ART);
+	}
+}
+
 static int __init cpufreq_tsc(void)
 {
 	if (!cpu_has_tsc)
 		return 0;
+
+	detect_art();
+
 	if (boot_cpu_has(X86_FEATURE_CONSTANT_TSC))
 		return 0;
 	cpufreq_register_notifier(&time_cpufreq_notifier_block,
@@ -1071,6 +1097,24 @@ int unsynchronized_tsc(void)
 	return 0;
 }
 
+/*
+ * Convert ART to TSC given numerator/denominator found in detect_art()
+ */
+static u64 convert_art_to_tsc(struct correlated_cs *cs, u64 cycles)
+{
+	u64 tmp, res;
+
+	res = (cycles / art_to_tsc_denominator) * art_to_tsc_numerator;
+	tmp = (cycles % art_to_tsc_denominator) * art_to_tsc_numerator;
+	res += tmp / art_to_tsc_denominator;
+
+	return res;
+}
+
+struct correlated_cs art_timestamper = {
+	.convert	= convert_art_to_tsc,
+};
+EXPORT_SYMBOL(art_timestamper);
 
 static void tsc_refine_calibration_work(struct work_struct *work);
 static DECLARE_DELAYED_WORK(tsc_irqwork, tsc_refine_calibration_work);
@@ -1142,6 +1186,8 @@ static void tsc_refine_calibration_work(struct work_struct *work)
 		(unsigned long)tsc_khz % 1000);
 
 out:
+	if (boot_cpu_has(X86_FEATURE_ART))
+		art_timestamper.related_cs = &clocksource_tsc;
 	clocksource_register_khz(&clocksource_tsc, tsc_khz);
 }
 
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [Intel-wired-lan] [RFC v5 2/6] Always Running Timer (ART) correlated clocksource
@ 2016-01-04 12:45   ` Christopher S. Hall
  0 siblings, 0 replies; 38+ messages in thread
From: Christopher S. Hall @ 2016-01-04 12:45 UTC (permalink / raw)
  To: intel-wired-lan

On modern Intel systems TSC is derived from the new Always Running Timer
(ART). ART can be captured simultaneous to the capture of
audio and network device clocks, allowing a correlation between timebases
to be constructed. Upon capture, the driver converts the captured ART
value to the appropriate system clock using the correlated clocksource
mechanism.

On systems that support ART a new CPUID leaf (0x15) returns parameters
?m? and ?n? such that:

TSC_value = (ART_value * m) / n + k [n >= 2]

[k is an offset that can adjusted by a privileged agent. The
IA32_TSC_ADJUST MSR is an example of an interface to adjust k.
See 17.14.4 of the Intel SDM for more details]

Signed-off-by: Christopher S. Hall <christopher.s.hall@intel.com>
---
 arch/x86/include/asm/cpufeature.h |  2 +-
 arch/x86/include/asm/tsc.h        |  2 ++
 arch/x86/kernel/tsc.c             | 46 +++++++++++++++++++++++++++++++++++++++
 3 files changed, 49 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h
index e4f8010..58c799e 100644
--- a/arch/x86/include/asm/cpufeature.h
+++ b/arch/x86/include/asm/cpufeature.h
@@ -85,7 +85,7 @@
 #define X86_FEATURE_P4		( 3*32+ 7) /* "" P4 */
 #define X86_FEATURE_CONSTANT_TSC ( 3*32+ 8) /* TSC ticks at a constant rate */
 #define X86_FEATURE_UP		( 3*32+ 9) /* smp kernel running on up */
-/* free, was #define X86_FEATURE_FXSAVE_LEAK ( 3*32+10) * "" FXSAVE leaks FOP/FIP/FOP */
+#define X86_FEATURE_ART		(3*32+10) /* Platform has always running timer (ART) */
 #define X86_FEATURE_ARCH_PERFMON ( 3*32+11) /* Intel Architectural PerfMon */
 #define X86_FEATURE_PEBS	( 3*32+12) /* Precise-Event Based Sampling */
 #define X86_FEATURE_BTS		( 3*32+13) /* Branch Trace Store */
diff --git a/arch/x86/include/asm/tsc.h b/arch/x86/include/asm/tsc.h
index 6d7c547..9474c9c 100644
--- a/arch/x86/include/asm/tsc.h
+++ b/arch/x86/include/asm/tsc.h
@@ -29,6 +29,8 @@ static inline cycles_t get_cycles(void)
 	return rdtsc();
 }
 
+extern struct correlated_cs art_timestamper;
+
 extern void tsc_init(void);
 extern void mark_tsc_unstable(char *reason);
 extern int unsynchronized_tsc(void);
diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
index c7c4d9c..26dcf63 100644
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -949,10 +949,36 @@ static struct notifier_block time_cpufreq_notifier_block = {
 	.notifier_call  = time_cpufreq_notifier
 };
 
+#define ART_CPUID_LEAF (0x15)
+/* The denominator will never be less that 2 */
+#define ART_MIN_DENOMINATOR (2)
+
+static u32 art_to_tsc_numerator;
+static u32 art_to_tsc_denominator;
+
+/*
+ * If ART is present detect the numerator:denominator to convert to TSC
+ */
+static void detect_art(void)
+{
+	unsigned int unused[2];
+
+	if (boot_cpu_data.cpuid_level >= ART_CPUID_LEAF) {
+		cpuid(ART_CPUID_LEAF, &art_to_tsc_denominator,
+		      &art_to_tsc_numerator, unused, unused+1);
+
+		if (art_to_tsc_denominator >= ART_MIN_DENOMINATOR)
+			set_cpu_cap(&boot_cpu_data, X86_FEATURE_ART);
+	}
+}
+
 static int __init cpufreq_tsc(void)
 {
 	if (!cpu_has_tsc)
 		return 0;
+
+	detect_art();
+
 	if (boot_cpu_has(X86_FEATURE_CONSTANT_TSC))
 		return 0;
 	cpufreq_register_notifier(&time_cpufreq_notifier_block,
@@ -1071,6 +1097,24 @@ int unsynchronized_tsc(void)
 	return 0;
 }
 
+/*
+ * Convert ART to TSC given numerator/denominator found in detect_art()
+ */
+static u64 convert_art_to_tsc(struct correlated_cs *cs, u64 cycles)
+{
+	u64 tmp, res;
+
+	res = (cycles / art_to_tsc_denominator) * art_to_tsc_numerator;
+	tmp = (cycles % art_to_tsc_denominator) * art_to_tsc_numerator;
+	res += tmp / art_to_tsc_denominator;
+
+	return res;
+}
+
+struct correlated_cs art_timestamper = {
+	.convert	= convert_art_to_tsc,
+};
+EXPORT_SYMBOL(art_timestamper);
 
 static void tsc_refine_calibration_work(struct work_struct *work);
 static DECLARE_DELAYED_WORK(tsc_irqwork, tsc_refine_calibration_work);
@@ -1142,6 +1186,8 @@ static void tsc_refine_calibration_work(struct work_struct *work)
 		(unsigned long)tsc_khz % 1000);
 
 out:
+	if (boot_cpu_has(X86_FEATURE_ART))
+		art_timestamper.related_cs = &clocksource_tsc;
 	clocksource_register_khz(&clocksource_tsc, tsc_khz);
 }
 
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [RFC v5 3/6] Add history to cross timestamp interface supporting slower devices
  2016-01-04 12:45 ` [Intel-wired-lan] " Christopher S. Hall
@ 2016-01-04 12:45   ` Christopher S. Hall
  -1 siblings, 0 replies; 38+ messages in thread
From: Christopher S. Hall @ 2016-01-04 12:45 UTC (permalink / raw)
  To: tglx, richardcochran, mingo, john.stultz, hpa, jeffrey.t.kirsher
  Cc: Christopher S. Hall, x86, linux-kernel, intel-wired-lan, netdev,
	kevin.b.stanton

Another representative use case of time sync and the correlated
clocksource (in addition to PTP noted above) is PTP synchronized
audio.

In a streaming application, as an example, samples will be sent and/or
received by multiple devices with a presentation time that is in terms
of the PTP master clock. Synchronizing the audio output on these
devices requires correlating the audio clock with the PTP master
clock. The more precise this correlation is, the better the audio
quality (i.e. out of sync audio sounds bad).

>From an application standpoint, to correlate the PTP master clock with
the audio device clock, the system clock is used as a intermediate
timebase. The transforms such an application would perform are:

    System Clock <-> Audio clock
    System Clock <-> Network Device Clock [<-> PTP Master Clock]

Such audio applications make use of some existing ALSA library calls
that provide audio/system cross-timestamps (e.g.
snd_pcm_status_get_htstamp()). Previous driver implementations capture
these cross timestamps by reading the system clock (raw/mono/real) and
the device clock with greatest degree of simultaneity possible in
software.

Modern Intel platforms can perform a more accurate cross timestamp in
hardware (ART,audio device clock).  The audio driver requires
ART->system time transforms -- the same as required for the network
driver. These platforms offload audio processing (including
cross-timestamps) to a DSP which to ensure uninterrupted audio
processing, communicates and response to the host only once every
millsecond. As a result is takes up to a millisecond for the DSP to
receive a request, the request is processed by the DSP, the audio
output hardware is polled for completion, the result is copied into
shared memory, and the host is notified. All of these operation occur
on a millisecond cadence.  This transaction requires about 2 ms, but
under heavier workloads it may take up to 4 ms.

If update_wall_time() is called while waiting for a response within
get_device_system_crosststamp() (from previous patch), a retry is
attempted. This will occur if the cycle_interval (determined by
CONFIG_HZ and mult/shift values) cycles elapse.

Adding a history allows these slow devices the option of providing an
ART value outside of the retry loop. In this case, the callback
provided is an accessor function for the previously obtained counter
value. If get_system_device_crosststamp() receives a counter value
previous to cycle_last, it consults the history provided as an
argument in history_ref and interpolates the realtime and monotonic
raw system time using the provided counter value. If there are any
clock discontinuities, e.g. from calling settimeofday(), the monotonic
raw time is interpolated in the usual way, but the realtime clock time
is adjusted by scaling the monotonic raw adjustment.

When an accessor function is used a history argument *must* be
provided. The history is initialized using ktime_get_snapshot() and
must be called before the counter values are read.

When the history is used to interpolate timestamp values, the realtime
clock time may be inaccurate to some degree. In general, the
longer the length of history the larger the interpolation error. If
there are discontinuities (large step changes) to the time, the error
can be very large.

Signed-off-by: Christopher S. Hall <christopher.s.hall@intel.com>
---
 include/linux/clocksource.h         |  16 ++++
 include/linux/timekeeper_internal.h |   4 +
 include/linux/timekeeping.h         |   9 +-
 kernel/time/timekeeping.c           | 162 ++++++++++++++++++++++++++++++++++--
 4 files changed, 185 insertions(+), 6 deletions(-)

diff --git a/include/linux/clocksource.h b/include/linux/clocksource.h
index 4b7973d..f413157 100644
--- a/include/linux/clocksource.h
+++ b/include/linux/clocksource.h
@@ -282,4 +282,20 @@ struct raw_system_counterval {
 	struct clocksource	*cs;
 };
 
+/*
+ * struct system_time_snapshot - simultaneous raw/real time capture with
+ *	counter value
+ * @cycles:	Clocksource counter value to produce the system times
+ * @real:	Realtime system time
+ * @raw:	Monotonic raw system time
+ * @cs_seq:	Sequence number associated with changed clocksource
+ */
+struct system_time_snapshot {
+	cycles_t	cycles;
+	ktime_t		real;
+	ktime_t		raw;
+	u8		cs_seq;
+	unsigned int	clock_set_seq;
+};
+
 #endif /* _LINUX_CLOCKSOURCE_H */
diff --git a/include/linux/timekeeper_internal.h b/include/linux/timekeeper_internal.h
index 2524722..156b51d 100644
--- a/include/linux/timekeeper_internal.h
+++ b/include/linux/timekeeper_internal.h
@@ -13,6 +13,9 @@
 /**
  * struct tk_read_base - base structure for timekeeping readout
  * @clock:	Current clocksource used for timekeeping.
+ * @cs_seq:	Clocksource sequence is incremented per clocksource change.
+ *	It's used to determine whether past system time can be related to
+ *	current system time
  * @read:	Read function of @clock
  * @mask:	Bitmask for two's complement subtraction of non 64bit clocks
  * @cycle_last: @clock cycle value at last update
@@ -29,6 +32,7 @@
  */
 struct tk_read_base {
 	struct clocksource	*clock;
+	u8			cs_seq;
 	cycle_t			(*read)(struct clocksource *cs);
 	cycle_t			mask;
 	cycle_t			cycle_last;
diff --git a/include/linux/timekeeping.h b/include/linux/timekeeping.h
index 2209943..2f290a2 100644
--- a/include/linux/timekeeping.h
+++ b/include/linux/timekeeping.h
@@ -295,11 +295,18 @@ struct get_sync_device_time {
 	void	 *ctx;
 };
 
+struct system_time_snapshot;
+/*
+ * Simultaneously snapshot realtime and monotonic raw clocks
+ */
+extern void ktime_get_snapshot(struct system_time_snapshot *systime_snapshot);
+
 /*
  * Get cross timestamp between system clock and device clock
  */
 extern int get_device_system_crosststamp(struct system_device_crosststamp *ct,
-					 struct get_sync_device_time *dt);
+					 struct get_sync_device_time *dt,
+					 struct system_time_snapshot *history);
 
 /*
  * Persistent clock related interfaces
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 9c1ddc3..5a7f784 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -235,11 +235,13 @@ static void tk_setup_internals(struct timekeeper *tk, struct clocksource *clock)
 
 	old_clock = tk->tkr_mono.clock;
 	tk->tkr_mono.clock = clock;
+	++tk->tkr_mono.cs_seq;
 	tk->tkr_mono.read = clock->read;
 	tk->tkr_mono.mask = clock->mask;
 	tk->tkr_mono.cycle_last = tk->tkr_mono.read(clock);
 
 	tk->tkr_raw.clock = clock;
+	++tk->tkr_raw.cs_seq;
 	tk->tkr_raw.read = clock->read;
 	tk->tkr_raw.mask = clock->mask;
 	tk->tkr_raw.cycle_last = tk->tkr_mono.cycle_last;
@@ -862,6 +864,39 @@ time64_t ktime_get_real_seconds(void)
 }
 EXPORT_SYMBOL_GPL(ktime_get_real_seconds);
 
+/**
+ * ktime_get_snapshot - snapshots the realtime/monotonic raw clocks with counter
+ * @snapshot:	pointer to struct receiving the system time snapshot
+ */
+void ktime_get_snapshot(struct system_time_snapshot *systime_snapshot)
+{
+	struct timekeeper *tk = &tk_core.timekeeper;
+	unsigned long seq;
+	ktime_t base_raw;
+	ktime_t base_real;
+	s64 nsec_raw;
+	s64 nsec_real;
+	cycle_t now;
+
+	do {
+		seq = read_seqcount_begin(&tk_core.seq);
+
+		now = tk->tkr_mono.read(tk->tkr_mono.clock);
+		systime_snapshot->cs_seq = tk->tkr_mono.cs_seq;
+		systime_snapshot->clock_set_seq = tk->clock_was_set_seq;
+		base_real = ktime_add(tk->tkr_mono.base,
+				      tk_core.timekeeper.offs_real);
+		base_raw = tk->tkr_raw.base;
+		nsec_real = timekeeping_cycles_to_ns(&tk->tkr_mono, now);
+		nsec_raw  = timekeeping_cycles_to_ns(&tk->tkr_raw, now);
+	} while (read_seqcount_retry(&tk_core.seq, seq));
+
+	systime_snapshot->cycles = now;
+	systime_snapshot->real = ktime_add_ns(base_real, nsec_real);
+	systime_snapshot->raw = ktime_add_ns(base_raw, nsec_raw);
+}
+EXPORT_SYMBOL_GPL(ktime_get_snapshot);
+
 #ifdef CONFIG_NTP_PPS
 
 /**
@@ -901,15 +936,82 @@ EXPORT_SYMBOL(ktime_get_raw_and_real_ts64);
 #endif /* CONFIG_NTP_PPS */
 
 /**
+ * adjust_historical_crosststamp - adjust crosstimestamp previous to current interval
+ * @total_history_cycles:	Total history length in cycles
+ * @partial_history_cycles:	Cycle offset into history (fractional part)
+ * @total_history_monoraw:	Total history length in monotonic raw ns
+ * @ts:				Cross timestamp that should be adjusted using
+ *	partial/total ratio
+ *
+ * Helper function used by get_device_system_crosststamp() to correct the
+ * crosstimestamp corresponding to the start of the current interval to the
+ * system counter value (timestamp point) provided by the driver. The
+ * total_history_* quantities are the total history starting at the provided
+ * reference point and ending at the start of the current interval. The cycle
+ * count between the driver timestamp point and the start of the current
+ * interval is partial_history_cycles.
+ */
+static void adjust_historical_crosststamp(cycle_t total_history_cycles,
+					  cycle_t partial_history_cycles,
+					  ktime_t total_history_monoraw,
+					  ktime_t total_history_realtime,
+					  bool discontinuity,
+					  struct system_device_crosststamp *ts)
+{
+	struct timekeeper *tk = &tk_core.timekeeper;
+	u64 corr_monoraw;
+	u64 corr_realtime;
+
+	/*
+	 * Scale the monotonic raw time delta by:
+	 *	partial_history_cycles / total_history_cycles
+	 */
+	corr_monoraw = (ktime_to_ns(total_history_monoraw) *
+			partial_history_cycles) / total_history_cycles;
+	/*
+	 * If there is a discontinuity in the history, scale monotonic raw
+	 *	correction by:
+	 *		mult(real)/mult(raw) yielding the realtime correction
+	 * Otherwise, calculate the realtime correction similar to monotonic
+	 *	raw calculation
+	 */
+	if (discontinuity)
+		corr_realtime = (corr_monoraw * tk->tkr_mono.mult) /
+			tk->tkr_raw.mult;
+	else
+		corr_realtime = (ktime_to_ns(total_history_realtime) *
+				partial_history_cycles) / total_history_cycles;
+
+	/* Fixup monotonic raw and real time time values */
+	ts->sys_monoraw = ktime_sub_ns(ts->sys_monoraw, corr_monoraw);
+	ts->sys_realtime = ktime_sub_ns(ts->sys_realtime, corr_realtime);
+}
+
+/*
+ * cycle_between - true if test occurs chronologically between before and after
+ */
+static bool cycle_between(cycles_t before, cycles_t test, cycles_t after)
+{
+	if (test > before && test < after)
+		return true;
+	if (test < before && before > after)
+		return true;
+	return false;
+}
+
+/**
  * get_device_system_crosststamp - Synchronously capture system/device timestamp
  * @xtstamp:		Receives simultaneously captured system and device time
  * @sync_devicetime:	Callback to get simultaneous device time and
  *	system counter from the device driver
+ * @history_ref:	Historical reference point used to interpolate system
+ *	time when counter provided by the driver is before the current interval
  *
  * Reads a timestamp from a device and correlates it to system time
  */
 int get_device_system_crosststamp(struct system_device_crosststamp *xtstamp,
-				  struct get_sync_device_time *sync_devicetime)
+				  struct get_sync_device_time *sync_devicetime,
+				  struct system_time_snapshot *history_ref)
 {
 	struct timekeeper *tk = &tk_core.timekeeper;
 	unsigned long seq;
@@ -918,6 +1020,12 @@ int get_device_system_crosststamp(struct system_device_crosststamp *xtstamp,
 	ktime_t base_real;
 	s64 nsec_raw;
 	s64 nsec_real;
+	cycles_t cycles;
+	cycle_t now;
+	cycle_t interval_start;
+	unsigned int clock_was_set_seq;
+	u8 cs_seq;
+	bool do_interp;
 	int ret;
 
 	do {
@@ -936,19 +1044,63 @@ int get_device_system_crosststamp(struct system_device_crosststamp *xtstamp,
 		 */
 		if (tk->tkr_mono.clock != raw_sys.cs)
 			return -ENODEV;
+		cycles = raw_sys.cycles;
+
+		/*
+		 * Check whether the system counter value provided by the
+		 * device driver is on the current interval.
+		 */
+		now = tk->tkr_mono.read(tk->tkr_mono.clock);
+		interval_start = tk->tkr_mono.cycle_last;
+		if (!cycle_between(interval_start, cycles, now)) {
+			cs_seq = tk->tkr_mono.cs_seq;
+			clock_was_set_seq = tk->clock_was_set_seq;
+			cycles = interval_start;
+			do_interp = true;
+		} else {
+			do_interp = false;
+		}
 
 		base_real = ktime_add(tk->tkr_mono.base,
 				      tk_core.timekeeper.offs_real);
 		base_raw = tk->tkr_raw.base;
 
-		nsec_real = timekeeping_cycles_to_ns(&tk->tkr_mono,
-						     raw_sys.cycles);
-		nsec_raw = timekeeping_cycles_to_ns(&tk->tkr_raw,
-						    raw_sys.cycles);
+		nsec_real = timekeeping_cycles_to_ns(&tk->tkr_mono, cycles);
+		nsec_raw = timekeeping_cycles_to_ns(&tk->tkr_raw, cycles);
 	} while (read_seqcount_retry(&tk_core.seq, seq));
 
 	xtstamp->sys_realtime = ktime_add_ns(base_real, nsec_real);
 	xtstamp->sys_monoraw = ktime_add_ns(base_raw, nsec_raw);
+
+	/*
+	 * Interpolate if necessary, working back from the start of the current
+	 * interval
+	 */
+	if (do_interp) {
+		cycle_t total_history_cycles;
+		ktime_t history_monoraw;
+		ktime_t history_realtime;
+		bool discontinuity;
+		cycle_t partial_history_cycles = cycles - raw_sys.cycles;
+
+		if (!history_ref || history_ref->cs_seq != cs_seq ||
+		    !cycle_between(history_ref->cycles, raw_sys.cycles,
+				   interval_start))
+			return -EINVAL;
+		history_monoraw = ktime_sub(xtstamp->sys_monoraw,
+					    history_ref->raw);
+		history_realtime = ktime_sub(xtstamp->sys_realtime,
+					     history_ref->real);
+		total_history_cycles = cycles - history_ref->cycles;
+		discontinuity =
+			history_ref->clock_set_seq != clock_was_set_seq;
+		adjust_historical_crosststamp(total_history_cycles,
+					      partial_history_cycles,
+					      history_monoraw,
+					      history_realtime, discontinuity,
+					      xtstamp);
+	}
+
 	return 0;
 }
 EXPORT_SYMBOL_GPL(get_device_system_crosststamp);
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [Intel-wired-lan] [RFC v5 3/6] Add history to cross timestamp interface supporting slower devices
@ 2016-01-04 12:45   ` Christopher S. Hall
  0 siblings, 0 replies; 38+ messages in thread
From: Christopher S. Hall @ 2016-01-04 12:45 UTC (permalink / raw)
  To: intel-wired-lan

Another representative use case of time sync and the correlated
clocksource (in addition to PTP noted above) is PTP synchronized
audio.

In a streaming application, as an example, samples will be sent and/or
received by multiple devices with a presentation time that is in terms
of the PTP master clock. Synchronizing the audio output on these
devices requires correlating the audio clock with the PTP master
clock. The more precise this correlation is, the better the audio
quality (i.e. out of sync audio sounds bad).

From an application standpoint, to correlate the PTP master clock with
the audio device clock, the system clock is used as a intermediate
timebase. The transforms such an application would perform are:

    System Clock <-> Audio clock
    System Clock <-> Network Device Clock [<-> PTP Master Clock]

Such audio applications make use of some existing ALSA library calls
that provide audio/system cross-timestamps (e.g.
snd_pcm_status_get_htstamp()). Previous driver implementations capture
these cross timestamps by reading the system clock (raw/mono/real) and
the device clock with greatest degree of simultaneity possible in
software.

Modern Intel platforms can perform a more accurate cross timestamp in
hardware (ART,audio device clock).  The audio driver requires
ART->system time transforms -- the same as required for the network
driver. These platforms offload audio processing (including
cross-timestamps) to a DSP which to ensure uninterrupted audio
processing, communicates and response to the host only once every
millsecond. As a result is takes up to a millisecond for the DSP to
receive a request, the request is processed by the DSP, the audio
output hardware is polled for completion, the result is copied into
shared memory, and the host is notified. All of these operation occur
on a millisecond cadence.  This transaction requires about 2 ms, but
under heavier workloads it may take up to 4 ms.

If update_wall_time() is called while waiting for a response within
get_device_system_crosststamp() (from previous patch), a retry is
attempted. This will occur if the cycle_interval (determined by
CONFIG_HZ and mult/shift values) cycles elapse.

Adding a history allows these slow devices the option of providing an
ART value outside of the retry loop. In this case, the callback
provided is an accessor function for the previously obtained counter
value. If get_system_device_crosststamp() receives a counter value
previous to cycle_last, it consults the history provided as an
argument in history_ref and interpolates the realtime and monotonic
raw system time using the provided counter value. If there are any
clock discontinuities, e.g. from calling settimeofday(), the monotonic
raw time is interpolated in the usual way, but the realtime clock time
is adjusted by scaling the monotonic raw adjustment.

When an accessor function is used a history argument *must* be
provided. The history is initialized using ktime_get_snapshot() and
must be called before the counter values are read.

When the history is used to interpolate timestamp values, the realtime
clock time may be inaccurate to some degree. In general, the
longer the length of history the larger the interpolation error. If
there are discontinuities (large step changes) to the time, the error
can be very large.

Signed-off-by: Christopher S. Hall <christopher.s.hall@intel.com>
---
 include/linux/clocksource.h         |  16 ++++
 include/linux/timekeeper_internal.h |   4 +
 include/linux/timekeeping.h         |   9 +-
 kernel/time/timekeeping.c           | 162 ++++++++++++++++++++++++++++++++++--
 4 files changed, 185 insertions(+), 6 deletions(-)

diff --git a/include/linux/clocksource.h b/include/linux/clocksource.h
index 4b7973d..f413157 100644
--- a/include/linux/clocksource.h
+++ b/include/linux/clocksource.h
@@ -282,4 +282,20 @@ struct raw_system_counterval {
 	struct clocksource	*cs;
 };
 
+/*
+ * struct system_time_snapshot - simultaneous raw/real time capture with
+ *	counter value
+ * @cycles:	Clocksource counter value to produce the system times
+ * @real:	Realtime system time
+ * @raw:	Monotonic raw system time
+ * @cs_seq:	Sequence number associated with changed clocksource
+ */
+struct system_time_snapshot {
+	cycles_t	cycles;
+	ktime_t		real;
+	ktime_t		raw;
+	u8		cs_seq;
+	unsigned int	clock_set_seq;
+};
+
 #endif /* _LINUX_CLOCKSOURCE_H */
diff --git a/include/linux/timekeeper_internal.h b/include/linux/timekeeper_internal.h
index 2524722..156b51d 100644
--- a/include/linux/timekeeper_internal.h
+++ b/include/linux/timekeeper_internal.h
@@ -13,6 +13,9 @@
 /**
  * struct tk_read_base - base structure for timekeeping readout
  * @clock:	Current clocksource used for timekeeping.
+ * @cs_seq:	Clocksource sequence is incremented per clocksource change.
+ *	It's used to determine whether past system time can be related to
+ *	current system time
  * @read:	Read function of @clock
  * @mask:	Bitmask for two's complement subtraction of non 64bit clocks
  * @cycle_last: @clock cycle value at last update
@@ -29,6 +32,7 @@
  */
 struct tk_read_base {
 	struct clocksource	*clock;
+	u8			cs_seq;
 	cycle_t			(*read)(struct clocksource *cs);
 	cycle_t			mask;
 	cycle_t			cycle_last;
diff --git a/include/linux/timekeeping.h b/include/linux/timekeeping.h
index 2209943..2f290a2 100644
--- a/include/linux/timekeeping.h
+++ b/include/linux/timekeeping.h
@@ -295,11 +295,18 @@ struct get_sync_device_time {
 	void	 *ctx;
 };
 
+struct system_time_snapshot;
+/*
+ * Simultaneously snapshot realtime and monotonic raw clocks
+ */
+extern void ktime_get_snapshot(struct system_time_snapshot *systime_snapshot);
+
 /*
  * Get cross timestamp between system clock and device clock
  */
 extern int get_device_system_crosststamp(struct system_device_crosststamp *ct,
-					 struct get_sync_device_time *dt);
+					 struct get_sync_device_time *dt,
+					 struct system_time_snapshot *history);
 
 /*
  * Persistent clock related interfaces
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 9c1ddc3..5a7f784 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -235,11 +235,13 @@ static void tk_setup_internals(struct timekeeper *tk, struct clocksource *clock)
 
 	old_clock = tk->tkr_mono.clock;
 	tk->tkr_mono.clock = clock;
+	++tk->tkr_mono.cs_seq;
 	tk->tkr_mono.read = clock->read;
 	tk->tkr_mono.mask = clock->mask;
 	tk->tkr_mono.cycle_last = tk->tkr_mono.read(clock);
 
 	tk->tkr_raw.clock = clock;
+	++tk->tkr_raw.cs_seq;
 	tk->tkr_raw.read = clock->read;
 	tk->tkr_raw.mask = clock->mask;
 	tk->tkr_raw.cycle_last = tk->tkr_mono.cycle_last;
@@ -862,6 +864,39 @@ time64_t ktime_get_real_seconds(void)
 }
 EXPORT_SYMBOL_GPL(ktime_get_real_seconds);
 
+/**
+ * ktime_get_snapshot - snapshots the realtime/monotonic raw clocks with counter
+ * @snapshot:	pointer to struct receiving the system time snapshot
+ */
+void ktime_get_snapshot(struct system_time_snapshot *systime_snapshot)
+{
+	struct timekeeper *tk = &tk_core.timekeeper;
+	unsigned long seq;
+	ktime_t base_raw;
+	ktime_t base_real;
+	s64 nsec_raw;
+	s64 nsec_real;
+	cycle_t now;
+
+	do {
+		seq = read_seqcount_begin(&tk_core.seq);
+
+		now = tk->tkr_mono.read(tk->tkr_mono.clock);
+		systime_snapshot->cs_seq = tk->tkr_mono.cs_seq;
+		systime_snapshot->clock_set_seq = tk->clock_was_set_seq;
+		base_real = ktime_add(tk->tkr_mono.base,
+				      tk_core.timekeeper.offs_real);
+		base_raw = tk->tkr_raw.base;
+		nsec_real = timekeeping_cycles_to_ns(&tk->tkr_mono, now);
+		nsec_raw  = timekeeping_cycles_to_ns(&tk->tkr_raw, now);
+	} while (read_seqcount_retry(&tk_core.seq, seq));
+
+	systime_snapshot->cycles = now;
+	systime_snapshot->real = ktime_add_ns(base_real, nsec_real);
+	systime_snapshot->raw = ktime_add_ns(base_raw, nsec_raw);
+}
+EXPORT_SYMBOL_GPL(ktime_get_snapshot);
+
 #ifdef CONFIG_NTP_PPS
 
 /**
@@ -901,15 +936,82 @@ EXPORT_SYMBOL(ktime_get_raw_and_real_ts64);
 #endif /* CONFIG_NTP_PPS */
 
 /**
+ * adjust_historical_crosststamp - adjust crosstimestamp previous to current interval
+ * @total_history_cycles:	Total history length in cycles
+ * @partial_history_cycles:	Cycle offset into history (fractional part)
+ * @total_history_monoraw:	Total history length in monotonic raw ns
+ * @ts:				Cross timestamp that should be adjusted using
+ *	partial/total ratio
+ *
+ * Helper function used by get_device_system_crosststamp() to correct the
+ * crosstimestamp corresponding to the start of the current interval to the
+ * system counter value (timestamp point) provided by the driver. The
+ * total_history_* quantities are the total history starting at the provided
+ * reference point and ending at the start of the current interval. The cycle
+ * count between the driver timestamp point and the start of the current
+ * interval is partial_history_cycles.
+ */
+static void adjust_historical_crosststamp(cycle_t total_history_cycles,
+					  cycle_t partial_history_cycles,
+					  ktime_t total_history_monoraw,
+					  ktime_t total_history_realtime,
+					  bool discontinuity,
+					  struct system_device_crosststamp *ts)
+{
+	struct timekeeper *tk = &tk_core.timekeeper;
+	u64 corr_monoraw;
+	u64 corr_realtime;
+
+	/*
+	 * Scale the monotonic raw time delta by:
+	 *	partial_history_cycles / total_history_cycles
+	 */
+	corr_monoraw = (ktime_to_ns(total_history_monoraw) *
+			partial_history_cycles) / total_history_cycles;
+	/*
+	 * If there is a discontinuity in the history, scale monotonic raw
+	 *	correction by:
+	 *		mult(real)/mult(raw) yielding the realtime correction
+	 * Otherwise, calculate the realtime correction similar to monotonic
+	 *	raw calculation
+	 */
+	if (discontinuity)
+		corr_realtime = (corr_monoraw * tk->tkr_mono.mult) /
+			tk->tkr_raw.mult;
+	else
+		corr_realtime = (ktime_to_ns(total_history_realtime) *
+				partial_history_cycles) / total_history_cycles;
+
+	/* Fixup monotonic raw and real time time values */
+	ts->sys_monoraw = ktime_sub_ns(ts->sys_monoraw, corr_monoraw);
+	ts->sys_realtime = ktime_sub_ns(ts->sys_realtime, corr_realtime);
+}
+
+/*
+ * cycle_between - true if test occurs chronologically between before and after
+ */
+static bool cycle_between(cycles_t before, cycles_t test, cycles_t after)
+{
+	if (test > before && test < after)
+		return true;
+	if (test < before && before > after)
+		return true;
+	return false;
+}
+
+/**
  * get_device_system_crosststamp - Synchronously capture system/device timestamp
  * @xtstamp:		Receives simultaneously captured system and device time
  * @sync_devicetime:	Callback to get simultaneous device time and
  *	system counter from the device driver
+ * @history_ref:	Historical reference point used to interpolate system
+ *	time when counter provided by the driver is before the current interval
  *
  * Reads a timestamp from a device and correlates it to system time
  */
 int get_device_system_crosststamp(struct system_device_crosststamp *xtstamp,
-				  struct get_sync_device_time *sync_devicetime)
+				  struct get_sync_device_time *sync_devicetime,
+				  struct system_time_snapshot *history_ref)
 {
 	struct timekeeper *tk = &tk_core.timekeeper;
 	unsigned long seq;
@@ -918,6 +1020,12 @@ int get_device_system_crosststamp(struct system_device_crosststamp *xtstamp,
 	ktime_t base_real;
 	s64 nsec_raw;
 	s64 nsec_real;
+	cycles_t cycles;
+	cycle_t now;
+	cycle_t interval_start;
+	unsigned int clock_was_set_seq;
+	u8 cs_seq;
+	bool do_interp;
 	int ret;
 
 	do {
@@ -936,19 +1044,63 @@ int get_device_system_crosststamp(struct system_device_crosststamp *xtstamp,
 		 */
 		if (tk->tkr_mono.clock != raw_sys.cs)
 			return -ENODEV;
+		cycles = raw_sys.cycles;
+
+		/*
+		 * Check whether the system counter value provided by the
+		 * device driver is on the current interval.
+		 */
+		now = tk->tkr_mono.read(tk->tkr_mono.clock);
+		interval_start = tk->tkr_mono.cycle_last;
+		if (!cycle_between(interval_start, cycles, now)) {
+			cs_seq = tk->tkr_mono.cs_seq;
+			clock_was_set_seq = tk->clock_was_set_seq;
+			cycles = interval_start;
+			do_interp = true;
+		} else {
+			do_interp = false;
+		}
 
 		base_real = ktime_add(tk->tkr_mono.base,
 				      tk_core.timekeeper.offs_real);
 		base_raw = tk->tkr_raw.base;
 
-		nsec_real = timekeeping_cycles_to_ns(&tk->tkr_mono,
-						     raw_sys.cycles);
-		nsec_raw = timekeeping_cycles_to_ns(&tk->tkr_raw,
-						    raw_sys.cycles);
+		nsec_real = timekeeping_cycles_to_ns(&tk->tkr_mono, cycles);
+		nsec_raw = timekeeping_cycles_to_ns(&tk->tkr_raw, cycles);
 	} while (read_seqcount_retry(&tk_core.seq, seq));
 
 	xtstamp->sys_realtime = ktime_add_ns(base_real, nsec_real);
 	xtstamp->sys_monoraw = ktime_add_ns(base_raw, nsec_raw);
+
+	/*
+	 * Interpolate if necessary, working back from the start of the current
+	 * interval
+	 */
+	if (do_interp) {
+		cycle_t total_history_cycles;
+		ktime_t history_monoraw;
+		ktime_t history_realtime;
+		bool discontinuity;
+		cycle_t partial_history_cycles = cycles - raw_sys.cycles;
+
+		if (!history_ref || history_ref->cs_seq != cs_seq ||
+		    !cycle_between(history_ref->cycles, raw_sys.cycles,
+				   interval_start))
+			return -EINVAL;
+		history_monoraw = ktime_sub(xtstamp->sys_monoraw,
+					    history_ref->raw);
+		history_realtime = ktime_sub(xtstamp->sys_realtime,
+					     history_ref->real);
+		total_history_cycles = cycles - history_ref->cycles;
+		discontinuity =
+			history_ref->clock_set_seq != clock_was_set_seq;
+		adjust_historical_crosststamp(total_history_cycles,
+					      partial_history_cycles,
+					      history_monoraw,
+					      history_realtime, discontinuity,
+					      xtstamp);
+	}
+
 	return 0;
 }
 EXPORT_SYMBOL_GPL(get_device_system_crosststamp);
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [RFC v5 4/6] Remove duplicate code from ktime_get_raw_and_real code
  2016-01-04 12:45 ` [Intel-wired-lan] " Christopher S. Hall
@ 2016-01-04 12:45   ` Christopher S. Hall
  -1 siblings, 0 replies; 38+ messages in thread
From: Christopher S. Hall @ 2016-01-04 12:45 UTC (permalink / raw)
  To: tglx, richardcochran, mingo, john.stultz, hpa, jeffrey.t.kirsher
  Cc: Christopher S. Hall, x86, linux-kernel, intel-wired-lan, netdev,
	kevin.b.stanton

The code in ktime_get_snapshot() is a superset of the code in
ktime_get_raw_and_real() code. Changes the latter to call the former. A
side effect of this is that ktime_get_raw_and_real() returns two clock
times corresponding to the *exact* same clock tick. Previously, this
code read the underlying counter twice.

Signed-off-by: Christopher S. Hall <christopher.s.hall@intel.com>
---
 kernel/time/timekeeping.c | 20 ++++----------------
 1 file changed, 4 insertions(+), 16 deletions(-)

diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 5a7f784..a0f096c 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -910,26 +910,14 @@ EXPORT_SYMBOL_GPL(ktime_get_snapshot);
  */
 void ktime_get_raw_and_real_ts64(struct timespec64 *ts_raw, struct timespec64 *ts_real)
 {
-	struct timekeeper *tk = &tk_core.timekeeper;
-	unsigned long seq;
-	s64 nsecs_raw, nsecs_real;
+	struct system_time_snapshot snap;
 
 	WARN_ON_ONCE(timekeeping_suspended);
 
-	do {
-		seq = read_seqcount_begin(&tk_core.seq);
-
-		*ts_raw = tk->raw_time;
-		ts_real->tv_sec = tk->xtime_sec;
-		ts_real->tv_nsec = 0;
-
-		nsecs_raw  = timekeeping_get_ns(&tk->tkr_raw);
-		nsecs_real = timekeeping_get_ns(&tk->tkr_mono);
-
-	} while (read_seqcount_retry(&tk_core.seq, seq));
+	ktime_get_snapshot(&snap);
 
-	timespec64_add_ns(ts_raw, nsecs_raw);
-	timespec64_add_ns(ts_real, nsecs_real);
+	*ts_raw = ktime_to_timespec64(snap.raw);
+	*ts_real = ktime_to_timespec64(snap.real);
 }
 EXPORT_SYMBOL(ktime_get_raw_and_real_ts64);
 
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [Intel-wired-lan] [RFC v5 4/6] Remove duplicate code from ktime_get_raw_and_real code
@ 2016-01-04 12:45   ` Christopher S. Hall
  0 siblings, 0 replies; 38+ messages in thread
From: Christopher S. Hall @ 2016-01-04 12:45 UTC (permalink / raw)
  To: intel-wired-lan

The code in ktime_get_snapshot() is a superset of the code in
ktime_get_raw_and_real() code. Changes the latter to call the former. A
side effect of this is that ktime_get_raw_and_real() returns two clock
times corresponding to the *exact* same clock tick. Previously, this
code read the underlying counter twice.

Signed-off-by: Christopher S. Hall <christopher.s.hall@intel.com>
---
 kernel/time/timekeeping.c | 20 ++++----------------
 1 file changed, 4 insertions(+), 16 deletions(-)

diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 5a7f784..a0f096c 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -910,26 +910,14 @@ EXPORT_SYMBOL_GPL(ktime_get_snapshot);
  */
 void ktime_get_raw_and_real_ts64(struct timespec64 *ts_raw, struct timespec64 *ts_real)
 {
-	struct timekeeper *tk = &tk_core.timekeeper;
-	unsigned long seq;
-	s64 nsecs_raw, nsecs_real;
+	struct system_time_snapshot snap;
 
 	WARN_ON_ONCE(timekeeping_suspended);
 
-	do {
-		seq = read_seqcount_begin(&tk_core.seq);
-
-		*ts_raw = tk->raw_time;
-		ts_real->tv_sec = tk->xtime_sec;
-		ts_real->tv_nsec = 0;
-
-		nsecs_raw  = timekeeping_get_ns(&tk->tkr_raw);
-		nsecs_real = timekeeping_get_ns(&tk->tkr_mono);
-
-	} while (read_seqcount_retry(&tk_core.seq, seq));
+	ktime_get_snapshot(&snap);
 
-	timespec64_add_ns(ts_raw, nsecs_raw);
-	timespec64_add_ns(ts_real, nsecs_real);
+	*ts_raw = ktime_to_timespec64(snap.raw);
+	*ts_real = ktime_to_timespec64(snap.real);
 }
 EXPORT_SYMBOL(ktime_get_raw_and_real_ts64);
 
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [RFC v5 5/6] Add PTP_SYS_OFFSET_PRECISE for driver crosstimestamping
  2016-01-04 12:45 ` [Intel-wired-lan] " Christopher S. Hall
@ 2016-01-04 12:45   ` Christopher S. Hall
  -1 siblings, 0 replies; 38+ messages in thread
From: Christopher S. Hall @ 2016-01-04 12:45 UTC (permalink / raw)
  To: tglx, richardcochran, mingo, john.stultz, hpa, jeffrey.t.kirsher
  Cc: Christopher S. Hall, x86, linux-kernel, intel-wired-lan, netdev,
	kevin.b.stanton

Currently, network /system cross-timestamping is performed in the
PTP_SYS_OFFSET ioctl. The PTP clock driver reads gettimeofday() and
the gettime64() callback provided by the driver. The cross-timestamp
is best effort where the latency between the capture of system time
(getnstimeofday()) and the device time (driver callback) may be
significant.

The getsynctime() callback and corresponding PTP_SYS_OFFSET_PRECISE
ioctl allows the driver to perform this device/system correlation when
for example cross timestamp hardware is available. Modern Intel
systems can do this for onboard Ethernet controllers using the ART
counter. There is virtually zero latency between captures of the ART
and network device clock.

The capabilities ioctl (PTP_CLOCK_GETCAPS), is augmented allowing
applications to query whether or not drivers implement the getsynctime
callback, providing more precise cross timestamping.

Signed-off-by: Christopher S. Hall <christopher.s.hall@intel.com>
---
 Documentation/ptp/testptp.c      |  6 ++++--
 drivers/ptp/ptp_chardev.c        | 31 +++++++++++++++++++++++++++++++
 include/linux/ptp_clock_kernel.h |  8 ++++++++
 include/uapi/linux/ptp_clock.h   | 13 ++++++++++++-
 4 files changed, 55 insertions(+), 3 deletions(-)

diff --git a/Documentation/ptp/testptp.c b/Documentation/ptp/testptp.c
index 6c6247a..d99012f 100644
--- a/Documentation/ptp/testptp.c
+++ b/Documentation/ptp/testptp.c
@@ -277,13 +277,15 @@ int main(int argc, char *argv[])
 			       "  %d external time stamp channels\n"
 			       "  %d programmable periodic signals\n"
 			       "  %d pulse per second\n"
-			       "  %d programmable pins\n",
+			       "  %d programmable pins\n"
+			       "  %d cross timestamping\n",
 			       caps.max_adj,
 			       caps.n_alarm,
 			       caps.n_ext_ts,
 			       caps.n_per_out,
 			       caps.pps,
-			       caps.n_pins);
+			       caps.n_pins,
+			       caps.cross_timestamping);
 		}
 	}
 
diff --git a/drivers/ptp/ptp_chardev.c b/drivers/ptp/ptp_chardev.c
index da7bae9..f39ee73 100644
--- a/drivers/ptp/ptp_chardev.c
+++ b/drivers/ptp/ptp_chardev.c
@@ -22,6 +22,7 @@
 #include <linux/poll.h>
 #include <linux/sched.h>
 #include <linux/slab.h>
+#include <linux/timekeeping.h>
 
 #include "ptp_private.h"
 
@@ -120,11 +121,14 @@ long ptp_ioctl(struct posix_clock *pc, unsigned int cmd, unsigned long arg)
 	struct ptp_clock_caps caps;
 	struct ptp_clock_request req;
 	struct ptp_sys_offset *sysoff = NULL;
+	struct ptp_sys_offset_precise precise_offset;
 	struct ptp_pin_desc pd;
 	struct ptp_clock *ptp = container_of(pc, struct ptp_clock, clock);
 	struct ptp_clock_info *ops = ptp->info;
 	struct ptp_clock_time *pct;
 	struct timespec64 ts;
+	struct system_device_crosststamp xtstamp;
+	u32 rem;
 	int enable, err = 0;
 	unsigned int i, pin_index;
 
@@ -138,6 +142,7 @@ long ptp_ioctl(struct posix_clock *pc, unsigned int cmd, unsigned long arg)
 		caps.n_per_out = ptp->info->n_per_out;
 		caps.pps = ptp->info->pps;
 		caps.n_pins = ptp->info->n_pins;
+		caps.cross_timestamping = ptp->info->getsynctime != NULL;
 		if (copy_to_user((void __user *)arg, &caps, sizeof(caps)))
 			err = -EFAULT;
 		break;
@@ -180,6 +185,32 @@ long ptp_ioctl(struct posix_clock *pc, unsigned int cmd, unsigned long arg)
 		err = ops->enable(ops, &req, enable);
 		break;
 
+	case PTP_SYS_OFFSET_PRECISE:
+		if (!ptp->info->getsynctime) {
+			err = -EINVAL;
+			break;
+		}
+		err = ptp->info->getsynctime(ptp->info, &xtstamp);
+		if (err)
+			break;
+
+		precise_offset.sys_real.sec =
+			div_u64_rem(ktime_to_ns(xtstamp.sys_realtime),
+				    NSEC_PER_SEC, &rem);
+		precise_offset.sys_real.nsec = rem;
+		precise_offset.sys_raw.sec =
+			div_u64_rem(ktime_to_ns(xtstamp.sys_monoraw),
+				    NSEC_PER_SEC, &rem);
+		precise_offset.sys_raw.nsec = rem;
+		precise_offset.dev.sec =
+			div_u64_rem(ktime_to_ns(xtstamp.device), NSEC_PER_SEC,
+				    &rem);
+		precise_offset.dev.nsec = rem;
+		if (copy_to_user((void __user *)arg, &precise_offset,
+				 sizeof(precise_offset)))
+			err = -EFAULT;
+		break;
+
 	case PTP_SYS_OFFSET:
 		sysoff = kmalloc(sizeof(*sysoff), GFP_KERNEL);
 		if (!sysoff) {
diff --git a/include/linux/ptp_clock_kernel.h b/include/linux/ptp_clock_kernel.h
index b8b7306..b1c8d48 100644
--- a/include/linux/ptp_clock_kernel.h
+++ b/include/linux/ptp_clock_kernel.h
@@ -38,6 +38,7 @@ struct ptp_clock_request {
 	};
 };
 
+struct system_device_crosststamp;
 /**
  * struct ptp_clock_info - decribes a PTP hardware clock
  *
@@ -67,6 +68,11 @@ struct ptp_clock_request {
  * @gettime64:  Reads the current time from the hardware clock.
  *              parameter ts: Holds the result.
  *
+ * @getsynctime:  Reads the current time from the hardware clock and
+ *                system clock simultaneously.
+ *                parameter cts: Contains timestamp (device,system) pair,
+ *                where system time is realtime and monotonic.
+ *
  * @settime64:  Set the current time on the hardware clock.
  *              parameter ts: Time value to set.
  *
@@ -105,6 +111,8 @@ struct ptp_clock_info {
 	int (*adjfreq)(struct ptp_clock_info *ptp, s32 delta);
 	int (*adjtime)(struct ptp_clock_info *ptp, s64 delta);
 	int (*gettime64)(struct ptp_clock_info *ptp, struct timespec64 *ts);
+	int (*getsynctime)(struct ptp_clock_info *ptp,
+			   struct system_device_crosststamp *cts);
 	int (*settime64)(struct ptp_clock_info *p, const struct timespec64 *ts);
 	int (*enable)(struct ptp_clock_info *ptp,
 		      struct ptp_clock_request *request, int on);
diff --git a/include/uapi/linux/ptp_clock.h b/include/uapi/linux/ptp_clock.h
index f0b7bfe..c6ed818 100644
--- a/include/uapi/linux/ptp_clock.h
+++ b/include/uapi/linux/ptp_clock.h
@@ -51,7 +51,9 @@ struct ptp_clock_caps {
 	int n_per_out; /* Number of programmable periodic signals. */
 	int pps;       /* Whether the clock supports a PPS callback. */
 	int n_pins;    /* Number of input/output pins. */
-	int rsv[14];   /* Reserved for future use. */
+	/* Whether the clock supports precise system-device cross timestamps */
+	int cross_timestamping;
+	int rsv[13];   /* Reserved for future use. */
 };
 
 struct ptp_extts_request {
@@ -81,6 +83,13 @@ struct ptp_sys_offset {
 	struct ptp_clock_time ts[2 * PTP_MAX_SAMPLES + 1];
 };
 
+struct ptp_sys_offset_precise {
+	struct ptp_clock_time dev;
+	struct ptp_clock_time sys_real;
+	struct ptp_clock_time sys_raw;
+	unsigned int rsv[4];    /* Reserved for future use. */
+};
+
 enum ptp_pin_function {
 	PTP_PF_NONE,
 	PTP_PF_EXTTS,
@@ -124,6 +133,8 @@ struct ptp_pin_desc {
 #define PTP_SYS_OFFSET     _IOW(PTP_CLK_MAGIC, 5, struct ptp_sys_offset)
 #define PTP_PIN_GETFUNC    _IOWR(PTP_CLK_MAGIC, 6, struct ptp_pin_desc)
 #define PTP_PIN_SETFUNC    _IOW(PTP_CLK_MAGIC, 7, struct ptp_pin_desc)
+#define PTP_SYS_OFFSET_PRECISE \
+	_IOWR(PTP_CLK_MAGIC, 8, struct ptp_sys_offset_precise)
 
 struct ptp_extts_event {
 	struct ptp_clock_time t; /* Time event occured. */
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [Intel-wired-lan] [RFC v5 5/6] Add PTP_SYS_OFFSET_PRECISE for driver crosstimestamping
@ 2016-01-04 12:45   ` Christopher S. Hall
  0 siblings, 0 replies; 38+ messages in thread
From: Christopher S. Hall @ 2016-01-04 12:45 UTC (permalink / raw)
  To: intel-wired-lan

Currently, network /system cross-timestamping is performed in the
PTP_SYS_OFFSET ioctl. The PTP clock driver reads gettimeofday() and
the gettime64() callback provided by the driver. The cross-timestamp
is best effort where the latency between the capture of system time
(getnstimeofday()) and the device time (driver callback) may be
significant.

The getsynctime() callback and corresponding PTP_SYS_OFFSET_PRECISE
ioctl allows the driver to perform this device/system correlation when
for example cross timestamp hardware is available. Modern Intel
systems can do this for onboard Ethernet controllers using the ART
counter. There is virtually zero latency between captures of the ART
and network device clock.

The capabilities ioctl (PTP_CLOCK_GETCAPS), is augmented allowing
applications to query whether or not drivers implement the getsynctime
callback, providing more precise cross timestamping.

Signed-off-by: Christopher S. Hall <christopher.s.hall@intel.com>
---
 Documentation/ptp/testptp.c      |  6 ++++--
 drivers/ptp/ptp_chardev.c        | 31 +++++++++++++++++++++++++++++++
 include/linux/ptp_clock_kernel.h |  8 ++++++++
 include/uapi/linux/ptp_clock.h   | 13 ++++++++++++-
 4 files changed, 55 insertions(+), 3 deletions(-)

diff --git a/Documentation/ptp/testptp.c b/Documentation/ptp/testptp.c
index 6c6247a..d99012f 100644
--- a/Documentation/ptp/testptp.c
+++ b/Documentation/ptp/testptp.c
@@ -277,13 +277,15 @@ int main(int argc, char *argv[])
 			       "  %d external time stamp channels\n"
 			       "  %d programmable periodic signals\n"
 			       "  %d pulse per second\n"
-			       "  %d programmable pins\n",
+			       "  %d programmable pins\n"
+			       "  %d cross timestamping\n",
 			       caps.max_adj,
 			       caps.n_alarm,
 			       caps.n_ext_ts,
 			       caps.n_per_out,
 			       caps.pps,
-			       caps.n_pins);
+			       caps.n_pins,
+			       caps.cross_timestamping);
 		}
 	}
 
diff --git a/drivers/ptp/ptp_chardev.c b/drivers/ptp/ptp_chardev.c
index da7bae9..f39ee73 100644
--- a/drivers/ptp/ptp_chardev.c
+++ b/drivers/ptp/ptp_chardev.c
@@ -22,6 +22,7 @@
 #include <linux/poll.h>
 #include <linux/sched.h>
 #include <linux/slab.h>
+#include <linux/timekeeping.h>
 
 #include "ptp_private.h"
 
@@ -120,11 +121,14 @@ long ptp_ioctl(struct posix_clock *pc, unsigned int cmd, unsigned long arg)
 	struct ptp_clock_caps caps;
 	struct ptp_clock_request req;
 	struct ptp_sys_offset *sysoff = NULL;
+	struct ptp_sys_offset_precise precise_offset;
 	struct ptp_pin_desc pd;
 	struct ptp_clock *ptp = container_of(pc, struct ptp_clock, clock);
 	struct ptp_clock_info *ops = ptp->info;
 	struct ptp_clock_time *pct;
 	struct timespec64 ts;
+	struct system_device_crosststamp xtstamp;
+	u32 rem;
 	int enable, err = 0;
 	unsigned int i, pin_index;
 
@@ -138,6 +142,7 @@ long ptp_ioctl(struct posix_clock *pc, unsigned int cmd, unsigned long arg)
 		caps.n_per_out = ptp->info->n_per_out;
 		caps.pps = ptp->info->pps;
 		caps.n_pins = ptp->info->n_pins;
+		caps.cross_timestamping = ptp->info->getsynctime != NULL;
 		if (copy_to_user((void __user *)arg, &caps, sizeof(caps)))
 			err = -EFAULT;
 		break;
@@ -180,6 +185,32 @@ long ptp_ioctl(struct posix_clock *pc, unsigned int cmd, unsigned long arg)
 		err = ops->enable(ops, &req, enable);
 		break;
 
+	case PTP_SYS_OFFSET_PRECISE:
+		if (!ptp->info->getsynctime) {
+			err = -EINVAL;
+			break;
+		}
+		err = ptp->info->getsynctime(ptp->info, &xtstamp);
+		if (err)
+			break;
+
+		precise_offset.sys_real.sec =
+			div_u64_rem(ktime_to_ns(xtstamp.sys_realtime),
+				    NSEC_PER_SEC, &rem);
+		precise_offset.sys_real.nsec = rem;
+		precise_offset.sys_raw.sec =
+			div_u64_rem(ktime_to_ns(xtstamp.sys_monoraw),
+				    NSEC_PER_SEC, &rem);
+		precise_offset.sys_raw.nsec = rem;
+		precise_offset.dev.sec =
+			div_u64_rem(ktime_to_ns(xtstamp.device), NSEC_PER_SEC,
+				    &rem);
+		precise_offset.dev.nsec = rem;
+		if (copy_to_user((void __user *)arg, &precise_offset,
+				 sizeof(precise_offset)))
+			err = -EFAULT;
+		break;
+
 	case PTP_SYS_OFFSET:
 		sysoff = kmalloc(sizeof(*sysoff), GFP_KERNEL);
 		if (!sysoff) {
diff --git a/include/linux/ptp_clock_kernel.h b/include/linux/ptp_clock_kernel.h
index b8b7306..b1c8d48 100644
--- a/include/linux/ptp_clock_kernel.h
+++ b/include/linux/ptp_clock_kernel.h
@@ -38,6 +38,7 @@ struct ptp_clock_request {
 	};
 };
 
+struct system_device_crosststamp;
 /**
  * struct ptp_clock_info - decribes a PTP hardware clock
  *
@@ -67,6 +68,11 @@ struct ptp_clock_request {
  * @gettime64:  Reads the current time from the hardware clock.
  *              parameter ts: Holds the result.
  *
+ * @getsynctime:  Reads the current time from the hardware clock and
+ *                system clock simultaneously.
+ *                parameter cts: Contains timestamp (device,system) pair,
+ *                where system time is realtime and monotonic.
+ *
  * @settime64:  Set the current time on the hardware clock.
  *              parameter ts: Time value to set.
  *
@@ -105,6 +111,8 @@ struct ptp_clock_info {
 	int (*adjfreq)(struct ptp_clock_info *ptp, s32 delta);
 	int (*adjtime)(struct ptp_clock_info *ptp, s64 delta);
 	int (*gettime64)(struct ptp_clock_info *ptp, struct timespec64 *ts);
+	int (*getsynctime)(struct ptp_clock_info *ptp,
+			   struct system_device_crosststamp *cts);
 	int (*settime64)(struct ptp_clock_info *p, const struct timespec64 *ts);
 	int (*enable)(struct ptp_clock_info *ptp,
 		      struct ptp_clock_request *request, int on);
diff --git a/include/uapi/linux/ptp_clock.h b/include/uapi/linux/ptp_clock.h
index f0b7bfe..c6ed818 100644
--- a/include/uapi/linux/ptp_clock.h
+++ b/include/uapi/linux/ptp_clock.h
@@ -51,7 +51,9 @@ struct ptp_clock_caps {
 	int n_per_out; /* Number of programmable periodic signals. */
 	int pps;       /* Whether the clock supports a PPS callback. */
 	int n_pins;    /* Number of input/output pins. */
-	int rsv[14];   /* Reserved for future use. */
+	/* Whether the clock supports precise system-device cross timestamps */
+	int cross_timestamping;
+	int rsv[13];   /* Reserved for future use. */
 };
 
 struct ptp_extts_request {
@@ -81,6 +83,13 @@ struct ptp_sys_offset {
 	struct ptp_clock_time ts[2 * PTP_MAX_SAMPLES + 1];
 };
 
+struct ptp_sys_offset_precise {
+	struct ptp_clock_time dev;
+	struct ptp_clock_time sys_real;
+	struct ptp_clock_time sys_raw;
+	unsigned int rsv[4];    /* Reserved for future use. */
+};
+
 enum ptp_pin_function {
 	PTP_PF_NONE,
 	PTP_PF_EXTTS,
@@ -124,6 +133,8 @@ struct ptp_pin_desc {
 #define PTP_SYS_OFFSET     _IOW(PTP_CLK_MAGIC, 5, struct ptp_sys_offset)
 #define PTP_PIN_GETFUNC    _IOWR(PTP_CLK_MAGIC, 6, struct ptp_pin_desc)
 #define PTP_PIN_SETFUNC    _IOW(PTP_CLK_MAGIC, 7, struct ptp_pin_desc)
+#define PTP_SYS_OFFSET_PRECISE \
+	_IOWR(PTP_CLK_MAGIC, 8, struct ptp_sys_offset_precise)
 
 struct ptp_extts_event {
 	struct ptp_clock_time t; /* Time event occured. */
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [RFC v5 6/6] Adds hardware supported cross timestamp
  2016-01-04 12:45 ` [Intel-wired-lan] " Christopher S. Hall
@ 2016-01-04 12:45   ` Christopher S. Hall
  -1 siblings, 0 replies; 38+ messages in thread
From: Christopher S. Hall @ 2016-01-04 12:45 UTC (permalink / raw)
  To: tglx, richardcochran, mingo, john.stultz, hpa, jeffrey.t.kirsher
  Cc: Christopher S. Hall, x86, linux-kernel, intel-wired-lan, netdev,
	kevin.b.stanton

Modern Intel systems supports cross timestamping of the network device
clock and Always Running Timer (ART) in hardware.  This allows the
device time and system time to be precisely correlated. The timestamp
pair is returned through e1000e_phc_get_sync_devicetime() used by
get_device_system_crosststamp().  The hardware cross-timestamp result
is made available to applications through the PTP_SYS_OFFSET_PRECISE
ioctl which call e1000e_phc_getsynctime().

Signed-off-by: Christopher S. Hall <christopher.s.hall@intel.com>
---
 drivers/net/ethernet/intel/Kconfig          |  9 +++
 drivers/net/ethernet/intel/e1000e/defines.h |  5 ++
 drivers/net/ethernet/intel/e1000e/ptp.c     | 92 +++++++++++++++++++++++++++++
 drivers/net/ethernet/intel/e1000e/regs.h    |  4 ++
 4 files changed, 110 insertions(+)

diff --git a/drivers/net/ethernet/intel/Kconfig b/drivers/net/ethernet/intel/Kconfig
index 4163b16..62bbddc 100644
--- a/drivers/net/ethernet/intel/Kconfig
+++ b/drivers/net/ethernet/intel/Kconfig
@@ -83,6 +83,15 @@ config E1000E
 	  To compile this driver as a module, choose M here. The module
 	  will be called e1000e.
 
+config E1000E_HWTS
+	bool "Support HW cross-timestamp on PCH devices"
+	default y
+	depends on E1000E && X86
+	---help---
+	 Say Y to enable hardware supported cross-timestamping on PCH
+	 devices. The cross-timestamp is available through the PTP clock
+	 driver precise cross-timestamp ioctl (PTP_SYS_OFFSET_PRECISE).
+
 config IGB
 	tristate "Intel(R) 82575/82576 PCI-Express Gigabit Ethernet support"
 	depends on PCI
diff --git a/drivers/net/ethernet/intel/e1000e/defines.h b/drivers/net/ethernet/intel/e1000e/defines.h
index 133d407..13cff75 100644
--- a/drivers/net/ethernet/intel/e1000e/defines.h
+++ b/drivers/net/ethernet/intel/e1000e/defines.h
@@ -527,6 +527,11 @@
 #define E1000_RXCW_C          0x20000000        /* Receive config */
 #define E1000_RXCW_SYNCH      0x40000000        /* Receive config synch */
 
+/* HH Time Sync */
+#define E1000_TSYNCTXCTL_MAX_ALLOWED_DLY_MASK	0x0000F000 /* max delay */
+#define E1000_TSYNCTXCTL_SYNC_COMP		0x40000000 /* sync complete */
+#define E1000_TSYNCTXCTL_START_SYNC		0x80000000 /* initiate sync */
+
 #define E1000_TSYNCTXCTL_VALID		0x00000001 /* Tx timestamp valid */
 #define E1000_TSYNCTXCTL_ENABLED	0x00000010 /* enable Tx timestamping */
 
diff --git a/drivers/net/ethernet/intel/e1000e/ptp.c b/drivers/net/ethernet/intel/e1000e/ptp.c
index 25a0ad5..44d2caa 100644
--- a/drivers/net/ethernet/intel/e1000e/ptp.c
+++ b/drivers/net/ethernet/intel/e1000e/ptp.c
@@ -26,6 +26,11 @@
 
 #include "e1000.h"
 
+#ifdef CONFIG_E1000E_HWTS
+#include <asm/tsc.h>
+#include <linux/ktime.h>
+#endif
+
 /**
  * e1000e_phc_adjfreq - adjust the frequency of the hardware clock
  * @ptp: ptp clock structure
@@ -98,6 +103,87 @@ static int e1000e_phc_adjtime(struct ptp_clock_info *ptp, s64 delta)
 	return 0;
 }
 
+#ifdef CONFIG_E1000E_HWTS
+#define MAX_HW_WAIT_COUNT (3)
+
+/**
+ * e1000e_phc_get_sync_devicetime - Callback given to timekeeping code reads system/device registers
+ * @device: current device time
+ * @system: raw system counter value read synchronously with device time
+ * @ctx: context provided by timekeeping code
+ *
+ * Read device and system (ART) clock simultaneously and return the corrected
+ * clock values in ns.
+ **/
+static int e1000e_phc_get_sync_devicetime(ktime_t *device,
+					  struct raw_system_counterval *system,
+					  void *ctx)
+{
+	struct e1000_adapter *adapter = (struct e1000_adapter *)ctx;
+	struct e1000_hw *hw = &adapter->hw;
+	unsigned long flags;
+	int i;
+	u32 tsync_ctrl;
+	cycle_t dev_cycles;
+	int ret;
+
+	tsync_ctrl = er32(TSYNCTXCTL);
+	tsync_ctrl |= E1000_TSYNCTXCTL_START_SYNC |
+		E1000_TSYNCTXCTL_MAX_ALLOWED_DLY_MASK;
+	ew32(TSYNCTXCTL, tsync_ctrl);
+	for (i = 0; i < MAX_HW_WAIT_COUNT; ++i) {
+		udelay(1);
+		tsync_ctrl = er32(TSYNCTXCTL);
+		if (tsync_ctrl & E1000_TSYNCTXCTL_SYNC_COMP)
+			break;
+	}
+
+	if (i == MAX_HW_WAIT_COUNT) {
+		ret = -ETIMEDOUT;
+	} else {
+		ret = 0;
+		system->cycles = er32(PLTSTMPH);
+		system->cycles <<= 32;
+		system->cycles |= er32(PLTSTMPL);
+		system->cs = art_timestamper.related_cs;
+		system->cycles = art_timestamper.convert(&art_timestamper,
+							 system->cycles);
+
+		dev_cycles = er32(SYSSTMPH);
+		dev_cycles <<= 32;
+		dev_cycles |= er32(SYSSTMPL);
+		spin_lock_irqsave(&adapter->systim_lock, flags);
+		device->tv64 = timecounter_cyc2time(&adapter->tc, dev_cycles);
+		spin_unlock_irqrestore(&adapter->systim_lock, flags);
+	}
+
+	return ret;
+}
+
+/**
+ * e1000e_phc_getsynctime - Reads the current system/device cross timestamp
+ * @ptp: ptp clock structure
+ * @cts: structure containing timestamp
+ *
+ * Read device and system (ART) clock simultaneously and return the scaled
+ * clock values in ns.
+ **/
+static int e1000e_phc_getsynctime(struct ptp_clock_info *ptp,
+				  struct system_device_crosststamp *xtstamp)
+{
+	struct e1000_adapter *adapter = container_of(ptp, struct e1000_adapter,
+						     ptp_clock_info);
+	struct get_sync_device_time get_sync_devicetime;
+	int ret;
+
+	get_sync_devicetime.get_time = e1000e_phc_get_sync_devicetime;
+	get_sync_devicetime.ctx = adapter;
+	ret = get_device_system_crosststamp(xtstamp, &get_sync_devicetime,
+					    NULL);
+	return ret;
+}
+#endif/*CONFIG_E1000E_HWTS*/
+
 /**
  * e1000e_phc_gettime - Reads the current time from the hardware clock
  * @ptp: ptp clock structure
@@ -236,6 +322,12 @@ void e1000e_ptp_init(struct e1000_adapter *adapter)
 		break;
 	}
 
+#ifdef CONFIG_E1000E_HWTS
+	/* CPU must have ART and GBe must be from Sunrise Point or greater */
+	if (hw->mac.type >= e1000_pch_spt && boot_cpu_has(X86_FEATURE_ART))
+		adapter->ptp_clock_info.getsynctime = e1000e_phc_getsynctime;
+#endif/*CONFIG_E1000E_HWTS*/
+
 	INIT_DELAYED_WORK(&adapter->systim_overflow_work,
 			  e1000e_systim_overflow_work);
 
diff --git a/drivers/net/ethernet/intel/e1000e/regs.h b/drivers/net/ethernet/intel/e1000e/regs.h
index 1d5e0b7..0cb4d36 100644
--- a/drivers/net/ethernet/intel/e1000e/regs.h
+++ b/drivers/net/ethernet/intel/e1000e/regs.h
@@ -245,6 +245,10 @@
 #define E1000_SYSTIML	0x0B600	/* System time register Low - RO */
 #define E1000_SYSTIMH	0x0B604	/* System time register High - RO */
 #define E1000_TIMINCA	0x0B608	/* Increment attributes register - RW */
+#define E1000_SYSSTMPL  0x0B648 /* HH Timesync system stamp low register */
+#define E1000_SYSSTMPH  0x0B64C /* HH Timesync system stamp hi register */
+#define E1000_PLTSTMPL  0x0B640 /* HH Timesync platform stamp low register */
+#define E1000_PLTSTMPH  0x0B644 /* HH Timesync platform stamp hi register */
 #define E1000_RXMTRL	0x0B634	/* Time sync Rx EtherType and Msg Type - RW */
 #define E1000_RXUDP	0x0B638	/* Time Sync Rx UDP Port - RW */
 
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [Intel-wired-lan] [RFC v5 6/6] Adds hardware supported cross timestamp
@ 2016-01-04 12:45   ` Christopher S. Hall
  0 siblings, 0 replies; 38+ messages in thread
From: Christopher S. Hall @ 2016-01-04 12:45 UTC (permalink / raw)
  To: intel-wired-lan

Modern Intel systems supports cross timestamping of the network device
clock and Always Running Timer (ART) in hardware.  This allows the
device time and system time to be precisely correlated. The timestamp
pair is returned through e1000e_phc_get_sync_devicetime() used by
get_device_system_crosststamp().  The hardware cross-timestamp result
is made available to applications through the PTP_SYS_OFFSET_PRECISE
ioctl which call e1000e_phc_getsynctime().

Signed-off-by: Christopher S. Hall <christopher.s.hall@intel.com>
---
 drivers/net/ethernet/intel/Kconfig          |  9 +++
 drivers/net/ethernet/intel/e1000e/defines.h |  5 ++
 drivers/net/ethernet/intel/e1000e/ptp.c     | 92 +++++++++++++++++++++++++++++
 drivers/net/ethernet/intel/e1000e/regs.h    |  4 ++
 4 files changed, 110 insertions(+)

diff --git a/drivers/net/ethernet/intel/Kconfig b/drivers/net/ethernet/intel/Kconfig
index 4163b16..62bbddc 100644
--- a/drivers/net/ethernet/intel/Kconfig
+++ b/drivers/net/ethernet/intel/Kconfig
@@ -83,6 +83,15 @@ config E1000E
 	  To compile this driver as a module, choose M here. The module
 	  will be called e1000e.
 
+config E1000E_HWTS
+	bool "Support HW cross-timestamp on PCH devices"
+	default y
+	depends on E1000E && X86
+	---help---
+	 Say Y to enable hardware supported cross-timestamping on PCH
+	 devices. The cross-timestamp is available through the PTP clock
+	 driver precise cross-timestamp ioctl (PTP_SYS_OFFSET_PRECISE).
+
 config IGB
 	tristate "Intel(R) 82575/82576 PCI-Express Gigabit Ethernet support"
 	depends on PCI
diff --git a/drivers/net/ethernet/intel/e1000e/defines.h b/drivers/net/ethernet/intel/e1000e/defines.h
index 133d407..13cff75 100644
--- a/drivers/net/ethernet/intel/e1000e/defines.h
+++ b/drivers/net/ethernet/intel/e1000e/defines.h
@@ -527,6 +527,11 @@
 #define E1000_RXCW_C          0x20000000        /* Receive config */
 #define E1000_RXCW_SYNCH      0x40000000        /* Receive config synch */
 
+/* HH Time Sync */
+#define E1000_TSYNCTXCTL_MAX_ALLOWED_DLY_MASK	0x0000F000 /* max delay */
+#define E1000_TSYNCTXCTL_SYNC_COMP		0x40000000 /* sync complete */
+#define E1000_TSYNCTXCTL_START_SYNC		0x80000000 /* initiate sync */
+
 #define E1000_TSYNCTXCTL_VALID		0x00000001 /* Tx timestamp valid */
 #define E1000_TSYNCTXCTL_ENABLED	0x00000010 /* enable Tx timestamping */
 
diff --git a/drivers/net/ethernet/intel/e1000e/ptp.c b/drivers/net/ethernet/intel/e1000e/ptp.c
index 25a0ad5..44d2caa 100644
--- a/drivers/net/ethernet/intel/e1000e/ptp.c
+++ b/drivers/net/ethernet/intel/e1000e/ptp.c
@@ -26,6 +26,11 @@
 
 #include "e1000.h"
 
+#ifdef CONFIG_E1000E_HWTS
+#include <asm/tsc.h>
+#include <linux/ktime.h>
+#endif
+
 /**
  * e1000e_phc_adjfreq - adjust the frequency of the hardware clock
  * @ptp: ptp clock structure
@@ -98,6 +103,87 @@ static int e1000e_phc_adjtime(struct ptp_clock_info *ptp, s64 delta)
 	return 0;
 }
 
+#ifdef CONFIG_E1000E_HWTS
+#define MAX_HW_WAIT_COUNT (3)
+
+/**
+ * e1000e_phc_get_sync_devicetime - Callback given to timekeeping code reads system/device registers
+ * @device: current device time
+ * @system: raw system counter value read synchronously with device time
+ * @ctx: context provided by timekeeping code
+ *
+ * Read device and system (ART) clock simultaneously and return the corrected
+ * clock values in ns.
+ **/
+static int e1000e_phc_get_sync_devicetime(ktime_t *device,
+					  struct raw_system_counterval *system,
+					  void *ctx)
+{
+	struct e1000_adapter *adapter = (struct e1000_adapter *)ctx;
+	struct e1000_hw *hw = &adapter->hw;
+	unsigned long flags;
+	int i;
+	u32 tsync_ctrl;
+	cycle_t dev_cycles;
+	int ret;
+
+	tsync_ctrl = er32(TSYNCTXCTL);
+	tsync_ctrl |= E1000_TSYNCTXCTL_START_SYNC |
+		E1000_TSYNCTXCTL_MAX_ALLOWED_DLY_MASK;
+	ew32(TSYNCTXCTL, tsync_ctrl);
+	for (i = 0; i < MAX_HW_WAIT_COUNT; ++i) {
+		udelay(1);
+		tsync_ctrl = er32(TSYNCTXCTL);
+		if (tsync_ctrl & E1000_TSYNCTXCTL_SYNC_COMP)
+			break;
+	}
+
+	if (i == MAX_HW_WAIT_COUNT) {
+		ret = -ETIMEDOUT;
+	} else {
+		ret = 0;
+		system->cycles = er32(PLTSTMPH);
+		system->cycles <<= 32;
+		system->cycles |= er32(PLTSTMPL);
+		system->cs = art_timestamper.related_cs;
+		system->cycles = art_timestamper.convert(&art_timestamper,
+							 system->cycles);
+
+		dev_cycles = er32(SYSSTMPH);
+		dev_cycles <<= 32;
+		dev_cycles |= er32(SYSSTMPL);
+		spin_lock_irqsave(&adapter->systim_lock, flags);
+		device->tv64 = timecounter_cyc2time(&adapter->tc, dev_cycles);
+		spin_unlock_irqrestore(&adapter->systim_lock, flags);
+	}
+
+	return ret;
+}
+
+/**
+ * e1000e_phc_getsynctime - Reads the current system/device cross timestamp
+ * @ptp: ptp clock structure
+ * @cts: structure containing timestamp
+ *
+ * Read device and system (ART) clock simultaneously and return the scaled
+ * clock values in ns.
+ **/
+static int e1000e_phc_getsynctime(struct ptp_clock_info *ptp,
+				  struct system_device_crosststamp *xtstamp)
+{
+	struct e1000_adapter *adapter = container_of(ptp, struct e1000_adapter,
+						     ptp_clock_info);
+	struct get_sync_device_time get_sync_devicetime;
+	int ret;
+
+	get_sync_devicetime.get_time = e1000e_phc_get_sync_devicetime;
+	get_sync_devicetime.ctx = adapter;
+	ret = get_device_system_crosststamp(xtstamp, &get_sync_devicetime,
+					    NULL);
+	return ret;
+}
+#endif/*CONFIG_E1000E_HWTS*/
+
 /**
  * e1000e_phc_gettime - Reads the current time from the hardware clock
  * @ptp: ptp clock structure
@@ -236,6 +322,12 @@ void e1000e_ptp_init(struct e1000_adapter *adapter)
 		break;
 	}
 
+#ifdef CONFIG_E1000E_HWTS
+	/* CPU must have ART and GBe must be from Sunrise Point or greater */
+	if (hw->mac.type >= e1000_pch_spt && boot_cpu_has(X86_FEATURE_ART))
+		adapter->ptp_clock_info.getsynctime = e1000e_phc_getsynctime;
+#endif/*CONFIG_E1000E_HWTS*/
+
 	INIT_DELAYED_WORK(&adapter->systim_overflow_work,
 			  e1000e_systim_overflow_work);
 
diff --git a/drivers/net/ethernet/intel/e1000e/regs.h b/drivers/net/ethernet/intel/e1000e/regs.h
index 1d5e0b7..0cb4d36 100644
--- a/drivers/net/ethernet/intel/e1000e/regs.h
+++ b/drivers/net/ethernet/intel/e1000e/regs.h
@@ -245,6 +245,10 @@
 #define E1000_SYSTIML	0x0B600	/* System time register Low - RO */
 #define E1000_SYSTIMH	0x0B604	/* System time register High - RO */
 #define E1000_TIMINCA	0x0B608	/* Increment attributes register - RW */
+#define E1000_SYSSTMPL  0x0B648 /* HH Timesync system stamp low register */
+#define E1000_SYSSTMPH  0x0B64C /* HH Timesync system stamp hi register */
+#define E1000_PLTSTMPL  0x0B640 /* HH Timesync platform stamp low register */
+#define E1000_PLTSTMPH  0x0B644 /* HH Timesync platform stamp hi register */
 #define E1000_RXMTRL	0x0B634	/* Time sync Rx EtherType and Msg Type - RW */
 #define E1000_RXUDP	0x0B638	/* Time Sync Rx UDP Port - RW */
 
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [RFC v5 5/6] Add PTP_SYS_OFFSET_PRECISE for driver crosstimestamping
  2016-01-04 12:45   ` [Intel-wired-lan] " Christopher S. Hall
@ 2016-01-05 15:27     ` Richard Cochran
  -1 siblings, 0 replies; 38+ messages in thread
From: Richard Cochran @ 2016-01-05 15:27 UTC (permalink / raw)
  To: Christopher S. Hall
  Cc: tglx, mingo, john.stultz, hpa, jeffrey.t.kirsher, x86,
	linux-kernel, intel-wired-lan, netdev, kevin.b.stanton

On Mon, Jan 04, 2016 at 04:45:22AM -0800, Christopher S. Hall wrote:
> @@ -138,6 +142,7 @@ long ptp_ioctl(struct posix_clock *pc, unsigned int cmd, unsigned long arg)
>  		caps.n_per_out = ptp->info->n_per_out;
>  		caps.pps = ptp->info->pps;
>  		caps.n_pins = ptp->info->n_pins;
> +		caps.cross_timestamping = ptp->info->getsynctime != NULL;
>  		if (copy_to_user((void __user *)arg, &caps, sizeof(caps)))
>  			err = -EFAULT;
>  		break;
> @@ -180,6 +185,32 @@ long ptp_ioctl(struct posix_clock *pc, unsigned int cmd, unsigned long arg)
>  		err = ops->enable(ops, &req, enable);
>  		break;
>  
> +	case PTP_SYS_OFFSET_PRECISE:
> +		if (!ptp->info->getsynctime) {
> +			err = -EINVAL;

-EOPNOTSUPP would be better here.

> +			break;
> +		}
> +		err = ptp->info->getsynctime(ptp->info, &xtstamp);
> +		if (err)
> +			break;
> +
> +		precise_offset.sys_real.sec =
> +			div_u64_rem(ktime_to_ns(xtstamp.sys_realtime),
> +				    NSEC_PER_SEC, &rem);
> +		precise_offset.sys_real.nsec = rem;

How about this instead:

		ts = ktime_to_timespec64(xtstamp.sys_realtime);
		precise_offset.sys_real.sec = ts.tv_sec;
		precise_offset.sys_real.nsec = ts.tv_nsec;

> +		precise_offset.sys_raw.sec =
> +			div_u64_rem(ktime_to_ns(xtstamp.sys_monoraw),
> +				    NSEC_PER_SEC, &rem);
> +		precise_offset.sys_raw.nsec = rem;
> +		precise_offset.dev.sec =
> +			div_u64_rem(ktime_to_ns(xtstamp.device), NSEC_PER_SEC,
> +				    &rem);
> +		precise_offset.dev.nsec = rem;

And for these as well.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [Intel-wired-lan] [RFC v5 5/6] Add PTP_SYS_OFFSET_PRECISE for driver crosstimestamping
@ 2016-01-05 15:27     ` Richard Cochran
  0 siblings, 0 replies; 38+ messages in thread
From: Richard Cochran @ 2016-01-05 15:27 UTC (permalink / raw)
  To: intel-wired-lan

On Mon, Jan 04, 2016 at 04:45:22AM -0800, Christopher S. Hall wrote:
> @@ -138,6 +142,7 @@ long ptp_ioctl(struct posix_clock *pc, unsigned int cmd, unsigned long arg)
>  		caps.n_per_out = ptp->info->n_per_out;
>  		caps.pps = ptp->info->pps;
>  		caps.n_pins = ptp->info->n_pins;
> +		caps.cross_timestamping = ptp->info->getsynctime != NULL;
>  		if (copy_to_user((void __user *)arg, &caps, sizeof(caps)))
>  			err = -EFAULT;
>  		break;
> @@ -180,6 +185,32 @@ long ptp_ioctl(struct posix_clock *pc, unsigned int cmd, unsigned long arg)
>  		err = ops->enable(ops, &req, enable);
>  		break;
>  
> +	case PTP_SYS_OFFSET_PRECISE:
> +		if (!ptp->info->getsynctime) {
> +			err = -EINVAL;

-EOPNOTSUPP would be better here.

> +			break;
> +		}
> +		err = ptp->info->getsynctime(ptp->info, &xtstamp);
> +		if (err)
> +			break;
> +
> +		precise_offset.sys_real.sec =
> +			div_u64_rem(ktime_to_ns(xtstamp.sys_realtime),
> +				    NSEC_PER_SEC, &rem);
> +		precise_offset.sys_real.nsec = rem;

How about this instead:

		ts = ktime_to_timespec64(xtstamp.sys_realtime);
		precise_offset.sys_real.sec = ts.tv_sec;
		precise_offset.sys_real.nsec = ts.tv_nsec;

> +		precise_offset.sys_raw.sec =
> +			div_u64_rem(ktime_to_ns(xtstamp.sys_monoraw),
> +				    NSEC_PER_SEC, &rem);
> +		precise_offset.sys_raw.nsec = rem;
> +		precise_offset.dev.sec =
> +			div_u64_rem(ktime_to_ns(xtstamp.device), NSEC_PER_SEC,
> +				    &rem);
> +		precise_offset.dev.nsec = rem;

And for these as well.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC v5 1/6] Timekeeping cross timestamp interface for device drivers
  2016-01-04 12:45   ` [Intel-wired-lan] " Christopher S. Hall
@ 2016-01-06 18:55     ` John Stultz
  -1 siblings, 0 replies; 38+ messages in thread
From: John Stultz @ 2016-01-06 18:55 UTC (permalink / raw)
  To: Christopher S. Hall
  Cc: Thomas Gleixner, Richard Cochran, Ingo Molnar, H. Peter Anvin,
	Jeff Kirsher, x86, lkml, intel-wired-lan, netdev, Stanton,
	Kevin B

On Mon, Jan 4, 2016 at 4:45 AM, Christopher S. Hall
<christopher.s.hall@intel.com> wrote:
> ACKNOWLEDGMENT: The original correlated clock source and cross
> timestamp code was developed by Thomas Gleixner
> <tglx@linutronix.de>. It has changed considerably and any mistakes are
> mine.
>
> The precision with which events on multiple networked systems can be
> synchronized using, as an example, PTP (IEEE 1588, 802.1AS) is limited
> by the precision of the cross timestamps between the system clock and
> the device (timestamp) clock. Precision here is the degree of
> simultaneity when capturing the cross timestamp.
>
> Currently the PTP cross timestamp is captured in software using the
> PTP device driver ioctl PTP_SYS_OFFSET. Reads of the device clock are
> interleaved with reads of the realtime clock. At best, the precision
> of this cross timestamp is on the order of several microseconds due to
> software latencies. Sub-microsecond precision is required for
> industrial control and some media applications. To achieve this level
> of precision hardware supported cross timestamping is needed.
>
> Hardware cross timestamps are derived from simultaneously capturing
> the device clock and the system clock. Applications use timestamps in
> nanoseconds rather than clock ticks. The device driver can scale
> device clock ticks to device time in nanoseconds, but cannot transform
> system clock ticks. Only the kernel timekeeping code can do this. The
> function get_device_system_crosststamp() allows device drivers to
> return a cross timestamp with system time properly scaled to
> nanoseconds.
>
> The cross timestamps contain the realtime and monotonic raw clock
> times. The realtime value is needed to discipline that clock using PTP
> and the monotonic raw value is used for applications that don't
> require a "real" time, but need an unadjusted clock time.
>
> The get_device_system_crosststamp() code calls back into the driver
> to ensure that the system counter is within the current timekeeping
> update interval. Because of possible NTP/PTP frequency adjustments,
> extrapolating a realtime clock time outside the current interval with
> a potentially different scaling factor can result in a small amount of
> error.
>
> Modern Intel hardware provides an Always Running Timer (ART) which is
> exactly related to TSC through a known frequency ratio. The ART is
> routed to devices on the system and is used to precisely and
> simultaneously capture the device clock with the ART. The kernel
> timekeeping code requires a system clock value from the current clock
> source to calculate the system time. The correlated clocksource adds a
> means to transform a clock (in this case ART) exactly to the system
> clock (TSC clocksource) that is used to calculate system time.
>
> Signed-off-by: Christopher S. Hall <christopher.s.hall@intel.com>
> ---
>  include/linux/clocksource.h | 27 ++++++++++++++++
>  include/linux/timekeeping.h | 35 +++++++++++++++++++++
>  kernel/time/timekeeping.c   | 77 ++++++++++++++++++++++++++++++++++++++++++---
>  3 files changed, 135 insertions(+), 4 deletions(-)
>
> diff --git a/include/linux/clocksource.h b/include/linux/clocksource.h
> index 7784b59..4b7973d 100644
> --- a/include/linux/clocksource.h
> +++ b/include/linux/clocksource.h
> @@ -255,4 +255,31 @@ static inline void clocksource_probe(void) {}
>  #define CLOCKSOURCE_ACPI_DECLARE(name, table_id, fn)           \
>         ACPI_DECLARE_PROBE_ENTRY(clksrc, name, table_id, 0, NULL, 0, fn)
>
> +/*
> + * struct correlated_cs - Descriptor for a clocksource correlated to another
> + *     clocksource
> + * @related_cs:                Pointer to the related timekeeping clocksource
> + * @convert:           Conversion function to convert a timestamp from
> + *                     the correlated clocksource to cycles of the related
> + *                     timekeeping clocksource
> + */
> +struct correlated_cs {
> +       struct clocksource      *related_cs;
> +       cycle_t                 (*convert)(struct correlated_cs *cs,
> +                                          cycle_t cycles);
> +};
> +
> +/*
> + * struct raw_system_counterval - system counter value captured in device
> + *     driver used to produce system/device cross-timestamp
> + * @system:    System counter value
> + * @cs:                Clocksource related to system counter value. This is used by
> + *             timekeeping code to verify validity of counter for system time
> + *             conversion
> + */
> +struct raw_system_counterval {
> +       cycle_t                 cycles;
> +       struct clocksource      *cs;
> +};
> +
>  #endif /* _LINUX_CLOCKSOURCE_H */
> diff --git a/include/linux/timekeeping.h b/include/linux/timekeeping.h
> index ec89d84..2209943 100644
> --- a/include/linux/timekeeping.h
> +++ b/include/linux/timekeeping.h
> @@ -266,6 +266,41 @@ extern void timekeeping_inject_sleeptime64(struct timespec64 *delta);
>  extern void ktime_get_raw_and_real_ts64(struct timespec64 *ts_raw,
>                                         struct timespec64 *ts_real);
>
> +
> +/*
> + * struct system_device_crosststamp - system/device cross-timestamp
> + *     (syncronized capture)
> + * @device:    Device time
> + * @realtime:  Realtime simultaneous with device time
> + * @monoraw:   Monotonic raw simultaneous with device time
> + */
> +struct system_device_crosststamp {
> +       ktime_t device;
> +       ktime_t sys_realtime;
> +       ktime_t sys_monoraw;
> +};
> +
> +struct raw_system_counterval;
> +/*
> + * struct get_sync_device_time - Provides method to capture device time
> + *     synchronized with raw system counter value
> + * @get_time:  Callback providing synchronized capture of device time
> + *             and system counter. Returns 0 on success, < 0 on failure
> + * @ctx:       Context provided to callback function
> + */
> +struct get_sync_device_time {
> +       int     (*get_time)(ktime_t *device,
> +                           struct raw_system_counterval *system,
> +                           void *ctx);
> +       void     *ctx;
> +};
> +
> +/*
> + * Get cross timestamp between system clock and device clock
> + */
> +extern int get_device_system_crosststamp(struct system_device_crosststamp *ct,
> +                                        struct get_sync_device_time *dt);


I feel like this is introducing a lot of very small structures, which
to the casual reviewer aren't immediately obvious how they interlink
and are used.  It also adds to the number of types we have to keep in
our head. I dunno, maybe its just the correlated_cs is added but isn't
used in this patch, but I keep feeling like these should be more
obvious somehow.

That's terribly vague feedback, so I'm sorry.  Maybe some concrete suggestions?

* Maybe try renaming get_sync_device_time as just  crosststamp_device/struct ?

* raw_system_counterval feels like its of limited value. Maybe just
pass the clocksource and cycle value to the functions separately?




>  /*
>   * Persistent clock related interfaces
>   */
> diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
> index d563c19..9c1ddc3 100644
> --- a/kernel/time/timekeeping.c
> +++ b/kernel/time/timekeeping.c
> @@ -298,13 +298,11 @@ u32 (*arch_gettimeoffset)(void) = default_arch_gettimeoffset;
>  static inline u32 arch_gettimeoffset(void) { return 0; }
>  #endif
>
> -static inline s64 timekeeping_get_ns(struct tk_read_base *tkr)
> +static inline s64 timekeeping_delta_to_ns(struct tk_read_base *tkr,
> +                                         cycle_t delta)
>  {
> -       cycle_t delta;
>         s64 nsec;
>
> -       delta = timekeeping_get_delta(tkr);
> -
>         nsec = delta * tkr->mult + tkr->xtime_nsec;
>         nsec >>= tkr->shift;
>
> @@ -312,6 +310,24 @@ static inline s64 timekeeping_get_ns(struct tk_read_base *tkr)
>         return nsec + arch_gettimeoffset();
>  }
>
> +static inline s64 timekeeping_get_ns(struct tk_read_base *tkr)
> +{
> +       cycle_t delta;
> +
> +       delta = timekeeping_get_delta(tkr);
> +       return timekeeping_delta_to_ns(tkr, delta);
> +}
> +
> +static inline s64 timekeeping_cycles_to_ns(struct tk_read_base *tkr,
> +                                           cycle_t cycles)
> +{
> +       cycle_t delta;
> +
> +       /* calculate the delta since the last update_wall_time */
> +       delta = clocksource_delta(cycles, tkr->cycle_last, tkr->mask);
> +       return timekeeping_delta_to_ns(tkr, delta);
> +}
>

Mind splitting the above out into its own small patch?



>  /**
>   * update_fast_timekeeper - Update the fast and NMI safe monotonic timekeeper.
>   * @tkr: Timekeeping readout base from which we take the update
> @@ -885,6 +901,59 @@ EXPORT_SYMBOL(ktime_get_raw_and_real_ts64);
>  #endif /* CONFIG_NTP_PPS */
>
>  /**
> + * get_device_system_crosststamp - Synchronously capture system/device timestamp
> + * @xtstamp:           Receives simultaneously captured system and device time
> + * @sync_devicetime:   Callback to get simultaneous device time and
> + *     system counter from the device driver
> + *
> + * Reads a timestamp from a device and correlates it to system time
> + */
> +int get_device_system_crosststamp(struct system_device_crosststamp *xtstamp,
> +                                 struct get_sync_device_time *sync_devicetime)

Another nit, but to me something like:
int get_device_system_crosststamp(struct get_sync_device_time *sync_devicetime,
                                                          struct
system_device_crosststamp *ret)

Reads better to me. ie: use this device_time -> return that crosststamp.


Otherwise this is looking pretty good.

thanks
-john

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [Intel-wired-lan] [RFC v5 1/6] Timekeeping cross timestamp interface for device drivers
@ 2016-01-06 18:55     ` John Stultz
  0 siblings, 0 replies; 38+ messages in thread
From: John Stultz @ 2016-01-06 18:55 UTC (permalink / raw)
  To: intel-wired-lan

On Mon, Jan 4, 2016 at 4:45 AM, Christopher S. Hall
<christopher.s.hall@intel.com> wrote:
> ACKNOWLEDGMENT: The original correlated clock source and cross
> timestamp code was developed by Thomas Gleixner
> <tglx@linutronix.de>. It has changed considerably and any mistakes are
> mine.
>
> The precision with which events on multiple networked systems can be
> synchronized using, as an example, PTP (IEEE 1588, 802.1AS) is limited
> by the precision of the cross timestamps between the system clock and
> the device (timestamp) clock. Precision here is the degree of
> simultaneity when capturing the cross timestamp.
>
> Currently the PTP cross timestamp is captured in software using the
> PTP device driver ioctl PTP_SYS_OFFSET. Reads of the device clock are
> interleaved with reads of the realtime clock. At best, the precision
> of this cross timestamp is on the order of several microseconds due to
> software latencies. Sub-microsecond precision is required for
> industrial control and some media applications. To achieve this level
> of precision hardware supported cross timestamping is needed.
>
> Hardware cross timestamps are derived from simultaneously capturing
> the device clock and the system clock. Applications use timestamps in
> nanoseconds rather than clock ticks. The device driver can scale
> device clock ticks to device time in nanoseconds, but cannot transform
> system clock ticks. Only the kernel timekeeping code can do this. The
> function get_device_system_crosststamp() allows device drivers to
> return a cross timestamp with system time properly scaled to
> nanoseconds.
>
> The cross timestamps contain the realtime and monotonic raw clock
> times. The realtime value is needed to discipline that clock using PTP
> and the monotonic raw value is used for applications that don't
> require a "real" time, but need an unadjusted clock time.
>
> The get_device_system_crosststamp() code calls back into the driver
> to ensure that the system counter is within the current timekeeping
> update interval. Because of possible NTP/PTP frequency adjustments,
> extrapolating a realtime clock time outside the current interval with
> a potentially different scaling factor can result in a small amount of
> error.
>
> Modern Intel hardware provides an Always Running Timer (ART) which is
> exactly related to TSC through a known frequency ratio. The ART is
> routed to devices on the system and is used to precisely and
> simultaneously capture the device clock with the ART. The kernel
> timekeeping code requires a system clock value from the current clock
> source to calculate the system time. The correlated clocksource adds a
> means to transform a clock (in this case ART) exactly to the system
> clock (TSC clocksource) that is used to calculate system time.
>
> Signed-off-by: Christopher S. Hall <christopher.s.hall@intel.com>
> ---
>  include/linux/clocksource.h | 27 ++++++++++++++++
>  include/linux/timekeeping.h | 35 +++++++++++++++++++++
>  kernel/time/timekeeping.c   | 77 ++++++++++++++++++++++++++++++++++++++++++---
>  3 files changed, 135 insertions(+), 4 deletions(-)
>
> diff --git a/include/linux/clocksource.h b/include/linux/clocksource.h
> index 7784b59..4b7973d 100644
> --- a/include/linux/clocksource.h
> +++ b/include/linux/clocksource.h
> @@ -255,4 +255,31 @@ static inline void clocksource_probe(void) {}
>  #define CLOCKSOURCE_ACPI_DECLARE(name, table_id, fn)           \
>         ACPI_DECLARE_PROBE_ENTRY(clksrc, name, table_id, 0, NULL, 0, fn)
>
> +/*
> + * struct correlated_cs - Descriptor for a clocksource correlated to another
> + *     clocksource
> + * @related_cs:                Pointer to the related timekeeping clocksource
> + * @convert:           Conversion function to convert a timestamp from
> + *                     the correlated clocksource to cycles of the related
> + *                     timekeeping clocksource
> + */
> +struct correlated_cs {
> +       struct clocksource      *related_cs;
> +       cycle_t                 (*convert)(struct correlated_cs *cs,
> +                                          cycle_t cycles);
> +};
> +
> +/*
> + * struct raw_system_counterval - system counter value captured in device
> + *     driver used to produce system/device cross-timestamp
> + * @system:    System counter value
> + * @cs:                Clocksource related to system counter value. This is used by
> + *             timekeeping code to verify validity of counter for system time
> + *             conversion
> + */
> +struct raw_system_counterval {
> +       cycle_t                 cycles;
> +       struct clocksource      *cs;
> +};
> +
>  #endif /* _LINUX_CLOCKSOURCE_H */
> diff --git a/include/linux/timekeeping.h b/include/linux/timekeeping.h
> index ec89d84..2209943 100644
> --- a/include/linux/timekeeping.h
> +++ b/include/linux/timekeeping.h
> @@ -266,6 +266,41 @@ extern void timekeeping_inject_sleeptime64(struct timespec64 *delta);
>  extern void ktime_get_raw_and_real_ts64(struct timespec64 *ts_raw,
>                                         struct timespec64 *ts_real);
>
> +
> +/*
> + * struct system_device_crosststamp - system/device cross-timestamp
> + *     (syncronized capture)
> + * @device:    Device time
> + * @realtime:  Realtime simultaneous with device time
> + * @monoraw:   Monotonic raw simultaneous with device time
> + */
> +struct system_device_crosststamp {
> +       ktime_t device;
> +       ktime_t sys_realtime;
> +       ktime_t sys_monoraw;
> +};
> +
> +struct raw_system_counterval;
> +/*
> + * struct get_sync_device_time - Provides method to capture device time
> + *     synchronized with raw system counter value
> + * @get_time:  Callback providing synchronized capture of device time
> + *             and system counter. Returns 0 on success, < 0 on failure
> + * @ctx:       Context provided to callback function
> + */
> +struct get_sync_device_time {
> +       int     (*get_time)(ktime_t *device,
> +                           struct raw_system_counterval *system,
> +                           void *ctx);
> +       void     *ctx;
> +};
> +
> +/*
> + * Get cross timestamp between system clock and device clock
> + */
> +extern int get_device_system_crosststamp(struct system_device_crosststamp *ct,
> +                                        struct get_sync_device_time *dt);


I feel like this is introducing a lot of very small structures, which
to the casual reviewer aren't immediately obvious how they interlink
and are used.  It also adds to the number of types we have to keep in
our head. I dunno, maybe its just the correlated_cs is added but isn't
used in this patch, but I keep feeling like these should be more
obvious somehow.

That's terribly vague feedback, so I'm sorry.  Maybe some concrete suggestions?

* Maybe try renaming get_sync_device_time as just  crosststamp_device/struct ?

* raw_system_counterval feels like its of limited value. Maybe just
pass the clocksource and cycle value to the functions separately?




>  /*
>   * Persistent clock related interfaces
>   */
> diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
> index d563c19..9c1ddc3 100644
> --- a/kernel/time/timekeeping.c
> +++ b/kernel/time/timekeeping.c
> @@ -298,13 +298,11 @@ u32 (*arch_gettimeoffset)(void) = default_arch_gettimeoffset;
>  static inline u32 arch_gettimeoffset(void) { return 0; }
>  #endif
>
> -static inline s64 timekeeping_get_ns(struct tk_read_base *tkr)
> +static inline s64 timekeeping_delta_to_ns(struct tk_read_base *tkr,
> +                                         cycle_t delta)
>  {
> -       cycle_t delta;
>         s64 nsec;
>
> -       delta = timekeeping_get_delta(tkr);
> -
>         nsec = delta * tkr->mult + tkr->xtime_nsec;
>         nsec >>= tkr->shift;
>
> @@ -312,6 +310,24 @@ static inline s64 timekeeping_get_ns(struct tk_read_base *tkr)
>         return nsec + arch_gettimeoffset();
>  }
>
> +static inline s64 timekeeping_get_ns(struct tk_read_base *tkr)
> +{
> +       cycle_t delta;
> +
> +       delta = timekeeping_get_delta(tkr);
> +       return timekeeping_delta_to_ns(tkr, delta);
> +}
> +
> +static inline s64 timekeeping_cycles_to_ns(struct tk_read_base *tkr,
> +                                           cycle_t cycles)
> +{
> +       cycle_t delta;
> +
> +       /* calculate the delta since the last update_wall_time */
> +       delta = clocksource_delta(cycles, tkr->cycle_last, tkr->mask);
> +       return timekeeping_delta_to_ns(tkr, delta);
> +}
>

Mind splitting the above out into its own small patch?



>  /**
>   * update_fast_timekeeper - Update the fast and NMI safe monotonic timekeeper.
>   * @tkr: Timekeeping readout base from which we take the update
> @@ -885,6 +901,59 @@ EXPORT_SYMBOL(ktime_get_raw_and_real_ts64);
>  #endif /* CONFIG_NTP_PPS */
>
>  /**
> + * get_device_system_crosststamp - Synchronously capture system/device timestamp
> + * @xtstamp:           Receives simultaneously captured system and device time
> + * @sync_devicetime:   Callback to get simultaneous device time and
> + *     system counter from the device driver
> + *
> + * Reads a timestamp from a device and correlates it to system time
> + */
> +int get_device_system_crosststamp(struct system_device_crosststamp *xtstamp,
> +                                 struct get_sync_device_time *sync_devicetime)

Another nit, but to me something like:
int get_device_system_crosststamp(struct get_sync_device_time *sync_devicetime,
                                                          struct
system_device_crosststamp *ret)

Reads better to me. ie: use this device_time -> return that crosststamp.


Otherwise this is looking pretty good.

thanks
-john

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC v5 3/6] Add history to cross timestamp interface supporting slower devices
  2016-01-04 12:45   ` [Intel-wired-lan] " Christopher S. Hall
@ 2016-01-06 19:37     ` John Stultz
  -1 siblings, 0 replies; 38+ messages in thread
From: John Stultz @ 2016-01-06 19:37 UTC (permalink / raw)
  To: Christopher S. Hall
  Cc: Thomas Gleixner, Richard Cochran, Ingo Molnar, H. Peter Anvin,
	Jeff Kirsher, x86, lkml, intel-wired-lan, netdev, Stanton,
	Kevin B

On Mon, Jan 4, 2016 at 4:45 AM, Christopher S. Hall
<christopher.s.hall@intel.com> wrote:
> @@ -13,6 +13,9 @@
>  /**
>   * struct tk_read_base - base structure for timekeeping readout
>   * @clock:     Current clocksource used for timekeeping.
> + * @cs_seq:    Clocksource sequence is incremented per clocksource change.
> + *     It's used to determine whether past system time can be related to
> + *     current system time
>   * @read:      Read function of @clock
>   * @mask:      Bitmask for two's complement subtraction of non 64bit clocks
>   * @cycle_last: @clock cycle value at last update
> @@ -29,6 +32,7 @@
>   */
>  struct tk_read_base {
>         struct clocksource      *clock;
> +       u8                      cs_seq;
>         cycle_t                 (*read)(struct clocksource *cs);
>         cycle_t                 mask;
>         cycle_t                 cycle_last;


So Thomas optimized the tk_read_bases to fit in a cacheline, and so I
worry about the u8 being added there. I'm also cautious about
exporting these seq values out further via the system_time_snapshot.
But I may just need some more time to warm up to the idea.


> diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
> index 9c1ddc3..5a7f784 100644
> --- a/kernel/time/timekeeping.c
> +++ b/kernel/time/timekeeping.c
> @@ -235,11 +235,13 @@ static void tk_setup_internals(struct timekeeper *tk, struct clocksource *clock)
>
>         old_clock = tk->tkr_mono.clock;
>         tk->tkr_mono.clock = clock;
> +       ++tk->tkr_mono.cs_seq;
>         tk->tkr_mono.read = clock->read;
>         tk->tkr_mono.mask = clock->mask;
>         tk->tkr_mono.cycle_last = tk->tkr_mono.read(clock);
>
>         tk->tkr_raw.clock = clock;
> +       ++tk->tkr_raw.cs_seq;
>         tk->tkr_raw.read = clock->read;
>         tk->tkr_raw.mask = clock->mask;
>         tk->tkr_raw.cycle_last = tk->tkr_mono.cycle_last;
> @@ -862,6 +864,39 @@ time64_t ktime_get_real_seconds(void)
>  }
>  EXPORT_SYMBOL_GPL(ktime_get_real_seconds);
>
> +/**
> + * ktime_get_snapshot - snapshots the realtime/monotonic raw clocks with counter
> + * @snapshot:  pointer to struct receiving the system time snapshot
> + */
> +void ktime_get_snapshot(struct system_time_snapshot *systime_snapshot)
> +{
> +       struct timekeeper *tk = &tk_core.timekeeper;
> +       unsigned long seq;
> +       ktime_t base_raw;
> +       ktime_t base_real;
> +       s64 nsec_raw;
> +       s64 nsec_real;
> +       cycle_t now;
> +
> +       do {
> +               seq = read_seqcount_begin(&tk_core.seq);
> +
> +               now = tk->tkr_mono.read(tk->tkr_mono.clock);
> +               systime_snapshot->cs_seq = tk->tkr_mono.cs_seq;
> +               systime_snapshot->clock_set_seq = tk->clock_was_set_seq;
> +               base_real = ktime_add(tk->tkr_mono.base,
> +                                     tk_core.timekeeper.offs_real);
> +               base_raw = tk->tkr_raw.base;
> +               nsec_real = timekeeping_cycles_to_ns(&tk->tkr_mono, now);
> +               nsec_raw  = timekeeping_cycles_to_ns(&tk->tkr_raw, now);
> +       } while (read_seqcount_retry(&tk_core.seq, seq));
> +
> +       systime_snapshot->cycles = now;
> +       systime_snapshot->real = ktime_add_ns(base_real, nsec_real);
> +       systime_snapshot->raw = ktime_add_ns(base_raw, nsec_raw);
> +}
> +EXPORT_SYMBOL_GPL(ktime_get_snapshot);


So can you split out this adding of ktime_get_snapshot()  (maybe
skipping the seqcount bits initially) into a separate patch?

> @@ -936,19 +1044,63 @@ int get_device_system_crosststamp(struct system_device_crosststamp *xtstamp,
>                  */
>                 if (tk->tkr_mono.clock != raw_sys.cs)
>                         return -ENODEV;
> +               cycles = raw_sys.cycles;
> +
> +               /*
> +                * Check whether the system counter value provided by the
> +                * device driver is on the current interval.
> +                */
> +               now = tk->tkr_mono.read(tk->tkr_mono.clock);
> +               interval_start = tk->tkr_mono.cycle_last;
> +               if (!cycle_between(interval_start, cycles, now)) {
> +                       cs_seq = tk->tkr_mono.cs_seq;
> +                       clock_was_set_seq = tk->clock_was_set_seq;
> +                       cycles = interval_start;
> +                       do_interp = true;
> +               } else {
> +                       do_interp = false;
> +               }
>
>                 base_real = ktime_add(tk->tkr_mono.base,
>                                       tk_core.timekeeper.offs_real);
>                 base_raw = tk->tkr_raw.base;
>
> -               nsec_real = timekeeping_cycles_to_ns(&tk->tkr_mono,
> -                                                    raw_sys.cycles);
> -               nsec_raw = timekeeping_cycles_to_ns(&tk->tkr_raw,
> -                                                   raw_sys.cycles);
> +               nsec_real = timekeeping_cycles_to_ns(&tk->tkr_mono, cycles);
> +               nsec_raw = timekeeping_cycles_to_ns(&tk->tkr_raw, cycles);
>         } while (read_seqcount_retry(&tk_core.seq, seq));
>
>         xtstamp->sys_realtime = ktime_add_ns(base_real, nsec_real);
>         xtstamp->sys_monoraw = ktime_add_ns(base_raw, nsec_raw);
> +
> +       /*
> +        * Interpolate if necessary, working back from the start of the current
> +        * interval
> +        */
> +       if (do_interp) {
> +               cycle_t total_history_cycles;
> +               ktime_t history_monoraw;
> +               ktime_t history_realtime;
> +               bool discontinuity;
> +               cycle_t partial_history_cycles = cycles - raw_sys.cycles;
> +
> +               if (!history_ref || history_ref->cs_seq != cs_seq ||

I've not done a super close reading here. But is it very likely the
the history_ref->cs_seq is the same as the captured seq? I thought
this history_ref was to allow old cross stamps to be used to improve
the back-calculation of the time at the given cycle value. So throwing
them out if they are older then the last tick seems strange.

thanks
-john

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [Intel-wired-lan] [RFC v5 3/6] Add history to cross timestamp interface supporting slower devices
@ 2016-01-06 19:37     ` John Stultz
  0 siblings, 0 replies; 38+ messages in thread
From: John Stultz @ 2016-01-06 19:37 UTC (permalink / raw)
  To: intel-wired-lan

On Mon, Jan 4, 2016 at 4:45 AM, Christopher S. Hall
<christopher.s.hall@intel.com> wrote:
> @@ -13,6 +13,9 @@
>  /**
>   * struct tk_read_base - base structure for timekeeping readout
>   * @clock:     Current clocksource used for timekeeping.
> + * @cs_seq:    Clocksource sequence is incremented per clocksource change.
> + *     It's used to determine whether past system time can be related to
> + *     current system time
>   * @read:      Read function of @clock
>   * @mask:      Bitmask for two's complement subtraction of non 64bit clocks
>   * @cycle_last: @clock cycle value at last update
> @@ -29,6 +32,7 @@
>   */
>  struct tk_read_base {
>         struct clocksource      *clock;
> +       u8                      cs_seq;
>         cycle_t                 (*read)(struct clocksource *cs);
>         cycle_t                 mask;
>         cycle_t                 cycle_last;


So Thomas optimized the tk_read_bases to fit in a cacheline, and so I
worry about the u8 being added there. I'm also cautious about
exporting these seq values out further via the system_time_snapshot.
But I may just need some more time to warm up to the idea.


> diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
> index 9c1ddc3..5a7f784 100644
> --- a/kernel/time/timekeeping.c
> +++ b/kernel/time/timekeeping.c
> @@ -235,11 +235,13 @@ static void tk_setup_internals(struct timekeeper *tk, struct clocksource *clock)
>
>         old_clock = tk->tkr_mono.clock;
>         tk->tkr_mono.clock = clock;
> +       ++tk->tkr_mono.cs_seq;
>         tk->tkr_mono.read = clock->read;
>         tk->tkr_mono.mask = clock->mask;
>         tk->tkr_mono.cycle_last = tk->tkr_mono.read(clock);
>
>         tk->tkr_raw.clock = clock;
> +       ++tk->tkr_raw.cs_seq;
>         tk->tkr_raw.read = clock->read;
>         tk->tkr_raw.mask = clock->mask;
>         tk->tkr_raw.cycle_last = tk->tkr_mono.cycle_last;
> @@ -862,6 +864,39 @@ time64_t ktime_get_real_seconds(void)
>  }
>  EXPORT_SYMBOL_GPL(ktime_get_real_seconds);
>
> +/**
> + * ktime_get_snapshot - snapshots the realtime/monotonic raw clocks with counter
> + * @snapshot:  pointer to struct receiving the system time snapshot
> + */
> +void ktime_get_snapshot(struct system_time_snapshot *systime_snapshot)
> +{
> +       struct timekeeper *tk = &tk_core.timekeeper;
> +       unsigned long seq;
> +       ktime_t base_raw;
> +       ktime_t base_real;
> +       s64 nsec_raw;
> +       s64 nsec_real;
> +       cycle_t now;
> +
> +       do {
> +               seq = read_seqcount_begin(&tk_core.seq);
> +
> +               now = tk->tkr_mono.read(tk->tkr_mono.clock);
> +               systime_snapshot->cs_seq = tk->tkr_mono.cs_seq;
> +               systime_snapshot->clock_set_seq = tk->clock_was_set_seq;
> +               base_real = ktime_add(tk->tkr_mono.base,
> +                                     tk_core.timekeeper.offs_real);
> +               base_raw = tk->tkr_raw.base;
> +               nsec_real = timekeeping_cycles_to_ns(&tk->tkr_mono, now);
> +               nsec_raw  = timekeeping_cycles_to_ns(&tk->tkr_raw, now);
> +       } while (read_seqcount_retry(&tk_core.seq, seq));
> +
> +       systime_snapshot->cycles = now;
> +       systime_snapshot->real = ktime_add_ns(base_real, nsec_real);
> +       systime_snapshot->raw = ktime_add_ns(base_raw, nsec_raw);
> +}
> +EXPORT_SYMBOL_GPL(ktime_get_snapshot);


So can you split out this adding of ktime_get_snapshot()  (maybe
skipping the seqcount bits initially) into a separate patch?

> @@ -936,19 +1044,63 @@ int get_device_system_crosststamp(struct system_device_crosststamp *xtstamp,
>                  */
>                 if (tk->tkr_mono.clock != raw_sys.cs)
>                         return -ENODEV;
> +               cycles = raw_sys.cycles;
> +
> +               /*
> +                * Check whether the system counter value provided by the
> +                * device driver is on the current interval.
> +                */
> +               now = tk->tkr_mono.read(tk->tkr_mono.clock);
> +               interval_start = tk->tkr_mono.cycle_last;
> +               if (!cycle_between(interval_start, cycles, now)) {
> +                       cs_seq = tk->tkr_mono.cs_seq;
> +                       clock_was_set_seq = tk->clock_was_set_seq;
> +                       cycles = interval_start;
> +                       do_interp = true;
> +               } else {
> +                       do_interp = false;
> +               }
>
>                 base_real = ktime_add(tk->tkr_mono.base,
>                                       tk_core.timekeeper.offs_real);
>                 base_raw = tk->tkr_raw.base;
>
> -               nsec_real = timekeeping_cycles_to_ns(&tk->tkr_mono,
> -                                                    raw_sys.cycles);
> -               nsec_raw = timekeeping_cycles_to_ns(&tk->tkr_raw,
> -                                                   raw_sys.cycles);
> +               nsec_real = timekeeping_cycles_to_ns(&tk->tkr_mono, cycles);
> +               nsec_raw = timekeeping_cycles_to_ns(&tk->tkr_raw, cycles);
>         } while (read_seqcount_retry(&tk_core.seq, seq));
>
>         xtstamp->sys_realtime = ktime_add_ns(base_real, nsec_real);
>         xtstamp->sys_monoraw = ktime_add_ns(base_raw, nsec_raw);
> +
> +       /*
> +        * Interpolate if necessary, working back from the start of the current
> +        * interval
> +        */
> +       if (do_interp) {
> +               cycle_t total_history_cycles;
> +               ktime_t history_monoraw;
> +               ktime_t history_realtime;
> +               bool discontinuity;
> +               cycle_t partial_history_cycles = cycles - raw_sys.cycles;
> +
> +               if (!history_ref || history_ref->cs_seq != cs_seq ||

I've not done a super close reading here. But is it very likely the
the history_ref->cs_seq is the same as the captured seq? I thought
this history_ref was to allow old cross stamps to be used to improve
the back-calculation of the time at the given cycle value. So throwing
them out if they are older then the last tick seems strange.

thanks
-john

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC v5 4/6] Remove duplicate code from ktime_get_raw_and_real code
  2016-01-04 12:45   ` [Intel-wired-lan] " Christopher S. Hall
@ 2016-01-06 19:42     ` John Stultz
  -1 siblings, 0 replies; 38+ messages in thread
From: John Stultz @ 2016-01-06 19:42 UTC (permalink / raw)
  To: Christopher S. Hall
  Cc: Thomas Gleixner, Richard Cochran, Ingo Molnar, H. Peter Anvin,
	Jeff Kirsher, x86, lkml, intel-wired-lan, netdev, Stanton,
	Kevin B

On Mon, Jan 4, 2016 at 4:45 AM, Christopher S. Hall
<christopher.s.hall@intel.com> wrote:
> The code in ktime_get_snapshot() is a superset of the code in
> ktime_get_raw_and_real() code. Changes the latter to call the former. A
> side effect of this is that ktime_get_raw_and_real() returns two clock
> times corresponding to the *exact* same clock tick. Previously, this
> code read the underlying counter twice.
>
> Signed-off-by: Christopher S. Hall <christopher.s.hall@intel.com>
> ---
>  kernel/time/timekeeping.c | 20 ++++----------------
>  1 file changed, 4 insertions(+), 16 deletions(-)
>
> diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
> index 5a7f784..a0f096c 100644
> --- a/kernel/time/timekeeping.c
> +++ b/kernel/time/timekeeping.c
> @@ -910,26 +910,14 @@ EXPORT_SYMBOL_GPL(ktime_get_snapshot);
>   */
>  void ktime_get_raw_and_real_ts64(struct timespec64 *ts_raw, struct timespec64 *ts_real)
>  {
> -       struct timekeeper *tk = &tk_core.timekeeper;
> -       unsigned long seq;
> -       s64 nsecs_raw, nsecs_real;
> +       struct system_time_snapshot snap;
>
>         WARN_ON_ONCE(timekeeping_suspended);
>
> -       do {
> -               seq = read_seqcount_begin(&tk_core.seq);
> -
> -               *ts_raw = tk->raw_time;
> -               ts_real->tv_sec = tk->xtime_sec;
> -               ts_real->tv_nsec = 0;
> -
> -               nsecs_raw  = timekeeping_get_ns(&tk->tkr_raw);
> -               nsecs_real = timekeeping_get_ns(&tk->tkr_mono);
> -
> -       } while (read_seqcount_retry(&tk_core.seq, seq));
> +       ktime_get_snapshot(&snap);
>
> -       timespec64_add_ns(ts_raw, nsecs_raw);
> -       timespec64_add_ns(ts_real, nsecs_real);
> +       *ts_raw = ktime_to_timespec64(snap.raw);
> +       *ts_real = ktime_to_timespec64(snap.real);
>  }
>  EXPORT_SYMBOL(ktime_get_raw_and_real_ts64);

Looks sane, but you might even be able to get rid of this function
entirely and modify the one user (pps_get_ts()) to use
ktime_get_snapshot().

thanks
-john

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [Intel-wired-lan] [RFC v5 4/6] Remove duplicate code from ktime_get_raw_and_real code
@ 2016-01-06 19:42     ` John Stultz
  0 siblings, 0 replies; 38+ messages in thread
From: John Stultz @ 2016-01-06 19:42 UTC (permalink / raw)
  To: intel-wired-lan

On Mon, Jan 4, 2016 at 4:45 AM, Christopher S. Hall
<christopher.s.hall@intel.com> wrote:
> The code in ktime_get_snapshot() is a superset of the code in
> ktime_get_raw_and_real() code. Changes the latter to call the former. A
> side effect of this is that ktime_get_raw_and_real() returns two clock
> times corresponding to the *exact* same clock tick. Previously, this
> code read the underlying counter twice.
>
> Signed-off-by: Christopher S. Hall <christopher.s.hall@intel.com>
> ---
>  kernel/time/timekeeping.c | 20 ++++----------------
>  1 file changed, 4 insertions(+), 16 deletions(-)
>
> diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
> index 5a7f784..a0f096c 100644
> --- a/kernel/time/timekeeping.c
> +++ b/kernel/time/timekeeping.c
> @@ -910,26 +910,14 @@ EXPORT_SYMBOL_GPL(ktime_get_snapshot);
>   */
>  void ktime_get_raw_and_real_ts64(struct timespec64 *ts_raw, struct timespec64 *ts_real)
>  {
> -       struct timekeeper *tk = &tk_core.timekeeper;
> -       unsigned long seq;
> -       s64 nsecs_raw, nsecs_real;
> +       struct system_time_snapshot snap;
>
>         WARN_ON_ONCE(timekeeping_suspended);
>
> -       do {
> -               seq = read_seqcount_begin(&tk_core.seq);
> -
> -               *ts_raw = tk->raw_time;
> -               ts_real->tv_sec = tk->xtime_sec;
> -               ts_real->tv_nsec = 0;
> -
> -               nsecs_raw  = timekeeping_get_ns(&tk->tkr_raw);
> -               nsecs_real = timekeeping_get_ns(&tk->tkr_mono);
> -
> -       } while (read_seqcount_retry(&tk_core.seq, seq));
> +       ktime_get_snapshot(&snap);
>
> -       timespec64_add_ns(ts_raw, nsecs_raw);
> -       timespec64_add_ns(ts_real, nsecs_real);
> +       *ts_raw = ktime_to_timespec64(snap.raw);
> +       *ts_real = ktime_to_timespec64(snap.real);
>  }
>  EXPORT_SYMBOL(ktime_get_raw_and_real_ts64);

Looks sane, but you might even be able to get rid of this function
entirely and modify the one user (pps_get_ts()) to use
ktime_get_snapshot().

thanks
-john

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC v5 5/6] Add PTP_SYS_OFFSET_PRECISE for driver crosstimestamping
  2016-01-05 15:27     ` [Intel-wired-lan] " Richard Cochran
@ 2016-01-07  1:42       ` Christopher Hall
  -1 siblings, 0 replies; 38+ messages in thread
From: Christopher Hall @ 2016-01-07  1:42 UTC (permalink / raw)
  To: Richard Cochran
  Cc: tglx, mingo, john.stultz, hpa, jeffrey.t.kirsher, x86,
	linux-kernel, intel-wired-lan, netdev, kevin.b.stanton

Hi Richard,

This all sounds fine.  Thanks for the feedback.  I'll roll this into the  
next patchset.

Chris

On Tue, 05 Jan 2016 07:27:32 -0800, Richard Cochran  
<richardcochran@gmail.com> wrote:
> On Mon, Jan 04, 2016 at 04:45:22AM -0800, Christopher S. Hall wrote:
>> +	case PTP_SYS_OFFSET_PRECISE:
>> +		if (!ptp->info->getsynctime) {
>> +			err = -EINVAL;
>
> -EOPNOTSUPP would be better here.
>
>> +		precise_offset.sys_real.sec =
>> +			div_u64_rem(ktime_to_ns(xtstamp.sys_realtime),
>> +				    NSEC_PER_SEC, &rem);
>> +		precise_offset.sys_real.nsec = rem;
>
> How about this instead:
>
> 		ts = ktime_to_timespec64(xtstamp.sys_realtime);
> 		precise_offset.sys_real.sec = ts.tv_sec;
> 		precise_offset.sys_real.nsec = ts.tv_nsec;
>
>> +		precise_offset.sys_raw.sec =
>> +			div_u64_rem(ktime_to_ns(xtstamp.sys_monoraw),
>> +				    NSEC_PER_SEC, &rem);
>> +		precise_offset.sys_raw.nsec = rem;
>> +		precise_offset.dev.sec =
>> +			div_u64_rem(ktime_to_ns(xtstamp.device), NSEC_PER_SEC,
>> +				    &rem);
>> +		precise_offset.dev.nsec = rem;
>
> And for these as well.
>
> Thanks,
> Richard

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [Intel-wired-lan] [RFC v5 5/6] Add PTP_SYS_OFFSET_PRECISE for driver crosstimestamping
@ 2016-01-07  1:42       ` Christopher Hall
  0 siblings, 0 replies; 38+ messages in thread
From: Christopher Hall @ 2016-01-07  1:42 UTC (permalink / raw)
  To: intel-wired-lan

Hi Richard,

This all sounds fine.  Thanks for the feedback.  I'll roll this into the  
next patchset.

Chris

On Tue, 05 Jan 2016 07:27:32 -0800, Richard Cochran  
<richardcochran@gmail.com> wrote:
> On Mon, Jan 04, 2016 at 04:45:22AM -0800, Christopher S. Hall wrote:
>> +	case PTP_SYS_OFFSET_PRECISE:
>> +		if (!ptp->info->getsynctime) {
>> +			err = -EINVAL;
>
> -EOPNOTSUPP would be better here.
>
>> +		precise_offset.sys_real.sec =
>> +			div_u64_rem(ktime_to_ns(xtstamp.sys_realtime),
>> +				    NSEC_PER_SEC, &rem);
>> +		precise_offset.sys_real.nsec = rem;
>
> How about this instead:
>
> 		ts = ktime_to_timespec64(xtstamp.sys_realtime);
> 		precise_offset.sys_real.sec = ts.tv_sec;
> 		precise_offset.sys_real.nsec = ts.tv_nsec;
>
>> +		precise_offset.sys_raw.sec =
>> +			div_u64_rem(ktime_to_ns(xtstamp.sys_monoraw),
>> +				    NSEC_PER_SEC, &rem);
>> +		precise_offset.sys_raw.nsec = rem;
>> +		precise_offset.dev.sec =
>> +			div_u64_rem(ktime_to_ns(xtstamp.device), NSEC_PER_SEC,
>> +				    &rem);
>> +		precise_offset.dev.nsec = rem;
>
> And for these as well.
>
> Thanks,
> Richard

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC v5 1/6] Timekeeping cross timestamp interface for device drivers
  2016-01-06 18:55     ` [Intel-wired-lan] " John Stultz
@ 2016-01-08  0:42       ` Christopher Hall
  -1 siblings, 0 replies; 38+ messages in thread
From: Christopher Hall @ 2016-01-08  0:42 UTC (permalink / raw)
  To: John Stultz
  Cc: Thomas Gleixner, Richard Cochran, Ingo Molnar, H. Peter Anvin,
	Jeff Kirsher, x86, lkml, intel-wired-lan, netdev, Stanton,
	Kevin B

Hi John,

On Wed, 06 Jan 2016 10:55:22 -0800, John Stultz <john.stultz@linaro.org>  
wrote:
> On Mon, Jan 4, 2016 at 4:45 AM, Christopher S. Hall
> <christopher.s.hall@intel.com> wrote:
>> +/*
>> + * struct correlated_cs - Descriptor for a clocksource correlated to  
>> another
>> + *     clocksource
>> + * @related_cs:                Pointer to the related timekeeping  
>> clocksource
>> + * @convert:           Conversion function to convert a timestamp from
>> + *                     the correlated clocksource to cycles of the  
>> related
>> + *                     timekeeping clocksource
>> + */
>> +struct correlated_cs {
>> +       struct clocksource      *related_cs;
>> +       cycle_t                 (*convert)(struct correlated_cs *cs,
>> +                                          cycle_t cycles);
>> +};

How about making this another patch? It's not directly related to the  
cross timestamp. I would lean toward making it its own patch leaving the  
ART correlated clocksource patch touching arch/ only.

>> +/*
>> + * struct raw_system_counterval - system counter value captured in  
>> device
>> + *     driver used to produce system/device cross-timestamp
>> + * @system:    System counter value
>> + * @cs:                Clocksource related to system counter value.  
>> This is used by
>> + *             timekeeping code to verify validity of counter for  
>> system time
>> + *             conversion
>> + */
>> +struct raw_system_counterval {
>> +       cycle_t                 cycles;
>> +       struct clocksource      *cs;
>> +};
>> +
>>  #endif /* _LINUX_CLOCKSOURCE_H */
>> diff --git a/include/linux/timekeeping.h b/include/linux/timekeeping.h
>> index ec89d84..2209943 100644
>> --- a/include/linux/timekeeping.h
>> +++ b/include/linux/timekeeping.h
>> @@ -266,6 +266,41 @@ extern void timekeeping_inject_sleeptime64(struct  
>> timespec64 *delta);
>>  extern void ktime_get_raw_and_real_ts64(struct timespec64 *ts_raw,
>>                                         struct timespec64 *ts_real);
>>
>> +
>> +/*
>> + * struct system_device_crosststamp - system/device cross-timestamp
>> + *     (syncronized capture)
>> + * @device:    Device time
>> + * @realtime:  Realtime simultaneous with device time
>> + * @monoraw:   Monotonic raw simultaneous with device time
>> + */
>> +struct system_device_crosststamp {
>> +       ktime_t device;
>> +       ktime_t sys_realtime;
>> +       ktime_t sys_monoraw;
>> +};
>> +
>> +struct raw_system_counterval;
>> +/*
>> + * struct get_sync_device_time - Provides method to capture device time
>> + *     synchronized with raw system counter value
>> + * @get_time:  Callback providing synchronized capture of device time
>> + *             and system counter. Returns 0 on success, < 0 on failure
>> + * @ctx:       Context provided to callback function
>> + */
>> +struct get_sync_device_time {
>> +       int     (*get_time)(ktime_t *device,
>> +                           struct raw_system_counterval *system,
>> +                           void *ctx);
>> +       void     *ctx;
>> +};
>> +
>> +/*
>> + * Get cross timestamp between system clock and device clock
>> + */
>> +extern int get_device_system_crosststamp(struct  
>> system_device_crosststamp *ct,
>> +                                        struct get_sync_device_time  
>> *dt);
>
> I feel like this is introducing a lot of very small structures, which
> to the casual reviewer aren't immediately obvious how they interlink
> and are used.  It also adds to the number of types we have to keep in
> our head. I dunno, maybe its just the correlated_cs is added but isn't
> used in this patch, but I keep feeling like these should be more
> obvious somehow.
>
> That's terribly vague feedback, so I'm sorry.  Maybe some concrete  
> suggestions?
>
> * Maybe try renaming get_sync_device_time as just   
> crosststamp_device/struct ?

My goal here is to differentiate between the cross timestamp (final  
result) and the synchronized capture (intermediate value). Maybe it isn't  
particularly useful (or worse additionally confusing), but when reading  
the code I found it confusing to call everything a cross timestamp. I  
think it's the units - cycles and nanoseconds - that are the source of my  
confusion in this case. The units are obvious from the struct member  
types, but it seemed more straightforward to differentiate in the struct  
name.  Does this make sense?  If I missed the mark maybe it should be  
changed.

> * raw_system_counterval feels like its of limited value. Maybe just
> pass the clocksource and cycle value to the functions separately?

I agree it is of somewhat limited value, but I think it makes the cross  
timestamp code more intuitive (the header file, as you pointed out, not as  
much). Maybe better commenting of the struct declaration would be useful  
here? To your comment above, I can keep more structs in my head than  
arguments. Also, to my thinking there isn't any intuitive ordering if  
these struct members are changed to individual arguments which might make  
the driver callback confusing.

>>
>> -static inline s64 timekeeping_get_ns(struct tk_read_base *tkr)
>> +static inline s64 timekeeping_delta_to_ns(struct tk_read_base *tkr,
>> +                                         cycle_t delta)
>>  {
>> -       cycle_t delta;
>>         s64 nsec;
>>
>> -       delta = timekeeping_get_delta(tkr);
>> -
>>         nsec = delta * tkr->mult + tkr->xtime_nsec;
>>         nsec >>= tkr->shift;
>>
>> @@ -312,6 +310,24 @@ static inline s64 timekeeping_get_ns(struct  
>> tk_read_base *tkr)
>>         return nsec + arch_gettimeoffset();
>>  }

> Mind splitting the above out into its own small patch?

Makes sense. To be clear your suggestion is to make this patch 1 of X and  
then 2 of X will be the cross timestamp patch.  Is that right?

>>  /**
>>   * update_fast_timekeeper - Update the fast and NMI safe monotonic  
>> timekeeper.
>>   * @tkr: Timekeeping readout base from which we take the update
>> @@ -885,6 +901,59 @@ EXPORT_SYMBOL(ktime_get_raw_and_real_ts64);
>>  #endif /* CONFIG_NTP_PPS */
>>
>>  /**
>> + * get_device_system_crosststamp - Synchronously capture system/device  
>> timestamp
>> + * @xtstamp:           Receives simultaneously captured system and  
>> device time
>> + * @sync_devicetime:   Callback to get simultaneous device time and
>> + *     system counter from the device driver
>> + *
>> + * Reads a timestamp from a device and correlates it to system time
>> + */
>> +int get_device_system_crosststamp(struct system_device_crosststamp  
>> *xtstamp,
>> +                                 struct get_sync_device_time  
>> *sync_devicetime)
>
> Another nit, but to me something like:
> int get_device_system_crosststamp(struct get_sync_device_time  
> *sync_devicetime,
>                                                           struct
> system_device_crosststamp *ret)
>
> Reads better to me. ie: use this device_time -> return that crosststamp.

OK.  My brain "works" the opposite way, but I think (after a small amount  
of research) I'm in the minority:)

Thanks,
Chris

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [Intel-wired-lan] [RFC v5 1/6] Timekeeping cross timestamp interface for device drivers
@ 2016-01-08  0:42       ` Christopher Hall
  0 siblings, 0 replies; 38+ messages in thread
From: Christopher Hall @ 2016-01-08  0:42 UTC (permalink / raw)
  To: intel-wired-lan

Hi John,

On Wed, 06 Jan 2016 10:55:22 -0800, John Stultz <john.stultz@linaro.org>  
wrote:
> On Mon, Jan 4, 2016 at 4:45 AM, Christopher S. Hall
> <christopher.s.hall@intel.com> wrote:
>> +/*
>> + * struct correlated_cs - Descriptor for a clocksource correlated to  
>> another
>> + *     clocksource
>> + * @related_cs:                Pointer to the related timekeeping  
>> clocksource
>> + * @convert:           Conversion function to convert a timestamp from
>> + *                     the correlated clocksource to cycles of the  
>> related
>> + *                     timekeeping clocksource
>> + */
>> +struct correlated_cs {
>> +       struct clocksource      *related_cs;
>> +       cycle_t                 (*convert)(struct correlated_cs *cs,
>> +                                          cycle_t cycles);
>> +};

How about making this another patch? It's not directly related to the  
cross timestamp. I would lean toward making it its own patch leaving the  
ART correlated clocksource patch touching arch/ only.

>> +/*
>> + * struct raw_system_counterval - system counter value captured in  
>> device
>> + *     driver used to produce system/device cross-timestamp
>> + * @system:    System counter value
>> + * @cs:                Clocksource related to system counter value.  
>> This is used by
>> + *             timekeeping code to verify validity of counter for  
>> system time
>> + *             conversion
>> + */
>> +struct raw_system_counterval {
>> +       cycle_t                 cycles;
>> +       struct clocksource      *cs;
>> +};
>> +
>>  #endif /* _LINUX_CLOCKSOURCE_H */
>> diff --git a/include/linux/timekeeping.h b/include/linux/timekeeping.h
>> index ec89d84..2209943 100644
>> --- a/include/linux/timekeeping.h
>> +++ b/include/linux/timekeeping.h
>> @@ -266,6 +266,41 @@ extern void timekeeping_inject_sleeptime64(struct  
>> timespec64 *delta);
>>  extern void ktime_get_raw_and_real_ts64(struct timespec64 *ts_raw,
>>                                         struct timespec64 *ts_real);
>>
>> +
>> +/*
>> + * struct system_device_crosststamp - system/device cross-timestamp
>> + *     (syncronized capture)
>> + * @device:    Device time
>> + * @realtime:  Realtime simultaneous with device time
>> + * @monoraw:   Monotonic raw simultaneous with device time
>> + */
>> +struct system_device_crosststamp {
>> +       ktime_t device;
>> +       ktime_t sys_realtime;
>> +       ktime_t sys_monoraw;
>> +};
>> +
>> +struct raw_system_counterval;
>> +/*
>> + * struct get_sync_device_time - Provides method to capture device time
>> + *     synchronized with raw system counter value
>> + * @get_time:  Callback providing synchronized capture of device time
>> + *             and system counter. Returns 0 on success, < 0 on failure
>> + * @ctx:       Context provided to callback function
>> + */
>> +struct get_sync_device_time {
>> +       int     (*get_time)(ktime_t *device,
>> +                           struct raw_system_counterval *system,
>> +                           void *ctx);
>> +       void     *ctx;
>> +};
>> +
>> +/*
>> + * Get cross timestamp between system clock and device clock
>> + */
>> +extern int get_device_system_crosststamp(struct  
>> system_device_crosststamp *ct,
>> +                                        struct get_sync_device_time  
>> *dt);
>
> I feel like this is introducing a lot of very small structures, which
> to the casual reviewer aren't immediately obvious how they interlink
> and are used.  It also adds to the number of types we have to keep in
> our head. I dunno, maybe its just the correlated_cs is added but isn't
> used in this patch, but I keep feeling like these should be more
> obvious somehow.
>
> That's terribly vague feedback, so I'm sorry.  Maybe some concrete  
> suggestions?
>
> * Maybe try renaming get_sync_device_time as just   
> crosststamp_device/struct ?

My goal here is to differentiate between the cross timestamp (final  
result) and the synchronized capture (intermediate value). Maybe it isn't  
particularly useful (or worse additionally confusing), but when reading  
the code I found it confusing to call everything a cross timestamp. I  
think it's the units - cycles and nanoseconds - that are the source of my  
confusion in this case. The units are obvious from the struct member  
types, but it seemed more straightforward to differentiate in the struct  
name.  Does this make sense?  If I missed the mark maybe it should be  
changed.

> * raw_system_counterval feels like its of limited value. Maybe just
> pass the clocksource and cycle value to the functions separately?

I agree it is of somewhat limited value, but I think it makes the cross  
timestamp code more intuitive (the header file, as you pointed out, not as  
much). Maybe better commenting of the struct declaration would be useful  
here? To your comment above, I can keep more structs in my head than  
arguments. Also, to my thinking there isn't any intuitive ordering if  
these struct members are changed to individual arguments which might make  
the driver callback confusing.

>>
>> -static inline s64 timekeeping_get_ns(struct tk_read_base *tkr)
>> +static inline s64 timekeeping_delta_to_ns(struct tk_read_base *tkr,
>> +                                         cycle_t delta)
>>  {
>> -       cycle_t delta;
>>         s64 nsec;
>>
>> -       delta = timekeeping_get_delta(tkr);
>> -
>>         nsec = delta * tkr->mult + tkr->xtime_nsec;
>>         nsec >>= tkr->shift;
>>
>> @@ -312,6 +310,24 @@ static inline s64 timekeeping_get_ns(struct  
>> tk_read_base *tkr)
>>         return nsec + arch_gettimeoffset();
>>  }

> Mind splitting the above out into its own small patch?

Makes sense. To be clear your suggestion is to make this patch 1 of X and  
then 2 of X will be the cross timestamp patch.  Is that right?

>>  /**
>>   * update_fast_timekeeper - Update the fast and NMI safe monotonic  
>> timekeeper.
>>   * @tkr: Timekeeping readout base from which we take the update
>> @@ -885,6 +901,59 @@ EXPORT_SYMBOL(ktime_get_raw_and_real_ts64);
>>  #endif /* CONFIG_NTP_PPS */
>>
>>  /**
>> + * get_device_system_crosststamp - Synchronously capture system/device  
>> timestamp
>> + * @xtstamp:           Receives simultaneously captured system and  
>> device time
>> + * @sync_devicetime:   Callback to get simultaneous device time and
>> + *     system counter from the device driver
>> + *
>> + * Reads a timestamp from a device and correlates it to system time
>> + */
>> +int get_device_system_crosststamp(struct system_device_crosststamp  
>> *xtstamp,
>> +                                 struct get_sync_device_time  
>> *sync_devicetime)
>
> Another nit, but to me something like:
> int get_device_system_crosststamp(struct get_sync_device_time  
> *sync_devicetime,
>                                                           struct
> system_device_crosststamp *ret)
>
> Reads better to me. ie: use this device_time -> return that crosststamp.

OK.  My brain "works" the opposite way, but I think (after a small amount  
of research) I'm in the minority:)

Thanks,
Chris

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC v5 1/6] Timekeeping cross timestamp interface for device drivers
  2016-01-08  0:42       ` [Intel-wired-lan] " Christopher Hall
@ 2016-01-08  1:05         ` John Stultz
  -1 siblings, 0 replies; 38+ messages in thread
From: John Stultz @ 2016-01-08  1:05 UTC (permalink / raw)
  To: Christopher Hall
  Cc: Thomas Gleixner, Richard Cochran, Ingo Molnar, H. Peter Anvin,
	Jeff Kirsher, x86, lkml, intel-wired-lan, netdev, Stanton,
	Kevin B

On Thu, Jan 7, 2016 at 4:42 PM, Christopher Hall
<christopher.s.hall@intel.com> wrote:
> Hi John,
>
> On Wed, 06 Jan 2016 10:55:22 -0800, John Stultz <john.stultz@linaro.org>
> wrote:
>>
>> On Mon, Jan 4, 2016 at 4:45 AM, Christopher S. Hall
>> <christopher.s.hall@intel.com> wrote:
>>>
>>> +/*
>>> + * struct correlated_cs - Descriptor for a clocksource correlated to
>>> another
>>> + *     clocksource
>>> + * @related_cs:                Pointer to the related timekeeping
>>> clocksource
>>> + * @convert:           Conversion function to convert a timestamp from
>>> + *                     the correlated clocksource to cycles of the
>>> related
>>> + *                     timekeeping clocksource
>>> + */
>>> +struct correlated_cs {
>>> +       struct clocksource      *related_cs;
>>> +       cycle_t                 (*convert)(struct correlated_cs *cs,
>>> +                                          cycle_t cycles);
>>> +};
>
>
> How about making this another patch? It's not directly related to the cross
> timestamp. I would lean toward making it its own patch leaving the ART
> correlated clocksource patch touching arch/ only.

Sure. Tying it more closely (via explicit commit message) to the
change that uses it would be better then kind of slipping it in here.


>> I feel like this is introducing a lot of very small structures, which
>> to the casual reviewer aren't immediately obvious how they interlink
>> and are used.  It also adds to the number of types we have to keep in
>> our head. I dunno, maybe its just the correlated_cs is added but isn't
>> used in this patch, but I keep feeling like these should be more
>> obvious somehow.
>>
>> That's terribly vague feedback, so I'm sorry.  Maybe some concrete
>> suggestions?
>>
>> * Maybe try renaming get_sync_device_time as just
>> crosststamp_device/struct ?
>
>
> My goal here is to differentiate between the cross timestamp (final result)
> and the synchronized capture (intermediate value). Maybe it isn't
> particularly useful (or worse additionally confusing), but when reading the
> code I found it confusing to call everything a cross timestamp. I think it's
> the units - cycles and nanoseconds - that are the source of my confusion in
> this case. The units are obvious from the struct member types, but it seemed
> more straightforward to differentiate in the struct name.  Does this make
> sense?  If I missed the mark maybe it should be changed.

Sure. I think having a different type then just cycles here is good,
and counterval is fine for an alternate name, but get_sync_device_time
looks like a function, not a structure name to me. The fact that the
structure is mostly an accessor to the hardware is fine, but maybe
counterval_device or something would be better? counterval_source? I
dunno.


>> * raw_system_counterval feels like its of limited value. Maybe just
>> pass the clocksource and cycle value to the functions separately?
>
>
> I agree it is of somewhat limited value, but I think it makes the cross
> timestamp code more intuitive (the header file, as you pointed out, not as
> much). Maybe better commenting of the struct declaration would be useful
> here? To your comment above, I can keep more structs in my head than
> arguments. Also, to my thinking there isn't any intuitive ordering if these
> struct members are changed to individual arguments which might make the
> driver callback confusing.

Yea. I just feel like lots of structures add extra abstraction that
makes the code harder to learn or re-learn. Not only is one trying to
remember the base types that are being passed around, but you also
have to remember what all the meta-structures are for. Especially for
these two-value structures.

And I apologize. I know this sort of nit-picking is sort of the worst.
"Its great you named your child that, but I'd prefer to call them..."
;)  But for long-term maintenance, under the fog of time, obviousness
is really important.


>>> -static inline s64 timekeeping_get_ns(struct tk_read_base *tkr)
>>> +static inline s64 timekeeping_delta_to_ns(struct tk_read_base *tkr,
>>> +                                         cycle_t delta)
>>>  {
>>> -       cycle_t delta;
>>>         s64 nsec;
>>>
>>> -       delta = timekeeping_get_delta(tkr);
>>> -
>>>         nsec = delta * tkr->mult + tkr->xtime_nsec;
>>>         nsec >>= tkr->shift;
>>>
>>> @@ -312,6 +310,24 @@ static inline s64 timekeeping_get_ns(struct
>>> tk_read_base *tkr)
>>>         return nsec + arch_gettimeoffset();
>>>  }
>
>
>> Mind splitting the above out into its own small patch?
>
>
> Makes sense. To be clear your suggestion is to make this patch 1 of X and
> then 2 of X will be the cross timestamp patch.  Is that right?

Yep.


>>>  /**
>>>   * update_fast_timekeeper - Update the fast and NMI safe monotonic
>>> timekeeper.
>>>   * @tkr: Timekeeping readout base from which we take the update
>>> @@ -885,6 +901,59 @@ EXPORT_SYMBOL(ktime_get_raw_and_real_ts64);
>>>  #endif /* CONFIG_NTP_PPS */
>>>
>>>  /**
>>> + * get_device_system_crosststamp - Synchronously capture system/device
>>> timestamp
>>> + * @xtstamp:           Receives simultaneously captured system and
>>> device time
>>> + * @sync_devicetime:   Callback to get simultaneous device time and
>>> + *     system counter from the device driver
>>> + *
>>> + * Reads a timestamp from a device and correlates it to system time
>>> + */
>>> +int get_device_system_crosststamp(struct system_device_crosststamp
>>> *xtstamp,
>>> +                                 struct get_sync_device_time
>>> *sync_devicetime)
>>
>>
>> Another nit, but to me something like:
>> int get_device_system_crosststamp(struct get_sync_device_time
>> *sync_devicetime,
>>                                                           struct
>> system_device_crosststamp *ret)
>>
>> Reads better to me. ie: use this device_time -> return that crosststamp.
>
>
> OK.  My brain "works" the opposite way, but I think (after a small amount of
> research) I'm in the minority:)

Yea. memcpy and others do work the other way, but my left-to-right
bias is pretty strong. :)

thanks
-john

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [Intel-wired-lan] [RFC v5 1/6] Timekeeping cross timestamp interface for device drivers
@ 2016-01-08  1:05         ` John Stultz
  0 siblings, 0 replies; 38+ messages in thread
From: John Stultz @ 2016-01-08  1:05 UTC (permalink / raw)
  To: intel-wired-lan

On Thu, Jan 7, 2016 at 4:42 PM, Christopher Hall
<christopher.s.hall@intel.com> wrote:
> Hi John,
>
> On Wed, 06 Jan 2016 10:55:22 -0800, John Stultz <john.stultz@linaro.org>
> wrote:
>>
>> On Mon, Jan 4, 2016 at 4:45 AM, Christopher S. Hall
>> <christopher.s.hall@intel.com> wrote:
>>>
>>> +/*
>>> + * struct correlated_cs - Descriptor for a clocksource correlated to
>>> another
>>> + *     clocksource
>>> + * @related_cs:                Pointer to the related timekeeping
>>> clocksource
>>> + * @convert:           Conversion function to convert a timestamp from
>>> + *                     the correlated clocksource to cycles of the
>>> related
>>> + *                     timekeeping clocksource
>>> + */
>>> +struct correlated_cs {
>>> +       struct clocksource      *related_cs;
>>> +       cycle_t                 (*convert)(struct correlated_cs *cs,
>>> +                                          cycle_t cycles);
>>> +};
>
>
> How about making this another patch? It's not directly related to the cross
> timestamp. I would lean toward making it its own patch leaving the ART
> correlated clocksource patch touching arch/ only.

Sure. Tying it more closely (via explicit commit message) to the
change that uses it would be better then kind of slipping it in here.


>> I feel like this is introducing a lot of very small structures, which
>> to the casual reviewer aren't immediately obvious how they interlink
>> and are used.  It also adds to the number of types we have to keep in
>> our head. I dunno, maybe its just the correlated_cs is added but isn't
>> used in this patch, but I keep feeling like these should be more
>> obvious somehow.
>>
>> That's terribly vague feedback, so I'm sorry.  Maybe some concrete
>> suggestions?
>>
>> * Maybe try renaming get_sync_device_time as just
>> crosststamp_device/struct ?
>
>
> My goal here is to differentiate between the cross timestamp (final result)
> and the synchronized capture (intermediate value). Maybe it isn't
> particularly useful (or worse additionally confusing), but when reading the
> code I found it confusing to call everything a cross timestamp. I think it's
> the units - cycles and nanoseconds - that are the source of my confusion in
> this case. The units are obvious from the struct member types, but it seemed
> more straightforward to differentiate in the struct name.  Does this make
> sense?  If I missed the mark maybe it should be changed.

Sure. I think having a different type then just cycles here is good,
and counterval is fine for an alternate name, but get_sync_device_time
looks like a function, not a structure name to me. The fact that the
structure is mostly an accessor to the hardware is fine, but maybe
counterval_device or something would be better? counterval_source? I
dunno.


>> * raw_system_counterval feels like its of limited value. Maybe just
>> pass the clocksource and cycle value to the functions separately?
>
>
> I agree it is of somewhat limited value, but I think it makes the cross
> timestamp code more intuitive (the header file, as you pointed out, not as
> much). Maybe better commenting of the struct declaration would be useful
> here? To your comment above, I can keep more structs in my head than
> arguments. Also, to my thinking there isn't any intuitive ordering if these
> struct members are changed to individual arguments which might make the
> driver callback confusing.

Yea. I just feel like lots of structures add extra abstraction that
makes the code harder to learn or re-learn. Not only is one trying to
remember the base types that are being passed around, but you also
have to remember what all the meta-structures are for. Especially for
these two-value structures.

And I apologize. I know this sort of nit-picking is sort of the worst.
"Its great you named your child that, but I'd prefer to call them..."
;)  But for long-term maintenance, under the fog of time, obviousness
is really important.


>>> -static inline s64 timekeeping_get_ns(struct tk_read_base *tkr)
>>> +static inline s64 timekeeping_delta_to_ns(struct tk_read_base *tkr,
>>> +                                         cycle_t delta)
>>>  {
>>> -       cycle_t delta;
>>>         s64 nsec;
>>>
>>> -       delta = timekeeping_get_delta(tkr);
>>> -
>>>         nsec = delta * tkr->mult + tkr->xtime_nsec;
>>>         nsec >>= tkr->shift;
>>>
>>> @@ -312,6 +310,24 @@ static inline s64 timekeeping_get_ns(struct
>>> tk_read_base *tkr)
>>>         return nsec + arch_gettimeoffset();
>>>  }
>
>
>> Mind splitting the above out into its own small patch?
>
>
> Makes sense. To be clear your suggestion is to make this patch 1 of X and
> then 2 of X will be the cross timestamp patch.  Is that right?

Yep.


>>>  /**
>>>   * update_fast_timekeeper - Update the fast and NMI safe monotonic
>>> timekeeper.
>>>   * @tkr: Timekeeping readout base from which we take the update
>>> @@ -885,6 +901,59 @@ EXPORT_SYMBOL(ktime_get_raw_and_real_ts64);
>>>  #endif /* CONFIG_NTP_PPS */
>>>
>>>  /**
>>> + * get_device_system_crosststamp - Synchronously capture system/device
>>> timestamp
>>> + * @xtstamp:           Receives simultaneously captured system and
>>> device time
>>> + * @sync_devicetime:   Callback to get simultaneous device time and
>>> + *     system counter from the device driver
>>> + *
>>> + * Reads a timestamp from a device and correlates it to system time
>>> + */
>>> +int get_device_system_crosststamp(struct system_device_crosststamp
>>> *xtstamp,
>>> +                                 struct get_sync_device_time
>>> *sync_devicetime)
>>
>>
>> Another nit, but to me something like:
>> int get_device_system_crosststamp(struct get_sync_device_time
>> *sync_devicetime,
>>                                                           struct
>> system_device_crosststamp *ret)
>>
>> Reads better to me. ie: use this device_time -> return that crosststamp.
>
>
> OK.  My brain "works" the opposite way, but I think (after a small amount of
> research) I'm in the minority:)

Yea. memcpy and others do work the other way, but my left-to-right
bias is pretty strong. :)

thanks
-john

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC v5 3/6] Add history to cross timestamp interface supporting slower devices
  2016-01-06 19:37     ` [Intel-wired-lan] " John Stultz
@ 2016-01-08  1:07       ` Christopher Hall
  -1 siblings, 0 replies; 38+ messages in thread
From: Christopher Hall @ 2016-01-08  1:07 UTC (permalink / raw)
  To: John Stultz
  Cc: Thomas Gleixner, Richard Cochran, Ingo Molnar, H. Peter Anvin,
	Jeff Kirsher, x86, lkml, intel-wired-lan, netdev, Stanton,
	Kevin B

On Wed, 06 Jan 2016 11:37:23 -0800, John Stultz <john.stultz@linaro.org>  
wrote:
> On Mon, Jan 4, 2016 at 4:45 AM, Christopher S. Hall
> <christopher.s.hall@intel.com> wrote:
>> @@ -13,6 +13,9 @@
>>  /**
>>   * struct tk_read_base - base structure for timekeeping readout
>>   * @clock:     Current clocksource used for timekeeping.
>> + * @cs_seq:    Clocksource sequence is incremented per clocksource  
>> change.
>> + *     It's used to determine whether past system time can be related  
>> to
>> + *     current system time
>>   * @read:      Read function of @clock
>>   * @mask:      Bitmask for two's complement subtraction of non 64bit  
>> clocks
>>   * @cycle_last: @clock cycle value at last update
>> @@ -29,6 +32,7 @@
>>   */
>>  struct tk_read_base {
>>         struct clocksource      *clock;
>> +       u8                      cs_seq;
>>         cycle_t                 (*read)(struct clocksource *cs);
>>         cycle_t                 mask;
>>         cycle_t                 cycle_last;
>
>
> So Thomas optimized the tk_read_bases to fit in a cacheline, and so I
> worry about the u8 being added there. I'm also cautious about
> exporting these seq values out further via the system_time_snapshot.
> But I may just need some more time to warm up to the idea.

I think I missed the reason for the existence tk_read_base. Now I know.  
The clocksource sequence doesn't belong there. It should be moved  
timekeeper struct.  Below there is more discussion of why the sequence  
number needs to exist at all. I think it does.

Would it make more sense to take the sequence out of the snapshot struct  
and make it an additional argument? it makes the snapshot struct more  
intuitive but the arguments to ktime_get_snapshot() maybe less so.

>
>
>> diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
>> index 9c1ddc3..5a7f784 100644
>> --- a/kernel/time/timekeeping.c
>> +++ b/kernel/time/timekeeping.c
>> @@ -235,11 +235,13 @@ static void tk_setup_internals(struct timekeeper  
>> *tk, struct clocksource *clock)
>>
>>         old_clock = tk->tkr_mono.clock;
>>         tk->tkr_mono.clock = clock;
>> +       ++tk->tkr_mono.cs_seq;
>>         tk->tkr_mono.read = clock->read;
>>         tk->tkr_mono.mask = clock->mask;
>>         tk->tkr_mono.cycle_last = tk->tkr_mono.read(clock);
>>
>>         tk->tkr_raw.clock = clock;
>> +       ++tk->tkr_raw.cs_seq;
>>         tk->tkr_raw.read = clock->read;
>>         tk->tkr_raw.mask = clock->mask;
>>         tk->tkr_raw.cycle_last = tk->tkr_mono.cycle_last;
>> @@ -862,6 +864,39 @@ time64_t ktime_get_real_seconds(void)
>>  }
>>  EXPORT_SYMBOL_GPL(ktime_get_real_seconds);
>>
>> +/**
>> + * ktime_get_snapshot - snapshots the realtime/monotonic raw clocks  
>> with counter
>> + * @snapshot:  pointer to struct receiving the system time snapshot
>> + */
>> +void ktime_get_snapshot(struct system_time_snapshot *systime_snapshot)
>> +{
>> +       struct timekeeper *tk = &tk_core.timekeeper;
>> +       unsigned long seq;
>> +       ktime_t base_raw;
>> +       ktime_t base_real;
>> +       s64 nsec_raw;
>> +       s64 nsec_real;
>> +       cycle_t now;
>> +
>> +       do {
>> +               seq = read_seqcount_begin(&tk_core.seq);
>> +
>> +               now = tk->tkr_mono.read(tk->tkr_mono.clock);
>> +               systime_snapshot->cs_seq = tk->tkr_mono.cs_seq;
>> +               systime_snapshot->clock_set_seq = tk->clock_was_set_seq;
>> +               base_real = ktime_add(tk->tkr_mono.base,
>> +                                     tk_core.timekeeper.offs_real);
>> +               base_raw = tk->tkr_raw.base;
>> +               nsec_real = timekeeping_cycles_to_ns(&tk->tkr_mono,  
>> now);
>> +               nsec_raw  = timekeeping_cycles_to_ns(&tk->tkr_raw, now);
>> +       } while (read_seqcount_retry(&tk_core.seq, seq));
>> +
>> +       systime_snapshot->cycles = now;
>> +       systime_snapshot->real = ktime_add_ns(base_real, nsec_real);
>> +       systime_snapshot->raw = ktime_add_ns(base_raw, nsec_raw);
>> +}
>> +EXPORT_SYMBOL_GPL(ktime_get_snapshot);
>
> So can you split out this adding of ktime_get_snapshot()  (maybe
> skipping the seqcount bits initially) into a separate patch?

Yes.  This should be fine.

>
>> @@ -936,19 +1044,63 @@ int get_device_system_crosststamp(struct  
>> system_device_crosststamp *xtstamp,
>>                  */
>>                 if (tk->tkr_mono.clock != raw_sys.cs)
>>                         return -ENODEV;
>> +               cycles = raw_sys.cycles;
>> +
>> +               /*
>> +                * Check whether the system counter value provided by  
>> the
>> +                * device driver is on the current interval.
>> +                */
>> +               now = tk->tkr_mono.read(tk->tkr_mono.clock);
>> +               interval_start = tk->tkr_mono.cycle_last;
>> +               if (!cycle_between(interval_start, cycles, now)) {
>> +                       cs_seq = tk->tkr_mono.cs_seq;
>> +                       clock_was_set_seq = tk->clock_was_set_seq;
>> +                       cycles = interval_start;
>> +                       do_interp = true;
>> +               } else {
>> +                       do_interp = false;
>> +               }
>>
>>                 base_real = ktime_add(tk->tkr_mono.base,
>>                                       tk_core.timekeeper.offs_real);
>>                 base_raw = tk->tkr_raw.base;
>>
>> -               nsec_real = timekeeping_cycles_to_ns(&tk->tkr_mono,
>> -                                                    raw_sys.cycles);
>> -               nsec_raw = timekeeping_cycles_to_ns(&tk->tkr_raw,
>> -                                                   raw_sys.cycles);
>> +               nsec_real = timekeeping_cycles_to_ns(&tk->tkr_mono,  
>> cycles);
>> +               nsec_raw = timekeeping_cycles_to_ns(&tk->tkr_raw,  
>> cycles);
>>         } while (read_seqcount_retry(&tk_core.seq, seq));
>>
>>         xtstamp->sys_realtime = ktime_add_ns(base_real, nsec_real);
>>         xtstamp->sys_monoraw = ktime_add_ns(base_raw, nsec_raw);
>> +
>> +       /*
>> +        * Interpolate if necessary, working back from the start of the  
>> current
>> +        * interval
>> +        */
>> +       if (do_interp) {
>> +               cycle_t total_history_cycles;
>> +               ktime_t history_monoraw;
>> +               ktime_t history_realtime;
>> +               bool discontinuity;
>> +               cycle_t partial_history_cycles = cycles -  
>> raw_sys.cycles;
>> +
>> +               if (!history_ref || history_ref->cs_seq != cs_seq ||
>
> I've not done a super close reading here. But is it very likely the
> the history_ref->cs_seq is the same as the captured seq? I thought
> this history_ref was to allow old cross stamps to be used to improve
> the back-calculation of the time at the given cycle value. So throwing
> them out if they are older then the last tick seems strange.

Maybe this needs more explanation. The clocksource sequence (cs_seq) is  
incremented for each change in clocksource. I use this to detect a rare  
corner case where the clocksource is changed from (on x86 anyway) TSC and  
then back. If the history crosses one of these changes then interpolation  
shouldn't be attempted (return error). It's not really enough when using  
the history to just check that the current clocksource is equal to the one  
used at the start of the history. The clocksource must not have changed at  
all. To answer your question, it's not at all likely that this would occur.

>
> thanks
> -john

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [Intel-wired-lan] [RFC v5 3/6] Add history to cross timestamp interface supporting slower devices
@ 2016-01-08  1:07       ` Christopher Hall
  0 siblings, 0 replies; 38+ messages in thread
From: Christopher Hall @ 2016-01-08  1:07 UTC (permalink / raw)
  To: intel-wired-lan

On Wed, 06 Jan 2016 11:37:23 -0800, John Stultz <john.stultz@linaro.org>  
wrote:
> On Mon, Jan 4, 2016 at 4:45 AM, Christopher S. Hall
> <christopher.s.hall@intel.com> wrote:
>> @@ -13,6 +13,9 @@
>>  /**
>>   * struct tk_read_base - base structure for timekeeping readout
>>   * @clock:     Current clocksource used for timekeeping.
>> + * @cs_seq:    Clocksource sequence is incremented per clocksource  
>> change.
>> + *     It's used to determine whether past system time can be related  
>> to
>> + *     current system time
>>   * @read:      Read function of @clock
>>   * @mask:      Bitmask for two's complement subtraction of non 64bit  
>> clocks
>>   * @cycle_last: @clock cycle value at last update
>> @@ -29,6 +32,7 @@
>>   */
>>  struct tk_read_base {
>>         struct clocksource      *clock;
>> +       u8                      cs_seq;
>>         cycle_t                 (*read)(struct clocksource *cs);
>>         cycle_t                 mask;
>>         cycle_t                 cycle_last;
>
>
> So Thomas optimized the tk_read_bases to fit in a cacheline, and so I
> worry about the u8 being added there. I'm also cautious about
> exporting these seq values out further via the system_time_snapshot.
> But I may just need some more time to warm up to the idea.

I think I missed the reason for the existence tk_read_base. Now I know.  
The clocksource sequence doesn't belong there. It should be moved  
timekeeper struct.  Below there is more discussion of why the sequence  
number needs to exist at all. I think it does.

Would it make more sense to take the sequence out of the snapshot struct  
and make it an additional argument? it makes the snapshot struct more  
intuitive but the arguments to ktime_get_snapshot() maybe less so.

>
>
>> diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
>> index 9c1ddc3..5a7f784 100644
>> --- a/kernel/time/timekeeping.c
>> +++ b/kernel/time/timekeeping.c
>> @@ -235,11 +235,13 @@ static void tk_setup_internals(struct timekeeper  
>> *tk, struct clocksource *clock)
>>
>>         old_clock = tk->tkr_mono.clock;
>>         tk->tkr_mono.clock = clock;
>> +       ++tk->tkr_mono.cs_seq;
>>         tk->tkr_mono.read = clock->read;
>>         tk->tkr_mono.mask = clock->mask;
>>         tk->tkr_mono.cycle_last = tk->tkr_mono.read(clock);
>>
>>         tk->tkr_raw.clock = clock;
>> +       ++tk->tkr_raw.cs_seq;
>>         tk->tkr_raw.read = clock->read;
>>         tk->tkr_raw.mask = clock->mask;
>>         tk->tkr_raw.cycle_last = tk->tkr_mono.cycle_last;
>> @@ -862,6 +864,39 @@ time64_t ktime_get_real_seconds(void)
>>  }
>>  EXPORT_SYMBOL_GPL(ktime_get_real_seconds);
>>
>> +/**
>> + * ktime_get_snapshot - snapshots the realtime/monotonic raw clocks  
>> with counter
>> + * @snapshot:  pointer to struct receiving the system time snapshot
>> + */
>> +void ktime_get_snapshot(struct system_time_snapshot *systime_snapshot)
>> +{
>> +       struct timekeeper *tk = &tk_core.timekeeper;
>> +       unsigned long seq;
>> +       ktime_t base_raw;
>> +       ktime_t base_real;
>> +       s64 nsec_raw;
>> +       s64 nsec_real;
>> +       cycle_t now;
>> +
>> +       do {
>> +               seq = read_seqcount_begin(&tk_core.seq);
>> +
>> +               now = tk->tkr_mono.read(tk->tkr_mono.clock);
>> +               systime_snapshot->cs_seq = tk->tkr_mono.cs_seq;
>> +               systime_snapshot->clock_set_seq = tk->clock_was_set_seq;
>> +               base_real = ktime_add(tk->tkr_mono.base,
>> +                                     tk_core.timekeeper.offs_real);
>> +               base_raw = tk->tkr_raw.base;
>> +               nsec_real = timekeeping_cycles_to_ns(&tk->tkr_mono,  
>> now);
>> +               nsec_raw  = timekeeping_cycles_to_ns(&tk->tkr_raw, now);
>> +       } while (read_seqcount_retry(&tk_core.seq, seq));
>> +
>> +       systime_snapshot->cycles = now;
>> +       systime_snapshot->real = ktime_add_ns(base_real, nsec_real);
>> +       systime_snapshot->raw = ktime_add_ns(base_raw, nsec_raw);
>> +}
>> +EXPORT_SYMBOL_GPL(ktime_get_snapshot);
>
> So can you split out this adding of ktime_get_snapshot()  (maybe
> skipping the seqcount bits initially) into a separate patch?

Yes.  This should be fine.

>
>> @@ -936,19 +1044,63 @@ int get_device_system_crosststamp(struct  
>> system_device_crosststamp *xtstamp,
>>                  */
>>                 if (tk->tkr_mono.clock != raw_sys.cs)
>>                         return -ENODEV;
>> +               cycles = raw_sys.cycles;
>> +
>> +               /*
>> +                * Check whether the system counter value provided by  
>> the
>> +                * device driver is on the current interval.
>> +                */
>> +               now = tk->tkr_mono.read(tk->tkr_mono.clock);
>> +               interval_start = tk->tkr_mono.cycle_last;
>> +               if (!cycle_between(interval_start, cycles, now)) {
>> +                       cs_seq = tk->tkr_mono.cs_seq;
>> +                       clock_was_set_seq = tk->clock_was_set_seq;
>> +                       cycles = interval_start;
>> +                       do_interp = true;
>> +               } else {
>> +                       do_interp = false;
>> +               }
>>
>>                 base_real = ktime_add(tk->tkr_mono.base,
>>                                       tk_core.timekeeper.offs_real);
>>                 base_raw = tk->tkr_raw.base;
>>
>> -               nsec_real = timekeeping_cycles_to_ns(&tk->tkr_mono,
>> -                                                    raw_sys.cycles);
>> -               nsec_raw = timekeeping_cycles_to_ns(&tk->tkr_raw,
>> -                                                   raw_sys.cycles);
>> +               nsec_real = timekeeping_cycles_to_ns(&tk->tkr_mono,  
>> cycles);
>> +               nsec_raw = timekeeping_cycles_to_ns(&tk->tkr_raw,  
>> cycles);
>>         } while (read_seqcount_retry(&tk_core.seq, seq));
>>
>>         xtstamp->sys_realtime = ktime_add_ns(base_real, nsec_real);
>>         xtstamp->sys_monoraw = ktime_add_ns(base_raw, nsec_raw);
>> +
>> +       /*
>> +        * Interpolate if necessary, working back from the start of the  
>> current
>> +        * interval
>> +        */
>> +       if (do_interp) {
>> +               cycle_t total_history_cycles;
>> +               ktime_t history_monoraw;
>> +               ktime_t history_realtime;
>> +               bool discontinuity;
>> +               cycle_t partial_history_cycles = cycles -  
>> raw_sys.cycles;
>> +
>> +               if (!history_ref || history_ref->cs_seq != cs_seq ||
>
> I've not done a super close reading here. But is it very likely the
> the history_ref->cs_seq is the same as the captured seq? I thought
> this history_ref was to allow old cross stamps to be used to improve
> the back-calculation of the time at the given cycle value. So throwing
> them out if they are older then the last tick seems strange.

Maybe this needs more explanation. The clocksource sequence (cs_seq) is  
incremented for each change in clocksource. I use this to detect a rare  
corner case where the clocksource is changed from (on x86 anyway) TSC and  
then back. If the history crosses one of these changes then interpolation  
shouldn't be attempted (return error). It's not really enough when using  
the history to just check that the current clocksource is equal to the one  
used at the start of the history. The clocksource must not have changed at  
all. To answer your question, it's not at all likely that this would occur.

>
> thanks
> -john

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC v5 3/6] Add history to cross timestamp interface supporting slower devices
  2016-01-08  1:07       ` [Intel-wired-lan] " Christopher Hall
@ 2016-01-08  1:12         ` John Stultz
  -1 siblings, 0 replies; 38+ messages in thread
From: John Stultz @ 2016-01-08  1:12 UTC (permalink / raw)
  To: Christopher Hall
  Cc: Thomas Gleixner, Richard Cochran, Ingo Molnar, H. Peter Anvin,
	Jeff Kirsher, x86, lkml, intel-wired-lan, netdev, Stanton,
	Kevin B

On Thu, Jan 7, 2016 at 5:07 PM, Christopher Hall
<christopher.s.hall@intel.com> wrote:
> On Wed, 06 Jan 2016 11:37:23 -0800, John Stultz <john.stultz@linaro.org>
> wrote:
>> On Mon, Jan 4, 2016 at 4:45 AM, Christopher S. Hall
>>>
>>> +               if (!history_ref || history_ref->cs_seq != cs_seq ||
>>
>>
>> I've not done a super close reading here. But is it very likely the
>> the history_ref->cs_seq is the same as the captured seq? I thought
>> this history_ref was to allow old cross stamps to be used to improve
>> the back-calculation of the time at the given cycle value. So throwing
>> them out if they are older then the last tick seems strange.
>
>
> Maybe this needs more explanation. The clocksource sequence (cs_seq) is
> incremented for each change in clocksource. I use this to detect a rare
> corner case where the clocksource is changed from (on x86 anyway) TSC and
> then back. If the history crosses one of these changes then interpolation
> shouldn't be attempted (return error). It's not really enough when using the
> history to just check that the current clocksource is equal to the one used
> at the start of the history. The clocksource must not have changed at all.
> To answer your question, it's not at all likely that this would occur.

Yes. Apologies. I had mis-skimmed the cs_seq increment as happening in
the update_wall_time not setup_internals.

Instead of cs_seq, which is easily confused as being related to the
seqcount lock, could you instead call it something more explicit like
clocksource_changed_count?

And yea, having it as part of the timekeeper structure would be the
better place for it.

thanks
-john

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [Intel-wired-lan] [RFC v5 3/6] Add history to cross timestamp interface supporting slower devices
@ 2016-01-08  1:12         ` John Stultz
  0 siblings, 0 replies; 38+ messages in thread
From: John Stultz @ 2016-01-08  1:12 UTC (permalink / raw)
  To: intel-wired-lan

On Thu, Jan 7, 2016 at 5:07 PM, Christopher Hall
<christopher.s.hall@intel.com> wrote:
> On Wed, 06 Jan 2016 11:37:23 -0800, John Stultz <john.stultz@linaro.org>
> wrote:
>> On Mon, Jan 4, 2016 at 4:45 AM, Christopher S. Hall
>>>
>>> +               if (!history_ref || history_ref->cs_seq != cs_seq ||
>>
>>
>> I've not done a super close reading here. But is it very likely the
>> the history_ref->cs_seq is the same as the captured seq? I thought
>> this history_ref was to allow old cross stamps to be used to improve
>> the back-calculation of the time at the given cycle value. So throwing
>> them out if they are older then the last tick seems strange.
>
>
> Maybe this needs more explanation. The clocksource sequence (cs_seq) is
> incremented for each change in clocksource. I use this to detect a rare
> corner case where the clocksource is changed from (on x86 anyway) TSC and
> then back. If the history crosses one of these changes then interpolation
> shouldn't be attempted (return error). It's not really enough when using the
> history to just check that the current clocksource is equal to the one used
> at the start of the history. The clocksource must not have changed at all.
> To answer your question, it's not at all likely that this would occur.

Yes. Apologies. I had mis-skimmed the cs_seq increment as happening in
the update_wall_time not setup_internals.

Instead of cs_seq, which is easily confused as being related to the
seqcount lock, could you instead call it something more explicit like
clocksource_changed_count?

And yea, having it as part of the timekeeper structure would be the
better place for it.

thanks
-john

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC v5 1/6] Timekeeping cross timestamp interface for device drivers
  2016-01-08  1:05         ` [Intel-wired-lan] " John Stultz
@ 2016-01-08  9:13           ` Richard Cochran
  -1 siblings, 0 replies; 38+ messages in thread
From: Richard Cochran @ 2016-01-08  9:13 UTC (permalink / raw)
  To: John Stultz
  Cc: Christopher Hall, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Jeff Kirsher, x86, lkml, intel-wired-lan, netdev, Stanton,
	Kevin B

On Thu, Jan 07, 2016 at 05:05:24PM -0800, John Stultz wrote:
> Yea. I just feel like lots of structures add extra abstraction that
> makes the code harder to learn or re-learn. Not only is one trying to
> remember the base types that are being passed around, but you also
> have to remember what all the meta-structures are for. Especially for
> these two-value structures.

FWIW, I had exactly this trouble when reading this series.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [Intel-wired-lan] [RFC v5 1/6] Timekeeping cross timestamp interface for device drivers
@ 2016-01-08  9:13           ` Richard Cochran
  0 siblings, 0 replies; 38+ messages in thread
From: Richard Cochran @ 2016-01-08  9:13 UTC (permalink / raw)
  To: intel-wired-lan

On Thu, Jan 07, 2016 at 05:05:24PM -0800, John Stultz wrote:
> Yea. I just feel like lots of structures add extra abstraction that
> makes the code harder to learn or re-learn. Not only is one trying to
> remember the base types that are being passed around, but you also
> have to remember what all the meta-structures are for. Especially for
> these two-value structures.

FWIW, I had exactly this trouble when reading this series.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC v5 3/6] Add history to cross timestamp interface supporting slower devices
  2016-01-08  1:07       ` [Intel-wired-lan] " Christopher Hall
@ 2016-01-08 14:04         ` Thomas Gleixner
  -1 siblings, 0 replies; 38+ messages in thread
From: Thomas Gleixner @ 2016-01-08 14:04 UTC (permalink / raw)
  To: Christopher Hall
  Cc: John Stultz, Richard Cochran, Ingo Molnar, H. Peter Anvin,
	Jeff Kirsher, x86, lkml, intel-wired-lan, netdev, Stanton,
	Kevin B

On Thu, 7 Jan 2016, Christopher Hall wrote:
> On Wed, 06 Jan 2016 11:37:23 -0800, John Stultz <john.stultz@linaro.org>
> wrote:
> > I've not done a super close reading here. But is it very likely the
> > the history_ref->cs_seq is the same as the captured seq? I thought
> > this history_ref was to allow old cross stamps to be used to improve
> > the back-calculation of the time at the given cycle value. So throwing
> > them out if they are older then the last tick seems strange.
> 
> Maybe this needs more explanation. The clocksource sequence (cs_seq) is
> incremented for each change in clocksource. I use this to detect a rare corner
> case where the clocksource is changed from (on x86 anyway) TSC and then back.
> If the history crosses one of these changes then interpolation shouldn't be
> attempted (return error). It's not really enough when using the history to
> just check that the current clocksource is equal to the one used at the start
> of the history. The clocksource must not have changed at all. To answer your
> question, it's not at all likely that this would occur.

You can flush the history when a clocksource change happens. No need to add
extra data to the core structures.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [Intel-wired-lan] [RFC v5 3/6] Add history to cross timestamp interface supporting slower devices
@ 2016-01-08 14:04         ` Thomas Gleixner
  0 siblings, 0 replies; 38+ messages in thread
From: Thomas Gleixner @ 2016-01-08 14:04 UTC (permalink / raw)
  To: intel-wired-lan

On Thu, 7 Jan 2016, Christopher Hall wrote:
> On Wed, 06 Jan 2016 11:37:23 -0800, John Stultz <john.stultz@linaro.org>
> wrote:
> > I've not done a super close reading here. But is it very likely the
> > the history_ref->cs_seq is the same as the captured seq? I thought
> > this history_ref was to allow old cross stamps to be used to improve
> > the back-calculation of the time at the given cycle value. So throwing
> > them out if they are older then the last tick seems strange.
> 
> Maybe this needs more explanation. The clocksource sequence (cs_seq) is
> incremented for each change in clocksource. I use this to detect a rare corner
> case where the clocksource is changed from (on x86 anyway) TSC and then back.
> If the history crosses one of these changes then interpolation shouldn't be
> attempted (return error). It's not really enough when using the history to
> just check that the current clocksource is equal to the one used at the start
> of the history. The clocksource must not have changed at all. To answer your
> question, it's not at all likely that this would occur.

You can flush the history when a clocksource change happens. No need to add
extra data to the core structures.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC v5 3/6] Add history to cross timestamp interface supporting slower devices
  2016-01-08 14:04         ` [Intel-wired-lan] " Thomas Gleixner
@ 2016-01-08 22:28           ` Christopher Hall
  -1 siblings, 0 replies; 38+ messages in thread
From: Christopher Hall @ 2016-01-08 22:28 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: John Stultz, Richard Cochran, Ingo Molnar, H. Peter Anvin,
	Jeff Kirsher, x86, lkml, intel-wired-lan, netdev, Stanton,
	Kevin B

On Fri, 08 Jan 2016 06:04:28 -0800, Thomas Gleixner <tglx@linutronix.de>  
wrote:
> On Thu, 7 Jan 2016, Christopher Hall wrote:
>> On Wed, 06 Jan 2016 11:37:23 -0800, John Stultz <john.stultz@linaro.org>
>> wrote:
>> > I've not done a super close reading here. But is it very likely the
>> > the history_ref->cs_seq is the same as the captured seq? I thought
>> > this history_ref was to allow old cross stamps to be used to improve
>> > the back-calculation of the time at the given cycle value. So throwing
>> > them out if they are older then the last tick seems strange.
>>
>> Maybe this needs more explanation. The clocksource sequence (cs_seq) is
>> incremented for each change in clocksource. I use this to detect a rare  
>> corner
>> case where the clocksource is changed from (on x86 anyway) TSC and then  
>> back.
>> If the history crosses one of these changes then interpolation  
>> shouldn't be
>> attempted (return error). It's not really enough when using the history  
>> to
>> just check that the current clocksource is equal to the one used at the  
>> start
>> of the history. The clocksource must not have changed at all. To answer  
>> your
>> question, it's not at all likely that this would occur.
>
> You can flush the history when a clocksource change happens. No need to  
> add
> extra data to the core structures.

Hi Thomas,

For slower devices like Intel's audio DSP, it's the responsibility of the  
driver to manage the history which is supplied as an argument to  
get_device_system_crosststamp(). The history is a triple (counter value,  
mono-raw time, real time). The clocksource change sequence number allows  
the timekeeping code to determine whether the supplied history is stale  
(returning an error if it is). I'm calling any history stale that crosses  
a clocksource change boundary.

Logically, the keeping of the history is split between the driver and  
timekeeping code with most of the burden on the driver. This introduces a  
small amount of redundancy in the form of the clocksource change sequence,  
but on balance results in a lot simpler timekeeping code.

Chris

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [Intel-wired-lan] [RFC v5 3/6] Add history to cross timestamp interface supporting slower devices
@ 2016-01-08 22:28           ` Christopher Hall
  0 siblings, 0 replies; 38+ messages in thread
From: Christopher Hall @ 2016-01-08 22:28 UTC (permalink / raw)
  To: intel-wired-lan

On Fri, 08 Jan 2016 06:04:28 -0800, Thomas Gleixner <tglx@linutronix.de>  
wrote:
> On Thu, 7 Jan 2016, Christopher Hall wrote:
>> On Wed, 06 Jan 2016 11:37:23 -0800, John Stultz <john.stultz@linaro.org>
>> wrote:
>> > I've not done a super close reading here. But is it very likely the
>> > the history_ref->cs_seq is the same as the captured seq? I thought
>> > this history_ref was to allow old cross stamps to be used to improve
>> > the back-calculation of the time at the given cycle value. So throwing
>> > them out if they are older then the last tick seems strange.
>>
>> Maybe this needs more explanation. The clocksource sequence (cs_seq) is
>> incremented for each change in clocksource. I use this to detect a rare  
>> corner
>> case where the clocksource is changed from (on x86 anyway) TSC and then  
>> back.
>> If the history crosses one of these changes then interpolation  
>> shouldn't be
>> attempted (return error). It's not really enough when using the history  
>> to
>> just check that the current clocksource is equal to the one used at the  
>> start
>> of the history. The clocksource must not have changed at all. To answer  
>> your
>> question, it's not at all likely that this would occur.
>
> You can flush the history when a clocksource change happens. No need to  
> add
> extra data to the core structures.

Hi Thomas,

For slower devices like Intel's audio DSP, it's the responsibility of the  
driver to manage the history which is supplied as an argument to  
get_device_system_crosststamp(). The history is a triple (counter value,  
mono-raw time, real time). The clocksource change sequence number allows  
the timekeeping code to determine whether the supplied history is stale  
(returning an error if it is). I'm calling any history stale that crosses  
a clocksource change boundary.

Logically, the keeping of the history is split between the driver and  
timekeeping code with most of the burden on the driver. This introduces a  
small amount of redundancy in the form of the clocksource change sequence,  
but on balance results in a lot simpler timekeeping code.

Chris

^ permalink raw reply	[flat|nested] 38+ messages in thread

end of thread, other threads:[~2016-01-08 22:28 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-01-04 12:45 [RFC v5 0/6] Patchset enabling hardware based cross-timestamps for next gen Intel platforms Christopher S. Hall
2016-01-04 12:45 ` [Intel-wired-lan] " Christopher S. Hall
2016-01-04 12:45 ` [RFC v5 1/6] Timekeeping cross timestamp interface for device drivers Christopher S. Hall
2016-01-04 12:45   ` [Intel-wired-lan] " Christopher S. Hall
2016-01-06 18:55   ` John Stultz
2016-01-06 18:55     ` [Intel-wired-lan] " John Stultz
2016-01-08  0:42     ` Christopher Hall
2016-01-08  0:42       ` [Intel-wired-lan] " Christopher Hall
2016-01-08  1:05       ` John Stultz
2016-01-08  1:05         ` [Intel-wired-lan] " John Stultz
2016-01-08  9:13         ` Richard Cochran
2016-01-08  9:13           ` [Intel-wired-lan] " Richard Cochran
2016-01-04 12:45 ` [RFC v5 2/6] Always Running Timer (ART) correlated clocksource Christopher S. Hall
2016-01-04 12:45   ` [Intel-wired-lan] " Christopher S. Hall
2016-01-04 12:45 ` [RFC v5 3/6] Add history to cross timestamp interface supporting slower devices Christopher S. Hall
2016-01-04 12:45   ` [Intel-wired-lan] " Christopher S. Hall
2016-01-06 19:37   ` John Stultz
2016-01-06 19:37     ` [Intel-wired-lan] " John Stultz
2016-01-08  1:07     ` Christopher Hall
2016-01-08  1:07       ` [Intel-wired-lan] " Christopher Hall
2016-01-08  1:12       ` John Stultz
2016-01-08  1:12         ` [Intel-wired-lan] " John Stultz
2016-01-08 14:04       ` Thomas Gleixner
2016-01-08 14:04         ` [Intel-wired-lan] " Thomas Gleixner
2016-01-08 22:28         ` Christopher Hall
2016-01-08 22:28           ` [Intel-wired-lan] " Christopher Hall
2016-01-04 12:45 ` [RFC v5 4/6] Remove duplicate code from ktime_get_raw_and_real code Christopher S. Hall
2016-01-04 12:45   ` [Intel-wired-lan] " Christopher S. Hall
2016-01-06 19:42   ` John Stultz
2016-01-06 19:42     ` [Intel-wired-lan] " John Stultz
2016-01-04 12:45 ` [RFC v5 5/6] Add PTP_SYS_OFFSET_PRECISE for driver crosstimestamping Christopher S. Hall
2016-01-04 12:45   ` [Intel-wired-lan] " Christopher S. Hall
2016-01-05 15:27   ` Richard Cochran
2016-01-05 15:27     ` [Intel-wired-lan] " Richard Cochran
2016-01-07  1:42     ` Christopher Hall
2016-01-07  1:42       ` [Intel-wired-lan] " Christopher Hall
2016-01-04 12:45 ` [RFC v5 6/6] Adds hardware supported cross timestamp Christopher S. Hall
2016-01-04 12:45   ` [Intel-wired-lan] " Christopher S. Hall

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.