Netdev Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH spi for-5.4 0/5] Deterministic SPI latency with NXP DSPI driver
@ 2019-08-18 18:25 Vladimir Oltean
  2019-08-18 18:25 ` [PATCH spi for-5.4 1/5] spi: Use an abbreviated pointer to ctlr->cur_msg in __spi_pump_messages Vladimir Oltean
                   ` (5 more replies)
  0 siblings, 6 replies; 27+ messages in thread
From: Vladimir Oltean @ 2019-08-18 18:25 UTC (permalink / raw)
  To: broonie, h.feurstein, mlichvar, richardcochran, andrew, f.fainelli
  Cc: linux-spi, netdev, Vladimir Oltean

This patchset proposes an interface from the SPI subsystem for
software timestamping SPI transfers. There is a default implementation
provided in the core, as well as a mechanism for SPI slave drivers to
check which byte was in fact timestamped post-facto. The patchset also
adds the first user of this interface (the NXP DSPI driver in TCFQ mode).

The interface is somewhat similar to Hubert Feurstein's proposal for the
MDIO subsystem: https://lkml.org/lkml/2019/8/16/638

Original cover letter below. Also provided at the end some results with
an extra test (J - phc2sys using the timestamps taken by the SPI core).

===========================================================

Continuing the discussion created by Hubert Feurstein around the
mv88e6xxx driver for MDIO-controlled switches
(https://lkml.org/lkml/2019/8/2/1364), this patchset takes a similar
approach for the NXP LS1021A-TSN board, which has a SPI-controlled DSA
switch (SJA1105).

The patchset is motivated by some experiments done with a logic
analyzer, trying to understand the source of latency (and especially of
the jitter). SJA1105 SPI messages for reading the PTP clock are 12 bytes
in length: 4 for the SPI header and 8 for the timestamp. When looking at
the messages with a scope, there's jitter basically everywhere: between
bits of a frame and between frames in a transfer. The inter-bit jitter
is hardware and impacts us to a lesser extend (is smaller and caused by
the PVT stability of the oscillators, PLLs, etc). We will focus on the
latency between consecutive SPI frames within a 12-byte transfer.

As a preface, revisions of the DSPI controller IP are integrated in many
Freescale/NXP devices. As a result, the driver has 3 modes of operation:
- TCFQ (Transfer Complete Flag mode): The controller signals software
  that data has been sent/received after each individual word.
- EOQ (End of Queue mode): The driver can implement batching by making
  use of the controller's 4-word deep FIFO.
- DMA (Direct Memory Access mode): The SPI controller's FIFO is no
  longer in direct interaction with the driver, but is used to trigger
  the RX and TX channels of the eDMA module on the SoC.

In LS1021A, the driver works in the least efficient mode of the 3
(TCFQ). There is a well-known errata that the DSPI controller is broken
in conjunction with the eDMA module. As for the EOQ mode, I have tried
unsuccessfully for a few days to make use of the 4 entry FIFO, and the
hardware simply fails to reliably acknowledge the transmission when the
FIFO gets full. So it looks like we're stuck with the TCFQ mode.

The problem with phc2sys on the LS1021A-TSN board is that in order for
the gettime64() call to complete on the sja1105, the system has to
service 12 IRQs. Intuitively that is excessive and is the main source of
jitter, but let's not get ahead of ourselves.

An outline of the experiments that were done (unless otherwise
mentioned, all of these ran for 120 seconds):

A. First I have measured the (poor) performance of phc2sys under current
   conditions. (DSPI driver in IRQ mode, no PTP system timestamping)

   offset: min -53310 max 16107 mean -1737.18 std dev 11444.3
   delay: min 163680 max 237360 mean 201149 std dev 22446.6
   lost servo lock 1 times

B. I switched the .gettime64 callback to .gettimex64, snapshotting the
   PTP system timestamp within the sja1105 driver.

   offset: min -48923 max 64217 mean -904.137 std dev 17358.1
   delay: min 149600 max 203840 mean 169045 std dev 17993.3
   lost servo lock 8 times

C. I patched "struct spi_transfer" to contain the PTP system timestamp,
   and from the sja1105 driver, I passed this structure to be
   snapshotted by the SPI controller's driver (spi-fsl-dspi). This is
   the "transfer-level" snapshot.

   offset: min -64979 max 38979 mean -416.197 std dev 15367.9
   delay: min 125120 max 168320 mean 150286 std dev 17675.3
   lost servo lock 10 times

D. I changed the placement of the transfer snapshotting within the DSPI
   driver, from "transfer-level" to "byte-level".

   offset: min -9021 max 7149 mean -0.418803 std dev 3529.81
   delay: min 7840 max 23920 mean 14493.7 std dev 5982.17
   lost servo lock 0 times

E. I moved the DSPI driver to poll mode. I went back to collecting the
   PTP system timestamps from the sja1105 driver (same as B).

   offset: min -4199 max 46643 mean 418.214 std dev 4554.01
   delay: min 84000 max 194000 mean 99463.2 std dev 12936.5
   lost servo lock 1 times

F. Transfer-level snapshotting in the DSPI driver (same as C), but in
   poll mode.

   offset: min -24244 max 1115 mean -230.478 std dev 2297.28
   delay: min 69440 max 119040 mean 70312.9 std dev 8065.34
   lost servo lock 1 times

G. Byte-level snapshotting (same as D) but in poll mode.

   offset: min -314 max 288 mean -2.48718 std dev 118.045
   delay: min 4880 max 6000 mean 5118.63 std dev 507.258
   lost servo lock 0 times

   This seemed suspiciously good to me, so I let it run for longer
   (58 minutes):

   offset: min -26251 max 16416 mean -21.8672 std dev 863.416
   delay: min 4720 max 57280 mean 5182.49 std dev 1607.19
   lost servo lock 3 times

H. Transfer-level snapshotting (same as F), but with IRQs disabled.
   This ran for 86 minutes.

   offset: min -1927 max 1843 mean -0.209203 std dev 529.398
   delay: min 85440 max 93680 mean 88245 std dev 1454.71
   lost servo lock 0 times

I. Byte-level snapshotting (same as G), but with IRQs disabled.
   This ran for 102 minutes.

   offset: min -378 max 381 mean -0.0083089 std dev 101.495
   delay: min 4720 max 5920 mean 5129.38 std dev 154.899
   lost servo lock 0 times

J. Default snapshotting taken by the SPI core, with the DSPI driver
   running in poll mode, IRQs enabled. This ran for 274 minutes.

   offset: min -42568 max 44576 mean 2.91646 std dev 947.467
   delay: min 58480 max 171040 mean 80750.7 std dev 2001.61
   lost servo lock 3 times

As a result, this patchset proposes the implementation of scenario I.
The others were done through temporary patches which are not presented
here due to the difficulty of presenting a coherent git history without
resorting to reverts etc. The gist of each experiment should be clear
though.

The raw data is available for dissection at
https://drive.google.com/open?id=1r9raU9ZeqOqkqts6Lb-ISf5ubLDLP3wk.
The logic analyzer captures can be opened with a free-as-in-beer program
provided by Saleae: https://www.saleae.com/downloads/.

In the capture data one can find the MOSI, SCK SPI signals, as well as a
debug GPIO which was toggled at the same time as the PTP system
timestamp was taken, to give the viewer an impression of what the
software is capturing compared to the actual timing of the SPI transfer.

Attached are also some close-up screenshots of transfers where there is
a clear and huge delay in-between frames of the same 12-byte SPI
transfer. As it turns out, these were all caused by the CPU getting
interrupted by some other IRQ. Approaches H and I are the only ones that
get rid of these glitches. In theory, the byte-level snapshotting should
be less vulnerable to an IRQ interrupting the SPI transfer (because the
time window is much smaller) but as the 58 minutes experiment shows, it
is not immune.

Vladimir Oltean (5):
  spi: Use an abbreviated pointer to ctlr->cur_msg in
    __spi_pump_messages
  spi: Add a PTP system timestamp to the transfer structure
  spi: spi-fsl-dspi: Use poll mode in case the platform IRQ is missing
  spi: spi-fsl-dspi: Implement the PTP system timestamping for TCFQ mode
  spi: spi-fsl-dspi: Disable interrupts and preemption during poll mode
    transfer

 drivers/spi/spi-fsl-dspi.c | 117 +++++++++++++++++++++++++++++++------
 drivers/spi/spi.c          |  85 +++++++++++++++++++++++----
 include/linux/spi/spi.h    |  38 ++++++++++++
 3 files changed, 210 insertions(+), 30 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH spi for-5.4 1/5] spi: Use an abbreviated pointer to ctlr->cur_msg in __spi_pump_messages
  2019-08-18 18:25 [PATCH spi for-5.4 0/5] Deterministic SPI latency with NXP DSPI driver Vladimir Oltean
@ 2019-08-18 18:25 ` Vladimir Oltean
  2019-08-20 18:21   ` Mark Brown
  2019-08-18 18:25 ` [PATCH spi for-5.4 2/5] spi: Add a PTP system timestamp to the transfer structure Vladimir Oltean
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 27+ messages in thread
From: Vladimir Oltean @ 2019-08-18 18:25 UTC (permalink / raw)
  To: broonie, h.feurstein, mlichvar, richardcochran, andrew, f.fainelli
  Cc: linux-spi, netdev, Vladimir Oltean

This helps a bit with line fitting now (the list_first_entry call) as
well as during the next patch which needs to iterate through all
transfers of ctlr->cur_msg so it timestamps them.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
---
 drivers/spi/spi.c | 23 ++++++++++++-----------
 1 file changed, 12 insertions(+), 11 deletions(-)

diff --git a/drivers/spi/spi.c b/drivers/spi/spi.c
index aef55acb5ccd..d96e04627982 100644
--- a/drivers/spi/spi.c
+++ b/drivers/spi/spi.c
@@ -1265,8 +1265,9 @@ EXPORT_SYMBOL_GPL(spi_finalize_current_transfer);
  */
 static void __spi_pump_messages(struct spi_controller *ctlr, bool in_kthread)
 {
-	unsigned long flags;
+	struct spi_message *mesg;
 	bool was_busy = false;
+	unsigned long flags;
 	int ret;
 
 	/* Lock queue */
@@ -1325,10 +1326,10 @@ static void __spi_pump_messages(struct spi_controller *ctlr, bool in_kthread)
 	}
 
 	/* Extract head of queue */
-	ctlr->cur_msg =
-		list_first_entry(&ctlr->queue, struct spi_message, queue);
+	mesg = list_first_entry(&ctlr->queue, struct spi_message, queue);
+	ctlr->cur_msg = mesg;
 
-	list_del_init(&ctlr->cur_msg->queue);
+	list_del_init(&mesg->queue);
 	if (ctlr->busy)
 		was_busy = true;
 	else
@@ -1361,7 +1362,7 @@ static void __spi_pump_messages(struct spi_controller *ctlr, bool in_kthread)
 			if (ctlr->auto_runtime_pm)
 				pm_runtime_put(ctlr->dev.parent);
 
-			ctlr->cur_msg->status = ret;
+			mesg->status = ret;
 			spi_finalize_current_message(ctlr);
 
 			mutex_unlock(&ctlr->io_mutex);
@@ -1369,28 +1370,28 @@ static void __spi_pump_messages(struct spi_controller *ctlr, bool in_kthread)
 		}
 	}
 
-	trace_spi_message_start(ctlr->cur_msg);
+	trace_spi_message_start(mesg);
 
 	if (ctlr->prepare_message) {
-		ret = ctlr->prepare_message(ctlr, ctlr->cur_msg);
+		ret = ctlr->prepare_message(ctlr, mesg);
 		if (ret) {
 			dev_err(&ctlr->dev, "failed to prepare message: %d\n",
 				ret);
-			ctlr->cur_msg->status = ret;
+			mesg->status = ret;
 			spi_finalize_current_message(ctlr);
 			goto out;
 		}
 		ctlr->cur_msg_prepared = true;
 	}
 
-	ret = spi_map_msg(ctlr, ctlr->cur_msg);
+	ret = spi_map_msg(ctlr, mesg);
 	if (ret) {
-		ctlr->cur_msg->status = ret;
+		mesg->status = ret;
 		spi_finalize_current_message(ctlr);
 		goto out;
 	}
 
-	ret = ctlr->transfer_one_message(ctlr, ctlr->cur_msg);
+	ret = ctlr->transfer_one_message(ctlr, mesg);
 	if (ret) {
 		dev_err(&ctlr->dev,
 			"failed to transfer one message from queue\n");
-- 
2.17.1


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH spi for-5.4 2/5] spi: Add a PTP system timestamp to the transfer structure
  2019-08-18 18:25 [PATCH spi for-5.4 0/5] Deterministic SPI latency with NXP DSPI driver Vladimir Oltean
  2019-08-18 18:25 ` [PATCH spi for-5.4 1/5] spi: Use an abbreviated pointer to ctlr->cur_msg in __spi_pump_messages Vladimir Oltean
@ 2019-08-18 18:25 ` Vladimir Oltean
  2019-08-22 17:11   ` Mark Brown
  2019-08-18 18:25 ` [PATCH spi for-5.4 3/5] spi: spi-fsl-dspi: Use poll mode in case the platform IRQ is missing Vladimir Oltean
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 27+ messages in thread
From: Vladimir Oltean @ 2019-08-18 18:25 UTC (permalink / raw)
  To: broonie, h.feurstein, mlichvar, richardcochran, andrew, f.fainelli
  Cc: linux-spi, netdev, Vladimir Oltean

SPI is one of the interfaces used to access devices which have a POSIX
clock driver (real time clocks, 1588 timers etc).

Since there are lots of sources of timing jitter when retrieving the
time from such a device (queuing delays, locking contention, running in
interruptible context etc), introduce a PTP system timestamp structure
in struct spi_transfer. This is to be used by SPI device drivers when
they need to know the exact time at which the underlying device's time
was snapshotted.

Because SPI controllers may have jitter even between frames, also
introduce a field which specifies to the controller driver specifically
which byte needs to be snapshotted.

Add a default implementation of the PTP system timestamping in the SPI
core. There are 3 entry points from the core towards the SPI controller
drivers:
- transfer_one: The driver is passed individual spi_transfers to
  execute. This is the easiest to timestamp.
- transfer_one_message: The core passes the driver an entire spi_message
  (a potential batch of spi_transfers). The core puts the same pre and
  post timestamp to all transfers within a message. This is not ideal,
  but nothing better can be done by default anyway, since the core has
  no insight into how the driver batches the transfers.
- transfer: Like transfer_one_message, but for unqueued drivers (i.e.
  the driver implements its own queue scheduling).

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
---
 drivers/spi/spi.c       | 62 +++++++++++++++++++++++++++++++++++++++++
 include/linux/spi/spi.h | 38 +++++++++++++++++++++++++
 2 files changed, 100 insertions(+)

diff --git a/drivers/spi/spi.c b/drivers/spi/spi.c
index d96e04627982..cf5c5435edcd 100644
--- a/drivers/spi/spi.c
+++ b/drivers/spi/spi.c
@@ -1171,6 +1171,11 @@ static int spi_transfer_one_message(struct spi_controller *ctlr,
 		spi_statistics_add_transfer_stats(statm, xfer, ctlr);
 		spi_statistics_add_transfer_stats(stats, xfer, ctlr);
 
+		if (!ctlr->ptp_sts_supported) {
+			xfer->ptp_sts_word_pre = 0;
+			ptp_read_system_prets(xfer->ptp_sts);
+		}
+
 		if (xfer->tx_buf || xfer->rx_buf) {
 			reinit_completion(&ctlr->xfer_completion);
 
@@ -1197,6 +1202,11 @@ static int spi_transfer_one_message(struct spi_controller *ctlr,
 					xfer->len);
 		}
 
+		if (!ctlr->ptp_sts_supported) {
+			ptp_read_system_postts(xfer->ptp_sts);
+			xfer->ptp_sts_word_post = xfer->len;
+		}
+
 		trace_spi_transfer_stop(msg, xfer);
 
 		if (msg->status != -EINPROGRESS)
@@ -1265,6 +1275,7 @@ EXPORT_SYMBOL_GPL(spi_finalize_current_transfer);
  */
 static void __spi_pump_messages(struct spi_controller *ctlr, bool in_kthread)
 {
+	struct spi_transfer *xfer;
 	struct spi_message *mesg;
 	bool was_busy = false;
 	unsigned long flags;
@@ -1391,6 +1402,13 @@ static void __spi_pump_messages(struct spi_controller *ctlr, bool in_kthread)
 		goto out;
 	}
 
+	if (!ctlr->ptp_sts_supported) {
+		list_for_each_entry(xfer, &mesg->transfers, transfer_list) {
+			xfer->ptp_sts_word_pre = 0;
+			ptp_read_system_prets(xfer->ptp_sts);
+		}
+	}
+
 	ret = ctlr->transfer_one_message(ctlr, mesg);
 	if (ret) {
 		dev_err(&ctlr->dev,
@@ -1418,6 +1436,34 @@ static void spi_pump_messages(struct kthread_work *work)
 	__spi_pump_messages(ctlr, true);
 }
 
+/**
+ * spi_xfer_ptp_sts_word - helper for drivers to retrieve the pointer to the
+ *			   word in the TX buffer which must be timestamped.
+ *			   The SPI slave does not provide a pointer directly
+ *			   because the TX and RX buffers may be reallocated
+ *			   (see @spi_map_msg).
+ * @xfer: Pointer to the transfer being timestamped
+ * @pre: If true, returns the pointer to @ptp_sts_word_pre, otherwise returns
+ *	the pointer to @ptp_sts_word_post.
+ */
+const void *spi_xfer_ptp_sts_word(struct spi_transfer *xfer, bool pre)
+{
+	unsigned int bytes_per_word;
+
+	if (xfer->bits_per_word <= 8)
+		bytes_per_word = 1;
+	else if (xfer->bits_per_word <= 16)
+		bytes_per_word = 2;
+	else
+		bytes_per_word = 4;
+
+	if (pre)
+		return xfer->tx_buf + xfer->ptp_sts_word_pre * bytes_per_word;
+
+	return xfer->tx_buf + xfer->ptp_sts_word_post * bytes_per_word;
+}
+EXPORT_SYMBOL_GPL(spi_xfer_ptp_sts_word);
+
 /**
  * spi_set_thread_rt - set the controller to pump at realtime priority
  * @ctlr: controller to boost priority of
@@ -1503,6 +1549,7 @@ EXPORT_SYMBOL_GPL(spi_get_next_queued_message);
  */
 void spi_finalize_current_message(struct spi_controller *ctlr)
 {
+	struct spi_transfer *xfer;
 	struct spi_message *mesg;
 	unsigned long flags;
 	int ret;
@@ -1511,6 +1558,13 @@ void spi_finalize_current_message(struct spi_controller *ctlr)
 	mesg = ctlr->cur_msg;
 	spin_unlock_irqrestore(&ctlr->queue_lock, flags);
 
+	if (!ctlr->ptp_sts_supported) {
+		list_for_each_entry(xfer, &mesg->transfers, transfer_list) {
+			ptp_read_system_postts(xfer->ptp_sts);
+			xfer->ptp_sts_word_post = xfer->len;
+		}
+	}
+
 	spi_unmap_msg(ctlr, mesg);
 
 	if (ctlr->cur_msg_prepared && ctlr->unprepare_message) {
@@ -3270,6 +3324,7 @@ static int __spi_validate(struct spi_device *spi, struct spi_message *message)
 static int __spi_async(struct spi_device *spi, struct spi_message *message)
 {
 	struct spi_controller *ctlr = spi->controller;
+	struct spi_transfer *xfer;
 
 	/*
 	 * Some controllers do not support doing regular SPI transfers. Return
@@ -3285,6 +3340,13 @@ static int __spi_async(struct spi_device *spi, struct spi_message *message)
 
 	trace_spi_message_submit(message);
 
+	if (!ctlr->ptp_sts_supported) {
+		list_for_each_entry(xfer, &message->transfers, transfer_list) {
+			xfer->ptp_sts_word_pre = 0;
+			ptp_read_system_prets(xfer->ptp_sts);
+		}
+	}
+
 	return ctlr->transfer(spi, message);
 }
 
diff --git a/include/linux/spi/spi.h b/include/linux/spi/spi.h
index af4f265d0f67..bb7553c6e5d0 100644
--- a/include/linux/spi/spi.h
+++ b/include/linux/spi/spi.h
@@ -13,6 +13,7 @@
 #include <linux/completion.h>
 #include <linux/scatterlist.h>
 #include <linux/gpio/consumer.h>
+#include <linux/ptp_clock_kernel.h>
 
 struct dma_chan;
 struct property_entry;
@@ -409,6 +410,12 @@ static inline void spi_unregister_driver(struct spi_driver *sdrv)
  * @fw_translate_cs: If the boot firmware uses different numbering scheme
  *	what Linux expects, this optional hook can be used to translate
  *	between the two.
+ * @ptp_sts_supported: If the driver sets this to true, it must provide a
+ *	time snapshot in @spi_transfer->ptp_sts as close as possible to the
+ *	moment in time when @spi_transfer->ptp_sts_word_pre and
+ *	@spi_transfer->ptp_sts_word_post were transmitted.
+ *	If the driver does not set this, the SPI core takes the snapshot as
+ *	close to the driver hand-over as possible.
  *
  * Each SPI controller can communicate with one or more @spi_device
  * children.  These make a small bus, sharing MOSI, MISO and SCK signals
@@ -604,6 +611,12 @@ struct spi_controller {
 	void			*dummy_tx;
 
 	int (*fw_translate_cs)(struct spi_controller *ctlr, unsigned cs);
+
+	/*
+	 * Driver sets this field to indicate it is able to snapshot SPI
+	 * transfers (needed e.g. for reading the time of POSIX clocks)
+	 */
+	bool			ptp_sts_supported;
 };
 
 static inline void *spi_controller_get_devdata(struct spi_controller *ctlr)
@@ -644,6 +657,9 @@ extern struct spi_message *spi_get_next_queued_message(struct spi_controller *ct
 extern void spi_finalize_current_message(struct spi_controller *ctlr);
 extern void spi_finalize_current_transfer(struct spi_controller *ctlr);
 
+/* Helper calls for driver to get which buffer pointer must be timestamped */
+extern const void *spi_xfer_ptp_sts_word(struct spi_transfer *xfer, bool pre);
+
 /* the spi driver core manages memory for the spi_controller classdev */
 extern struct spi_controller *__spi_alloc_controller(struct device *host,
 						unsigned int size, bool slave);
@@ -753,6 +769,24 @@ extern void spi_res_release(struct spi_controller *ctlr,
  * @transfer_list: transfers are sequenced through @spi_message.transfers
  * @tx_sg: Scatterlist for transmit, currently not for client use
  * @rx_sg: Scatterlist for receive, currently not for client use
+ * @ptp_sts_word_pre: The word (subject to bits_per_word semantics) offset
+ *	within @tx_buf for which the SPI device is requesting that the time
+ *	snapshot for this transfer begins. Upon completing the SPI transfer,
+ *	this value may have changed compared to what was requested, depending
+ *	on the available snapshotting resolution (DMA transfer,
+ *	@ptp_sts_supported is false, etc).
+ * @ptp_sts_word_post: See @ptp_sts_word_post. The two can be equal (meaning
+ *	that a single byte should be snapshotted). The core will set
+ *	@ptp_sts_word_pre to 0, and @ptp_sts_word_post to the length of the
+ *	transfer, if @ptp_sts_supported is false for this controller. This is
+ *	done purposefully (instead of setting to spi_transfer->len - 1) to
+ *	denote that a transfer-level snapshot taken from within the driver may
+ *	still be of higher quality.
+ * @ptp_sts: Pointer to a memory location held by the SPI slave device where a
+ *	PTP system timestamp structure may lie. If drivers use PIO or their
+ *	hardware has some sort of assist for retrieving exact transfer timing,
+ *	they can (and should) assert @ptp_sts_supported and populate this
+ *	structure using the ptp_read_system_*ts helper functions.
  *
  * SPI transfers always write the same number of bytes as they read.
  * Protocol drivers should always provide @rx_buf and/or @tx_buf.
@@ -842,6 +876,10 @@ struct spi_transfer {
 
 	u32		effective_speed_hz;
 
+	unsigned int	ptp_sts_word_pre;
+	unsigned int	ptp_sts_word_post;
+	struct ptp_system_timestamp *ptp_sts;
+
 	struct list_head transfer_list;
 };
 
-- 
2.17.1


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH spi for-5.4 3/5] spi: spi-fsl-dspi: Use poll mode in case the platform IRQ is missing
  2019-08-18 18:25 [PATCH spi for-5.4 0/5] Deterministic SPI latency with NXP DSPI driver Vladimir Oltean
  2019-08-18 18:25 ` [PATCH spi for-5.4 1/5] spi: Use an abbreviated pointer to ctlr->cur_msg in __spi_pump_messages Vladimir Oltean
  2019-08-18 18:25 ` [PATCH spi for-5.4 2/5] spi: Add a PTP system timestamp to the transfer structure Vladimir Oltean
@ 2019-08-18 18:25 ` Vladimir Oltean
  2019-08-22 17:38   ` Mark Brown
  2019-08-18 18:25 ` [PATCH spi for-5.4 4/5] spi: spi-fsl-dspi: Implement the PTP system timestamping for TCFQ mode Vladimir Oltean
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 27+ messages in thread
From: Vladimir Oltean @ 2019-08-18 18:25 UTC (permalink / raw)
  To: broonie, h.feurstein, mlichvar, richardcochran, andrew, f.fainelli
  Cc: linux-spi, netdev, Vladimir Oltean

On platforms like LS1021A which use TCFQ mode, an interrupt needs to be
processed after each byte is TXed/RXed. I tried to make the DSPI
implementation on this SoC operate in other, more efficient modes (EOQ,
DMA) but it looks like it simply isn't possible.

Therefore allow the driver to operate in poll mode, to ease a bit of
this absurd amount of IRQ load generated in TCFQ mode. Doing so reduces
both the net time it takes to transmit a SPI message, as well as the
inter-frame jitter that occurs while doing so.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
---
 drivers/spi/spi-fsl-dspi.c | 81 ++++++++++++++++++++++++++++----------
 1 file changed, 61 insertions(+), 20 deletions(-)

diff --git a/drivers/spi/spi-fsl-dspi.c b/drivers/spi/spi-fsl-dspi.c
index 238bbe172b79..4daf8c3d07b7 100644
--- a/drivers/spi/spi-fsl-dspi.c
+++ b/drivers/spi/spi-fsl-dspi.c
@@ -647,15 +647,11 @@ static void dspi_eoq_read(struct fsl_dspi *dspi)
 		dspi_push_rx(dspi, fifo_read(dspi));
 }
 
-static irqreturn_t dspi_interrupt(int irq, void *dev_id)
+static int dspi_rxtx(struct fsl_dspi *dspi)
 {
-	struct fsl_dspi *dspi = (struct fsl_dspi *)dev_id;
 	struct spi_message *msg = dspi->cur_msg;
-	u32 spi_sr, spi_tcr;
 	u16 spi_tcnt;
-
-	regmap_read(dspi->regmap, SPI_SR, &spi_sr);
-	regmap_write(dspi->regmap, SPI_SR, spi_sr);
+	u32 spi_tcr;
 
 	/* Get transfer counter (in number of SPI transfers). It was
 	 * reset to 0 when transfer(s) were started.
@@ -670,17 +666,52 @@ static irqreturn_t dspi_interrupt(int irq, void *dev_id)
 	else
 		dspi_tcfq_read(dspi);
 
-	if (!dspi->len) {
-		dspi->waitflags = 1;
-		wake_up_interruptible(&dspi->waitq);
-		return IRQ_HANDLED;
-	}
+	if (!dspi->len)
+		/* Success! */
+		return 0;
 
 	if (dspi->devtype_data->trans_mode == DSPI_EOQ_MODE)
 		dspi_eoq_write(dspi);
 	else
 		dspi_tcfq_write(dspi);
 
+	return -EINPROGRESS;
+}
+
+static int dspi_poll(struct fsl_dspi *dspi)
+{
+	int tries = 1000;
+	u32 spi_sr;
+
+	do {
+		regmap_read(dspi->regmap, SPI_SR, &spi_sr);
+		regmap_write(dspi->regmap, SPI_SR, spi_sr);
+
+		if (spi_sr & (SPI_SR_EOQF | SPI_SR_TCFQF))
+			break;
+	} while (--tries);
+
+	if (!tries)
+		return -ETIMEDOUT;
+
+	return dspi_rxtx(dspi);
+}
+
+static irqreturn_t dspi_interrupt(int irq, void *dev_id)
+{
+	struct fsl_dspi *dspi = (struct fsl_dspi *)dev_id;
+	u32 spi_sr;
+
+	regmap_read(dspi->regmap, SPI_SR, &spi_sr);
+	regmap_write(dspi->regmap, SPI_SR, spi_sr);
+
+	dspi_rxtx(dspi);
+
+	if (!dspi->len) {
+		dspi->waitflags = 1;
+		wake_up_interruptible(&dspi->waitq);
+	}
+
 	return IRQ_HANDLED;
 }
 
@@ -768,13 +799,18 @@ static int dspi_transfer_one_message(struct spi_controller *ctlr,
 			goto out;
 		}
 
-		if (trans_mode != DSPI_DMA_MODE) {
-			if (wait_event_interruptible(dspi->waitq,
-						     dspi->waitflags))
-				dev_err(&dspi->pdev->dev,
-					"wait transfer complete fail!\n");
+		if (!dspi->irq) {
+			do {
+				status = dspi_poll(dspi);
+			} while (status == -EINPROGRESS);
+		} else if (trans_mode != DSPI_DMA_MODE) {
+			status = wait_event_interruptible(dspi->waitq,
+							  dspi->waitflags);
 			dspi->waitflags = 0;
 		}
+		if (status)
+			dev_err(&dspi->pdev->dev,
+				"Waiting for transfer to complete failed!\n");
 
 		if (transfer->delay_usecs)
 			udelay(transfer->delay_usecs);
@@ -1074,10 +1110,13 @@ static int dspi_probe(struct platform_device *pdev)
 		goto out_ctlr_put;
 
 	dspi_init(dspi);
+
 	dspi->irq = platform_get_irq(pdev, 0);
-	if (dspi->irq < 0) {
-		ret = dspi->irq;
-		goto out_clk_put;
+	if (dspi->irq <= 0) {
+		dev_info(&pdev->dev,
+			 "can't get platform irq, using poll mode\n");
+		dspi->irq = 0;
+		goto poll_mode;
 	}
 
 	ret = devm_request_irq(&pdev->dev, dspi->irq, dspi_interrupt,
@@ -1087,6 +1126,9 @@ static int dspi_probe(struct platform_device *pdev)
 		goto out_clk_put;
 	}
 
+	init_waitqueue_head(&dspi->waitq);
+
+poll_mode:
 	if (dspi->devtype_data->trans_mode == DSPI_DMA_MODE) {
 		ret = dspi_request_dma(dspi, res->start);
 		if (ret < 0) {
@@ -1098,7 +1140,6 @@ static int dspi_probe(struct platform_device *pdev)
 	ctlr->max_speed_hz =
 		clk_get_rate(dspi->clk) / dspi->devtype_data->max_clock_factor;
 
-	init_waitqueue_head(&dspi->waitq);
 	platform_set_drvdata(pdev, ctlr);
 
 	ret = spi_register_controller(ctlr);
-- 
2.17.1


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH spi for-5.4 4/5] spi: spi-fsl-dspi: Implement the PTP system timestamping for TCFQ mode
  2019-08-18 18:25 [PATCH spi for-5.4 0/5] Deterministic SPI latency with NXP DSPI driver Vladimir Oltean
                   ` (2 preceding siblings ...)
  2019-08-18 18:25 ` [PATCH spi for-5.4 3/5] spi: spi-fsl-dspi: Use poll mode in case the platform IRQ is missing Vladimir Oltean
@ 2019-08-18 18:25 ` Vladimir Oltean
  2019-08-18 18:26 ` [PATCH spi for-5.4 5/5] spi: spi-fsl-dspi: Disable interrupts and preemption during poll mode transfer Vladimir Oltean
  2019-08-20 15:57 ` [PATCH spi for-5.4 0/5] Deterministic SPI latency with NXP DSPI driver Vladimir Oltean
  5 siblings, 0 replies; 27+ messages in thread
From: Vladimir Oltean @ 2019-08-18 18:25 UTC (permalink / raw)
  To: broonie, h.feurstein, mlichvar, richardcochran, andrew, f.fainelli
  Cc: linux-spi, netdev, Vladimir Oltean

In this mode, the DSPI controller uses PIO to transfer word by word. In
comparison, in EOQ mode the 4-word deep FIFO is being used, hence the
current logic will need some adaptation for which I do not have the
hardware (Coldfire) to test. It is not clear what is the timing of DMA
transfers and whether timestamping in the driver brings any overall
performance increase compared to regular timestamping done in the core.

Tested on LS1021A.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
---
 drivers/spi/spi-fsl-dspi.c | 28 ++++++++++++++++++++++++++++
 1 file changed, 28 insertions(+)

diff --git a/drivers/spi/spi-fsl-dspi.c b/drivers/spi/spi-fsl-dspi.c
index 4daf8c3d07b7..ea7169d18e09 100644
--- a/drivers/spi/spi-fsl-dspi.c
+++ b/drivers/spi/spi-fsl-dspi.c
@@ -129,6 +129,7 @@ enum dspi_trans_mode {
 struct fsl_dspi_devtype_data {
 	enum dspi_trans_mode	trans_mode;
 	u8			max_clock_factor;
+	bool			ptp_sts_supported;
 	bool			xspi_mode;
 };
 
@@ -140,12 +141,14 @@ static const struct fsl_dspi_devtype_data vf610_data = {
 static const struct fsl_dspi_devtype_data ls1021a_v1_data = {
 	.trans_mode		= DSPI_TCFQ_MODE,
 	.max_clock_factor	= 8,
+	.ptp_sts_supported	= true,
 	.xspi_mode		= true,
 };
 
 static const struct fsl_dspi_devtype_data ls2085a_data = {
 	.trans_mode		= DSPI_TCFQ_MODE,
 	.max_clock_factor	= 8,
+	.ptp_sts_supported	= true,
 };
 
 static const struct fsl_dspi_devtype_data coldfire_data = {
@@ -179,6 +182,11 @@ struct fsl_dspi {
 	int					irq;
 	struct clk				*clk;
 
+	struct ptp_system_timestamp		*ptp_sts;
+	const void				*ptp_sts_word_pre;
+	const void				*ptp_sts_word_post;
+	bool					take_snapshot_pre;
+	bool					take_snapshot_post;
 	struct spi_transfer			*cur_transfer;
 	struct spi_message			*cur_msg;
 	struct chip_data			*cur_chip;
@@ -653,6 +661,9 @@ static int dspi_rxtx(struct fsl_dspi *dspi)
 	u16 spi_tcnt;
 	u32 spi_tcr;
 
+	if (dspi->take_snapshot_post)
+		ptp_read_system_postts(dspi->ptp_sts);
+
 	/* Get transfer counter (in number of SPI transfers). It was
 	 * reset to 0 when transfer(s) were started.
 	 */
@@ -670,6 +681,12 @@ static int dspi_rxtx(struct fsl_dspi *dspi)
 		/* Success! */
 		return 0;
 
+	dspi->take_snapshot_pre = (dspi->tx == dspi->ptp_sts_word_pre);
+	dspi->take_snapshot_post = (dspi->tx == dspi->ptp_sts_word_post);
+
+	if (dspi->take_snapshot_pre)
+		ptp_read_system_prets(dspi->ptp_sts);
+
 	if (dspi->devtype_data->trans_mode == DSPI_EOQ_MODE)
 		dspi_eoq_write(dspi);
 	else
@@ -764,6 +781,10 @@ static int dspi_transfer_one_message(struct spi_controller *ctlr,
 			dspi->bytes_per_word = 2;
 		else
 			dspi->bytes_per_word = 4;
+		dspi->ptp_sts = transfer->ptp_sts;
+		dspi->ptp_sts_word_pre = spi_xfer_ptp_sts_word(transfer, true);
+		dspi->ptp_sts_word_post = spi_xfer_ptp_sts_word(transfer,
+								false);
 
 		regmap_update_bits(dspi->regmap, SPI_MCR,
 				   SPI_MCR_CLR_TXF | SPI_MCR_CLR_RXF,
@@ -776,6 +797,11 @@ static int dspi_transfer_one_message(struct spi_controller *ctlr,
 				     SPI_FRAME_EBITS(transfer->bits_per_word) |
 				     SPI_CTARE_DTCP(1));
 
+		dspi->take_snapshot_pre = (dspi->tx == dspi->ptp_sts_word_pre);
+
+		if (dspi->take_snapshot_pre)
+			ptp_read_system_prets(dspi->ptp_sts);
+
 		trans_mode = dspi->devtype_data->trans_mode;
 		switch (trans_mode) {
 		case DSPI_EOQ_MODE:
@@ -1140,6 +1166,8 @@ static int dspi_probe(struct platform_device *pdev)
 	ctlr->max_speed_hz =
 		clk_get_rate(dspi->clk) / dspi->devtype_data->max_clock_factor;
 
+	ctlr->ptp_sts_supported = dspi->devtype_data->ptp_sts_supported;
+
 	platform_set_drvdata(pdev, ctlr);
 
 	ret = spi_register_controller(ctlr);
-- 
2.17.1


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH spi for-5.4 5/5] spi: spi-fsl-dspi: Disable interrupts and preemption during poll mode transfer
  2019-08-18 18:25 [PATCH spi for-5.4 0/5] Deterministic SPI latency with NXP DSPI driver Vladimir Oltean
                   ` (3 preceding siblings ...)
  2019-08-18 18:25 ` [PATCH spi for-5.4 4/5] spi: spi-fsl-dspi: Implement the PTP system timestamping for TCFQ mode Vladimir Oltean
@ 2019-08-18 18:26 ` Vladimir Oltean
  2019-08-20 15:57 ` [PATCH spi for-5.4 0/5] Deterministic SPI latency with NXP DSPI driver Vladimir Oltean
  5 siblings, 0 replies; 27+ messages in thread
From: Vladimir Oltean @ 2019-08-18 18:26 UTC (permalink / raw)
  To: broonie, h.feurstein, mlichvar, richardcochran, andrew, f.fainelli
  Cc: linux-spi, netdev, Vladimir Oltean

While the poll mode helps reduce the overall latency in transmitting a
SPI message in the EOQ and TCFQ modes, the transmission can still have
jitter due to the CPU needing to service interrupts.

The transmission latency does not matter except in situations where the
SPI transfer represents the readout of a POSIX clock. In that case,
even with the byte-level PTP system timestamping in place, a pending IRQ
might find its way to be processed on the local CPU exactly during the
window when the transfer is snapshotted.

Disabling interrupts ensures the above situation never happens. When it
does, it manifests itself as random delay spikes, which throw off the
servo loop of phc2sys and make it lose lock.

Short phc2sys summary after 58 minutes of running without this patch:

  offset: min -26251 max 16416 mean -21.8672 std dev 863.416
  delay: min 4720 max 57280 mean 5182.49 std dev 1607.19
  lost servo lock 3 times

Summary of the same phc2sys service running for 120 minutes with the
patch:

  offset: min -378 max 381 mean -0.0083089 std dev 101.495
  delay: min 4720 max 5920 mean 5129.38 std dev 154.899
  lost servo lock 0 times

Disable interrupts unconditionally if running in poll mode.
Two aspects:
- If the DSPI driver is in IRQ mode, then disabling interrupts becomes a
  contradiction in terms. Poll mode is recommendable for predictable
  latency.
- In theory it should be possible to disable interrupts only for SPI
  transfers that represent an interaction with a POSIX clock. The driver
  can sense this by looking at transfer->ptp_sts. However enabling this
  unconditionally makes issues much more visible (and not just in fringe
  cases), were they to ever appear.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
---
 drivers/spi/spi-fsl-dspi.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/drivers/spi/spi-fsl-dspi.c b/drivers/spi/spi-fsl-dspi.c
index ea7169d18e09..c94574a20c8a 100644
--- a/drivers/spi/spi-fsl-dspi.c
+++ b/drivers/spi/spi-fsl-dspi.c
@@ -182,6 +182,8 @@ struct fsl_dspi {
 	int					irq;
 	struct clk				*clk;
 
+	/* Used to disable IRQs and preemption */
+	spinlock_t				lock;
 	struct ptp_system_timestamp		*ptp_sts;
 	const void				*ptp_sts_word_pre;
 	const void				*ptp_sts_word_post;
@@ -739,6 +741,7 @@ static int dspi_transfer_one_message(struct spi_controller *ctlr,
 	struct spi_device *spi = message->spi;
 	enum dspi_trans_mode trans_mode;
 	struct spi_transfer *transfer;
+	unsigned long flags = 0;
 	int status = 0;
 
 	message->actual_length = 0;
@@ -797,6 +800,9 @@ static int dspi_transfer_one_message(struct spi_controller *ctlr,
 				     SPI_FRAME_EBITS(transfer->bits_per_word) |
 				     SPI_CTARE_DTCP(1));
 
+		if (!dspi->irq)
+			spin_lock_irqsave(&dspi->lock, flags);
+
 		dspi->take_snapshot_pre = (dspi->tx == dspi->ptp_sts_word_pre);
 
 		if (dspi->take_snapshot_pre)
@@ -829,6 +835,9 @@ static int dspi_transfer_one_message(struct spi_controller *ctlr,
 			do {
 				status = dspi_poll(dspi);
 			} while (status == -EINPROGRESS);
+
+			spin_unlock_irqrestore(&dspi->lock, flags);
+
 		} else if (trans_mode != DSPI_DMA_MODE) {
 			status = wait_event_interruptible(dspi->waitq,
 							  dspi->waitflags);
@@ -1153,6 +1162,7 @@ static int dspi_probe(struct platform_device *pdev)
 	}
 
 	init_waitqueue_head(&dspi->waitq);
+	spin_lock_init(&dspi->lock);
 
 poll_mode:
 	if (dspi->devtype_data->trans_mode == DSPI_DMA_MODE) {
-- 
2.17.1


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH spi for-5.4 0/5] Deterministic SPI latency with NXP DSPI driver
  2019-08-18 18:25 [PATCH spi for-5.4 0/5] Deterministic SPI latency with NXP DSPI driver Vladimir Oltean
                   ` (4 preceding siblings ...)
  2019-08-18 18:26 ` [PATCH spi for-5.4 5/5] spi: spi-fsl-dspi: Disable interrupts and preemption during poll mode transfer Vladimir Oltean
@ 2019-08-20 15:57 ` Vladimir Oltean
  2019-08-20 16:57   ` Andrew Lunn
                     ` (2 more replies)
  5 siblings, 3 replies; 27+ messages in thread
From: Vladimir Oltean @ 2019-08-20 15:57 UTC (permalink / raw)
  To: Mark Brown, Hubert Feurstein, mlichvar, Richard Cochran,
	Andrew Lunn, Florian Fainelli
  Cc: linux-spi, netdev

On Sun, 18 Aug 2019 at 21:26, Vladimir Oltean <olteanv@gmail.com> wrote:
>
> This patchset proposes an interface from the SPI subsystem for
> software timestamping SPI transfers. There is a default implementation
> provided in the core, as well as a mechanism for SPI slave drivers to
> check which byte was in fact timestamped post-facto. The patchset also
> adds the first user of this interface (the NXP DSPI driver in TCFQ mode).
>
> The interface is somewhat similar to Hubert Feurstein's proposal for the
> MDIO subsystem: https://lkml.org/lkml/2019/8/16/638
>
> Original cover letter below. Also provided at the end some results with
> an extra test (J - phc2sys using the timestamps taken by the SPI core).
>
> ===========================================================
>
> Continuing the discussion created by Hubert Feurstein around the
> mv88e6xxx driver for MDIO-controlled switches
> (https://lkml.org/lkml/2019/8/2/1364), this patchset takes a similar
> approach for the NXP LS1021A-TSN board, which has a SPI-controlled DSA
> switch (SJA1105).
>
> The patchset is motivated by some experiments done with a logic
> analyzer, trying to understand the source of latency (and especially of
> the jitter). SJA1105 SPI messages for reading the PTP clock are 12 bytes
> in length: 4 for the SPI header and 8 for the timestamp. When looking at
> the messages with a scope, there's jitter basically everywhere: between
> bits of a frame and between frames in a transfer. The inter-bit jitter
> is hardware and impacts us to a lesser extend (is smaller and caused by
> the PVT stability of the oscillators, PLLs, etc). We will focus on the
> latency between consecutive SPI frames within a 12-byte transfer.
>
> As a preface, revisions of the DSPI controller IP are integrated in many
> Freescale/NXP devices. As a result, the driver has 3 modes of operation:
> - TCFQ (Transfer Complete Flag mode): The controller signals software
>   that data has been sent/received after each individual word.
> - EOQ (End of Queue mode): The driver can implement batching by making
>   use of the controller's 4-word deep FIFO.
> - DMA (Direct Memory Access mode): The SPI controller's FIFO is no
>   longer in direct interaction with the driver, but is used to trigger
>   the RX and TX channels of the eDMA module on the SoC.
>
> In LS1021A, the driver works in the least efficient mode of the 3
> (TCFQ). There is a well-known errata that the DSPI controller is broken
> in conjunction with the eDMA module. As for the EOQ mode, I have tried
> unsuccessfully for a few days to make use of the 4 entry FIFO, and the
> hardware simply fails to reliably acknowledge the transmission when the
> FIFO gets full. So it looks like we're stuck with the TCFQ mode.
>
> The problem with phc2sys on the LS1021A-TSN board is that in order for
> the gettime64() call to complete on the sja1105, the system has to
> service 12 IRQs. Intuitively that is excessive and is the main source of
> jitter, but let's not get ahead of ourselves.
>
> An outline of the experiments that were done (unless otherwise
> mentioned, all of these ran for 120 seconds):
>
> A. First I have measured the (poor) performance of phc2sys under current
>    conditions. (DSPI driver in IRQ mode, no PTP system timestamping)
>
>    offset: min -53310 max 16107 mean -1737.18 std dev 11444.3
>    delay: min 163680 max 237360 mean 201149 std dev 22446.6
>    lost servo lock 1 times
>
> B. I switched the .gettime64 callback to .gettimex64, snapshotting the
>    PTP system timestamp within the sja1105 driver.
>
>    offset: min -48923 max 64217 mean -904.137 std dev 17358.1
>    delay: min 149600 max 203840 mean 169045 std dev 17993.3
>    lost servo lock 8 times
>
> C. I patched "struct spi_transfer" to contain the PTP system timestamp,
>    and from the sja1105 driver, I passed this structure to be
>    snapshotted by the SPI controller's driver (spi-fsl-dspi). This is
>    the "transfer-level" snapshot.
>
>    offset: min -64979 max 38979 mean -416.197 std dev 15367.9
>    delay: min 125120 max 168320 mean 150286 std dev 17675.3
>    lost servo lock 10 times
>
> D. I changed the placement of the transfer snapshotting within the DSPI
>    driver, from "transfer-level" to "byte-level".
>
>    offset: min -9021 max 7149 mean -0.418803 std dev 3529.81
>    delay: min 7840 max 23920 mean 14493.7 std dev 5982.17
>    lost servo lock 0 times
>
> E. I moved the DSPI driver to poll mode. I went back to collecting the
>    PTP system timestamps from the sja1105 driver (same as B).
>
>    offset: min -4199 max 46643 mean 418.214 std dev 4554.01
>    delay: min 84000 max 194000 mean 99463.2 std dev 12936.5
>    lost servo lock 1 times
>
> F. Transfer-level snapshotting in the DSPI driver (same as C), but in
>    poll mode.
>
>    offset: min -24244 max 1115 mean -230.478 std dev 2297.28
>    delay: min 69440 max 119040 mean 70312.9 std dev 8065.34
>    lost servo lock 1 times
>
> G. Byte-level snapshotting (same as D) but in poll mode.
>
>    offset: min -314 max 288 mean -2.48718 std dev 118.045
>    delay: min 4880 max 6000 mean 5118.63 std dev 507.258
>    lost servo lock 0 times
>
>    This seemed suspiciously good to me, so I let it run for longer
>    (58 minutes):
>
>    offset: min -26251 max 16416 mean -21.8672 std dev 863.416
>    delay: min 4720 max 57280 mean 5182.49 std dev 1607.19
>    lost servo lock 3 times
>
> H. Transfer-level snapshotting (same as F), but with IRQs disabled.
>    This ran for 86 minutes.
>
>    offset: min -1927 max 1843 mean -0.209203 std dev 529.398
>    delay: min 85440 max 93680 mean 88245 std dev 1454.71
>    lost servo lock 0 times
>
> I. Byte-level snapshotting (same as G), but with IRQs disabled.
>    This ran for 102 minutes.
>
>    offset: min -378 max 381 mean -0.0083089 std dev 101.495
>    delay: min 4720 max 5920 mean 5129.38 std dev 154.899
>    lost servo lock 0 times
>
> J. Default snapshotting taken by the SPI core, with the DSPI driver
>    running in poll mode, IRQs enabled. This ran for 274 minutes.
>
>    offset: min -42568 max 44576 mean 2.91646 std dev 947.467
>    delay: min 58480 max 171040 mean 80750.7 std dev 2001.61
>    lost servo lock 3 times
>
> As a result, this patchset proposes the implementation of scenario I.
> The others were done through temporary patches which are not presented
> here due to the difficulty of presenting a coherent git history without
> resorting to reverts etc. The gist of each experiment should be clear
> though.
>
> The raw data is available for dissection at
> https://drive.google.com/open?id=1r9raU9ZeqOqkqts6Lb-ISf5ubLDLP3wk.
> The logic analyzer captures can be opened with a free-as-in-beer program
> provided by Saleae: https://www.saleae.com/downloads/.
>
> In the capture data one can find the MOSI, SCK SPI signals, as well as a
> debug GPIO which was toggled at the same time as the PTP system
> timestamp was taken, to give the viewer an impression of what the
> software is capturing compared to the actual timing of the SPI transfer.
>
> Attached are also some close-up screenshots of transfers where there is
> a clear and huge delay in-between frames of the same 12-byte SPI
> transfer. As it turns out, these were all caused by the CPU getting
> interrupted by some other IRQ. Approaches H and I are the only ones that
> get rid of these glitches. In theory, the byte-level snapshotting should
> be less vulnerable to an IRQ interrupting the SPI transfer (because the
> time window is much smaller) but as the 58 minutes experiment shows, it
> is not immune.
>
> Vladimir Oltean (5):
>   spi: Use an abbreviated pointer to ctlr->cur_msg in
>     __spi_pump_messages
>   spi: Add a PTP system timestamp to the transfer structure
>   spi: spi-fsl-dspi: Use poll mode in case the platform IRQ is missing
>   spi: spi-fsl-dspi: Implement the PTP system timestamping for TCFQ mode
>   spi: spi-fsl-dspi: Disable interrupts and preemption during poll mode
>     transfer
>
>  drivers/spi/spi-fsl-dspi.c | 117 +++++++++++++++++++++++++++++++------
>  drivers/spi/spi.c          |  85 +++++++++++++++++++++++----
>  include/linux/spi/spi.h    |  38 ++++++++++++
>  3 files changed, 210 insertions(+), 30 deletions(-)
>
> --
> 2.17.1
>

To the PTP/DSA people copied,

It's possible that I'm going to get a lot of hate for saying this, but
I think we're all missing the forest for the trees with these
ptp_system_timestamp patches.
I'm going to start off with 4 truisms:
- The best software timestamp is worse than a hardware timestamp
- DSA switches are switches that are connected to their host over Ethernet
- Ethernet has support for hardware timestamping
- The mess of taking precise hardware timestamps is well hidden from the kernel

You might see where I'm going.
Maybe this is all really a DSA-specific problem, and should be treated
as such (kept contained).
The claimed goal is to synchronize the DSA switches' time with phc2sys
to/from something else. The problem is that there is latency for
reading the PHC on a DSA device that is a discrete chip.
What if all we need is just a mini-"phc2sys-over-Ethernet" that runs
on a kernel thread internally to DSA? We say that DSA switches are
"slave" to the "master" netdevice - their PTP clock can be the same.
I am fairly confident that the sja1105 at least can be configured in
hardware to work in this mode. One just needs to enable the CPU port
in its own reachability matrix. None of the switch ports is really a
"CPU port" hardware speaking.
- TX timestamps are taken by installing a management route with a
specified port destination. That destination can be the port the frame
came from (the CPU port) if the above condition is met.
- RX timestamps are taken by the switch for frames matching one of 2
MAC filters. Then a short Ethernet frame containing metadata (RX
timestamp) is created and sent to the CPU port. If I enable RX
timestamping on the CPU port, then every management frame sent from
Linux will also generate an RX timestamp (as well as a TX timestamp,
but that is irrelevant).
I believe something similar should be possible on other hardware as well.

The kernel thread can loop back an Ethernet frame and use the 4
collected timestamps to calculate offset and delay.
The only question is how to manage the sync direction (DSA switch to
master, or vice-versa).
It is assumed that the master netdevice supports hardware timestamping
and has a PHC with lower access jitter. That might be a more common
thing than an MDIO or SPI controller with polled I/O and software
timestamping implemented.

Looking forward to comments. If I'm wrong and we do need to extend the
SPI and MDIO subsystems, then I'd better be wrong in any way we look
at the problem, because the alternative is rather intrusive and
tedious to do right (not to mention, not very reusable).

Thanks,
-Vladimir

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH spi for-5.4 0/5] Deterministic SPI latency with NXP DSPI driver
  2019-08-20 15:57 ` [PATCH spi for-5.4 0/5] Deterministic SPI latency with NXP DSPI driver Vladimir Oltean
@ 2019-08-20 16:57   ` Andrew Lunn
  2019-08-21  4:38   ` Richard Cochran
  2019-08-21  4:42   ` Richard Cochran
  2 siblings, 0 replies; 27+ messages in thread
From: Andrew Lunn @ 2019-08-20 16:57 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Mark Brown, Hubert Feurstein, mlichvar, Richard Cochran,
	Florian Fainelli, linux-spi, netdev

> - Ethernet has support for hardware timestamping

True, but not all drivers support it.

Marvell switches are often combined with Marvell MACs. None of the
Marvell MAC drivers, mv643xx, mvneta, mvpp2 or octeontx2 support
hardware timestamping. My guess is, the hardware probably supports it,
but nobody has taken the time to writing the driver code. FEC does
have PTP support, but what does it look like in general? Are Marvell
drives the exception, or the norm?

What we have been talking about in this thread, adding timestamp calls
at various places, is simple, compared to adding driver code to a
number of MAC drivers. We can get a lot of 'bang for our buck' with
time stamps, it is easy to copy to other drivers, you don't need a
good knowledge of the hardware, datasheet, etc.

So if we can get this accepted, we should try to do it. You can always
come back later and add your full hardware solution.

   Andrew

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH spi for-5.4 1/5] spi: Use an abbreviated pointer to ctlr->cur_msg in __spi_pump_messages
  2019-08-18 18:25 ` [PATCH spi for-5.4 1/5] spi: Use an abbreviated pointer to ctlr->cur_msg in __spi_pump_messages Vladimir Oltean
@ 2019-08-20 18:21   ` Mark Brown
  2019-08-20 19:36     ` Vladimir Oltean
  0 siblings, 1 reply; 27+ messages in thread
From: Mark Brown @ 2019-08-20 18:21 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: h.feurstein, mlichvar, richardcochran, andrew, f.fainelli,
	linux-spi, netdev

[-- Attachment #1: Type: text/plain, Size: 345 bytes --]

On Sun, Aug 18, 2019 at 09:25:56PM +0300, Vladimir Oltean wrote:

>  	/* Extract head of queue */
> -	ctlr->cur_msg =
> -		list_first_entry(&ctlr->queue, struct spi_message, queue);
> +	mesg = list_first_entry(&ctlr->queue, struct spi_message, queue);
> +	ctlr->cur_msg = mesg;

Why mesg when the existing code uses msg as an abbreviation here?

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH spi for-5.4 1/5] spi: Use an abbreviated pointer to ctlr->cur_msg in __spi_pump_messages
  2019-08-20 18:21   ` Mark Brown
@ 2019-08-20 19:36     ` Vladimir Oltean
  2019-08-21 11:01       ` Mark Brown
  0 siblings, 1 reply; 27+ messages in thread
From: Vladimir Oltean @ 2019-08-20 19:36 UTC (permalink / raw)
  To: Mark Brown
  Cc: Hubert Feurstein, mlichvar, Richard Cochran, Andrew Lunn,
	Florian Fainelli, linux-spi, netdev

Hi Mark,

On Tue, 20 Aug 2019 at 21:21, Mark Brown <broonie@kernel.org> wrote:
>
> On Sun, Aug 18, 2019 at 09:25:56PM +0300, Vladimir Oltean wrote:
>
> >       /* Extract head of queue */
> > -     ctlr->cur_msg =
> > -             list_first_entry(&ctlr->queue, struct spi_message, queue);
> > +     mesg = list_first_entry(&ctlr->queue, struct spi_message, queue);
> > +     ctlr->cur_msg = mesg;
>
> Why mesg when the existing code uses msg as an abbreviation here?

Does it matter? I took from spi_finalize_current_message which also uses mesg.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH spi for-5.4 0/5] Deterministic SPI latency with NXP DSPI driver
  2019-08-20 15:57 ` [PATCH spi for-5.4 0/5] Deterministic SPI latency with NXP DSPI driver Vladimir Oltean
  2019-08-20 16:57   ` Andrew Lunn
@ 2019-08-21  4:38   ` Richard Cochran
  2019-08-21 14:08     ` Richard Cochran
  2019-08-21  4:42   ` Richard Cochran
  2 siblings, 1 reply; 27+ messages in thread
From: Richard Cochran @ 2019-08-21  4:38 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Mark Brown, Hubert Feurstein, mlichvar, Andrew Lunn,
	Florian Fainelli, linux-spi, netdev

On Tue, Aug 20, 2019 at 06:57:10PM +0300, Vladimir Oltean wrote:
> What if all we need is just a mini-"phc2sys-over-Ethernet" that runs
> on a kernel thread internally to DSA? We say that DSA switches are
> "slave" to the "master" netdevice - their PTP clock can be the same.
> I am fairly confident that the sja1105 at least can be configured in
> hardware to work in this mode. One just needs to enable the CPU port
> in its own reachability matrix. None of the switch ports is really a
> "CPU port" hardware speaking.

I did consider this method when working on an early version of the
marvell driver.  At least for that chip, there was no way to get the
time stamps on the port attached to the CPU port.

The DSA system (as I understand it) does not allow using the Linux
interface acting as the CPU port in a way that would allow taking time
stamps with the existing code base.  You might find a way, but I guess
it won't be easy.

Overall, the PTP switch use case is well supported by Linux.  The
synchronization of the management CPU to the PTP, while nice to have,
is not required to implement a Transparent Clock.  Your specific
application might require it, but honestly, if the management CPU
needs good synchronization, then you really aught to feed a PPS from
the switch into a gpio (for example) on the CPU.

Using SPI or MDIO or I2C or UART as a synchronization bus is not a
wise solution.

Having said that, I don't oppose improving the situation for these
slow, non-deterministic serial buses, but you will have to sell your
solution to the maintainers of said buses.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH spi for-5.4 0/5] Deterministic SPI latency with NXP DSPI driver
  2019-08-20 15:57 ` [PATCH spi for-5.4 0/5] Deterministic SPI latency with NXP DSPI driver Vladimir Oltean
  2019-08-20 16:57   ` Andrew Lunn
  2019-08-21  4:38   ` Richard Cochran
@ 2019-08-21  4:42   ` Richard Cochran
  2 siblings, 0 replies; 27+ messages in thread
From: Richard Cochran @ 2019-08-21  4:42 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Mark Brown, Hubert Feurstein, mlichvar, Andrew Lunn,
	Florian Fainelli, linux-spi, netdev

On Tue, Aug 20, 2019 at 06:57:10PM +0300, Vladimir Oltean wrote:
> I believe something similar should be possible on other hardware as well.

Not for the marvell link street devices, AFAICT.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH spi for-5.4 1/5] spi: Use an abbreviated pointer to ctlr->cur_msg in __spi_pump_messages
  2019-08-20 19:36     ` Vladimir Oltean
@ 2019-08-21 11:01       ` Mark Brown
  0 siblings, 0 replies; 27+ messages in thread
From: Mark Brown @ 2019-08-21 11:01 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Hubert Feurstein, mlichvar, Richard Cochran, Andrew Lunn,
	Florian Fainelli, linux-spi, netdev

[-- Attachment #1: Type: text/plain, Size: 795 bytes --]

On Tue, Aug 20, 2019 at 10:36:43PM +0300, Vladimir Oltean wrote:
> On Tue, 20 Aug 2019 at 21:21, Mark Brown <broonie@kernel.org> wrote:

> > On Sun, Aug 18, 2019 at 09:25:56PM +0300, Vladimir Oltean wrote:

> > >       /* Extract head of queue */
> > > -     ctlr->cur_msg =
> > > -             list_first_entry(&ctlr->queue, struct spi_message, queue);
> > > +     mesg = list_first_entry(&ctlr->queue, struct spi_message, queue);
> > > +     ctlr->cur_msg = mesg;

> > Why mesg when the existing code uses msg as an abbreviation here?

> Does it matter? I took from spi_finalize_current_message which also uses mesg.

It's particularly visible when it's on the same line, flags up a
question about if things are the same.  Other things not being great
doesn't preclude making this one better.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH spi for-5.4 0/5] Deterministic SPI latency with NXP DSPI driver
  2019-08-21  4:38   ` Richard Cochran
@ 2019-08-21 14:08     ` Richard Cochran
  2019-08-21 20:17       ` Vladimir Oltean
  0 siblings, 1 reply; 27+ messages in thread
From: Richard Cochran @ 2019-08-21 14:08 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Mark Brown, Hubert Feurstein, mlichvar, Andrew Lunn,
	Florian Fainelli, linux-spi, netdev

On Tue, Aug 20, 2019 at 09:38:45PM -0700, Richard Cochran wrote:
> Overall, the PTP switch use case is well supported by Linux.  The
> synchronization of the management CPU to the PTP, while nice to have,
> is not required to implement a Transparent Clock.  Your specific
> application might require it, but honestly, if the management CPU
> needs good synchronization, then you really aught to feed a PPS from
> the switch into a gpio (for example) on the CPU.

Another way to achieve this is to have a second MAC interface on the
management CPU connected to a spare port on the switch.  Then time
stamping, PHC, ptp4l, and phc2sys work as expected.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH spi for-5.4 0/5] Deterministic SPI latency with NXP DSPI driver
  2019-08-21 14:08     ` Richard Cochran
@ 2019-08-21 20:17       ` Vladimir Oltean
  2019-08-22 14:16         ` Richard Cochran
  0 siblings, 1 reply; 27+ messages in thread
From: Vladimir Oltean @ 2019-08-21 20:17 UTC (permalink / raw)
  To: Richard Cochran
  Cc: Mark Brown, Hubert Feurstein, Miroslav Lichvar, Andrew Lunn,
	Florian Fainelli, linux-spi, netdev

Hi Richard,

On Wed, 21 Aug 2019 at 17:08, Richard Cochran <richardcochran@gmail.com> wrote:
>
> On Tue, Aug 20, 2019 at 09:38:45PM -0700, Richard Cochran wrote:
> > Overall, the PTP switch use case is well supported by Linux.  The
> > synchronization of the management CPU to the PTP, while nice to have,
> > is not required to implement a Transparent Clock.  Your specific
> > application might require it, but honestly, if the management CPU
> > needs good synchronization, then you really aught to feed a PPS from
> > the switch into a gpio (for example) on the CPU.
>
> Another way to achieve this is to have a second MAC interface on the
> management CPU connected to a spare port on the switch.  Then time
> stamping, PHC, ptp4l, and phc2sys work as expected.
>
> Thanks,
> Richard

Of course PPS with a dedicated hardware receiver that can take input
compare timestamps is always preferable. However non-Ethernet
synchronization in the field looks to me like "make do with whatever
you can". I'm not sure a plain GPIO that raises an interrupt is better
than an interrupt-driven serial protocol controller - it's (mostly)
the interrupts that throw off the precision of the software timestamp.
And use Miroslav's pps-gpio-poll module and you're back from where you
started (try to make a sw timestamp as precise as possible).
As for dedicating a second interface pair in (basically) loopback just
for sync, that's how I'm testing PTP when I don't have a second board
and hence how the idea occurred to me. I can imagine this even getting
deployed and I can also probably name an example, but it certainly
wouldn't be my first choice. But DSA could have that built-in, and
with the added latency benefit of a MAC-to-MAC connection.
Too bad the mv88e6xxx driver can't do loopback timestamping, that's
already 50% of the DSA drivers that support PTP at all. An embedded
solution for this is less compelling now.

Regards,
-Vladimir

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH spi for-5.4 0/5] Deterministic SPI latency with NXP DSPI driver
  2019-08-21 20:17       ` Vladimir Oltean
@ 2019-08-22 14:16         ` Richard Cochran
  2019-08-22 14:56           ` Andrew Lunn
  2019-08-22 14:58           ` Vladimir Oltean
  0 siblings, 2 replies; 27+ messages in thread
From: Richard Cochran @ 2019-08-22 14:16 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Mark Brown, Hubert Feurstein, Miroslav Lichvar, Andrew Lunn,
	Florian Fainelli, linux-spi, netdev

On Wed, Aug 21, 2019 at 11:17:23PM +0300, Vladimir Oltean wrote:
> Of course PPS with a dedicated hardware receiver that can take input
> compare timestamps is always preferable. However non-Ethernet
> synchronization in the field looks to me like "make do with whatever
> you can". I'm not sure a plain GPIO that raises an interrupt is better
> than an interrupt-driven serial protocol controller - it's (mostly)
> the interrupts that throw off the precision of the software timestamp.
> And use Miroslav's pps-gpio-poll module and you're back from where you
> started (try to make a sw timestamp as precise as possible).

Right, it might be better, might not.  You can consider hacking a
local time stamp into the ISR.  Also, if one of your MACs has a input
event pin, you can feed the switch's PPS output in there.

> wouldn't be my first choice. But DSA could have that built-in, and
> with the added latency benefit of a MAC-to-MAC connection.
> Too bad the mv88e6xxx driver can't do loopback timestamping, that's
> already 50% of the DSA drivers that support PTP at all. An embedded
> solution for this is less compelling now.

Let me back track on my statement about mv88e6xxx.  At the time, I
didn't see any practical way to use the CPU port for synchronization,
but I forget exactly the details.  Maybe it is indeed possible,
somehow.  If you can find a way that will work on your switch and on
the Marvell, then I'd like to hear about it.

Thinking back...

One problem is this.  PTP requires a delay measurement.  You can send
a delay request from the host, but there will never be a reply.

Another problem is this.  A Sync message arriving on an external port
is time stamped there, but then it is encapsulated as a tagged DSA
management message and delivered out the CPU port.  At this point, it
is no longer a PTP frame and will not be time stamped at the CPU port
on egress.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH spi for-5.4 0/5] Deterministic SPI latency with NXP DSPI driver
  2019-08-22 14:16         ` Richard Cochran
@ 2019-08-22 14:56           ` Andrew Lunn
  2019-08-22 14:58           ` Vladimir Oltean
  1 sibling, 0 replies; 27+ messages in thread
From: Andrew Lunn @ 2019-08-22 14:56 UTC (permalink / raw)
  To: Richard Cochran
  Cc: Vladimir Oltean, Mark Brown, Hubert Feurstein, Miroslav Lichvar,
	Florian Fainelli, linux-spi, netdev

> Thinking back...
> 
> One problem is this.  PTP requires a delay measurement.  You can send
> a delay request from the host, but there will never be a reply.
> 
> Another problem is this.  A Sync message arriving on an external port
> is time stamped there, but then it is encapsulated as a tagged DSA
> management message and delivered out the CPU port.  At this point, it
> is no longer a PTP frame and will not be time stamped at the CPU port
> on egress.

I think so that both the host interface and the CPU port recognize the
frame and time stamp it, it needs to be untagged. Otherwise, as you
said, the hardware does not recognise it. I've never tried sending
untagged frames to the CPU port. I expect they are just dropped.

However, somebody might want to play with the TCAM. The TCAM can
redirect a packet out any port. I've no idea what the pipeline
ordering is, but it might be possible for the TCAM to redirect a frame
back to the host interface, before it gets dropped because it does not
have DSA tags?  But is the TCAM before or after PTP in the pipeline?
Could you then get 4 timestamps for the same frame?  Host egress,
switch ingress, switch egress, host ingress?

But how do you make this generic? Can other switches also loop a frame
back like this and do the same time stamping? How do you actually get
access to these time stamps split over two blocks of hardware?

So in theory, this might be possible, but in practice?

     Andrew

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH spi for-5.4 0/5] Deterministic SPI latency with NXP DSPI driver
  2019-08-22 14:16         ` Richard Cochran
  2019-08-22 14:56           ` Andrew Lunn
@ 2019-08-22 14:58           ` Vladimir Oltean
  2019-08-22 16:05             ` Richard Cochran
  2019-08-22 16:10             ` Richard Cochran
  1 sibling, 2 replies; 27+ messages in thread
From: Vladimir Oltean @ 2019-08-22 14:58 UTC (permalink / raw)
  To: Richard Cochran
  Cc: Mark Brown, Hubert Feurstein, Miroslav Lichvar, Andrew Lunn,
	Florian Fainelli, linux-spi, netdev

On Thu, 22 Aug 2019 at 17:16, Richard Cochran <richardcochran@gmail.com> wrote:
>
> On Wed, Aug 21, 2019 at 11:17:23PM +0300, Vladimir Oltean wrote:
> > Of course PPS with a dedicated hardware receiver that can take input
> > compare timestamps is always preferable. However non-Ethernet
> > synchronization in the field looks to me like "make do with whatever
> > you can". I'm not sure a plain GPIO that raises an interrupt is better
> > than an interrupt-driven serial protocol controller - it's (mostly)
> > the interrupts that throw off the precision of the software timestamp.
> > And use Miroslav's pps-gpio-poll module and you're back from where you
> > started (try to make a sw timestamp as precise as possible).
>
> Right, it might be better, might not.  You can consider hacking a
> local time stamp into the ISR.  Also, if one of your MACs has a input
> event pin, you can feed the switch's PPS output in there.
>
> > wouldn't be my first choice. But DSA could have that built-in, and
> > with the added latency benefit of a MAC-to-MAC connection.
> > Too bad the mv88e6xxx driver can't do loopback timestamping, that's
> > already 50% of the DSA drivers that support PTP at all. An embedded
> > solution for this is less compelling now.
>
> Let me back track on my statement about mv88e6xxx.  At the time, I
> didn't see any practical way to use the CPU port for synchronization,
> but I forget exactly the details.  Maybe it is indeed possible,
> somehow.  If you can find a way that will work on your switch and on
> the Marvell, then I'd like to hear about it.
>
> Thinking back...
>
> One problem is this.  PTP requires a delay measurement.  You can send
> a delay request from the host, but there will never be a reply.
>

I don't think I understand the problem here.
You need to think about this as a sort of degenerate PTP where the
master and the slave are under the same device's management, not the
full stack. I never actually said I want to make ptp4l work over the
CPU port.
So instead of the typical:

Master (device A)                Slave (device B)

    |                            |         Tstamps known
 t1 |------\      Sync           |         to slave
    |       \-----\              |
    |              \-----\       |
    |                     \----->| t2      t2
    |------\    Follow_up        |
    |       \-----\              |
    |              \-----\       |
    |                     \----->| t1      t1, t2
    |                            |
    |          Delay_req  /------| t3      t1, t2, t3
    |              /-----/       |
    |       /-----/              |
 t4 |<-----/                     |
    |                            |
    |------\    Follow_up        |
    |       \-----\              |
    |              \-----\       |
    |                     \----->|         t1, t2, t3, t4
    |                            |
    |                            |
    |                            |
    |                            |
    v           time             v

You'd have something like this:

Master (DSA master port)         Slave (switch CPU port)

    |                            |         Tstamps known
    |                            |         to slave
    |       Local_sync_req       |
 t1 |------\                     |         t1
    |       \-----\              |
    |              \-----\       |
    |                     \----->| t2      t1, t2
    |                            |
    |     Local_sync_resp /------| t3      t1, t2, t3
    |              /-----/       |
    |       /-----/              |
 t4 |<-----/                     |         t1, t2, t3, t4
    |                            |
    |                            |
    v           time             v

As far as I understand PTP, the other messages are just protocol
blah-blah because the slave doesn't know what the master knows, which
is clearly not applicable here.
t1, t2, t3, t4 still keep the same definitions though (master TX
timestamp, slave RX timestamp, slave TX timestamp, master RX
timestamp).
I'm 90% certain the sja1105 can take an RX timestamp for a management
frame (e.g. one for which a TX timestamp was requested) sent in
loopback.

> Another problem is this.  A Sync message arriving on an external port
> is time stamped there, but then it is encapsulated as a tagged DSA
> management message and delivered out the CPU port.  At this point, it
> is no longer a PTP frame and will not be time stamped at the CPU port
> on egress.
>

But you don't mean a TX timestamp at the egress of swp4 here, do you?

 +---------------------------------+
 |  Management CPU                 |
 |                                 |
 |           DSA master            |
 |             +-----+             |
 |             |     |             |
 |             |     |             |
 +-------------+-----+-------------+
           eth0   ^ RX tstamp
                  |
                  | TX tstamp
        (swp4) CPU port
 +-------------+-----+-------------+
 |  Switch     |     |             |
 |             |     |             |
 |             +-----+             |
 |                ^ T              |
 |                |                |
 |    +-----------+                |
 |    |                            |
 |    |                            |
 | +-----+ +-----+ +-----+ +-----+ |
 | |     | |     | |     | |     | |
 | |     | |     | |     | |     | |
 +-+-----+-+-----+-+-----+-+-----+-+
   swp0    swp1    swp2    swp3
      ^
      | RX tstamp

Why would that matter?
Or just the RX timestamp at eth0?
If you mean the latter, then yes, having HWTSTAMP_FILTER_ALL in the
rx_filter of the DSA master is a hard requirement.

> Thanks,
> Richard

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH spi for-5.4 0/5] Deterministic SPI latency with NXP DSPI driver
  2019-08-22 14:58           ` Vladimir Oltean
@ 2019-08-22 16:05             ` Richard Cochran
  2019-08-22 16:13               ` Vladimir Oltean
  2019-08-22 16:10             ` Richard Cochran
  1 sibling, 1 reply; 27+ messages in thread
From: Richard Cochran @ 2019-08-22 16:05 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Mark Brown, Hubert Feurstein, Miroslav Lichvar, Andrew Lunn,
	Florian Fainelli, linux-spi, netdev

On Thu, Aug 22, 2019 at 05:58:49PM +0300, Vladimir Oltean wrote:
> I don't think I understand the problem here.

On the contrary, I do.

> You'd have something like this:
> 
> Master (DSA master port)         Slave (switch CPU port)
> 
>     |                            |         Tstamps known
>     |                            |         to slave
>     |       Local_sync_req       |
>  t1 |------\                     |         t1
>     |       \-----\              |
>     |              \-----\       |
>     |                     \----->| t2      t1, t2
>     |                            |
>     |     Local_sync_resp /------| t3      t1, t2, t3
>     |              /-----/       |
>     |       /-----/              |
>  t4 |<-----/                     |         t1, t2, t3, t4
>     |                            |
>     |                            |
>     v           time             v

And who generates Local_sync_resp?

Also, what sort of frame is it?  PTP has no Sync request or response.

> But you don't mean a TX timestamp at the egress of swp4 here, do you?

Yes, I do.
 
> Why would that matter?

Because in order to synchronize to an external GM, you need to measure
two things:

1. the (unchanging) delay from MAC to MAC
2. the (per-packet) switch residence time

Thanks,
Richard

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH spi for-5.4 0/5] Deterministic SPI latency with NXP DSPI driver
  2019-08-22 14:58           ` Vladimir Oltean
  2019-08-22 16:05             ` Richard Cochran
@ 2019-08-22 16:10             ` Richard Cochran
  1 sibling, 0 replies; 27+ messages in thread
From: Richard Cochran @ 2019-08-22 16:10 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Mark Brown, Hubert Feurstein, Miroslav Lichvar, Andrew Lunn,
	Florian Fainelli, linux-spi, netdev

On Thu, Aug 22, 2019 at 05:58:49PM +0300, Vladimir Oltean wrote:
> If you mean the latter, then yes, having HWTSTAMP_FILTER_ALL in the
> rx_filter of the DSA master is a hard requirement.

And to clear, the marvell only recognizes PTP frames.  That means that
DSA frames cannot be time stamped.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH spi for-5.4 0/5] Deterministic SPI latency with NXP DSPI driver
  2019-08-22 16:05             ` Richard Cochran
@ 2019-08-22 16:13               ` Vladimir Oltean
  2019-08-23  5:22                 ` Richard Cochran
  0 siblings, 1 reply; 27+ messages in thread
From: Vladimir Oltean @ 2019-08-22 16:13 UTC (permalink / raw)
  To: Richard Cochran
  Cc: Mark Brown, Hubert Feurstein, Miroslav Lichvar, Andrew Lunn,
	Florian Fainelli, linux-spi, netdev

On Thu, 22 Aug 2019 at 19:05, Richard Cochran <richardcochran@gmail.com> wrote:
>
> On Thu, Aug 22, 2019 at 05:58:49PM +0300, Vladimir Oltean wrote:
> > I don't think I understand the problem here.
>
> On the contrary, I do.
>

You do think that I understand the problem? But I don't!

> > You'd have something like this:
> >
> > Master (DSA master port)         Slave (switch CPU port)
> >
> >     |                            |         Tstamps known
> >     |                            |         to slave
> >     |       Local_sync_req       |
> >  t1 |------\                     |         t1
> >     |       \-----\              |
> >     |              \-----\       |
> >     |                     \----->| t2      t1, t2
> >     |                            |
> >     |     Local_sync_resp /------| t3      t1, t2, t3
> >     |              /-----/       |
> >     |       /-----/              |
> >  t4 |<-----/                     |         t1, t2, t3, t4
> >     |                            |
> >     |                            |
> >     v           time             v
>
> And who generates Local_sync_resp?
>

Local_sync_resp is the same as Local_sync_req except maybe with a
custom tag added by the switch. Irrelevant as long as the DSA master
can timestamp it.

> Also, what sort of frame is it?  PTP has no Sync request or response.
>

A frame that can be timestamped on RX and TX by the DSA switch and
master, that is not a PTP frame.

> > But you don't mean a TX timestamp at the egress of swp4 here, do you?
>
> Yes, I do.
>
> > Why would that matter?
>
> Because in order to synchronize to an external GM, you need to measure
> two things:
>
> 1. the (unchanging) delay from MAC to MAC
> 2. the (per-packet) switch residence time
>

But since when are we discussing the synchronization to an external
grandmaster? I don't see the connection.

> Thanks,
> Richard

Regards,
-Vladimir

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH spi for-5.4 2/5] spi: Add a PTP system timestamp to the transfer structure
  2019-08-18 18:25 ` [PATCH spi for-5.4 2/5] spi: Add a PTP system timestamp to the transfer structure Vladimir Oltean
@ 2019-08-22 17:11   ` Mark Brown
  2019-08-24 12:38     ` Vladimir Oltean
  0 siblings, 1 reply; 27+ messages in thread
From: Mark Brown @ 2019-08-22 17:11 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: h.feurstein, mlichvar, richardcochran, andrew, f.fainelli,
	linux-spi, netdev

[-- Attachment #1: Type: text/plain, Size: 1695 bytes --]

On Sun, Aug 18, 2019 at 09:25:57PM +0300, Vladimir Oltean wrote:

> @@ -1391,6 +1402,13 @@ static void __spi_pump_messages(struct spi_controller *ctlr, bool in_kthread)
>  		goto out;
>  	}
>  
> +	if (!ctlr->ptp_sts_supported) {
> +		list_for_each_entry(xfer, &mesg->transfers, transfer_list) {
> +			xfer->ptp_sts_word_pre = 0;
> +			ptp_read_system_prets(xfer->ptp_sts);
> +		}
> +	}
> +

We can do better than this for controllers which use transfer_one().

> +const void *spi_xfer_ptp_sts_word(struct spi_transfer *xfer, bool pre)
> +{

xfer can be const here too.

> + * @ptp_sts_supported: If the driver sets this to true, it must provide a
> + *	time snapshot in @spi_transfer->ptp_sts as close as possible to the
> + *	moment in time when @spi_transfer->ptp_sts_word_pre and
> + *	@spi_transfer->ptp_sts_word_post were transmitted.
> + *	If the driver does not set this, the SPI core takes the snapshot as
> + *	close to the driver hand-over as possible.

A couple of issues here.  The big one is that for PIO transfers
this is going to either complicate the code or introduce overhead
in individual drivers for an extremely niche use case.  I guess
most drivers won't implement it which makes this a bit moot but
then this is a concern that pushes back against the idea of
implementing the feature.

The other is that it's not 100% clear what you're looking to
timestamp here - is it when the data goes on the wire, is it when
the data goes on the FIFO (which could be relatively large)?  I'm
guessing you're looking for the physical transfer here, if that's
the case should there be some effort to compensate for the delays
in the controller?

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH spi for-5.4 3/5] spi: spi-fsl-dspi: Use poll mode in case the platform IRQ is missing
  2019-08-18 18:25 ` [PATCH spi for-5.4 3/5] spi: spi-fsl-dspi: Use poll mode in case the platform IRQ is missing Vladimir Oltean
@ 2019-08-22 17:38   ` Mark Brown
  0 siblings, 0 replies; 27+ messages in thread
From: Mark Brown @ 2019-08-22 17:38 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: h.feurstein, mlichvar, richardcochran, andrew, f.fainelli,
	linux-spi, netdev

[-- Attachment #1: Type: text/plain, Size: 435 bytes --]

On Sun, Aug 18, 2019 at 09:25:58PM +0300, Vladimir Oltean wrote:
> On platforms like LS1021A which use TCFQ mode, an interrupt needs to be
> processed after each byte is TXed/RXed. I tried to make the DSPI
> implementation on this SoC operate in other, more efficient modes (EOQ,
> DMA) but it looks like it simply isn't possible.

This doesn't apply against current code (I guess due to your cleanup
series), please check and resend.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH spi for-5.4 0/5] Deterministic SPI latency with NXP DSPI driver
  2019-08-22 16:13               ` Vladimir Oltean
@ 2019-08-23  5:22                 ` Richard Cochran
  2019-08-24 12:13                   ` Vladimir Oltean
  0 siblings, 1 reply; 27+ messages in thread
From: Richard Cochran @ 2019-08-23  5:22 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Mark Brown, Hubert Feurstein, Miroslav Lichvar, Andrew Lunn,
	Florian Fainelli, linux-spi, netdev

On Thu, Aug 22, 2019 at 07:13:12PM +0300, Vladimir Oltean wrote:
> You do think that I understand the problem? But I don't!

;^)

> > And who generates Local_sync_resp?
> >
> 
> Local_sync_resp is the same as Local_sync_req except maybe with a
> custom tag added by the switch. Irrelevant as long as the DSA master
> can timestamp it.

So this is point why it won't work.  The time stamping logic in the
switch only recognizes PTP frames.
 
Thanks,
Richard

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH spi for-5.4 0/5] Deterministic SPI latency with NXP DSPI driver
  2019-08-23  5:22                 ` Richard Cochran
@ 2019-08-24 12:13                   ` Vladimir Oltean
  0 siblings, 0 replies; 27+ messages in thread
From: Vladimir Oltean @ 2019-08-24 12:13 UTC (permalink / raw)
  To: Richard Cochran
  Cc: Mark Brown, Hubert Feurstein, Miroslav Lichvar, Andrew Lunn,
	Florian Fainelli, linux-spi, netdev

On Fri, 23 Aug 2019 at 08:22, Richard Cochran <richardcochran@gmail.com> wrote:
>
> On Thu, Aug 22, 2019 at 07:13:12PM +0300, Vladimir Oltean wrote:
> > You do think that I understand the problem? But I don't!
>
> ;^)
>
> > > And who generates Local_sync_resp?
> > >
> >
> > Local_sync_resp is the same as Local_sync_req except maybe with a
> > custom tag added by the switch. Irrelevant as long as the DSA master
> > can timestamp it.
>
> So this is point why it won't work.  The time stamping logic in the
> switch only recognizes PTP frames.
>
> Thanks,
> Richard

So to summarize the pros and cons of a PHC-to-PHC loopback sync:
Pros:
- At least two orders of magnitude improvement in offset compared to a
software timestamping solution, even in the situation where the
software timestamp is fully optimized
- Does not depend on the availability of PPS hardware
- In the case where both MACs support this, the synchronization can
simply reuse the DSA link with no dedicated circuitry
Cons:
- DSA framework for retrieving timestamps would need to be reworked
- The solution would have to be implemented in the kernel
- A separate protocol from PTP would have to be devised
- Not all DSA masters support hardware timestamping. Of those that do,
not all may support timestamping generic frames
- Not all PTP-capable DSA switches support timestamping generic frames
- Not all DSA switches may be able to loop back traffic from their CPU
port. I think this is called "hairpinning".
- The solution only covers the sync of the top-most switch in the DSA
tree. The hairpinning described above would need to be selective as
well, not just possible.

So at this point, the solution is not generic enough for me to be
compelled to prototype it. Taking system clock timestamps in the SPI
driver is "good enough". We'll just need to work out a way with Mark
that this can be added to the SPI subsystem, given the valid
objections already expressed.

Thanks,
-Vladimir

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH spi for-5.4 2/5] spi: Add a PTP system timestamp to the transfer structure
  2019-08-22 17:11   ` Mark Brown
@ 2019-08-24 12:38     ` Vladimir Oltean
  2019-08-27 19:01       ` Mark Brown
  0 siblings, 1 reply; 27+ messages in thread
From: Vladimir Oltean @ 2019-08-24 12:38 UTC (permalink / raw)
  To: Mark Brown
  Cc: Hubert Feurstein, Miroslav Lichvar, Richard Cochran, Andrew Lunn,
	Florian Fainelli, linux-spi, netdev

On Thu, 22 Aug 2019 at 21:19, Mark Brown <broonie@kernel.org> wrote:
>
> On Sun, Aug 18, 2019 at 09:25:57PM +0300, Vladimir Oltean wrote:
>
> > @@ -1391,6 +1402,13 @@ static void __spi_pump_messages(struct spi_controller *ctlr, bool in_kthread)
> >               goto out;
> >       }
> >
> > +     if (!ctlr->ptp_sts_supported) {
> > +             list_for_each_entry(xfer, &mesg->transfers, transfer_list) {
> > +                     xfer->ptp_sts_word_pre = 0;
> > +                     ptp_read_system_prets(xfer->ptp_sts);
> > +             }
> > +     }
> > +
>
> We can do better than this for controllers which use transfer_one().
>

You mean I should guard this "if", and the one below, with "&&
!ctlr->transfer_one"?

> > +const void *spi_xfer_ptp_sts_word(struct spi_transfer *xfer, bool pre)
> > +{
>
> xfer can be const here too.
>
> > + * @ptp_sts_supported: If the driver sets this to true, it must provide a
> > + *   time snapshot in @spi_transfer->ptp_sts as close as possible to the
> > + *   moment in time when @spi_transfer->ptp_sts_word_pre and
> > + *   @spi_transfer->ptp_sts_word_post were transmitted.
> > + *   If the driver does not set this, the SPI core takes the snapshot as
> > + *   close to the driver hand-over as possible.
>
> A couple of issues here.  The big one is that for PIO transfers
> this is going to either complicate the code or introduce overhead
> in individual drivers for an extremely niche use case.  I guess
> most drivers won't implement it which makes this a bit moot but
> then this is a concern that pushes back against the idea of
> implementing the feature.
>

The concern is the overhead in terms of code, or runtime performance?
Arguably the applications that require deterministic latency are
actually going to push for overall less overhead at runtime, even if
that comes at a cost in terms of code size. The spi-fsl-dspi driver
does not perform worse by any metric after this rework.

> The other is that it's not 100% clear what you're looking to
> timestamp here - is it when the data goes on the wire, is it when
> the data goes on the FIFO (which could be relatively large)?  I'm
> guessing you're looking for the physical transfer here, if that's
> the case should there be some effort to compensate for the delays
> in the controller?

The goal is to timestamp the moment when the SPI slave sees word N of
the data. Luckily the DSPI driver raises the TCF (Transfer Complete
Flag) once that word has been transmitted, which I used to my
advantage. The EOQ mode behaves similarly, but has a granularity of 4
words. The controller delays are hence implicitly included in the
software timestamp.
But the question is valid and I expect that such compensation might be
needed for some hardware, provided that it can be measured and
guaranteed. In fact Hubert did add such logic to the v3 of his MDIO
patch: https://lkml.org/lkml/2019/8/20/195 There were some objections
mainly related to the certainty of those offset corrections. I don't
want to "future-proof" the API now with features I have no use of, but
such compensation logic might come in the future.

Regards,
-Vladimir

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH spi for-5.4 2/5] spi: Add a PTP system timestamp to the transfer structure
  2019-08-24 12:38     ` Vladimir Oltean
@ 2019-08-27 19:01       ` Mark Brown
  0 siblings, 0 replies; 27+ messages in thread
From: Mark Brown @ 2019-08-27 19:01 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Hubert Feurstein, Miroslav Lichvar, Richard Cochran, Andrew Lunn,
	Florian Fainelli, linux-spi, netdev

[-- Attachment #1: Type: text/plain, Size: 3542 bytes --]

On Sat, Aug 24, 2019 at 03:38:16PM +0300, Vladimir Oltean wrote:
> On Thu, 22 Aug 2019 at 21:19, Mark Brown <broonie@kernel.org> wrote:
> > On Sun, Aug 18, 2019 at 09:25:57PM +0300, Vladimir Oltean wrote:

> > > +     if (!ctlr->ptp_sts_supported) {
> > > +             list_for_each_entry(xfer, &mesg->transfers, transfer_list) {
> > > +                     xfer->ptp_sts_word_pre = 0;
> > > +                     ptp_read_system_prets(xfer->ptp_sts);
> > > +             }
> > > +     }

> > We can do better than this for controllers which use transfer_one().

> You mean I should guard this "if", and the one below, with "&&
> !ctlr->transfer_one"?

Yes, that'd make it a bit more obvious that the better handling
is there.

> > > + * @ptp_sts_supported: If the driver sets this to true, it must provide a
> > > + *   time snapshot in @spi_transfer->ptp_sts as close as possible to the
> > > + *   moment in time when @spi_transfer->ptp_sts_word_pre and
> > > + *   @spi_transfer->ptp_sts_word_post were transmitted.
> > > + *   If the driver does not set this, the SPI core takes the snapshot as
> > > + *   close to the driver hand-over as possible.

> > A couple of issues here.  The big one is that for PIO transfers
> > this is going to either complicate the code or introduce overhead
> > in individual drivers for an extremely niche use case.  I guess
> > most drivers won't implement it which makes this a bit moot but
> > then this is a concern that pushes back against the idea of
> > implementing the feature.

> The concern is the overhead in terms of code, or runtime performance?

Both, yes.

> Arguably the applications that require deterministic latency are
> actually going to push for overall less overhead at runtime, even if
> that comes at a cost in terms of code size. The spi-fsl-dspi driver
> does not perform worse by any metric after this rework.

Determinalistic and fast are often note the same thing here,
sometimes it's better not to optimize if the optimization only
works some of the time for example.

> > The other is that it's not 100% clear what you're looking to
> > timestamp here - is it when the data goes on the wire, is it when
> > the data goes on the FIFO (which could be relatively large)?  I'm
> > guessing you're looking for the physical transfer here, if that's
> > the case should there be some effort to compensate for the delays
> > in the controller?

> The goal is to timestamp the moment when the SPI slave sees word N of
> the data. Luckily the DSPI driver raises the TCF (Transfer Complete
> Flag) once that word has been transmitted, which I used to my
> advantage. The EOQ mode behaves similarly, but has a granularity of 4
> words. The controller delays are hence implicitly included in the
> software timestamp.

The documentation should be clear on that, it'd be very natural
for someone to timestamp on entry to the FIFO.

> But the question is valid and I expect that such compensation might be
> needed for some hardware, provided that it can be measured and
> guaranteed. In fact Hubert did add such logic to the v3 of his MDIO
> patch: https://lkml.org/lkml/2019/8/20/195 There were some objections
> mainly related to the certainty of those offset corrections. I don't
> want to "future-proof" the API now with features I have no use of, but
> such compensation logic might come in the future.

I think it's mainly important that people know what the
expectations are so different drivers are consistent in how they
work, as you say the API can always be extended later.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, back to index

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-18 18:25 [PATCH spi for-5.4 0/5] Deterministic SPI latency with NXP DSPI driver Vladimir Oltean
2019-08-18 18:25 ` [PATCH spi for-5.4 1/5] spi: Use an abbreviated pointer to ctlr->cur_msg in __spi_pump_messages Vladimir Oltean
2019-08-20 18:21   ` Mark Brown
2019-08-20 19:36     ` Vladimir Oltean
2019-08-21 11:01       ` Mark Brown
2019-08-18 18:25 ` [PATCH spi for-5.4 2/5] spi: Add a PTP system timestamp to the transfer structure Vladimir Oltean
2019-08-22 17:11   ` Mark Brown
2019-08-24 12:38     ` Vladimir Oltean
2019-08-27 19:01       ` Mark Brown
2019-08-18 18:25 ` [PATCH spi for-5.4 3/5] spi: spi-fsl-dspi: Use poll mode in case the platform IRQ is missing Vladimir Oltean
2019-08-22 17:38   ` Mark Brown
2019-08-18 18:25 ` [PATCH spi for-5.4 4/5] spi: spi-fsl-dspi: Implement the PTP system timestamping for TCFQ mode Vladimir Oltean
2019-08-18 18:26 ` [PATCH spi for-5.4 5/5] spi: spi-fsl-dspi: Disable interrupts and preemption during poll mode transfer Vladimir Oltean
2019-08-20 15:57 ` [PATCH spi for-5.4 0/5] Deterministic SPI latency with NXP DSPI driver Vladimir Oltean
2019-08-20 16:57   ` Andrew Lunn
2019-08-21  4:38   ` Richard Cochran
2019-08-21 14:08     ` Richard Cochran
2019-08-21 20:17       ` Vladimir Oltean
2019-08-22 14:16         ` Richard Cochran
2019-08-22 14:56           ` Andrew Lunn
2019-08-22 14:58           ` Vladimir Oltean
2019-08-22 16:05             ` Richard Cochran
2019-08-22 16:13               ` Vladimir Oltean
2019-08-23  5:22                 ` Richard Cochran
2019-08-24 12:13                   ` Vladimir Oltean
2019-08-22 16:10             ` Richard Cochran
2019-08-21  4:42   ` Richard Cochran

Netdev Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/netdev/0 netdev/git/0.git
	git clone --mirror https://lore.kernel.org/netdev/1 netdev/git/1.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 netdev netdev/ https://lore.kernel.org/netdev \
		netdev@vger.kernel.org netdev@archiver.kernel.org
	public-inbox-index netdev

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.netdev


AGPL code for this site: git clone https://public-inbox.org/ public-inbox