* [PATCH v2 0/4] Deterministic SPI latency with NXP DSPI driver @ 2019-09-05 1:01 Vladimir Oltean 2019-09-05 1:01 ` [PATCH v2 1/4] spi: Use an abbreviated pointer to ctlr->cur_msg in __spi_pump_messages Vladimir Oltean ` (4 more replies) 0 siblings, 5 replies; 21+ messages in thread From: Vladimir Oltean @ 2019-09-05 1:01 UTC (permalink / raw) To: broonie, h.feurstein, mlichvar, richardcochran, andrew, f.fainelli Cc: linux-spi, netdev, Vladimir Oltean Since v1, I noticed that under CPU load, the pre-to-post delay gets reduced to half (i.e. the code runs twice as fast). After a bit of debugging, turns out this is the effect of the 'ondemand' governor. Disabling dynamic frequency scaling (either by switching to the 'performance' governor or by making sure that /sys/devices/system/cpu/cpu{0,1}/cpufreq/scaling_min_freq is equal to /sys/devices/system/cpu/cpu{0,1}/cpufreq/scaling_max_freq) is enough for all readouts to take roughly the same time now. It is an open question to me: is there any API available to kernel drivers for controlling this, or is it only a policy for user space to configure? So here is another test (K) which ran for 36 minutes, with interrupts disabled during the critical section, poll mode, and with the CPU pegged at the maximum frequency (1.2 GHz) and SPI controller at 12 MHz. offset: min -72 max 64 mean 0.00729594 std dev 21.6964 delay: min 4400 max 4480 mean 4402.66 std dev 95.8882 lost servo lock 0 times While it is clear that the delay is now finally constant, I can't seem to figure out why it is what it is - 4400 ns, when the bit time on 12 MHz SPI is 83.33 ns, and hence a frame time is 666 ns. Puzzled, I added a print in dspi_poll and it turns out the status register is always set on the first access. While this is yet another argument to operate in poll mode, it is also an indication that the transmission is CPU-bound and so is the timestamp measurement. Hence it probably makes sense to revert to an implementation similar to Hubert's - add a fixed timestamp correction based on frame size and frequency (hardware latencies are smaller than can be measured from the CPU), and just stop including the time to poll the TX confirmation status into the measurement. I haven't done this last change yet, however, mostly to collect comments and make sure I'm not operating on false assumptions. Having the CPU cap the measurement at 4400 ns makes it difficult to know when does the 666 ns transfer actually happen, and thus what an appropriate offset correction for the hardware transfer would be. Cover letter from v1 below. =========================================================== This patchset proposes an interface from the SPI subsystem for software timestamping SPI transfers. There is a default implementation provided in the core, as well as a mechanism for SPI slave drivers to check which byte was in fact timestamped post-facto. The patchset also adds the first user of this interface (the NXP DSPI driver in TCFQ mode). The interface is somewhat similar to Hubert Feurstein's proposal for the MDIO subsystem: https://lkml.org/lkml/2019/8/16/638 Original cover letter below. Also provided at the end some results with an extra test (J - phc2sys using the timestamps taken by the SPI core). =========================================================== Continuing the discussion created by Hubert Feurstein around the mv88e6xxx driver for MDIO-controlled switches (https://lkml.org/lkml/2019/8/2/1364), this patchset takes a similar approach for the NXP LS1021A-TSN board, which has a SPI-controlled DSA switch (SJA1105). The patchset is motivated by some experiments done with a logic analyzer, trying to understand the source of latency (and especially of the jitter). SJA1105 SPI messages for reading the PTP clock are 12 bytes in length: 4 for the SPI header and 8 for the timestamp. When looking at the messages with a scope, there's jitter basically everywhere: between bits of a frame and between frames in a transfer. The inter-bit jitter is hardware and impacts us to a lesser extend (is smaller and caused by the PVT stability of the oscillators, PLLs, etc). We will focus on the latency between consecutive SPI frames within a 12-byte transfer. As a preface, revisions of the DSPI controller IP are integrated in many Freescale/NXP devices. As a result, the driver has 3 modes of operation: - TCFQ (Transfer Complete Flag mode): The controller signals software that data has been sent/received after each individual word. - EOQ (End of Queue mode): The driver can implement batching by making use of the controller's 4-word deep FIFO. - DMA (Direct Memory Access mode): The SPI controller's FIFO is no longer in direct interaction with the driver, but is used to trigger the RX and TX channels of the eDMA module on the SoC. In LS1021A, the driver works in the least efficient mode of the 3 (TCFQ). There is a well-known errata that the DSPI controller is broken in conjunction with the eDMA module. As for the EOQ mode, I have tried unsuccessfully for a few days to make use of the 4 entry FIFO, and the hardware simply fails to reliably acknowledge the transmission when the FIFO gets full. So it looks like we're stuck with the TCFQ mode. The problem with phc2sys on the LS1021A-TSN board is that in order for the gettime64() call to complete on the sja1105, the system has to service 12 IRQs. Intuitively that is excessive and is the main source of jitter, but let's not get ahead of ourselves. An outline of the experiments that were done (unless otherwise mentioned, all of these ran for 120 seconds): A. First I have measured the (poor) performance of phc2sys under current conditions. (DSPI driver in IRQ mode, no PTP system timestamping) offset: min -53310 max 16107 mean -1737.18 std dev 11444.3 delay: min 163680 max 237360 mean 201149 std dev 22446.6 lost servo lock 1 times B. I switched the .gettime64 callback to .gettimex64, snapshotting the PTP system timestamp within the sja1105 driver. offset: min -48923 max 64217 mean -904.137 std dev 17358.1 delay: min 149600 max 203840 mean 169045 std dev 17993.3 lost servo lock 8 times C. I patched "struct spi_transfer" to contain the PTP system timestamp, and from the sja1105 driver, I passed this structure to be snapshotted by the SPI controller's driver (spi-fsl-dspi). This is the "transfer-level" snapshot. offset: min -64979 max 38979 mean -416.197 std dev 15367.9 delay: min 125120 max 168320 mean 150286 std dev 17675.3 lost servo lock 10 times D. I changed the placement of the transfer snapshotting within the DSPI driver, from "transfer-level" to "byte-level". offset: min -9021 max 7149 mean -0.418803 std dev 3529.81 delay: min 7840 max 23920 mean 14493.7 std dev 5982.17 lost servo lock 0 times E. I moved the DSPI driver to poll mode. I went back to collecting the PTP system timestamps from the sja1105 driver (same as B). offset: min -4199 max 46643 mean 418.214 std dev 4554.01 delay: min 84000 max 194000 mean 99463.2 std dev 12936.5 lost servo lock 1 times F. Transfer-level snapshotting in the DSPI driver (same as C), but in poll mode. offset: min -24244 max 1115 mean -230.478 std dev 2297.28 delay: min 69440 max 119040 mean 70312.9 std dev 8065.34 lost servo lock 1 times G. Byte-level snapshotting (same as D) but in poll mode. offset: min -314 max 288 mean -2.48718 std dev 118.045 delay: min 4880 max 6000 mean 5118.63 std dev 507.258 lost servo lock 0 times This seemed suspiciously good to me, so I let it run for longer (58 minutes): offset: min -26251 max 16416 mean -21.8672 std dev 863.416 delay: min 4720 max 57280 mean 5182.49 std dev 1607.19 lost servo lock 3 times H. Transfer-level snapshotting (same as F), but with IRQs disabled. This ran for 86 minutes. offset: min -1927 max 1843 mean -0.209203 std dev 529.398 delay: min 85440 max 93680 mean 88245 std dev 1454.71 lost servo lock 0 times I. Byte-level snapshotting (same as G), but with IRQs disabled. This ran for 102 minutes. offset: min -378 max 381 mean -0.0083089 std dev 101.495 delay: min 4720 max 5920 mean 5129.38 std dev 154.899 lost servo lock 0 times J. Default snapshotting taken by the SPI core, with the DSPI driver running in poll mode, IRQs enabled. This ran for 274 minutes. offset: min -42568 max 44576 mean 2.91646 std dev 947.467 delay: min 58480 max 171040 mean 80750.7 std dev 2001.61 lost servo lock 3 times As a result, this patchset proposes the implementation of scenario I. The others were done through temporary patches which are not presented here due to the difficulty of presenting a coherent git history without resorting to reverts etc. The gist of each experiment should be clear though. The raw data is available for dissection at https://drive.google.com/open?id=1r9raU9ZeqOqkqts6Lb-ISf5ubLDLP3wk. The logic analyzer captures can be opened with a free-as-in-beer program provided by Saleae: https://www.saleae.com/downloads/. In the capture data one can find the MOSI, SCK SPI signals, as well as a debug GPIO which was toggled at the same time as the PTP system timestamp was taken, to give the viewer an impression of what the software is capturing compared to the actual timing of the SPI transfer. Attached are also some close-up screenshots of transfers where there is a clear and huge delay in-between frames of the same 12-byte SPI transfer. As it turns out, these were all caused by the CPU getting interrupted by some other IRQ. Approaches H and I are the only ones that get rid of these glitches. In theory, the byte-level snapshotting should be less vulnerable to an IRQ interrupting the SPI transfer (because the time window is much smaller) but as the 58 minutes experiment shows, it is not immune. Vladimir Oltean (4): spi: Use an abbreviated pointer to ctlr->cur_msg in __spi_pump_messages spi: Add a PTP system timestamp to the transfer structure spi: spi-fsl-dspi: Implement the PTP system timestamping for TCFQ mode spi: spi-fsl-dspi: Always use the TCFQ devices in poll mode drivers/spi/spi-fsl-dspi.c | 20 ++++- drivers/spi/spi.c | 150 ++++++++++++++++++++++++++++++++++--- include/linux/spi/spi.h | 61 +++++++++++++++ 3 files changed, 219 insertions(+), 12 deletions(-) -- 2.17.1 ^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH v2 1/4] spi: Use an abbreviated pointer to ctlr->cur_msg in __spi_pump_messages 2019-09-05 1:01 [PATCH v2 0/4] Deterministic SPI latency with NXP DSPI driver Vladimir Oltean @ 2019-09-05 1:01 ` Vladimir Oltean 2019-09-05 17:39 ` Mark Brown 2019-09-05 1:01 ` [PATCH v2 2/4] spi: Add a PTP system timestamp to the transfer structure Vladimir Oltean ` (3 subsequent siblings) 4 siblings, 1 reply; 21+ messages in thread From: Vladimir Oltean @ 2019-09-05 1:01 UTC (permalink / raw) To: broonie, h.feurstein, mlichvar, richardcochran, andrew, f.fainelli Cc: linux-spi, netdev, Vladimir Oltean This helps a bit with line fitting now (the list_first_entry call) as well as during the next patch which needs to iterate through all transfers of ctlr->cur_msg so it timestamps them. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> --- Changes in v2: - Renamed mesg to msg. drivers/spi/spi.c | 23 ++++++++++++----------- 1 file changed, 12 insertions(+), 11 deletions(-) diff --git a/drivers/spi/spi.c b/drivers/spi/spi.c index 75ac046cae52..9ce86f761763 100644 --- a/drivers/spi/spi.c +++ b/drivers/spi/spi.c @@ -1265,8 +1265,9 @@ EXPORT_SYMBOL_GPL(spi_finalize_current_transfer); */ static void __spi_pump_messages(struct spi_controller *ctlr, bool in_kthread) { - unsigned long flags; + struct spi_message *msg; bool was_busy = false; + unsigned long flags; int ret; /* Lock queue */ @@ -1325,10 +1326,10 @@ static void __spi_pump_messages(struct spi_controller *ctlr, bool in_kthread) } /* Extract head of queue */ - ctlr->cur_msg = - list_first_entry(&ctlr->queue, struct spi_message, queue); + msg = list_first_entry(&ctlr->queue, struct spi_message, queue); + ctlr->cur_msg = msg; - list_del_init(&ctlr->cur_msg->queue); + list_del_init(&msg->queue); if (ctlr->busy) was_busy = true; else @@ -1361,7 +1362,7 @@ static void __spi_pump_messages(struct spi_controller *ctlr, bool in_kthread) if (ctlr->auto_runtime_pm) pm_runtime_put(ctlr->dev.parent); - ctlr->cur_msg->status = ret; + msg->status = ret; spi_finalize_current_message(ctlr); mutex_unlock(&ctlr->io_mutex); @@ -1369,28 +1370,28 @@ static void __spi_pump_messages(struct spi_controller *ctlr, bool in_kthread) } } - trace_spi_message_start(ctlr->cur_msg); + trace_spi_message_start(msg); if (ctlr->prepare_message) { - ret = ctlr->prepare_message(ctlr, ctlr->cur_msg); + ret = ctlr->prepare_message(ctlr, msg); if (ret) { dev_err(&ctlr->dev, "failed to prepare message: %d\n", ret); - ctlr->cur_msg->status = ret; + msg->status = ret; spi_finalize_current_message(ctlr); goto out; } ctlr->cur_msg_prepared = true; } - ret = spi_map_msg(ctlr, ctlr->cur_msg); + ret = spi_map_msg(ctlr, msg); if (ret) { - ctlr->cur_msg->status = ret; + msg->status = ret; spi_finalize_current_message(ctlr); goto out; } - ret = ctlr->transfer_one_message(ctlr, ctlr->cur_msg); + ret = ctlr->transfer_one_message(ctlr, msg); if (ret) { dev_err(&ctlr->dev, "failed to transfer one message from queue\n"); -- 2.17.1 ^ permalink raw reply related [flat|nested] 21+ messages in thread
* Applied "spi: Use an abbreviated pointer to ctlr->cur_msg in __spi_pump_messages" to the spi tree 2019-09-05 1:01 ` [PATCH v2 1/4] spi: Use an abbreviated pointer to ctlr->cur_msg in __spi_pump_messages Vladimir Oltean @ 2019-09-05 17:39 ` Mark Brown 0 siblings, 0 replies; 21+ messages in thread From: Mark Brown @ 2019-09-05 17:39 UTC (permalink / raw) To: Vladimir Oltean Cc: andrew, broonie, f.fainelli, h.feurstein, linux-spi, Mark Brown, mlichvar, netdev, richardcochran The patch spi: Use an abbreviated pointer to ctlr->cur_msg in __spi_pump_messages has been applied to the spi tree at https://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi.git for-5.4 All being well this means that it will be integrated into the linux-next tree (usually sometime in the next 24 hours) and sent to Linus during the next merge window (or sooner if it is a bug fix), however if problems are discovered then the patch may be dropped or reverted. You may get further e-mails resulting from automated or manual testing and review of the tree, please engage with people reporting problems and send followup patches addressing any issues that are reported if needed. If any updates are required or you are submitting further changes they should be sent as incremental updates against current git, existing patches will not be replaced. Please add any relevant lists and maintainers to the CCs when replying to this mail. Thanks, Mark From d1c44c9342c17e3314371325d9272684a075b65c Mon Sep 17 00:00:00 2001 From: Vladimir Oltean <olteanv@gmail.com> Date: Thu, 5 Sep 2019 04:01:11 +0300 Subject: [PATCH] spi: Use an abbreviated pointer to ctlr->cur_msg in __spi_pump_messages This helps a bit with line fitting now (the list_first_entry call) as well as during the next patch which needs to iterate through all transfers of ctlr->cur_msg so it timestamps them. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Link: https://lore.kernel.org/r/20190905010114.26718-2-olteanv@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org> --- drivers/spi/spi.c | 23 ++++++++++++----------- 1 file changed, 12 insertions(+), 11 deletions(-) diff --git a/drivers/spi/spi.c b/drivers/spi/spi.c index aef55acb5ccd..b2890923d256 100644 --- a/drivers/spi/spi.c +++ b/drivers/spi/spi.c @@ -1265,8 +1265,9 @@ EXPORT_SYMBOL_GPL(spi_finalize_current_transfer); */ static void __spi_pump_messages(struct spi_controller *ctlr, bool in_kthread) { - unsigned long flags; + struct spi_message *msg; bool was_busy = false; + unsigned long flags; int ret; /* Lock queue */ @@ -1325,10 +1326,10 @@ static void __spi_pump_messages(struct spi_controller *ctlr, bool in_kthread) } /* Extract head of queue */ - ctlr->cur_msg = - list_first_entry(&ctlr->queue, struct spi_message, queue); + msg = list_first_entry(&ctlr->queue, struct spi_message, queue); + ctlr->cur_msg = msg; - list_del_init(&ctlr->cur_msg->queue); + list_del_init(&msg->queue); if (ctlr->busy) was_busy = true; else @@ -1361,7 +1362,7 @@ static void __spi_pump_messages(struct spi_controller *ctlr, bool in_kthread) if (ctlr->auto_runtime_pm) pm_runtime_put(ctlr->dev.parent); - ctlr->cur_msg->status = ret; + msg->status = ret; spi_finalize_current_message(ctlr); mutex_unlock(&ctlr->io_mutex); @@ -1369,28 +1370,28 @@ static void __spi_pump_messages(struct spi_controller *ctlr, bool in_kthread) } } - trace_spi_message_start(ctlr->cur_msg); + trace_spi_message_start(msg); if (ctlr->prepare_message) { - ret = ctlr->prepare_message(ctlr, ctlr->cur_msg); + ret = ctlr->prepare_message(ctlr, msg); if (ret) { dev_err(&ctlr->dev, "failed to prepare message: %d\n", ret); - ctlr->cur_msg->status = ret; + msg->status = ret; spi_finalize_current_message(ctlr); goto out; } ctlr->cur_msg_prepared = true; } - ret = spi_map_msg(ctlr, ctlr->cur_msg); + ret = spi_map_msg(ctlr, msg); if (ret) { - ctlr->cur_msg->status = ret; + msg->status = ret; spi_finalize_current_message(ctlr); goto out; } - ret = ctlr->transfer_one_message(ctlr, ctlr->cur_msg); + ret = ctlr->transfer_one_message(ctlr, msg); if (ret) { dev_err(&ctlr->dev, "failed to transfer one message from queue\n"); -- 2.20.1 ^ permalink raw reply related [flat|nested] 21+ messages in thread
* Applied "spi: Use an abbreviated pointer to ctlr->cur_msg in __spi_pump_messages" to the spi tree @ 2019-09-05 17:39 ` Mark Brown 0 siblings, 0 replies; 21+ messages in thread From: Mark Brown @ 2019-09-05 17:39 UTC (permalink / raw) To: Vladimir Oltean Cc: andrew, broonie, f.fainelli, h.feurstein, linux-spi, Mark Brown, mlichvar, netdev, richardcochran The patch spi: Use an abbreviated pointer to ctlr->cur_msg in __spi_pump_messages has been applied to the spi tree at https://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi.git for-5.4 All being well this means that it will be integrated into the linux-next tree (usually sometime in the next 24 hours) and sent to Linus during the next merge window (or sooner if it is a bug fix), however if problems are discovered then the patch may be dropped or reverted. You may get further e-mails resulting from automated or manual testing and review of the tree, please engage with people reporting problems and send followup patches addressing any issues that are reported if needed. If any updates are required or you are submitting further changes they should be sent as incremental updates against current git, existing patches will not be replaced. Please add any relevant lists and maintainers to the CCs when replying to this mail. Thanks, Mark >From d1c44c9342c17e3314371325d9272684a075b65c Mon Sep 17 00:00:00 2001 From: Vladimir Oltean <olteanv@gmail.com> Date: Thu, 5 Sep 2019 04:01:11 +0300 Subject: [PATCH] spi: Use an abbreviated pointer to ctlr->cur_msg in __spi_pump_messages This helps a bit with line fitting now (the list_first_entry call) as well as during the next patch which needs to iterate through all transfers of ctlr->cur_msg so it timestamps them. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Link: https://lore.kernel.org/r/20190905010114.26718-2-olteanv@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org> --- drivers/spi/spi.c | 23 ++++++++++++----------- 1 file changed, 12 insertions(+), 11 deletions(-) diff --git a/drivers/spi/spi.c b/drivers/spi/spi.c index aef55acb5ccd..b2890923d256 100644 --- a/drivers/spi/spi.c +++ b/drivers/spi/spi.c @@ -1265,8 +1265,9 @@ EXPORT_SYMBOL_GPL(spi_finalize_current_transfer); */ static void __spi_pump_messages(struct spi_controller *ctlr, bool in_kthread) { - unsigned long flags; + struct spi_message *msg; bool was_busy = false; + unsigned long flags; int ret; /* Lock queue */ @@ -1325,10 +1326,10 @@ static void __spi_pump_messages(struct spi_controller *ctlr, bool in_kthread) } /* Extract head of queue */ - ctlr->cur_msg = - list_first_entry(&ctlr->queue, struct spi_message, queue); + msg = list_first_entry(&ctlr->queue, struct spi_message, queue); + ctlr->cur_msg = msg; - list_del_init(&ctlr->cur_msg->queue); + list_del_init(&msg->queue); if (ctlr->busy) was_busy = true; else @@ -1361,7 +1362,7 @@ static void __spi_pump_messages(struct spi_controller *ctlr, bool in_kthread) if (ctlr->auto_runtime_pm) pm_runtime_put(ctlr->dev.parent); - ctlr->cur_msg->status = ret; + msg->status = ret; spi_finalize_current_message(ctlr); mutex_unlock(&ctlr->io_mutex); @@ -1369,28 +1370,28 @@ static void __spi_pump_messages(struct spi_controller *ctlr, bool in_kthread) } } - trace_spi_message_start(ctlr->cur_msg); + trace_spi_message_start(msg); if (ctlr->prepare_message) { - ret = ctlr->prepare_message(ctlr, ctlr->cur_msg); + ret = ctlr->prepare_message(ctlr, msg); if (ret) { dev_err(&ctlr->dev, "failed to prepare message: %d\n", ret); - ctlr->cur_msg->status = ret; + msg->status = ret; spi_finalize_current_message(ctlr); goto out; } ctlr->cur_msg_prepared = true; } - ret = spi_map_msg(ctlr, ctlr->cur_msg); + ret = spi_map_msg(ctlr, msg); if (ret) { - ctlr->cur_msg->status = ret; + msg->status = ret; spi_finalize_current_message(ctlr); goto out; } - ret = ctlr->transfer_one_message(ctlr, ctlr->cur_msg); + ret = ctlr->transfer_one_message(ctlr, msg); if (ret) { dev_err(&ctlr->dev, "failed to transfer one message from queue\n"); -- 2.20.1 ^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v2 2/4] spi: Add a PTP system timestamp to the transfer structure 2019-09-05 1:01 [PATCH v2 0/4] Deterministic SPI latency with NXP DSPI driver Vladimir Oltean 2019-09-05 1:01 ` [PATCH v2 1/4] spi: Use an abbreviated pointer to ctlr->cur_msg in __spi_pump_messages Vladimir Oltean @ 2019-09-05 1:01 ` Vladimir Oltean 2019-10-08 10:52 ` Mark Brown 2019-10-08 18:09 ` Mark Brown 2019-09-05 1:01 ` [PATCH v2 3/4] spi: spi-fsl-dspi: Implement the PTP system timestamping for TCFQ mode Vladimir Oltean ` (2 subsequent siblings) 4 siblings, 2 replies; 21+ messages in thread From: Vladimir Oltean @ 2019-09-05 1:01 UTC (permalink / raw) To: broonie, h.feurstein, mlichvar, richardcochran, andrew, f.fainelli Cc: linux-spi, netdev, Vladimir Oltean SPI is one of the interfaces used to access devices which have a POSIX clock driver (real time clocks, 1588 timers etc). The fact that the SPI bus is slow is not what the main problem is, but rather the fact that drivers don't take a constant amount of time in transferring data over SPI. When there is a high delay in the readout of time, there will be uncertainty in the value that has been read out of the peripheral. When that delay is constant, the uncertainty can at least be approximated with a certain accuracy which is fine more often than not. Timing jitter occurs all over in the kernel code, and is mainly caused by having to let go of the CPU for various reasons such as preemption, servicing interrupts, going to sleep, etc. Another major reason is CPU dynamic frequency scaling. It turns out that the problem of retrieving time from a SPI peripheral with high accuracy can be solved by the use of "PTP system timestamping" - a mechanism to correlate the time when the device has snapshotted its internal time counter with the Linux system time at that same moment. This is sufficient for having a precise time measurement - it is not necessary for the whole SPI transfer to be transmitted "as fast as possible", or "as low-jitter as possible". The system has to be low-jitter for a very short amount of time to be effective. This patch introduces a PTP system timestamping mechanism in struct spi_transfer. This is to be used by SPI device drivers when they need to know the exact time at which the underlying device's time was snapshotted. More often than not, SPI peripherals have a very exact timing for when their SPI-to-interconnect bridge issues a transaction for snapshotting and reading the time register, and that will be dependent on when the SPI-to-interconnect bridge figures out that this is what it should do, aka as soon as it sees byte N of the SPI transfer. Since spi_device drivers are the ones who'd know best how the peripheral behaves in this regard, expose a mechanism in spi_transfer which allows them to specify which word (or word range) from the transfer should be timestamped. Add a default implementation of the PTP system timestamping in the SPI core. This is not going to be satisfactory performance-wise, but should at least increase the likelihood that SPI device drivers will use PTP system timestamping in the future. There are 3 entry points from the core towards the SPI controller drivers: - transfer_one: The driver is passed individual spi_transfers to execute. This is the easiest to timestamp. - transfer_one_message: The core passes the driver an entire spi_message (a potential batch of spi_transfers). The core puts the same pre and post timestamp to all transfers within a message. This is not ideal, but nothing better can be done by default anyway, since the core has no insight into how the driver batches the transfers. - transfer: Like transfer_one_message, but for unqueued drivers (i.e. the driver implements its own queue scheduling). Signed-off-by: Vladimir Oltean <olteanv@gmail.com> --- Changes in v2: - Added even more helpful helper functions: spi_take_timestamp_pre and spi_take_timestamp_pre, that drivers can simply call without keeping any state themselves. - Updated comments and commit message. drivers/spi/spi.c | 127 ++++++++++++++++++++++++++++++++++++++++ include/linux/spi/spi.h | 61 +++++++++++++++++++ 2 files changed, 188 insertions(+) diff --git a/drivers/spi/spi.c b/drivers/spi/spi.c index 9ce86f761763..bfc1f1e7ae12 100644 --- a/drivers/spi/spi.c +++ b/drivers/spi/spi.c @@ -1171,6 +1171,11 @@ static int spi_transfer_one_message(struct spi_controller *ctlr, spi_statistics_add_transfer_stats(statm, xfer, ctlr); spi_statistics_add_transfer_stats(stats, xfer, ctlr); + if (!ctlr->ptp_sts_supported) { + xfer->ptp_sts_word_pre = 0; + ptp_read_system_prets(xfer->ptp_sts); + } + if (xfer->tx_buf || xfer->rx_buf) { reinit_completion(&ctlr->xfer_completion); @@ -1197,6 +1202,11 @@ static int spi_transfer_one_message(struct spi_controller *ctlr, xfer->len); } + if (!ctlr->ptp_sts_supported) { + ptp_read_system_postts(xfer->ptp_sts); + xfer->ptp_sts_word_post = xfer->len; + } + trace_spi_transfer_stop(msg, xfer); if (msg->status != -EINPROGRESS) @@ -1265,6 +1275,7 @@ EXPORT_SYMBOL_GPL(spi_finalize_current_transfer); */ static void __spi_pump_messages(struct spi_controller *ctlr, bool in_kthread) { + struct spi_transfer *xfer; struct spi_message *msg; bool was_busy = false; unsigned long flags; @@ -1391,6 +1402,13 @@ static void __spi_pump_messages(struct spi_controller *ctlr, bool in_kthread) goto out; } + if (!ctlr->ptp_sts_supported && !ctlr->transfer_one) { + list_for_each_entry(xfer, &msg->transfers, transfer_list) { + xfer->ptp_sts_word_pre = 0; + ptp_read_system_prets(xfer->ptp_sts); + } + } + ret = ctlr->transfer_one_message(ctlr, msg); if (ret) { dev_err(&ctlr->dev, @@ -1418,6 +1436,99 @@ static void spi_pump_messages(struct kthread_work *work) __spi_pump_messages(ctlr, true); } +/** + * spi_take_timestamp_pre - helper for drivers to collect the beginning of the + * TX timestamp for the requested byte from the SPI + * transfer. The frequency with which this function + * must be called (once per word, once for the whole + * transfer, once per batch of words etc) is arbitrary + * as long as the @tx buffer offset is greater than or + * equal to the requested byte at the time of the + * call. The timestamp is only taken once, at the + * first such call. It is assumed that the driver + * advances its @tx buffer pointer monotonically. + * @ctlr: Pointer to the spi_controller structure of the driver + * @xfer: Pointer to the transfer being timestamped + * @tx: Pointer to the current word within the xfer->tx_buf that the driver is + * preparing to transmit right now. + * @irqs_off: If true, will disable IRQs and preemption for the duration of the + * transfer, for less jitter in time measurement. Only compatible + * with PIO drivers. If true, must follow up with + * spi_take_timestamp_post or otherwise system will crash. + * WARNING: for fully predictable results, the CPU frequency must + * also be under control (governor). + */ +void spi_take_timestamp_pre(struct spi_controller *ctlr, + struct spi_transfer *xfer, + const void *tx, bool irqs_off) +{ + u8 bytes_per_word = DIV_ROUND_UP(xfer->bits_per_word, 8); + + if (!xfer->ptp_sts) + return; + + if (xfer->timestamped_pre) + return; + + if (tx < (xfer->tx_buf + xfer->ptp_sts_word_pre * bytes_per_word)) + return; + + /* Capture the resolution of the timestamp */ + xfer->ptp_sts_word_pre = (tx - xfer->tx_buf) / bytes_per_word; + + xfer->timestamped_pre = true; + + if (irqs_off) { + local_irq_save(ctlr->irq_flags); + preempt_disable(); + } + + ptp_read_system_prets(xfer->ptp_sts); +} +EXPORT_SYMBOL_GPL(spi_take_timestamp_pre); + +/** + * spi_take_timestamp_post - helper for drivers to collect the end of the + * TX timestamp for the requested byte from the SPI + * transfer. Can be called with an arbitrary + * frequency: only the first call where @tx exceeds + * or is equal to the requested word will be + * timestamped. + * @ctlr: Pointer to the spi_controller structure of the driver + * @xfer: Pointer to the transfer being timestamped + * @tx: Pointer to the current word within the xfer->tx_buf that the driver has + * just transmitted. + * @irqs_off: If true, will re-enable IRQs and preemption for the local CPU. + */ +void spi_take_timestamp_post(struct spi_controller *ctlr, + struct spi_transfer *xfer, + const void *tx, bool irqs_off) +{ + u8 bytes_per_word = DIV_ROUND_UP(xfer->bits_per_word, 8); + + if (!xfer->ptp_sts) + return; + + if (xfer->timestamped_post) + return; + + if (tx < (xfer->tx_buf + xfer->ptp_sts_word_post * bytes_per_word)) + return; + + ptp_read_system_postts(xfer->ptp_sts); + + if (irqs_off) { + local_irq_restore(ctlr->irq_flags); + preempt_enable(); + } + + /* Capture the resolution of the timestamp */ + xfer->ptp_sts_word_post = (tx - xfer->tx_buf) / bytes_per_word; + + xfer->timestamped_post = true; +} +EXPORT_SYMBOL_GPL(spi_take_timestamp_post); + /** * spi_set_thread_rt - set the controller to pump at realtime priority * @ctlr: controller to boost priority of @@ -1503,6 +1614,7 @@ EXPORT_SYMBOL_GPL(spi_get_next_queued_message); */ void spi_finalize_current_message(struct spi_controller *ctlr) { + struct spi_transfer *xfer; struct spi_message *mesg; unsigned long flags; int ret; @@ -1511,6 +1623,13 @@ void spi_finalize_current_message(struct spi_controller *ctlr) mesg = ctlr->cur_msg; spin_unlock_irqrestore(&ctlr->queue_lock, flags); + if (!ctlr->ptp_sts_supported && !ctlr->transfer_one) { + list_for_each_entry(xfer, &mesg->transfers, transfer_list) { + ptp_read_system_postts(xfer->ptp_sts); + xfer->ptp_sts_word_post = xfer->len; + } + } + spi_unmap_msg(ctlr, mesg); if (ctlr->cur_msg_prepared && ctlr->unprepare_message) { @@ -3271,6 +3390,7 @@ static int __spi_validate(struct spi_device *spi, struct spi_message *message) static int __spi_async(struct spi_device *spi, struct spi_message *message) { struct spi_controller *ctlr = spi->controller; + struct spi_transfer *xfer; /* * Some controllers do not support doing regular SPI transfers. Return @@ -3286,6 +3406,13 @@ static int __spi_async(struct spi_device *spi, struct spi_message *message) trace_spi_message_submit(message); + if (!ctlr->ptp_sts_supported) { + list_for_each_entry(xfer, &message->transfers, transfer_list) { + xfer->ptp_sts_word_pre = 0; + ptp_read_system_prets(xfer->ptp_sts); + } + } + return ctlr->transfer(spi, message); } diff --git a/include/linux/spi/spi.h b/include/linux/spi/spi.h index af4f265d0f67..27f6b046cf92 100644 --- a/include/linux/spi/spi.h +++ b/include/linux/spi/spi.h @@ -13,6 +13,7 @@ #include <linux/completion.h> #include <linux/scatterlist.h> #include <linux/gpio/consumer.h> +#include <linux/ptp_clock_kernel.h> struct dma_chan; struct property_entry; @@ -409,6 +410,12 @@ static inline void spi_unregister_driver(struct spi_driver *sdrv) * @fw_translate_cs: If the boot firmware uses different numbering scheme * what Linux expects, this optional hook can be used to translate * between the two. + * @ptp_sts_supported: If the driver sets this to true, it must provide a + * time snapshot in @spi_transfer->ptp_sts as close as possible to the + * moment in time when @spi_transfer->ptp_sts_word_pre and + * @spi_transfer->ptp_sts_word_post were transmitted. + * If the driver does not set this, the SPI core takes the snapshot as + * close to the driver hand-over as possible. * * Each SPI controller can communicate with one or more @spi_device * children. These make a small bus, sharing MOSI, MISO and SCK signals @@ -604,6 +611,15 @@ struct spi_controller { void *dummy_tx; int (*fw_translate_cs)(struct spi_controller *ctlr, unsigned cs); + + /* + * Driver sets this field to indicate it is able to snapshot SPI + * transfers (needed e.g. for reading the time of POSIX clocks) + */ + bool ptp_sts_supported; + + /* Interrupt enable state during PTP system timestamping */ + unsigned long irq_flags; }; static inline void *spi_controller_get_devdata(struct spi_controller *ctlr) @@ -644,6 +660,14 @@ extern struct spi_message *spi_get_next_queued_message(struct spi_controller *ct extern void spi_finalize_current_message(struct spi_controller *ctlr); extern void spi_finalize_current_transfer(struct spi_controller *ctlr); +/* Helper calls for driver to timestamp transfer */ +void spi_take_timestamp_pre(struct spi_controller *ctlr, + struct spi_transfer *xfer, + const void *tx, bool irqs_off); +void spi_take_timestamp_post(struct spi_controller *ctlr, + struct spi_transfer *xfer, + const void *tx, bool irqs_off); + /* the spi driver core manages memory for the spi_controller classdev */ extern struct spi_controller *__spi_alloc_controller(struct device *host, unsigned int size, bool slave); @@ -753,6 +777,35 @@ extern void spi_res_release(struct spi_controller *ctlr, * @transfer_list: transfers are sequenced through @spi_message.transfers * @tx_sg: Scatterlist for transmit, currently not for client use * @rx_sg: Scatterlist for receive, currently not for client use + * @ptp_sts_word_pre: The word (subject to bits_per_word semantics) offset + * within @tx_buf for which the SPI device is requesting that the time + * snapshot for this transfer begins. Upon completing the SPI transfer, + * this value may have changed compared to what was requested, depending + * on the available snapshotting resolution (DMA transfer, + * @ptp_sts_supported is false, etc). + * @ptp_sts_word_post: See @ptp_sts_word_post. The two can be equal (meaning + * that a single byte should be snapshotted). + * If the core takes care of the timestamp (if @ptp_sts_supported is false + * for this controller), it will set @ptp_sts_word_pre to 0, and + * @ptp_sts_word_post to the length of the transfer. This is done + * purposefully (instead of setting to spi_transfer->len - 1) to denote + * that a transfer-level snapshot taken from within the driver may still + * be of higher quality. + * @ptp_sts: Pointer to a memory location held by the SPI slave device where a + * PTP system timestamp structure may lie. If drivers use PIO or their + * hardware has some sort of assist for retrieving exact transfer timing, + * they can (and should) assert @ptp_sts_supported and populate this + * structure using the ptp_read_system_*ts helper functions. + * The timestamp must represent the time at which the SPI slave device has + * processed the word, i.e. the "pre" timestamp should be taken before + * transmitting the "pre" word, and the "post" timestamp after receiving + * transmit confirmation from the controller for the "post" word. + * @timestamped_pre: Set by the SPI controller driver to denote it has acted + * upon the @ptp_sts request. Not set when the SPI core has taken care of + * the task. SPI device drivers are free to print a warning if this comes + * back unset and they need the better resolution. + * @timestamped_post: See above. The reason why both exist is that these + * booleans are also used to keep state in the core SPI logic. * * SPI transfers always write the same number of bytes as they read. * Protocol drivers should always provide @rx_buf and/or @tx_buf. @@ -842,6 +895,14 @@ struct spi_transfer { u32 effective_speed_hz; + unsigned int ptp_sts_word_pre; + unsigned int ptp_sts_word_post; + + struct ptp_system_timestamp *ptp_sts; + + bool timestamped_pre; + bool timestamped_post; + struct list_head transfer_list; }; -- 2.17.1 ^ permalink raw reply related [flat|nested] 21+ messages in thread
* Applied "spi: Add a PTP system timestamp to the transfer structure" to the spi tree 2019-09-05 1:01 ` [PATCH v2 2/4] spi: Add a PTP system timestamp to the transfer structure Vladimir Oltean @ 2019-10-08 10:52 ` Mark Brown 2019-10-08 18:09 ` Mark Brown 1 sibling, 0 replies; 21+ messages in thread From: Mark Brown @ 2019-10-08 10:52 UTC (permalink / raw) To: Vladimir Oltean Cc: andrew, broonie, f.fainelli, h.feurstein, linux-spi, Mark Brown, mlichvar, netdev, richardcochran The patch spi: Add a PTP system timestamp to the transfer structure has been applied to the spi tree at https://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi.git for-5.5 All being well this means that it will be integrated into the linux-next tree (usually sometime in the next 24 hours) and sent to Linus during the next merge window (or sooner if it is a bug fix), however if problems are discovered then the patch may be dropped or reverted. You may get further e-mails resulting from automated or manual testing and review of the tree, please engage with people reporting problems and send followup patches addressing any issues that are reported if needed. If any updates are required or you are submitting further changes they should be sent as incremental updates against current git, existing patches will not be replaced. Please add any relevant lists and maintainers to the CCs when replying to this mail. Thanks, Mark From b42faeee718ce13ef6eb99c24880b58deb54c8fa Mon Sep 17 00:00:00 2001 From: Vladimir Oltean <olteanv@gmail.com> Date: Thu, 5 Sep 2019 04:01:12 +0300 Subject: [PATCH] spi: Add a PTP system timestamp to the transfer structure SPI is one of the interfaces used to access devices which have a POSIX clock driver (real time clocks, 1588 timers etc). The fact that the SPI bus is slow is not what the main problem is, but rather the fact that drivers don't take a constant amount of time in transferring data over SPI. When there is a high delay in the readout of time, there will be uncertainty in the value that has been read out of the peripheral. When that delay is constant, the uncertainty can at least be approximated with a certain accuracy which is fine more often than not. Timing jitter occurs all over in the kernel code, and is mainly caused by having to let go of the CPU for various reasons such as preemption, servicing interrupts, going to sleep, etc. Another major reason is CPU dynamic frequency scaling. It turns out that the problem of retrieving time from a SPI peripheral with high accuracy can be solved by the use of "PTP system timestamping" - a mechanism to correlate the time when the device has snapshotted its internal time counter with the Linux system time at that same moment. This is sufficient for having a precise time measurement - it is not necessary for the whole SPI transfer to be transmitted "as fast as possible", or "as low-jitter as possible". The system has to be low-jitter for a very short amount of time to be effective. This patch introduces a PTP system timestamping mechanism in struct spi_transfer. This is to be used by SPI device drivers when they need to know the exact time at which the underlying device's time was snapshotted. More often than not, SPI peripherals have a very exact timing for when their SPI-to-interconnect bridge issues a transaction for snapshotting and reading the time register, and that will be dependent on when the SPI-to-interconnect bridge figures out that this is what it should do, aka as soon as it sees byte N of the SPI transfer. Since spi_device drivers are the ones who'd know best how the peripheral behaves in this regard, expose a mechanism in spi_transfer which allows them to specify which word (or word range) from the transfer should be timestamped. Add a default implementation of the PTP system timestamping in the SPI core. This is not going to be satisfactory performance-wise, but should at least increase the likelihood that SPI device drivers will use PTP system timestamping in the future. There are 3 entry points from the core towards the SPI controller drivers: - transfer_one: The driver is passed individual spi_transfers to execute. This is the easiest to timestamp. - transfer_one_message: The core passes the driver an entire spi_message (a potential batch of spi_transfers). The core puts the same pre and post timestamp to all transfers within a message. This is not ideal, but nothing better can be done by default anyway, since the core has no insight into how the driver batches the transfers. - transfer: Like transfer_one_message, but for unqueued drivers (i.e. the driver implements its own queue scheduling). Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Link: https://lore.kernel.org/r/20190905010114.26718-3-olteanv@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org> --- drivers/spi/spi.c | 127 ++++++++++++++++++++++++++++++++++++++++ include/linux/spi/spi.h | 61 +++++++++++++++++++ 2 files changed, 188 insertions(+) diff --git a/drivers/spi/spi.c b/drivers/spi/spi.c index f9502dbbb5c1..9bb36c32cbf9 100644 --- a/drivers/spi/spi.c +++ b/drivers/spi/spi.c @@ -1171,6 +1171,11 @@ static int spi_transfer_one_message(struct spi_controller *ctlr, spi_statistics_add_transfer_stats(statm, xfer, ctlr); spi_statistics_add_transfer_stats(stats, xfer, ctlr); + if (!ctlr->ptp_sts_supported) { + xfer->ptp_sts_word_pre = 0; + ptp_read_system_prets(xfer->ptp_sts); + } + if (xfer->tx_buf || xfer->rx_buf) { reinit_completion(&ctlr->xfer_completion); @@ -1197,6 +1202,11 @@ static int spi_transfer_one_message(struct spi_controller *ctlr, xfer->len); } + if (!ctlr->ptp_sts_supported) { + ptp_read_system_postts(xfer->ptp_sts); + xfer->ptp_sts_word_post = xfer->len; + } + trace_spi_transfer_stop(msg, xfer); if (msg->status != -EINPROGRESS) @@ -1265,6 +1275,7 @@ EXPORT_SYMBOL_GPL(spi_finalize_current_transfer); */ static void __spi_pump_messages(struct spi_controller *ctlr, bool in_kthread) { + struct spi_transfer *xfer; struct spi_message *msg; bool was_busy = false; unsigned long flags; @@ -1391,6 +1402,13 @@ static void __spi_pump_messages(struct spi_controller *ctlr, bool in_kthread) goto out; } + if (!ctlr->ptp_sts_supported && !ctlr->transfer_one) { + list_for_each_entry(xfer, &msg->transfers, transfer_list) { + xfer->ptp_sts_word_pre = 0; + ptp_read_system_prets(xfer->ptp_sts); + } + } + ret = ctlr->transfer_one_message(ctlr, msg); if (ret) { dev_err(&ctlr->dev, @@ -1418,6 +1436,99 @@ static void spi_pump_messages(struct kthread_work *work) __spi_pump_messages(ctlr, true); } +/** + * spi_take_timestamp_pre - helper for drivers to collect the beginning of the + * TX timestamp for the requested byte from the SPI + * transfer. The frequency with which this function + * must be called (once per word, once for the whole + * transfer, once per batch of words etc) is arbitrary + * as long as the @tx buffer offset is greater than or + * equal to the requested byte at the time of the + * call. The timestamp is only taken once, at the + * first such call. It is assumed that the driver + * advances its @tx buffer pointer monotonically. + * @ctlr: Pointer to the spi_controller structure of the driver + * @xfer: Pointer to the transfer being timestamped + * @tx: Pointer to the current word within the xfer->tx_buf that the driver is + * preparing to transmit right now. + * @irqs_off: If true, will disable IRQs and preemption for the duration of the + * transfer, for less jitter in time measurement. Only compatible + * with PIO drivers. If true, must follow up with + * spi_take_timestamp_post or otherwise system will crash. + * WARNING: for fully predictable results, the CPU frequency must + * also be under control (governor). + */ +void spi_take_timestamp_pre(struct spi_controller *ctlr, + struct spi_transfer *xfer, + const void *tx, bool irqs_off) +{ + u8 bytes_per_word = DIV_ROUND_UP(xfer->bits_per_word, 8); + + if (!xfer->ptp_sts) + return; + + if (xfer->timestamped_pre) + return; + + if (tx < (xfer->tx_buf + xfer->ptp_sts_word_pre * bytes_per_word)) + return; + + /* Capture the resolution of the timestamp */ + xfer->ptp_sts_word_pre = (tx - xfer->tx_buf) / bytes_per_word; + + xfer->timestamped_pre = true; + + if (irqs_off) { + local_irq_save(ctlr->irq_flags); + preempt_disable(); + } + + ptp_read_system_prets(xfer->ptp_sts); +} +EXPORT_SYMBOL_GPL(spi_take_timestamp_pre); + +/** + * spi_take_timestamp_post - helper for drivers to collect the end of the + * TX timestamp for the requested byte from the SPI + * transfer. Can be called with an arbitrary + * frequency: only the first call where @tx exceeds + * or is equal to the requested word will be + * timestamped. + * @ctlr: Pointer to the spi_controller structure of the driver + * @xfer: Pointer to the transfer being timestamped + * @tx: Pointer to the current word within the xfer->tx_buf that the driver has + * just transmitted. + * @irqs_off: If true, will re-enable IRQs and preemption for the local CPU. + */ +void spi_take_timestamp_post(struct spi_controller *ctlr, + struct spi_transfer *xfer, + const void *tx, bool irqs_off) +{ + u8 bytes_per_word = DIV_ROUND_UP(xfer->bits_per_word, 8); + + if (!xfer->ptp_sts) + return; + + if (xfer->timestamped_post) + return; + + if (tx < (xfer->tx_buf + xfer->ptp_sts_word_post * bytes_per_word)) + return; + + ptp_read_system_postts(xfer->ptp_sts); + + if (irqs_off) { + local_irq_restore(ctlr->irq_flags); + preempt_enable(); + } + + /* Capture the resolution of the timestamp */ + xfer->ptp_sts_word_post = (tx - xfer->tx_buf) / bytes_per_word; + + xfer->timestamped_post = true; +} +EXPORT_SYMBOL_GPL(spi_take_timestamp_post); + /** * spi_set_thread_rt - set the controller to pump at realtime priority * @ctlr: controller to boost priority of @@ -1503,6 +1614,7 @@ EXPORT_SYMBOL_GPL(spi_get_next_queued_message); */ void spi_finalize_current_message(struct spi_controller *ctlr) { + struct spi_transfer *xfer; struct spi_message *mesg; unsigned long flags; int ret; @@ -1511,6 +1623,13 @@ void spi_finalize_current_message(struct spi_controller *ctlr) mesg = ctlr->cur_msg; spin_unlock_irqrestore(&ctlr->queue_lock, flags); + if (!ctlr->ptp_sts_supported && !ctlr->transfer_one) { + list_for_each_entry(xfer, &mesg->transfers, transfer_list) { + ptp_read_system_postts(xfer->ptp_sts); + xfer->ptp_sts_word_post = xfer->len; + } + } + spi_unmap_msg(ctlr, mesg); if (ctlr->cur_msg_prepared && ctlr->unprepare_message) { @@ -3273,6 +3392,7 @@ static int __spi_validate(struct spi_device *spi, struct spi_message *message) static int __spi_async(struct spi_device *spi, struct spi_message *message) { struct spi_controller *ctlr = spi->controller; + struct spi_transfer *xfer; /* * Some controllers do not support doing regular SPI transfers. Return @@ -3288,6 +3408,13 @@ static int __spi_async(struct spi_device *spi, struct spi_message *message) trace_spi_message_submit(message); + if (!ctlr->ptp_sts_supported) { + list_for_each_entry(xfer, &message->transfers, transfer_list) { + xfer->ptp_sts_word_pre = 0; + ptp_read_system_prets(xfer->ptp_sts); + } + } + return ctlr->transfer(spi, message); } diff --git a/include/linux/spi/spi.h b/include/linux/spi/spi.h index af4f265d0f67..27f6b046cf92 100644 --- a/include/linux/spi/spi.h +++ b/include/linux/spi/spi.h @@ -13,6 +13,7 @@ #include <linux/completion.h> #include <linux/scatterlist.h> #include <linux/gpio/consumer.h> +#include <linux/ptp_clock_kernel.h> struct dma_chan; struct property_entry; @@ -409,6 +410,12 @@ static inline void spi_unregister_driver(struct spi_driver *sdrv) * @fw_translate_cs: If the boot firmware uses different numbering scheme * what Linux expects, this optional hook can be used to translate * between the two. + * @ptp_sts_supported: If the driver sets this to true, it must provide a + * time snapshot in @spi_transfer->ptp_sts as close as possible to the + * moment in time when @spi_transfer->ptp_sts_word_pre and + * @spi_transfer->ptp_sts_word_post were transmitted. + * If the driver does not set this, the SPI core takes the snapshot as + * close to the driver hand-over as possible. * * Each SPI controller can communicate with one or more @spi_device * children. These make a small bus, sharing MOSI, MISO and SCK signals @@ -604,6 +611,15 @@ struct spi_controller { void *dummy_tx; int (*fw_translate_cs)(struct spi_controller *ctlr, unsigned cs); + + /* + * Driver sets this field to indicate it is able to snapshot SPI + * transfers (needed e.g. for reading the time of POSIX clocks) + */ + bool ptp_sts_supported; + + /* Interrupt enable state during PTP system timestamping */ + unsigned long irq_flags; }; static inline void *spi_controller_get_devdata(struct spi_controller *ctlr) @@ -644,6 +660,14 @@ extern struct spi_message *spi_get_next_queued_message(struct spi_controller *ct extern void spi_finalize_current_message(struct spi_controller *ctlr); extern void spi_finalize_current_transfer(struct spi_controller *ctlr); +/* Helper calls for driver to timestamp transfer */ +void spi_take_timestamp_pre(struct spi_controller *ctlr, + struct spi_transfer *xfer, + const void *tx, bool irqs_off); +void spi_take_timestamp_post(struct spi_controller *ctlr, + struct spi_transfer *xfer, + const void *tx, bool irqs_off); + /* the spi driver core manages memory for the spi_controller classdev */ extern struct spi_controller *__spi_alloc_controller(struct device *host, unsigned int size, bool slave); @@ -753,6 +777,35 @@ extern void spi_res_release(struct spi_controller *ctlr, * @transfer_list: transfers are sequenced through @spi_message.transfers * @tx_sg: Scatterlist for transmit, currently not for client use * @rx_sg: Scatterlist for receive, currently not for client use + * @ptp_sts_word_pre: The word (subject to bits_per_word semantics) offset + * within @tx_buf for which the SPI device is requesting that the time + * snapshot for this transfer begins. Upon completing the SPI transfer, + * this value may have changed compared to what was requested, depending + * on the available snapshotting resolution (DMA transfer, + * @ptp_sts_supported is false, etc). + * @ptp_sts_word_post: See @ptp_sts_word_post. The two can be equal (meaning + * that a single byte should be snapshotted). + * If the core takes care of the timestamp (if @ptp_sts_supported is false + * for this controller), it will set @ptp_sts_word_pre to 0, and + * @ptp_sts_word_post to the length of the transfer. This is done + * purposefully (instead of setting to spi_transfer->len - 1) to denote + * that a transfer-level snapshot taken from within the driver may still + * be of higher quality. + * @ptp_sts: Pointer to a memory location held by the SPI slave device where a + * PTP system timestamp structure may lie. If drivers use PIO or their + * hardware has some sort of assist for retrieving exact transfer timing, + * they can (and should) assert @ptp_sts_supported and populate this + * structure using the ptp_read_system_*ts helper functions. + * The timestamp must represent the time at which the SPI slave device has + * processed the word, i.e. the "pre" timestamp should be taken before + * transmitting the "pre" word, and the "post" timestamp after receiving + * transmit confirmation from the controller for the "post" word. + * @timestamped_pre: Set by the SPI controller driver to denote it has acted + * upon the @ptp_sts request. Not set when the SPI core has taken care of + * the task. SPI device drivers are free to print a warning if this comes + * back unset and they need the better resolution. + * @timestamped_post: See above. The reason why both exist is that these + * booleans are also used to keep state in the core SPI logic. * * SPI transfers always write the same number of bytes as they read. * Protocol drivers should always provide @rx_buf and/or @tx_buf. @@ -842,6 +895,14 @@ struct spi_transfer { u32 effective_speed_hz; + unsigned int ptp_sts_word_pre; + unsigned int ptp_sts_word_post; + + struct ptp_system_timestamp *ptp_sts; + + bool timestamped_pre; + bool timestamped_post; + struct list_head transfer_list; }; -- 2.20.1 ^ permalink raw reply related [flat|nested] 21+ messages in thread
* Applied "spi: Add a PTP system timestamp to the transfer structure" to the spi tree @ 2019-10-08 10:52 ` Mark Brown 0 siblings, 0 replies; 21+ messages in thread From: Mark Brown @ 2019-10-08 10:52 UTC (permalink / raw) To: Vladimir Oltean Cc: andrew, broonie, f.fainelli, h.feurstein, linux-spi, Mark Brown, mlichvar, netdev, richardcochran The patch spi: Add a PTP system timestamp to the transfer structure has been applied to the spi tree at https://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi.git for-5.5 All being well this means that it will be integrated into the linux-next tree (usually sometime in the next 24 hours) and sent to Linus during the next merge window (or sooner if it is a bug fix), however if problems are discovered then the patch may be dropped or reverted. You may get further e-mails resulting from automated or manual testing and review of the tree, please engage with people reporting problems and send followup patches addressing any issues that are reported if needed. If any updates are required or you are submitting further changes they should be sent as incremental updates against current git, existing patches will not be replaced. Please add any relevant lists and maintainers to the CCs when replying to this mail. Thanks, Mark >From b42faeee718ce13ef6eb99c24880b58deb54c8fa Mon Sep 17 00:00:00 2001 From: Vladimir Oltean <olteanv@gmail.com> Date: Thu, 5 Sep 2019 04:01:12 +0300 Subject: [PATCH] spi: Add a PTP system timestamp to the transfer structure SPI is one of the interfaces used to access devices which have a POSIX clock driver (real time clocks, 1588 timers etc). The fact that the SPI bus is slow is not what the main problem is, but rather the fact that drivers don't take a constant amount of time in transferring data over SPI. When there is a high delay in the readout of time, there will be uncertainty in the value that has been read out of the peripheral. When that delay is constant, the uncertainty can at least be approximated with a certain accuracy which is fine more often than not. Timing jitter occurs all over in the kernel code, and is mainly caused by having to let go of the CPU for various reasons such as preemption, servicing interrupts, going to sleep, etc. Another major reason is CPU dynamic frequency scaling. It turns out that the problem of retrieving time from a SPI peripheral with high accuracy can be solved by the use of "PTP system timestamping" - a mechanism to correlate the time when the device has snapshotted its internal time counter with the Linux system time at that same moment. This is sufficient for having a precise time measurement - it is not necessary for the whole SPI transfer to be transmitted "as fast as possible", or "as low-jitter as possible". The system has to be low-jitter for a very short amount of time to be effective. This patch introduces a PTP system timestamping mechanism in struct spi_transfer. This is to be used by SPI device drivers when they need to know the exact time at which the underlying device's time was snapshotted. More often than not, SPI peripherals have a very exact timing for when their SPI-to-interconnect bridge issues a transaction for snapshotting and reading the time register, and that will be dependent on when the SPI-to-interconnect bridge figures out that this is what it should do, aka as soon as it sees byte N of the SPI transfer. Since spi_device drivers are the ones who'd know best how the peripheral behaves in this regard, expose a mechanism in spi_transfer which allows them to specify which word (or word range) from the transfer should be timestamped. Add a default implementation of the PTP system timestamping in the SPI core. This is not going to be satisfactory performance-wise, but should at least increase the likelihood that SPI device drivers will use PTP system timestamping in the future. There are 3 entry points from the core towards the SPI controller drivers: - transfer_one: The driver is passed individual spi_transfers to execute. This is the easiest to timestamp. - transfer_one_message: The core passes the driver an entire spi_message (a potential batch of spi_transfers). The core puts the same pre and post timestamp to all transfers within a message. This is not ideal, but nothing better can be done by default anyway, since the core has no insight into how the driver batches the transfers. - transfer: Like transfer_one_message, but for unqueued drivers (i.e. the driver implements its own queue scheduling). Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Link: https://lore.kernel.org/r/20190905010114.26718-3-olteanv@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org> --- drivers/spi/spi.c | 127 ++++++++++++++++++++++++++++++++++++++++ include/linux/spi/spi.h | 61 +++++++++++++++++++ 2 files changed, 188 insertions(+) diff --git a/drivers/spi/spi.c b/drivers/spi/spi.c index f9502dbbb5c1..9bb36c32cbf9 100644 --- a/drivers/spi/spi.c +++ b/drivers/spi/spi.c @@ -1171,6 +1171,11 @@ static int spi_transfer_one_message(struct spi_controller *ctlr, spi_statistics_add_transfer_stats(statm, xfer, ctlr); spi_statistics_add_transfer_stats(stats, xfer, ctlr); + if (!ctlr->ptp_sts_supported) { + xfer->ptp_sts_word_pre = 0; + ptp_read_system_prets(xfer->ptp_sts); + } + if (xfer->tx_buf || xfer->rx_buf) { reinit_completion(&ctlr->xfer_completion); @@ -1197,6 +1202,11 @@ static int spi_transfer_one_message(struct spi_controller *ctlr, xfer->len); } + if (!ctlr->ptp_sts_supported) { + ptp_read_system_postts(xfer->ptp_sts); + xfer->ptp_sts_word_post = xfer->len; + } + trace_spi_transfer_stop(msg, xfer); if (msg->status != -EINPROGRESS) @@ -1265,6 +1275,7 @@ EXPORT_SYMBOL_GPL(spi_finalize_current_transfer); */ static void __spi_pump_messages(struct spi_controller *ctlr, bool in_kthread) { + struct spi_transfer *xfer; struct spi_message *msg; bool was_busy = false; unsigned long flags; @@ -1391,6 +1402,13 @@ static void __spi_pump_messages(struct spi_controller *ctlr, bool in_kthread) goto out; } + if (!ctlr->ptp_sts_supported && !ctlr->transfer_one) { + list_for_each_entry(xfer, &msg->transfers, transfer_list) { + xfer->ptp_sts_word_pre = 0; + ptp_read_system_prets(xfer->ptp_sts); + } + } + ret = ctlr->transfer_one_message(ctlr, msg); if (ret) { dev_err(&ctlr->dev, @@ -1418,6 +1436,99 @@ static void spi_pump_messages(struct kthread_work *work) __spi_pump_messages(ctlr, true); } +/** + * spi_take_timestamp_pre - helper for drivers to collect the beginning of the + * TX timestamp for the requested byte from the SPI + * transfer. The frequency with which this function + * must be called (once per word, once for the whole + * transfer, once per batch of words etc) is arbitrary + * as long as the @tx buffer offset is greater than or + * equal to the requested byte at the time of the + * call. The timestamp is only taken once, at the + * first such call. It is assumed that the driver + * advances its @tx buffer pointer monotonically. + * @ctlr: Pointer to the spi_controller structure of the driver + * @xfer: Pointer to the transfer being timestamped + * @tx: Pointer to the current word within the xfer->tx_buf that the driver is + * preparing to transmit right now. + * @irqs_off: If true, will disable IRQs and preemption for the duration of the + * transfer, for less jitter in time measurement. Only compatible + * with PIO drivers. If true, must follow up with + * spi_take_timestamp_post or otherwise system will crash. + * WARNING: for fully predictable results, the CPU frequency must + * also be under control (governor). + */ +void spi_take_timestamp_pre(struct spi_controller *ctlr, + struct spi_transfer *xfer, + const void *tx, bool irqs_off) +{ + u8 bytes_per_word = DIV_ROUND_UP(xfer->bits_per_word, 8); + + if (!xfer->ptp_sts) + return; + + if (xfer->timestamped_pre) + return; + + if (tx < (xfer->tx_buf + xfer->ptp_sts_word_pre * bytes_per_word)) + return; + + /* Capture the resolution of the timestamp */ + xfer->ptp_sts_word_pre = (tx - xfer->tx_buf) / bytes_per_word; + + xfer->timestamped_pre = true; + + if (irqs_off) { + local_irq_save(ctlr->irq_flags); + preempt_disable(); + } + + ptp_read_system_prets(xfer->ptp_sts); +} +EXPORT_SYMBOL_GPL(spi_take_timestamp_pre); + +/** + * spi_take_timestamp_post - helper for drivers to collect the end of the + * TX timestamp for the requested byte from the SPI + * transfer. Can be called with an arbitrary + * frequency: only the first call where @tx exceeds + * or is equal to the requested word will be + * timestamped. + * @ctlr: Pointer to the spi_controller structure of the driver + * @xfer: Pointer to the transfer being timestamped + * @tx: Pointer to the current word within the xfer->tx_buf that the driver has + * just transmitted. + * @irqs_off: If true, will re-enable IRQs and preemption for the local CPU. + */ +void spi_take_timestamp_post(struct spi_controller *ctlr, + struct spi_transfer *xfer, + const void *tx, bool irqs_off) +{ + u8 bytes_per_word = DIV_ROUND_UP(xfer->bits_per_word, 8); + + if (!xfer->ptp_sts) + return; + + if (xfer->timestamped_post) + return; + + if (tx < (xfer->tx_buf + xfer->ptp_sts_word_post * bytes_per_word)) + return; + + ptp_read_system_postts(xfer->ptp_sts); + + if (irqs_off) { + local_irq_restore(ctlr->irq_flags); + preempt_enable(); + } + + /* Capture the resolution of the timestamp */ + xfer->ptp_sts_word_post = (tx - xfer->tx_buf) / bytes_per_word; + + xfer->timestamped_post = true; +} +EXPORT_SYMBOL_GPL(spi_take_timestamp_post); + /** * spi_set_thread_rt - set the controller to pump at realtime priority * @ctlr: controller to boost priority of @@ -1503,6 +1614,7 @@ EXPORT_SYMBOL_GPL(spi_get_next_queued_message); */ void spi_finalize_current_message(struct spi_controller *ctlr) { + struct spi_transfer *xfer; struct spi_message *mesg; unsigned long flags; int ret; @@ -1511,6 +1623,13 @@ void spi_finalize_current_message(struct spi_controller *ctlr) mesg = ctlr->cur_msg; spin_unlock_irqrestore(&ctlr->queue_lock, flags); + if (!ctlr->ptp_sts_supported && !ctlr->transfer_one) { + list_for_each_entry(xfer, &mesg->transfers, transfer_list) { + ptp_read_system_postts(xfer->ptp_sts); + xfer->ptp_sts_word_post = xfer->len; + } + } + spi_unmap_msg(ctlr, mesg); if (ctlr->cur_msg_prepared && ctlr->unprepare_message) { @@ -3273,6 +3392,7 @@ static int __spi_validate(struct spi_device *spi, struct spi_message *message) static int __spi_async(struct spi_device *spi, struct spi_message *message) { struct spi_controller *ctlr = spi->controller; + struct spi_transfer *xfer; /* * Some controllers do not support doing regular SPI transfers. Return @@ -3288,6 +3408,13 @@ static int __spi_async(struct spi_device *spi, struct spi_message *message) trace_spi_message_submit(message); + if (!ctlr->ptp_sts_supported) { + list_for_each_entry(xfer, &message->transfers, transfer_list) { + xfer->ptp_sts_word_pre = 0; + ptp_read_system_prets(xfer->ptp_sts); + } + } + return ctlr->transfer(spi, message); } diff --git a/include/linux/spi/spi.h b/include/linux/spi/spi.h index af4f265d0f67..27f6b046cf92 100644 --- a/include/linux/spi/spi.h +++ b/include/linux/spi/spi.h @@ -13,6 +13,7 @@ #include <linux/completion.h> #include <linux/scatterlist.h> #include <linux/gpio/consumer.h> +#include <linux/ptp_clock_kernel.h> struct dma_chan; struct property_entry; @@ -409,6 +410,12 @@ static inline void spi_unregister_driver(struct spi_driver *sdrv) * @fw_translate_cs: If the boot firmware uses different numbering scheme * what Linux expects, this optional hook can be used to translate * between the two. + * @ptp_sts_supported: If the driver sets this to true, it must provide a + * time snapshot in @spi_transfer->ptp_sts as close as possible to the + * moment in time when @spi_transfer->ptp_sts_word_pre and + * @spi_transfer->ptp_sts_word_post were transmitted. + * If the driver does not set this, the SPI core takes the snapshot as + * close to the driver hand-over as possible. * * Each SPI controller can communicate with one or more @spi_device * children. These make a small bus, sharing MOSI, MISO and SCK signals @@ -604,6 +611,15 @@ struct spi_controller { void *dummy_tx; int (*fw_translate_cs)(struct spi_controller *ctlr, unsigned cs); + + /* + * Driver sets this field to indicate it is able to snapshot SPI + * transfers (needed e.g. for reading the time of POSIX clocks) + */ + bool ptp_sts_supported; + + /* Interrupt enable state during PTP system timestamping */ + unsigned long irq_flags; }; static inline void *spi_controller_get_devdata(struct spi_controller *ctlr) @@ -644,6 +660,14 @@ extern struct spi_message *spi_get_next_queued_message(struct spi_controller *ct extern void spi_finalize_current_message(struct spi_controller *ctlr); extern void spi_finalize_current_transfer(struct spi_controller *ctlr); +/* Helper calls for driver to timestamp transfer */ +void spi_take_timestamp_pre(struct spi_controller *ctlr, + struct spi_transfer *xfer, + const void *tx, bool irqs_off); +void spi_take_timestamp_post(struct spi_controller *ctlr, + struct spi_transfer *xfer, + const void *tx, bool irqs_off); + /* the spi driver core manages memory for the spi_controller classdev */ extern struct spi_controller *__spi_alloc_controller(struct device *host, unsigned int size, bool slave); @@ -753,6 +777,35 @@ extern void spi_res_release(struct spi_controller *ctlr, * @transfer_list: transfers are sequenced through @spi_message.transfers * @tx_sg: Scatterlist for transmit, currently not for client use * @rx_sg: Scatterlist for receive, currently not for client use + * @ptp_sts_word_pre: The word (subject to bits_per_word semantics) offset + * within @tx_buf for which the SPI device is requesting that the time + * snapshot for this transfer begins. Upon completing the SPI transfer, + * this value may have changed compared to what was requested, depending + * on the available snapshotting resolution (DMA transfer, + * @ptp_sts_supported is false, etc). + * @ptp_sts_word_post: See @ptp_sts_word_post. The two can be equal (meaning + * that a single byte should be snapshotted). + * If the core takes care of the timestamp (if @ptp_sts_supported is false + * for this controller), it will set @ptp_sts_word_pre to 0, and + * @ptp_sts_word_post to the length of the transfer. This is done + * purposefully (instead of setting to spi_transfer->len - 1) to denote + * that a transfer-level snapshot taken from within the driver may still + * be of higher quality. + * @ptp_sts: Pointer to a memory location held by the SPI slave device where a + * PTP system timestamp structure may lie. If drivers use PIO or their + * hardware has some sort of assist for retrieving exact transfer timing, + * they can (and should) assert @ptp_sts_supported and populate this + * structure using the ptp_read_system_*ts helper functions. + * The timestamp must represent the time at which the SPI slave device has + * processed the word, i.e. the "pre" timestamp should be taken before + * transmitting the "pre" word, and the "post" timestamp after receiving + * transmit confirmation from the controller for the "post" word. + * @timestamped_pre: Set by the SPI controller driver to denote it has acted + * upon the @ptp_sts request. Not set when the SPI core has taken care of + * the task. SPI device drivers are free to print a warning if this comes + * back unset and they need the better resolution. + * @timestamped_post: See above. The reason why both exist is that these + * booleans are also used to keep state in the core SPI logic. * * SPI transfers always write the same number of bytes as they read. * Protocol drivers should always provide @rx_buf and/or @tx_buf. @@ -842,6 +895,14 @@ struct spi_transfer { u32 effective_speed_hz; + unsigned int ptp_sts_word_pre; + unsigned int ptp_sts_word_post; + + struct ptp_system_timestamp *ptp_sts; + + bool timestamped_pre; + bool timestamped_post; + struct list_head transfer_list; }; -- 2.20.1 ^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: Applied "spi: Add a PTP system timestamp to the transfer structure" to the spi tree 2019-10-08 10:52 ` Mark Brown (?) @ 2019-10-08 12:58 ` Vladimir Oltean 2019-10-08 16:42 ` Mark Brown -1 siblings, 1 reply; 21+ messages in thread From: Vladimir Oltean @ 2019-10-08 12:58 UTC (permalink / raw) To: Mark Brown, David S. Miller Cc: Andrew Lunn, Florian Fainelli, Hubert Feurstein, linux-spi, Miroslav Lichvar, netdev, Richard Cochran On Tue, 8 Oct 2019 at 13:52, Mark Brown <broonie@kernel.org> wrote: > > The patch > > spi: Add a PTP system timestamp to the transfer structure > > has been applied to the spi tree at > > https://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi.git for-5.5 > > All being well this means that it will be integrated into the linux-next > tree (usually sometime in the next 24 hours) and sent to Linus during > the next merge window (or sooner if it is a bug fix), however if > problems are discovered then the patch may be dropped or reverted. > > You may get further e-mails resulting from automated or manual testing > and review of the tree, please engage with people reporting problems and > send followup patches addressing any issues that are reported if needed. > > If any updates are required or you are submitting further changes they > should be sent as incremental updates against current git, existing > patches will not be replaced. > > Please add any relevant lists and maintainers to the CCs when replying > to this mail. > > Thanks, > Mark > Thanks Mark! Dave, do you think you can somehow integrate this patch into net-next as well, so that I can send some further patches that depend on the newly introduced ptp_sts member of struct spi_transfer without waiting for another kernel release? Regards, -Vladimir > From b42faeee718ce13ef6eb99c24880b58deb54c8fa Mon Sep 17 00:00:00 2001 > From: Vladimir Oltean <olteanv@gmail.com> > Date: Thu, 5 Sep 2019 04:01:12 +0300 > Subject: [PATCH] spi: Add a PTP system timestamp to the transfer structure > > SPI is one of the interfaces used to access devices which have a POSIX > clock driver (real time clocks, 1588 timers etc). The fact that the SPI > bus is slow is not what the main problem is, but rather the fact that > drivers don't take a constant amount of time in transferring data over > SPI. When there is a high delay in the readout of time, there will be > uncertainty in the value that has been read out of the peripheral. > When that delay is constant, the uncertainty can at least be > approximated with a certain accuracy which is fine more often than not. > > Timing jitter occurs all over in the kernel code, and is mainly caused > by having to let go of the CPU for various reasons such as preemption, > servicing interrupts, going to sleep, etc. Another major reason is CPU > dynamic frequency scaling. > > It turns out that the problem of retrieving time from a SPI peripheral > with high accuracy can be solved by the use of "PTP system > timestamping" - a mechanism to correlate the time when the device has > snapshotted its internal time counter with the Linux system time at that > same moment. This is sufficient for having a precise time measurement - > it is not necessary for the whole SPI transfer to be transmitted "as > fast as possible", or "as low-jitter as possible". The system has to be > low-jitter for a very short amount of time to be effective. > > This patch introduces a PTP system timestamping mechanism in struct > spi_transfer. This is to be used by SPI device drivers when they need to > know the exact time at which the underlying device's time was > snapshotted. More often than not, SPI peripherals have a very exact > timing for when their SPI-to-interconnect bridge issues a transaction > for snapshotting and reading the time register, and that will be > dependent on when the SPI-to-interconnect bridge figures out that this > is what it should do, aka as soon as it sees byte N of the SPI transfer. > Since spi_device drivers are the ones who'd know best how the peripheral > behaves in this regard, expose a mechanism in spi_transfer which allows > them to specify which word (or word range) from the transfer should be > timestamped. > > Add a default implementation of the PTP system timestamping in the SPI > core. This is not going to be satisfactory performance-wise, but should > at least increase the likelihood that SPI device drivers will use PTP > system timestamping in the future. > There are 3 entry points from the core towards the SPI controller > drivers: > > - transfer_one: The driver is passed individual spi_transfers to > execute. This is the easiest to timestamp. > > - transfer_one_message: The core passes the driver an entire spi_message > (a potential batch of spi_transfers). The core puts the same pre and > post timestamp to all transfers within a message. This is not ideal, > but nothing better can be done by default anyway, since the core has > no insight into how the driver batches the transfers. > > - transfer: Like transfer_one_message, but for unqueued drivers (i.e. > the driver implements its own queue scheduling). > > Signed-off-by: Vladimir Oltean <olteanv@gmail.com> > Link: https://lore.kernel.org/r/20190905010114.26718-3-olteanv@gmail.com > Signed-off-by: Mark Brown <broonie@kernel.org> > --- > drivers/spi/spi.c | 127 ++++++++++++++++++++++++++++++++++++++++ > include/linux/spi/spi.h | 61 +++++++++++++++++++ > 2 files changed, 188 insertions(+) > > diff --git a/drivers/spi/spi.c b/drivers/spi/spi.c > index f9502dbbb5c1..9bb36c32cbf9 100644 > --- a/drivers/spi/spi.c > +++ b/drivers/spi/spi.c > @@ -1171,6 +1171,11 @@ static int spi_transfer_one_message(struct spi_controller *ctlr, > spi_statistics_add_transfer_stats(statm, xfer, ctlr); > spi_statistics_add_transfer_stats(stats, xfer, ctlr); > > + if (!ctlr->ptp_sts_supported) { > + xfer->ptp_sts_word_pre = 0; > + ptp_read_system_prets(xfer->ptp_sts); > + } > + > if (xfer->tx_buf || xfer->rx_buf) { > reinit_completion(&ctlr->xfer_completion); > > @@ -1197,6 +1202,11 @@ static int spi_transfer_one_message(struct spi_controller *ctlr, > xfer->len); > } > > + if (!ctlr->ptp_sts_supported) { > + ptp_read_system_postts(xfer->ptp_sts); > + xfer->ptp_sts_word_post = xfer->len; > + } > + > trace_spi_transfer_stop(msg, xfer); > > if (msg->status != -EINPROGRESS) > @@ -1265,6 +1275,7 @@ EXPORT_SYMBOL_GPL(spi_finalize_current_transfer); > */ > static void __spi_pump_messages(struct spi_controller *ctlr, bool in_kthread) > { > + struct spi_transfer *xfer; > struct spi_message *msg; > bool was_busy = false; > unsigned long flags; > @@ -1391,6 +1402,13 @@ static void __spi_pump_messages(struct spi_controller *ctlr, bool in_kthread) > goto out; > } > > + if (!ctlr->ptp_sts_supported && !ctlr->transfer_one) { > + list_for_each_entry(xfer, &msg->transfers, transfer_list) { > + xfer->ptp_sts_word_pre = 0; > + ptp_read_system_prets(xfer->ptp_sts); > + } > + } > + > ret = ctlr->transfer_one_message(ctlr, msg); > if (ret) { > dev_err(&ctlr->dev, > @@ -1418,6 +1436,99 @@ static void spi_pump_messages(struct kthread_work *work) > __spi_pump_messages(ctlr, true); > } > > +/** > + * spi_take_timestamp_pre - helper for drivers to collect the beginning of the > + * TX timestamp for the requested byte from the SPI > + * transfer. The frequency with which this function > + * must be called (once per word, once for the whole > + * transfer, once per batch of words etc) is arbitrary > + * as long as the @tx buffer offset is greater than or > + * equal to the requested byte at the time of the > + * call. The timestamp is only taken once, at the > + * first such call. It is assumed that the driver > + * advances its @tx buffer pointer monotonically. > + * @ctlr: Pointer to the spi_controller structure of the driver > + * @xfer: Pointer to the transfer being timestamped > + * @tx: Pointer to the current word within the xfer->tx_buf that the driver is > + * preparing to transmit right now. > + * @irqs_off: If true, will disable IRQs and preemption for the duration of the > + * transfer, for less jitter in time measurement. Only compatible > + * with PIO drivers. If true, must follow up with > + * spi_take_timestamp_post or otherwise system will crash. > + * WARNING: for fully predictable results, the CPU frequency must > + * also be under control (governor). > + */ > +void spi_take_timestamp_pre(struct spi_controller *ctlr, > + struct spi_transfer *xfer, > + const void *tx, bool irqs_off) > +{ > + u8 bytes_per_word = DIV_ROUND_UP(xfer->bits_per_word, 8); > + > + if (!xfer->ptp_sts) > + return; > + > + if (xfer->timestamped_pre) > + return; > + > + if (tx < (xfer->tx_buf + xfer->ptp_sts_word_pre * bytes_per_word)) > + return; > + > + /* Capture the resolution of the timestamp */ > + xfer->ptp_sts_word_pre = (tx - xfer->tx_buf) / bytes_per_word; > + > + xfer->timestamped_pre = true; > + > + if (irqs_off) { > + local_irq_save(ctlr->irq_flags); > + preempt_disable(); > + } > + > + ptp_read_system_prets(xfer->ptp_sts); > +} > +EXPORT_SYMBOL_GPL(spi_take_timestamp_pre); > + > +/** > + * spi_take_timestamp_post - helper for drivers to collect the end of the > + * TX timestamp for the requested byte from the SPI > + * transfer. Can be called with an arbitrary > + * frequency: only the first call where @tx exceeds > + * or is equal to the requested word will be > + * timestamped. > + * @ctlr: Pointer to the spi_controller structure of the driver > + * @xfer: Pointer to the transfer being timestamped > + * @tx: Pointer to the current word within the xfer->tx_buf that the driver has > + * just transmitted. > + * @irqs_off: If true, will re-enable IRQs and preemption for the local CPU. > + */ > +void spi_take_timestamp_post(struct spi_controller *ctlr, > + struct spi_transfer *xfer, > + const void *tx, bool irqs_off) > +{ > + u8 bytes_per_word = DIV_ROUND_UP(xfer->bits_per_word, 8); > + > + if (!xfer->ptp_sts) > + return; > + > + if (xfer->timestamped_post) > + return; > + > + if (tx < (xfer->tx_buf + xfer->ptp_sts_word_post * bytes_per_word)) > + return; > + > + ptp_read_system_postts(xfer->ptp_sts); > + > + if (irqs_off) { > + local_irq_restore(ctlr->irq_flags); > + preempt_enable(); > + } > + > + /* Capture the resolution of the timestamp */ > + xfer->ptp_sts_word_post = (tx - xfer->tx_buf) / bytes_per_word; > + > + xfer->timestamped_post = true; > +} > +EXPORT_SYMBOL_GPL(spi_take_timestamp_post); > + > /** > * spi_set_thread_rt - set the controller to pump at realtime priority > * @ctlr: controller to boost priority of > @@ -1503,6 +1614,7 @@ EXPORT_SYMBOL_GPL(spi_get_next_queued_message); > */ > void spi_finalize_current_message(struct spi_controller *ctlr) > { > + struct spi_transfer *xfer; > struct spi_message *mesg; > unsigned long flags; > int ret; > @@ -1511,6 +1623,13 @@ void spi_finalize_current_message(struct spi_controller *ctlr) > mesg = ctlr->cur_msg; > spin_unlock_irqrestore(&ctlr->queue_lock, flags); > > + if (!ctlr->ptp_sts_supported && !ctlr->transfer_one) { > + list_for_each_entry(xfer, &mesg->transfers, transfer_list) { > + ptp_read_system_postts(xfer->ptp_sts); > + xfer->ptp_sts_word_post = xfer->len; > + } > + } > + > spi_unmap_msg(ctlr, mesg); > > if (ctlr->cur_msg_prepared && ctlr->unprepare_message) { > @@ -3273,6 +3392,7 @@ static int __spi_validate(struct spi_device *spi, struct spi_message *message) > static int __spi_async(struct spi_device *spi, struct spi_message *message) > { > struct spi_controller *ctlr = spi->controller; > + struct spi_transfer *xfer; > > /* > * Some controllers do not support doing regular SPI transfers. Return > @@ -3288,6 +3408,13 @@ static int __spi_async(struct spi_device *spi, struct spi_message *message) > > trace_spi_message_submit(message); > > + if (!ctlr->ptp_sts_supported) { > + list_for_each_entry(xfer, &message->transfers, transfer_list) { > + xfer->ptp_sts_word_pre = 0; > + ptp_read_system_prets(xfer->ptp_sts); > + } > + } > + > return ctlr->transfer(spi, message); > } > > diff --git a/include/linux/spi/spi.h b/include/linux/spi/spi.h > index af4f265d0f67..27f6b046cf92 100644 > --- a/include/linux/spi/spi.h > +++ b/include/linux/spi/spi.h > @@ -13,6 +13,7 @@ > #include <linux/completion.h> > #include <linux/scatterlist.h> > #include <linux/gpio/consumer.h> > +#include <linux/ptp_clock_kernel.h> > > struct dma_chan; > struct property_entry; > @@ -409,6 +410,12 @@ static inline void spi_unregister_driver(struct spi_driver *sdrv) > * @fw_translate_cs: If the boot firmware uses different numbering scheme > * what Linux expects, this optional hook can be used to translate > * between the two. > + * @ptp_sts_supported: If the driver sets this to true, it must provide a > + * time snapshot in @spi_transfer->ptp_sts as close as possible to the > + * moment in time when @spi_transfer->ptp_sts_word_pre and > + * @spi_transfer->ptp_sts_word_post were transmitted. > + * If the driver does not set this, the SPI core takes the snapshot as > + * close to the driver hand-over as possible. > * > * Each SPI controller can communicate with one or more @spi_device > * children. These make a small bus, sharing MOSI, MISO and SCK signals > @@ -604,6 +611,15 @@ struct spi_controller { > void *dummy_tx; > > int (*fw_translate_cs)(struct spi_controller *ctlr, unsigned cs); > + > + /* > + * Driver sets this field to indicate it is able to snapshot SPI > + * transfers (needed e.g. for reading the time of POSIX clocks) > + */ > + bool ptp_sts_supported; > + > + /* Interrupt enable state during PTP system timestamping */ > + unsigned long irq_flags; > }; > > static inline void *spi_controller_get_devdata(struct spi_controller *ctlr) > @@ -644,6 +660,14 @@ extern struct spi_message *spi_get_next_queued_message(struct spi_controller *ct > extern void spi_finalize_current_message(struct spi_controller *ctlr); > extern void spi_finalize_current_transfer(struct spi_controller *ctlr); > > +/* Helper calls for driver to timestamp transfer */ > +void spi_take_timestamp_pre(struct spi_controller *ctlr, > + struct spi_transfer *xfer, > + const void *tx, bool irqs_off); > +void spi_take_timestamp_post(struct spi_controller *ctlr, > + struct spi_transfer *xfer, > + const void *tx, bool irqs_off); > + > /* the spi driver core manages memory for the spi_controller classdev */ > extern struct spi_controller *__spi_alloc_controller(struct device *host, > unsigned int size, bool slave); > @@ -753,6 +777,35 @@ extern void spi_res_release(struct spi_controller *ctlr, > * @transfer_list: transfers are sequenced through @spi_message.transfers > * @tx_sg: Scatterlist for transmit, currently not for client use > * @rx_sg: Scatterlist for receive, currently not for client use > + * @ptp_sts_word_pre: The word (subject to bits_per_word semantics) offset > + * within @tx_buf for which the SPI device is requesting that the time > + * snapshot for this transfer begins. Upon completing the SPI transfer, > + * this value may have changed compared to what was requested, depending > + * on the available snapshotting resolution (DMA transfer, > + * @ptp_sts_supported is false, etc). > + * @ptp_sts_word_post: See @ptp_sts_word_post. The two can be equal (meaning > + * that a single byte should be snapshotted). > + * If the core takes care of the timestamp (if @ptp_sts_supported is false > + * for this controller), it will set @ptp_sts_word_pre to 0, and > + * @ptp_sts_word_post to the length of the transfer. This is done > + * purposefully (instead of setting to spi_transfer->len - 1) to denote > + * that a transfer-level snapshot taken from within the driver may still > + * be of higher quality. > + * @ptp_sts: Pointer to a memory location held by the SPI slave device where a > + * PTP system timestamp structure may lie. If drivers use PIO or their > + * hardware has some sort of assist for retrieving exact transfer timing, > + * they can (and should) assert @ptp_sts_supported and populate this > + * structure using the ptp_read_system_*ts helper functions. > + * The timestamp must represent the time at which the SPI slave device has > + * processed the word, i.e. the "pre" timestamp should be taken before > + * transmitting the "pre" word, and the "post" timestamp after receiving > + * transmit confirmation from the controller for the "post" word. > + * @timestamped_pre: Set by the SPI controller driver to denote it has acted > + * upon the @ptp_sts request. Not set when the SPI core has taken care of > + * the task. SPI device drivers are free to print a warning if this comes > + * back unset and they need the better resolution. > + * @timestamped_post: See above. The reason why both exist is that these > + * booleans are also used to keep state in the core SPI logic. > * > * SPI transfers always write the same number of bytes as they read. > * Protocol drivers should always provide @rx_buf and/or @tx_buf. > @@ -842,6 +895,14 @@ struct spi_transfer { > > u32 effective_speed_hz; > > + unsigned int ptp_sts_word_pre; > + unsigned int ptp_sts_word_post; > + > + struct ptp_system_timestamp *ptp_sts; > + > + bool timestamped_pre; > + bool timestamped_post; > + > struct list_head transfer_list; > }; > > -- > 2.20.1 > ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Applied "spi: Add a PTP system timestamp to the transfer structure" to the spi tree 2019-10-08 12:58 ` Vladimir Oltean @ 2019-10-08 16:42 ` Mark Brown 2019-10-09 22:13 ` Jakub Kicinski 0 siblings, 1 reply; 21+ messages in thread From: Mark Brown @ 2019-10-08 16:42 UTC (permalink / raw) To: Vladimir Oltean Cc: David S. Miller, Andrew Lunn, Florian Fainelli, Hubert Feurstein, linux-spi, Miroslav Lichvar, netdev, Richard Cochran [-- Attachment #1: Type: text/plain, Size: 1452 bytes --] On Tue, Oct 08, 2019 at 03:58:51PM +0300, Vladimir Oltean wrote: > Dave, do you think you can somehow integrate this patch into net-next > as well, so that I can send some further patches that depend on the > newly introduced ptp_sts member of struct spi_transfer without waiting > for another kernel release? Ugh, it'd have been good to have been more aware of this before applying things since I put them on the one development branch (I used to make more topic branches but Linus doesn't like them). I've pulled things out into a branch with a signed tag for merging into other trees: The following changes since commit 54ecb8f7028c5eb3d740bb82b0f1d90f2df63c5c: Linux 5.4-rc1 (2019-09-30 10:35:40 -0700) are available in the Git repository at: https://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi.git tags/spi-ptp-api for you to fetch changes up to 79591b7db21d255db158afaa48c557dcab631a1c: spi: Add a PTP system timestamp to the transfer structure (2019-10-08 17:38:15 +0100) ---------------------------------------------------------------- spi: Add a PTP API For detailed timestamping of operations. ---------------------------------------------------------------- Vladimir Oltean (1): spi: Add a PTP system timestamp to the transfer structure drivers/spi/spi.c | 127 ++++++++++++++++++++++++++++++++++++++++++++++++ include/linux/spi/spi.h | 61 +++++++++++++++++++++++ 2 files changed, 188 insertions(+) [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Applied "spi: Add a PTP system timestamp to the transfer structure" to the spi tree 2019-10-08 16:42 ` Mark Brown @ 2019-10-09 22:13 ` Jakub Kicinski 2019-10-09 22:57 ` Vladimir Oltean 0 siblings, 1 reply; 21+ messages in thread From: Jakub Kicinski @ 2019-10-09 22:13 UTC (permalink / raw) To: Mark Brown, Vladimir Oltean Cc: David S. Miller, Andrew Lunn, Florian Fainelli, Hubert Feurstein, linux-spi, Miroslav Lichvar, netdev, Richard Cochran On Tue, 8 Oct 2019 17:42:59 +0100, Mark Brown wrote: > On Tue, Oct 08, 2019 at 03:58:51PM +0300, Vladimir Oltean wrote: > > > Dave, do you think you can somehow integrate this patch into net-next > > as well, so that I can send some further patches that depend on the > > newly introduced ptp_sts member of struct spi_transfer without waiting > > for another kernel release? > > Ugh, it'd have been good to have been more aware of this before applying > things since I put them on the one development branch (I used to make > more topic branches but Linus doesn't like them). I've pulled things > out into a branch with a signed tag for merging into other trees: > > The following changes since commit 54ecb8f7028c5eb3d740bb82b0f1d90f2df63c5c: > > Linux 5.4-rc1 (2019-09-30 10:35:40 -0700) > > are available in the Git repository at: > > https://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi.git tags/spi-ptp-api > > for you to fetch changes up to 79591b7db21d255db158afaa48c557dcab631a1c: > > spi: Add a PTP system timestamp to the transfer structure (2019-10-08 17:38:15 +0100) > > ---------------------------------------------------------------- > spi: Add a PTP API > > For detailed timestamping of operations. > > ---------------------------------------------------------------- > Vladimir Oltean (1): > spi: Add a PTP system timestamp to the transfer structure > > drivers/spi/spi.c | 127 ++++++++++++++++++++++++++++++++++++++++++++++++ > include/linux/spi/spi.h | 61 +++++++++++++++++++++++ > 2 files changed, 188 insertions(+) Thanks for the branch, I pulled it into net-next, it should show up once build testing is done. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Applied "spi: Add a PTP system timestamp to the transfer structure" to the spi tree 2019-10-09 22:13 ` Jakub Kicinski @ 2019-10-09 22:57 ` Vladimir Oltean 0 siblings, 0 replies; 21+ messages in thread From: Vladimir Oltean @ 2019-10-09 22:57 UTC (permalink / raw) To: Jakub Kicinski Cc: Mark Brown, David S. Miller, Andrew Lunn, Florian Fainelli, Hubert Feurstein, linux-spi, Miroslav Lichvar, netdev, Richard Cochran On Thu, 10 Oct 2019 at 01:14, Jakub Kicinski <jakub.kicinski@netronome.com> wrote: > > On Tue, 8 Oct 2019 17:42:59 +0100, Mark Brown wrote: > > On Tue, Oct 08, 2019 at 03:58:51PM +0300, Vladimir Oltean wrote: > > > > > Dave, do you think you can somehow integrate this patch into net-next > > > as well, so that I can send some further patches that depend on the > > > newly introduced ptp_sts member of struct spi_transfer without waiting > > > for another kernel release? > > > > Ugh, it'd have been good to have been more aware of this before applying > > things since I put them on the one development branch (I used to make > > more topic branches but Linus doesn't like them). I've pulled things > > out into a branch with a signed tag for merging into other trees: > > > > The following changes since commit 54ecb8f7028c5eb3d740bb82b0f1d90f2df63c5c: > > > > Linux 5.4-rc1 (2019-09-30 10:35:40 -0700) > > > > are available in the Git repository at: > > > > https://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi.git tags/spi-ptp-api > > > > for you to fetch changes up to 79591b7db21d255db158afaa48c557dcab631a1c: > > > > spi: Add a PTP system timestamp to the transfer structure (2019-10-08 17:38:15 +0100) > > > > ---------------------------------------------------------------- > > spi: Add a PTP API > > > > For detailed timestamping of operations. > > > > ---------------------------------------------------------------- > > Vladimir Oltean (1): > > spi: Add a PTP system timestamp to the transfer structure > > > > drivers/spi/spi.c | 127 ++++++++++++++++++++++++++++++++++++++++++++++++ > > include/linux/spi/spi.h | 61 +++++++++++++++++++++++ > > 2 files changed, 188 insertions(+) > > Thanks for the branch, I pulled it into net-next, it should show up once > build testing is done. Thanks to both of you, Jakub and Mark! -Vladimir ^ permalink raw reply [flat|nested] 21+ messages in thread
* Applied "spi: Add a PTP system timestamp to the transfer structure" to the spi tree 2019-09-05 1:01 ` [PATCH v2 2/4] spi: Add a PTP system timestamp to the transfer structure Vladimir Oltean @ 2019-10-08 18:09 ` Mark Brown 2019-10-08 18:09 ` Mark Brown 1 sibling, 0 replies; 21+ messages in thread From: Mark Brown @ 2019-10-08 18:09 UTC (permalink / raw) To: Vladimir Oltean Cc: andrew, broonie, f.fainelli, h.feurstein, linux-spi, Mark Brown, mlichvar, netdev, richardcochran The patch spi: Add a PTP system timestamp to the transfer structure has been applied to the spi tree at https://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi.git All being well this means that it will be integrated into the linux-next tree (usually sometime in the next 24 hours) and sent to Linus during the next merge window (or sooner if it is a bug fix), however if problems are discovered then the patch may be dropped or reverted. You may get further e-mails resulting from automated or manual testing and review of the tree, please engage with people reporting problems and send followup patches addressing any issues that are reported if needed. If any updates are required or you are submitting further changes they should be sent as incremental updates against current git, existing patches will not be replaced. Please add any relevant lists and maintainers to the CCs when replying to this mail. Thanks, Mark From 79591b7db21d255db158afaa48c557dcab631a1c Mon Sep 17 00:00:00 2001 From: Vladimir Oltean <olteanv@gmail.com> Date: Thu, 5 Sep 2019 04:01:12 +0300 Subject: [PATCH] spi: Add a PTP system timestamp to the transfer structure SPI is one of the interfaces used to access devices which have a POSIX clock driver (real time clocks, 1588 timers etc). The fact that the SPI bus is slow is not what the main problem is, but rather the fact that drivers don't take a constant amount of time in transferring data over SPI. When there is a high delay in the readout of time, there will be uncertainty in the value that has been read out of the peripheral. When that delay is constant, the uncertainty can at least be approximated with a certain accuracy which is fine more often than not. Timing jitter occurs all over in the kernel code, and is mainly caused by having to let go of the CPU for various reasons such as preemption, servicing interrupts, going to sleep, etc. Another major reason is CPU dynamic frequency scaling. It turns out that the problem of retrieving time from a SPI peripheral with high accuracy can be solved by the use of "PTP system timestamping" - a mechanism to correlate the time when the device has snapshotted its internal time counter with the Linux system time at that same moment. This is sufficient for having a precise time measurement - it is not necessary for the whole SPI transfer to be transmitted "as fast as possible", or "as low-jitter as possible". The system has to be low-jitter for a very short amount of time to be effective. This patch introduces a PTP system timestamping mechanism in struct spi_transfer. This is to be used by SPI device drivers when they need to know the exact time at which the underlying device's time was snapshotted. More often than not, SPI peripherals have a very exact timing for when their SPI-to-interconnect bridge issues a transaction for snapshotting and reading the time register, and that will be dependent on when the SPI-to-interconnect bridge figures out that this is what it should do, aka as soon as it sees byte N of the SPI transfer. Since spi_device drivers are the ones who'd know best how the peripheral behaves in this regard, expose a mechanism in spi_transfer which allows them to specify which word (or word range) from the transfer should be timestamped. Add a default implementation of the PTP system timestamping in the SPI core. This is not going to be satisfactory performance-wise, but should at least increase the likelihood that SPI device drivers will use PTP system timestamping in the future. There are 3 entry points from the core towards the SPI controller drivers: - transfer_one: The driver is passed individual spi_transfers to execute. This is the easiest to timestamp. - transfer_one_message: The core passes the driver an entire spi_message (a potential batch of spi_transfers). The core puts the same pre and post timestamp to all transfers within a message. This is not ideal, but nothing better can be done by default anyway, since the core has no insight into how the driver batches the transfers. - transfer: Like transfer_one_message, but for unqueued drivers (i.e. the driver implements its own queue scheduling). Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Link: https://lore.kernel.org/r/20190905010114.26718-3-olteanv@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org> --- drivers/spi/spi.c | 127 ++++++++++++++++++++++++++++++++++++++++ include/linux/spi/spi.h | 61 +++++++++++++++++++ 2 files changed, 188 insertions(+) diff --git a/drivers/spi/spi.c b/drivers/spi/spi.c index f9502dbbb5c1..9bb36c32cbf9 100644 --- a/drivers/spi/spi.c +++ b/drivers/spi/spi.c @@ -1171,6 +1171,11 @@ static int spi_transfer_one_message(struct spi_controller *ctlr, spi_statistics_add_transfer_stats(statm, xfer, ctlr); spi_statistics_add_transfer_stats(stats, xfer, ctlr); + if (!ctlr->ptp_sts_supported) { + xfer->ptp_sts_word_pre = 0; + ptp_read_system_prets(xfer->ptp_sts); + } + if (xfer->tx_buf || xfer->rx_buf) { reinit_completion(&ctlr->xfer_completion); @@ -1197,6 +1202,11 @@ static int spi_transfer_one_message(struct spi_controller *ctlr, xfer->len); } + if (!ctlr->ptp_sts_supported) { + ptp_read_system_postts(xfer->ptp_sts); + xfer->ptp_sts_word_post = xfer->len; + } + trace_spi_transfer_stop(msg, xfer); if (msg->status != -EINPROGRESS) @@ -1265,6 +1275,7 @@ EXPORT_SYMBOL_GPL(spi_finalize_current_transfer); */ static void __spi_pump_messages(struct spi_controller *ctlr, bool in_kthread) { + struct spi_transfer *xfer; struct spi_message *msg; bool was_busy = false; unsigned long flags; @@ -1391,6 +1402,13 @@ static void __spi_pump_messages(struct spi_controller *ctlr, bool in_kthread) goto out; } + if (!ctlr->ptp_sts_supported && !ctlr->transfer_one) { + list_for_each_entry(xfer, &msg->transfers, transfer_list) { + xfer->ptp_sts_word_pre = 0; + ptp_read_system_prets(xfer->ptp_sts); + } + } + ret = ctlr->transfer_one_message(ctlr, msg); if (ret) { dev_err(&ctlr->dev, @@ -1418,6 +1436,99 @@ static void spi_pump_messages(struct kthread_work *work) __spi_pump_messages(ctlr, true); } +/** + * spi_take_timestamp_pre - helper for drivers to collect the beginning of the + * TX timestamp for the requested byte from the SPI + * transfer. The frequency with which this function + * must be called (once per word, once for the whole + * transfer, once per batch of words etc) is arbitrary + * as long as the @tx buffer offset is greater than or + * equal to the requested byte at the time of the + * call. The timestamp is only taken once, at the + * first such call. It is assumed that the driver + * advances its @tx buffer pointer monotonically. + * @ctlr: Pointer to the spi_controller structure of the driver + * @xfer: Pointer to the transfer being timestamped + * @tx: Pointer to the current word within the xfer->tx_buf that the driver is + * preparing to transmit right now. + * @irqs_off: If true, will disable IRQs and preemption for the duration of the + * transfer, for less jitter in time measurement. Only compatible + * with PIO drivers. If true, must follow up with + * spi_take_timestamp_post or otherwise system will crash. + * WARNING: for fully predictable results, the CPU frequency must + * also be under control (governor). + */ +void spi_take_timestamp_pre(struct spi_controller *ctlr, + struct spi_transfer *xfer, + const void *tx, bool irqs_off) +{ + u8 bytes_per_word = DIV_ROUND_UP(xfer->bits_per_word, 8); + + if (!xfer->ptp_sts) + return; + + if (xfer->timestamped_pre) + return; + + if (tx < (xfer->tx_buf + xfer->ptp_sts_word_pre * bytes_per_word)) + return; + + /* Capture the resolution of the timestamp */ + xfer->ptp_sts_word_pre = (tx - xfer->tx_buf) / bytes_per_word; + + xfer->timestamped_pre = true; + + if (irqs_off) { + local_irq_save(ctlr->irq_flags); + preempt_disable(); + } + + ptp_read_system_prets(xfer->ptp_sts); +} +EXPORT_SYMBOL_GPL(spi_take_timestamp_pre); + +/** + * spi_take_timestamp_post - helper for drivers to collect the end of the + * TX timestamp for the requested byte from the SPI + * transfer. Can be called with an arbitrary + * frequency: only the first call where @tx exceeds + * or is equal to the requested word will be + * timestamped. + * @ctlr: Pointer to the spi_controller structure of the driver + * @xfer: Pointer to the transfer being timestamped + * @tx: Pointer to the current word within the xfer->tx_buf that the driver has + * just transmitted. + * @irqs_off: If true, will re-enable IRQs and preemption for the local CPU. + */ +void spi_take_timestamp_post(struct spi_controller *ctlr, + struct spi_transfer *xfer, + const void *tx, bool irqs_off) +{ + u8 bytes_per_word = DIV_ROUND_UP(xfer->bits_per_word, 8); + + if (!xfer->ptp_sts) + return; + + if (xfer->timestamped_post) + return; + + if (tx < (xfer->tx_buf + xfer->ptp_sts_word_post * bytes_per_word)) + return; + + ptp_read_system_postts(xfer->ptp_sts); + + if (irqs_off) { + local_irq_restore(ctlr->irq_flags); + preempt_enable(); + } + + /* Capture the resolution of the timestamp */ + xfer->ptp_sts_word_post = (tx - xfer->tx_buf) / bytes_per_word; + + xfer->timestamped_post = true; +} +EXPORT_SYMBOL_GPL(spi_take_timestamp_post); + /** * spi_set_thread_rt - set the controller to pump at realtime priority * @ctlr: controller to boost priority of @@ -1503,6 +1614,7 @@ EXPORT_SYMBOL_GPL(spi_get_next_queued_message); */ void spi_finalize_current_message(struct spi_controller *ctlr) { + struct spi_transfer *xfer; struct spi_message *mesg; unsigned long flags; int ret; @@ -1511,6 +1623,13 @@ void spi_finalize_current_message(struct spi_controller *ctlr) mesg = ctlr->cur_msg; spin_unlock_irqrestore(&ctlr->queue_lock, flags); + if (!ctlr->ptp_sts_supported && !ctlr->transfer_one) { + list_for_each_entry(xfer, &mesg->transfers, transfer_list) { + ptp_read_system_postts(xfer->ptp_sts); + xfer->ptp_sts_word_post = xfer->len; + } + } + spi_unmap_msg(ctlr, mesg); if (ctlr->cur_msg_prepared && ctlr->unprepare_message) { @@ -3273,6 +3392,7 @@ static int __spi_validate(struct spi_device *spi, struct spi_message *message) static int __spi_async(struct spi_device *spi, struct spi_message *message) { struct spi_controller *ctlr = spi->controller; + struct spi_transfer *xfer; /* * Some controllers do not support doing regular SPI transfers. Return @@ -3288,6 +3408,13 @@ static int __spi_async(struct spi_device *spi, struct spi_message *message) trace_spi_message_submit(message); + if (!ctlr->ptp_sts_supported) { + list_for_each_entry(xfer, &message->transfers, transfer_list) { + xfer->ptp_sts_word_pre = 0; + ptp_read_system_prets(xfer->ptp_sts); + } + } + return ctlr->transfer(spi, message); } diff --git a/include/linux/spi/spi.h b/include/linux/spi/spi.h index af4f265d0f67..27f6b046cf92 100644 --- a/include/linux/spi/spi.h +++ b/include/linux/spi/spi.h @@ -13,6 +13,7 @@ #include <linux/completion.h> #include <linux/scatterlist.h> #include <linux/gpio/consumer.h> +#include <linux/ptp_clock_kernel.h> struct dma_chan; struct property_entry; @@ -409,6 +410,12 @@ static inline void spi_unregister_driver(struct spi_driver *sdrv) * @fw_translate_cs: If the boot firmware uses different numbering scheme * what Linux expects, this optional hook can be used to translate * between the two. + * @ptp_sts_supported: If the driver sets this to true, it must provide a + * time snapshot in @spi_transfer->ptp_sts as close as possible to the + * moment in time when @spi_transfer->ptp_sts_word_pre and + * @spi_transfer->ptp_sts_word_post were transmitted. + * If the driver does not set this, the SPI core takes the snapshot as + * close to the driver hand-over as possible. * * Each SPI controller can communicate with one or more @spi_device * children. These make a small bus, sharing MOSI, MISO and SCK signals @@ -604,6 +611,15 @@ struct spi_controller { void *dummy_tx; int (*fw_translate_cs)(struct spi_controller *ctlr, unsigned cs); + + /* + * Driver sets this field to indicate it is able to snapshot SPI + * transfers (needed e.g. for reading the time of POSIX clocks) + */ + bool ptp_sts_supported; + + /* Interrupt enable state during PTP system timestamping */ + unsigned long irq_flags; }; static inline void *spi_controller_get_devdata(struct spi_controller *ctlr) @@ -644,6 +660,14 @@ extern struct spi_message *spi_get_next_queued_message(struct spi_controller *ct extern void spi_finalize_current_message(struct spi_controller *ctlr); extern void spi_finalize_current_transfer(struct spi_controller *ctlr); +/* Helper calls for driver to timestamp transfer */ +void spi_take_timestamp_pre(struct spi_controller *ctlr, + struct spi_transfer *xfer, + const void *tx, bool irqs_off); +void spi_take_timestamp_post(struct spi_controller *ctlr, + struct spi_transfer *xfer, + const void *tx, bool irqs_off); + /* the spi driver core manages memory for the spi_controller classdev */ extern struct spi_controller *__spi_alloc_controller(struct device *host, unsigned int size, bool slave); @@ -753,6 +777,35 @@ extern void spi_res_release(struct spi_controller *ctlr, * @transfer_list: transfers are sequenced through @spi_message.transfers * @tx_sg: Scatterlist for transmit, currently not for client use * @rx_sg: Scatterlist for receive, currently not for client use + * @ptp_sts_word_pre: The word (subject to bits_per_word semantics) offset + * within @tx_buf for which the SPI device is requesting that the time + * snapshot for this transfer begins. Upon completing the SPI transfer, + * this value may have changed compared to what was requested, depending + * on the available snapshotting resolution (DMA transfer, + * @ptp_sts_supported is false, etc). + * @ptp_sts_word_post: See @ptp_sts_word_post. The two can be equal (meaning + * that a single byte should be snapshotted). + * If the core takes care of the timestamp (if @ptp_sts_supported is false + * for this controller), it will set @ptp_sts_word_pre to 0, and + * @ptp_sts_word_post to the length of the transfer. This is done + * purposefully (instead of setting to spi_transfer->len - 1) to denote + * that a transfer-level snapshot taken from within the driver may still + * be of higher quality. + * @ptp_sts: Pointer to a memory location held by the SPI slave device where a + * PTP system timestamp structure may lie. If drivers use PIO or their + * hardware has some sort of assist for retrieving exact transfer timing, + * they can (and should) assert @ptp_sts_supported and populate this + * structure using the ptp_read_system_*ts helper functions. + * The timestamp must represent the time at which the SPI slave device has + * processed the word, i.e. the "pre" timestamp should be taken before + * transmitting the "pre" word, and the "post" timestamp after receiving + * transmit confirmation from the controller for the "post" word. + * @timestamped_pre: Set by the SPI controller driver to denote it has acted + * upon the @ptp_sts request. Not set when the SPI core has taken care of + * the task. SPI device drivers are free to print a warning if this comes + * back unset and they need the better resolution. + * @timestamped_post: See above. The reason why both exist is that these + * booleans are also used to keep state in the core SPI logic. * * SPI transfers always write the same number of bytes as they read. * Protocol drivers should always provide @rx_buf and/or @tx_buf. @@ -842,6 +895,14 @@ struct spi_transfer { u32 effective_speed_hz; + unsigned int ptp_sts_word_pre; + unsigned int ptp_sts_word_post; + + struct ptp_system_timestamp *ptp_sts; + + bool timestamped_pre; + bool timestamped_post; + struct list_head transfer_list; }; -- 2.20.1 ^ permalink raw reply related [flat|nested] 21+ messages in thread
* Applied "spi: Add a PTP system timestamp to the transfer structure" to the spi tree @ 2019-10-08 18:09 ` Mark Brown 0 siblings, 0 replies; 21+ messages in thread From: Mark Brown @ 2019-10-08 18:09 UTC (permalink / raw) To: Vladimir Oltean Cc: andrew, broonie, f.fainelli, h.feurstein, linux-spi, Mark Brown, mlichvar, netdev, richardcochran The patch spi: Add a PTP system timestamp to the transfer structure has been applied to the spi tree at https://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi.git All being well this means that it will be integrated into the linux-next tree (usually sometime in the next 24 hours) and sent to Linus during the next merge window (or sooner if it is a bug fix), however if problems are discovered then the patch may be dropped or reverted. You may get further e-mails resulting from automated or manual testing and review of the tree, please engage with people reporting problems and send followup patches addressing any issues that are reported if needed. If any updates are required or you are submitting further changes they should be sent as incremental updates against current git, existing patches will not be replaced. Please add any relevant lists and maintainers to the CCs when replying to this mail. Thanks, Mark >From 79591b7db21d255db158afaa48c557dcab631a1c Mon Sep 17 00:00:00 2001 From: Vladimir Oltean <olteanv@gmail.com> Date: Thu, 5 Sep 2019 04:01:12 +0300 Subject: [PATCH] spi: Add a PTP system timestamp to the transfer structure SPI is one of the interfaces used to access devices which have a POSIX clock driver (real time clocks, 1588 timers etc). The fact that the SPI bus is slow is not what the main problem is, but rather the fact that drivers don't take a constant amount of time in transferring data over SPI. When there is a high delay in the readout of time, there will be uncertainty in the value that has been read out of the peripheral. When that delay is constant, the uncertainty can at least be approximated with a certain accuracy which is fine more often than not. Timing jitter occurs all over in the kernel code, and is mainly caused by having to let go of the CPU for various reasons such as preemption, servicing interrupts, going to sleep, etc. Another major reason is CPU dynamic frequency scaling. It turns out that the problem of retrieving time from a SPI peripheral with high accuracy can be solved by the use of "PTP system timestamping" - a mechanism to correlate the time when the device has snapshotted its internal time counter with the Linux system time at that same moment. This is sufficient for having a precise time measurement - it is not necessary for the whole SPI transfer to be transmitted "as fast as possible", or "as low-jitter as possible". The system has to be low-jitter for a very short amount of time to be effective. This patch introduces a PTP system timestamping mechanism in struct spi_transfer. This is to be used by SPI device drivers when they need to know the exact time at which the underlying device's time was snapshotted. More often than not, SPI peripherals have a very exact timing for when their SPI-to-interconnect bridge issues a transaction for snapshotting and reading the time register, and that will be dependent on when the SPI-to-interconnect bridge figures out that this is what it should do, aka as soon as it sees byte N of the SPI transfer. Since spi_device drivers are the ones who'd know best how the peripheral behaves in this regard, expose a mechanism in spi_transfer which allows them to specify which word (or word range) from the transfer should be timestamped. Add a default implementation of the PTP system timestamping in the SPI core. This is not going to be satisfactory performance-wise, but should at least increase the likelihood that SPI device drivers will use PTP system timestamping in the future. There are 3 entry points from the core towards the SPI controller drivers: - transfer_one: The driver is passed individual spi_transfers to execute. This is the easiest to timestamp. - transfer_one_message: The core passes the driver an entire spi_message (a potential batch of spi_transfers). The core puts the same pre and post timestamp to all transfers within a message. This is not ideal, but nothing better can be done by default anyway, since the core has no insight into how the driver batches the transfers. - transfer: Like transfer_one_message, but for unqueued drivers (i.e. the driver implements its own queue scheduling). Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Link: https://lore.kernel.org/r/20190905010114.26718-3-olteanv@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org> --- drivers/spi/spi.c | 127 ++++++++++++++++++++++++++++++++++++++++ include/linux/spi/spi.h | 61 +++++++++++++++++++ 2 files changed, 188 insertions(+) diff --git a/drivers/spi/spi.c b/drivers/spi/spi.c index f9502dbbb5c1..9bb36c32cbf9 100644 --- a/drivers/spi/spi.c +++ b/drivers/spi/spi.c @@ -1171,6 +1171,11 @@ static int spi_transfer_one_message(struct spi_controller *ctlr, spi_statistics_add_transfer_stats(statm, xfer, ctlr); spi_statistics_add_transfer_stats(stats, xfer, ctlr); + if (!ctlr->ptp_sts_supported) { + xfer->ptp_sts_word_pre = 0; + ptp_read_system_prets(xfer->ptp_sts); + } + if (xfer->tx_buf || xfer->rx_buf) { reinit_completion(&ctlr->xfer_completion); @@ -1197,6 +1202,11 @@ static int spi_transfer_one_message(struct spi_controller *ctlr, xfer->len); } + if (!ctlr->ptp_sts_supported) { + ptp_read_system_postts(xfer->ptp_sts); + xfer->ptp_sts_word_post = xfer->len; + } + trace_spi_transfer_stop(msg, xfer); if (msg->status != -EINPROGRESS) @@ -1265,6 +1275,7 @@ EXPORT_SYMBOL_GPL(spi_finalize_current_transfer); */ static void __spi_pump_messages(struct spi_controller *ctlr, bool in_kthread) { + struct spi_transfer *xfer; struct spi_message *msg; bool was_busy = false; unsigned long flags; @@ -1391,6 +1402,13 @@ static void __spi_pump_messages(struct spi_controller *ctlr, bool in_kthread) goto out; } + if (!ctlr->ptp_sts_supported && !ctlr->transfer_one) { + list_for_each_entry(xfer, &msg->transfers, transfer_list) { + xfer->ptp_sts_word_pre = 0; + ptp_read_system_prets(xfer->ptp_sts); + } + } + ret = ctlr->transfer_one_message(ctlr, msg); if (ret) { dev_err(&ctlr->dev, @@ -1418,6 +1436,99 @@ static void spi_pump_messages(struct kthread_work *work) __spi_pump_messages(ctlr, true); } +/** + * spi_take_timestamp_pre - helper for drivers to collect the beginning of the + * TX timestamp for the requested byte from the SPI + * transfer. The frequency with which this function + * must be called (once per word, once for the whole + * transfer, once per batch of words etc) is arbitrary + * as long as the @tx buffer offset is greater than or + * equal to the requested byte at the time of the + * call. The timestamp is only taken once, at the + * first such call. It is assumed that the driver + * advances its @tx buffer pointer monotonically. + * @ctlr: Pointer to the spi_controller structure of the driver + * @xfer: Pointer to the transfer being timestamped + * @tx: Pointer to the current word within the xfer->tx_buf that the driver is + * preparing to transmit right now. + * @irqs_off: If true, will disable IRQs and preemption for the duration of the + * transfer, for less jitter in time measurement. Only compatible + * with PIO drivers. If true, must follow up with + * spi_take_timestamp_post or otherwise system will crash. + * WARNING: for fully predictable results, the CPU frequency must + * also be under control (governor). + */ +void spi_take_timestamp_pre(struct spi_controller *ctlr, + struct spi_transfer *xfer, + const void *tx, bool irqs_off) +{ + u8 bytes_per_word = DIV_ROUND_UP(xfer->bits_per_word, 8); + + if (!xfer->ptp_sts) + return; + + if (xfer->timestamped_pre) + return; + + if (tx < (xfer->tx_buf + xfer->ptp_sts_word_pre * bytes_per_word)) + return; + + /* Capture the resolution of the timestamp */ + xfer->ptp_sts_word_pre = (tx - xfer->tx_buf) / bytes_per_word; + + xfer->timestamped_pre = true; + + if (irqs_off) { + local_irq_save(ctlr->irq_flags); + preempt_disable(); + } + + ptp_read_system_prets(xfer->ptp_sts); +} +EXPORT_SYMBOL_GPL(spi_take_timestamp_pre); + +/** + * spi_take_timestamp_post - helper for drivers to collect the end of the + * TX timestamp for the requested byte from the SPI + * transfer. Can be called with an arbitrary + * frequency: only the first call where @tx exceeds + * or is equal to the requested word will be + * timestamped. + * @ctlr: Pointer to the spi_controller structure of the driver + * @xfer: Pointer to the transfer being timestamped + * @tx: Pointer to the current word within the xfer->tx_buf that the driver has + * just transmitted. + * @irqs_off: If true, will re-enable IRQs and preemption for the local CPU. + */ +void spi_take_timestamp_post(struct spi_controller *ctlr, + struct spi_transfer *xfer, + const void *tx, bool irqs_off) +{ + u8 bytes_per_word = DIV_ROUND_UP(xfer->bits_per_word, 8); + + if (!xfer->ptp_sts) + return; + + if (xfer->timestamped_post) + return; + + if (tx < (xfer->tx_buf + xfer->ptp_sts_word_post * bytes_per_word)) + return; + + ptp_read_system_postts(xfer->ptp_sts); + + if (irqs_off) { + local_irq_restore(ctlr->irq_flags); + preempt_enable(); + } + + /* Capture the resolution of the timestamp */ + xfer->ptp_sts_word_post = (tx - xfer->tx_buf) / bytes_per_word; + + xfer->timestamped_post = true; +} +EXPORT_SYMBOL_GPL(spi_take_timestamp_post); + /** * spi_set_thread_rt - set the controller to pump at realtime priority * @ctlr: controller to boost priority of @@ -1503,6 +1614,7 @@ EXPORT_SYMBOL_GPL(spi_get_next_queued_message); */ void spi_finalize_current_message(struct spi_controller *ctlr) { + struct spi_transfer *xfer; struct spi_message *mesg; unsigned long flags; int ret; @@ -1511,6 +1623,13 @@ void spi_finalize_current_message(struct spi_controller *ctlr) mesg = ctlr->cur_msg; spin_unlock_irqrestore(&ctlr->queue_lock, flags); + if (!ctlr->ptp_sts_supported && !ctlr->transfer_one) { + list_for_each_entry(xfer, &mesg->transfers, transfer_list) { + ptp_read_system_postts(xfer->ptp_sts); + xfer->ptp_sts_word_post = xfer->len; + } + } + spi_unmap_msg(ctlr, mesg); if (ctlr->cur_msg_prepared && ctlr->unprepare_message) { @@ -3273,6 +3392,7 @@ static int __spi_validate(struct spi_device *spi, struct spi_message *message) static int __spi_async(struct spi_device *spi, struct spi_message *message) { struct spi_controller *ctlr = spi->controller; + struct spi_transfer *xfer; /* * Some controllers do not support doing regular SPI transfers. Return @@ -3288,6 +3408,13 @@ static int __spi_async(struct spi_device *spi, struct spi_message *message) trace_spi_message_submit(message); + if (!ctlr->ptp_sts_supported) { + list_for_each_entry(xfer, &message->transfers, transfer_list) { + xfer->ptp_sts_word_pre = 0; + ptp_read_system_prets(xfer->ptp_sts); + } + } + return ctlr->transfer(spi, message); } diff --git a/include/linux/spi/spi.h b/include/linux/spi/spi.h index af4f265d0f67..27f6b046cf92 100644 --- a/include/linux/spi/spi.h +++ b/include/linux/spi/spi.h @@ -13,6 +13,7 @@ #include <linux/completion.h> #include <linux/scatterlist.h> #include <linux/gpio/consumer.h> +#include <linux/ptp_clock_kernel.h> struct dma_chan; struct property_entry; @@ -409,6 +410,12 @@ static inline void spi_unregister_driver(struct spi_driver *sdrv) * @fw_translate_cs: If the boot firmware uses different numbering scheme * what Linux expects, this optional hook can be used to translate * between the two. + * @ptp_sts_supported: If the driver sets this to true, it must provide a + * time snapshot in @spi_transfer->ptp_sts as close as possible to the + * moment in time when @spi_transfer->ptp_sts_word_pre and + * @spi_transfer->ptp_sts_word_post were transmitted. + * If the driver does not set this, the SPI core takes the snapshot as + * close to the driver hand-over as possible. * * Each SPI controller can communicate with one or more @spi_device * children. These make a small bus, sharing MOSI, MISO and SCK signals @@ -604,6 +611,15 @@ struct spi_controller { void *dummy_tx; int (*fw_translate_cs)(struct spi_controller *ctlr, unsigned cs); + + /* + * Driver sets this field to indicate it is able to snapshot SPI + * transfers (needed e.g. for reading the time of POSIX clocks) + */ + bool ptp_sts_supported; + + /* Interrupt enable state during PTP system timestamping */ + unsigned long irq_flags; }; static inline void *spi_controller_get_devdata(struct spi_controller *ctlr) @@ -644,6 +660,14 @@ extern struct spi_message *spi_get_next_queued_message(struct spi_controller *ct extern void spi_finalize_current_message(struct spi_controller *ctlr); extern void spi_finalize_current_transfer(struct spi_controller *ctlr); +/* Helper calls for driver to timestamp transfer */ +void spi_take_timestamp_pre(struct spi_controller *ctlr, + struct spi_transfer *xfer, + const void *tx, bool irqs_off); +void spi_take_timestamp_post(struct spi_controller *ctlr, + struct spi_transfer *xfer, + const void *tx, bool irqs_off); + /* the spi driver core manages memory for the spi_controller classdev */ extern struct spi_controller *__spi_alloc_controller(struct device *host, unsigned int size, bool slave); @@ -753,6 +777,35 @@ extern void spi_res_release(struct spi_controller *ctlr, * @transfer_list: transfers are sequenced through @spi_message.transfers * @tx_sg: Scatterlist for transmit, currently not for client use * @rx_sg: Scatterlist for receive, currently not for client use + * @ptp_sts_word_pre: The word (subject to bits_per_word semantics) offset + * within @tx_buf for which the SPI device is requesting that the time + * snapshot for this transfer begins. Upon completing the SPI transfer, + * this value may have changed compared to what was requested, depending + * on the available snapshotting resolution (DMA transfer, + * @ptp_sts_supported is false, etc). + * @ptp_sts_word_post: See @ptp_sts_word_post. The two can be equal (meaning + * that a single byte should be snapshotted). + * If the core takes care of the timestamp (if @ptp_sts_supported is false + * for this controller), it will set @ptp_sts_word_pre to 0, and + * @ptp_sts_word_post to the length of the transfer. This is done + * purposefully (instead of setting to spi_transfer->len - 1) to denote + * that a transfer-level snapshot taken from within the driver may still + * be of higher quality. + * @ptp_sts: Pointer to a memory location held by the SPI slave device where a + * PTP system timestamp structure may lie. If drivers use PIO or their + * hardware has some sort of assist for retrieving exact transfer timing, + * they can (and should) assert @ptp_sts_supported and populate this + * structure using the ptp_read_system_*ts helper functions. + * The timestamp must represent the time at which the SPI slave device has + * processed the word, i.e. the "pre" timestamp should be taken before + * transmitting the "pre" word, and the "post" timestamp after receiving + * transmit confirmation from the controller for the "post" word. + * @timestamped_pre: Set by the SPI controller driver to denote it has acted + * upon the @ptp_sts request. Not set when the SPI core has taken care of + * the task. SPI device drivers are free to print a warning if this comes + * back unset and they need the better resolution. + * @timestamped_post: See above. The reason why both exist is that these + * booleans are also used to keep state in the core SPI logic. * * SPI transfers always write the same number of bytes as they read. * Protocol drivers should always provide @rx_buf and/or @tx_buf. @@ -842,6 +895,14 @@ struct spi_transfer { u32 effective_speed_hz; + unsigned int ptp_sts_word_pre; + unsigned int ptp_sts_word_post; + + struct ptp_system_timestamp *ptp_sts; + + bool timestamped_pre; + bool timestamped_post; + struct list_head transfer_list; }; -- 2.20.1 ^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v2 3/4] spi: spi-fsl-dspi: Implement the PTP system timestamping for TCFQ mode 2019-09-05 1:01 [PATCH v2 0/4] Deterministic SPI latency with NXP DSPI driver Vladimir Oltean 2019-09-05 1:01 ` [PATCH v2 1/4] spi: Use an abbreviated pointer to ctlr->cur_msg in __spi_pump_messages Vladimir Oltean 2019-09-05 1:01 ` [PATCH v2 2/4] spi: Add a PTP system timestamp to the transfer structure Vladimir Oltean @ 2019-09-05 1:01 ` Vladimir Oltean 2019-10-08 10:52 ` Mark Brown 2019-09-05 1:01 ` [PATCH v2 4/4] spi: spi-fsl-dspi: Always use the TCFQ devices in poll mode Vladimir Oltean 2019-09-09 10:06 ` [PATCH v2 0/4] Deterministic SPI latency with NXP DSPI driver Mark Brown 4 siblings, 1 reply; 21+ messages in thread From: Vladimir Oltean @ 2019-09-05 1:01 UTC (permalink / raw) To: broonie, h.feurstein, mlichvar, richardcochran, andrew, f.fainelli Cc: linux-spi, netdev, Vladimir Oltean In this mode, the DSPI controller uses PIO to transfer word by word. In comparison, in EOQ mode the 4-word deep FIFO is being used, hence the current logic will need some adaptation for which I do not have the hardware (Coldfire) to test. It is not clear what is the timing of DMA transfers and whether timestamping in the driver brings any overall performance increase compared to regular timestamping done in the core. Short phc2sys summary after 58 minutes of running on LS1021A-TSN with interrupts disabled during the critical section: offset: min -26251 max 16416 mean -21.8672 std dev 863.416 delay: min 4720 max 57280 mean 5182.49 std dev 1607.19 lost servo lock 3 times Summary of the same phc2sys service running for 120 minutes with interrupts disabled: offset: min -378 max 381 mean -0.0083089 std dev 101.495 delay: min 4720 max 5920 mean 5129.38 std dev 154.899 lost servo lock 0 times The minimum delay (pre to post time) in nanoseconds is the same, but the maximum delay is quite a bit higher, due to interrupts getting sometimes executed and interfering with the measurement. Hence set disable_irqs whenever possible (aka when the driver runs in poll mode - otherwise it would be a contradiction in terms). Signed-off-by: Vladimir Oltean <olteanv@gmail.com> --- Changes in v2: - Adapted to the newly introduced SPI core API from 02/04. drivers/spi/spi-fsl-dspi.c | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/drivers/spi/spi-fsl-dspi.c b/drivers/spi/spi-fsl-dspi.c index bec758e978fb..7caea2da4397 100644 --- a/drivers/spi/spi-fsl-dspi.c +++ b/drivers/spi/spi-fsl-dspi.c @@ -129,6 +129,7 @@ enum dspi_trans_mode { struct fsl_dspi_devtype_data { enum dspi_trans_mode trans_mode; u8 max_clock_factor; + bool ptp_sts_supported; bool xspi_mode; }; @@ -140,12 +141,14 @@ static const struct fsl_dspi_devtype_data vf610_data = { static const struct fsl_dspi_devtype_data ls1021a_v1_data = { .trans_mode = DSPI_TCFQ_MODE, .max_clock_factor = 8, + .ptp_sts_supported = true, .xspi_mode = true, }; static const struct fsl_dspi_devtype_data ls2085a_data = { .trans_mode = DSPI_TCFQ_MODE, .max_clock_factor = 8, + .ptp_sts_supported = true, }; static const struct fsl_dspi_devtype_data coldfire_data = { @@ -654,6 +657,9 @@ static int dspi_rxtx(struct fsl_dspi *dspi) u16 spi_tcnt; u32 spi_tcr; + spi_take_timestamp_post(dspi->ctlr, dspi->cur_transfer, + dspi->tx - dspi->bytes_per_word, !dspi->irq); + /* Get transfer counter (in number of SPI transfers). It was * reset to 0 when transfer(s) were started. */ @@ -672,6 +678,9 @@ static int dspi_rxtx(struct fsl_dspi *dspi) /* Success! */ return 0; + spi_take_timestamp_pre(dspi->ctlr, dspi->cur_transfer, + dspi->tx, !dspi->irq); + if (trans_mode == DSPI_EOQ_MODE) dspi_eoq_write(dspi); else if (trans_mode == DSPI_TCFQ_MODE) @@ -779,6 +788,9 @@ static int dspi_transfer_one_message(struct spi_controller *ctlr, SPI_FRAME_EBITS(transfer->bits_per_word) | SPI_CTARE_DTCP(1)); + spi_take_timestamp_pre(dspi->ctlr, dspi->cur_transfer, + dspi->tx, !dspi->irq); + trans_mode = dspi->devtype_data->trans_mode; switch (trans_mode) { case DSPI_EOQ_MODE: @@ -1132,6 +1144,7 @@ static int dspi_probe(struct platform_device *pdev) init_waitqueue_head(&dspi->waitq); poll_mode: + if (dspi->devtype_data->trans_mode == DSPI_DMA_MODE) { ret = dspi_request_dma(dspi, res->start); if (ret < 0) { @@ -1143,6 +1156,8 @@ static int dspi_probe(struct platform_device *pdev) ctlr->max_speed_hz = clk_get_rate(dspi->clk) / dspi->devtype_data->max_clock_factor; + ctlr->ptp_sts_supported = dspi->devtype_data->ptp_sts_supported; + platform_set_drvdata(pdev, ctlr); ret = spi_register_controller(ctlr); -- 2.17.1 ^ permalink raw reply related [flat|nested] 21+ messages in thread
* Applied "spi: spi-fsl-dspi: Implement the PTP system timestamping for TCFQ mode" to the spi tree 2019-09-05 1:01 ` [PATCH v2 3/4] spi: spi-fsl-dspi: Implement the PTP system timestamping for TCFQ mode Vladimir Oltean @ 2019-10-08 10:52 ` Mark Brown 0 siblings, 0 replies; 21+ messages in thread From: Mark Brown @ 2019-10-08 10:52 UTC (permalink / raw) To: Vladimir Oltean Cc: andrew, broonie, f.fainelli, h.feurstein, linux-spi, Mark Brown, mlichvar, netdev, richardcochran The patch spi: spi-fsl-dspi: Implement the PTP system timestamping for TCFQ mode has been applied to the spi tree at https://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi.git for-5.5 All being well this means that it will be integrated into the linux-next tree (usually sometime in the next 24 hours) and sent to Linus during the next merge window (or sooner if it is a bug fix), however if problems are discovered then the patch may be dropped or reverted. You may get further e-mails resulting from automated or manual testing and review of the tree, please engage with people reporting problems and send followup patches addressing any issues that are reported if needed. If any updates are required or you are submitting further changes they should be sent as incremental updates against current git, existing patches will not be replaced. Please add any relevant lists and maintainers to the CCs when replying to this mail. Thanks, Mark From d6b71dfaeeba115dd61a7f367cf04c2d0ca77ebb Mon Sep 17 00:00:00 2001 From: Vladimir Oltean <olteanv@gmail.com> Date: Thu, 5 Sep 2019 04:01:13 +0300 Subject: [PATCH] spi: spi-fsl-dspi: Implement the PTP system timestamping for TCFQ mode In this mode, the DSPI controller uses PIO to transfer word by word. In comparison, in EOQ mode the 4-word deep FIFO is being used, hence the current logic will need some adaptation for which I do not have the hardware (Coldfire) to test. It is not clear what is the timing of DMA transfers and whether timestamping in the driver brings any overall performance increase compared to regular timestamping done in the core. Short phc2sys summary after 58 minutes of running on LS1021A-TSN with interrupts disabled during the critical section: offset: min -26251 max 16416 mean -21.8672 std dev 863.416 delay: min 4720 max 57280 mean 5182.49 std dev 1607.19 lost servo lock 3 times Summary of the same phc2sys service running for 120 minutes with interrupts disabled: offset: min -378 max 381 mean -0.0083089 std dev 101.495 delay: min 4720 max 5920 mean 5129.38 std dev 154.899 lost servo lock 0 times The minimum delay (pre to post time) in nanoseconds is the same, but the maximum delay is quite a bit higher, due to interrupts getting sometimes executed and interfering with the measurement. Hence set disable_irqs whenever possible (aka when the driver runs in poll mode - otherwise it would be a contradiction in terms). Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Link: https://lore.kernel.org/r/20190905010114.26718-4-olteanv@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org> --- drivers/spi/spi-fsl-dspi.c | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/drivers/spi/spi-fsl-dspi.c b/drivers/spi/spi-fsl-dspi.c index c61074502145..c0e96cc7fc51 100644 --- a/drivers/spi/spi-fsl-dspi.c +++ b/drivers/spi/spi-fsl-dspi.c @@ -129,6 +129,7 @@ enum dspi_trans_mode { struct fsl_dspi_devtype_data { enum dspi_trans_mode trans_mode; u8 max_clock_factor; + bool ptp_sts_supported; bool xspi_mode; }; @@ -140,12 +141,14 @@ static const struct fsl_dspi_devtype_data vf610_data = { static const struct fsl_dspi_devtype_data ls1021a_v1_data = { .trans_mode = DSPI_TCFQ_MODE, .max_clock_factor = 8, + .ptp_sts_supported = true, .xspi_mode = true, }; static const struct fsl_dspi_devtype_data ls2085a_data = { .trans_mode = DSPI_TCFQ_MODE, .max_clock_factor = 8, + .ptp_sts_supported = true, }; static const struct fsl_dspi_devtype_data coldfire_data = { @@ -654,6 +657,9 @@ static int dspi_rxtx(struct fsl_dspi *dspi) u16 spi_tcnt; u32 spi_tcr; + spi_take_timestamp_post(dspi->ctlr, dspi->cur_transfer, + dspi->tx - dspi->bytes_per_word, !dspi->irq); + /* Get transfer counter (in number of SPI transfers). It was * reset to 0 when transfer(s) were started. */ @@ -672,6 +678,9 @@ static int dspi_rxtx(struct fsl_dspi *dspi) /* Success! */ return 0; + spi_take_timestamp_pre(dspi->ctlr, dspi->cur_transfer, + dspi->tx, !dspi->irq); + if (trans_mode == DSPI_EOQ_MODE) dspi_eoq_write(dspi); else if (trans_mode == DSPI_TCFQ_MODE) @@ -779,6 +788,9 @@ static int dspi_transfer_one_message(struct spi_controller *ctlr, SPI_FRAME_EBITS(transfer->bits_per_word) | SPI_CTARE_DTCP(1)); + spi_take_timestamp_pre(dspi->ctlr, dspi->cur_transfer, + dspi->tx, !dspi->irq); + trans_mode = dspi->devtype_data->trans_mode; switch (trans_mode) { case DSPI_EOQ_MODE: @@ -1155,6 +1167,7 @@ static int dspi_probe(struct platform_device *pdev) init_waitqueue_head(&dspi->waitq); poll_mode: + if (dspi->devtype_data->trans_mode == DSPI_DMA_MODE) { ret = dspi_request_dma(dspi, res->start); if (ret < 0) { @@ -1166,6 +1179,8 @@ static int dspi_probe(struct platform_device *pdev) ctlr->max_speed_hz = clk_get_rate(dspi->clk) / dspi->devtype_data->max_clock_factor; + ctlr->ptp_sts_supported = dspi->devtype_data->ptp_sts_supported; + platform_set_drvdata(pdev, ctlr); ret = spi_register_controller(ctlr); -- 2.20.1 ^ permalink raw reply related [flat|nested] 21+ messages in thread
* Applied "spi: spi-fsl-dspi: Implement the PTP system timestamping for TCFQ mode" to the spi tree @ 2019-10-08 10:52 ` Mark Brown 0 siblings, 0 replies; 21+ messages in thread From: Mark Brown @ 2019-10-08 10:52 UTC (permalink / raw) To: Vladimir Oltean Cc: andrew, broonie, f.fainelli, h.feurstein, linux-spi, Mark Brown, mlichvar, netdev, richardcochran The patch spi: spi-fsl-dspi: Implement the PTP system timestamping for TCFQ mode has been applied to the spi tree at https://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi.git for-5.5 All being well this means that it will be integrated into the linux-next tree (usually sometime in the next 24 hours) and sent to Linus during the next merge window (or sooner if it is a bug fix), however if problems are discovered then the patch may be dropped or reverted. You may get further e-mails resulting from automated or manual testing and review of the tree, please engage with people reporting problems and send followup patches addressing any issues that are reported if needed. If any updates are required or you are submitting further changes they should be sent as incremental updates against current git, existing patches will not be replaced. Please add any relevant lists and maintainers to the CCs when replying to this mail. Thanks, Mark >From d6b71dfaeeba115dd61a7f367cf04c2d0ca77ebb Mon Sep 17 00:00:00 2001 From: Vladimir Oltean <olteanv@gmail.com> Date: Thu, 5 Sep 2019 04:01:13 +0300 Subject: [PATCH] spi: spi-fsl-dspi: Implement the PTP system timestamping for TCFQ mode In this mode, the DSPI controller uses PIO to transfer word by word. In comparison, in EOQ mode the 4-word deep FIFO is being used, hence the current logic will need some adaptation for which I do not have the hardware (Coldfire) to test. It is not clear what is the timing of DMA transfers and whether timestamping in the driver brings any overall performance increase compared to regular timestamping done in the core. Short phc2sys summary after 58 minutes of running on LS1021A-TSN with interrupts disabled during the critical section: offset: min -26251 max 16416 mean -21.8672 std dev 863.416 delay: min 4720 max 57280 mean 5182.49 std dev 1607.19 lost servo lock 3 times Summary of the same phc2sys service running for 120 minutes with interrupts disabled: offset: min -378 max 381 mean -0.0083089 std dev 101.495 delay: min 4720 max 5920 mean 5129.38 std dev 154.899 lost servo lock 0 times The minimum delay (pre to post time) in nanoseconds is the same, but the maximum delay is quite a bit higher, due to interrupts getting sometimes executed and interfering with the measurement. Hence set disable_irqs whenever possible (aka when the driver runs in poll mode - otherwise it would be a contradiction in terms). Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Link: https://lore.kernel.org/r/20190905010114.26718-4-olteanv@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org> --- drivers/spi/spi-fsl-dspi.c | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/drivers/spi/spi-fsl-dspi.c b/drivers/spi/spi-fsl-dspi.c index c61074502145..c0e96cc7fc51 100644 --- a/drivers/spi/spi-fsl-dspi.c +++ b/drivers/spi/spi-fsl-dspi.c @@ -129,6 +129,7 @@ enum dspi_trans_mode { struct fsl_dspi_devtype_data { enum dspi_trans_mode trans_mode; u8 max_clock_factor; + bool ptp_sts_supported; bool xspi_mode; }; @@ -140,12 +141,14 @@ static const struct fsl_dspi_devtype_data vf610_data = { static const struct fsl_dspi_devtype_data ls1021a_v1_data = { .trans_mode = DSPI_TCFQ_MODE, .max_clock_factor = 8, + .ptp_sts_supported = true, .xspi_mode = true, }; static const struct fsl_dspi_devtype_data ls2085a_data = { .trans_mode = DSPI_TCFQ_MODE, .max_clock_factor = 8, + .ptp_sts_supported = true, }; static const struct fsl_dspi_devtype_data coldfire_data = { @@ -654,6 +657,9 @@ static int dspi_rxtx(struct fsl_dspi *dspi) u16 spi_tcnt; u32 spi_tcr; + spi_take_timestamp_post(dspi->ctlr, dspi->cur_transfer, + dspi->tx - dspi->bytes_per_word, !dspi->irq); + /* Get transfer counter (in number of SPI transfers). It was * reset to 0 when transfer(s) were started. */ @@ -672,6 +678,9 @@ static int dspi_rxtx(struct fsl_dspi *dspi) /* Success! */ return 0; + spi_take_timestamp_pre(dspi->ctlr, dspi->cur_transfer, + dspi->tx, !dspi->irq); + if (trans_mode == DSPI_EOQ_MODE) dspi_eoq_write(dspi); else if (trans_mode == DSPI_TCFQ_MODE) @@ -779,6 +788,9 @@ static int dspi_transfer_one_message(struct spi_controller *ctlr, SPI_FRAME_EBITS(transfer->bits_per_word) | SPI_CTARE_DTCP(1)); + spi_take_timestamp_pre(dspi->ctlr, dspi->cur_transfer, + dspi->tx, !dspi->irq); + trans_mode = dspi->devtype_data->trans_mode; switch (trans_mode) { case DSPI_EOQ_MODE: @@ -1155,6 +1167,7 @@ static int dspi_probe(struct platform_device *pdev) init_waitqueue_head(&dspi->waitq); poll_mode: + if (dspi->devtype_data->trans_mode == DSPI_DMA_MODE) { ret = dspi_request_dma(dspi, res->start); if (ret < 0) { @@ -1166,6 +1179,8 @@ static int dspi_probe(struct platform_device *pdev) ctlr->max_speed_hz = clk_get_rate(dspi->clk) / dspi->devtype_data->max_clock_factor; + ctlr->ptp_sts_supported = dspi->devtype_data->ptp_sts_supported; + platform_set_drvdata(pdev, ctlr); ret = spi_register_controller(ctlr); -- 2.20.1 ^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v2 4/4] spi: spi-fsl-dspi: Always use the TCFQ devices in poll mode 2019-09-05 1:01 [PATCH v2 0/4] Deterministic SPI latency with NXP DSPI driver Vladimir Oltean ` (2 preceding siblings ...) 2019-09-05 1:01 ` [PATCH v2 3/4] spi: spi-fsl-dspi: Implement the PTP system timestamping for TCFQ mode Vladimir Oltean @ 2019-09-05 1:01 ` Vladimir Oltean 2019-10-08 10:52 ` Mark Brown 2019-09-09 10:06 ` [PATCH v2 0/4] Deterministic SPI latency with NXP DSPI driver Mark Brown 4 siblings, 1 reply; 21+ messages in thread From: Vladimir Oltean @ 2019-09-05 1:01 UTC (permalink / raw) To: broonie, h.feurstein, mlichvar, richardcochran, andrew, f.fainelli Cc: linux-spi, netdev, Vladimir Oltean With this patch, the "interrupts" property from the device tree bindings is ignored, even if present, if the driver runs in TCFQ mode. Switching to using the DSPI in poll mode has several distinct benefits: - With interrupts, the DSPI driver in TCFQ mode raises an IRQ after each transmitted word. There is more time wasted for the "waitq" event than for actual I/O. And the DSPI IRQ count can easily get the largest in /proc/interrupts on Freescale boards with attached SPI devices. - The SPI I/O time is both lower, and more consistently so. Attached to some Freescale devices are either PTP switches, or SPI RTCs. For reading time off of a SPI slave device, it is important that all SPI transfers take a deterministic time to complete. - In poll mode there is much less time spent by the CPU in hardirq context, which helps with the response latency of the system, and at the same time there is more control over when interrupts must be disabled (to get a precise timestamp measurement): win-win. On the LS1021A-TSN board, where the SPI device is a SJA1105 PTP switch (with a bits_per_word=8 driver), I created a "benchmark" where I read its PTP time once per second, for 120 seconds. Each "read PTP time" is a 12-byte SPI transfer. I then recorded the time before putting the first byte in the TX FIFO, and the time after reading the last byte from the RX FIFO. That is the transfer delay in nanoseconds. Interrupt mode: delay: min 125120 max 168320 mean 150286 std dev 17675.3 Poll mode: delay: min 69440 max 119040 mean 70312.9 std dev 8065.34 Both the mean latency and the standard deviation are more than 50% lower in poll mode than in interrupt mode. This is with an 'ondemand' governor on an otherwise idle system - therefore running mostly at 600 MHz out of a max of 1200 MHz. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> --- Changes in v2: - None. drivers/spi/spi-fsl-dspi.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/spi/spi-fsl-dspi.c b/drivers/spi/spi-fsl-dspi.c index 7caea2da4397..c30325faa050 100644 --- a/drivers/spi/spi-fsl-dspi.c +++ b/drivers/spi/spi-fsl-dspi.c @@ -716,7 +716,7 @@ static irqreturn_t dspi_interrupt(int irq, void *dev_id) regmap_read(dspi->regmap, SPI_SR, &spi_sr); regmap_write(dspi->regmap, SPI_SR, spi_sr); - if (!(spi_sr & (SPI_SR_EOQF | SPI_SR_TCFQF))) + if (!(spi_sr & SPI_SR_EOQF)) return IRQ_NONE; if (dspi_rxtx(dspi) == 0) { @@ -1126,6 +1126,9 @@ static int dspi_probe(struct platform_device *pdev) dspi_init(dspi); + if (dspi->devtype_data->trans_mode == DSPI_TCFQ_MODE) + goto poll_mode; + dspi->irq = platform_get_irq(pdev, 0); if (dspi->irq <= 0) { dev_info(&pdev->dev, -- 2.17.1 ^ permalink raw reply related [flat|nested] 21+ messages in thread
* Applied "spi: spi-fsl-dspi: Always use the TCFQ devices in poll mode" to the spi tree 2019-09-05 1:01 ` [PATCH v2 4/4] spi: spi-fsl-dspi: Always use the TCFQ devices in poll mode Vladimir Oltean @ 2019-10-08 10:52 ` Mark Brown 0 siblings, 0 replies; 21+ messages in thread From: Mark Brown @ 2019-10-08 10:52 UTC (permalink / raw) To: Vladimir Oltean Cc: andrew, broonie, f.fainelli, h.feurstein, linux-spi, Mark Brown, mlichvar, netdev, richardcochran The patch spi: spi-fsl-dspi: Always use the TCFQ devices in poll mode has been applied to the spi tree at https://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi.git for-5.4 All being well this means that it will be integrated into the linux-next tree (usually sometime in the next 24 hours) and sent to Linus during the next merge window (or sooner if it is a bug fix), however if problems are discovered then the patch may be dropped or reverted. You may get further e-mails resulting from automated or manual testing and review of the tree, please engage with people reporting problems and send followup patches addressing any issues that are reported if needed. If any updates are required or you are submitting further changes they should be sent as incremental updates against current git, existing patches will not be replaced. Please add any relevant lists and maintainers to the CCs when replying to this mail. Thanks, Mark From 5d2af8bcd4939d0f3d5061cc3b7783fd26311828 Mon Sep 17 00:00:00 2001 From: Vladimir Oltean <olteanv@gmail.com> Date: Thu, 5 Sep 2019 04:01:14 +0300 Subject: [PATCH] spi: spi-fsl-dspi: Always use the TCFQ devices in poll mode With this patch, the "interrupts" property from the device tree bindings is ignored, even if present, if the driver runs in TCFQ mode. Switching to using the DSPI in poll mode has several distinct benefits: - With interrupts, the DSPI driver in TCFQ mode raises an IRQ after each transmitted word. There is more time wasted for the "waitq" event than for actual I/O. And the DSPI IRQ count can easily get the largest in /proc/interrupts on Freescale boards with attached SPI devices. - The SPI I/O time is both lower, and more consistently so. Attached to some Freescale devices are either PTP switches, or SPI RTCs. For reading time off of a SPI slave device, it is important that all SPI transfers take a deterministic time to complete. - In poll mode there is much less time spent by the CPU in hardirq context, which helps with the response latency of the system, and at the same time there is more control over when interrupts must be disabled (to get a precise timestamp measurement): win-win. On the LS1021A-TSN board, where the SPI device is a SJA1105 PTP switch (with a bits_per_word=8 driver), I created a "benchmark" where I read its PTP time once per second, for 120 seconds. Each "read PTP time" is a 12-byte SPI transfer. I then recorded the time before putting the first byte in the TX FIFO, and the time after reading the last byte from the RX FIFO. That is the transfer delay in nanoseconds. Interrupt mode: delay: min 125120 max 168320 mean 150286 std dev 17675.3 Poll mode: delay: min 69440 max 119040 mean 70312.9 std dev 8065.34 Both the mean latency and the standard deviation are more than 50% lower in poll mode than in interrupt mode. This is with an 'ondemand' governor on an otherwise idle system - therefore running mostly at 600 MHz out of a max of 1200 MHz. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Link: https://lore.kernel.org/r/20190905010114.26718-5-olteanv@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org> --- drivers/spi/spi-fsl-dspi.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/spi/spi-fsl-dspi.c b/drivers/spi/spi-fsl-dspi.c index bec758e978fb..7bb018eb67d0 100644 --- a/drivers/spi/spi-fsl-dspi.c +++ b/drivers/spi/spi-fsl-dspi.c @@ -707,7 +707,7 @@ static irqreturn_t dspi_interrupt(int irq, void *dev_id) regmap_read(dspi->regmap, SPI_SR, &spi_sr); regmap_write(dspi->regmap, SPI_SR, spi_sr); - if (!(spi_sr & (SPI_SR_EOQF | SPI_SR_TCFQF))) + if (!(spi_sr & SPI_SR_EOQF)) return IRQ_NONE; if (dspi_rxtx(dspi) == 0) { @@ -1114,6 +1114,9 @@ static int dspi_probe(struct platform_device *pdev) dspi_init(dspi); + if (dspi->devtype_data->trans_mode == DSPI_TCFQ_MODE) + goto poll_mode; + dspi->irq = platform_get_irq(pdev, 0); if (dspi->irq <= 0) { dev_info(&pdev->dev, -- 2.20.1 ^ permalink raw reply related [flat|nested] 21+ messages in thread
* Applied "spi: spi-fsl-dspi: Always use the TCFQ devices in poll mode" to the spi tree @ 2019-10-08 10:52 ` Mark Brown 0 siblings, 0 replies; 21+ messages in thread From: Mark Brown @ 2019-10-08 10:52 UTC (permalink / raw) To: Vladimir Oltean Cc: andrew, broonie, f.fainelli, h.feurstein, linux-spi, Mark Brown, mlichvar, netdev, richardcochran The patch spi: spi-fsl-dspi: Always use the TCFQ devices in poll mode has been applied to the spi tree at https://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi.git for-5.4 All being well this means that it will be integrated into the linux-next tree (usually sometime in the next 24 hours) and sent to Linus during the next merge window (or sooner if it is a bug fix), however if problems are discovered then the patch may be dropped or reverted. You may get further e-mails resulting from automated or manual testing and review of the tree, please engage with people reporting problems and send followup patches addressing any issues that are reported if needed. If any updates are required or you are submitting further changes they should be sent as incremental updates against current git, existing patches will not be replaced. Please add any relevant lists and maintainers to the CCs when replying to this mail. Thanks, Mark >From 5d2af8bcd4939d0f3d5061cc3b7783fd26311828 Mon Sep 17 00:00:00 2001 From: Vladimir Oltean <olteanv@gmail.com> Date: Thu, 5 Sep 2019 04:01:14 +0300 Subject: [PATCH] spi: spi-fsl-dspi: Always use the TCFQ devices in poll mode With this patch, the "interrupts" property from the device tree bindings is ignored, even if present, if the driver runs in TCFQ mode. Switching to using the DSPI in poll mode has several distinct benefits: - With interrupts, the DSPI driver in TCFQ mode raises an IRQ after each transmitted word. There is more time wasted for the "waitq" event than for actual I/O. And the DSPI IRQ count can easily get the largest in /proc/interrupts on Freescale boards with attached SPI devices. - The SPI I/O time is both lower, and more consistently so. Attached to some Freescale devices are either PTP switches, or SPI RTCs. For reading time off of a SPI slave device, it is important that all SPI transfers take a deterministic time to complete. - In poll mode there is much less time spent by the CPU in hardirq context, which helps with the response latency of the system, and at the same time there is more control over when interrupts must be disabled (to get a precise timestamp measurement): win-win. On the LS1021A-TSN board, where the SPI device is a SJA1105 PTP switch (with a bits_per_word=8 driver), I created a "benchmark" where I read its PTP time once per second, for 120 seconds. Each "read PTP time" is a 12-byte SPI transfer. I then recorded the time before putting the first byte in the TX FIFO, and the time after reading the last byte from the RX FIFO. That is the transfer delay in nanoseconds. Interrupt mode: delay: min 125120 max 168320 mean 150286 std dev 17675.3 Poll mode: delay: min 69440 max 119040 mean 70312.9 std dev 8065.34 Both the mean latency and the standard deviation are more than 50% lower in poll mode than in interrupt mode. This is with an 'ondemand' governor on an otherwise idle system - therefore running mostly at 600 MHz out of a max of 1200 MHz. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Link: https://lore.kernel.org/r/20190905010114.26718-5-olteanv@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org> --- drivers/spi/spi-fsl-dspi.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/spi/spi-fsl-dspi.c b/drivers/spi/spi-fsl-dspi.c index bec758e978fb..7bb018eb67d0 100644 --- a/drivers/spi/spi-fsl-dspi.c +++ b/drivers/spi/spi-fsl-dspi.c @@ -707,7 +707,7 @@ static irqreturn_t dspi_interrupt(int irq, void *dev_id) regmap_read(dspi->regmap, SPI_SR, &spi_sr); regmap_write(dspi->regmap, SPI_SR, spi_sr); - if (!(spi_sr & (SPI_SR_EOQF | SPI_SR_TCFQF))) + if (!(spi_sr & SPI_SR_EOQF)) return IRQ_NONE; if (dspi_rxtx(dspi) == 0) { @@ -1114,6 +1114,9 @@ static int dspi_probe(struct platform_device *pdev) dspi_init(dspi); + if (dspi->devtype_data->trans_mode == DSPI_TCFQ_MODE) + goto poll_mode; + dspi->irq = platform_get_irq(pdev, 0); if (dspi->irq <= 0) { dev_info(&pdev->dev, -- 2.20.1 ^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [PATCH v2 0/4] Deterministic SPI latency with NXP DSPI driver 2019-09-05 1:01 [PATCH v2 0/4] Deterministic SPI latency with NXP DSPI driver Vladimir Oltean ` (3 preceding siblings ...) 2019-09-05 1:01 ` [PATCH v2 4/4] spi: spi-fsl-dspi: Always use the TCFQ devices in poll mode Vladimir Oltean @ 2019-09-09 10:06 ` Mark Brown 2019-09-09 14:08 ` Vladimir Oltean 4 siblings, 1 reply; 21+ messages in thread From: Mark Brown @ 2019-09-09 10:06 UTC (permalink / raw) To: Vladimir Oltean Cc: h.feurstein, mlichvar, richardcochran, andrew, f.fainelli, linux-spi, netdev [-- Attachment #1: Type: text/plain, Size: 773 bytes --] On Thu, Sep 05, 2019 at 04:01:10AM +0300, Vladimir Oltean wrote: > This patchset proposes an interface from the SPI subsystem for > software timestamping SPI transfers. There is a default implementation > provided in the core, as well as a mechanism for SPI slave drivers to > check which byte was in fact timestamped post-facto. The patchset also > adds the first user of this interface (the NXP DSPI driver in TCFQ mode). I think this is about as good as we're going to get but we're very near the merge window now so I'll leave this until after the merge window is done in case there's more review comments before applying. I need to reread the implementation code a bit as well, it looked fine on a first scan through but it's possible I might spot something later. [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 0/4] Deterministic SPI latency with NXP DSPI driver 2019-09-09 10:06 ` [PATCH v2 0/4] Deterministic SPI latency with NXP DSPI driver Mark Brown @ 2019-09-09 14:08 ` Vladimir Oltean 0 siblings, 0 replies; 21+ messages in thread From: Vladimir Oltean @ 2019-09-09 14:08 UTC (permalink / raw) To: Mark Brown Cc: h.feurstein, mlichvar, richardcochran, andrew, f.fainelli, linux-spi, netdev Hi Mark, On 09/09/2019, Mark Brown <broonie@kernel.org> wrote: > On Thu, Sep 05, 2019 at 04:01:10AM +0300, Vladimir Oltean wrote: > >> This patchset proposes an interface from the SPI subsystem for >> software timestamping SPI transfers. There is a default implementation >> provided in the core, as well as a mechanism for SPI slave drivers to >> check which byte was in fact timestamped post-facto. The patchset also >> adds the first user of this interface (the NXP DSPI driver in TCFQ mode). > > I think this is about as good as we're going to get but we're > very near the merge window now so I'll leave this until after the > merge window is done in case there's more review comments before > applying. I need to reread the implementation code a bit as > well, it looked fine on a first scan through but it's possible I > might spot something later. > There's one thing I still don't like, and that is the fact that the delay for sending one SPI word is so low, I can't actually capture it precisely with a "pre" and a "post" system clock timestamp. What actually happens is that I'm actually measuring the timing of the (too loose) CPU loop. Normally that's not bad, because the guarantee that the transfer happened between "pre" and "post" is still kept. But I'm introducing a false jitter in the delays I'm reporting ("post" - "pre") that does not actually depend upon the hardware phenomenon, but on the CPU frequency :) At maximum CPU frequency (performance governor) the reported latency is always constant, but still larger than the SPI transfer time. In fact it's constant exactly _because_ the CPU frequency is constant. When the CPU goes at lower frequencies, user space gets confused about the varying delay and my control loop doesn't keep lock as well. So in fact I wonder whether I'm using the PTP system timestamping API properly. One idea I had was to just timestamp the write to the TX FIFO, and add a constant delay based on bytes_per_word * (NSEC_PER_SEC / speed_hz). IMHO that correction should logically be applied to both "pre" and "post". Then say that the "post" is equal to the "pre". But that would mean I'm reporting a delay of zero, and losing the guarantee that the transfer actually happens between the reported "pre" and "post". On the other hand, introducing a static correction option could potentially help with the drivers that just get notified of a DMA completion. The other idea was to just push the PTP system timestamping all the way into regmap_write, and just minimize the governor effect by making sure the timestamped area of code is really short. I don't really know. ^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2019-10-09 22:57 UTC | newest] Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-09-05 1:01 [PATCH v2 0/4] Deterministic SPI latency with NXP DSPI driver Vladimir Oltean 2019-09-05 1:01 ` [PATCH v2 1/4] spi: Use an abbreviated pointer to ctlr->cur_msg in __spi_pump_messages Vladimir Oltean 2019-09-05 17:39 ` Applied "spi: Use an abbreviated pointer to ctlr->cur_msg in __spi_pump_messages" to the spi tree Mark Brown 2019-09-05 17:39 ` Mark Brown 2019-09-05 1:01 ` [PATCH v2 2/4] spi: Add a PTP system timestamp to the transfer structure Vladimir Oltean 2019-10-08 10:52 ` Applied "spi: Add a PTP system timestamp to the transfer structure" to the spi tree Mark Brown 2019-10-08 10:52 ` Mark Brown 2019-10-08 12:58 ` Vladimir Oltean 2019-10-08 16:42 ` Mark Brown 2019-10-09 22:13 ` Jakub Kicinski 2019-10-09 22:57 ` Vladimir Oltean 2019-10-08 18:09 ` Mark Brown 2019-10-08 18:09 ` Mark Brown 2019-09-05 1:01 ` [PATCH v2 3/4] spi: spi-fsl-dspi: Implement the PTP system timestamping for TCFQ mode Vladimir Oltean 2019-10-08 10:52 ` Applied "spi: spi-fsl-dspi: Implement the PTP system timestamping for TCFQ mode" to the spi tree Mark Brown 2019-10-08 10:52 ` Mark Brown 2019-09-05 1:01 ` [PATCH v2 4/4] spi: spi-fsl-dspi: Always use the TCFQ devices in poll mode Vladimir Oltean 2019-10-08 10:52 ` Applied "spi: spi-fsl-dspi: Always use the TCFQ devices in poll mode" to the spi tree Mark Brown 2019-10-08 10:52 ` Mark Brown 2019-09-09 10:06 ` [PATCH v2 0/4] Deterministic SPI latency with NXP DSPI driver Mark Brown 2019-09-09 14:08 ` Vladimir Oltean
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.