linux-spi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Linux SPI Slave fails to respond under high load on BeagleBone Black
@ 2020-11-06 20:11 Glenn Schmottlach
  2020-11-10  7:05 ` Vignesh Raghavendra
  0 siblings, 1 reply; 5+ messages in thread
From: Glenn Schmottlach @ 2020-11-06 20:11 UTC (permalink / raw)
  To: linux-spi; +Cc: broonie, vigneshr

I have encountered a problem utilizing the Linux SPI slave capability
between two BeagleBone Blacks (BBBs) where one is SPI master and the
other is configured as the SPI slave. The two devices are directly
wired together with short leads (2-3 inches long).

Both boards are using the same linux-ti-staging kernel:

Linux beaglebone 5.4.74-g227e67c99f #1 PREEMPT Fri Nov 6 14:47:05 UTC
2020 armv7l armv7l armv7l GNU/Linux

I believe the real issue exists in the McSPI driver
(spi-omap2-mcspi.c) and its implementation of SPI slave support. Let
me explain what I am seeing.

One the SPI master I run the following command:

spi-pipe -d /dev/spidev0.0 -s 1000000 -b 32 -n -1 < /dev/random | hexdump -vC

On the BBB configured as the SPI slave I run the same:

spi-pipe -d /dev/spidev0.0 -s 1000000 -b 32 -n -1 < /dev/random | hexdump -vC

I would expect 32 bytes of random data to be passed back and forth
between the two devices continuously. Please note, I do *not* expect
the SPI master to receive consecutive 32 byte blocks of purely random
data since there will be a "gap" in time when the SPI slave sends a 32
byte chunk/message and reloads the driver with the next chunk to send.
So the expectation is that the SPI slave will clock out 32 chunks of
random data followed by some zeros while the SPI master continues to
clock data and the next chunk is loaded on the SPI slave.

This does work briefly (2-3 transfers) until the SPI slave stops
clocking out random data and seems to get stuck waiting for the clock
(and receive data) from the SPI master. I can run the same experiment
using single invocations of the transfers indefinitely,

E.g. On each device running the same command...

spi-pipe -d /dev/spidev0.0 -s 1000000 -b 32 -n 1 < /dev/random | hexdump -vC

Where "-n 1" says to send one chunk then exit the program. The problem
occurs when I allow both to run indefinitely. I have verified with a
logic analyzer that the SPI master is properly clocking out the random
data with clocks and chip selects toggling. I do not believe this is
an issue with the SPI master.

I modified the 'spi-pipe' program so that it will *only* transmit data
(e.g. set the rx_buf for the SPI transfer structure is set to NULL) in
the ioctl(fd, SPI_IOC_MESSAGE(1), &transfer) call. When configured, I
would expect the spi-pipe program would block until I invoked spi-pipe
on the SPI master since it provides the chip select and clocks for the
bi-directional data flow. Unfortunately, the spi-pipe program on the
slave device immediately returns. I instrumented the kernel driver and
see the following sequence:

spi-pipe -d /dev/spidev0.0 -s 1000000 -b 32 -n 1 -i 2 < /dev/random
[  593.951328] omap2_mcspi_setup_transfer(931): enter
[  593.956168] omap2_mcspi_setup_transfer(1012): exit
[  593.965574] omap2_mcspi_transfer_one(1186): enter
[  593.972971] omap2_mcspi_setup_transfer(931): enter
[  593.979186] omap2_mcspi_setup_transfer(1012): exit
[  593.984008] omap2_mcspi_txrx_dma(620): enter
[  593.989569] omap2_mcspi_tx_dma(420): enter
[  593.993709] omap2_mcspi_set_dma_req(204): enter
[  594.000076] omap2_mcspi_set_dma_req(220): exit
[  594.000103] omap2_mcspi_tx_callback(405): enter
[  594.009181] omap2_mcspi_set_dma_req(204): enter
[  594.013727] omap2_mcspi_set_dma_req(220): exit
[  594.018208] omap2_mcspi_tx_callback(410): exit
[  594.026941] omap2_mcspi_tx_dma(443): exit
[  594.032559] omap2_mcspi_txrx_dma(667): start wait for dma tx completion
[  594.040999] mcspi_wait_for_completion(371): enter
[  594.045742] omap2_mcspi_set_dma_req(204): enter
[  594.052341] omap2_mcspi_set_dma_req(220): exit
[  594.056814] omap2_mcspi_setup_transfer(931): enter
[  594.063635] omap2_mcspi_setup_transfer(1012): exit
[  594.069399] omap2_mcspi_transfer_one(1299): exit
[  594.074665] omap2_mcspi_setup_transfer(931): enter
[  594.080932] omap2_mcspi_setup_transfer(1012): exit

It's clear that the driver (or ioctl() call) isn't blocking but seems
to just queue up the data and exits. I believe this might be a symptom
of a larger problem with the SPI slave implementation. Below is the
sequence I see from the SPI slave when both sides (master/slave) are
run indefinitely and the SPI slave stops responding.

[ 2383.618465] omap2_mcspi_rx_dma(541): start wait for RX completion
[ 2383.625394] mcspi_wait_for_completion(371): enter
[ 2396.370655] omap2_mcspi_irq_handler(1141): irqstat = 0x30001
[ 2396.376506] omap2_mcspi_rx_callback(391): enter
[ 2396.381069] omap2_mcspi_set_dma_req(204): enter
[ 2396.385678] omap2_mcspi_set_dma_req(220): exit
[ 2396.390160] omap2_mcspi_rx_callback(396): exit
[ 2396.403603] omap2_mcspi_set_dma_req(204): enter
[ 2396.408194] omap2_mcspi_set_dma_req(220): exit
[ 2396.412670] omap2_mcspi_txrx_dma(667): start wait for dma tx completion
[ 2396.424886] mcspi_wait_for_completion(371): enter
[ 2396.429628] omap2_mcspi_set_dma_req(204): enter
[ 2396.435543] omap2_mcspi_set_dma_req(220): exit
[ 2396.440015] omap2_mcspi_setup_transfer(931): enter
[ 2396.446107] omap2_mcspi_setup_transfer(1012): exit
[ 2396.450925] omap2_mcspi_transfer_one(1299): exit
[ 2396.457608] omap2_mcspi_transfer_one(1186): enter
00000000  52 aa d3 90 95 23 c0 aa  8f 49 e3 0b 86 a1 95 de  |R....#...I......|
00000010  85 3f 32 b4 c9 fd 57 c7  31 7d ea c6 05 bd cd 18  |.?2...W.1}......|
[ 2396.468439] omap2_mcspi_setup_transfer(931): enter
[ 2396.478826] omap2_mcspi_setup_transfer(1012): exit
[ 2396.486384] omap2_mcspi_txrx_dma(620): enter
[ 2396.490687] omap2_mcspi_irq_handler(1141): irqstat = 0x3000F
[ 2396.497536] omap2_mcspi_tx_dma(420): enter
[ 2396.501679] omap2_mcspi_set_dma_req(204): enter
[ 2396.507560] omap2_mcspi_set_dma_req(220): exit
[ 2396.512029] omap2_mcspi_tx_dma(443): exit
[ 2396.517290] omap2_mcspi_rx_dma(464): enter
[ 2396.521432] omap2_mcspi_set_dma_req(204): enter
[ 2396.527294] omap2_mcspi_set_dma_req(220): exit
[ 2396.527322] omap2_mcspi_rx_callback(391): enter
[ 2396.536380] omap2_mcspi_set_dma_req(204): enter
[ 2396.540933] omap2_mcspi_set_dma_req(220): exit
[ 2396.545405] omap2_mcspi_rx_callback(396): exit
[ 2396.552488] omap2_mcspi_rx_dma(541): start wait for RX completion
[ 2396.559512] mcspi_wait_for_completion(371): enter
[ 2396.565084] omap2_mcspi_set_dma_req(204): enter
[ 2396.569639] omap2_mcspi_set_dma_req(220): exit
[ 2396.575395] omap2_mcspi_txrx_dma(667): start wait for dma tx completion
[ 2396.582045] mcspi_wait_for_completion(371): enter

At this point spi-pipe on the slave no longer responds to the master
device and I have to ctrl-c to exit it.

I am hoping someone can explain what is going on or confirm that there
is indeed a bug in the slave implementation of spi-omap2-mcspi.c. I
hope someone might be able to suggest a fix for this issue that I
could try out. I am open to testing any suggestions.

Thanks . . .

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Linux SPI Slave fails to respond under high load on BeagleBone Black
  2020-11-06 20:11 Linux SPI Slave fails to respond under high load on BeagleBone Black Glenn Schmottlach
@ 2020-11-10  7:05 ` Vignesh Raghavendra
  2020-11-12 19:48   ` Glenn Schmottlach
  0 siblings, 1 reply; 5+ messages in thread
From: Vignesh Raghavendra @ 2020-11-10  7:05 UTC (permalink / raw)
  To: Glenn Schmottlach, linux-spi; +Cc: broonie

Hi,

On 11/7/20 1:41 AM, Glenn Schmottlach wrote:
> I have encountered a problem utilizing the Linux SPI slave capability
> between two BeagleBone Blacks (BBBs) where one is SPI master and the
> other is configured as the SPI slave. The two devices are directly
> wired together with short leads (2-3 inches long).
> 
> Both boards are using the same linux-ti-staging kernel:
> 
> Linux beaglebone 5.4.74-g227e67c99f #1 PREEMPT Fri Nov 6 14:47:05 UTC
> 2020 armv7l armv7l armv7l GNU/Linux
> 
> I believe the real issue exists in the McSPI driver
> (spi-omap2-mcspi.c) and its implementation of SPI slave support. Let
> me explain what I am seeing.
> 
> One the SPI master I run the following command:
> 
> spi-pipe -d /dev/spidev0.0 -s 1000000 -b 32 -n -1 < /dev/random | hexdump -vC
> 
> On the BBB configured as the SPI slave I run the same:
> 
> spi-pipe -d /dev/spidev0.0 -s 1000000 -b 32 -n -1 < /dev/random | hexdump -vC
> 
> I would expect 32 bytes of random data to be passed back and forth
> between the two devices continuously. Please note, I do *not* expect
> the SPI master to receive consecutive 32 byte blocks of purely random
> data since there will be a "gap" in time when the SPI slave sends a 32
> byte chunk/message and reloads the driver with the next chunk to send.
> So the expectation is that the SPI slave will clock out 32 chunks of
> random data followed by some zeros while the SPI master continues to
> clock data and the next chunk is loaded on the SPI slave.
> 

This would mean Slave TX FIFO underflows. I am not sure if McSPI IP is
robust enough to handle this case. TRM isn't particularly clear.


> This does work briefly (2-3 transfers) until the SPI slave stops
> clocking out random data and seems to get stuck waiting for the clock
> (and receive data) from the SPI master. I can run the same experiment
> using single invocations of the transfers indefinitely,
> 
> E.g. On each device running the same command...
> 
> spi-pipe -d /dev/spidev0.0 -s 1000000 -b 32 -n 1 < /dev/random | hexdump -vC
> 
> Where "-n 1" says to send one chunk then exit the program. The problem
> occurs when I allow both to run indefinitely. I have verified with a
> logic analyzer that the SPI master is properly clocking out the random
> data with clocks and chip selects toggling. I do not believe this is
> an issue with the SPI master.
> 
> I modified the 'spi-pipe' program so that it will *only* transmit data
> (e.g. set the rx_buf for the SPI transfer structure is set to NULL) in
> the ioctl(fd, SPI_IOC_MESSAGE(1), &transfer) call. When configured, I
> would expect the spi-pipe program would block until I invoked spi-pipe
> on the SPI master since it provides the chip select and clocks for the
> bi-directional data flow. Unfortunately, the spi-pipe program on the
> slave device immediately returns. I instrumented the kernel driver and
> see the following sequence:
> 
> spi-pipe -d /dev/spidev0.0 -s 1000000 -b 32 -n 1 -i 2 < /dev/random
> [  593.951328] omap2_mcspi_setup_transfer(931): enter
> [  593.956168] omap2_mcspi_setup_transfer(1012): exit
> [  593.965574] omap2_mcspi_transfer_one(1186): enter
> [  593.972971] omap2_mcspi_setup_transfer(931): enter
> [  593.979186] omap2_mcspi_setup_transfer(1012): exit
> [  593.984008] omap2_mcspi_txrx_dma(620): enter
> [  593.989569] omap2_mcspi_tx_dma(420): enter
> [  593.993709] omap2_mcspi_set_dma_req(204): enter
> [  594.000076] omap2_mcspi_set_dma_req(220): exit
> [  594.000103] omap2_mcspi_tx_callback(405): enter
> [  594.009181] omap2_mcspi_set_dma_req(204): enter
> [  594.013727] omap2_mcspi_set_dma_req(220): exit
> [  594.018208] omap2_mcspi_tx_callback(410): exit
> [  594.026941] omap2_mcspi_tx_dma(443): exit
> [  594.032559] omap2_mcspi_txrx_dma(667): start wait for dma tx completion
> [  594.040999] mcspi_wait_for_completion(371): enter
> [  594.045742] omap2_mcspi_set_dma_req(204): enter
> [  594.052341] omap2_mcspi_set_dma_req(220): exit
> [  594.056814] omap2_mcspi_setup_transfer(931): enter
> [  594.063635] omap2_mcspi_setup_transfer(1012): exit
> [  594.069399] omap2_mcspi_transfer_one(1299): exit
> [  594.074665] omap2_mcspi_setup_transfer(931): enter
> [  594.080932] omap2_mcspi_setup_transfer(1012): exit
> 

This is expected as McSPI has FIFO of 64 bytes and 32 bytes would easily
fit into it and DMA transfer would be complete but bytes would still be
in FIFO and not actually sent out.

You would see that this thread would block if you choose a transfer size
of say 128 bytes as there isn't enough room in FIFO.

> It's clear that the driver (or ioctl() call) isn't blocking but seems
> to just queue up the data and exits. I believe this might be a symptom
> of a larger problem with the SPI slave implementation. Below is the
> sequence I see from the SPI slave when both sides (master/slave) are
> run indefinitely and the SPI slave stops responding.
> 
> [ 2383.618465] omap2_mcspi_rx_dma(541): start wait for RX completion
> [ 2383.625394] mcspi_wait_for_completion(371): enter
> [ 2396.370655] omap2_mcspi_irq_handler(1141): irqstat = 0x30001
> [ 2396.376506] omap2_mcspi_rx_callback(391): enter
> [ 2396.381069] omap2_mcspi_set_dma_req(204): enter
> [ 2396.385678] omap2_mcspi_set_dma_req(220): exit
> [ 2396.390160] omap2_mcspi_rx_callback(396): exit
> [ 2396.403603] omap2_mcspi_set_dma_req(204): enter
> [ 2396.408194] omap2_mcspi_set_dma_req(220): exit
> [ 2396.412670] omap2_mcspi_txrx_dma(667): start wait for dma tx completion
> [ 2396.424886] mcspi_wait_for_completion(371): enter
> [ 2396.429628] omap2_mcspi_set_dma_req(204): enter
> [ 2396.435543] omap2_mcspi_set_dma_req(220): exit
> [ 2396.440015] omap2_mcspi_setup_transfer(931): enter
> [ 2396.446107] omap2_mcspi_setup_transfer(1012): exit
> [ 2396.450925] omap2_mcspi_transfer_one(1299): exit
> [ 2396.457608] omap2_mcspi_transfer_one(1186): enter
> 00000000  52 aa d3 90 95 23 c0 aa  8f 49 e3 0b 86 a1 95 de  |R....#...I......|
> 00000010  85 3f 32 b4 c9 fd 57 c7  31 7d ea c6 05 bd cd 18  |.?2...W.1}......|
> [ 2396.468439] omap2_mcspi_setup_transfer(931): enter
> [ 2396.478826] omap2_mcspi_setup_transfer(1012): exit
> [ 2396.486384] omap2_mcspi_txrx_dma(620): enter
> [ 2396.490687] omap2_mcspi_irq_handler(1141): irqstat = 0x3000F
> [ 2396.497536] omap2_mcspi_tx_dma(420): enter
> [ 2396.501679] omap2_mcspi_set_dma_req(204): enter
> [ 2396.507560] omap2_mcspi_set_dma_req(220): exit
> [ 2396.512029] omap2_mcspi_tx_dma(443): exit
> [ 2396.517290] omap2_mcspi_rx_dma(464): enter
> [ 2396.521432] omap2_mcspi_set_dma_req(204): enter
> [ 2396.527294] omap2_mcspi_set_dma_req(220): exit
> [ 2396.527322] omap2_mcspi_rx_callback(391): enter
> [ 2396.536380] omap2_mcspi_set_dma_req(204): enter
> [ 2396.540933] omap2_mcspi_set_dma_req(220): exit
> [ 2396.545405] omap2_mcspi_rx_callback(396): exit
> [ 2396.552488] omap2_mcspi_rx_dma(541): start wait for RX completion
> [ 2396.559512] mcspi_wait_for_completion(371): enter
> [ 2396.565084] omap2_mcspi_set_dma_req(204): enter
> [ 2396.569639] omap2_mcspi_set_dma_req(220): exit
> [ 2396.575395] omap2_mcspi_txrx_dma(667): start wait for dma tx completion
> [ 2396.582045] mcspi_wait_for_completion(371): enter
> 

At this point it looks TX DMA on slave side is still in progress and
probably is waiting for data from slave FIFO to be sent out to master.

Could you trace how much of data is left in TX FIFO using MCSPI_CHSTAT_0
and MCSPI_XFERLEVEL registers?

One possible reason may be that driver sets up  MCSPI_XFERLEVEL WCNT and
enables channel before setting up TX DMA. So if there a master that is
continuously requesting data, then there is a possibility that WCNT goes
down to 0 (due to shifting of 0s) before DMA had chance to put actual
the data into the FIFO.

Does it help if you move the programming of WCNT field to
omap2_mcspi_tx_dma()?

> At this point spi-pipe on the slave no longer responds to the master
> device and I have to ctrl-c to exit it.
> 
> I am hoping someone can explain what is going on or confirm that there
> is indeed a bug in the slave implementation of spi-omap2-mcspi.c. I
> hope someone might be able to suggest a fix for this issue that I
> could try out. I am open to testing any suggestions.
> 

Main intention of adding SPI slave support was to support limited
usecase where master and slave exchange very small known messages of
predetermined length such as to support at sparse interval (such as dead
man's switch). So tests like above would probably need more work to
McSPI driver and maybe to the core as well.

Regards
Vignesh



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Linux SPI Slave fails to respond under high load on BeagleBone Black
  2020-11-10  7:05 ` Vignesh Raghavendra
@ 2020-11-12 19:48   ` Glenn Schmottlach
  2020-11-13 15:08     ` Vignesh Raghavendra
  0 siblings, 1 reply; 5+ messages in thread
From: Glenn Schmottlach @ 2020-11-12 19:48 UTC (permalink / raw)
  To: Vignesh Raghavendra; +Cc: linux-spi, broonie

On Tue, Nov 10, 2020 at 2:06 AM Vignesh Raghavendra <vigneshr@ti.com> wrote:

> Could you trace how much of data is left in TX FIFO using MCSPI_CHSTAT_0
> and MCSPI_XFERLEVEL registers?
>

When the SPI slave is no longer responding to the SPI master the
following was observed on the SPI slave:

devmem2 0x48030130 w    (CHSTAT_0)
Read at address  0x48030130 (0xb6f5c130): 0x0000004F

So  0x4F means:
      RXFFF=1 (FIFO RX buffer is full)
      TXFFE=1 (FIFO transmit buffer is empty)
      EOT=1 (Set to 1 at end of SPI transfer)
      TXS=1 (TX register status: register is empty)
      RXS=1 (RX register status: register is full)

So it appears there is data ready to be read from the SPI master and
there is no data scheduled to be sent.

devmem2 0x4803017C w  (XFERLEVEL)
Read at address  0x4803017C (0xb6fd717c): 0x00000000

So WCNT = 0 and it appears no words are scheduled to be transferred on
the channel.

Again, the SPI slave appears to be "stuck"

Here is a function trace on the SPI slave leading up to it being "stuck"

root@beaglebone:~# spi-pipe -d /dev/spidev0.0 -s 24000000 -b 32 -n -1
< /dev/random | hexdump -vC
[ 9895.479249] omap2_mcspi_setup(1136): enter
[ 9895.483597] omap_mcspi_runtime_resume(1480): enter
[ 9895.488569] omap_mcspi_runtime_resume(1503): exit
[ 9895.499704] omap2_mcspi_setup_transfer(987): enter
[ 9895.505606] omap2_mcspi_setup_transfer(1068): exit
[ 9895.510429] omap2_mcspi_setup(1175): exit
[ 9895.522659] omap2_mcspi_set_cs(273): enter -> enable = 1
[ 9895.528052] omap2_mcspi_set_cs(304): exit
[ 9895.532392] omap2_mcspi_prepare_message(1378): enter
[ 9895.542695] omap2_mcspi_prepare_message(1396): exit
[ 9895.547606] omap2_mcspi_can_dma(1404): enter
[ 9895.551889] omap2_mcspi_can_dma(1415): exit
[ 9895.560061] omap2_mcspi_set_cs(273): enter -> enable = 0
[ 9895.566406] omap2_mcspi_set_cs(304): exit
[ 9895.570578] omap2_mcspi_transfer_one(1254): enter
[ 9895.581994] omap2_mcspi_set_enable(255): enter -> enable = 0
[ 9895.588708] omap2_mcspi_set_enable(265): exit
[ 9895.596666] omap2_mcspi_setup_transfer(987): enter
[ 9895.601536] omap2_mcspi_setup_transfer(1068): exit
[ 9895.612715] omap2_mcspi_can_dma(1404): enter
[ 9895.617007] omap2_mcspi_can_dma(1415): exit
[ 9895.621202] omap2_mcspi_set_fifo(343): enter -> enable = 1
[ 9895.635896] mcspi_bytes_per_word(214): enter
[ 9895.640190] mcspi_bytes_per_word(223): exit
[ 9895.647441] omap2_mcspi_set_fifo(376): exit
[ 9895.651646] omap2_mcspi_set_enable(255): enter -> enable = 1
[ 9895.662705] omap2_mcspi_set_enable(265): exit
[ 9895.667085] omap2_mcspi_can_dma(1404): enter
[ 9895.671368] omap2_mcspi_can_dma(1415): exit
[ 9895.681567] omap2_mcspi_txrx_dma(675): enter
[ 9895.686832] omap2_mcspi_irq_handler(1203): enter
[ 9895.691495] omap2_mcspi_irq_handler(1215): exit
[ 9895.701131] omap2_mcspi_tx_dma(472): enter
[ 9895.706728] omap2_mcspi_set_dma_req(232): enter
[ 9895.711344] omap2_mcspi_tx_callback(454): enter
[ 9895.715943] omap2_mcspi_set_dma_req(232): enter
[ 9895.720528] omap2_mcspi_set_dma_req(247): exit
[ 9895.725001] omap2_mcspi_tx_callback(459): exit
[ 9895.743431] omap2_mcspi_set_dma_req(247): exit
[ 9895.747902] omap2_mcspi_tx_dma(492): exit
[ 9895.751924] omap2_mcspi_rx_dma(512): enter
[ 9895.758134] omap2_mcspi_set_dma_req(232): enter
[ 9895.765133] omap2_mcspi_set_dma_req(247): exit
[ 9895.769599] mcspi_wait_for_completion(418): enter
[ 9906.203277] omap2_mcspi_rx_callback(440): enter
[ 9906.207876] omap2_mcspi_set_dma_req(232): enter
[ 9906.212568] omap2_mcspi_set_dma_req(247): exit
[ 9906.217114] omap2_mcspi_rx_callback(445): exit
[ 9906.232642] omap2_mcspi_set_dma_req(232): enter
[ 9906.238512] omap2_mcspi_set_dma_req(247): exit
[ 9906.253163] omap2_mcspi_rx_dma(594): exit
[ 9906.257206] mcspi_wait_for_completion(418): enter
[ 9906.261934] omap2_mcspi_set_dma_req(232): enter
[ 9906.275168] omap2_mcspi_set_dma_req(247): exit
[ 9906.279637] omap2_mcspi_txrx_dma(723): exit
[ 9906.290659] omap2_mcspi_setup_transfer(987): enter
[ 9906.296541] omap2_mcspi_setup_transfer(1068): exit
[ 9906.301357] omap2_mcspi_set_enable(255): enter -> enable = 0
[ 9906.313084] omap2_mcspi_set_enable(265): exit
[ 9906.317466] omap2_mcspi_set_fifo(343): enter -> enable = 0
[ 9906.329195] omap2_mcspi_set_fifo(389): exit
[ 9906.339782] omap2_mcspi_transfer_one(1367): exit
[ 9906.345438] omap2_mcspi_set_cs(273): enter -> enable = 1
[ 9906.350821] omap2_mcspi_set_cs(304): exit
[ 9906.361383] omap2_mcspi_can_dma(1404): enter
[ 9906.366773] omap2_mcspi_can_dma(1415): exit
[ 9906.371820] omap2_mcspi_prepare_message(1378): enter

>>> The SPI slave **does** receive/transmit the 1st 32 byte buffer from/to the SPI master here
00000000  7c 25 73 69 f2 f7 67 7a  2e 49 89 1d 15 a3 82 14  ||%si..gz.I......|
00000010  b9 6a 9d 61 84 40 fb 65  92 93 95 47 74 eb 86 70  |.j.a.@.e...Gt..p|
>>>

[ 9906.383655] omap2_mcspi_prepare_message(1396): exit
[ 9906.392478] omap2_mcspi_can_dma(1404): enter
[ 9906.400989] omap2_mcspi_can_dma(1415): exit
[ 9906.406039] omap2_mcspi_set_cs(273): enter -> enable = 0
[ 9906.411665] omap2_mcspi_set_cs(304): exit
[ 9906.422574] omap2_mcspi_transfer_one(1254): enter
[ 9906.428205] omap2_mcspi_set_enable(255): enter -> enable = 0
[ 9906.434843] omap2_mcspi_set_enable(265): exit
[ 9906.439219] omap2_mcspi_setup_transfer(987): enter
[ 9906.448176] omap2_mcspi_setup_transfer(1068): exit
[ 9906.460740] omap2_mcspi_can_dma(1404): enter
[ 9906.471370] omap2_mcspi_can_dma(1415): exit
[ 9906.475873] omap2_mcspi_set_fifo(343): enter -> enable = 1
[ 9906.481411] mcspi_bytes_per_word(214): enter
[ 9906.488991] mcspi_bytes_per_word(223): exit
[ 9906.494172] omap2_mcspi_set_fifo(376): exit
[ 9906.498376] omap2_mcspi_set_enable(255): enter -> enable = 1
[ 9906.510844] omap2_mcspi_set_enable(265): exit
[ 9906.518939] omap2_mcspi_can_dma(1404): enter
[ 9906.524126] omap2_mcspi_can_dma(1415): exit
[ 9906.528332] omap2_mcspi_txrx_dma(675): enter
[ 9906.532636] omap2_mcspi_irq_handler(1203): enter
[ 9906.537291] omap2_mcspi_irq_handler(1215): exit
[ 9906.548202] omap2_mcspi_tx_dma(472): enter
[ 9906.552342] omap2_mcspi_set_dma_req(232): enter
[ 9906.558436] omap2_mcspi_set_dma_req(247): exit
[ 9906.562903] omap2_mcspi_tx_dma(492): exit
[ 9906.573147] omap2_mcspi_rx_dma(512): enter
[ 9906.578230] omap2_mcspi_set_dma_req(232): enter
[ 9906.582834] omap2_mcspi_rx_callback(440): enter
[ 9906.587432] omap2_mcspi_set_dma_req(232): enter
[ 9906.592017] omap2_mcspi_set_dma_req(247): exit
[ 9906.596498] omap2_mcspi_rx_callback(445): exit
[ 9906.615645] omap2_mcspi_set_dma_req(247): exit
[ 9906.620115] mcspi_wait_for_completion(418): enter
[ 9906.627340] omap2_mcspi_set_dma_req(232): enter
[ 9906.631940] omap2_mcspi_set_dma_req(247): exit
[ 9906.643142] omap2_mcspi_rx_dma(594): exit
[ 9906.647177] mcspi_wait_for_completion(418): enter

At this point it's waiting for a completion that appears to never come.

> One possible reason may be that driver sets up  MCSPI_XFERLEVEL WCNT and
> enables channel before setting up TX DMA. So if there a master that is
> continuously requesting data, then there is a possibility that WCNT goes
> down to 0 (due to shifting of 0s) before DMA had chance to put actual
> the data into the FIFO.
>
> Does it help if you move the programming of WCNT field to
> omap2_mcspi_tx_dma()?
>

That sounds plausible and I *attempted* to move this code but I
must've implemented it incorrectly because the driver stopped
functioning. It's likely my implementation and/or interpretation of
your suggestion is wrong. If you could provide a more explicit
description of the changes or a patch I would certainly be willing to
attempt it again.

> Main intention of adding SPI slave support was to support limited
> usecase where master and slave exchange very small known messages of
> predetermined length such as to support at sparse interval (such as dead

I was hoping to pass high-speed/bandwidth telemetry data from the SPI
slave to the SPI master. Apparently my use case may not match the one
described. With that said, I think it may still be achievable with
tweaking by someone with more experience with the McSPI driver. I
introduced a "usleep()" inside spi-pipe() just after the SPI transfer
on the SPI master application. With a long enough delay between
transfers (~2-10 msec) the SPI master/slave succeeds indefinitely and
the SPI slave never gets "stuck". Unfortunately, this kills my
telemetry throughput. So, there is likely some kind of race condition
in the SPI slave controller code.

Any additional suggestions or patches to try would be very
appreciated. I think the potential to work reliably under load is
there but it still needs a few changes. Hopefully these are small
changes, localized to a single file, and don't require a big refactor.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Linux SPI Slave fails to respond under high load on BeagleBone Black
  2020-11-12 19:48   ` Glenn Schmottlach
@ 2020-11-13 15:08     ` Vignesh Raghavendra
  2020-11-13 20:34       ` Glenn Schmottlach
  0 siblings, 1 reply; 5+ messages in thread
From: Vignesh Raghavendra @ 2020-11-13 15:08 UTC (permalink / raw)
  To: Glenn Schmottlach; +Cc: linux-spi, broonie



On 11/13/20 1:18 AM, Glenn Schmottlach wrote:
> On Tue, Nov 10, 2020 at 2:06 AM Vignesh Raghavendra <vigneshr@ti.com> wrote:
> 
>> Could you trace how much of data is left in TX FIFO using MCSPI_CHSTAT_0
>> and MCSPI_XFERLEVEL registers?
>>
> 
> When the SPI slave is no longer responding to the SPI master the
> following was observed on the SPI slave:
> 
> devmem2 0x48030130 w    (CHSTAT_0)
> Read at address  0x48030130 (0xb6f5c130): 0x0000004F
> 
> So  0x4F means:
>       RXFFF=1 (FIFO RX buffer is full)
>       TXFFE=1 (FIFO transmit buffer is empty)
>       EOT=1 (Set to 1 at end of SPI transfer)
>       TXS=1 (TX register status: register is empty)
>       RXS=1 (RX register status: register is full)
> 
> So it appears there is data ready to be read from the SPI master and
> there is no data scheduled to be sent.
> 
> devmem2 0x4803017C w  (XFERLEVEL)
> Read at address  0x4803017C (0xb6fd717c): 0x00000000
> 
> So WCNT = 0 and it appears no words are scheduled to be transferred on
> the channel.
> 

Does the transfer resume if you manually updated WCNT field to a 
value > 32 using devmem2 when slave appears to be "stuck"? 

> Again, the SPI slave appears to be "stuck"
> 
[...]
> 
> At this point it's waiting for a completion that appears to never come.
> 
>> One possible reason may be that driver sets up  MCSPI_XFERLEVEL WCNT and
>> enables channel before setting up TX DMA. So if there a master that is
>> continuously requesting data, then there is a possibility that WCNT goes
>> down to 0 (due to shifting of 0s) before DMA had chance to put actual
>> the data into the FIFO.
>>
>> Does it help if you move the programming of WCNT field to
>> omap2_mcspi_tx_dma()?
>>
> 
> That sounds plausible and I *attempted* to move this code but I
> must've implemented it incorrectly because the driver stopped
> functioning. It's likely my implementation and/or interpretation of
> your suggestion is wrong. If you could provide a more explicit
> description of the changes or a patch I would certainly be willing to
> attempt it again.
> 
>> Main intention of adding SPI slave support was to support limited
>> usecase where master and slave exchange very small known messages of
>> predetermined length such as to support at sparse interval (such as dead
> 
> I was hoping to pass high-speed/bandwidth telemetry data from the SPI
> slave to the SPI master. Apparently my use case may not match the one
> described. With that said, I think it may still be achievable with
> tweaking by someone with more experience with the McSPI driver. I
> introduced a "usleep()" inside spi-pipe() just after the SPI transfer
> on the SPI master application. With a long enough delay between
> transfers (~2-10 msec) the SPI master/slave succeeds indefinitely and
> the SPI slave never gets "stuck". Unfortunately, this kills my
> telemetry throughput. So, there is likely some kind of race condition
> in the SPI slave controller code.
> 
> Any additional suggestions or patches to try would be very
> appreciated. I think the potential to work reliably under load is
> there but it still needs a few changes. Hopefully these are small
> changes, localized to a single file, and don't require a big refactor.
> 


Could you see if below diff helps? This delays enabling of channel
until TX DMA is queued so that WCNT does not decrement


diff --git a/drivers/spi/spi-omap2-mcspi.c b/drivers/spi/spi-omap2-mcspi.c
index d4c9510af393..bf8c6526bcd7 100644
--- a/drivers/spi/spi-omap2-mcspi.c
+++ b/drivers/spi/spi-omap2-mcspi.c
@@ -426,6 +426,8 @@ static void omap2_mcspi_tx_dma(struct spi_device *spi,
        }
        dma_async_issue_pending(mcspi_dma->dma_tx);
        omap2_mcspi_set_dma_req(spi, 0, 1);
+       if (spi_controller_is_slave(master))
+               omap2_mcspi_set_enable(spi, 1);
 }
 
 static unsigned
@@ -1194,7 +1196,9 @@ static int omap2_mcspi_transfer_one(struct spi_master *master,
                    master->can_dma(master, spi, t))
                        omap2_mcspi_set_fifo(spi, t, 1);
 
-               omap2_mcspi_set_enable(spi, 1);
+               /* For slave TX, enable after DMA is queued */
+               if (!spi_controller_is_slave(master) || !t->tx_buf)
+                       omap2_mcspi_set_enable(spi, 1);
 
                /* RX_ONLY mode needs dummy data in TX reg */
                if (t->tx_buf == NULL)

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: Linux SPI Slave fails to respond under high load on BeagleBone Black
  2020-11-13 15:08     ` Vignesh Raghavendra
@ 2020-11-13 20:34       ` Glenn Schmottlach
  0 siblings, 0 replies; 5+ messages in thread
From: Glenn Schmottlach @ 2020-11-13 20:34 UTC (permalink / raw)
  To: Vignesh Raghavendra; +Cc: linux-spi, broonie

On Fri, Nov 13, 2020 at 10:08 AM Vignesh Raghavendra <vigneshr@ti.com> wrote:

>
> Does the transfer resume if you manually updated WCNT field to a
> value > 32 using devmem2 when slave appears to be "stuck"?
>

Unfortunately, no, the slave does not become unstuck. I can see that
the WCNT field typically drops back to zero but nothing appears to
happen.

>
>
> Could you see if below diff helps? This delays enabling of channel
> until TX DMA is queued so that WCNT does not decrement
>
>
> diff --git a/drivers/spi/spi-omap2-mcspi.c b/drivers/spi/spi-omap2-mcspi.c
> index d4c9510af393..bf8c6526bcd7 100644
> --- a/drivers/spi/spi-omap2-mcspi.c
> +++ b/drivers/spi/spi-omap2-mcspi.c
> @@ -426,6 +426,8 @@ static void omap2_mcspi_tx_dma(struct spi_device *spi,
>         }
>         dma_async_issue_pending(mcspi_dma->dma_tx);
>         omap2_mcspi_set_dma_req(spi, 0, 1);
> +       if (spi_controller_is_slave(master))
> +               omap2_mcspi_set_enable(spi, 1);
>  }
>
>  static unsigned
> @@ -1194,7 +1196,9 @@ static int omap2_mcspi_transfer_one(struct spi_master *master,
>                     master->can_dma(master, spi, t))
>                         omap2_mcspi_set_fifo(spi, t, 1);
>
> -               omap2_mcspi_set_enable(spi, 1);
> +               /* For slave TX, enable after DMA is queued */
> +               if (!spi_controller_is_slave(master) || !t->tx_buf)
> +                       omap2_mcspi_set_enable(spi, 1);
>
>                 /* RX_ONLY mode needs dummy data in TX reg */
>                 if (t->tx_buf == NULL)

I made this change and initially thought it improved things but as the
clock speed is increased (> 10000 Hz) it reverts to the prior
behavior. I can almost see the SPI slave stutter/stammer along until
it finally stops responding to the SPI master. At this point WCNT == 0
and poking in a value > 32 has no effect. From my testing I can
definitely see a dramatic decline in performance (or susceptibility
for the SPI slave to get "stuck") as the clock rate increases. For my
use case, the SPI slave generates telemetry and thus discards all
incoming data from the MOSI pin. Likewise my SPI master is only
interested in the MISO data. I hacked up the spi-pipe application to
provide no TX buffer for the SPI master (thus clocking out zeros to
the SPI slave). Likewise for the SPI slave the spi-pipe program does
not provide an RX buffer since the clock is merely being used to clock
"out" telemetry data to the SPI master. Making this change offers some
improvement but it usually doesn't last long.

It certainly seems there is one (or more) race conditions. Very
rarely, a test will run indefinitely but in general it's not
repeatable. It seems that the SPI slave must be able to atomically
submit a TX buffer in order for this to work. Given your patch, it
appears it must be difficult to thread in successive TX buffers since
the DMA must be scheduled and WCNT set appropriately in order to clock
out data from the slave. Of course this all has to happen while the
SPI master randomly clocks data to/from the slave. Perhaps if I had a
better understanding of the normal program flow I could see this more
clearly. Do you have any additional suggestions I could investigate?
Certainly, this problem is easy to recreate with two development
boards (or BeagleBone Blacks in my case). Have you ever encountered it
in your testing?

Thanks,

Glenn

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-11-13 20:35 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-06 20:11 Linux SPI Slave fails to respond under high load on BeagleBone Black Glenn Schmottlach
2020-11-10  7:05 ` Vignesh Raghavendra
2020-11-12 19:48   ` Glenn Schmottlach
2020-11-13 15:08     ` Vignesh Raghavendra
2020-11-13 20:34       ` Glenn Schmottlach

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).