On Thu, 7 Jan 2021 15:35:46 +0000, Mark Brown wrote: > Copying in a bunch of people for that driver. Thanks. I added Tudor Ambarus as well as my debugging points in the direction of: commit 9326e4f1e5dd1a4410c429638d3c412b6fc17040 (spi/for-5.10) Author: Tudor Ambarus Date: Wed Dec 9 19:35:14 2020 +0200 spi: Limit the spi device max speed to controller's max speed Make sure the max_speed_hz of spi_device does not override the max_speed_hz of controller. > I'd not expect this to be required with a polled path, it's only needed > if the transfer function returns a positive value indicating that the > transfer is still in progress which shouldn't be the case when things > are polled. Oh, indeed. And the driver is indeed returning 0 in this case. > > I'll continue poking around later (especially checking computed timeout values), > > should I submit patches for s/complete/spi_finalize_current_transfer/ ? Reverting spi: Limit the spi device max speed to controller's max speed fixes the timeout. Which brings be back to devicetree: - the controller node does not define the max clock rate - the device I add in my overlay does define one So spi->max_speed_hz is initially non-zero, but spi->controller->max_speed_hz is zero, then: - if (!spi->max_speed_hz) + if (!spi->max_speed_hz || + spi->max_speed_hz > spi->controller->max_speed_hz) spi->max_speed_hz = spi->controller->max_speed_hz; it becomes zero in 5.11. Then, because I do not specify a transfer speed in my userland code: if (!xfer->speed_hz) xfer->speed_hz = spi->max_speed_hz; it gets zero instead of the (as of 5.10) 4MHz, which led to my original division-by-zero issue. What would be the correct fix ? - declare controller max clock ? If so, I think spi.c should warn about this missing value. - do not cap transfer speed to controller speed when the latter is zero ? - both ? My working out, for whatever it's worth: I ran my userland code under "strace -tt -T" and get: 00:31:21.449837 write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4000) = -1 ETIMEDOUT (Connection timed out) <0.885552> Given the previous divide-by-zero issue, spi.c is now assuming 100kHz, which for this 4000 bytes buffer means: >>> ((8*1000*4000)/100000) * 2 + 200 844.96 milliseconds, which roughly matches the call duration measured by strace. But this device is declared in my DT as supporting transfer rates up to 4MHz (so period of 250ns), which would mean an actual transfer duration of: >>> 4000*8*250e-9 0.008 assuming bytes being sent back-to-back (which I believe is the case for SPI, but worst case I do not expect this to multiply the transfer time by 100...). So there is something wrong with the transfer speed. > Probably send the completion patch, yes. Done. Regards, -- Vincent Pelletier