All of lore.kernel.org
 help / color / mirror / Atom feed
* mcp2517fd: transmit errors
@ 2020-10-09 14:16 Kurt Van Dijck
  2020-10-09 16:24 ` Marc Kleine-Budde
  0 siblings, 1 reply; 6+ messages in thread
From: Kurt Van Dijck @ 2020-10-09 14:16 UTC (permalink / raw)
  To: linux-can

Hello,

I'm using a v5.4 kernel now, with backported 'can: mcp25xxfd: initial commit'.
I did focus up to now to CAN recv performance, but now I face another
issue. I have errors transmitting to CAN. It's unstable.
I need to collect more details, and it is now about focus number 1.

I managed to decrease the urgency for my project by inserting a delay
in the most busy transmitter.

Any ideas what to look for?

Kind regards,
Kurt

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: mcp2517fd: transmit errors
  2020-10-09 14:16 mcp2517fd: transmit errors Kurt Van Dijck
@ 2020-10-09 16:24 ` Marc Kleine-Budde
  2020-10-09 17:40   ` Kurt Van Dijck
  0 siblings, 1 reply; 6+ messages in thread
From: Marc Kleine-Budde @ 2020-10-09 16:24 UTC (permalink / raw)
  To: linux-can, Kurt Van Dijck


[-- Attachment #1.1: Type: text/plain, Size: 1367 bytes --]

On 10/9/20 4:16 PM, Kurt Van Dijck wrote:
> I'm using a v5.4 kernel now, with backported 'can: mcp25xxfd: initial commit'.
> I did focus up to now to CAN recv performance, but now I face another
> issue. I have errors transmitting to CAN.

What kind of errors?

> It's unstable.

What does that mean?

> I need to collect more details, and it is now about focus number 1.
> 
> I managed to decrease the urgency for my project by inserting a delay
> in the most busy transmitter.
> 
> Any ideas what to look for?

The mcp2517fd suffers from the MAB TX underflow errata: See 1. in
http://ww1.microchip.com/downloads/en/DeviceDoc/MCP2517FD-External-CAN-FD-Controller-with-SPI-Interface-20005688B.pdf

Compile the driver with "#define DEBUG" or remove the
"MCP251XFD_QUIRK_MAB_NO_WARN" from the mcp251xfd_devtype_data_mcp2517fd. Then
you should see an error message when the chip switches modes due to the MAB
underrun.

If it's that errors there's not so much you can do, maybe optimize the SPI host
driver (or use a mcp2518fd). Which SoC are you on?

regards,
Marc

-- 
Pengutronix e.K.                 | Marc Kleine-Budde           |
Embedded Linux                   | https://www.pengutronix.de  |
Vertretung West/Dortmund         | Phone: +49-231-2826-924     |
Amtsgericht Hildesheim, HRA 2686 | Fax:   +49-5121-206917-5555 |


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: mcp2517fd: transmit errors
  2020-10-09 16:24 ` Marc Kleine-Budde
@ 2020-10-09 17:40   ` Kurt Van Dijck
  2020-10-12 12:22     ` Kurt Van Dijck
  2020-10-12 12:39     ` Marc Kleine-Budde
  0 siblings, 2 replies; 6+ messages in thread
From: Kurt Van Dijck @ 2020-10-09 17:40 UTC (permalink / raw)
  To: Marc Kleine-Budde; +Cc: linux-can

On Fri, 09 Oct 2020 18:24:02 +0200, Marc Kleine-Budde wrote:
> On 10/9/20 4:16 PM, Kurt Van Dijck wrote:
> > I'm using a v5.4 kernel now, with backported 'can: mcp25xxfd: initial commit'.
> > I did focus up to now to CAN recv performance, but now I face another
> > issue. I have errors transmitting to CAN.
> 
> What kind of errors?
First observation is that no response is received for some requests.
This is very high level, I need to investigate if the request is really
sent. This is a needle in a haystack.
Due to the transmit error counter in `ip -s link show can0`, I guess
it's not sent.
> 
> > It's unstable.
> 
> What does that mean?

Each burst of >x CAN frames produces the problem.
I still figure x in this statement.

I'm porting the problem to my desk to reproduce.

> 
> > I need to collect more details, and it is now about focus number 1.
> > 
> > I managed to decrease the urgency for my project by inserting a delay
> > in the most busy transmitter.
> > 
> > Any ideas what to look for?
> 
> The mcp2517fd suffers from the MAB TX underflow errata: See 1. in
> http://ww1.microchip.com/downloads/en/DeviceDoc/MCP2517FD-External-CAN-FD-Controller-with-SPI-Interface-20005688B.pdf
> 
> Compile the driver with "#define DEBUG" or remove the
> "MCP251XFD_QUIRK_MAB_NO_WARN" from the mcp251xfd_devtype_data_mcp2517fd. Then
> you should see an error message when the chip switches modes due to the MAB
> underrun.
I'll do this.

> 
> If it's that errors there's not so much you can do, maybe optimize the SPI host
> driver (or use a mcp2518fd). Which SoC are you on?
still the imx8mm, on a variscite board, with suboptimal 20MHz
oscillator, 8.5Mhz SPI speed, 1Mbit CAN

Kurt

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: mcp2517fd: transmit errors
  2020-10-09 17:40   ` Kurt Van Dijck
@ 2020-10-12 12:22     ` Kurt Van Dijck
  2020-10-12 12:25       ` Marc Kleine-Budde
  2020-10-12 12:39     ` Marc Kleine-Budde
  1 sibling, 1 reply; 6+ messages in thread
From: Kurt Van Dijck @ 2020-10-12 12:22 UTC (permalink / raw)
  To: Marc Kleine-Budde, linux-can

On Fri, 09 Oct 2020 19:40:57 +0200, Kurt Van Dijck wrote:
> On Fri, 09 Oct 2020 18:24:02 +0200, Marc Kleine-Budde wrote:
> > On 10/9/20 4:16 PM, Kurt Van Dijck wrote:
> > > Any ideas what to look for?
> > 
> > The mcp2517fd suffers from the MAB TX underflow errata: See 1. in
> > http://ww1.microchip.com/downloads/en/DeviceDoc/MCP2517FD-External-CAN-FD-Controller-with-SPI-Interface-20005688B.pdf
> > 
> > Compile the driver with "#define DEBUG" or remove the
> > "MCP251XFD_QUIRK_MAB_NO_WARN" from the mcp251xfd_devtype_data_mcp2517fd. Then
> > you should see an error message when the chip switches modes due to the MAB
> > underrun.
> I'll do this.

Yep, TX MAB underflow it is.
Thanks for the suggestion.

Kind regards,
Kurt

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: mcp2517fd: transmit errors
  2020-10-12 12:22     ` Kurt Van Dijck
@ 2020-10-12 12:25       ` Marc Kleine-Budde
  0 siblings, 0 replies; 6+ messages in thread
From: Marc Kleine-Budde @ 2020-10-12 12:25 UTC (permalink / raw)
  To: linux-can, Kurt Van Dijck


[-- Attachment #1.1: Type: text/plain, Size: 1235 bytes --]

On 10/12/20 2:22 PM, Kurt Van Dijck wrote:
> On Fri, 09 Oct 2020 19:40:57 +0200, Kurt Van Dijck wrote:
>> On Fri, 09 Oct 2020 18:24:02 +0200, Marc Kleine-Budde wrote:
>>> On 10/9/20 4:16 PM, Kurt Van Dijck wrote:
>>>> Any ideas what to look for?
>>>
>>> The mcp2517fd suffers from the MAB TX underflow errata: See 1. in
>>> http://ww1.microchip.com/downloads/en/DeviceDoc/MCP2517FD-External-CAN-FD-Controller-with-SPI-Interface-20005688B.pdf
>>>
>>> Compile the driver with "#define DEBUG" or remove the
>>> "MCP251XFD_QUIRK_MAB_NO_WARN" from the mcp251xfd_devtype_data_mcp2517fd. Then
>>> you should see an error message when the chip switches modes due to the MAB
>>> underrun.
>> I'll do this.
> 
> Yep, TX MAB underflow it is.

There's not much you can do against it...use a mcp2518fd. Maybe implementing
busy polling for spi transaction completion would bring a slight improvement
here....but that imx spi driver is a bit messy.

Marc

-- 
Pengutronix e.K.                 | Marc Kleine-Budde           |
Embedded Linux                   | https://www.pengutronix.de  |
Vertretung West/Dortmund         | Phone: +49-231-2826-924     |
Amtsgericht Hildesheim, HRA 2686 | Fax:   +49-5121-206917-5555 |


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: mcp2517fd: transmit errors
  2020-10-09 17:40   ` Kurt Van Dijck
  2020-10-12 12:22     ` Kurt Van Dijck
@ 2020-10-12 12:39     ` Marc Kleine-Budde
  1 sibling, 0 replies; 6+ messages in thread
From: Marc Kleine-Budde @ 2020-10-12 12:39 UTC (permalink / raw)
  To: linux-can, Kurt Van Dijck


[-- Attachment #1.1: Type: text/plain, Size: 1865 bytes --]

On 10/9/20 7:40 PM, Kurt Van Dijck wrote:
> On Fri, 09 Oct 2020 18:24:02 +0200, Marc Kleine-Budde wrote:
>> On 10/9/20 4:16 PM, Kurt Van Dijck wrote:
>>> I'm using a v5.4 kernel now, with backported 'can: mcp25xxfd: initial commit'.
>>> I did focus up to now to CAN recv performance, but now I face another
>>> issue. I have errors transmitting to CAN.
>>
>> What kind of errors?
> First observation is that no response is received for some requests.
> This is very high level, I need to investigate if the request is really
> sent. This is a needle in a haystack.
> Due to the transmit error counter in `ip -s link show can0`, I guess
> it's not sent.

If the HW TX process of a CAN frame runs into the MAB underflow there will be an
error on the CAN bus (stuffing, etc...)

The driver switches the HW back into normal mode and the TX is retried. So there
should be no packet loss, but some error frames.
>>> It's unstable.
>>
>> What does that mean?
> 
> Each burst of >x CAN frames produces the problem.
> I still figure x in this statement.

The TX routine of the driver doesn't do any aggregation of CAN frames. Although
the network stack offers support for that and the hardware supports that.

The TX CAN frames are sent as individual SPI messages.

On the other hand, what the driver does is aggregating the readout of the TEF (=
TX-completion) messages.

It's probably the overall system load that delays the SPI-complete-interrupt to
SPI-CS deselect, so that CS stays active for too long after the transfer, and
the errata is triggered.

Marc

-- 
Pengutronix e.K.                 | Marc Kleine-Budde           |
Embedded Linux                   | https://www.pengutronix.de  |
Vertretung West/Dortmund         | Phone: +49-231-2826-924     |
Amtsgericht Hildesheim, HRA 2686 | Fax:   +49-5121-206917-5555 |


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-10-12 12:39 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-09 14:16 mcp2517fd: transmit errors Kurt Van Dijck
2020-10-09 16:24 ` Marc Kleine-Budde
2020-10-09 17:40   ` Kurt Van Dijck
2020-10-12 12:22     ` Kurt Van Dijck
2020-10-12 12:25       ` Marc Kleine-Budde
2020-10-12 12:39     ` Marc Kleine-Budde

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.