Re: Testing two MCP2518FD's on i.MX8MM

All of lore.kernel.org
 help / color / mirror / Atom feed

* Re: Testing two MCP2518FD's on i.MX8MM
       [not found] <CAOMZO5CwS-cO3W148YHVYFwcL3QC8oFJfeQBb+WN=QgEPU7AsQ@mail.gmail.com>
@ 2021-06-12 15:10 ` Fabio Estevam
  2021-06-15  7:15   ` Marc Kleine-Budde
  0 siblings, 1 reply; 5+ messages in thread
From: Fabio Estevam @ 2021-06-12 15:10 UTC (permalink / raw)
  To: Marc Kleine-Budde, dev.kurt; +Cc: linux-can, kernel

[Sorry, resending. Sent HTML content by mistake]

On Sat, Jun 12, 2021 at 12:07 PM Fabio Estevam <festevam@gmail.com> wrote:
>
> Hi,
>
> I am trying to run CAN stress tests on an i.MX8MM-based board that has two mcp2518fd chips.
> I am using linux-next 20210607 and this is the ecspi dts:
> https://pastebin.com/raw/YVvuqAAc
>
> Then I launch the test script:
> ./cantest start
>
> This is the script content:
> https://pastebin.com/raw/hc8gKgUf
>
> The problem is that RX FIFO overflow happens:
>
> [  128.559485] mcp251xfd spi0.0 can0: RX-0: FIFO overflow.
> [  128.573478] mcp251xfd spi0.0 can0: RX-0: FIFO overflow.
> [  128.584787] mcp251xfd spi0.0 can0: RX-0: FIFO overflow.
>
> and also cansequence errors:
> # 2020-02-09 01:41:15:368 sequence CNT: 2779938, RX:      8    expected:  34    missing:  230    skt overfl d:    0 a:    0    delta: 230    incident: 6    seq_wrap RX: 10860     sequ_wrap_expected: 10860   overall lost: 136
> 2020-02-09 01:41:15:368 sequence CNT:      9, RX:     34    expected:   9    missing:   25    skt overfl d:   [  899.794388] mcp251xfd spi0.0 can0: RX-0: FIFO overflow.
>  0 a:    0    delta:  25    incident: 7    seq_wrap RX: 10860   [  899.804780] mcp251xfd spi0.0 can0: RX-0: FIFO overflow.
>   sequ_wrap_expected: 10860   overall lost: 161
> 2020-02-09 01:41:15:370 sequence CNT:     40, RX:      9    expected:  40    missing:  225    skt overfl d:    0 a:    0    delta: 225    incident: 8    seq_wrap RX: 10860     sequ_wrap_expected: 10860   overall lost: 130
> 2020-02-09 01:41:15:392 sequence CNT:    137, RX:    105    expected: 137    missing:  224    skt overfl d:    0 a:    0    delta: 224    incident: 9    seq_wrap RX: 10860     sequ_wrap_expected: 10860   overall lost: 98
> 2020-02-09 01:41:15:396 sequence CNT:    137, RX:    145    expected: 137    missing:    8    skt overfl d:    0 a:    0    delta:   8    incident: 10    seq_wrap RX: 10860     sequ_wrap_expected: 10860   overall lost: 106
> 2020-02-09 01:41:15:403 sequence CNT:    160, ERRORFRAME 20000004   00 01 00 00 00 00 00 00
> 2020-02-09 01:41:15:414 sequence CNT:    192, ERRORFRAME 20000004   00 01 00 00 00 00 00 00
> 2020-02-09 01:41:15:414 sequence CNT:    192, RX:    210    expected: 192    missing:   18    skt overfl d:    0 a:    0    delta:  18    incident: 11    seq_wrap RX: 10860     sequ_wrap_expected: 10860   overall lost: 124
> 2020-02-09 01:41:15:416 sequence CNT:    220, RX:    222    expected: 220    missing:    2    skt overfl d:    0 a:    0    delta:   2    incident: 12    seq_wrap RX: 10860     sequ_wrap_expected: 10860   overall lost: 126
>
> I have applied this series to get SPI DMA to work on i.MX8MM:
> https://patches.linaro.org/cover/417924/
>
> I have also tried SPI PIO mode instead of DMA, but it does not help.
>
> Any ideas of what can be done to improve this?
>
> Thanks!
>
> Fabio Estevam

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Testing two MCP2518FD's on i.MX8MM
  2021-06-12 15:10 ` Testing two MCP2518FD's on i.MX8MM Fabio Estevam
@ 2021-06-15  7:15   ` Marc Kleine-Budde
  2021-06-21 12:24     ` Fabio Estevam
  0 siblings, 1 reply; 5+ messages in thread
From: Marc Kleine-Budde @ 2021-06-15  7:15 UTC (permalink / raw)
  To: Fabio Estevam; +Cc: linux-can

[-- Attachment #1: Type: text/plain, Size: 3939 bytes --]

On 12.06.2021 12:10:19, Fabio Estevam wrote:
> I am trying to run CAN stress tests on an i.MX8MM-based board that
> has two mcp2518fd chips. I am using linux-next 20210607 and this is
> the ecspi dts:
> 
> https://pastebin.com/raw/YVvuqAAc
>
> Then I launch the test script:
> ./cantest start
>
> This is the script content:
> https://pastebin.com/raw/hc8gKgUf
>
> The problem is that RX FIFO overflow happens:
>
> [  128.559485] mcp251xfd spi0.0 can0: RX-0: FIFO overflow.
> [  128.573478] mcp251xfd spi0.0 can0: RX-0: FIFO overflow.
> [  128.584787] mcp251xfd spi0.0 can0: RX-0: FIFO overflow.
>
> and also cansequence errors:
> # 2020-02-09 01:41:15:368 sequence CNT: 2779938, RX:      8    expected:  34    missing:  230    skt overfl d:    0 a:    0    delta: 230    incident: 6    seq_wrap RX: 10860     sequ_wrap_expected: 10860   overall lost: 136
> 2020-02-09 01:41:15:368 sequence CNT:      9, RX:     34    expected:   9    missing:   25    skt overfl d:   [  899.794388] mcp251xfd spi0.0 can0: RX-0: FIFO overflow.
>  0 a:    0    delta:  25    incident: 7    seq_wrap RX: 10860   [  899.804780] mcp251xfd spi0.0 can0: RX-0: FIFO overflow.
>   sequ_wrap_expected: 10860   overall lost: 161
> 2020-02-09 01:41:15:370 sequence CNT:     40, RX:      9    expected:  40    missing:  225    skt overfl d:    0 a:    0    delta: 225    incident: 8    seq_wrap RX: 10860     sequ_wrap_expected: 10860   overall lost: 130
> 2020-02-09 01:41:15:392 sequence CNT:    137, RX:    105    expected: 137    missing:  224    skt overfl d:    0 a:    0    delta: 224    incident: 9    seq_wrap RX: 10860     sequ_wrap_expected: 10860   overall lost: 98
> 2020-02-09 01:41:15:396 sequence CNT:    137, RX:    145    expected: 137    missing:    8    skt overfl d:    0 a:    0    delta:   8    incident: 10    seq_wrap RX: 10860     sequ_wrap_expected: 10860   overall lost: 106
> 2020-02-09 01:41:15:403 sequence CNT:    160, ERRORFRAME 20000004   00 01 00 00 00 00 00 00
> 2020-02-09 01:41:15:414 sequence CNT:    192, ERRORFRAME 20000004   00 01 00 00 00 00 00 00
> 2020-02-09 01:41:15:414 sequence CNT:    192, RX:    210    expected: 192    missing:   18    skt overfl d:    0 a:    0    delta:  18    incident: 11    seq_wrap RX: 10860     sequ_wrap_expected: 10860   overall lost: 124
> 2020-02-09 01:41:15:416 sequence CNT:    220, RX:    222    expected: 220    missing:    2    skt overfl d:    0 a:    0    delta:   2    incident: 12    seq_wrap RX: 10860     sequ_wrap_expected: 10860   overall lost: 126
>
> I have applied this series to get SPI DMA to work on i.MX8MM:
> https://patches.linaro.org/cover/417924/
>
> I have also tried SPI PIO mode instead of DMA, but it does not help.
>
> Any ideas of what can be done to improve this?

The imx SPI driver has quite some overhead, when it comes to small SPI
transfers. The mcp251fd driver performs much better with the SPI IP
cores on the raspi, which have quite good optimized drivers.

Hook up a scope to the SPI's clock and chip select lines of the imx,
you'll see the time between end of transfer until the chip select is
inactive is longer than the SPI transfer itself.

I expect most bang for the buck can be archived by adding an IRQ less
busy polling transfer mode, which kicks in below a certain SPI transfer
length.

On the mcp251xfd driver side, there is some room for optimization. The
basic idea is to reduce the number of SPI transfers by combining several
reads into one transfer. This can be done in some places.

For peak loads in CAN-2.0 mode it would be interesting to make use of
the remaining RAM for a 2nd FIFO.

regards,
Marc

-- 
Pengutronix e.K.                 | Marc Kleine-Budde           |
Embedded Linux                   | https://www.pengutronix.de  |
Vertretung West/Dortmund         | Phone: +49-231-2826-924     |
Amtsgericht Hildesheim, HRA 2686 | Fax:   +49-5121-206917-5555 |

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Testing two MCP2518FD's on i.MX8MM
  2021-06-15  7:15   ` Marc Kleine-Budde
@ 2021-06-21 12:24     ` Fabio Estevam
  2021-06-21 12:37       ` Marc Kleine-Budde
  0 siblings, 1 reply; 5+ messages in thread
From: Fabio Estevam @ 2021-06-21 12:24 UTC (permalink / raw)
  To: Marc Kleine-Budde; +Cc: linux-can, Oliver Hartkopp, Paul E . McKenney

Hi Marc,

On Tue, Jun 15, 2021 at 4:15 AM Marc Kleine-Budde <mkl@pengutronix.de> wrote:

> The imx SPI driver has quite some overhead, when it comes to small SPI
> transfers. The mcp251fd driver performs much better with the SPI IP
> cores on the raspi, which have quite good optimized drivers.
>
> Hook up a scope to the SPI's clock and chip select lines of the imx,
> you'll see the time between end of transfer until the chip select is
> inactive is longer than the SPI transfer itself.
>
> I expect most bang for the buck can be archived by adding an IRQ less
> busy polling transfer mode, which kicks in below a certain SPI transfer
> length.
>
> On the mcp251xfd driver side, there is some room for optimization. The
> basic idea is to reduce the number of SPI transfers by combining several
> reads into one transfer. This can be done in some places.
>
> For peak loads in CAN-2.0 mode it would be interesting to make use of
> the remaining RAM for a 2nd FIFO.

Thanks for your reply.

I do see some RCU related errors every time the application is launched:

# ./cantest.sh start
root@verdin-imx8mm:~# interface = can1, family = 29, ty[   17.484220]
NOHZ tick-stop error: Non-RCU local softirq work is pending, handler
#08!!!
[   17.484240] NOHZ tick-stop error: Non-RCU local softirq work is
pending, handler #08!!!
pe = 3, proto = 1
interface = ca[   17.502870] NOHZ tick-stop error: Non-RCU local
softirq work is pending, handler #08!!!
n0, family = 29, type = 3, proto [   17.502912] NOHZ tick-stop error:
Non-RCU local softirq work is pending, handler #08!!!
= 1
interface = can1, family = 2[   17.524457] NOHZ tick-stop error:
Non-RCU local softirq work is pending, handler #08!!!
9, type = 3, proto = 1
interface[   17.524476] NOHZ tick-stop error: Non-RCU local softirq
work is pending, handler #08!!!
 = can0, family = 29, type = 3, p[   17.535223] NOHZ tick-stop error:
Non-RCU local softirq work is pending, handler #08!!!
roto = 1
[   17.557284] NOHZ tick-stop error: Non-RCU local softirq work is
pending, handler #08!!!
[   17.557284] NOHZ tick-stop error: Non-RCU local softirq work is
pending, handler #08!!!
[   17.574035] NOHZ tick-stop error: Non-RCU local softirq work is
pending, handler #08!!!
[   17.574037] NOHZ tick-stop error: Non-RCU local softirq work is
pending, handler #08!!!
[   18.435652] sched: RT throttling activated

After some time:

[  292.197058] rcu: 0-....: (1 GPs behind)
idle=6db/1/0x4000000000000002 softirq=2974/2975 fqs=7882
[  292.206039] (t=21003 jiffies g=1249 q=1317)
[  292.210316] Task dump for CPU 0:
[  292.213549] task:cansequence     state:R  running task     stack:
 0 pid:  374 ppid:     1 flags:0x00000202
[  292.223485] Call trace:
[  292.225932]  dump_backtrace+0x0/0x1a8
[  292.229613]  show_stack+0x18/0x28
[  292.232936]  sched_show_task+0x150/0x170
[  292.236869]  dump_cpu_task+0x44/0x54
[  292.240453]  rcu_dump_cpu_stacks+0xf4/0x13c
[  292.244648]  rcu_sched_clock_irq+0x844/0xdc0
[  292.248929]  update_process_times+0x98/0xe8
[  292.253125]  tick_nohz_handler+0xac/0x110
[  292.257142]  arch_timer_handler_phys+0x34/0x48
[  292.261598]  handle_percpu_devid_irq+0x84/0x148
[  292.266138]  handle_domain_irq+0x60/0x90
[  292.270071]  gic_handle_irq+0x54/0x120
[  292.273833]  call_on_irq_stack+0x28/0x50
[  292.277767]  do_interrupt_handler+0x54/0x60
[  292.281964]  el1_interrupt+0x30/0x78
[  292.285550]  el1h_64_irq_handler+0x18/0x28
[  292.289653]  el1h_64_irq+0x78/0x7c
[  292.293061]  __audit_syscall_exit+0x8/0x238
[  292.297256]  el0_svc_common+0x60/0xd8
[  292.300927]  do_el0_svc+0x28/0x90
[  292.304249]  el0_svc+0x24/0x38
[  292.307312]  el0t_64_sync_handler+0xb0/0xb8
[  292.311502]  el0t_64_sync+0x198/0x19c
2020-02-12 19:28:40:388 sequence CNT: 1586297, RX:     93    expected:
121    missing:  228    skt overfl d:    0 a:    0    delta: 228
incident: 1    seq_wrap RX: 6196     sequ_wrap_expected: 6196
overall lost: 4294967268
2020-02-12 19:28:40:389 sequence CNT:     95, RX:    121    expected:
95    missing:   26    skt overfl d:    0 a:    0    delta:  26
incident: 2    seq_wrap RX: 6196     sequ_wrap_expected: 6196
overall lost: 4294967294
2020-02-12 19:28:40:389 sequence CNT:    125, RX:    127    expected:
125    missing:    2    skt overfl d:    0 a:    0    delta:   2
incident: 3    seq_wrap RX: 6196     sequ_wrap_expected: 6196
overall lost: 0

Any ideas how these RCU errors could be fixed?

Thanks,

Fabio Estevam

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Testing two MCP2518FD's on i.MX8MM
  2021-06-21 12:24     ` Fabio Estevam
@ 2021-06-21 12:37       ` Marc Kleine-Budde
  2021-06-21 13:07         ` Fabio Estevam
  0 siblings, 1 reply; 5+ messages in thread
From: Marc Kleine-Budde @ 2021-06-21 12:37 UTC (permalink / raw)
  To: Fabio Estevam; +Cc: linux-can, Oliver Hartkopp, Paul E . McKenney

[-- Attachment #1: Type: text/plain, Size: 1597 bytes --]

On 21.06.2021 09:24:31, Fabio Estevam wrote:
> > The imx SPI driver has quite some overhead, when it comes to small SPI
> > transfers. The mcp251fd driver performs much better with the SPI IP
> > cores on the raspi, which have quite good optimized drivers.
> >
> > Hook up a scope to the SPI's clock and chip select lines of the imx,
> > you'll see the time between end of transfer until the chip select is
> > inactive is longer than the SPI transfer itself.
> >
> > I expect most bang for the buck can be archived by adding an IRQ less
> > busy polling transfer mode, which kicks in below a certain SPI transfer
> > length.
> >
> > On the mcp251xfd driver side, there is some room for optimization. The
> > basic idea is to reduce the number of SPI transfers by combining several
> > reads into one transfer. This can be done in some places.
> >
> > For peak loads in CAN-2.0 mode it would be interesting to make use of
> > the remaining RAM for a 2nd FIFO.
> 
> Thanks for your reply.
> 
> I do see some RCU related errors every time the application is launched:

[...]

> Any ideas how these RCU errors could be fixed?

Can you test if
https://lore.kernel.org/r/20210621123436.2897023-1-mkl@pengutronix.de
fixes your problem? We still have to check if lockdep complains...

regards,
Marc

-- 
Pengutronix e.K.                 | Marc Kleine-Budde           |
Embedded Linux                   | https://www.pengutronix.de  |
Vertretung West/Dortmund         | Phone: +49-231-2826-924     |
Amtsgericht Hildesheim, HRA 2686 | Fax:   +49-5121-206917-5555 |

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Testing two MCP2518FD's on i.MX8MM
  2021-06-21 12:37       ` Marc Kleine-Budde
@ 2021-06-21 13:07         ` Fabio Estevam
  0 siblings, 0 replies; 5+ messages in thread
From: Fabio Estevam @ 2021-06-21 13:07 UTC (permalink / raw)
  To: Marc Kleine-Budde; +Cc: linux-can, Oliver Hartkopp, Paul E . McKenney

Hi Marc,

On Mon, Jun 21, 2021 at 9:37 AM Marc Kleine-Budde <mkl@pengutronix.de> wrote:

> Can you test if
> https://lore.kernel.org/r/20210621123436.2897023-1-mkl@pengutronix.de
> fixes your problem? We still have to check if lockdep complains...

I tested your series and I don't see the initial RCU errors after
launching the application, but
now it causes a storm of cansequence errors:

root@verdin-imx8mm:~# ./cantest.sh start
root@verdin-imx8mm:~# interface = can1, family = 29, type = 3, proto = 1
interface = can1, family = 29, type = 3, proto = 1
interface = can0, family = 29, type = 3, proto = 1
interface = can0, family = 29, type = 3, proto = 1
2020-02-12 19:36:05:161 sequence CNT:   1304, RX:     39    expected:
24    missing:   15    skt overfl d:    0 a:    0    delta:  15
incident: 1    seq_wrap RX: 5     sequ_wrap_expected: 5   overall
lost: 15
2020-02-12 19:36:05:406 sequence CNT:   3015, RX:    230    expected:
199    missing:   31    skt overfl d:    0 a:    0    delta:  31
incident: 1    seq_wrap RX: 11     sequ_wrap_expected: 11   overall
lost: 31
2020-02-12 19:36:05:455 sequence CNT:    742, RX:    238    expected:
230    missing:    8    skt overfl d:    0 a:    0    delta:   8
incident: 2    seq_wrap RX: 7     sequ_wrap_expected: 7   overall
lost: 23
2020-02-12 19:36:05:730 sequence CNT:   1287, RX:      8    expected:
 7    missing:    1    skt overfl d:    0 a:    0    delta:   1
incident: 3    seq_wrap RX: 12     sequ_wrap_expected: 12   overall
lost: 24
2020-02-12 19:36:05:746 sequence CNT:    991, RX:    228    expected:
223    missing:    5    skt overfl d:    0 a:    0    delta:   5
incident: 4    seq_wrap RX: 15     sequ_wrap_expected: 15   overall
lost: 29
....

For mcp2518fd usage with imx8mm: would you recommend SPI in PIO or DMA mode?
Looking at your imx6dl devicetree it seems you use DMA.

Thanks

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-06-21 13:07 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <CAOMZO5CwS-cO3W148YHVYFwcL3QC8oFJfeQBb+WN=QgEPU7AsQ@mail.gmail.com>
2021-06-12 15:10 ` Testing two MCP2518FD's on i.MX8MM Fabio Estevam
2021-06-15  7:15   ` Marc Kleine-Budde
2021-06-21 12:24     ` Fabio Estevam
2021-06-21 12:37       ` Marc Kleine-Budde
2021-06-21 13:07         ` Fabio Estevam

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.