netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Marc Kleine-Budde <mkl@pengutronix.de>
To: Patrick Menschel <menschel.p@posteo.de>
Cc: Drew Fustini <drew@beagleboard.org>,
	netdev@vger.kernel.org, linux-can@vger.kernel.org,
	Will C <will@macchina.cc>
Subject: Re: [net-next 6/6] can: mcp251xfd: mcp251xfd_regmap_crc_read(): work around broken CRC on TBC register
Date: Mon, 10 May 2021 09:43:34 +0200	[thread overview]
Message-ID: <20210510074334.el2yxp3oy2pmbs7d@pengutronix.de> (raw)
In-Reply-To: <7cb69acc-ee56-900b-0320-a893f687d850@posteo.de>

[-- Attachment #1: Type: text/plain, Size: 6214 bytes --]

On 08.05.2021 18:36:56, Patrick Menschel wrote:
> ### Test conditions ###
> 
> Since I lacked a true stress test, I wrote one for regular tox with
> pytest collection.
> 
> https://gitlab.com/Menschel/socketcan/-/blob/master/tests/test_socketcan.py#L872
> 
> It uses mcp0 and mcp1 which are directly connected.
> No CAN FD, just 500k with regular frames, random id and random data.
> 
> I basically mimic cangen but enhanced with a queue that handles to the
> rx thread what should be compared next.
> 
> ### Extract from dmesg shows no CRC Errors ###
> 
> [   30.930608] CAN device driver interface
> [   30.967349] spi_master spi0: will run message pump with realtime priority
> [   31.054202] mcp251xfd spi0.1 can0: MCP2518FD rev0.0 (-RX_INT
> -MAB_NO_WARN +CRC_REG +CRC_RX +CRC_TX +ECC -HD c:40.00MHz m:20.00MHz
> r:17.00MHz e:16.66MHz) successfully initialized.
> [   31.076906] mcp251xfd spi0.0 can1: MCP2518FD rev0.0 (-RX_INT
> -MAB_NO_WARN +CRC_REG +CRC_RX +CRC_TX +ECC -HD c:40.00MHz m:20.00MHz
> r:17.00MHz e:16.66MHz) successfully initialized.
> [   31.298969] mcp251xfd spi0.0 mcp0: renamed from can1
> [   31.339864] mcp251xfd spi0.1 mcp1: renamed from can0
> [   33.471889] IPv6: ADDRCONF(NETDEV_CHANGE): mcp0: link becomes ready
> [   34.482260] IPv6: ADDRCONF(NETDEV_CHANGE): mcp1: link becomes ready
> [  215.218979] can: controller area network core
> [  215.219146] NET: Registered protocol family 29
> [  215.261599] can: raw protocol
> [  218.745376] can: isotp protocol
> [  220.931150] NOHZ tick-stop error: Non-RCU local softirq work is
> pending, handler #08!!!
> [  220.931274] NOHZ tick-stop error: Non-RCU local softirq work is
> pending, handler #08!!!
> [  220.931395] NOHZ tick-stop error: Non-RCU local softirq work is
> pending, handler #08!!!
> [  220.931518] NOHZ tick-stop error: Non-RCU local softirq work is
> pending, handler #08!!!
> [  220.931643] NOHZ tick-stop error: Non-RCU local softirq work is
> pending, handler #08!!!
> [  220.931768] NOHZ tick-stop error: Non-RCU local softirq work is
> pending, handler #08!!!
> [  220.931893] NOHZ tick-stop error: Non-RCU local softirq work is
> pending, handler #08!!!
> [  222.099822] NOHZ tick-stop error: Non-RCU local softirq work is
> pending, handler #08!!!
> [  222.099901] NOHZ tick-stop error: Non-RCU local softirq work is
> pending, handler #08!!!
> [  222.100022] NOHZ tick-stop error: Non-RCU local softirq work is
> pending, handler #08!!!
> [  222.330438] can: broadcast manager protocol
> 
> That softirq error has something to do with IsoTp. I was not able to
> trace it back but I have it on multiple boards: pi0w, pi3b, pi3b+.

The softirq error is known and shows up as the mcp251xfd driver raises a
softirq from threaded IRQ context. We're working on fixing this.

> ### Performance ###
> 
> ## v5.10-rpi/backport-performance-improvements ##
> 
> I get about 20000 frames in 2 minutes.
> 
> 2021-05-08 19:00:36 [    INFO] 20336 frames in 0:02:00
> (test_socketcan.py:890)
> 
> 2021-05-08 19:49:34 [    INFO] 20001 frames in 0:02:00
> (test_socketcan.py:890)
> 
> 
> ## regular v5.10 ##
> 
> 2021-05-08 20:19:55 [    INFO] 20000 frames in 0:02:00
> (test_socketcan.py:890)
> 
> 2021-05-08 20:22:40 [    INFO] 19995 frames in 0:02:00
> (test_socketcan.py:890)
> 
> 2021-05-08 20:25:22 [    INFO] 19931 frames in 0:02:00
> (test_socketcan.py:890)
> 
> 
> The numbers are slightly better but I count that as tolerance.

Makes sense. But you have only measured number of frames in a given
time. The raspi SPI driver is highly optimized so the changes in the
driver don't show up in those numbers.

Thanks for testing, I'll send a pull request to the raspi kernel.

If you are interested if there are performance benefits on your raspi,
consider measuring the spent CPU time and the number of SPI interrupts.

Measure CPU time by putting the command "time" in front of your test.
Measure SPI Interrupts by looking at /proc/interrupts before and after
the test. Note: there are SPI host controller interrupts and Interrupts
from the mcp251xfd.

On a raspi you probably only have a hand full of SPI host controller
interrupts, as the raspi driver only uses interrupts for long transfers.
There will be a mcp251xfd interrupt per TX-complete and RX CAN message,
maybe a few less if they overlap.

> I also found that there are cross effects. If I run the same test on
> vcan0 before, the frame count goes down to 13000 instead.

The changes only touch the mcp251xfd driver, if you see a difference
with the vcan driver, it's either a change in the kernel somewhere else
or your test setup is sensitive to something you changed without
noticing (starting condition, ...)

> I also have to admit, that I didn't get any crc errors with regular
> v5.10 during that tests.

The CRC errors the patch works around are CRC errors introduced by a
chip erratum, not by electromagnetic interference. In my observation
these CRC errors show up if the register contents changes while the
register is read. The register that changes most is the timer base
counter register. That register is only read if a CAN bus error is
signaled to user space (and this is maximized by enabling bus error
reporting). If it happens to be a CRC error while reading the TBC
register and the CRC can be "corrected" by flipping the upper most bit,
there will be no error message about any CRC errors.

Long story short. You only notice that this patch works, if in a
situation you had CRC errors on the TBC register (that is CAN errors are
reported to user space), you now have an order of magnitude less CRC
errors than before.

> Do I have to change my test?

No need to.

> I can still update that pi3b+ that runs my micro-hil at work. That was
> the one that occasionally had CRC errors.

Thanks again for testing!

regards,
Marc

-- 
Pengutronix e.K.                 | Marc Kleine-Budde           |
Embedded Linux                   | https://www.pengutronix.de  |
Vertretung West/Dortmund         | Phone: +49-231-2826-924     |
Amtsgericht Hildesheim, HRA 2686 | Fax:   +49-5121-206917-5555 |

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

  parent reply	other threads:[~2021-05-10  9:55 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-07  8:01 pull-request: can-next 2021-04-07 Marc Kleine-Budde
2021-04-07  8:01 ` [net-next 1/6] can: skb: alloc_can{,fd}_skb(): set "cf" to NULL if skb allocation fails Marc Kleine-Budde
2021-04-07  8:01 ` [net-next 2/6] can: m_can: m_can_receive_skb(): add missing error handling to can_rx_offload_queue_sorted() call Marc Kleine-Budde
2021-04-07  8:01 ` [net-next 3/6] can: c_can: remove unused enum BOSCH_C_CAN_PLATFORM Marc Kleine-Budde
2021-04-07  8:01 ` [net-next 4/6] can: mcp251xfd: add BQL support Marc Kleine-Budde
2021-04-07  8:01 ` [net-next 5/6] can: mcp251xfd: mcp251xfd_regmap_crc_read_one(): Factor out crc check into separate function Marc Kleine-Budde
2021-04-07  8:01 ` [net-next 6/6] can: mcp251xfd: mcp251xfd_regmap_crc_read(): work around broken CRC on TBC register Marc Kleine-Budde
2021-04-21 19:58   ` Drew Fustini
2021-04-22  7:18     ` Marc Kleine-Budde
2021-04-22 16:46       ` Patrick Menschel
2021-05-07  7:25       ` Marc Kleine-Budde
2021-05-07  8:21         ` Patrick Menschel
2021-05-07  8:25           ` Marc Kleine-Budde
2021-05-08 18:36             ` Patrick Menschel
2021-05-09  7:46               ` Patrick Menschel
2021-05-10  7:45                 ` Marc Kleine-Budde
2021-05-20 10:29                   ` Patrick Menschel
2021-05-10  7:43               ` Marc Kleine-Budde [this message]
2021-12-07 16:53                 ` Modilaynen, Pavel
2021-12-08  8:54                   ` Marc Kleine-Budde
2021-12-09 10:22                   ` Thomas.Kopp
2021-12-09 11:17                     ` AW: " Sven Schuchmann
2021-12-09 11:27                       ` Marc Kleine-Budde
2021-12-09 12:53                         ` AW: " Sven Schuchmann
2021-12-13 22:12                     ` Modilaynen, Pavel
2021-12-21 22:24                       ` Thomas.Kopp
2022-06-21 14:25                         ` Marc Kleine-Budde
2022-06-22  8:19                           ` Marc Kleine-Budde
2022-06-22 13:47                           ` Thomas.Kopp
2022-06-26 19:14                         ` Marc Kleine-Budde
2021-05-07 22:36         ` Drew Fustini
2021-05-08 12:30           ` Marc Kleine-Budde
2021-04-07 22:10 ` pull-request: can-next 2021-04-07 patchwork-bot+netdevbpf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210510074334.el2yxp3oy2pmbs7d@pengutronix.de \
    --to=mkl@pengutronix.de \
    --cc=drew@beagleboard.org \
    --cc=linux-can@vger.kernel.org \
    --cc=menschel.p@posteo.de \
    --cc=netdev@vger.kernel.org \
    --cc=will@macchina.cc \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).