All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: MCP2518FD Drivers Rarely Working with Custom Kernel 5.10.Y
       [not found] ` <20210622212818.enfx5fzgghfxfznb@pengutronix.de>
@ 2021-06-23  2:59   ` Joshua Quesenberry
  2021-06-23  5:24     ` Patrick Menschel
  0 siblings, 1 reply; 15+ messages in thread
From: Joshua Quesenberry @ 2021-06-23  2:59 UTC (permalink / raw)
  To: Marc Kleine-Budde; +Cc: kernel, linux-can, Joshua Quesenberry

Thank you Marc, I had tried finding a Linux CAN forum, but
unfortunately searching for "CAN" in Google is about the most
unhelpful search term one could use... so thanks for replying and
getting me to a more appropriate audience.

Reverting my system back to where CAN was working will probably be
challenging. Our main goal was to get Boot from USB on the RPi
enabled, but this unfortunately meant upgrading every piece of
software and firmware available... previously we were still on Buster,
but the OS snapshot was from Spring 2020 (if not Fall/Winter 2019), if
not earlier, the firmware was much older, and the kernel was 4.19.73,
wherein the MCP251XFD driver didn't exist yet. So getting back there
will mean throwing a saved SD Card image on from Spring 2020 and then
trying to figure out how to force downgrade the firmware. A colleague
started this upgrade process for another project and was seeing these
same results on two separate RPi, he did the OS and firmware upgrades,
but I did the building of the 5.10.17 kernel. So including those two
RPi and mine, that's three total systems with mostly non-working CAN
where it had been working fine, my system has slightly newer RPi
firmware now and the 5.10.44 kernel, the hope was maybe I'd pick up a
patch somewhere, but no such luck. If you still think it would be
beneficial to go through the effort of downgrading everything to
verify the hardware I can do that, but just want to make sure before I
start that since it'll take a while.

I updated spi.c to include printing the error number as you requested
and that's all baking now. When I get into work in the morning (US
EST) I'll get the changes deployed and try it out. Since this issue is
a very high failure rate, getting a log shouldn't be an issue.

Some background on the custom kernel... when I switched to the 5.10.Y
branch, I used arch/arm/configs/bcm2711_defconfig as my base config
and then switched on preempt, switched to 1000Hz kernel timer,
switched the default governor from powersave to ondemand, switched on
debug flag (CONFIG_DEBUG_USER=y), enabled a few different CAN drivers
we may encounter, and enabled some stuff for the WM8782 I2S chip. I
probably should have recreated my config after 5.10.44, but I hadn't
considered till this writing, looking at this diff there a few bits
that are new I probably could benefit from including, but I don't see
anything that I'd be concerned about.

`diff bcm2711_defconfig hel_bcm2711_lowlatency_defconfig`
15d14
< CONFIG_ATA=m
43d41
< CONFIG_BH1750=m
53c51
< CONFIG_BLK_DEV_NVME=y
---
> CONFIG_BLK_DEV_NVME=m
120c118
< CONFIG_CAN_J1939=m
---
> CONFIG_CAN_KVASER_USB=m
123a122,123
> CONFIG_CAN_MCP25XXFD=m
> CONFIG_CAN_PEAK_USB=m
127d126
< CONFIG_CCS811=m
155c154
< CONFIG_CPU_FREQ_DEFAULT_GOV_POWERSAVE=y
---
> CONFIG_CPU_FREQ_DEFAULT_GOV_ONDEMAND=y
158,159c157
< CONFIG_CPU_FREQ_GOV_ONDEMAND=y
< CONFIG_CPU_FREQ_GOV_PERFORMANCE=y
---
> CONFIG_CPU_FREQ_GOV_POWERSAVE=y
184a183
> CONFIG_DEBUG_USER=y
209d207
< CONFIG_DRM_PANEL_JDI_LT070ME05000=m
319a318
> CONFIG_GENERIC_PHY=y
325d323
< CONFIG_GPIO_PCA953X_IRQ=y
395a394
> CONFIG_HZ_1000=y
561d559
< CONFIG_IR_TOY=m
826d823
< CONFIG_NF_LOG_ARP=m
828d824
< CONFIG_NF_LOG_NETDEV=m
950c946
< CONFIG_PREEMPT_VOLUNTARY=y
---
> CONFIG_PREEMPT=y
957d952
< CONFIG_QCA7000_UART=m
994d988
< CONFIG_RPI_POE_POWER=m
1040a1035
> # CONFIG_RTC_HCTOSYS is not set
1044,1045d1038
< CONFIG_SATA_AHCI=m
< CONFIG_SATA_MV=m
1054d1046
< CONFIG_SENSIRION_SGP30=m
1134a1127
> CONFIG_SND_RPI_I2S_AUDIO_WM8782=m
1149a1143
> CONFIG_SND_SOC_WM8782=m

The /boot/config.txt I included in the forum posts mentioned is
tweaking the 40-pin header quite a bit from the default setup, we're
using many of the pins for our HAT and planned for possibly adding
more in the future.

Lastly, I've implemented UDEV rules so that SPI0.0 = can0 and SPI0.1 =
can1, I've done this in the past without it causing any issues and
don't think it's causing issues here, in fact I think I've tested with
the UDEV rule removed and things still didn't work, but felt it worth
mentioning just in case.

Thanks!

Josh Q

On Tue, Jun 22, 2021 at 5:28 PM Marc Kleine-Budde <mkl@pengutronix.de> wrote:
>
> Hey Joshua,
>
> On 22.06.2021 13:29:34, Joshua Quesenberry wrote:
> > I see you are a maintainer of the MCP251XFD driver? I am running into
> > issues where the driver doesn't work on an RPI4 with custom kernel
> > 5.10.17 or 5.10.44 and am wondering if you could take a look at one of
> > my ticket (cross posted in RPi and Linux forums) to see if anything
> > I'm running into is familiar and you may have some idea of a direction
> > to point me in?
>
> Feel free to use the linux-can mailing list (linux-can@vger.kernel.org -
> no HTML mail though) or open an issue on my github
> (https://github.com/marckleinebudde/linux/issues). Using the mailing
> list is preferred.
>
> You've written that your system was working with the old software. Can
> you restore the old kernel and check if it's still working. Just to be
> sure that the hardware is OK?
>
> | mcp251xfd spi0.1 can0: MCP2518FD rev0.0 (-RX_INT -MAB_NO_WARN +CRC_REG +CRC_RX +CRC_TX +ECC -HD c:40.00MHz m:20.00MHz r:17.00MHz e:16.66MHz) successfully initialized.
> | mcp251xfd spi0.0 can1: MCP2518FD rev0.0 (-RX_INT -MAB_NO_WARN +CRC_REG +CRC_RX +CRC_TX +ECC -HD c:40.00MHz m:20.00MHz r:17.00MHz e:16.66MHz) successfully initialized.
>
> These messages shows that the mcp2518fd was properly detected and that
> SPI communication with the is working.
>
> But then some SPI transfers fails:
> | spi_master spi0: failed to transfer one message from queue
>
> But the SPI framework doesn't print the error number, nor the raspi SPI
> driver does. Can you add the error number to the error message:
>
> https://elixir.bootlin.com/linux/v5.10.45/source/drivers/spi/spi.c#L1534
>
> e.g.:
>
> |               dev_err(&ctlr->dev,
> |                       "failed to transfer one message from queue (ret=%d)\n", ret);
>
> These error messages mean that the mcp251xfd driver read nothing (data
> is always 00) from the chip. This might be caused by the failed SPI
> transfer.
>
> | mcp251xfd spi0.0 can0: CRC read error at address 0x0e0c (length=4, data=00 00 00 00, CRC=0x0000) retrying.
> | mcp251xfd spi0.0 can0: CRC read error at address 0x0e0c (length=4, data=00 00 00 00, CRC=0x0000) retrying.
> | mcp251xfd spi0.0 can0: CRC read error at address 0x0e0c (length=4, data=00 00 00 00, CRC=0x0000) retrying.
> | mcp251xfd spi0.0 can0: CRC read error at address 0x0e0c (length=4, data=00 00 00 00, CRC=0x0000).
>
> I think the easiest way is to add the error number as mentioned above
> and reproduce the error. Answer to this mail and add the mailing list on
> Cc, don't forget to send a non HTML mail. :D
>
> regards,
> Marc
>
> --
> Pengutronix e.K.                 | Marc Kleine-Budde           |
> Embedded Linux                   | https://www.pengutronix.de  |
> Vertretung West/Dortmund         | Phone: +49-231-2826-924     |
> Amtsgericht Hildesheim, HRA 2686 | Fax:   +49-5121-206917-5555 |

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: MCP2518FD Drivers Rarely Working with Custom Kernel 5.10.Y
  2021-06-23  2:59   ` MCP2518FD Drivers Rarely Working with Custom Kernel 5.10.Y Joshua Quesenberry
@ 2021-06-23  5:24     ` Patrick Menschel
  2021-06-23 17:34       ` Joshua Quesenberry
  0 siblings, 1 reply; 15+ messages in thread
From: Patrick Menschel @ 2021-06-23  5:24 UTC (permalink / raw)
  To: Joshua Quesenberry, Marc Kleine-Budde; +Cc: kernel, linux-can

Am 23.06.21 um 04:59 schrieb Joshua Quesenberry:
> Thank you Marc, I had tried finding a Linux CAN forum, but
> unfortunately searching for "CAN" in Google is about the most
> unhelpful search term one could use... so thanks for replying and
> getting me to a more appropriate audience.
> 
> Reverting my system back to where CAN was working will probably be
> challenging. Our main goal was to get Boot from USB on the RPi
> enabled, but this unfortunately meant upgrading every piece of
> software and firmware available... previously we were still on Buster,
> but the OS snapshot was from Spring 2020 (if not Fall/Winter 2019), if
> not earlier, the firmware was much older, and the kernel was 4.19.73,
> wherein the MCP251XFD driver didn't exist yet. So getting back there
> will mean throwing a saved SD Card image on from Spring 2020 and then
> trying to figure out how to force downgrade the firmware. A colleague
> started this upgrade process for another project and was seeing these
> same results on two separate RPi, he did the OS and firmware upgrades,
> but I did the building of the 5.10.17 kernel. So including those two
> RPi and mine, that's three total systems with mostly non-working CAN
> where it had been working fine, my system has slightly newer RPi
> firmware now and the 5.10.44 kernel, the hope was maybe I'd pick up a
> patch somewhere, but no such luck. If you still think it would be
> beneficial to go through the effort of downgrading everything to
> verify the hardware I can do that, but just want to make sure before I
> start that since it'll take a while.
> 
> I updated spi.c to include printing the error number as you requested
> and that's all baking now. When I get into work in the morning (US
> EST) I'll get the changes deployed and try it out. Since this issue is
> a very high failure rate, getting a log shouldn't be an issue.
> 
> Some background on the custom kernel... when I switched to the 5.10.Y
> branch, I used arch/arm/configs/bcm2711_defconfig as my base config
> and then switched on preempt, switched to 1000Hz kernel timer,
> switched the default governor from powersave to ondemand, switched on
> debug flag (CONFIG_DEBUG_USER=y), enabled a few different CAN drivers
> we may encounter, and enabled some stuff for the WM8782 I2S chip. I
> probably should have recreated my config after 5.10.44, but I hadn't
> considered till this writing, looking at this diff there a few bits
> that are new I probably could benefit from including, but I don't see
> anything that I'd be concerned about.
> 
> `diff bcm2711_defconfig hel_bcm2711_lowlatency_defconfig`
> 15d14
> < CONFIG_ATA=m
> 43d41
> < CONFIG_BH1750=m
> 53c51
> < CONFIG_BLK_DEV_NVME=y
> ---
>> CONFIG_BLK_DEV_NVME=m
> 120c118
> < CONFIG_CAN_J1939=m
> ---
>> CONFIG_CAN_KVASER_USB=m
> 123a122,123
>> CONFIG_CAN_MCP25XXFD=m
>> CONFIG_CAN_PEAK_USB=m
> 127d126
> < CONFIG_CCS811=m
> 155c154
> < CONFIG_CPU_FREQ_DEFAULT_GOV_POWERSAVE=y
> ---
>> CONFIG_CPU_FREQ_DEFAULT_GOV_ONDEMAND=y
> 158,159c157
> < CONFIG_CPU_FREQ_GOV_ONDEMAND=y
> < CONFIG_CPU_FREQ_GOV_PERFORMANCE=y
> ---
>> CONFIG_CPU_FREQ_GOV_POWERSAVE=y
> 184a183
>> CONFIG_DEBUG_USER=y
> 209d207
> < CONFIG_DRM_PANEL_JDI_LT070ME05000=m
> 319a318
>> CONFIG_GENERIC_PHY=y
> 325d323
> < CONFIG_GPIO_PCA953X_IRQ=y
> 395a394
>> CONFIG_HZ_1000=y
> 561d559
> < CONFIG_IR_TOY=m
> 826d823
> < CONFIG_NF_LOG_ARP=m
> 828d824
> < CONFIG_NF_LOG_NETDEV=m
> 950c946
> < CONFIG_PREEMPT_VOLUNTARY=y
> ---
>> CONFIG_PREEMPT=y
> 957d952
> < CONFIG_QCA7000_UART=m
> 994d988
> < CONFIG_RPI_POE_POWER=m
> 1040a1035
>> # CONFIG_RTC_HCTOSYS is not set
> 1044,1045d1038
> < CONFIG_SATA_AHCI=m
> < CONFIG_SATA_MV=m
> 1054d1046
> < CONFIG_SENSIRION_SGP30=m
> 1134a1127
>> CONFIG_SND_RPI_I2S_AUDIO_WM8782=m
> 1149a1143
>> CONFIG_SND_SOC_WM8782=m
> 
> The /boot/config.txt I included in the forum posts mentioned is
> tweaking the 40-pin header quite a bit from the default setup, we're
> using many of the pins for our HAT and planned for possibly adding
> more in the future.

Hi,

it would help to find a reference to that config.txt .

Regarding the changed Kconfig flags, I would suspect everything that
owns a =y to be the culprit, especially everything that has connections
to a clock.
Ever since the first rpi3, clocks are unreliable in general due to the
frequency governor. The rpi guys did there best to get rid of most of
the initial problems but the root cause remains.

The interesting question is, does a stock raspbian buster work with your
hardware and that config.txt?

I'm running a stock raspbian buster on a rpi3b+ with seeed can fd hat v2
24/7 for a couple of month now and did not expierence any problems.

Regards,
Patrick

^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: MCP2518FD Drivers Rarely Working with Custom Kernel 5.10.Y
  2021-06-23  5:24     ` Patrick Menschel
@ 2021-06-23 17:34       ` Joshua Quesenberry
  2021-06-23 20:07         ` Patrick Menschel
  2021-06-25  6:56         ` Marc Kleine-Budde
  0 siblings, 2 replies; 15+ messages in thread
From: Joshua Quesenberry @ 2021-06-23 17:34 UTC (permalink / raw)
  To: 'Patrick Menschel', 'Marc Kleine-Budde'
  Cc: kernel, linux-can, engnfrc

[-- Attachment #1: Type: text/plain, Size: 8761 bytes --]

Hey!

I have attached config.txt so you all can see what I'm doing.

I added printing the error number as Marc suggested and the number appears to be -110 every time.

[   25.660006] CAN device driver interface
[   25.668720] spi_master spi0: will run message pump with realtime priority
[   25.676697] mcp251xfd spi0.1 can0: MCP2518FD rev0.0 (-RX_INT -MAB_NO_WARN +CRC_REG +CRC_RX +CRC_TX +ECC -HD c:40.00MHz m:20.00MHz r:17.00MHz e:16.66MHz) successfully initialized.
[   25.684900] mcp251xfd spi0.0 can1: MCP2518FD rev0.0 (-RX_INT -MAB_NO_WARN +CRC_REG +CRC_RX +CRC_TX +ECC -HD c:40.00MHz m:20.00MHz r:17.00MHz e:16.66MHz) successfully initialized.
[   28.098033] mcp251xfd spi0.1 rename4: renamed from can0
[   28.175644] mcp251xfd spi0.0 can0: renamed from can1
[   28.225891] mcp251xfd spi0.1 can1: renamed from rename4
[  146.964971] mcp251xfd spi0.0: SPI transfer timed out
[  146.965023] spi_master spi0: failed to transfer one message from queue (ret=-110)
[  146.965216] mcp251xfd spi0.0 can0: CRC read error at address 0x0e0c (length=4, data=00 00 00 00, CRC=0x0000) retrying.
[  146.965247] mcp251xfd spi0.0 can0: CRC read error at address 0x0e0c (length=4, data=00 00 00 00, CRC=0x0000) retrying.
[  146.965277] mcp251xfd spi0.0 can0: CRC read error at address 0x0e0c (length=4, data=00 00 00 00, CRC=0x0000) retrying.
[  146.965286] mcp251xfd spi0.0 can0: CRC read error at address 0x0e0c (length=4, data=00 00 00 00, CRC=0x0000).
[  146.965331] mcp251xfd spi0.0 can0: CRC read error at address 0x0000 (length=4, data=00 00 00 00, CRC=0x0000) retrying.
[  146.965360] mcp251xfd spi0.0 can0: CRC read error at address 0x0000 (length=4, data=00 00 00 00, CRC=0x0000) retrying.
[  146.965389] mcp251xfd spi0.0 can0: CRC read error at address 0x0000 (length=4, data=00 00 00 00, CRC=0x0000) retrying.
[  146.965397] mcp251xfd spi0.0 can0: CRC read error at address 0x0000 (length=4, data=00 00 00 00, CRC=0x0000).
[  146.965413] A link change request failed with some changes committed already. Interface can0 may have been left with an inconsistent configuration, please check.

Regarding the discussion about Kconfig flags, I went ahead and rebuilt kernel 5.10.44 using a config that was essentially arch/arm/configs/bcm2711_defconfig with these additions needed to get our I2S working. This should have undone the switch to ONDEMAND governor and enabling 1000 Hz clock.

1030a1031
> CONFIG_SND_RPI_I2S_AUDIO_WM8782=m
1040a1042
> CONFIG_SND_SOC_WM8782=m

My RPi and HAT have worked very reliably with the older buster image and customized (same tweaks as mentioned in last email) kernel 4.19.73, in that kernel I'm using MCP25XXFD driver from msperl which under 5.10.Y kernel is having issues too. I only upgraded everything on my system at the end of last week, so hardware has been OK very recently.

Keep in mind I'm not seeing a total failure, I do occasionally see everything work correctly and I can run the ip link setup command without issue, it's just not common and it seems fully removing power from the system and reapplying seems to help, but not every time, so maybe it's a coincidence. It could be an issue of subsequent configurations of the controller after the initial setup on power application, but I'd expect it work after every power yank I think.

I wouldn't feel comfortable reverting my /boot/config.txt to a stock one and a default setup of the 40-pin header, at least not with my HAT attached which includes the CAN controllers AND circuitry to supply power to RPi from a 12V rail.

Thanks,

Josh Q

-----Original Message-----
From: Patrick Menschel <menschel.p@posteo.de> 
Sent: Wednesday, June 23, 2021 1:24 AM
To: Joshua Quesenberry <engnfrc@gmail.com>; Marc Kleine-Budde <mkl@pengutronix.de>
Cc: kernel@pengutronix.de; linux-can@vger.kernel.org
Subject: Re: MCP2518FD Drivers Rarely Working with Custom Kernel 5.10.Y

Am 23.06.21 um 04:59 schrieb Joshua Quesenberry:
> Thank you Marc, I had tried finding a Linux CAN forum, but 
> unfortunately searching for "CAN" in Google is about the most 
> unhelpful search term one could use... so thanks for replying and 
> getting me to a more appropriate audience.
> 
> Reverting my system back to where CAN was working will probably be 
> challenging. Our main goal was to get Boot from USB on the RPi 
> enabled, but this unfortunately meant upgrading every piece of 
> software and firmware available... previously we were still on Buster, 
> but the OS snapshot was from Spring 2020 (if not Fall/Winter 2019), if 
> not earlier, the firmware was much older, and the kernel was 4.19.73, 
> wherein the MCP251XFD driver didn't exist yet. So getting back there 
> will mean throwing a saved SD Card image on from Spring 2020 and then 
> trying to figure out how to force downgrade the firmware. A colleague 
> started this upgrade process for another project and was seeing these 
> same results on two separate RPi, he did the OS and firmware upgrades, 
> but I did the building of the 5.10.17 kernel. So including those two 
> RPi and mine, that's three total systems with mostly non-working CAN 
> where it had been working fine, my system has slightly newer RPi 
> firmware now and the 5.10.44 kernel, the hope was maybe I'd pick up a 
> patch somewhere, but no such luck. If you still think it would be 
> beneficial to go through the effort of downgrading everything to 
> verify the hardware I can do that, but just want to make sure before I 
> start that since it'll take a while.
> 
> I updated spi.c to include printing the error number as you requested 
> and that's all baking now. When I get into work in the morning (US
> EST) I'll get the changes deployed and try it out. Since this issue is 
> a very high failure rate, getting a log shouldn't be an issue.
> 
> Some background on the custom kernel... when I switched to the 5.10.Y 
> branch, I used arch/arm/configs/bcm2711_defconfig as my base config 
> and then switched on preempt, switched to 1000Hz kernel timer, 
> switched the default governor from powersave to ondemand, switched on 
> debug flag (CONFIG_DEBUG_USER=y), enabled a few different CAN drivers 
> we may encounter, and enabled some stuff for the WM8782 I2S chip. I 
> probably should have recreated my config after 5.10.44, but I hadn't 
> considered till this writing, looking at this diff there a few bits 
> that are new I probably could benefit from including, but I don't see 
> anything that I'd be concerned about.
> 
> `diff bcm2711_defconfig hel_bcm2711_lowlatency_defconfig`
> 15d14
> < CONFIG_ATA=m
> 43d41
> < CONFIG_BH1750=m
> 53c51
> < CONFIG_BLK_DEV_NVME=y
> ---
>> CONFIG_BLK_DEV_NVME=m
> 120c118
> < CONFIG_CAN_J1939=m
> ---
>> CONFIG_CAN_KVASER_USB=m
> 123a122,123
>> CONFIG_CAN_MCP25XXFD=m
>> CONFIG_CAN_PEAK_USB=m
> 127d126
> < CONFIG_CCS811=m
> 155c154
> < CONFIG_CPU_FREQ_DEFAULT_GOV_POWERSAVE=y
> ---
>> CONFIG_CPU_FREQ_DEFAULT_GOV_ONDEMAND=y
> 158,159c157
> < CONFIG_CPU_FREQ_GOV_ONDEMAND=y
> < CONFIG_CPU_FREQ_GOV_PERFORMANCE=y
> ---
>> CONFIG_CPU_FREQ_GOV_POWERSAVE=y
> 184a183
>> CONFIG_DEBUG_USER=y
> 209d207
> < CONFIG_DRM_PANEL_JDI_LT070ME05000=m
> 319a318
>> CONFIG_GENERIC_PHY=y
> 325d323
> < CONFIG_GPIO_PCA953X_IRQ=y
> 395a394
>> CONFIG_HZ_1000=y
> 561d559
> < CONFIG_IR_TOY=m
> 826d823
> < CONFIG_NF_LOG_ARP=m
> 828d824
> < CONFIG_NF_LOG_NETDEV=m
> 950c946
> < CONFIG_PREEMPT_VOLUNTARY=y
> ---
>> CONFIG_PREEMPT=y
> 957d952
> < CONFIG_QCA7000_UART=m
> 994d988
> < CONFIG_RPI_POE_POWER=m
> 1040a1035
>> # CONFIG_RTC_HCTOSYS is not set
> 1044,1045d1038
> < CONFIG_SATA_AHCI=m
> < CONFIG_SATA_MV=m
> 1054d1046
> < CONFIG_SENSIRION_SGP30=m
> 1134a1127
>> CONFIG_SND_RPI_I2S_AUDIO_WM8782=m
> 1149a1143
>> CONFIG_SND_SOC_WM8782=m
> 
> The /boot/config.txt I included in the forum posts mentioned is 
> tweaking the 40-pin header quite a bit from the default setup, we're 
> using many of the pins for our HAT and planned for possibly adding 
> more in the future.

Hi,

it would help to find a reference to that config.txt .

Regarding the changed Kconfig flags, I would suspect everything that owns a =y to be the culprit, especially everything that has connections to a clock.
Ever since the first rpi3, clocks are unreliable in general due to the frequency governor. The rpi guys did there best to get rid of most of the initial problems but the root cause remains.

The interesting question is, does a stock raspbian buster work with your hardware and that config.txt?

I'm running a stock raspbian buster on a rpi3b+ with seeed can fd hat v2
24/7 for a couple of month now and did not expierence any problems.

Regards,
Patrick

[-- Attachment #2: config.txt --]
[-- Type: text/plain, Size: 2927 bytes --]

# For more options and information see
# http://rpf.io/configtxt
# Some settings may impact device functionality. See link above for details

dtdebug=1

disable_splash=1
boot_delay=0

# uncomment if you get no picture on HDMI for a default "safe" mode
#hdmi_safe=1

# uncomment this if your display has a black border of unused pixels visible
# and your display can output without overscan
#disable_overscan=1

# uncomment the following to adjust overscan. Use positive numbers if console
# goes off screen, and negative if there is too much border
#overscan_left=16
#overscan_right=16
#overscan_top=16
#overscan_bottom=16

# uncomment to force a console size. By default it will be display's size minus
# overscan.
#framebuffer_width=1280
#framebuffer_height=720

# uncomment if hdmi display is not detected and composite is being output
#hdmi_force_hotplug=1

hdmi_force_mode=1

# uncomment to force a specific HDMI mode (this will force VGA)
#hdmi_group=1
#hdmi_mode=1
hdmi_group=1
hdmi_mode=3 # 480p 60Hz H
#hdmi_mode=9 # 240p 60Hz H

# uncomment to force a HDMI mode rather than DVI. This can make audio work in
# DMT (computer monitor) modes
#hdmi_drive=2

# uncomment to increase signal to HDMI, if you have interference, blanking, or
# no display
#config_hdmi_boost=4

# uncomment for composite PAL
#sdtv_mode=2

#uncomment to overclock the arm. 700 MHz is the default.
#arm_freq=800

#arm_freq=1500
#gpu_freq=600
#core_freq=600
#h264_freq=600
#isp_freq=600
#v3d_freq=600

# Uncomment some or all of these to enable the optional hardware interfaces
dtparam=i2c_arm=on
dtparam=i2s=on
dtparam=spi=off
enable_uart=0

# Uncomment this to enable the lirc-rpi module
#dtoverlay=lirc-rpi

# Additional overlays and parameters are documented /boot/overlays/README

# Enable audio (loads snd_bcm2835)
dtparam=audio=on

[pi4]
# Enable DRM VC4 V3D driver on top of the dispmanx display stack
dtoverlay=vc4-fkms-v3d
max_framebuffers=2

[all]
#dtoverlay=vc4-fkms-v3d
start_x=1
gpu_mem=512

# GPS
dtoverlay=uart3

# Reserved for HAT IDs
dtoverlay=i2c0
dtparam=pins_0_1=on
dtparam=combine=off

# Power Supervisor, IMU, RPi I2C Bus
dtoverlay=i2c1

# CAN 0/1
dtoverlay=spi0-cs
dtparam=cs0_pin=8
dtparam=cs1_pin=7

# Reserved
dtoverlay=spi5-2cs
dtparam=cs0_pin=12
dtparam=cs0_spidev=disabled
dtparam=cs1_pin=26
dtparam=cs1_spidev=disabled

# # CAN 0
# dtoverlay=mcp2517fd-spi0_0-can0
# dtparam=oscillator=40000000
# dtparam=spimaxfrequency=20000000
# dtparam=interrupt=24
# 
# # CAN 1
# dtoverlay=mcp2517fd-spi0_1-can1
# dtparam=oscillator=40000000
# dtparam=spimaxfrequency=20000000
# dtparam=interrupt=25

# CAN 0
dtoverlay=mcp251xfd,spi0-0,interrupt=24,oscillator=40000000,speed=20000000

# CAN 1
dtoverlay=mcp251xfd,spi0-1,interrupt=25,oscillator=40000000,speed=20000000

dtoverlay=hel-wm8782,alsaname=mic

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: MCP2518FD Drivers Rarely Working with Custom Kernel 5.10.Y
  2021-06-23 17:34       ` Joshua Quesenberry
@ 2021-06-23 20:07         ` Patrick Menschel
  2021-06-24 18:24           ` Joshua Quesenberry
  2021-06-25  6:56         ` Marc Kleine-Budde
  1 sibling, 1 reply; 15+ messages in thread
From: Patrick Menschel @ 2021-06-23 20:07 UTC (permalink / raw)
  To: Joshua Quesenberry, 'Marc Kleine-Budde'; +Cc: kernel, linux-can

Am 23.06.21 um 19:34 schrieb Joshua Quesenberry:
> Hey!
> 
> I have attached config.txt so you all can see what I'm doing.
> 
> I added printing the error number as Marc suggested and the number appears to be -110 every time.
> 
> [   25.660006] CAN device driver interface
> [   25.668720] spi_master spi0: will run message pump with realtime priority
> [   25.676697] mcp251xfd spi0.1 can0: MCP2518FD rev0.0 (-RX_INT -MAB_NO_WARN +CRC_REG +CRC_RX +CRC_TX +ECC -HD c:40.00MHz m:20.00MHz r:17.00MHz e:16.66MHz) successfully initialized.
> [   25.684900] mcp251xfd spi0.0 can1: MCP2518FD rev0.0 (-RX_INT -MAB_NO_WARN +CRC_REG +CRC_RX +CRC_TX +ECC -HD c:40.00MHz m:20.00MHz r:17.00MHz e:16.66MHz) successfully initialized.
> [   28.098033] mcp251xfd spi0.1 rename4: renamed from can0
> [   28.175644] mcp251xfd spi0.0 can0: renamed from can1
> [   28.225891] mcp251xfd spi0.1 can1: renamed from rename4
> [  146.964971] mcp251xfd spi0.0: SPI transfer timed out
> [  146.965023] spi_master spi0: failed to transfer one message from queue (ret=-110)
> [  146.965216] mcp251xfd spi0.0 can0: CRC read error at address 0x0e0c (length=4, data=00 00 00 00, CRC=0x0000) retrying.
> [  146.965247] mcp251xfd spi0.0 can0: CRC read error at address 0x0e0c (length=4, data=00 00 00 00, CRC=0x0000) retrying.
> [  146.965277] mcp251xfd spi0.0 can0: CRC read error at address 0x0e0c (length=4, data=00 00 00 00, CRC=0x0000) retrying.
> [  146.965286] mcp251xfd spi0.0 can0: CRC read error at address 0x0e0c (length=4, data=00 00 00 00, CRC=0x0000).
> [  146.965331] mcp251xfd spi0.0 can0: CRC read error at address 0x0000 (length=4, data=00 00 00 00, CRC=0x0000) retrying.
> [  146.965360] mcp251xfd spi0.0 can0: CRC read error at address 0x0000 (length=4, data=00 00 00 00, CRC=0x0000) retrying.
> [  146.965389] mcp251xfd spi0.0 can0: CRC read error at address 0x0000 (length=4, data=00 00 00 00, CRC=0x0000) retrying.
> [  146.965397] mcp251xfd spi0.0 can0: CRC read error at address 0x0000 (length=4, data=00 00 00 00, CRC=0x0000).
> [  146.965413] A link change request failed with some changes committed already. Interface can0 may have been left with an inconsistent configuration, please check.
> 
> Regarding the discussion about Kconfig flags, I went ahead and rebuilt kernel 5.10.44 using a config that was essentially arch/arm/configs/bcm2711_defconfig with these additions needed to get our I2S working. This should have undone the switch to ONDEMAND governor and enabling 1000 Hz clock.
> 
> 1030a1031
>> CONFIG_SND_RPI_I2S_AUDIO_WM8782=m
> 1040a1042
>> CONFIG_SND_SOC_WM8782=m
> 
> My RPi and HAT have worked very reliably with the older buster image and customized (same tweaks as mentioned in last email) kernel 4.19.73, in that kernel I'm using MCP25XXFD driver from msperl which under 5.10.Y kernel is having issues too. I only upgraded everything on my system at the end of last week, so hardware has been OK very recently.
> 
> Keep in mind I'm not seeing a total failure, I do occasionally see everything work correctly and I can run the ip link setup command without issue, it's just not common and it seems fully removing power from the system and reapplying seems to help, but not every time, so maybe it's a coincidence. It could be an issue of subsequent configurations of the controller after the initial setup on power application, but I'd expect it work after every power yank I think.
> 
> I wouldn't feel comfortable reverting my /boot/config.txt to a stock one and a default setup of the 40-pin header, at least not with my HAT attached which includes the CAN controllers AND circuitry to supply power to RPi from a 12V rail.

OK,

one general advice.

Check if you can merge your HAT into a complete overlay with ovmerge
tool like I did in
https://github.com/raspberrypi/linux/issues/4032
https://github.com/raspberrypi/linux/pull/4034

This should clean up your config.txt quite a bit.

I compare against the seeed can fd hat v2 which also has i2c1 and both
can on spi0.

I'm not sure about the PI4 but these 3 items usually all go on i2c1
which may be problematic.

# typical combination out of raspi-config
dtparam=i2c_arm=on
dtparam=i2s=on

# some manual entry, check if it can be removed
dtoverlay=i2c1


The CAN related stuff looks ok, but you can omit the
oscillator=40000000,speed=20000000
Those are the standard values afaik.

# CAN 0/1
dtoverlay=spi0-cs
dtparam=cs0_pin=8
dtparam=cs1_pin=7

# CAN 0
dtoverlay=mcp251xfd,spi0-0,interrupt=24

# CAN 1
dtoverlay=mcp251xfd,spi0-1,interrupt=25


Concerning the naming, I also used generic names can0, can1 at first but
was advised to use best practice and to rename to mcp0, mcp1 instead. I
believe Marc mentioned that this causes some kind of problem.

This is my udev rule content.
SUBSYSTEM=="net", ACTION=="add",
DEVPATH=="/devices/platform/soc/*/*/*/spi0.0/net/can?", NAME="mcp0"
SUBSYSTEM=="net", ACTION=="add",
DEVPATH=="/devices/platform/soc/*/*/*/spi0.1/net/can?", NAME="mcp1"


Regards,
Patrick

^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: MCP2518FD Drivers Rarely Working with Custom Kernel 5.10.Y
  2021-06-23 20:07         ` Patrick Menschel
@ 2021-06-24 18:24           ` Joshua Quesenberry
  2021-06-24 20:41             ` Patrick Menschel
  0 siblings, 1 reply; 15+ messages in thread
From: Joshua Quesenberry @ 2021-06-24 18:24 UTC (permalink / raw)
  To: 'Patrick Menschel', 'Marc Kleine-Budde'
  Cc: kernel, linux-can, engnfrc

Thanks for the tips Patrick on config.txt cleanup, I'll take a closer look into that.

Not sure I'm following what you're asking for with regards to i2c. i2c1 is currently is use to communicate with accel, gyro, mag, and a power supervisor MCU that helps us accomplish wake/sleep based on accel or vehicle ignition and some other low-level hardware tasks.

I removed the UDEV rules I had that rename can0 and can1, just in case that was causing a race condition or something else odd, but the system is behaving the same.

Not sure what else to try at this point, any ideas? What does the error number of -110 mean to you all?

-----Original Message-----
From: Patrick Menschel <menschel.p@posteo.de> 
Sent: Wednesday, June 23, 2021 4:07 PM
To: Joshua Quesenberry <engnfrc@gmail.com>; 'Marc Kleine-Budde' <mkl@pengutronix.de>
Cc: kernel@pengutronix.de; linux-can@vger.kernel.org
Subject: Re: MCP2518FD Drivers Rarely Working with Custom Kernel 5.10.Y

Am 23.06.21 um 19:34 schrieb Joshua Quesenberry:
> Hey!
> 
> I have attached config.txt so you all can see what I'm doing.
> 
> I added printing the error number as Marc suggested and the number appears to be -110 every time.
> 
> [   25.660006] CAN device driver interface
> [   25.668720] spi_master spi0: will run message pump with realtime priority
> [   25.676697] mcp251xfd spi0.1 can0: MCP2518FD rev0.0 (-RX_INT -MAB_NO_WARN +CRC_REG +CRC_RX +CRC_TX +ECC -HD c:40.00MHz m:20.00MHz r:17.00MHz e:16.66MHz) successfully initialized.
> [   25.684900] mcp251xfd spi0.0 can1: MCP2518FD rev0.0 (-RX_INT -MAB_NO_WARN +CRC_REG +CRC_RX +CRC_TX +ECC -HD c:40.00MHz m:20.00MHz r:17.00MHz e:16.66MHz) successfully initialized.
> [   28.098033] mcp251xfd spi0.1 rename4: renamed from can0
> [   28.175644] mcp251xfd spi0.0 can0: renamed from can1
> [   28.225891] mcp251xfd spi0.1 can1: renamed from rename4
> [  146.964971] mcp251xfd spi0.0: SPI transfer timed out [  146.965023] 
> spi_master spi0: failed to transfer one message from queue (ret=-110) 
> [  146.965216] mcp251xfd spi0.0 can0: CRC read error at address 0x0e0c (length=4, data=00 00 00 00, CRC=0x0000) retrying.
> [  146.965247] mcp251xfd spi0.0 can0: CRC read error at address 0x0e0c (length=4, data=00 00 00 00, CRC=0x0000) retrying.
> [  146.965277] mcp251xfd spi0.0 can0: CRC read error at address 0x0e0c (length=4, data=00 00 00 00, CRC=0x0000) retrying.
> [  146.965286] mcp251xfd spi0.0 can0: CRC read error at address 0x0e0c (length=4, data=00 00 00 00, CRC=0x0000).
> [  146.965331] mcp251xfd spi0.0 can0: CRC read error at address 0x0000 (length=4, data=00 00 00 00, CRC=0x0000) retrying.
> [  146.965360] mcp251xfd spi0.0 can0: CRC read error at address 0x0000 (length=4, data=00 00 00 00, CRC=0x0000) retrying.
> [  146.965389] mcp251xfd spi0.0 can0: CRC read error at address 0x0000 (length=4, data=00 00 00 00, CRC=0x0000) retrying.
> [  146.965397] mcp251xfd spi0.0 can0: CRC read error at address 0x0000 (length=4, data=00 00 00 00, CRC=0x0000).
> [  146.965413] A link change request failed with some changes committed already. Interface can0 may have been left with an inconsistent configuration, please check.
> 
> Regarding the discussion about Kconfig flags, I went ahead and rebuilt kernel 5.10.44 using a config that was essentially arch/arm/configs/bcm2711_defconfig with these additions needed to get our I2S working. This should have undone the switch to ONDEMAND governor and enabling 1000 Hz clock.
> 
> 1030a1031
>> CONFIG_SND_RPI_I2S_AUDIO_WM8782=m
> 1040a1042
>> CONFIG_SND_SOC_WM8782=m
> 
> My RPi and HAT have worked very reliably with the older buster image and customized (same tweaks as mentioned in last email) kernel 4.19.73, in that kernel I'm using MCP25XXFD driver from msperl which under 5.10.Y kernel is having issues too. I only upgraded everything on my system at the end of last week, so hardware has been OK very recently.
> 
> Keep in mind I'm not seeing a total failure, I do occasionally see everything work correctly and I can run the ip link setup command without issue, it's just not common and it seems fully removing power from the system and reapplying seems to help, but not every time, so maybe it's a coincidence. It could be an issue of subsequent configurations of the controller after the initial setup on power application, but I'd expect it work after every power yank I think.
> 
> I wouldn't feel comfortable reverting my /boot/config.txt to a stock one and a default setup of the 40-pin header, at least not with my HAT attached which includes the CAN controllers AND circuitry to supply power to RPi from a 12V rail.

OK,

one general advice.

Check if you can merge your HAT into a complete overlay with ovmerge tool like I did in
https://github.com/raspberrypi/linux/issues/4032
https://github.com/raspberrypi/linux/pull/4034

This should clean up your config.txt quite a bit.

I compare against the seeed can fd hat v2 which also has i2c1 and both can on spi0.

I'm not sure about the PI4 but these 3 items usually all go on i2c1 which may be problematic.

# typical combination out of raspi-config dtparam=i2c_arm=on dtparam=i2s=on

# some manual entry, check if it can be removed
dtoverlay=i2c1


The CAN related stuff looks ok, but you can omit the
oscillator=40000000,speed=20000000
Those are the standard values afaik.

# CAN 0/1
dtoverlay=spi0-cs
dtparam=cs0_pin=8
dtparam=cs1_pin=7

# CAN 0
dtoverlay=mcp251xfd,spi0-0,interrupt=24

# CAN 1
dtoverlay=mcp251xfd,spi0-1,interrupt=25


Concerning the naming, I also used generic names can0, can1 at first but was advised to use best practice and to rename to mcp0, mcp1 instead. I believe Marc mentioned that this causes some kind of problem.

This is my udev rule content.
SUBSYSTEM=="net", ACTION=="add",
DEVPATH=="/devices/platform/soc/*/*/*/spi0.0/net/can?", NAME="mcp0"
SUBSYSTEM=="net", ACTION=="add",
DEVPATH=="/devices/platform/soc/*/*/*/spi0.1/net/can?", NAME="mcp1"


Regards,
Patrick


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: MCP2518FD Drivers Rarely Working with Custom Kernel 5.10.Y
  2021-06-24 18:24           ` Joshua Quesenberry
@ 2021-06-24 20:41             ` Patrick Menschel
  0 siblings, 0 replies; 15+ messages in thread
From: Patrick Menschel @ 2021-06-24 20:41 UTC (permalink / raw)
  To: Joshua Quesenberry, 'Marc Kleine-Budde'; +Cc: kernel, linux-can


Am 24.06.21 um 20:24 schrieb Joshua Quesenberry:
> Thanks for the tips Patrick on config.txt cleanup, I'll take a closer look into that.
> 
> Not sure I'm following what you're asking for with regards to i2c. i2c1 is currently is use to communicate with accel, gyro, mag, and a power supervisor MCU that helps us accomplish wake/sleep based on accel or vehicle ignition and some other low-level hardware tasks.

Concerning i2c1 afaik it is already set up by the other two lines but I
may be wrong about that.
>> # typical combination out of raspi-config
>> dtparam=i2c_arm=on
>> dtparam=i2s=on

If you want to wake/sleep according to automotive standards, select a
CAN transceiver with a wakeup pin.

If I remember right there is even a dt overlay for on/off via pin event.

> 
> I removed the UDEV rules I had that rename can0 and can1, just in case that was causing a race condition or something else odd, but the system is behaving the same.
> 
> Not sure what else to try at this point, any ideas? What does the error number of -110 mean to you all?
...
>> [  146.964971] mcp251xfd spi0.0: SPI transfer timed out [  146.965023] 
>> spi_master spi0: failed to transfer one message from queue (ret=-110) 
>> [  146.965216] mcp251xfd spi0.0 can0: CRC read error at address 0x0e0c (length=4, data=00 00 00 00, CRC=0x0000) retrying.
>> [  146.965247] mcp251xfd spi0.0 can0: CRC read error at address 0x0e0c (length=4, data=00 00 00 00, CRC=0x0000) retrying.
>> [  146.965277] mcp251xfd spi0.0 can0: CRC read error at address 0x0e0c (length=4, data=00 00 00 00, CRC=0x0000) retrying.
>> [  146.965286] mcp251xfd spi0.0 can0: CRC read error at address 0x0e0c (length=4, data=00 00 00 00, CRC=0x0000).
>> [  146.965331] mcp251xfd spi0.0 can0: CRC read error at address 0x0000 (length=4, data=00 00 00 00, CRC=0x0000) retrying.
>> [  146.965360] mcp251xfd spi0.0 can0: CRC read error at address 0x0000 (length=4, data=00 00 00 00, CRC=0x0000) retrying.
>> [  146.965389] mcp251xfd spi0.0 can0: CRC read error at address 0x0000 (length=4, data=00 00 00 00, CRC=0x0000) retrying.
>> [  146.965397] mcp251xfd spi0.0 can0: CRC read error at address 0x0000 (length=4, data=00 00 00 00, CRC=0x0000).

110 is ETIMEOUT
according to your dmesg output, it is raised in a send context from SPI
Master and afterwards only zeros are recieved, so may be an issue with
the chip select pin. Scope it to be sure.

Regards,
Patrick

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: MCP2518FD Drivers Rarely Working with Custom Kernel 5.10.Y
  2021-06-23 17:34       ` Joshua Quesenberry
  2021-06-23 20:07         ` Patrick Menschel
@ 2021-06-25  6:56         ` Marc Kleine-Budde
  2021-06-25 12:16           ` Marc Kleine-Budde
  1 sibling, 1 reply; 15+ messages in thread
From: Marc Kleine-Budde @ 2021-06-25  6:56 UTC (permalink / raw)
  To: Joshua Quesenberry; +Cc: 'Patrick Menschel', kernel, linux-can

[-- Attachment #1: Type: text/plain, Size: 3244 bytes --]

On 23.06.2021 13:34:10, Joshua Quesenberry wrote:
> I added printing the error number as Marc suggested and the number
> appears to be -110 every time.

#define	ETIMEDOUT	110	/* Connection timed out */
https://elixir.bootlin.com/linux/latest/source/include/uapi/asm-generic/errno.h#L93

That means something has timed out, we see this in the previous log
message, too:

> [   25.660006] CAN device driver interface
> [   25.668720] spi_master spi0: will run message pump with realtime priority
> [   25.676697] mcp251xfd spi0.1 can0: MCP2518FD rev0.0 (-RX_INT -MAB_NO_WARN +CRC_REG +CRC_RX +CRC_TX +ECC -HD c:40.00MHz m:20.00MHz r:17.00MHz e:16.66MHz) successfully initialized.
> [   25.684900] mcp251xfd spi0.0 can1: MCP2518FD rev0.0 (-RX_INT -MAB_NO_WARN +CRC_REG +CRC_RX +CRC_TX +ECC -HD c:40.00MHz m:20.00MHz r:17.00MHz e:16.66MHz) successfully initialized.
> [   28.098033] mcp251xfd spi0.1 rename4: renamed from can0
> [   28.175644] mcp251xfd spi0.0 can0: renamed from can1
> [   28.225891] mcp251xfd spi0.1 can1: renamed from rename4
                                   VVVVVVVVVVVVVVVVVVVVVV
> [  146.964971] mcp251xfd spi0.0: SPI transfer timed out
> [  146.965023] spi_master spi0: failed to transfer one message from queue (ret=-110)

> [  146.965216] mcp251xfd spi0.0 can0: CRC read error at address 0x0e0c (length=4, data=00 00 00 00, CRC=0x0000) retrying.
> [  146.965247] mcp251xfd spi0.0 can0: CRC read error at address 0x0e0c (length=4, data=00 00 00 00, CRC=0x0000) retrying.
> [  146.965277] mcp251xfd spi0.0 can0: CRC read error at address 0x0e0c (length=4, data=00 00 00 00, CRC=0x0000) retrying.
> [  146.965286] mcp251xfd spi0.0 can0: CRC read error at address 0x0e0c (length=4, data=00 00 00 00, CRC=0x0000).
> [  146.965331] mcp251xfd spi0.0 can0: CRC read error at address 0x0000 (length=4, data=00 00 00 00, CRC=0x0000) retrying.
> [  146.965360] mcp251xfd spi0.0 can0: CRC read error at address 0x0000 (length=4, data=00 00 00 00, CRC=0x0000) retrying.
> [  146.965389] mcp251xfd spi0.0 can0: CRC read error at address 0x0000 (length=4, data=00 00 00 00, CRC=0x0000) retrying.
> [  146.965397] mcp251xfd spi0.0 can0: CRC read error at address 0x0000 (length=4, data=00 00 00 00, CRC=0x0000).
> [  146.965413] A link change request failed with some changes committed already. Interface can0 may have been left with an inconsistent configuration, please check.
> 
> Regarding the discussion about Kconfig flags, I went ahead and rebuilt
> kernel 5.10.44 using a config that was essentially
> arch/arm/configs/bcm2711_defconfig with these additions needed to get
> our I2S working. This should have undone the switch to ONDEMAND
> governor and enabling 1000 Hz clock.

Please switch back the clock to the standard HZ setting.

BTW: why have you changed the clock setting in the first place? All
timer (not timeout) code in Linux makes use of hrtimer, which run
independent of the clock HZ setting.

Marc

-- 
Pengutronix e.K.                 | Marc Kleine-Budde           |
Embedded Linux                   | https://www.pengutronix.de  |
Vertretung West/Dortmund         | Phone: +49-231-2826-924     |
Amtsgericht Hildesheim, HRA 2686 | Fax:   +49-5121-206917-5555 |

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: MCP2518FD Drivers Rarely Working with Custom Kernel 5.10.Y
  2021-06-25  6:56         ` Marc Kleine-Budde
@ 2021-06-25 12:16           ` Marc Kleine-Budde
       [not found]             ` <020f01d769da$9fac86b0$df059410$@gmail.com>
  0 siblings, 1 reply; 15+ messages in thread
From: Marc Kleine-Budde @ 2021-06-25 12:16 UTC (permalink / raw)
  To: Joshua Quesenberry; +Cc: 'Patrick Menschel', kernel, linux-can

[-- Attachment #1: Type: text/plain, Size: 5207 bytes --]

On 25.06.2021 08:56:26, Marc Kleine-Budde wrote:
> On 23.06.2021 13:34:10, Joshua Quesenberry wrote:
> > I added printing the error number as Marc suggested and the number
> > appears to be -110 every time.
> 
> #define	ETIMEDOUT	110	/* Connection timed out */
> https://elixir.bootlin.com/linux/latest/source/include/uapi/asm-generic/errno.h#L93
> 
> That means something has timed out, we see this in the previous log
> message, too:
> 
> > [   25.660006] CAN device driver interface
> > [   25.668720] spi_master spi0: will run message pump with realtime priority
> > [   25.676697] mcp251xfd spi0.1 can0: MCP2518FD rev0.0 (-RX_INT -MAB_NO_WARN +CRC_REG +CRC_RX +CRC_TX +ECC -HD c:40.00MHz m:20.00MHz r:17.00MHz e:16.66MHz) successfully initialized.
> > [   25.684900] mcp251xfd spi0.0 can1: MCP2518FD rev0.0 (-RX_INT -MAB_NO_WARN +CRC_REG +CRC_RX +CRC_TX +ECC -HD c:40.00MHz m:20.00MHz r:17.00MHz e:16.66MHz) successfully initialized.
> > [   28.098033] mcp251xfd spi0.1 rename4: renamed from can0
> > [   28.175644] mcp251xfd spi0.0 can0: renamed from can1
> > [   28.225891] mcp251xfd spi0.1 can1: renamed from rename4
>                                    VVVVVVVVVVVVVVVVVVVVVV
> > [  146.964971] mcp251xfd spi0.0: SPI transfer timed out
> > [  146.965023] spi_master spi0: failed to transfer one message from queue (ret=-110)
> 
> > [  146.965216] mcp251xfd spi0.0 can0: CRC read error at address 0x0e0c (length=4, data=00 00 00 00, CRC=0x0000) retrying.
> > [  146.965247] mcp251xfd spi0.0 can0: CRC read error at address 0x0e0c (length=4, data=00 00 00 00, CRC=0x0000) retrying.
> > [  146.965277] mcp251xfd spi0.0 can0: CRC read error at address 0x0e0c (length=4, data=00 00 00 00, CRC=0x0000) retrying.
> > [  146.965286] mcp251xfd spi0.0 can0: CRC read error at address 0x0e0c (length=4, data=00 00 00 00, CRC=0x0000).
> > [  146.965331] mcp251xfd spi0.0 can0: CRC read error at address 0x0000 (length=4, data=00 00 00 00, CRC=0x0000) retrying.
> > [  146.965360] mcp251xfd spi0.0 can0: CRC read error at address 0x0000 (length=4, data=00 00 00 00, CRC=0x0000) retrying.
> > [  146.965389] mcp251xfd spi0.0 can0: CRC read error at address 0x0000 (length=4, data=00 00 00 00, CRC=0x0000) retrying.
> > [  146.965397] mcp251xfd spi0.0 can0: CRC read error at address 0x0000 (length=4, data=00 00 00 00, CRC=0x0000).
> > [  146.965413] A link change request failed with some changes committed already. Interface can0 may have been left with an inconsistent configuration, please check.
> > 
> > Regarding the discussion about Kconfig flags, I went ahead and rebuilt
> > kernel 5.10.44 using a config that was essentially
> > arch/arm/configs/bcm2711_defconfig with these additions needed to get
> > our I2S working. This should have undone the switch to ONDEMAND
> > governor and enabling 1000 Hz clock.
> 
> Please switch back the clock to the standard HZ setting.

I compiled my 64 bit raspi kernel with HZ=1000 and my mcp2518fd board
works without problem on my raspi4b.

| static int bcm2835_spi_transfer_one_poll(struct spi_controller *ctlr,
| 					 struct spi_device *spi,
| 					 struct spi_transfer *tfr,
| 					 u32 cs)
| {
[...]
| 	/* set the timeout to at least 2 jiffies */
| 	timeout = jiffies + 2 + HZ * polling_limit_us / 1000000;

The timeout is calculated in jiffies. The jiffies variable is
incremented once per timer tick (which depends on the clock HZ
configuration). There are "HZ" jiffies per second. This means the above
"2" equals 8ms (HZ=250), but with HZ=1000 only 2ms.

To keep the timeout constant, you can change this into:

        timeout = jiffies + (HZ * 8) / 1000 + HZ * polling_limit_us / 1000000;

However, the polling mode will only be used for transfers that should
finish in 30 µs. So even 2ms is far of...

| 
| 	/* loop until finished the transfer */
| 	while (bs->rx_len) {
| 		/* fill in tx fifo with remaining data */
| 		bcm2835_wr_fifo(bs);
| 
| 		/* read from fifo as much as possible */
| 		bcm2835_rd_fifo(bs);
| 
| 		/* if there is still data pending to read
| 		 * then check the timeout
| 		 */
| 		if (bs->rx_len && time_after(jiffies, timeout)) {

If there is a timeout, the driver will fall back to IRQ mode.


Can you add a "#define DEBUG" in spi-bcm2835.c, even before the other
"#include" directives. That should give you this debug message:

| 			dev_dbg_ratelimited(&spi->dev,
| 					    "timeout period reached: jiffies: %lu remaining tx/rx: %d/%d - falling back to interrupt mode\n",
| 					    jiffies - timeout,
| 					    bs->tx_len, bs->rx_len);
| 			/* fall back to interrupt mode */
| 
| 			/* update usage statistics */
| 			bs->count_transfer_irq_after_polling++;
| 
| 			return bcm2835_spi_transfer_one_irq(ctlr, spi,
| 							    tfr, cs, false);

here it activates the IRQ. But I'm not sure if the fallback works
correctly....

| 		}
| 	}
| 

Marc

-- 
Pengutronix e.K.                 | Marc Kleine-Budde           |
Embedded Linux                   | https://www.pengutronix.de  |
Vertretung West/Dortmund         | Phone: +49-231-2826-924     |
Amtsgericht Hildesheim, HRA 2686 | Fax:   +49-5121-206917-5555 |

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: MCP2518FD Drivers Rarely Working with Custom Kernel 5.10.Y
       [not found]               ` <022d01d769e2$e623cbf0$b26b63d0$@gmail.com>
@ 2021-07-02  4:33                 ` Joshua Quesenberry
  2021-07-02  9:31                 ` Marc Kleine-Budde
  1 sibling, 0 replies; 15+ messages in thread
From: Joshua Quesenberry @ 2021-07-02  4:33 UTC (permalink / raw)
  To: Marc Kleine-Budde; +Cc: Patrick Menschel, kernel, linux-can, Joshua Quesenberry

Marc,

I tried adding "#define DEBUG" to the top of spi-bcm2835.c, but I
don't see any additional logging in the output of dmesg. Any other
ideas? Anything you all notice in the Saleae data worth mentioning?

Thanks!

Josh Q

On Fri, Jun 25, 2021 at 12:55 PM Joshua Quesenberry <engnfrc@gmail.com> wrote:
>
> Forgive me, I forgot can0 = spi0.1 and can1 = spi0.0 right now because I killed my UDEV rule so I was tapped onto the wrong CS line. Attached is a snapshot of what I'm seeing AND an export of the data from Saleae which may prove more useful than snapshots.
>
> Thanks,
>
> Josh Q
>
> -----Original Message-----
> From: Joshua Quesenberry <engnfrc@gmail.com>
> Sent: Friday, June 25, 2021 11:56 AM
> To: 'Marc Kleine-Budde' <mkl@pengutronix.de>
> Cc: 'Patrick Menschel' <menschel.p@posteo.de>; kernel@pengutronix.de; linux-can@vger.kernel.org; engnfrc@gmail.com
> Subject: RE: MCP2518FD Drivers Rarely Working with Custom Kernel 5.10.Y
>
> Marc,
>
> I had already switched back to the normal clock speed in a previous email, so should be good there. It's been a while, but we're pushing the RPi pretty hard and either there was a drop in overall resource usage or it was a matter of less dropped video frames (which is something we strive to minimize for the research we do) from the three USB cameras. At the moment I've got all the add-ons turned off and the load is minimal.
>
> I was able to get my hands on a Saleae and am attaching a snapshot of what happens when I run the ip link command. I am noticing that the chip select is never being toggled right now with the ip link command failing... so that just might be our root issue. So what can we do to figure out WHY the chip select isn't acting as expected?
>
> I have probes attached to the following for CAN0:
>
> Pin Function
> 19  SPI0_MOSI
> 21  SPI0_MISO
> 23  SPI0_SCLK
> 24  GPIO8 / CS0
>
> Thanks,
>
> Josh Q
>
> -----Original Message-----
> From: Marc Kleine-Budde <mkl@pengutronix.de>
> Sent: Friday, June 25, 2021 8:17 AM
> To: Joshua Quesenberry <engnfrc@gmail.com>
> Cc: 'Patrick Menschel' <menschel.p@posteo.de>; kernel@pengutronix.de; linux-can@vger.kernel.org
> Subject: Re: MCP2518FD Drivers Rarely Working with Custom Kernel 5.10.Y
>
> On 25.06.2021 08:56:26, Marc Kleine-Budde wrote:
> > On 23.06.2021 13:34:10, Joshua Quesenberry wrote:
> > > I added printing the error number as Marc suggested and the number
> > > appears to be -110 every time.
> >
> > #define       ETIMEDOUT       110     /* Connection timed out */
> > https://elixir.bootlin.com/linux/latest/source/include/uapi/asm-generi
> > c/errno.h#L93
> >
> > That means something has timed out, we see this in the previous log
> > message, too:
> >
> > > [   25.660006] CAN device driver interface
> > > [   25.668720] spi_master spi0: will run message pump with realtime
> > > priority
> > > [   25.676697] mcp251xfd spi0.1 can0: MCP2518FD rev0.0
> > > (-RX_INT -MAB_NO_WARN +CRC_REG +CRC_RX +CRC_TX +ECC -HD c:40.00MHz
> > > m:20.00MHz r:17.00MHz e:16.66MHz) successfully initialized.
> > > [   25.684900] mcp251xfd spi0.0 can1: MCP2518FD rev0.0
> > > (-RX_INT -MAB_NO_WARN +CRC_REG +CRC_RX +CRC_TX +ECC -HD c:40.00MHz
> > > m:20.00MHz r:17.00MHz e:16.66MHz) successfully initialized.
> > > [   28.098033] mcp251xfd spi0.1 rename4: renamed from can0
> > > [   28.175644] mcp251xfd spi0.0 can0: renamed from can1
> > > [   28.225891] mcp251xfd spi0.1 can1: renamed from rename4
> >                                    VVVVVVVVVVVVVVVVVVVVVV
> > > [  146.964971] mcp251xfd spi0.0: SPI transfer timed out [
> > > 146.965023] spi_master spi0: failed to transfer one message from
> > > queue (ret=-110)
> >
> > > [  146.965216] mcp251xfd spi0.0 can0: CRC read error at address
> > > 0x0e0c (length=4, data=00 00 00 00, CRC=0x0000) retrying.
> > > [  146.965247] mcp251xfd spi0.0 can0: CRC read error at address
> > > 0x0e0c (length=4, data=00 00 00 00, CRC=0x0000) retrying.
> > > [  146.965277] mcp251xfd spi0.0 can0: CRC read error at address
> > > 0x0e0c (length=4, data=00 00 00 00, CRC=0x0000) retrying.
> > > [  146.965286] mcp251xfd spi0.0 can0: CRC read error at address
> > > 0x0e0c (length=4, data=00 00 00 00, CRC=0x0000).
> > > [  146.965331] mcp251xfd spi0.0 can0: CRC read error at address
> > > 0x0000 (length=4, data=00 00 00 00, CRC=0x0000) retrying.
> > > [  146.965360] mcp251xfd spi0.0 can0: CRC read error at address
> > > 0x0000 (length=4, data=00 00 00 00, CRC=0x0000) retrying.
> > > [  146.965389] mcp251xfd spi0.0 can0: CRC read error at address
> > > 0x0000 (length=4, data=00 00 00 00, CRC=0x0000) retrying.
> > > [  146.965397] mcp251xfd spi0.0 can0: CRC read error at address
> > > 0x0000 (length=4, data=00 00 00 00, CRC=0x0000).
> > > [  146.965413] A link change request failed with some changes
> > > committed already. Interface can0 may have been left with an
> > > inconsistent configuration, please check.
> > >
> > > Regarding the discussion about Kconfig flags, I went ahead and
> > > rebuilt kernel 5.10.44 using a config that was essentially
> > > arch/arm/configs/bcm2711_defconfig with these additions needed to
> > > get our I2S working. This should have undone the switch to ONDEMAND
> > > governor and enabling 1000 Hz clock.
> >
> > Please switch back the clock to the standard HZ setting.
>
> I compiled my 64 bit raspi kernel with HZ=1000 and my mcp2518fd board works without problem on my raspi4b.
>
> | static int bcm2835_spi_transfer_one_poll(struct spi_controller *ctlr,
> |                                        struct spi_device *spi,
> |                                        struct spi_transfer *tfr,
> |                                        u32 cs)
> | {
> [...]
> |       /* set the timeout to at least 2 jiffies */
> |       timeout = jiffies + 2 + HZ * polling_limit_us / 1000000;
>
> The timeout is calculated in jiffies. The jiffies variable is incremented once per timer tick (which depends on the clock HZ configuration). There are "HZ" jiffies per second. This means the above "2" equals 8ms (HZ=250), but with HZ=1000 only 2ms.
>
> To keep the timeout constant, you can change this into:
>
>         timeout = jiffies + (HZ * 8) / 1000 + HZ * polling_limit_us / 1000000;
>
> However, the polling mode will only be used for transfers that should finish in 30 µs. So even 2ms is far of...
>
> |
> |       /* loop until finished the transfer */
> |       while (bs->rx_len) {
> |               /* fill in tx fifo with remaining data */
> |               bcm2835_wr_fifo(bs);
> |
> |               /* read from fifo as much as possible */
> |               bcm2835_rd_fifo(bs);
> |
> |               /* if there is still data pending to read
> |                * then check the timeout
> |                */
> |               if (bs->rx_len && time_after(jiffies, timeout)) {
>
> If there is a timeout, the driver will fall back to IRQ mode.
>
>
> Can you add a "#define DEBUG" in spi-bcm2835.c, even before the other "#include" directives. That should give you this debug message:
>
> |                       dev_dbg_ratelimited(&spi->dev,
> |                                           "timeout period reached: jiffies: %lu remaining tx/rx: %d/%d
> | -
> falling back to interrupt mode\n",
> |                                           jiffies - timeout,
> |                                           bs->tx_len, bs->rx_len);
> |                       /* fall back to interrupt mode */
> |
> |                       /* update usage statistics */
> |                       bs->count_transfer_irq_after_polling++;
> |
> |                       return bcm2835_spi_transfer_one_irq(ctlr, spi,
> |                                                           tfr, cs, false);
>
> here it activates the IRQ. But I'm not sure if the fallback works correctly....
>
> |               }
> |       }
> |
>
> Marc
>
> --
> Pengutronix e.K.                 | Marc Kleine-Budde           |
> Embedded Linux                   | https://www.pengutronix.de  |
> Vertretung West/Dortmund         | Phone: +49-231-2826-924     |
> Amtsgericht Hildesheim, HRA 2686 | Fax:   +49-5121-206917-5555 |

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: MCP2518FD Drivers Rarely Working with Custom Kernel 5.10.Y
       [not found]               ` <022d01d769e2$e623cbf0$b26b63d0$@gmail.com>
  2021-07-02  4:33                 ` Joshua Quesenberry
@ 2021-07-02  9:31                 ` Marc Kleine-Budde
  2021-07-02 14:26                   ` Joshua Quesenberry
  1 sibling, 1 reply; 15+ messages in thread
From: Marc Kleine-Budde @ 2021-07-02  9:31 UTC (permalink / raw)
  To: Joshua Quesenberry; +Cc: 'Patrick Menschel', kernel, linux-can

[-- Attachment #1: Type: text/plain, Size: 786 bytes --]

On 25.06.2021 12:55:26, Joshua Quesenberry wrote:
> Forgive me, I forgot can0 = spi0.1 and can1 = spi0.0 right now because
> I killed my UDEV rule so I was tapped onto the wrong CS line. Attached
> is a snapshot of what I'm seeing AND an export of the data from Saleae
> which may prove more useful than snapshots.

Pulseview cannot parse the csv file correctly (see [1]). Can you save it
in a different format?

Marc

[1] https://www.mail-archive.com/sigrok-devel@lists.sourceforge.net/msg03751.html


-- 
Pengutronix e.K.                 | Marc Kleine-Budde           |
Embedded Linux                   | https://www.pengutronix.de  |
Vertretung West/Dortmund         | Phone: +49-231-2826-924     |
Amtsgericht Hildesheim, HRA 2686 | Fax:   +49-5121-206917-5555 |

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: MCP2518FD Drivers Rarely Working with Custom Kernel 5.10.Y
  2021-07-02  9:31                 ` Marc Kleine-Budde
@ 2021-07-02 14:26                   ` Joshua Quesenberry
  2021-07-06 18:40                     ` Joshua Quesenberry
  0 siblings, 1 reply; 15+ messages in thread
From: Joshua Quesenberry @ 2021-07-02 14:26 UTC (permalink / raw)
  To: Marc Kleine-Budde; +Cc: Patrick Menschel, kernel, linux-can, Joshua Quesenberry

[-- Attachment #1: Type: text/plain, Size: 1272 bytes --]

The only other format I have is Saleae's trace format, if you're
willing to install their software, the attached trace should work. I
double checked that the application will load even without the device
attached, and it does, so it should work for you. I was using Logic 1,
but switched to their Logic 2 app and recollected a trace for you.
https://www.saleae.com/downloads/

Thanks,

Josh Q

On Fri, Jul 2, 2021 at 5:31 AM Marc Kleine-Budde <mkl@pengutronix.de> wrote:
>
> On 25.06.2021 12:55:26, Joshua Quesenberry wrote:
> > Forgive me, I forgot can0 = spi0.1 and can1 = spi0.0 right now because
> > I killed my UDEV rule so I was tapped onto the wrong CS line. Attached
> > is a snapshot of what I'm seeing AND an export of the data from Saleae
> > which may prove more useful than snapshots.
>
> Pulseview cannot parse the csv file correctly (see [1]). Can you save it
> in a different format?
>
> Marc
>
> [1] https://www.mail-archive.com/sigrok-devel@lists.sourceforge.net/msg03751.html
>
>
> --
> Pengutronix e.K.                 | Marc Kleine-Budde           |
> Embedded Linux                   | https://www.pengutronix.de  |
> Vertretung West/Dortmund         | Phone: +49-231-2826-924     |
> Amtsgericht Hildesheim, HRA 2686 | Fax:   +49-5121-206917-5555 |

[-- Attachment #2: MCP251xFD - Not Working - 50 MHz, 500 M Samples - 20210702 1018.sal --]
[-- Type: application/octet-stream, Size: 8355 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: MCP2518FD Drivers Rarely Working with Custom Kernel 5.10.Y
  2021-07-02 14:26                   ` Joshua Quesenberry
@ 2021-07-06 18:40                     ` Joshua Quesenberry
  2021-07-12 18:57                       ` Joshua Quesenberry
  0 siblings, 1 reply; 15+ messages in thread
From: Joshua Quesenberry @ 2021-07-06 18:40 UTC (permalink / raw)
  To: 'Marc Kleine-Budde'
  Cc: 'Patrick Menschel', kernel, linux-can, engnfrc

Good Afternoon,

Today I was planning to attack this problem from two different angles, first to detach the HAT and try narrowing down my config.txt to just what's needed for CAN (HAT wired up by jumper wires) and second to completely rebuild my OS from scratch in case something went awry during the upgrades; luckily during narrowing down my config.txt I found the issue. It appears that when I try to load the I2S subsystem that it's conflicting with SPI0. Since it's a lesser used feature, I'm guessing none of you all are testing with it loaded? Any ideas on how to begin troubleshooting this? I2S is something we need.

Current config.txt with two I2S lines (double hash) removed that results in working CAN on each reboot:

------------------------------------------------------------------------------------------------------------------------------------

dtdebug=1

disable_splash=1
boot_delay=0

hdmi_force_mode=1

hdmi_group=1
hdmi_mode=3 # 480p 60Hz H

# Uncomment some or all of these to enable the optional hardware interfaces
dtparam=i2c_arm=on
## dtparam=i2s=on
dtparam=spi=off
enable_uart=0

# Enable audio (loads snd_bcm2835)
dtparam=audio=on

# I2S WM8782 Driver
## dtoverlay=hel-wm8782,alsaname=mic

[pi4]
# Enable DRM VC4 V3D driver on top of the dispmanx display stack
dtoverlay=vc4-fkms-v3d
max_framebuffers=2

[all]
start_x=1
gpu_mem=512

# GPS
dtoverlay=uart3

# Reserved for HAT IDs
dtoverlay=i2c0
dtparam=pins_0_1=on
dtparam=combine=off

# Power Supervisor, IMU, RPi I2C Bus
dtoverlay=i2c1

# CAN 0/1
dtoverlay=spi0-cs
dtparam=cs0_pin=8
dtparam=cs1_pin=7

# Reserved
dtoverlay=spi5-2cs
dtparam=cs0_pin=12
dtparam=cs0_spidev=disabled
dtparam=cs1_pin=26
dtparam=cs1_spidev=disabled

# CAN 0
dtoverlay=mcp251xfd,spi0-0,interrupt=24,oscillator=40000000,speed=20000000

# CAN 1
dtoverlay=mcp251xfd,spi0-1,interrupt=25,oscillator=40000000,speed=20000000

------------------------------------------------------------------------------------------------------------------------------------

Thanks,

Josh Q

-----Original Message-----
From: Joshua Quesenberry <engnfrc@gmail.com> 
Sent: Friday, July 2, 2021 10:27 AM
To: Marc Kleine-Budde <mkl@pengutronix.de>
Cc: Patrick Menschel <menschel.p@posteo.de>; kernel@pengutronix.de; linux-can@vger.kernel.org; Joshua Quesenberry <EngnFrc@gmail.com>
Subject: Re: MCP2518FD Drivers Rarely Working with Custom Kernel 5.10.Y

The only other format I have is Saleae's trace format, if you're willing to install their software, the attached trace should work. I double checked that the application will load even without the device attached, and it does, so it should work for you. I was using Logic 1, but switched to their Logic 2 app and recollected a trace for you.
https://www.saleae.com/downloads/

Thanks,

Josh Q

On Fri, Jul 2, 2021 at 5:31 AM Marc Kleine-Budde <mkl@pengutronix.de> wrote:
>
> On 25.06.2021 12:55:26, Joshua Quesenberry wrote:
> > Forgive me, I forgot can0 = spi0.1 and can1 = spi0.0 right now 
> > because I killed my UDEV rule so I was tapped onto the wrong CS 
> > line. Attached is a snapshot of what I'm seeing AND an export of the 
> > data from Saleae which may prove more useful than snapshots.
>
> Pulseview cannot parse the csv file correctly (see [1]). Can you save 
> it in a different format?
>
> Marc
>
> [1] 
> https://www.mail-archive.com/sigrok-devel@lists.sourceforge.net/msg037
> 51.html
>
>
> --
> Pengutronix e.K.                 | Marc Kleine-Budde           |
> Embedded Linux                   | https://www.pengutronix.de  |
> Vertretung West/Dortmund         | Phone: +49-231-2826-924     |
> Amtsgericht Hildesheim, HRA 2686 | Fax:   +49-5121-206917-5555 |


^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: MCP2518FD Drivers Rarely Working with Custom Kernel 5.10.Y
  2021-07-06 18:40                     ` Joshua Quesenberry
@ 2021-07-12 18:57                       ` Joshua Quesenberry
  2021-07-12 19:56                         ` Patrick Menschel
  0 siblings, 1 reply; 15+ messages in thread
From: Joshua Quesenberry @ 2021-07-12 18:57 UTC (permalink / raw)
  To: 'Marc Kleine-Budde'
  Cc: 'Patrick Menschel', kernel, linux-can, engnfrc

Any thoughts on my recent findings? So far the raspberrypi.org forums haven't proved fruitful, not sure if there's another more appropriate place I should take this conversation now that the issue doesn't seem to be relating to the CAN drivers themselves, but underlying subsystems conflicting.

Thanks!

Josh Q

-----Original Message-----
From: Joshua Quesenberry <engnfrc@gmail.com> 
Sent: Tuesday, July 6, 2021 2:41 PM
To: 'Marc Kleine-Budde' <mkl@pengutronix.de>
Cc: 'Patrick Menschel' <menschel.p@posteo.de>; kernel@pengutronix.de; linux-can@vger.kernel.org; engnfrc@gmail.com
Subject: RE: MCP2518FD Drivers Rarely Working with Custom Kernel 5.10.Y

Good Afternoon,

Today I was planning to attack this problem from two different angles, first to detach the HAT and try narrowing down my config.txt to just what's needed for CAN (HAT wired up by jumper wires) and second to completely rebuild my OS from scratch in case something went awry during the upgrades; luckily during narrowing down my config.txt I found the issue. It appears that when I try to load the I2S subsystem that it's conflicting with SPI0. Since it's a lesser used feature, I'm guessing none of you all are testing with it loaded? Any ideas on how to begin troubleshooting this? I2S is something we need.

Current config.txt with two I2S lines (double hash) removed that results in working CAN on each reboot:

------------------------------------------------------------------------------------------------------------------------------------

dtdebug=1

disable_splash=1
boot_delay=0

hdmi_force_mode=1

hdmi_group=1
hdmi_mode=3 # 480p 60Hz H

# Uncomment some or all of these to enable the optional hardware interfaces dtparam=i2c_arm=on ## dtparam=i2s=on dtparam=spi=off
enable_uart=0

# Enable audio (loads snd_bcm2835)
dtparam=audio=on

# I2S WM8782 Driver
## dtoverlay=hel-wm8782,alsaname=mic

[pi4]
# Enable DRM VC4 V3D driver on top of the dispmanx display stack dtoverlay=vc4-fkms-v3d
max_framebuffers=2

[all]
start_x=1
gpu_mem=512

# GPS
dtoverlay=uart3

# Reserved for HAT IDs
dtoverlay=i2c0
dtparam=pins_0_1=on
dtparam=combine=off

# Power Supervisor, IMU, RPi I2C Bus
dtoverlay=i2c1

# CAN 0/1
dtoverlay=spi0-cs
dtparam=cs0_pin=8
dtparam=cs1_pin=7

# Reserved
dtoverlay=spi5-2cs
dtparam=cs0_pin=12
dtparam=cs0_spidev=disabled
dtparam=cs1_pin=26
dtparam=cs1_spidev=disabled

# CAN 0
dtoverlay=mcp251xfd,spi0-0,interrupt=24,oscillator=40000000,speed=20000000

# CAN 1
dtoverlay=mcp251xfd,spi0-1,interrupt=25,oscillator=40000000,speed=20000000

------------------------------------------------------------------------------------------------------------------------------------

Thanks,

Josh Q

-----Original Message-----
From: Joshua Quesenberry <engnfrc@gmail.com>
Sent: Friday, July 2, 2021 10:27 AM
To: Marc Kleine-Budde <mkl@pengutronix.de>
Cc: Patrick Menschel <menschel.p@posteo.de>; kernel@pengutronix.de; linux-can@vger.kernel.org; Joshua Quesenberry <EngnFrc@gmail.com>
Subject: Re: MCP2518FD Drivers Rarely Working with Custom Kernel 5.10.Y

The only other format I have is Saleae's trace format, if you're willing to install their software, the attached trace should work. I double checked that the application will load even without the device attached, and it does, so it should work for you. I was using Logic 1, but switched to their Logic 2 app and recollected a trace for you.
https://www.saleae.com/downloads/

Thanks,

Josh Q

On Fri, Jul 2, 2021 at 5:31 AM Marc Kleine-Budde <mkl@pengutronix.de> wrote:
>
> On 25.06.2021 12:55:26, Joshua Quesenberry wrote:
> > Forgive me, I forgot can0 = spi0.1 and can1 = spi0.0 right now 
> > because I killed my UDEV rule so I was tapped onto the wrong CS 
> > line. Attached is a snapshot of what I'm seeing AND an export of the 
> > data from Saleae which may prove more useful than snapshots.
>
> Pulseview cannot parse the csv file correctly (see [1]). Can you save 
> it in a different format?
>
> Marc
>
> [1]
> https://www.mail-archive.com/sigrok-devel@lists.sourceforge.net/msg037
> 51.html
>
>
> --
> Pengutronix e.K.                 | Marc Kleine-Budde           |
> Embedded Linux                   | https://www.pengutronix.de  |
> Vertretung West/Dortmund         | Phone: +49-231-2826-924     |
> Amtsgericht Hildesheim, HRA 2686 | Fax:   +49-5121-206917-5555 |



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: MCP2518FD Drivers Rarely Working with Custom Kernel 5.10.Y
  2021-07-12 18:57                       ` Joshua Quesenberry
@ 2021-07-12 19:56                         ` Patrick Menschel
  2021-07-12 20:01                           ` Joshua Quesenberry
  0 siblings, 1 reply; 15+ messages in thread
From: Patrick Menschel @ 2021-07-12 19:56 UTC (permalink / raw)
  To: Joshua Quesenberry, 'Marc Kleine-Budde'; +Cc: kernel, linux-can

Am 12.07.21 um 20:57 schrieb Joshua Quesenberry:
> Any thoughts on my recent findings? So far the raspberrypi.org forums haven't proved fruitful, not sure if there's another more appropriate place I should take this conversation now that the issue doesn't seem to be relating to the CAN drivers themselves, but underlying subsystems conflicting.


Technically you could open an issue on
https://github.com/raspberrypi/linux

This is usually the straight forward solution to get in touch with the
platform experts.

Cheers,
Patrick

^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: MCP2518FD Drivers Rarely Working with Custom Kernel 5.10.Y
  2021-07-12 19:56                         ` Patrick Menschel
@ 2021-07-12 20:01                           ` Joshua Quesenberry
  0 siblings, 0 replies; 15+ messages in thread
From: Joshua Quesenberry @ 2021-07-12 20:01 UTC (permalink / raw)
  To: 'Patrick Menschel', 'Marc Kleine-Budde'
  Cc: kernel, linux-can, engnfrc

Thanks for the suggestion Patrick! I will do that shortly.

Josh Q

-----Original Message-----
From: Patrick Menschel <menschel.p@posteo.de> 
Sent: Monday, July 12, 2021 3:57 PM
To: Joshua Quesenberry <engnfrc@gmail.com>; 'Marc Kleine-Budde' <mkl@pengutronix.de>
Cc: kernel@pengutronix.de; linux-can@vger.kernel.org
Subject: Re: MCP2518FD Drivers Rarely Working with Custom Kernel 5.10.Y

Am 12.07.21 um 20:57 schrieb Joshua Quesenberry:
> Any thoughts on my recent findings? So far the raspberrypi.org forums haven't proved fruitful, not sure if there's another more appropriate place I should take this conversation now that the issue doesn't seem to be relating to the CAN drivers themselves, but underlying subsystems conflicting.


Technically you could open an issue on
https://github.com/raspberrypi/linux

This is usually the straight forward solution to get in touch with the platform experts.

Cheers,
Patrick


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2021-07-12 20:01 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <016701d7678c$2b3d50c0$81b7f240$@gmail.com>
     [not found] ` <20210622212818.enfx5fzgghfxfznb@pengutronix.de>
2021-06-23  2:59   ` MCP2518FD Drivers Rarely Working with Custom Kernel 5.10.Y Joshua Quesenberry
2021-06-23  5:24     ` Patrick Menschel
2021-06-23 17:34       ` Joshua Quesenberry
2021-06-23 20:07         ` Patrick Menschel
2021-06-24 18:24           ` Joshua Quesenberry
2021-06-24 20:41             ` Patrick Menschel
2021-06-25  6:56         ` Marc Kleine-Budde
2021-06-25 12:16           ` Marc Kleine-Budde
     [not found]             ` <020f01d769da$9fac86b0$df059410$@gmail.com>
     [not found]               ` <022d01d769e2$e623cbf0$b26b63d0$@gmail.com>
2021-07-02  4:33                 ` Joshua Quesenberry
2021-07-02  9:31                 ` Marc Kleine-Budde
2021-07-02 14:26                   ` Joshua Quesenberry
2021-07-06 18:40                     ` Joshua Quesenberry
2021-07-12 18:57                       ` Joshua Quesenberry
2021-07-12 19:56                         ` Patrick Menschel
2021-07-12 20:01                           ` Joshua Quesenberry

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.