All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: ram initialization on mcp2518fd
       [not found]   ` <441514ec-7b43-e11c-09b5-bdaf7fca0077@prevas.dk>
@ 2022-06-16 11:21     ` Rasmus Villemoes
  2022-06-16 12:40       ` Marc Kleine-Budde
  0 siblings, 1 reply; 9+ messages in thread
From: Rasmus Villemoes @ 2022-06-16 11:21 UTC (permalink / raw)
  To: Marc Kleine-Budde; +Cc: info, linux-can

[resending without image attached so that it makes it to the linux-can list]

On 16/06/2022 12.11, Marc Kleine-Budde wrote:
> Hello Rasmus,
>
> can we move this discussion to the Linux-CAN mailing list
> (linux-can@vger.kernel.org)?

Absolutely. Cc'ed.

> On 16.06.2022 11:13:41, Rasmus Villemoes wrote:
>> I have a board based on imx8mp with two mcp2518fd devices (sitting on
>> two different spi busses). I'm having some trouble getting them to
>> work.
>
> The ecspi or the qspi?
>

ecspi


>> A logic analyzer shows that the spi communication works (mostly) fine
>> [1], up until mcp251xfd_chip_ecc_init(). In there, it seems that the
>> regmap_raw_write() ends up trying to do byte-by-byte transfers - i.e.,
>> the chip select is only asserted for 8 clock cycles, then deasserted,
>> and asserted again for the next byte.
>
> What about the other transfers done before the
> mcp251xfd_chip_ecc_init()? Do the look correct?

Yes. For example the very first write done from mcp251xfd_chip_start()
-> mcp251xfd_chip_softreset{,_do}() -> mcp251xfd_chip_clock_enable()
looks like https://ibb.co/xMbdsSq , and the MOSI signal looks right (0xa
= crc
write to 0xe00 aka OSC, 4 value bytes, the 0x00000060 value in little
endian, and the crc partly off-screen).

>> Unfortunately, I can't physically put more than two probes on at a
>> time, but I have captured (clk, mosi) at one point and (clk, cs) at
>> another, and in both cases one sees these 8-clock bursts, which should
>> never appear with this device.
>
> ACK
>
>> Obviously, that means that the RAM doesn't get initialized, because
>> all the device sees is a bunch of aborted spi transactions. I can't
>> really figure out why this happens, maybe I'm missing some setting in
>> DT or elsewhere; currently I have
>
> Can you show me your SPI host controller node, too?

Yes, it's essentially the one from imx8mp.dtsi, i.e.

	ecspi1: spi@30820000 {
		#address-cells = <1>;
		#size-cells = <0>;
		compatible = "fsl,imx8mp-ecspi", "fsl,imx6ul-ecspi";
		reg = <0x30820000 0x10000>;
		interrupts = <GIC_SPI 31 IRQ_TYPE_LEVEL_HIGH>;
		clocks = <&clk IMX8MP_CLK_ECSPI1_ROOT>,
			 <&clk IMX8MP_CLK_ECSPI1_ROOT>;
		clock-names = "ipg", "per";
		assigned-clock-rates = <80000000>;
		assigned-clocks = <&clk IMX8MP_CLK_ECSPI1>;
		assigned-clock-parents = <&clk IMX8MP_SYS_PLL1_800M>;
		dmas = <&sdma1 0 7 1>, <&sdma1 1 7 2>;
		dma-names = "rx", "tx";
		status = "disabled";
	};

with this in the board .dts:

&ecspi1 {
	status = "okay";
	pinctrl-names = "default";
	pinctrl-0 = <&pinctrl_ecspi1>;
	cs-gpios = <&gpio5 9 GPIO_ACTIVE_LOW>;

	CAN_C: spi@0 {
		reg = <0>;
		compatible = "microchip,mcp2518fd";
		/* clocks = <&cdce937 6>; */
		clocks = <&clk_40MHz>;
		interrupts-extended = <&gpio1 6 IRQ_TYPE_LEVEL_LOW>;
		spi-max-frequency = <10000000>;
		pinctrl-names = "default";
		pinctrl-0 = <&pinctrl_CAN_C>;
	};
};

It's currently (for bad reasons) the nxp kernel, hence the
fsl,imx6ul-ecspi compatible, but changing that to the one in mainline,
fsl,imx51-ecspi, doesn't change the symptoms.


>> 	CAN_C: spi@0 {
>> 		reg = <0>;
>> 		compatible = "microchip,mcp2518fd";
>> 		clocks = <&clk_40MHz>;
>> 		interrupts-extended = <&gpio1 6 IRQ_TYPE_LEVEL_LOW>;
>> 		spi-max-frequency = <17000000>;
>> 		pinctrl-names = "default";
>> 		pinctrl-0 = <&pinctrl_CAN_C>;
>> 	};
>>
>> with the pinctrl being for the irq gpio. I have also tried lowering
>> spi-max-frequency to 10MHz (currently it ends up effectively using
>> 16MHz),
>
> The driver uses 85% of clock/2 or spi-max-frequency, whatever is lower.

Yes; using 10MHz was just an attempt to see if using an even lower
frequency than the maximum implied by the errata (and implemented in the
driver) would make a difference.

>> [1] One thing I have noticed is that we probably want to do
>>
>> -       xfer[1].len = sizeof(dev_id);
>> +       xfer[1].len = sizeof(*dev_id);
>
> Fixed. Do you want to appear as Reported-by in the patch?

Yes please.

>> But: the reading doesn't seem to work; it's as if the device doesn't
>> drive MISO at all at this point, because when I configure the pin with
>> a weak internal pull-up, I get 0xffffffff, while if I use pinmux
>> settings with a weak pull-down I get 0 - which is the "correct"
>> answer, but probably not for the right reason.
>
> The device id is read after detecting the chip and the
> mcp251xfd_chip_ecc_init(). Does the chip detection work properly?

It appears so, adding a print of the osc variable after the
regmap_read() in mcp251xfd_register_chip_detect() gives

mcp251xfd_register_chip_detect: osc = 0x00000468

so the LPMEN bit seems to have been set (and the other bits show
expected values). And it's true that the dev_id is read after this,
through mcp251xfd_register_done(), which is why I'm puzzled that it
doesn't seem to work.

Are you sure about the ecc_init() part? AFAICT that's only called from
chip_start, which in turn is only called when the device is brought up.

>> I'm on a 5.10.y kernel, but I don't see any commits related to this
>> since then (other than the error handling in
>> mcp251xfd_register_get_dev_id()).
>
> Which kernel are you using exactly? Mainline or freescale/nxp
> downstream?

Currently NXP,  lf-5.10.72-2.2.0 aka a68e31b63f86. I'll see if I can
manage to make a mainline kernel boot.

> Have you enabled DMA on SPI?

Not explicitly, but nor have I done anything to disable/not enable it,
so I'm not really sure what the right answer is. Is that a CONFIG_ knob
or module/kernel parameter?

Thanks,
Rasmus

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: ram initialization on mcp2518fd
  2022-06-16 11:21     ` ram initialization on mcp2518fd Rasmus Villemoes
@ 2022-06-16 12:40       ` Marc Kleine-Budde
  2022-06-17 14:01         ` Rasmus Villemoes
  0 siblings, 1 reply; 9+ messages in thread
From: Marc Kleine-Budde @ 2022-06-16 12:40 UTC (permalink / raw)
  To: Rasmus Villemoes; +Cc: linux-can

[-- Attachment #1: Type: text/plain, Size: 6463 bytes --]

On 16.06.2022 13:21:52, Rasmus Villemoes wrote:
> > The ecspi or the qspi?
> 
> ecspi

OK.

> >> A logic analyzer shows that the spi communication works (mostly) fine
> >> [1], up until mcp251xfd_chip_ecc_init(). In there, it seems that the
> >> regmap_raw_write() ends up trying to do byte-by-byte transfers - i.e.,
> >> the chip select is only asserted for 8 clock cycles, then deasserted,
> >> and asserted again for the next byte.
> >
> > What about the other transfers done before the
> > mcp251xfd_chip_ecc_init()? Do the look correct?
> 
> Yes. For example the very first write done from mcp251xfd_chip_start()
> -> mcp251xfd_chip_softreset{,_do}() -> mcp251xfd_chip_clock_enable()
> looks like https://ibb.co/xMbdsSq , and the MOSI signal looks right
> (0xa = crc write to 0xe00 aka OSC, 4 value bytes, the 0x00000060 value
> in little endian, and the crc partly off-screen).

That looks perfect.

[...]

> >> Obviously, that means that the RAM doesn't get initialized, because
> >> all the device sees is a bunch of aborted spi transactions. I can't
> >> really figure out why this happens, maybe I'm missing some setting in
> >> DT or elsewhere; currently I have
> >
> > Can you show me your SPI host controller node, too?
> 
> Yes, it's essentially the one from imx8mp.dtsi, i.e.
> 
> 	ecspi1: spi@30820000 {
> 		#address-cells = <1>;
> 		#size-cells = <0>;
> 		compatible = "fsl,imx8mp-ecspi", "fsl,imx6ul-ecspi";
> 		reg = <0x30820000 0x10000>;
> 		interrupts = <GIC_SPI 31 IRQ_TYPE_LEVEL_HIGH>;
> 		clocks = <&clk IMX8MP_CLK_ECSPI1_ROOT>,
> 			 <&clk IMX8MP_CLK_ECSPI1_ROOT>;
> 		clock-names = "ipg", "per";
> 		assigned-clock-rates = <80000000>;
> 		assigned-clocks = <&clk IMX8MP_CLK_ECSPI1>;
> 		assigned-clock-parents = <&clk IMX8MP_SYS_PLL1_800M>;
> 		dmas = <&sdma1 0 7 1>, <&sdma1 1 7 2>;
> 		dma-names = "rx", "tx";
> 		status = "disabled";
> 	};
> 
> with this in the board .dts:
> 
> &ecspi1 {
> 	status = "okay";
> 	pinctrl-names = "default";
> 	pinctrl-0 = <&pinctrl_ecspi1>;
> 	cs-gpios = <&gpio5 9 GPIO_ACTIVE_LOW>;
> 
> 	CAN_C: spi@0 {
> 		reg = <0>;
> 		compatible = "microchip,mcp2518fd";
> 		/* clocks = <&cdce937 6>; */
> 		clocks = <&clk_40MHz>;
> 		interrupts-extended = <&gpio1 6 IRQ_TYPE_LEVEL_LOW>;
> 		spi-max-frequency = <10000000>;
> 		pinctrl-names = "default";
> 		pinctrl-0 = <&pinctrl_CAN_C>;
> 	};
> };
> 
> It's currently (for bad reasons) the nxp kernel, hence the
> fsl,imx6ul-ecspi compatible, but changing that to the one in mainline,
> fsl,imx51-ecspi, doesn't change the symptoms.

The imx8mp.dtsi enables DMA for the SPI host controller.

> >> 	CAN_C: spi@0 {
> >> 		reg = <0>;
> >> 		compatible = "microchip,mcp2518fd";
> >> 		clocks = <&clk_40MHz>;
> >> 		interrupts-extended = <&gpio1 6 IRQ_TYPE_LEVEL_LOW>;
> >> 		spi-max-frequency = <17000000>;
> >> 		pinctrl-names = "default";
> >> 		pinctrl-0 = <&pinctrl_CAN_C>;
> >> 	};
> >>
> >> with the pinctrl being for the irq gpio. I have also tried lowering
> >> spi-max-frequency to 10MHz (currently it ends up effectively using
> >> 16MHz),
> >
> > The driver uses 85% of clock/2 or spi-max-frequency, whatever is lower.
> 
> Yes; using 10MHz was just an attempt to see if using an even lower
> frequency than the maximum implied by the errata (and implemented in the
> driver) would make a difference.
> 
> >> [1] One thing I have noticed is that we probably want to do
> >>
> >> -       xfer[1].len = sizeof(dev_id);
> >> +       xfer[1].len = sizeof(*dev_id);
> >
> > Fixed. Do you want to appear as Reported-by in the patch?
> 
> Yes please.

Updated.

> >> But: the reading doesn't seem to work; it's as if the device doesn't
> >> drive MISO at all at this point, because when I configure the pin with
> >> a weak internal pull-up, I get 0xffffffff, while if I use pinmux
> >> settings with a weak pull-down I get 0 - which is the "correct"
> >> answer, but probably not for the right reason.
> >
> > The device id is read after detecting the chip and the
> > mcp251xfd_chip_ecc_init(). Does the chip detection work properly?
> 
> It appears so, adding a print of the osc variable after the
> regmap_read() in mcp251xfd_register_chip_detect() gives
> 
> mcp251xfd_register_chip_detect: osc = 0x00000468
> 
> so the LPMEN bit seems to have been set (and the other bits show
> expected values). And it's true that the dev_id is read after this,
> through mcp251xfd_register_done(), which is why I'm puzzled that it
> doesn't seem to work.
> 
> Are you sure about the ecc_init() part? AFAICT that's only called from
> chip_start, which in turn is only called when the device is brought
> up.

Yes, you're right!

> >> I'm on a 5.10.y kernel, but I don't see any commits related to this
> >> since then (other than the error handling in
> >> mcp251xfd_register_get_dev_id()).
> >
> > Which kernel are you using exactly? Mainline or freescale/nxp
> > downstream?
> 
> Currently NXP,  lf-5.10.72-2.2.0 aka a68e31b63f86. I'll see if I can
> manage to make a mainline kernel boot.

v5.10 is old, you will not get the best mcp251xfd performance out of it.
There are several performance improvements on the SPI framework, the
spi-imx driver and the mcp251xfd driver with a recent kernel.

> > Have you enabled DMA on SPI?
> 
> Not explicitly, but nor have I done anything to disable/not enable it,
> so I'm not really sure what the right answer is. Is that a CONFIG_ knob
> or module/kernel parameter?

You can use "/delete-property/ dmas;" and "/delete-property/ dma-names;"
in your board dts on the &ecspi1 node, the module parameter "use_dma=0",
or the kernel command line "spi-imx.use_dma=0" to disable DMA.

The SPI host driver uses PIO for smaller transfers, but switches to DMA
mode for bigger ones (IIRC > 64 bytes). The clearing of the memory
definitely falls into the big transfer category. The SPI DMA mode is
broken on various kernel variants (downstream, mainline), kernel
versions, SoC variants and used SDMA firmware versions. For the
mcp251xfd driver DMA is also slower. So switch of DMA and try again.

Marc

-- 
Pengutronix e.K.                 | Marc Kleine-Budde           |
Embedded Linux                   | https://www.pengutronix.de  |
Vertretung West/Dortmund         | Phone: +49-231-2826-924     |
Amtsgericht Hildesheim, HRA 2686 | Fax:   +49-5121-206917-5555 |

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: ram initialization on mcp2518fd
  2022-06-16 12:40       ` Marc Kleine-Budde
@ 2022-06-17 14:01         ` Rasmus Villemoes
  2022-06-17 14:17           ` Marc Kleine-Budde
  0 siblings, 1 reply; 9+ messages in thread
From: Rasmus Villemoes @ 2022-06-17 14:01 UTC (permalink / raw)
  To: Marc Kleine-Budde; +Cc: linux-can

On 16/06/2022 14.40, Marc Kleine-Budde wrote:
> On 16.06.2022 13:21:52, Rasmus Villemoes wrote:

>>> Have you enabled DMA on SPI?
>>
>> Not explicitly, but nor have I done anything to disable/not enable it,
>> so I'm not really sure what the right answer is. Is that a CONFIG_ knob
>> or module/kernel parameter?
> 
> You can use "/delete-property/ dmas;" and "/delete-property/ dma-names;"
> in your board dts on the &ecspi1 node, the module parameter "use_dma=0",
> or the kernel command line "spi-imx.use_dma=0" to disable DMA.
> 
> The SPI host driver uses PIO for smaller transfers, but switches to DMA
> mode for bigger ones (IIRC > 64 bytes). The clearing of the memory
> definitely falls into the big transfer category. The SPI DMA mode is
> broken on various kernel variants (downstream, mainline), kernel
> versions, SoC variants and used SDMA firmware versions. For the
> mcp251xfd driver DMA is also slower. So switch of DMA and try again.

Thanks! Disabling DMA did the trick, and I can now send traffic back and
forth between the two chips, and since the RAM now gets properly
initialized, I also don't get the ecc error interrupts.

There is still something to be looked at, since I see

NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #08!!!

when I start the traffic test. And the dev_id reading still doesn't work
(though it's not really used for anything other than an informative
printk). But I'll have to get our hardware guys to help me do some
soldering to capture all four channels at once to see just exactly what
is going on there.

Rasmus

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: ram initialization on mcp2518fd
  2022-06-17 14:01         ` Rasmus Villemoes
@ 2022-06-17 14:17           ` Marc Kleine-Budde
  2022-06-20  7:09             ` Rasmus Villemoes
  0 siblings, 1 reply; 9+ messages in thread
From: Marc Kleine-Budde @ 2022-06-17 14:17 UTC (permalink / raw)
  To: Rasmus Villemoes; +Cc: linux-can

[-- Attachment #1: Type: text/plain, Size: 2109 bytes --]

On 17.06.2022 16:01:56, Rasmus Villemoes wrote:
> >>> Have you enabled DMA on SPI?
> >>
> >> Not explicitly, but nor have I done anything to disable/not enable it,
> >> so I'm not really sure what the right answer is. Is that a CONFIG_ knob
> >> or module/kernel parameter?
> > 
> > You can use "/delete-property/ dmas;" and "/delete-property/ dma-names;"
> > in your board dts on the &ecspi1 node, the module parameter "use_dma=0",
> > or the kernel command line "spi-imx.use_dma=0" to disable DMA.
> > 
> > The SPI host driver uses PIO for smaller transfers, but switches to DMA
> > mode for bigger ones (IIRC > 64 bytes). The clearing of the memory
> > definitely falls into the big transfer category. The SPI DMA mode is
> > broken on various kernel variants (downstream, mainline), kernel
> > versions, SoC variants and used SDMA firmware versions. For the
> > mcp251xfd driver DMA is also slower. So switch of DMA and try again.
> 
> Thanks! Disabling DMA did the trick, and I can now send traffic back and
> forth between the two chips, and since the RAM now gets properly
> initialized, I also don't get the ecc error interrupts.

\o/

> There is still something to be looked at, since I see
> 
> NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #08!!!

Update to a newer kernel. In v5.15 this problem is fixed.

> when I start the traffic test. And the dev_id reading still doesn't
> work (though it's not really used for anything other than an
> informative printk).

What does it read on your board? But still that transfer should work.

> But I'll have to get our hardware guys to help me do some soldering to
> capture all four channels at once to see just exactly what is going on
> there.

Have a look at these: https://sensepeek.com/pcbite_20

regards,
Marc

-- 
Pengutronix e.K.                 | Marc Kleine-Budde           |
Embedded Linux                   | https://www.pengutronix.de  |
Vertretung West/Dortmund         | Phone: +49-231-2826-924     |
Amtsgericht Hildesheim, HRA 2686 | Fax:   +49-5121-206917-5555 |

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: ram initialization on mcp2518fd
  2022-06-17 14:17           ` Marc Kleine-Budde
@ 2022-06-20  7:09             ` Rasmus Villemoes
  2022-06-20 10:27               ` Marc Kleine-Budde
  2022-06-23 13:50               ` Marc Kleine-Budde
  0 siblings, 2 replies; 9+ messages in thread
From: Rasmus Villemoes @ 2022-06-20  7:09 UTC (permalink / raw)
  To: Marc Kleine-Budde; +Cc: linux-can

On 17/06/2022 16.17, Marc Kleine-Budde wrote:
> On 17.06.2022 16:01:56, Rasmus Villemoes wrote:

>> There is still something to be looked at, since I see
>>
>> NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #08!!!
> 
> Update to a newer kernel. In v5.15 this problem is fixed.

Oh, great, thanks. I'll try to give 5.15 a quick spin, we're mostly on
5.10 because this project started when that was the latest LTS.

>> when I start the traffic test. And the dev_id reading still doesn't
>> work (though it's not really used for anything other than an
>> informative printk).
> 
> What does it read on your board? But still that transfer should work.

Depending on whether the miso pin has been configured with a weak
internal pull-up or pull-down, it reads either 0xff or 0x00 - and that's
also the case when I expand the read to all six e00 through e14
registers. So the chip isn't really driving miso, and I think that's
because CS is released between the two elements of the xfer array.

>> But I'll have to get our hardware guys to help me do some soldering to
>> capture all four channels at once to see just exactly what is going on
>> there.
> 
> Have a look at these: https://sensepeek.com/pcbite_20

Yup, they're excellent :) My desk: https://ibb.co/LRdtt7Q . However I
can't really manage to get more than two probes directly on the mcp2518fd.

Thanks,
Rasmus

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: ram initialization on mcp2518fd
  2022-06-20  7:09             ` Rasmus Villemoes
@ 2022-06-20 10:27               ` Marc Kleine-Budde
  2022-06-20 13:36                 ` Rasmus Villemoes
  2022-06-23 13:50               ` Marc Kleine-Budde
  1 sibling, 1 reply; 9+ messages in thread
From: Marc Kleine-Budde @ 2022-06-20 10:27 UTC (permalink / raw)
  To: Rasmus Villemoes; +Cc: linux-can

[-- Attachment #1: Type: text/plain, Size: 3411 bytes --]

On 20.06.2022 09:09:09, Rasmus Villemoes wrote:
> >> NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #08!!!
> > 
> > Update to a newer kernel. In v5.15 this problem is fixed.
> 
> Oh, great, thanks. I'll try to give 5.15 a quick spin, we're mostly on
> 5.10 because this project started when that was the latest LTS.

We usually start with the latest and greatest kernel when starting a new
project, keep updating and settle for the newest LTS as late as
possible. You don't have anything from LTS if you don't update :) But
that's a different story...

> >> when I start the traffic test. And the dev_id reading still doesn't
> >> work (though it's not really used for anything other than an
> >> informative printk).
> > 
> > What does it read on your board? But still that transfer should work.
> 
> Depending on whether the miso pin has been configured with a weak
> internal pull-up or pull-down, it reads either 0xff or 0x00 - and
> that's also the case when I expand the read to all six e00 through e14
> registers. So the chip isn't really driving miso, and I think that's
> because CS is released between the two elements of the xfer array.

Can you do a measurement of that MCP251XFD_REG_DEVID transfer?

For testing I read the OSC register instead of the DEVID and print the
value:

diff --git a/drivers/net/can/spi/mcp251xfd/mcp251xfd-core.c b/drivers/net/can/spi/mcp251xfd/mcp251xfd-core.c
index 537335d42d1d..79de59f9c145 100644
--- a/drivers/net/can/spi/mcp251xfd/mcp251xfd-core.c
+++ b/drivers/net/can/spi/mcp251xfd/mcp251xfd-core.c
@@ -1796,13 +1796,14 @@ mcp251xfd_register_get_dev_id(const struct mcp251xfd_priv *priv, u32 *dev_id,
        xfer[1].len = sizeof(dev_id);
        xfer[1].speed_hz = priv->spi_max_speed_hz_fast;
 
-       mcp251xfd_spi_cmd_read_nocrc(&buf_tx->cmd, MCP251XFD_REG_DEVID);
+       mcp251xfd_spi_cmd_read_nocrc(&buf_tx->cmd, MCP251XFD_REG_OSC);
 
        err = spi_sync_transfer(priv->spi, xfer, ARRAY_SIZE(xfer));
        if (err)
                goto out_kfree_buf_tx;
 
        *dev_id = be32_to_cpup((__be32 *)buf_rx->data);
+       pr_info("%s: reg_osc=0x%08x\n", __func__, *dev_id);
        *effective_speed_hz_slow = xfer[0].effective_speed_hz;
        *effective_speed_hz_fast = xfer[1].effective_speed_hz;

And it gives me:

| [  171.051636] mcp251xfd_register_get_dev_id: reg_osc=0x68040000

I just noticed the register contents isn't big endian, it's little
endian. I'll send a patch. (The address information is big endian, and
mcp251xfd_spi_cmd_read_nocrc is taking care of this.)

> >> But I'll have to get our hardware guys to help me do some soldering to
> >> capture all four channels at once to see just exactly what is going on
> >> there.
> > 
> > Have a look at these: https://sensepeek.com/pcbite_20
> 
> Yup, they're excellent :) My desk: https://ibb.co/LRdtt7Q . However I
> can't really manage to get more than two probes directly on the mcp2518fd.

Nice, you already have them! With some balancing I usually get all 4
signals connected, until someone touched my desk :)

Marc

-- 
Pengutronix e.K.                 | Marc Kleine-Budde           |
Embedded Linux                   | https://www.pengutronix.de  |
Vertretung West/Dortmund         | Phone: +49-231-2826-924     |
Amtsgericht Hildesheim, HRA 2686 | Fax:   +49-5121-206917-5555 |

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: ram initialization on mcp2518fd
  2022-06-20 10:27               ` Marc Kleine-Budde
@ 2022-06-20 13:36                 ` Rasmus Villemoes
  2022-06-20 14:05                   ` Marc Kleine-Budde
  0 siblings, 1 reply; 9+ messages in thread
From: Rasmus Villemoes @ 2022-06-20 13:36 UTC (permalink / raw)
  To: Marc Kleine-Budde; +Cc: linux-can

On 20/06/2022 12.27, Marc Kleine-Budde wrote:
> On 20.06.2022 09:09:09, Rasmus Villemoes wrote:

>>>> when I start the traffic test. And the dev_id reading still doesn't
>>>> work (though it's not really used for anything other than an
>>>> informative printk).
>>>
>>> What does it read on your board? But still that transfer should work.
>>
>> Depending on whether the miso pin has been configured with a weak
>> internal pull-up or pull-down, it reads either 0xff or 0x00 - and
>> that's also the case when I expand the read to all six e00 through e14
>> registers. So the chip isn't really driving miso, and I think that's
>> because CS is released between the two elements of the xfer array.
> 
> Can you do a measurement of that MCP251XFD_REG_DEVID transfer?
> 
> For testing I read the OSC register instead of the DEVID and print the
> value:
> 
> diff --git a/drivers/net/can/spi/mcp251xfd/mcp251xfd-core.c b/drivers/net/can/spi/mcp251xfd/mcp251xfd-core.c
> index 537335d42d1d..79de59f9c145 100644
> --- a/drivers/net/can/spi/mcp251xfd/mcp251xfd-core.c
> +++ b/drivers/net/can/spi/mcp251xfd/mcp251xfd-core.c
> @@ -1796,13 +1796,14 @@ mcp251xfd_register_get_dev_id(const struct mcp251xfd_priv *priv, u32 *dev_id,
>         xfer[1].len = sizeof(dev_id);
>         xfer[1].speed_hz = priv->spi_max_speed_hz_fast;
>  
> -       mcp251xfd_spi_cmd_read_nocrc(&buf_tx->cmd, MCP251XFD_REG_DEVID);
> +       mcp251xfd_spi_cmd_read_nocrc(&buf_tx->cmd, MCP251XFD_REG_OSC);
>  
>         err = spi_sync_transfer(priv->spi, xfer, ARRAY_SIZE(xfer));
>         if (err)
>                 goto out_kfree_buf_tx;
>  
>         *dev_id = be32_to_cpup((__be32 *)buf_rx->data);
> +       pr_info("%s: reg_osc=0x%08x\n", __func__, *dev_id);
>         *effective_speed_hz_slow = xfer[0].effective_speed_hz;
>         *effective_speed_hz_fast = xfer[1].effective_speed_hz;
> 
> And it gives me:
> 
> | [  171.051636] mcp251xfd_register_get_dev_id: reg_osc=0x68040000
> 
> I just noticed the register contents isn't big endian, it's little
> endian. I'll send a patch. (The address information is big endian, and
> mcp251xfd_spi_cmd_read_nocrc is taking care of this.)

So the problem was that I was using native chip select, i.e. my pinmux
setting was

MX8MP_IOMUXC_ECSPI1_SS0__ECSPI1_SS0	0x00000144

and that means the controller only asserts the CS line for the duration
of one burst; so when the spi message contains two transfers, it
obviously breaks as the device saw the release of CS after the two
command/address bytes as the end of the transaction [then, from the
device's POV another begins, but there MOSI is all 0, so that may get
interpreted as a reset command, or perhaps it's ignored because it's not
precisely 16 0 bits - I wonder how the hardware designers thought
all-zeros was a good idea for a reset command].

Switching to gpio, i.e.

MX8MP_IOMUXC_ECSPI1_SS0__GPIO5_IO09	0x00000144

means CS is held low for the whole message, and I now read the expected
contents.

====

It's probably not anything to do with the mcp2518fd driver and this is
straying somewhat from the original problem, but I see that CS is held
low for a very long time after the last byte has been shifted in/out, on
the order of 0.5ms. https://ibb.co/n1Hq3YR shows the first write to OSC,
and zooming out shows https://ibb.co/12Z1YkM (or just looking at the
mouse-over info in the first one).

Similarly, there's a rather large gap between the two spi_transfers in
the case of the reading of dev_id (but, per the above, with CS correctly
held throughout): https://ibb.co/pyG1y96 .

I'm not sure if this is to be expected.

Rasmus

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: ram initialization on mcp2518fd
  2022-06-20 13:36                 ` Rasmus Villemoes
@ 2022-06-20 14:05                   ` Marc Kleine-Budde
  0 siblings, 0 replies; 9+ messages in thread
From: Marc Kleine-Budde @ 2022-06-20 14:05 UTC (permalink / raw)
  To: Rasmus Villemoes; +Cc: linux-can

[-- Attachment #1: Type: text/plain, Size: 2328 bytes --]

On 20.06.2022 15:36:35, Rasmus Villemoes wrote:
> >>> What does it read on your board? But still that transfer should work.

[...]

> So the problem was that I was using native chip select, i.e. my pinmux
> setting was
> 
> MX8MP_IOMUXC_ECSPI1_SS0__ECSPI1_SS0	0x00000144

HW chip selects are broken in the spi-imx driver/hardware, at least how
to the Linux SPI stack uses them.

> and that means the controller only asserts the CS line for the duration
> of one burst; so when the spi message contains two transfers, it
> obviously breaks as the device saw the release of CS after the two
> command/address bytes as the end of the transaction [then, from the
> device's POV another begins, but there MOSI is all 0, so that may get
> interpreted as a reset command, or perhaps it's ignored because it's not
> precisely 16 0 bits - I wonder how the hardware designers thought
> all-zeros was a good idea for a reset command].
> 
> Switching to gpio, i.e.
> 
> MX8MP_IOMUXC_ECSPI1_SS0__GPIO5_IO09	0x00000144
> 
> means CS is held low for the whole message, and I now read the expected
> contents.

\o/

> It's probably not anything to do with the mcp2518fd driver and this is
> straying somewhat from the original problem, but I see that CS is held
> low for a very long time after the last byte has been shifted in/out, on
> the order of 0.5ms. https://ibb.co/n1Hq3YR shows the first write to OSC,
> and zooming out shows https://ibb.co/12Z1YkM (or just looking at the
> mouse-over info in the first one).
> 
> Similarly, there's a rather large gap between the two spi_transfers in
> the case of the reading of dev_id (but, per the above, with CS correctly
> held throughout): https://ibb.co/pyG1y96 .
> 
> I'm not sure if this is to be expected.

It's the IRQ latency from transfer complete until the CS is deasserted
in software. In kernel v5.19 you will find some patches the improve the
imx-spi driver to use polling for short transfers, which greatly reduces
the latency you've measured.

regards,
Marc

-- 
Pengutronix e.K.                 | Marc Kleine-Budde           |
Embedded Linux                   | https://www.pengutronix.de  |
Vertretung West/Dortmund         | Phone: +49-231-2826-924     |
Amtsgericht Hildesheim, HRA 2686 | Fax:   +49-5121-206917-5555 |

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: ram initialization on mcp2518fd
  2022-06-20  7:09             ` Rasmus Villemoes
  2022-06-20 10:27               ` Marc Kleine-Budde
@ 2022-06-23 13:50               ` Marc Kleine-Budde
  1 sibling, 0 replies; 9+ messages in thread
From: Marc Kleine-Budde @ 2022-06-23 13:50 UTC (permalink / raw)
  To: Rasmus Villemoes; +Cc: linux-can, Thomas Kopp

[-- Attachment #1: Type: text/plain, Size: 1127 bytes --]

On 20.06.2022 09:09:09, Rasmus Villemoes wrote:
> >> There is still something to be looked at, since I see
> >>
> >> NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #08!!!
> > 
> > Update to a newer kernel. In v5.15 this problem is fixed.
> 
> Oh, great, thanks. I'll try to give 5.15 a quick spin, we're mostly on
> 5.10 because this project started when that was the latest LTS.

Here you can find the mcp251xfd driver and many enhancements of the CAN
drivers and infrastructure back ported to v5.10 and v5.15. Feel free to
test:

| https://git.pengutronix.de/cgit/mkl/linux/

The "-flat" tags contains a linear series against the latest stable,
while the tags without "-flat" are merges of different topics of the
back port. As the name of the tags suggest this work is sponsored by
microchip.

regards,
Marc

-- 
Pengutronix e.K.                 | Marc Kleine-Budde           |
Embedded Linux                   | https://www.pengutronix.de  |
Vertretung West/Dortmund         | Phone: +49-231-2826-924     |
Amtsgericht Hildesheim, HRA 2686 | Fax:   +49-5121-206917-5555 |

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2022-06-23 13:50 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <e3f73c56-1b46-4ee9-357e-40400c746e09@prevas.dk>
     [not found] ` <87wndgyl2l.fsf@hardanger.blackshift.org>
     [not found]   ` <441514ec-7b43-e11c-09b5-bdaf7fca0077@prevas.dk>
2022-06-16 11:21     ` ram initialization on mcp2518fd Rasmus Villemoes
2022-06-16 12:40       ` Marc Kleine-Budde
2022-06-17 14:01         ` Rasmus Villemoes
2022-06-17 14:17           ` Marc Kleine-Budde
2022-06-20  7:09             ` Rasmus Villemoes
2022-06-20 10:27               ` Marc Kleine-Budde
2022-06-20 13:36                 ` Rasmus Villemoes
2022-06-20 14:05                   ` Marc Kleine-Budde
2022-06-23 13:50               ` Marc Kleine-Budde

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.