All of lore.kernel.org
 help / color / mirror / Atom feed
* MCP251xFD runs interrupt handler twice per message.
@ 2022-01-14 11:05 Markus Mirevik
  2022-01-14 12:41 ` Marc Kleine-Budde
  0 siblings, 1 reply; 11+ messages in thread
From: Markus Mirevik @ 2022-01-14 11:05 UTC (permalink / raw)
  To: linux-can

Hello

I have performance problems with mcp2518fd. I have custom board based of beagleboard black. (Sitara am335x) using two mcp2518fd. The can communication uses a lot of CPU.  irq/64-spi1.1 uses around  ~20% at 1000 can messages/s.

I have several questions but I'll start with the most confusing.  I have found that every can messages fires two interrupts on my board. I have tested this in several ways. If I send one normal can message the counter in /proc/interrupts is increased twice. I have also put some printouts in mcp251xfd-core.c that confirms that mcp251xfd_irq() is run twice and the second time intf_pending is 0.

And for testing purposes I changed the interrupt config to trigger on falling edge instead of level and with this configured there is only one interrupt fired. But I guess this will risk locking the interrupt line low if an edge isn't detected. 

I have tried both with rx-int active and inactive. My scope shows really nice signals and that  rx-int and int is deactivated simultaneous on incoming frames. 

I'm testing by sending messages from my computer through a can dongle. 

The board is currently running Linux 5.10.79-yocto-tiny #1 SMP Tue Nov 16 03:57:43 UTC 2021 armv7l armv7l armv7l GNU/Linux
I've also tested updating the driver to the one from https://github.com/marckleinebudde/linux.git from 2021-12-29 with the same result (IRQ handler run twice). 

I am profoundly confused by this issue and I can not figure out why it's happening. My understanding is that since the IRQ handler is loaded with IRQF_ONESHOT the irq line should be masked until mcp251xfd_irq() returns. Since mcp251xfd_irq() checks that !rx_pending before exiting the int signal must be high. And the interrupt should not fire again... 

Result from init:
[    4.003029] mcp251xfd spi1.0 can0: MCP2518FD rev0.0 (+RX_INT -MAB_NO_WARN +CRC_REG +CRC_RX +CRC_TX +ECC -HD c:40.00MHz m:20.00MHz r:17.00MHz e:0.00MHz) successfully initialized.
[    4.028220] mcp251xfd spi1.1 can1: MCP2518FD rev0.0 (-RX_INT -MAB_NO_WARN +CRC_REG +CRC_RX +CRC_TX +ECC -HD c:40.00MHz m:20.00MHz r:17.00MHz e:0.00MHz) successfully initialized.

This is the current dtsi part for the canFD chips. (With rx-pin removed on one of the devices and trigger on edge on the other for debugging purposes). 

#include <dt-bindings/gpio/gpio.h>
#include <dt-bindings/interrupt-controller/irq.h>

&am33xx_pinmux{
        pinctrl_spi1_pins: pinctrl_spi1_pins {
                pinctrl-single,pins = <
                        AM33XX_IOPAD(0x990, PIN_INPUT | MUX_MODE3) /* (A13) mcasp0_aclkx.spi1_sclk */
                        AM33XX_IOPAD(0x994, PIN_INPUT | MUX_MODE3) /* (B13) mcasp0_fsx.spi1_d0 */
                        AM33XX_IOPAD(0x998, PIN_INPUT | MUX_MODE3) /* (D12) mcasp0_axr0.spi1_d1 */
                        AM33XX_IOPAD(0x96c, PIN_OUTPUT_PULLUP | MUX_MODE5) /* (E17) uart0_rtsn.spi1_cs0         CleANopen       LEFT*/
                        AM33XX_IOPAD(0x9b0, PIN_OUTPUT_PULLUP | MUX_MODE4) /* (A15) xdma_event_intr0.spi1_cs1   SAWM CAN        RIGHT*/
                >;
        };

        can0_int_pins: can0_int_pins {
                pinctrl-single,pins = <
                /*CleANopen*/
                AM33XX_IOPAD(0x89c, PIN_INPUT_PULLUP | MUX_MODE7) /* (T6) gpmc_be0n_cle.gpio2[5]        nINT            */
                AM33XX_IOPAD(0x968, PIN_INPUT_PULLUP | MUX_MODE7) /* (E18) uart0_ctsn.gpio1[8]          nINT1           */
                >;
        };

        can1_int_pins: can1_int_pins {
                pinctrl-single,pins = <
                /*SAWM CAN*/
                AM33XX_IOPAD(0x820, PIN_INPUT_PULLUP | MUX_MODE7) /* (U10) gpmc_ad8.gpio0[22]           nINT            */
                AM33XX_IOPAD(0x8c8, PIN_INPUT_PULLUP | MUX_MODE7) /* (U3) lcd_data10.gpio2[16]  nINT1           */
                AM33XX_IOPAD(0x8cc, PIN_INPUT_PULLUP | MUX_MODE7) /* (U4) lcd_data11.gpio2[17]  nINT0 NOT USED  */
                >;
        };
};



/{
        /* external 40M oscillator of mcp2518fd on SPI1.0 */
        mcp2518fd_can0_osc: mcp2518fd_can0_osc {
                compatible = "fixed-clock";
                #clock-cells = <0>;
                clock-frequency = <40000000>;
        };
};


/{
        /* external 40M oscillator of mcp2518fd on SPI1.1 */
        mcp2518fd_can1_osc: mcp2518fd_can1_osc {
                compatible = "fixed-clock";
                #clock-cells = <0>;
                clock-frequency = <40000000>;
        };
};

/* the spi config of the can-controller itself binding everything together */
&spi1{
    #address-cells = <1>;
    #size-cells = <0>;

    status = "okay";
    pinctrl-names = "default";
    pinctrl-0 = <&pinctrl_spi1_pins>;

    /*CleANopen*/
    can@0 {
        compatible = "microchip,mcp2518fd";
        reg = <0>;
        clocks = <&mcp2518fd_can0_osc>;
        pinctrl-names = "default";
        pinctrl-0 = <&can0_int_pins>;
        spi-max-frequency = <20000000>;
        interrupts-extended = <&gpio2 5 IRQ_TYPE_EDGE_FALLING>;
        microchip,rx-int-gpios = <&gpio1 8 GPIO_ACTIVE_LOW>;
    };

    can@1 {
        compatible = "microchip,mcp2518fd";
        reg = <1>;
        clocks = <&mcp2518fd_can1_osc>;
        pinctrl-names = "default";
        pinctrl-0 = <&can1_int_pins>;
        spi-max-frequency = <20000000>;
        interrupts-extended = <&gpio0 22 IRQ_TYPE_LEVEL_LOW>;
//        microchip,rx-int-gpios = <&gpio2 16 GPIO_ACTIVE_LOW>;
    };
};

 
BR
Markus Mirevik


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: MCP251xFD runs interrupt handler twice per message.
  2022-01-14 11:05 MCP251xFD runs interrupt handler twice per message Markus Mirevik
@ 2022-01-14 12:41 ` Marc Kleine-Budde
  2022-01-14 13:43   ` Sv: " Markus Mirevik
  0 siblings, 1 reply; 11+ messages in thread
From: Marc Kleine-Budde @ 2022-01-14 12:41 UTC (permalink / raw)
  To: Markus Mirevik; +Cc: linux-can

[-- Attachment #1: Type: text/plain, Size: 8974 bytes --]

On 14.01.2022 11:05:51, Markus Mirevik wrote:
> I have performance problems with mcp2518fd. I have custom board based
> of beagleboard black. (Sitara am335x) using two mcp2518fd. The can
> communication uses a lot of CPU. irq/64-spi1.1 uses around ~20% at
> 1000 can messages/s.
> 
> I have several questions but I'll start with the most confusing. I
> have found that every can messages fires two interrupts on my board.

Do you mean every RX'ed CAN frame?
Which interrupts does increase twice?

> I have tested this in several ways. If I send one normal can message
> the counter in /proc/interrupts is increased twice. I have also put
> some printouts in mcp251xfd-core.c that confirms that mcp251xfd_irq()
> is run twice and the second time intf_pending is 0.

You mean mcp251xfd_irq() is started twice?

The big loop
(https://elixir.bootlin.com/linux/v5.16/source/drivers/net/can/spi/mcp251xfd/mcp251xfd-core.c#L2182)
in mcp251xfd_irq() is run twice, and the IRQ handler is left only if
there are no pending IRQs.

The main IRQ handler loop (omitting the rx-int) is straight forward, and
not optimized:
- read pending IRQs [*]
- exit if there are no pending IRQs
- handle pending IRQs
  for RX:
  - read RX-FIFO level [*]
  - read RX'ed CAN frames [*]
  - mark RX'ed CAN frames as read [*]
- loop

All actions marked with a [*] require a SPI transfer, which results in 5
SPI transfers (read pending IRQs is done twice) per RX'ed CAN frame.
(Assuming there is only 1 RX'ed frame pending and no pending IRQs after
 the first loop.)

I have some ideas how to optimize this:
- do a single SPI transfer instead of several small ones:
  e.g. pending IRQs + RX-FIFO level, or read RX'ed frames + mark as read
- opportunistically assume RX'ed frame on interrupt and get rid of 1st
  read pending IRQs
- assume TX done IRQ, if CAN frames have been TX'ed but not marked as
  sent hardware
- coalesce RX IRQs
- coalesce TX done IRQs
- combine several TX frames into single SPI transfer

> And for testing purposes I changed the interrupt config to trigger on
> falling edge instead of level and with this configured there is only
> one interrupt fired. But I guess this will risk locking the interrupt
> line low if an edge isn't detected.

ACK - Please don't use edge triggered IRQs, sooner or later the driver
will miss an IRQ resulting in a handing driver. Always use level
triggered with the mcp251xfd.

> I have tried both with rx-int active and inactive. My scope shows
> really nice signals and that rx-int and int is deactivated
> simultaneous on incoming frames.

rx-int is an optimization to skip the first get IRQ pending transfer, as
it indicates RX'ed CAN frames.

> I'm testing by sending messages from my computer through a can dongle.
> 
> The board is currently running Linux 5.10.79-yocto-tiny #1 SMP Tue Nov
> 16 03:57:43 UTC 2021 armv7l armv7l armv7l GNU/Linux

Newer kernels include some optimizations....

> I've also tested updating the driver to the one from
> https://github.com/marckleinebudde/linux.git from 2021-12-29 with the
> same result (IRQ handler run twice).

The newest production ready code is always net-next/master. But you can
use the latest official kernel release (v5.16) too.

> I am profoundly confused by this issue and I can not figure out why
> it's happening. My understanding is that since the IRQ handler is
> loaded with IRQF_ONESHOT the irq line should be masked until
> mcp251xfd_irq() returns.

ACK.

> Since mcp251xfd_irq() checks that !rx_pending before exiting the int
> signal must be high. And the interrupt should not fire again...

The rx-int GPIO:

| microchip,rx-int-gpios = <&gpio1 8 GPIO_ACTIVE_LOW>;

is not used as an interrupt. The only interrupt line is:

| interrupts-extended = <&gpio2 5 IRQ_TYPE_EDGE_FALLING>;

If there is an interrupt the gpio2_5 will get active, if this is an RX
IRQ the gpio1_8 will get active, too. But only gpio2_5 will trigger the
interrupt handler.

In the interrupt handler, as rx-int is available and active the RX
handling is done first. If there are no more RX IRQs pending, the rx-int
goes inactive and the big loop in the IRQ handler will be handled: read
rx-pending IRQs, probably none pending, then leave IRQ handler.

> Result from init:
> [    4.003029] mcp251xfd spi1.0 can0: MCP2518FD rev0.0 (+RX_INT -MAB_NO_WARN +CRC_REG +CRC_RX +CRC_TX +ECC -HD c:40.00MHz m:20.00MHz r:17.00MHz e:0.00MHz) successfully initialized.
> [    4.028220] mcp251xfd spi1.1 can1: MCP2518FD rev0.0 (-RX_INT -MAB_NO_WARN +CRC_REG +CRC_RX +CRC_TX +ECC -HD c:40.00MHz m:20.00MHz r:17.00MHz e:0.00MHz) successfully initialized.
> 
> This is the current dtsi part for the canFD chips. (With rx-pin removed on one of the devices and trigger on edge on the other for debugging purposes). 
> 
> #include <dt-bindings/gpio/gpio.h>
> #include <dt-bindings/interrupt-controller/irq.h>
> 
> &am33xx_pinmux{
>         pinctrl_spi1_pins: pinctrl_spi1_pins {
>                 pinctrl-single,pins = <
>                         AM33XX_IOPAD(0x990, PIN_INPUT | MUX_MODE3) /* (A13) mcasp0_aclkx.spi1_sclk */
>                         AM33XX_IOPAD(0x994, PIN_INPUT | MUX_MODE3) /* (B13) mcasp0_fsx.spi1_d0 */
>                         AM33XX_IOPAD(0x998, PIN_INPUT | MUX_MODE3) /* (D12) mcasp0_axr0.spi1_d1 */
>                         AM33XX_IOPAD(0x96c, PIN_OUTPUT_PULLUP | MUX_MODE5) /* (E17) uart0_rtsn.spi1_cs0         CleANopen       LEFT*/
>                         AM33XX_IOPAD(0x9b0, PIN_OUTPUT_PULLUP | MUX_MODE4) /* (A15) xdma_event_intr0.spi1_cs1   SAWM CAN        RIGHT*/
>                 >;
>         };
> 
>         can0_int_pins: can0_int_pins {
>                 pinctrl-single,pins = <
>                 /*CleANopen*/
>                 AM33XX_IOPAD(0x89c, PIN_INPUT_PULLUP | MUX_MODE7) /* (T6) gpmc_be0n_cle.gpio2[5]        nINT            */
>                 AM33XX_IOPAD(0x968, PIN_INPUT_PULLUP | MUX_MODE7) /* (E18) uart0_ctsn.gpio1[8]          nINT1           */
>                 >;
>         };
> 
>         can1_int_pins: can1_int_pins {
>                 pinctrl-single,pins = <
>                 /*SAWM CAN*/
>                 AM33XX_IOPAD(0x820, PIN_INPUT_PULLUP | MUX_MODE7) /* (U10) gpmc_ad8.gpio0[22]           nINT            */
>                 AM33XX_IOPAD(0x8c8, PIN_INPUT_PULLUP | MUX_MODE7) /* (U3) lcd_data10.gpio2[16]  nINT1           */
>                 AM33XX_IOPAD(0x8cc, PIN_INPUT_PULLUP | MUX_MODE7) /* (U4) lcd_data11.gpio2[17]  nINT0 NOT USED  */
>                 >;
>         };
> };
> 
> 
> 
> /{
>         /* external 40M oscillator of mcp2518fd on SPI1.0 */
>         mcp2518fd_can0_osc: mcp2518fd_can0_osc {
>                 compatible = "fixed-clock";
>                 #clock-cells = <0>;
>                 clock-frequency = <40000000>;
>         };
> };
> 
> 
> /{
>         /* external 40M oscillator of mcp2518fd on SPI1.1 */
>         mcp2518fd_can1_osc: mcp2518fd_can1_osc {
>                 compatible = "fixed-clock";
>                 #clock-cells = <0>;
>                 clock-frequency = <40000000>;
>         };
> };
> 
> /* the spi config of the can-controller itself binding everything together */
> &spi1{
>     #address-cells = <1>;
>     #size-cells = <0>;
> 
>     status = "okay";
>     pinctrl-names = "default";
>     pinctrl-0 = <&pinctrl_spi1_pins>;
> 
>     /*CleANopen*/
>     can@0 {
>         compatible = "microchip,mcp2518fd";
>         reg = <0>;
>         clocks = <&mcp2518fd_can0_osc>;
>         pinctrl-names = "default";
>         pinctrl-0 = <&can0_int_pins>;
>         spi-max-frequency = <20000000>;
>         interrupts-extended = <&gpio2 5 IRQ_TYPE_EDGE_FALLING>;

See comment about edge IRQs above!

>         microchip,rx-int-gpios = <&gpio1 8 GPIO_ACTIVE_LOW>;
>     };
> 
>     can@1 {
>         compatible = "microchip,mcp2518fd";
>         reg = <1>;
>         clocks = <&mcp2518fd_can1_osc>;
>         pinctrl-names = "default";
>         pinctrl-0 = <&can1_int_pins>;
>         spi-max-frequency = <20000000>;
>         interrupts-extended = <&gpio0 22 IRQ_TYPE_LEVEL_LOW>;
> //        microchip,rx-int-gpios = <&gpio2 16 GPIO_ACTIVE_LOW>;
>     };
> };

Can you test again with IRQ_TYPE_LEVEL_LOW and no rx-int-gpios. Please
instrument the beginning and the returns of the mcp251xfd_irq() function
to check if it's really started twice. Please print the
priv->regs_status.intf in every loop. Can you record the gpio2_5 with a
scope.

Make sure that you don't send any CAN frames as reaction of reception.

regards,
Marc

-- 
Pengutronix e.K.                 | Marc Kleine-Budde           |
Embedded Linux                   | https://www.pengutronix.de  |
Vertretung West/Dortmund         | Phone: +49-231-2826-924     |
Amtsgericht Hildesheim, HRA 2686 | Fax:   +49-5121-206917-5555 |

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Sv: MCP251xFD runs interrupt handler twice per message.
  2022-01-14 12:41 ` Marc Kleine-Budde
@ 2022-01-14 13:43   ` Markus Mirevik
  2022-01-16 11:09     ` Marc Kleine-Budde
  0 siblings, 1 reply; 11+ messages in thread
From: Markus Mirevik @ 2022-01-14 13:43 UTC (permalink / raw)
  To: Marc Kleine-Budde; +Cc: linux-can

Hi

> On 14.01.2022 11:05:51, Markus Mirevik wrote:
> > I have performance problems with mcp2518fd. I have custom board based
> > of beagleboard black. (Sitara am335x) using two mcp2518fd. The can
> > communication uses a lot of CPU. irq/64-spi1.1 uses around ~20% at
> > 1000 can messages/s.
> >
> > I have several questions but I'll start with the most confusing. I
> > have found that every can messages fires two interrupts on my board.
> 
> Do you mean every RX'ed CAN frame?
Yes, every received CAN frame.

> Which interrupts does increase twice?

This one: 
64:  4  44e07000.gpio  22 Level     spi1.1

 
> > I have tested this in several ways. If I send one normal can message
> > the counter in /proc/interrupts is increased twice. I have also put
> > some printouts in mcp251xfd-core.c that confirms that mcp251xfd_irq()
> > is run twice and the second time intf_pending is 0.
> 
> You mean mcp251xfd_irq() is started twice?

Yes.

> 
> The big loop
> (https://elixir.bootlin.com/linux/v5.16/source/drivers/net/can/spi/mcp251xf
> d/mcp251xfd-core.c#L2182)
> in mcp251xfd_irq() is run twice, and the IRQ handler is left only if there are
> no pending IRQs.
> 
> The main IRQ handler loop (omitting the rx-int) is straight forward, and not
> optimized:
> - read pending IRQs [*]
> - exit if there are no pending IRQs
> - handle pending IRQs
>   for RX:
>   - read RX-FIFO level [*]
>   - read RX'ed CAN frames [*]
>   - mark RX'ed CAN frames as read [*]
> - loop
> 
> All actions marked with a [*] require a SPI transfer, which results in 5 SPI
> transfers (read pending IRQs is done twice) per RX'ed CAN frame.
> (Assuming there is only 1 RX'ed frame pending and no pending IRQs after
> the first loop.)
> 
> I have some ideas how to optimize this:
> - do a single SPI transfer instead of several small ones:
>   e.g. pending IRQs + RX-FIFO level, or read RX'ed frames + mark as read
> - opportunistically assume RX'ed frame on interrupt and get rid of 1st
>   read pending IRQs
> - assume TX done IRQ, if CAN frames have been TX'ed but not marked as
>   sent hardware
> - coalesce RX IRQs
> - coalesce TX done IRQs
> - combine several TX frames into single SPI transfer
> 

Indeed there is a lot to work on, I'm a bit out of my depth here and as I said the first question is about why it interrupt is fired twice. 
It's not only the big loop that is run twice. 
With rx-int active what I can observe is this:

- read RX-FIFO level [*]
- read RX'ed CAN frames [*]
- mark RX'ed CAN frames as read [*]
- read pending IRQs [*]
- exit if there are no pending IRQs
 -> No Pending IRQ's Exiting. 
 - read pending IRQs [*]
 - exit if there are no pending IRQs
-> No Pending IRQ's Exiting.

> > And for testing purposes I changed the interrupt config to trigger on
> > falling edge instead of level and with this configured there is only
> > one interrupt fired. But I guess this will risk locking the interrupt
> > line low if an edge isn't detected.
> 
> ACK - Please don't use edge triggered IRQs, sooner or later the driver will
> miss an IRQ resulting in a handing driver. Always use level triggered with the
> mcp251xfd.
> 

Yes, I figured that. 

> > I have tried both with rx-int active and inactive. My scope shows
> > really nice signals and that rx-int and int is deactivated
> > simultaneous on incoming frames.
> 
> rx-int is an optimization to skip the first get IRQ pending transfer, as it
> indicates RX'ed CAN frames.

OFFTOPIC:
Shouldn’t a return after the rx-int part be ok to skip the next get IRQ pending transfer as well? 

if (priv->rx_int) {
  do {
    int rx_pending;
    rx_pending = gpiod_get_value_cansleep(priv->rx_int);
    
    if (!rx_pending)
      break;

    err = mcp251xfd_handle(priv, rxif);
    if (err)
      goto out_fail;

    handled = IRQ_HANDLED;
  } while (1);
  
  if (handled == IRQ_HANDLED) {
    return handled;
  }
}
...

Even better would be to check if  (IRQ.VALUE == INACTIVE) but I don know how to that.. 


> 
> > I'm testing by sending messages from my computer through a can dongle.
> >
> > The board is currently running Linux 5.10.79-yocto-tiny #1 SMP Tue Nov
> > 16 03:57:43 UTC 2021 armv7l armv7l armv7l GNU/Linux
> 
> Newer kernels include some optimizations....
> 
> > I've also tested updating the driver to the one from
> > https://github.com/marckleinebudde/linux.git from 2021-12-29 with the
> > same result (IRQ handler run twice).
> 
> The newest production ready code is always net-next/master. But you can
> use the latest official kernel release (v5.16) too.
> 
> > I am profoundly confused by this issue and I can not figure out why
> > it's happening. My understanding is that since the IRQ handler is
> > loaded with IRQF_ONESHOT the irq line should be masked until
> > mcp251xfd_irq() returns.
> 
> ACK.
> 
> > Since mcp251xfd_irq() checks that !rx_pending before exiting the int
> > signal must be high. And the interrupt should not fire again...
> 
> The rx-int GPIO:
> 
> | microchip,rx-int-gpios = <&gpio1 8 GPIO_ACTIVE_LOW>;
> 
> is not used as an interrupt. The only interrupt line is:
> 
> | interrupts-extended = <&gpio2 5 IRQ_TYPE_EDGE_FALLING>;
> 
> If there is an interrupt the gpio2_5 will get active, if this is an RX IRQ the
> gpio1_8 will get active, too. But only gpio2_5 will trigger the interrupt
> handler.
> 
> In the interrupt handler, as rx-int is available and active the RX handling is
> done first. If there are no more RX IRQs pending, the rx-int goes inactive and
> the big loop in the IRQ handler will be handled: read rx-pending IRQs,
> probably none pending, then leave IRQ handler.
> 

Ok.

> > Result from init:
> > [    4.003029] mcp251xfd spi1.0 can0: MCP2518FD rev0.0 (+RX_INT -
> MAB_NO_WARN +CRC_REG +CRC_RX +CRC_TX +ECC -HD c:40.00MHz
> m:20.00MHz r:17.00MHz e:0.00MHz) successfully initialized.
> > [    4.028220] mcp251xfd spi1.1 can1: MCP2518FD rev0.0 (-RX_INT -
> MAB_NO_WARN +CRC_REG +CRC_RX +CRC_TX +ECC -HD c:40.00MHz
> m:20.00MHz r:17.00MHz e:0.00MHz) successfully initialized.
> >
> > This is the current dtsi part for the canFD chips. (With rx-pin removed on
> one of the devices and trigger on edge on the other for debugging
> purposes).
> >
> > #include <dt-bindings/gpio/gpio.h>
> > #include <dt-bindings/interrupt-controller/irq.h>
> >
> > &am33xx_pinmux{
> >         pinctrl_spi1_pins: pinctrl_spi1_pins {
> >                 pinctrl-single,pins = <
> >                         AM33XX_IOPAD(0x990, PIN_INPUT | MUX_MODE3) /* (A13)
> mcasp0_aclkx.spi1_sclk */
> >                         AM33XX_IOPAD(0x994, PIN_INPUT | MUX_MODE3) /* (B13)
> mcasp0_fsx.spi1_d0 */
> >                         AM33XX_IOPAD(0x998, PIN_INPUT | MUX_MODE3) /* (D12)
> mcasp0_axr0.spi1_d1 */
> >                         AM33XX_IOPAD(0x96c, PIN_OUTPUT_PULLUP | MUX_MODE5)
> /* (E17) uart0_rtsn.spi1_cs0         CleANopen       LEFT*/
> >                         AM33XX_IOPAD(0x9b0, PIN_OUTPUT_PULLUP |
> MUX_MODE4) /* (A15) xdma_event_intr0.spi1_cs1   SAWM CAN
> RIGHT*/
> >                 >;
> >         };
> >
> >         can0_int_pins: can0_int_pins {
> >                 pinctrl-single,pins = <
> >                 /*CleANopen*/
> >                 AM33XX_IOPAD(0x89c, PIN_INPUT_PULLUP | MUX_MODE7) /*
> (T6) gpmc_be0n_cle.gpio2[5]        nINT            */
> >                 AM33XX_IOPAD(0x968, PIN_INPUT_PULLUP | MUX_MODE7) /*
> (E18) uart0_ctsn.gpio1[8]          nINT1           */
> >                 >;
> >         };
> >
> >         can1_int_pins: can1_int_pins {
> >                 pinctrl-single,pins = <
> >                 /*SAWM CAN*/
> >                 AM33XX_IOPAD(0x820, PIN_INPUT_PULLUP | MUX_MODE7) /*
> (U10) gpmc_ad8.gpio0[22]           nINT            */
> >                 AM33XX_IOPAD(0x8c8, PIN_INPUT_PULLUP | MUX_MODE7) /*
> (U3) lcd_data10.gpio2[16]  nINT1           */
> >                 AM33XX_IOPAD(0x8cc, PIN_INPUT_PULLUP | MUX_MODE7) /*
> (U4) lcd_data11.gpio2[17]  nINT0 NOT USED  */
> >                 >;
> >         };
> > };
> >
> >
> >
> > /{
> >         /* external 40M oscillator of mcp2518fd on SPI1.0 */
> >         mcp2518fd_can0_osc: mcp2518fd_can0_osc {
> >                 compatible = "fixed-clock";
> >                 #clock-cells = <0>;
> >                 clock-frequency = <40000000>;
> >         };
> > };
> >
> >
> > /{
> >         /* external 40M oscillator of mcp2518fd on SPI1.1 */
> >         mcp2518fd_can1_osc: mcp2518fd_can1_osc {
> >                 compatible = "fixed-clock";
> >                 #clock-cells = <0>;
> >                 clock-frequency = <40000000>;
> >         };
> > };
> >
> > /* the spi config of the can-controller itself binding everything
> > together */ &spi1{
> >     #address-cells = <1>;
> >     #size-cells = <0>;
> >
> >     status = "okay";
> >     pinctrl-names = "default";
> >     pinctrl-0 = <&pinctrl_spi1_pins>;
> >
> >     /*CleANopen*/
> >     can@0 {
> >         compatible = "microchip,mcp2518fd";
> >         reg = <0>;
> >         clocks = <&mcp2518fd_can0_osc>;
> >         pinctrl-names = "default";
> >         pinctrl-0 = <&can0_int_pins>;
> >         spi-max-frequency = <20000000>;
> >         interrupts-extended = <&gpio2 5 IRQ_TYPE_EDGE_FALLING>;
> 
> See comment about edge IRQs above!

ACK.

> 
> >         microchip,rx-int-gpios = <&gpio1 8 GPIO_ACTIVE_LOW>;
> >     };
> >
> >     can@1 {
> >         compatible = "microchip,mcp2518fd";
> >         reg = <1>;
> >         clocks = <&mcp2518fd_can1_osc>;
> >         pinctrl-names = "default";
> >         pinctrl-0 = <&can1_int_pins>;
> >         spi-max-frequency = <20000000>;
> >         interrupts-extended = <&gpio0 22 IRQ_TYPE_LEVEL_LOW>;
> > //        microchip,rx-int-gpios = <&gpio2 16 GPIO_ACTIVE_LOW>;
> >     };
> > };
> 
> Can you test again with IRQ_TYPE_LEVEL_LOW and no rx-int-gpios. Please
> instrument the beginning and the returns of the mcp251xfd_irq() function to
> check if it's really started twice. Please print the
> priv->regs_status.intf in every loop. Can you record the gpio2_5 with a
> scope.

Yes I can but what do you mean with Instrument? 
And no scope until Monday. 


> Make sure that you don't send any CAN frames as reaction of reception.

How do I make sure of that? :/ 


> 
> regards,
> Marc
> 
> --
> Pengutronix e.K.                 | Marc Kleine-Budde           |
> Embedded Linux                   | https://www.pengutronix.de  |
> Vertretung West/Dortmund         | Phone: +49-231-2826-924     |
> Amtsgericht Hildesheim, HRA 2686 | Fax:   +49-5121-206917-5555 |

Thank you for your time
Markus 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: MCP251xFD runs interrupt handler twice per message.
  2022-01-14 13:43   ` Sv: " Markus Mirevik
@ 2022-01-16 11:09     ` Marc Kleine-Budde
  2022-01-17 11:55       ` Sv: " Markus Mirevik
  0 siblings, 1 reply; 11+ messages in thread
From: Marc Kleine-Budde @ 2022-01-16 11:09 UTC (permalink / raw)
  To: Markus Mirevik; +Cc: linux-can

[-- Attachment #1: Type: text/plain, Size: 6036 bytes --]

On 14.01.2022 13:43:42, Markus Mirevik wrote:
> > > I have performance problems with mcp2518fd. I have custom board based
> > > of beagleboard black. (Sitara am335x) using two mcp2518fd. The can
> > > communication uses a lot of CPU. irq/64-spi1.1 uses around ~20% at
> > > 1000 can messages/s.
> > >
> > > I have several questions but I'll start with the most confusing. I
> > > have found that every can messages fires two interrupts on my board.
> > 
> > Do you mean every RX'ed CAN frame?
> Yes, every received CAN frame.
> 
> > Which interrupts does increase twice?
> 
> This one: 
> 64:  4  44e07000.gpio  22 Level     spi1.1

Ok, that's the IRQ line from the mcp251xfd.

> > > I have tested this in several ways. If I send one normal can message
> > > the counter in /proc/interrupts is increased twice. I have also put
> > > some printouts in mcp251xfd-core.c that confirms that mcp251xfd_irq()
> > > is run twice and the second time intf_pending is 0.
> > 
> > You mean mcp251xfd_irq() is started twice?
> 
> Yes.

That's not the way it's supposed to be. Maybe it's a limitation of the
am335x IRQ controller.

> > The big loop
> > (https://elixir.bootlin.com/linux/v5.16/source/drivers/net/can/spi/mcp251xf
> > d/mcp251xfd-core.c#L2182)
> > in mcp251xfd_irq() is run twice, and the IRQ handler is left only if there are
> > no pending IRQs.
> > 
> > The main IRQ handler loop (omitting the rx-int) is straight forward, and not
> > optimized:
> > - read pending IRQs [*]
> > - exit if there are no pending IRQs
> > - handle pending IRQs
> >   for RX:
> >   - read RX-FIFO level [*]
> >   - read RX'ed CAN frames [*]
> >   - mark RX'ed CAN frames as read [*]
> > - loop
> > 
> > All actions marked with a [*] require a SPI transfer, which results in 5 SPI
> > transfers (read pending IRQs is done twice) per RX'ed CAN frame.
> > (Assuming there is only 1 RX'ed frame pending and no pending IRQs after
> > the first loop.)
> > 
> > I have some ideas how to optimize this:
> > - do a single SPI transfer instead of several small ones:
> >   e.g. pending IRQs + RX-FIFO level, or read RX'ed frames + mark as read
> > - opportunistically assume RX'ed frame on interrupt and get rid of 1st
> >   read pending IRQs
> > - assume TX done IRQ, if CAN frames have been TX'ed but not marked as
> >   sent hardware
> > - coalesce RX IRQs
> > - coalesce TX done IRQs
> > - combine several TX frames into single SPI transfer
> > 
> 
> Indeed there is a lot to work on, I'm a bit out of my depth here and
> as I said the first question is about why it interrupt is fired twice.
>
> It's not only the big loop that is run twice. With rx-int active what
> I can observe is this:
> 
> - read RX-FIFO level [*]
> - read RX'ed CAN frames [*]
> - mark RX'ed CAN frames as read [*]
> - read pending IRQs [*]
> - exit if there are no pending IRQs
>  -> No Pending IRQ's Exiting.

So far that look correct, but the extra call to mcp251xfd_irq() that
follow is not correct.

> - read pending IRQs [*]
> - exit if there are no pending IRQs
> -> No Pending IRQ's Exiting.

> > > And for testing purposes I changed the interrupt config to trigger on
> > > falling edge instead of level and with this configured there is only
> > > one interrupt fired. But I guess this will risk locking the interrupt
> > > line low if an edge isn't detected.
> > 
> > ACK - Please don't use edge triggered IRQs, sooner or later the driver will
> > miss an IRQ resulting in a handing driver. Always use level triggered with the
> > mcp251xfd.
> > 
> 
> Yes, I figured that. 
> 
> > > I have tried both with rx-int active and inactive. My scope shows
> > > really nice signals and that rx-int and int is deactivated
> > > simultaneous on incoming frames.
> > 
> > rx-int is an optimization to skip the first get IRQ pending transfer, as it
> > indicates RX'ed CAN frames.
> 
> OFFTOPIC:
> Shouldn’t a return after the rx-int part be ok to skip the next get IRQ pending transfer as well? 

That's a possible optimization.

I would look at the number of pending TX CAN frames. The higher the
number, the higher the chance that some have been send.

> if (priv->rx_int) {
>   do {
>     int rx_pending;
>     rx_pending = gpiod_get_value_cansleep(priv->rx_int);
>     
>     if (!rx_pending)
>       break;
> 
>     err = mcp251xfd_handle(priv, rxif);
>     if (err)
>       goto out_fail;
> 
>     handled = IRQ_HANDLED;
>   } while (1);
>   
>   if (handled == IRQ_HANDLED) {
>     return handled;
>   }
> }
> ...
> 
> Even better would be to check if (IRQ.VALUE == INACTIVE) but I don
> know how to that..

I think this is not possible in the Linux kernel anymore. There was a
function to get the GPIO associated with an IRQ, but that's not
available anymore. In general IRQs are not associated to GPIO, that's
probably only the case for SoCs, but not for other busses like PCI.

[...]

> > Can you test again with IRQ_TYPE_LEVEL_LOW and no rx-int-gpios. Please
> > instrument the beginning and the returns of the mcp251xfd_irq() function to
> > check if it's really started twice. Please print the
> > priv->regs_status.intf in every loop. Can you record the gpio2_5 with a
> > scope.
> 
> Yes I can but what do you mean with Instrument?

That's a fancy way of saying: please add printk() :)

> And no scope until Monday. 

no problem

> > Make sure that you don't send any CAN frames as reaction of reception.
> 
> How do I make sure of that? :/ 

The Linux kernel doesn't send any CAN frames on its own. You need a user
space program to do so. Make sure you don't have any user space program
running that's sending CAN frames.

Marc

-- 
Pengutronix e.K.                 | Marc Kleine-Budde           |
Embedded Linux                   | https://www.pengutronix.de  |
Vertretung West/Dortmund         | Phone: +49-231-2826-924     |
Amtsgericht Hildesheim, HRA 2686 | Fax:   +49-5121-206917-5555 |

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Sv: MCP251xFD runs interrupt handler twice per message.
  2022-01-16 11:09     ` Marc Kleine-Budde
@ 2022-01-17 11:55       ` Markus Mirevik
  2022-01-17 13:27         ` Marc Kleine-Budde
  0 siblings, 1 reply; 11+ messages in thread
From: Markus Mirevik @ 2022-01-17 11:55 UTC (permalink / raw)
  To: Marc Kleine-Budde; +Cc: linux-can

[-- Attachment #1: Type: text/plain, Size: 9165 bytes --]

> On 14.01.2022 13:43:42, Markus Mirevik wrote:
> > > > I have performance problems with mcp2518fd. I have custom board
> > > > based of beagleboard black. (Sitara am335x) using two mcp2518fd.
> > > > The can communication uses a lot of CPU. irq/64-spi1.1 uses around
> > > > ~20% at
> > > > 1000 can messages/s.
> > > >
> > > > I have several questions but I'll start with the most confusing. I
> > > > have found that every can messages fires two interrupts on my board.
> > >
> > > Do you mean every RX'ed CAN frame?
> > Yes, every received CAN frame.
> >
> > > Which interrupts does increase twice?
> >
> > This one:
> > 64:  4  44e07000.gpio  22 Level     spi1.1
> 
> Ok, that's the IRQ line from the mcp251xfd.

ACK.

> 
> > > > I have tested this in several ways. If I send one normal can
> > > > message the counter in /proc/interrupts is increased twice. I have
> > > > also put some printouts in mcp251xfd-core.c that confirms that
> > > > mcp251xfd_irq() is run twice and the second time intf_pending is 0.
> > >
> > > You mean mcp251xfd_irq() is started twice?
> >
> > Yes.
> 
> That's not the way it's supposed to be. Maybe it's a limitation of the am335x
> IRQ controller.

What kind of limitation could that be? 

> 
> > > The big loop
> > > (https://elixir.bootlin.com/linux/v5.16/source/drivers/net/can/spi/m
> > > cp251xf
> > > d/mcp251xfd-core.c#L2182)
> > > in mcp251xfd_irq() is run twice, and the IRQ handler is left only if
> > > there are no pending IRQs.
> > >
> > > The main IRQ handler loop (omitting the rx-int) is straight forward,
> > > and not
> > > optimized:
> > > - read pending IRQs [*]
> > > - exit if there are no pending IRQs
> > > - handle pending IRQs
> > >   for RX:
> > >   - read RX-FIFO level [*]
> > >   - read RX'ed CAN frames [*]
> > >   - mark RX'ed CAN frames as read [*]
> > > - loop
> > >
> > > All actions marked with a [*] require a SPI transfer, which results
> > > in 5 SPI transfers (read pending IRQs is done twice) per RX'ed CAN frame.
> > > (Assuming there is only 1 RX'ed frame pending and no pending IRQs
> > > after the first loop.)
> > >
> > > I have some ideas how to optimize this:
> > > - do a single SPI transfer instead of several small ones:
> > >   e.g. pending IRQs + RX-FIFO level, or read RX'ed frames + mark as
> > > read
> > > - opportunistically assume RX'ed frame on interrupt and get rid of 1st
> > >   read pending IRQs
> > > - assume TX done IRQ, if CAN frames have been TX'ed but not marked as
> > >   sent hardware
> > > - coalesce RX IRQs
> > > - coalesce TX done IRQs
> > > - combine several TX frames into single SPI transfer
> > >
> >
> > Indeed there is a lot to work on, I'm a bit out of my depth here and
> > as I said the first question is about why it interrupt is fired twice.
> >
> > It's not only the big loop that is run twice. With rx-int active what
> > I can observe is this:
> >
> > - read RX-FIFO level [*]
> > - read RX'ed CAN frames [*]
> > - mark RX'ed CAN frames as read [*]
> > - read pending IRQs [*]
> > - exit if there are no pending IRQs
> >  -> No Pending IRQ's Exiting.
> 
> So far that look correct, but the extra call to mcp251xfd_irq() that follow is
> not correct.
> 
> > - read pending IRQs [*]
> > - exit if there are no pending IRQs
> > -> No Pending IRQ's Exiting.
> 
> > > > And for testing purposes I changed the interrupt config to trigger
> > > > on falling edge instead of level and with this configured there is
> > > > only one interrupt fired. But I guess this will risk locking the
> > > > interrupt line low if an edge isn't detected.
> > >
> > > ACK - Please don't use edge triggered IRQs, sooner or later the
> > > driver will miss an IRQ resulting in a handing driver. Always use
> > > level triggered with the mcp251xfd.
> > >
> >
> > Yes, I figured that.
> >
> > > > I have tried both with rx-int active and inactive. My scope shows
> > > > really nice signals and that rx-int and int is deactivated
> > > > simultaneous on incoming frames.
> > >
> > > rx-int is an optimization to skip the first get IRQ pending
> > > transfer, as it indicates RX'ed CAN frames.
> >
> > OFFTOPIC:
> > Shouldn’t a return after the rx-int part be ok to skip the next get IRQ
> pending transfer as well?
> 
> That's a possible optimization.
> 
> I would look at the number of pending TX CAN frames. The higher the
> number, the higher the chance that some have been send.
> 
> > if (priv->rx_int) {
> >   do {
> >     int rx_pending;
> >     rx_pending = gpiod_get_value_cansleep(priv->rx_int);
> >
> >     if (!rx_pending)
> >       break;
> >
> >     err = mcp251xfd_handle(priv, rxif);
> >     if (err)
> >       goto out_fail;
> >
> >     handled = IRQ_HANDLED;
> >   } while (1);
> >
> >   if (handled == IRQ_HANDLED) {
> >     return handled;
> >   }
> > }
> > ...
> >
> > Even better would be to check if (IRQ.VALUE == INACTIVE) but I don
> > know how to that..
> 
> I think this is not possible in the Linux kernel anymore. There was a function
> to get the GPIO associated with an IRQ, but that's not available anymore. In
> general IRQs are not associated to GPIO, that's probably only the case for
> SoCs, but not for other busses like PCI.
> 

OK.


> [...]
> 
> > > Can you test again with IRQ_TYPE_LEVEL_LOW and no rx-int-gpios.
> > > Please instrument the beginning and the returns of the
> > > mcp251xfd_irq() function to check if it's really started twice.
> > > Please print the
> > > priv->regs_status.intf in every loop. Can you record the gpio2_5
> > > priv->with a
> > > scope.
> >
> > Yes I can but what do you mean with Instrument?
> 
> That's a fancy way of saying: please add printk() :)
> 
> > And no scope until Monday.
> 
> no problem
> 
> > > Make sure that you don't send any CAN frames as reaction of reception.
> >
> > How do I make sure of that? :/
> 
> The Linux kernel doesn't send any CAN frames on its own. You need a user
> space program to do so. Make sure you don't have any user space program
> running that's sending CAN frames.
> 

No user space program running that would send can frames. And no can frames received on the other side.  

With IRQ_TYPE_LEVEL_LOW and the rx-int-gpios disabled I have added a call to netdev_info at the top of the function: 
...
        netdev_info(priv->ndev, "mcp251xfd_irq mcp251xfd_regs_status.intf: %d", priv->regs_status.intf );
        if (priv->rx_int) {
...

and here:
... 
               if (!(intf_pending)) {
                        netdev_info(priv->ndev, "!intf_pending mcp251xfd_regs_status.intf: %d", priv->regs_status.intf );
                        return handled;
                }
                netdev_info(priv->ndev, "intf_pending mcp251xfd_regs_status.intf: %d", priv->regs_status.intf );
...

Resulting in this when receiving one can message: 

[  987.486691] mcp251xfd spi1.1 can1: mcp251xfd_irq mcp251xfd_regs_status.intf: -1088815100
[  987.494987] mcp251xfd spi1.1 can1: intf_pending mcp251xfd_regs_status.intf: -1088815098
[  987.503239] mcp251xfd spi1.1 can1: !intf_pending mcp251xfd_regs_status.intf: -1088815100
[  987.511423] mcp251xfd spi1.1 can1: mcp251xfd_irq mcp251xfd_regs_status.intf: -1088815100
[  987.519603] mcp251xfd spi1.1 can1: !intf_pending mcp251xfd_regs_status.intf: -1088815100

mcp251xfd_regs_status.intf translated to binary 
1	1111 1111 1111 1111 1111 1111 1111 1111 1011 1111 0001 1010 0000 0000 0000 0100
2	1111 1111 1111 1111 1111 1111 1111 1111 1011 1111 0001 1010 0000 0000 0000 0110
3	1111 1111 1111 1111 1111 1111 1111 1111 1011 1111 0001 1010 0000 0000 0000 0100
4	1111 1111 1111 1111 1111 1111 1111 1111 1011 1111 0001 1010 0000 0000 0000 0100
5	1111 1111 1111 1111 1111 1111 1111 1111 1011 1111 0001 1010 0000 0000 0000 0100


I have also attached a  scope picture, this is without the printouts. 

And if I'm reading it correctly it confirms that it runs twice. 

1 read pending IRQs [*]
-RXIF-
2 read RX-FIFO level [*]
3 read RX'ed CAN frames [*]
4 and 5 mark RX'ed CAN frames as read [*] (This is the 5.10.79 version that uses regmap_update_bits which I assume uses one read and one write instruction.) 
-loop-
6 read pending IRQs [*]
-Exit-
7 read pending IRQs [*]

Confirmd with printouts:
[   26.990324] mcp251xfd spi1.1 can1: mcp251xfd_irq()
[   26.995173] mcp251xfd spi1.1 can1: * Bulk read MCP251XFD_REG_INT
[   27.001371] mcp251xfd spi1.1 can1: Interrupts pending, handling.
[   27.007435] mcp251xfd spi1.1 can1: * Read RX MCP251XFD_REG_FIFOSTA
[   27.013676] mcp251xfd spi1.1 can1: * Bulk read. Register: 1248 count: 5
[   27.020226] mcp251xfd spi1.1 can1: * Update bits MCP251XFD_REG_FIFOCON_UINC
[   27.027318] mcp251xfd spi1.1 can1: * Bulk read MCP251XFD_REG_INT
[   27.033378] mcp251xfd spi1.1 can1: No interrupts pending, returning.
[   27.039805] mcp251xfd spi1.1 can1: mcp251xfd_irq()
[   27.044617] mcp251xfd spi1.1 can1: * Bulk read MCP251XFD_REG_INT
[   27.050693] mcp251xfd spi1.1 can1: No interrupts pending, returning.

Regards
Markus 


[-- Attachment #2: scope.png --]
[-- Type: image/png, Size: 19673 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: MCP251xFD runs interrupt handler twice per message.
  2022-01-17 11:55       ` Sv: " Markus Mirevik
@ 2022-01-17 13:27         ` Marc Kleine-Budde
  2022-01-17 14:53           ` Sv: " Markus Mirevik
  0 siblings, 1 reply; 11+ messages in thread
From: Marc Kleine-Budde @ 2022-01-17 13:27 UTC (permalink / raw)
  To: Markus Mirevik; +Cc: linux-can

[-- Attachment #1: Type: text/plain, Size: 5566 bytes --]

On 17.01.2022 11:55:31, Markus Mirevik wrote:
> > > > You mean mcp251xfd_irq() is started twice?
> > >
> > > Yes.
> > 
> > That's not the way it's supposed to be. Maybe it's a limitation of the am335x
> > IRQ controller.
> 
> What kind of limitation could that be?

Some kind of limitation of the hardware. Don't know, SoC may have all
kinds of bugs :)

[...]

> > > > Can you test again with IRQ_TYPE_LEVEL_LOW and no rx-int-gpios.
> > > > Please instrument the beginning and the returns of the
> > > > mcp251xfd_irq() function to check if it's really started twice.
> > > > Please print the
> > > > priv->regs_status.intf in every loop. Can you record the gpio2_5
> > > > priv->with a
> > > > scope.
> > >
> > > Yes I can but what do you mean with Instrument?
> > 
> > That's a fancy way of saying: please add printk() :)
> > 
> > > And no scope until Monday.
> > 
> > no problem
> > 
> > > > Make sure that you don't send any CAN frames as reaction of reception.
> > >
> > > How do I make sure of that? :/
> > 
> > The Linux kernel doesn't send any CAN frames on its own. You need a user
> > space program to do so. Make sure you don't have any user space program
> > running that's sending CAN frames.
> > 
> 
> No user space program running that would send can frames. And no can frames received on the other side.  
> 
> With IRQ_TYPE_LEVEL_LOW and the rx-int-gpios disabled I have added a
> call to netdev_info at the top of the function:

For better readability register values are usually printed in hex
(0x%08x).

> ...
>         netdev_info(priv->ndev, "mcp251xfd_irq mcp251xfd_regs_status.intf: %d", priv->regs_status.intf );

Note: the "intf" contains old information here, because in this run of
the interrupt handler the intf has not been read so far.

>         if (priv->rx_int) {
> ...
> 
> and here:
> ... 
>                if (!(intf_pending)) {
>                         netdev_info(priv->ndev, "!intf_pending mcp251xfd_regs_status.intf: %d", priv->regs_status.intf );
>                         return handled;
>                 }
>                 netdev_info(priv->ndev, "intf_pending mcp251xfd_regs_status.intf: %d", priv->regs_status.intf );
> ...
> 
> Resulting in this when receiving one can message: 
> 
> [  987.486691] mcp251xfd spi1.1 can1: mcp251xfd_irq mcp251xfd_regs_status.intf: -1088815100
> [  987.494987] mcp251xfd spi1.1 can1: intf_pending mcp251xfd_regs_status.intf:  -1088815098
> [  987.503239] mcp251xfd spi1.1 can1: !intf_pending mcp251xfd_regs_status.intf: -1088815100

2nd loop starts here:

> [  987.511423] mcp251xfd spi1.1 can1: mcp251xfd_irq mcp251xfd_regs_status.intf: -1088815100
> [  987.519603] mcp251xfd spi1.1 can1: !intf_pending mcp251xfd_regs_status.intf: -1088815100
> 
> mcp251xfd_regs_status.intf translated to binary 
> 1	1111 1111 1111 1111 1111 1111 1111 1111 1011 1111 0001 1010 0000 0000 0000 0100
> 2	1111 1111 1111 1111 1111 1111 1111 1111 1011 1111 0001 1010 0000 0000 0000 0110
> 3	1111 1111 1111 1111 1111 1111 1111 1111 1011 1111 0001 1010 0000 0000 0000 0100

2nd loop starts here:

> 4	1111 1111 1111 1111 1111 1111 1111 1111 1011 1111 0001 1010 0000 0000 0000 0100
> 5	1111 1111 1111 1111 1111 1111 1111 1111 1011 1111 0001 1010 0000 0000 0000 0100
> 
> I have also attached a scope picture, this is without the printouts.

good idea, the printk usually takes much time.

That scope picture looks good! Can you include both runs of the IRQ
handler in one scope picture? I'm interested if there is activity or a
glich on IRQ line.

After you've that picture, you can try to catch glitches on the IRQ
line. If your scope allows this, setup a trigger on the INT channel that
triggers if the channel is low for considerably less then it takes to
clear the IRQ.

Currently it takes about 160µs to clear the IRQ, so setup a trigger for
less than 50µs.

> And if I'm reading it correctly it confirms that it runs twice. 

ACK

> 1 read pending IRQs [*]
> -RXIF-
> 2 read RX-FIFO level [*]
> 3 read RX'ed CAN frames [*]
> 4 and 5 mark RX'ed CAN frames as read [*]
> (This is the 5.10.79 version that uses regmap_update_bits which I
>  assume uses one read and one write instruction.)

ACK - in the CRC mode (which is default) the regmap_update_bits has not
been optimized, it does a read/modify/write.

> -loop-
> 6 read pending IRQs [*]
> -Exit-
> 7 read pending IRQs [*]
> 
> Confirmd with printouts:
> [   26.990324] mcp251xfd spi1.1 can1: mcp251xfd_irq()
> [   26.995173] mcp251xfd spi1.1 can1: * Bulk read MCP251XFD_REG_INT
> [   27.001371] mcp251xfd spi1.1 can1: Interrupts pending, handling.
> [   27.007435] mcp251xfd spi1.1 can1: * Read RX MCP251XFD_REG_FIFOSTA
> [   27.013676] mcp251xfd spi1.1 can1: * Bulk read. Register: 1248 count: 5
> [   27.020226] mcp251xfd spi1.1 can1: * Update bits MCP251XFD_REG_FIFOCON_UINC
> [   27.027318] mcp251xfd spi1.1 can1: * Bulk read MCP251XFD_REG_INT
> [   27.033378] mcp251xfd spi1.1 can1: No interrupts pending, returning.
> [   27.039805] mcp251xfd spi1.1 can1: mcp251xfd_irq()
> [   27.044617] mcp251xfd spi1.1 can1: * Bulk read MCP251XFD_REG_INT
> [   27.050693] mcp251xfd spi1.1 can1: No interrupts pending, returning.

regards,
Marc

-- 
Pengutronix e.K.                 | Marc Kleine-Budde           |
Embedded Linux                   | https://www.pengutronix.de  |
Vertretung West/Dortmund         | Phone: +49-231-2826-924     |
Amtsgericht Hildesheim, HRA 2686 | Fax:   +49-5121-206917-5555 |

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Sv: MCP251xFD runs interrupt handler twice per message.
  2022-01-17 13:27         ` Marc Kleine-Budde
@ 2022-01-17 14:53           ` Markus Mirevik
  2022-01-17 15:08             ` Marc Kleine-Budde
  0 siblings, 1 reply; 11+ messages in thread
From: Markus Mirevik @ 2022-01-17 14:53 UTC (permalink / raw)
  To: Marc Kleine-Budde; +Cc: linux-can

> > > > > You mean mcp251xfd_irq() is started twice?
> > > >
> > > > Yes.
> > >
> > > That's not the way it's supposed to be. Maybe it's a limitation of
> > > the am335x IRQ controller.
> >
> > What kind of limitation could that be?
> 
> Some kind of limitation of the hardware. Don't know, SoC may have all kinds
> of bugs :)
>



 
> [...]
> 
> > > > > Can you test again with IRQ_TYPE_LEVEL_LOW and no rx-int-gpios.
> > > > > Please instrument the beginning and the returns of the
> > > > > mcp251xfd_irq() function to check if it's really started twice.
> > > > > Please print the
> > > > > priv->regs_status.intf in every loop. Can you record the gpio2_5
> > > > > priv->with a
> > > > > scope.
> > > >
> > > > Yes I can but what do you mean with Instrument?
> > >
> > > That's a fancy way of saying: please add printk() :)
> > >
> > > > And no scope until Monday.
> > >
> > > no problem
> > >
> > > > > Make sure that you don't send any CAN frames as reaction of
> reception.
> > > >
> > > > How do I make sure of that? :/
> > >
> > > The Linux kernel doesn't send any CAN frames on its own. You need a
> > > user space program to do so. Make sure you don't have any user space
> > > program running that's sending CAN frames.
> > >
> >
> > No user space program running that would send can frames. And no can
> frames received on the other side.
> >
> > With IRQ_TYPE_LEVEL_LOW and the rx-int-gpios disabled I have added a
> > call to netdev_info at the top of the function:
> 
> For better readability register values are usually printed in hex (0x%08x).
> 

Noted. 

> > ...
> >         netdev_info(priv->ndev, "mcp251xfd_irq
> > mcp251xfd_regs_status.intf: %d", priv->regs_status.intf );
> 
> Note: the "intf" contains old information here, because in this run of the
> interrupt handler the intf has not been read so far.
> 

Noted. 

> >         if (priv->rx_int) {
> > ...
> >
> > and here:
> > ...
> >                if (!(intf_pending)) {
> >                         netdev_info(priv->ndev, "!intf_pending
> mcp251xfd_regs_status.intf: %d", priv->regs_status.intf );
> >                         return handled;
> >                 }
> >                 netdev_info(priv->ndev, "intf_pending
> > mcp251xfd_regs_status.intf: %d", priv->regs_status.intf ); ...
> >
> > Resulting in this when receiving one can message:
> >
> > [  987.486691] mcp251xfd spi1.1 can1: mcp251xfd_irq
> > mcp251xfd_regs_status.intf: -1088815100 [  987.494987] mcp251xfd
> > spi1.1 can1: intf_pending mcp251xfd_regs_status.intf:  -1088815098 [
> > 987.503239] mcp251xfd spi1.1 can1: !intf_pending
> > mcp251xfd_regs_status.intf: -1088815100
> 
> 2nd loop starts here:
> 
> > [  987.511423] mcp251xfd spi1.1 can1: mcp251xfd_irq
> > mcp251xfd_regs_status.intf: -1088815100 [  987.519603] mcp251xfd
> > spi1.1 can1: !intf_pending mcp251xfd_regs_status.intf: -1088815100
> >
> > mcp251xfd_regs_status.intf translated to binary
> > 1	1111 1111 1111 1111 1111 1111 1111 1111 1011 1111 0001 1010
> 0000 0000 0000 0100
> > 2	1111 1111 1111 1111 1111 1111 1111 1111 1011 1111 0001 1010
> 0000 0000 0000 0110
> > 3	1111 1111 1111 1111 1111 1111 1111 1111 1011 1111 0001 1010
> 0000 0000 0000 0100
> 
> 2nd loop starts here:
> 
> > 4	1111 1111 1111 1111 1111 1111 1111 1111 1011 1111 0001 1010
> 0000 0000 0000 0100
> > 5	1111 1111 1111 1111 1111 1111 1111 1111 1011 1111 0001 1010
> 0000 0000 0000 0100
> >
> > I have also attached a scope picture, this is without the printouts.
> 
> good idea, the printk usually takes much time.
> 
> That scope picture looks good! Can you include both runs of the IRQ handler
> in one scope picture? I'm interested if there is activity or a glich on IRQ line.
 
Both are included. The last (number 7) is the one extra reading of the register!?

> After you've that picture, you can try to catch glitches on the IRQ line. If your
> scope allows this, setup a trigger on the INT channel that triggers if the
> channel is low for considerably less then it takes to clear the IRQ.
> 
> Currently it takes about 160µs to clear the IRQ, so setup a trigger for less than
> 50µs.
> 

No glitches. The scope can trigger on glitch but no one is to be found. 

> > And if I'm reading it correctly it confirms that it runs twice.
> 
> ACK
> 
> > 1 read pending IRQs [*]
> > -RXIF-
> > 2 read RX-FIFO level [*]
> > 3 read RX'ed CAN frames [*]
> > 4 and 5 mark RX'ed CAN frames as read [*] (This is the 5.10.79 version
> > that uses regmap_update_bits which I  assume uses one read and one
> > write instruction.)
> 
> ACK - in the CRC mode (which is default) the regmap_update_bits has not
> been optimized, it does a read/modify/write.
> 
> > -loop-
> > 6 read pending IRQs [*]
> > -Exit-
> > 7 read pending IRQs [*]
> >
> > Confirmd with printouts:
> > [   26.990324] mcp251xfd spi1.1 can1: mcp251xfd_irq()
> > [   26.995173] mcp251xfd spi1.1 can1: * Bulk read MCP251XFD_REG_INT
> > [   27.001371] mcp251xfd spi1.1 can1: Interrupts pending, handling.
> > [   27.007435] mcp251xfd spi1.1 can1: * Read RX MCP251XFD_REG_FIFOSTA
> > [   27.013676] mcp251xfd spi1.1 can1: * Bulk read. Register: 1248 count: 5
> > [   27.020226] mcp251xfd spi1.1 can1: * Update bits MCP251XFD_REG_FIFOCON_UINC
> > [   27.027318] mcp251xfd spi1.1 can1: * Bulk read MCP251XFD_REG_INT
> > [   27.033378] mcp251xfd spi1.1 can1: No interrupts pending, returning.
> > [   27.039805] mcp251xfd spi1.1 can1: mcp251xfd_irq()
> > [   27.044617] mcp251xfd spi1.1 can1: * Bulk read MCP251XFD_REG_INT
> > [   27.050693] mcp251xfd spi1.1 can1: No interrupts pending, returning.

Regards
Markus

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: MCP251xFD runs interrupt handler twice per message.
  2022-01-17 14:53           ` Sv: " Markus Mirevik
@ 2022-01-17 15:08             ` Marc Kleine-Budde
  2022-01-17 15:56               ` Marc Kleine-Budde
  2022-01-18 10:16               ` Marc Kleine-Budde
  0 siblings, 2 replies; 11+ messages in thread
From: Marc Kleine-Budde @ 2022-01-17 15:08 UTC (permalink / raw)
  To: Markus Mirevik; +Cc: linux-can

[-- Attachment #1: Type: text/plain, Size: 1244 bytes --]

On 17.01.2022 14:53:46, Markus Mirevik wrote:
> > That scope picture looks good! Can you include both runs of the IRQ
> > handler in one scope picture? I'm interested if there is activity or
> > a glich on IRQ line.
>  
> Both are included. The last (number 7) is the one extra reading of the
> register!?

Doh! Indeed, you'r right, missed that.

> > After you've that picture, you can try to catch glitches on the IRQ
> > line. If your scope allows this, setup a trigger on the INT channel
> > that triggers if the channel is low for considerably less then it
> > takes to clear the IRQ.
> > 
> > Currently it takes about 160µs to clear the IRQ, so setup a trigger
> > for less than 50µs.
>
> No glitches. The scope can trigger on glitch but no one is to be
> found.

Good.

It looks like this is not a problem of the mcp251xfd driver. I think now
it's time to ask an am335x expert. Maybe one of my coworker knows this
issue.

regards,
Marc

-- 
Pengutronix e.K.                 | Marc Kleine-Budde           |
Embedded Linux                   | https://www.pengutronix.de  |
Vertretung West/Dortmund         | Phone: +49-231-2826-924     |
Amtsgericht Hildesheim, HRA 2686 | Fax:   +49-5121-206917-5555 |

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: MCP251xFD runs interrupt handler twice per message.
  2022-01-17 15:08             ` Marc Kleine-Budde
@ 2022-01-17 15:56               ` Marc Kleine-Budde
  2022-01-18 10:16               ` Marc Kleine-Budde
  1 sibling, 0 replies; 11+ messages in thread
From: Marc Kleine-Budde @ 2022-01-17 15:56 UTC (permalink / raw)
  To: Markus Mirevik; +Cc: linux-can

[-- Attachment #1: Type: text/plain, Size: 515 bytes --]

On 17.01.2022 16:08:33, Marc Kleine-Budde wrote:
> It looks like this is not a problem of the mcp251xfd driver. I think now
> it's time to ask an am335x expert. Maybe one of my coworker knows this
> issue.

Doesn't ring any bell.

Marc

-- 
Pengutronix e.K.                 | Marc Kleine-Budde           |
Embedded Linux                   | https://www.pengutronix.de  |
Vertretung West/Dortmund         | Phone: +49-231-2826-924     |
Amtsgericht Hildesheim, HRA 2686 | Fax:   +49-5121-206917-5555 |

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: MCP251xFD runs interrupt handler twice per message.
  2022-01-17 15:08             ` Marc Kleine-Budde
  2022-01-17 15:56               ` Marc Kleine-Budde
@ 2022-01-18 10:16               ` Marc Kleine-Budde
  2022-01-18 10:22                 ` Sv: " Markus Mirevik
  1 sibling, 1 reply; 11+ messages in thread
From: Marc Kleine-Budde @ 2022-01-18 10:16 UTC (permalink / raw)
  To: Markus Mirevik; +Cc: linux-can

[-- Attachment #1: Type: text/plain, Size: 892 bytes --]

On 17.01.2022 16:08:33, Marc Kleine-Budde wrote:
> On 17.01.2022 14:53:46, Markus Mirevik wrote:
> > > That scope picture looks good! Can you include both runs of the IRQ
> > > handler in one scope picture? I'm interested if there is activity or
> > > a glich on IRQ line.
> >  
> > Both are included. The last (number 7) is the one extra reading of the
> > register!?
> 
> Doh! Indeed, you'r right, missed that.

Can you take a scope picture with falling edge triggered IRQ?

(Note for the out of context reader: _Always_ use level triggered IRQs
 with the mcp251xfd in production use.)

regards,
Marc

-- 
Pengutronix e.K.                 | Marc Kleine-Budde           |
Embedded Linux                   | https://www.pengutronix.de  |
Vertretung West/Dortmund         | Phone: +49-231-2826-924     |
Amtsgericht Hildesheim, HRA 2686 | Fax:   +49-5121-206917-5555 |

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Sv: MCP251xFD runs interrupt handler twice per message.
  2022-01-18 10:16               ` Marc Kleine-Budde
@ 2022-01-18 10:22                 ` Markus Mirevik
  0 siblings, 0 replies; 11+ messages in thread
From: Markus Mirevik @ 2022-01-18 10:22 UTC (permalink / raw)
  To: Marc Kleine-Budde; +Cc: linux-can

> On 17.01.2022 16:08:33, Marc Kleine-Budde wrote:
> > On 17.01.2022 14:53:46, Markus Mirevik wrote:
> > > > That scope picture looks good! Can you include both runs of the
> > > > IRQ handler in one scope picture? I'm interested if there is
> > > > activity or a glich on IRQ line.
> > >
> > > Both are included. The last (number 7) is the one extra reading of
> > > the register!?
> >
> > Doh! Indeed, you'r right, missed that.
> 
> Can you take a scope picture with falling edge triggered IRQ?

Yes, but I don’t have the scope today. Get back to you when I have it.
 
> (Note for the out of context reader: _Always_ use level triggered IRQs  with
> the mcp251xfd in production use.)


Regards
Markus


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2022-01-18 10:22 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-14 11:05 MCP251xFD runs interrupt handler twice per message Markus Mirevik
2022-01-14 12:41 ` Marc Kleine-Budde
2022-01-14 13:43   ` Sv: " Markus Mirevik
2022-01-16 11:09     ` Marc Kleine-Budde
2022-01-17 11:55       ` Sv: " Markus Mirevik
2022-01-17 13:27         ` Marc Kleine-Budde
2022-01-17 14:53           ` Sv: " Markus Mirevik
2022-01-17 15:08             ` Marc Kleine-Budde
2022-01-17 15:56               ` Marc Kleine-Budde
2022-01-18 10:16               ` Marc Kleine-Budde
2022-01-18 10:22                 ` Sv: " Markus Mirevik

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.