On 14.01.2022 13:43:42, Markus Mirevik wrote:
> > > I have performance problems with mcp2518fd. I have custom board based
> > > of beagleboard black. (Sitara am335x) using two mcp2518fd. The can
> > > communication uses a lot of CPU. irq/64-spi1.1 uses around ~20% at
> > > 1000 can messages/s.
> > >
> > > I have several questions but I'll start with the most confusing. I
> > > have found that every can messages fires two interrupts on my board.
> > 
> > Do you mean every RX'ed CAN frame?
> Yes, every received CAN frame.
> 
> > Which interrupts does increase twice?
> 
> This one: 
> 64:  4  44e07000.gpio  22 Level     spi1.1

Ok, that's the IRQ line from the mcp251xfd.

> > > I have tested this in several ways. If I send one normal can message
> > > the counter in /proc/interrupts is increased twice. I have also put
> > > some printouts in mcp251xfd-core.c that confirms that mcp251xfd_irq()
> > > is run twice and the second time intf_pending is 0.
> > 
> > You mean mcp251xfd_irq() is started twice?
> 
> Yes.

That's not the way it's supposed to be. Maybe it's a limitation of the
am335x IRQ controller.

> > The big loop
> > (https://elixir.bootlin.com/linux/v5.16/source/drivers/net/can/spi/mcp251xf
> > d/mcp251xfd-core.c#L2182)
> > in mcp251xfd_irq() is run twice, and the IRQ handler is left only if there are
> > no pending IRQs.
> > 
> > The main IRQ handler loop (omitting the rx-int) is straight forward, and not
> > optimized:
> > - read pending IRQs [*]
> > - exit if there are no pending IRQs
> > - handle pending IRQs
> >   for RX:
> >   - read RX-FIFO level [*]
> >   - read RX'ed CAN frames [*]
> >   - mark RX'ed CAN frames as read [*]
> > - loop
> > 
> > All actions marked with a [*] require a SPI transfer, which results in 5 SPI
> > transfers (read pending IRQs is done twice) per RX'ed CAN frame.
> > (Assuming there is only 1 RX'ed frame pending and no pending IRQs after
> > the first loop.)
> > 
> > I have some ideas how to optimize this:
> > - do a single SPI transfer instead of several small ones:
> >   e.g. pending IRQs + RX-FIFO level, or read RX'ed frames + mark as read
> > - opportunistically assume RX'ed frame on interrupt and get rid of 1st
> >   read pending IRQs
> > - assume TX done IRQ, if CAN frames have been TX'ed but not marked as
> >   sent hardware
> > - coalesce RX IRQs
> > - coalesce TX done IRQs
> > - combine several TX frames into single SPI transfer
> > 
> 
> Indeed there is a lot to work on, I'm a bit out of my depth here and
> as I said the first question is about why it interrupt is fired twice.
>
> It's not only the big loop that is run twice. With rx-int active what
> I can observe is this:
> 
> - read RX-FIFO level [*]
> - read RX'ed CAN frames [*]
> - mark RX'ed CAN frames as read [*]
> - read pending IRQs [*]
> - exit if there are no pending IRQs
>  -> No Pending IRQ's Exiting.

So far that look correct, but the extra call to mcp251xfd_irq() that
follow is not correct.

> - read pending IRQs [*]
> - exit if there are no pending IRQs
> -> No Pending IRQ's Exiting.

> > > And for testing purposes I changed the interrupt config to trigger on
> > > falling edge instead of level and with this configured there is only
> > > one interrupt fired. But I guess this will risk locking the interrupt
> > > line low if an edge isn't detected.
> > 
> > ACK - Please don't use edge triggered IRQs, sooner or later the driver will
> > miss an IRQ resulting in a handing driver. Always use level triggered with the
> > mcp251xfd.
> > 
> 
> Yes, I figured that. 
> 
> > > I have tried both with rx-int active and inactive. My scope shows
> > > really nice signals and that rx-int and int is deactivated
> > > simultaneous on incoming frames.
> > 
> > rx-int is an optimization to skip the first get IRQ pending transfer, as it
> > indicates RX'ed CAN frames.
> 
> OFFTOPIC:
> Shouldn’t a return after the rx-int part be ok to skip the next get IRQ pending transfer as well? 

That's a possible optimization.

I would look at the number of pending TX CAN frames. The higher the
number, the higher the chance that some have been send.

> if (priv->rx_int) {
>   do {
>     int rx_pending;
>     rx_pending = gpiod_get_value_cansleep(priv->rx_int);
>     
>     if (!rx_pending)
>       break;
> 
>     err = mcp251xfd_handle(priv, rxif);
>     if (err)
>       goto out_fail;
> 
>     handled = IRQ_HANDLED;
>   } while (1);
>   
>   if (handled == IRQ_HANDLED) {
>     return handled;
>   }
> }
> ...
> 
> Even better would be to check if (IRQ.VALUE == INACTIVE) but I don
> know how to that..

I think this is not possible in the Linux kernel anymore. There was a
function to get the GPIO associated with an IRQ, but that's not
available anymore. In general IRQs are not associated to GPIO, that's
probably only the case for SoCs, but not for other busses like PCI.

[...]

> > Can you test again with IRQ_TYPE_LEVEL_LOW and no rx-int-gpios. Please
> > instrument the beginning and the returns of the mcp251xfd_irq() function to
> > check if it's really started twice. Please print the
> > priv->regs_status.intf in every loop. Can you record the gpio2_5 with a
> > scope.
> 
> Yes I can but what do you mean with Instrument?

That's a fancy way of saying: please add printk() :)

> And no scope until Monday. 

no problem

> > Make sure that you don't send any CAN frames as reaction of reception.
> 
> How do I make sure of that? :/ 

The Linux kernel doesn't send any CAN frames on its own. You need a user
space program to do so. Make sure you don't have any user space program
running that's sending CAN frames.

Marc

-- 
Pengutronix e.K.                 | Marc Kleine-Budde           |
Embedded Linux                   | https://www.pengutronix.de  |
Vertretung West/Dortmund         | Phone: +49-231-2826-924     |
Amtsgericht Hildesheim, HRA 2686 | Fax:   +49-5121-206917-5555 |