m_can: a lot of 'Rx FIFO 0 Message Lost' in dmesg

* m_can: a lot of 'Rx FIFO 0 Message Lost' in dmesg
@ 2021-02-24 14:27 Mariusz Madej
  2021-02-26 13:37 ` Torin Cooper-Bennun
  0 siblings, 1 reply; 6+ messages in thread
From: Mariusz Madej @ 2021-02-24 14:27 UTC (permalink / raw)
  To: linux-can; +Cc: dmurphy

Hi,

I have a problem with m_can controller in my sama5d2 processor.
Under heavy can traffic it happens that my device starts to report (dmesg):

[   77.610000] m_can_platform f8054000.can can0: Rx FIFO 0 Message Lost
[   77.620000] m_can_platform f8054000.can can0: Rx FIFO 0 Message Lost
[   77.630000] m_can_platform f8054000.can can0: Rx FIFO 0 Message Lost
[   77.630000] m_can_platform f8054000.can can0: Rx FIFO 0 Message Lost
[   77.640000] m_can_platform f8054000.can can0: Rx FIFO 0 Message Lost
[   77.640000] m_can_platform f8054000.can can0: Rx FIFO 0 Message Lost
[   77.650000] m_can_platform f8054000.can can0: Rx FIFO 0 Message Lost
[   77.660000] m_can_platform f8054000.can can0: Rx FIFO 0 Message Lost
[   77.660000] m_can_platform f8054000.can can0: Rx FIFO 0 Message Lost

what causes large load problem in my system.

I think I have a clue what is going on but my kernel knowledge is low so i want
You to tell me if I am right or not. So:

The only place in m_can.c file, where interrupt register is cleared is function
called when interrupt arrives

static irqreturn_t m_can_isr(int irq, void *dev_id)
{
.
.
        /* ACK all irqs */
        if (ir & IR_ALL_INT)
                m_can_write(cdev, M_CAN_IR, ir);
.
.
}

But when we enter 'NAPI mode' in heavy load we are never get to this function
until load gets lower and interrupts are enabled again. In this situation,
this code:

static int m_can_do_rx_poll(struct net_device *dev, int quota)
{
        struct m_can_classdev *cdev = netdev_priv(dev);
        u32 pkts = 0;
        u32 rxfs;

        rxfs = m_can_read(cdev, M_CAN_RXF0S);
        if (!(rxfs & RXFS_FFL_MASK)) {
                netdev_dbg(dev, "no messages in fifo0\n");
                return 0;
        }

        while ((rxfs & RXFS_FFL_MASK) && (quota > 0)) {
                if (rxfs & RXFS_RFL)
                        netdev_warn(dev, "Rx FIFO 0 Message Lost\n");

                m_can_read_fifo(dev, rxfs);

                quota--;
                pkts++;
                rxfs = m_can_read(cdev, M_CAN_RXF0S);
        }

        if (pkts)
                can_led_event(dev, CAN_LED_EVENT_RX);

        return pkts;
}

will always have (rxfs & RXFS_RFL) == true until interrupt are enabled again.
That is why we got so many messages in a row for so long time. So clearing
RXFS_RFL bit after warning is issued could be a solution.

Can You tell me if I am right?

Regards
Mariusz

^ permalink raw reply	[flat|nested] 6+ messages in thread