All of lore.kernel.org
 help / color / mirror / Atom feed
* can: m_can: tcan4x5x m_can_isr do not handle tx if rx fails
@ 2022-02-22 11:14 Michael Anochin
  2022-02-22 13:20 ` Marc Kleine-Budde
  0 siblings, 1 reply; 19+ messages in thread
From: Michael Anochin @ 2022-02-22 11:14 UTC (permalink / raw)
  To: linux-can

In the context of the ENOBUFS problem by using can interfaces under 
higher load:

In m_can_isr handler, if rx fails (m_can_rx_peripheral), then no 
netif_wake_queue(dev) will be called. Can this lead to ENOBUFS?


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: can: m_can: tcan4x5x m_can_isr do not handle tx if rx fails
  2022-02-22 11:14 can: m_can: tcan4x5x m_can_isr do not handle tx if rx fails Michael Anochin
@ 2022-02-22 13:20 ` Marc Kleine-Budde
       [not found]   ` <c2651e9c-d3e7-815a-6e18-8ddffc04d3d7@photo-meter.com>
  0 siblings, 1 reply; 19+ messages in thread
From: Marc Kleine-Budde @ 2022-02-22 13:20 UTC (permalink / raw)
  To: Michael Anochin; +Cc: linux-can

[-- Attachment #1: Type: text/plain, Size: 582 bytes --]

On 22.02.2022 12:14:23, Michael Anochin wrote:
> In the context of the ENOBUFS problem by using can interfaces under higher
> load:
> 
> In m_can_isr handler, if rx fails (m_can_rx_peripheral), then no
> netif_wake_queue(dev) will be called. Can this lead to ENOBUFS?

Yes - Can you send a fix?

Marc

-- 
Pengutronix e.K.                 | Marc Kleine-Budde           |
Embedded Linux                   | https://www.pengutronix.de  |
Vertretung West/Dortmund         | Phone: +49-231-2826-924     |
Amtsgericht Hildesheim, HRA 2686 | Fax:   +49-5121-206917-5555 |

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: can: m_can: tcan4x5x m_can_isr do not handle tx if rx fails
       [not found]   ` <c2651e9c-d3e7-815a-6e18-8ddffc04d3d7@photo-meter.com>
@ 2022-02-22 13:44     ` Marc Kleine-Budde
       [not found]       ` <e3504807-06fc-b6d9-3fb1-bf8d94e2b444@photo-meter.com>
  0 siblings, 1 reply; 19+ messages in thread
From: Marc Kleine-Budde @ 2022-02-22 13:44 UTC (permalink / raw)
  To: Michael Anochin; +Cc: linux-can

[-- Attachment #1: Type: text/plain, Size: 823 bytes --]

Please keep the Mailing List on Cc.

On 22.02.2022 14:26:22, Michael Anochin wrote:
> I try it. But I sink in rx-offload.c with __skb_queue_add_sort and
> napi scheduler. That blocks work_queue for tx, but I don't understand
> how. Need help.

Your idea that m_can_rx_peripheral() in m_can_isr() may fail is valid.
You can add a netdev_error() to report the error if
m_can_rx_peripheral() fails. Then investigate further.

> May be I should increase the quota for napi polling?

Try to increase and check if that helps.

regards,
Marc

-- 
Pengutronix e.K.                 | Marc Kleine-Budde           |
Embedded Linux                   | https://www.pengutronix.de  |
Vertretung West/Dortmund         | Phone: +49-231-2826-924     |
Amtsgericht Hildesheim, HRA 2686 | Fax:   +49-5121-206917-5555 |

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: can: m_can: tcan4x5x m_can_isr do not handle tx if rx fails
       [not found]       ` <e3504807-06fc-b6d9-3fb1-bf8d94e2b444@photo-meter.com>
@ 2022-02-22 14:45         ` Marc Kleine-Budde
  2022-02-22 14:54           ` Michael Anochin
  2022-02-22 15:11           ` Michael Anochin
  0 siblings, 2 replies; 19+ messages in thread
From: Marc Kleine-Budde @ 2022-02-22 14:45 UTC (permalink / raw)
  To: Michael Anochin; +Cc: linux-can

[-- Attachment #1: Type: text/plain, Size: 1105 bytes --]

Please don't forget to keep the mailing list on Cc!

On 22.02.2022 15:30:33, Michael Anochin wrote:
> 
> > You can add a netdev_error() to report the error if
> Done, m_can_rx_peripheral(dev) returns each time normally with 0.
> I added netdev_err also after out_fail in m_can_isr, but it fires no error
> in dmesg after NOBUFS.
> 
> The curious thing is that it fails in the other place.
> 
> Sometimes I see
> [ 9945.908861] tcan4x5x spi4.0 can1: can_put_echo_skb: BUG! echo_skb 11 is
> occupied!
> 
> But I think, it is not my problem.

This should not happen. Especially with the tcan driver. In a previous
mail you stated that you are using the following mram config:

| bosch,mram-cfg =  <0x0 0 0 16 0 0 1 1>;

is this still the case? This is inconsistent with the above error
message.

regards,
Marc

-- 
Pengutronix e.K.                 | Marc Kleine-Budde           |
Embedded Linux                   | https://www.pengutronix.de  |
Vertretung West/Dortmund         | Phone: +49-231-2826-924     |
Amtsgericht Hildesheim, HRA 2686 | Fax:   +49-5121-206917-5555 |

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: can: m_can: tcan4x5x m_can_isr do not handle tx if rx fails
  2022-02-22 14:45         ` Marc Kleine-Budde
@ 2022-02-22 14:54           ` Michael Anochin
  2022-02-22 15:06             ` Marc Kleine-Budde
  2022-02-22 15:11           ` Michael Anochin
  1 sibling, 1 reply; 19+ messages in thread
From: Michael Anochin @ 2022-02-22 14:54 UTC (permalink / raw)
  To: Marc Kleine-Budde; +Cc: linux-can

> 
> This should not happen. Especially with the tcan driver. In a previous
> mail you stated that you are using the following mram config:
> 
> | bosch,mram-cfg =  <0x0 0 0 16 0 0 1 1>;
> 
> is this still the case? This is inconsistent with the above error
> message.
> I have tried many bosch,mram-cfg. This makes almost no difference

bosch,mram-cfg =  <0x0 0 0 16 0 0 1 1> is from Mainstream
bosch,mram-cfg =  <0x0 0 0 10 0 0 16 16> is from Mainstream
bosch,mram-cfg =  <0x0 0 0 16 0 0 8 8> is from Mainstream

I recognized, that no RXFIFO_1 is used, only RXFIFO_0.  On 
TXFIFO/TXEFIFO may be only one element is used by driver. I am not sure.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: can: m_can: tcan4x5x m_can_isr do not handle tx if rx fails
  2022-02-22 14:54           ` Michael Anochin
@ 2022-02-22 15:06             ` Marc Kleine-Budde
  2022-02-22 15:40               ` Michael Anochin
  0 siblings, 1 reply; 19+ messages in thread
From: Marc Kleine-Budde @ 2022-02-22 15:06 UTC (permalink / raw)
  To: Michael Anochin; +Cc: linux-can

[-- Attachment #1: Type: text/plain, Size: 1332 bytes --]

On 22.02.2022 15:54:32, Michael Anochin wrote:
> > 
> > This should not happen. Especially with the tcan driver. In a previous
> > mail you stated that you are using the following mram config:
> > 
> > | bosch,mram-cfg =  <0x0 0 0 16 0 0 1 1>;
> > 
> > is this still the case? This is inconsistent with the above error
> > message.

My question is, which mram-cfg were you using when the above error
message hit.

> I have tried many bosch,mram-cfg. This makes almost no difference
> 
> bosch,mram-cfg =  <0x0 0 0 16 0 0 1 1> is from Mainstream
> bosch,mram-cfg =  <0x0 0 0 10 0 0 16 16> is from Mainstream
> bosch,mram-cfg =  <0x0 0 0 16 0 0 8 8> is from Mainstream

What is Mainstream?

> I recognized, that no RXFIFO_1 is used, only RXFIFO_0.  On TXFIFO/TXEFIFO
> may be only one element is used by driver. I am not sure.

ACK, as documented in the DT bindings:

| bosch,mram-cfg = <0x0 0 0 16 0 0 1 1>;

https://elixir.bootlin.com/linux/latest/source/Documentation/devicetree/bindings/net/can/tcan4x5x.txt#L34

regards,
Marc

-- 
Pengutronix e.K.                 | Marc Kleine-Budde           |
Embedded Linux                   | https://www.pengutronix.de  |
Vertretung West/Dortmund         | Phone: +49-231-2826-924     |
Amtsgericht Hildesheim, HRA 2686 | Fax:   +49-5121-206917-5555 |

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: can: m_can: tcan4x5x m_can_isr do not handle tx if rx fails
  2022-02-22 14:45         ` Marc Kleine-Budde
  2022-02-22 14:54           ` Michael Anochin
@ 2022-02-22 15:11           ` Michael Anochin
  2022-02-22 15:32             ` Marc Kleine-Budde
  1 sibling, 1 reply; 19+ messages in thread
From: Michael Anochin @ 2022-02-22 15:11 UTC (permalink / raw)
  To: Marc Kleine-Budde; +Cc: linux-can

With netdev_warn() in m_can_tx_handler I found,
that before "BUG! echo_skb N" appears,

m_can_next_echo_skb_occupied(dev, putidx) is true with putidx=N-1



[11676.933800] tcan4x5x spi4.0 can1: m_can_tx_handler m_can_tx_fifo_full 
or m_can_next_echo_skb_occupied, putidx=12

[11676.934735] tcan4x5x spi4.0 can1: m_can_start_xmit: enter

[11676.934744] tcan4x5x spi4.0 can1: m_can_start_xmit netif_stop_queue done

[11676.934911] tcan4x5x spi4.0 can1: can_put_echo_skb: BUG! echo_skb 13 
is occupied!


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: can: m_can: tcan4x5x m_can_isr do not handle tx if rx fails
  2022-02-22 15:11           ` Michael Anochin
@ 2022-02-22 15:32             ` Marc Kleine-Budde
  0 siblings, 0 replies; 19+ messages in thread
From: Marc Kleine-Budde @ 2022-02-22 15:32 UTC (permalink / raw)
  To: Michael Anochin; +Cc: linux-can

[-- Attachment #1: Type: text/plain, Size: 1027 bytes --]

On 22.02.2022 16:11:52, Michael Anochin wrote:
> With netdev_warn() in m_can_tx_handler I found,
> that before "BUG! echo_skb N" appears,
> 
> m_can_next_echo_skb_occupied(dev, putidx) is true with putidx=N-1
> 
> [11676.933800] tcan4x5x spi4.0 can1: m_can_tx_handler m_can_tx_fifo_full or
> m_can_next_echo_skb_occupied, putidx=12
> 
> [11676.934735] tcan4x5x spi4.0 can1: m_can_start_xmit: enter
> 
> [11676.934744] tcan4x5x spi4.0 can1: m_can_start_xmit netif_stop_queue done
> 
> [11676.934911] tcan4x5x spi4.0 can1: can_put_echo_skb: BUG! echo_skb 13 is
> occupied!

The tcan driver is probably not tested with more than 1 TX element.
Please use the following mram config for now:

| bosch,mram-cfg = <0x0 0 0 16 0 0 1 1>;

Marc

-- 
Pengutronix e.K.                 | Marc Kleine-Budde           |
Embedded Linux                   | https://www.pengutronix.de  |
Vertretung West/Dortmund         | Phone: +49-231-2826-924     |
Amtsgericht Hildesheim, HRA 2686 | Fax:   +49-5121-206917-5555 |

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: can: m_can: tcan4x5x m_can_isr do not handle tx if rx fails
  2022-02-22 15:06             ` Marc Kleine-Budde
@ 2022-02-22 15:40               ` Michael Anochin
  2022-02-22 15:43                 ` Marc Kleine-Budde
  0 siblings, 1 reply; 19+ messages in thread
From: Michael Anochin @ 2022-02-22 15:40 UTC (permalink / raw)
  To: Marc Kleine-Budde; +Cc: linux-can

The "BUG! echo_skb " Message was with mram-cfg=<0x0 0 0 10 0 0 16 16>.
Sorry for copypaste error.

I changed now to <0x0 0 0 16 0 0 1 1>. Thank you for information.



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: can: m_can: tcan4x5x m_can_isr do not handle tx if rx fails
  2022-02-22 15:40               ` Michael Anochin
@ 2022-02-22 15:43                 ` Marc Kleine-Budde
  2022-02-22 15:48                   ` Michael Anochin
  0 siblings, 1 reply; 19+ messages in thread
From: Marc Kleine-Budde @ 2022-02-22 15:43 UTC (permalink / raw)
  To: Michael Anochin; +Cc: linux-can

[-- Attachment #1: Type: text/plain, Size: 539 bytes --]

On 22.02.2022 16:40:55, Michael Anochin wrote:
> The "BUG! echo_skb " Message was with mram-cfg=<0x0 0 0 10 0 0 16 16>.
> Sorry for copypaste error.
> 
> I changed now to <0x0 0 0 16 0 0 1 1>. Thank you for information.

Keep us informed if that helps.

Marc

-- 
Pengutronix e.K.                 | Marc Kleine-Budde           |
Embedded Linux                   | https://www.pengutronix.de  |
Vertretung West/Dortmund         | Phone: +49-231-2826-924     |
Amtsgericht Hildesheim, HRA 2686 | Fax:   +49-5121-206917-5555 |

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 484 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: can: m_can: tcan4x5x m_can_isr do not handle tx if rx fails
  2022-02-22 15:43                 ` Marc Kleine-Budde
@ 2022-02-22 15:48                   ` Michael Anochin
  2022-02-22 15:51                     ` Marc Kleine-Budde
  0 siblings, 1 reply; 19+ messages in thread
From: Michael Anochin @ 2022-02-22 15:48 UTC (permalink / raw)
  To: Marc Kleine-Budde; +Cc: linux-can

> Keep us informed if that helps.
No, this does not help. It was my start-point with <0x0 0 0 16 0 0 1 1>
I continue to dive in with debug-printing.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: can: m_can: tcan4x5x m_can_isr do not handle tx if rx fails
  2022-02-22 15:48                   ` Michael Anochin
@ 2022-02-22 15:51                     ` Marc Kleine-Budde
  2022-02-22 16:02                       ` Michael Anochin
  2022-02-22 16:20                       ` Michael Anochin
  0 siblings, 2 replies; 19+ messages in thread
From: Marc Kleine-Budde @ 2022-02-22 15:51 UTC (permalink / raw)
  To: Michael Anochin; +Cc: linux-can

[-- Attachment #1: Type: text/plain, Size: 514 bytes --]

On 22.02.2022 16:48:35, Michael Anochin wrote:
> > Keep us informed if that helps.
> No, this does not help. It was my start-point with <0x0 0 0 16 0 0 1 1>
> I continue to dive in with debug-printing.

Still any error messages?

Marc

-- 
Pengutronix e.K.                 | Marc Kleine-Budde           |
Embedded Linux                   | https://www.pengutronix.de  |
Vertretung West/Dortmund         | Phone: +49-231-2826-924     |
Amtsgericht Hildesheim, HRA 2686 | Fax:   +49-5121-206917-5555 |

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: can: m_can: tcan4x5x m_can_isr do not handle tx if rx fails
  2022-02-22 15:51                     ` Marc Kleine-Budde
@ 2022-02-22 16:02                       ` Michael Anochin
  2022-02-22 16:09                         ` Marc Kleine-Budde
  2022-02-22 16:20                       ` Michael Anochin
  1 sibling, 1 reply; 19+ messages in thread
From: Michael Anochin @ 2022-02-22 16:02 UTC (permalink / raw)
  To: Marc Kleine-Budde; +Cc: linux-can

> Still any error messages?
No BUG-Message with  <0x0 0 0 16 0 0 1 1>. At least that is positive.

But no other Messages in in kbuf. Simply no netif_wake_queue fires. 
After that no TX possible. But RX is working.


See a log
https://pastebin.com/ZJKqTVvs

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: can: m_can: tcan4x5x m_can_isr do not handle tx if rx fails
  2022-02-22 16:02                       ` Michael Anochin
@ 2022-02-22 16:09                         ` Marc Kleine-Budde
  2022-02-22 16:41                           ` Michael Anochin
  2022-02-22 16:58                           ` Michael Anochin
  0 siblings, 2 replies; 19+ messages in thread
From: Marc Kleine-Budde @ 2022-02-22 16:09 UTC (permalink / raw)
  To: Michael Anochin; +Cc: linux-can

[-- Attachment #1: Type: text/plain, Size: 1223 bytes --]

On 22.02.2022 17:02:14, Michael Anochin wrote:
> > Still any error messages?
> No BUG-Message with  <0x0 0 0 16 0 0 1 1>. At least that is positive.
> 
> But no other Messages in in kbuf. Simply no netif_wake_queue fires. After
> that no TX possible. But RX is working.

This is good. \o/

| [  763.651277] tcan4x5x spi6.0 can2: m_can_isr: netif_wake_queue done
| [  763.651295] tcan4x5x spi6.0 can2: m_can_start_xmit netif_stop_queue done
| [  763.651462] tcan4x5x spi6.0 can2: m_can_tx_handler m_can_tx_fifo_full
| [  763.652163] tcan4x5x spi6.0 can2: m_can_isr: netif_wake_queue done
| [  763.652182] tcan4x5x spi6.0 can2: m_can_start_xmit netif_stop_queue done
| [  763.652352] tcan4x5x spi6.0 can2: m_can_tx_handler m_can_tx_fifo_full

You're missing a "netif_wake_queue done" here. There's probably an
interrupt associated with this event. Add a print if that IRQ is active
right after reading the IRQ status register.

Marc

-- 
Pengutronix e.K.                 | Marc Kleine-Budde           |
Embedded Linux                   | https://www.pengutronix.de  |
Vertretung West/Dortmund         | Phone: +49-231-2826-924     |
Amtsgericht Hildesheim, HRA 2686 | Fax:   +49-5121-206917-5555 |

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: can: m_can: tcan4x5x m_can_isr do not handle tx if rx fails
  2022-02-22 15:51                     ` Marc Kleine-Budde
  2022-02-22 16:02                       ` Michael Anochin
@ 2022-02-22 16:20                       ` Michael Anochin
  1 sibling, 0 replies; 19+ messages in thread
From: Michael Anochin @ 2022-02-22 16:20 UTC (permalink / raw)
  To: Marc Kleine-Budde; +Cc: linux-can

>> 
>> Still any error messages?
>> 

I can relatively easy reproduce this issue. After socket opened, I need 
to write a group of 6x can_fd frames (len=64) to the socket in a cycle 
of 10ms. After 1-2 minutes TX stops and latch up appears.
Bitrates are 500000/1000000.
In latch up condition, write to socket can return  errno 11 (EAGAIN) or 
errno 105 (ENOBUFS) permanently till ifdown.

Here is my can-status

9: can0: <NOARP,UP,LOWER_UP,ECHO> mtu 72 qdisc pfifo_fast state UP mode 
DEFAULT group default qlen 2000
     link/can  promiscuity 0 minmtu 0 maxmtu 0
     can <BERR-REPORTING,FD> state ERROR-ACTIVE (berr-counter tx 0 rx 0) 
restart-ms 0
           bitrate 500000 sample-point 0.800
           tq 50 prop-seg 0 phase-seg1 31 phase-seg2 8 sjw 8
           m_can: tseg1 2..256 tseg2 2..128 sjw 1..128 brp 1..512 brp-inc 1
           dbitrate 1000000 dsample-point 0.700
           dtq 50 dprop-seg 0 dphase-seg1 13 dphase-seg2 6 dsjw 6
           m_can: dtseg1 1..32 dtseg2 1..16 dsjw 1..16 dbrp 1..32 dbrp-inc 1
           clock 40000000numtxqueues 1 numrxqueues 1 gso_max_size 65536 
gso_max_segs 65535
10: can1: <NOARP,UP,LOWER_UP,ECHO> mtu 72 qdisc pfifo_fast state UP mode 
DEFAULT group default qlen 2000
     link/can  promiscuity 0 minmtu 0 maxmtu 0
     can <BERR-REPORTING,FD> state ERROR-ACTIVE (berr-counter tx 0 rx 0) 
restart-ms 0
           bitrate 500000 sample-point 0.800
           tq 50 prop-seg 0 phase-seg1 31 phase-seg2 8 sjw 8
           m_can: tseg1 2..256 tseg2 2..128 sjw 1..128 brp 1..512 brp-inc 1
           dbitrate 1000000 dsample-point 0.700
           dtq 50 dprop-seg 0 dphase-seg1 13 dphase-seg2 6 dsjw 6
           m_can: dtseg1 1..32 dtseg2 1..16 dsjw 1..16 dbrp 1..32 dbrp-inc 1
           clock 40000000numtxqueues 1 numrxqueues 1 gso_max_size 65536 
gso_max_segs 65535
11: can2: <NOARP,UP,LOWER_UP,ECHO> mtu 72 qdisc pfifo_fast state UP mode 
DEFAULT group default qlen 2000
     link/can  promiscuity 0 minmtu 0 maxmtu 0
     can <BERR-REPORTING,FD> state ERROR-ACTIVE (berr-counter tx 0 rx 0) 
restart-ms 0
           bitrate 500000 sample-point 0.800
           tq 50 prop-seg 0 phase-seg1 31 phase-seg2 8 sjw 8
           m_can: tseg1 2..256 tseg2 2..128 sjw 1..128 brp 1..512 brp-inc 1
           dbitrate 1000000 dsample-point 0.700
           dtq 50 dprop-seg 0 dphase-seg1 13 dphase-seg2 6 dsjw 6
           m_can: dtseg1 1..32 dtseg2 1..16 dsjw 1..16 dbrp 1..32 dbrp-inc 1
           clock 40000000numtxqueues 1 numrxqueues 1 gso_max_size 65536 
gso_max_segs 65535



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: can: m_can: tcan4x5x m_can_isr do not handle tx if rx fails
  2022-02-22 16:09                         ` Marc Kleine-Budde
@ 2022-02-22 16:41                           ` Michael Anochin
  2022-02-22 20:10                             ` Marc Kleine-Budde
  2022-02-22 16:58                           ` Michael Anochin
  1 sibling, 1 reply; 19+ messages in thread
From: Michael Anochin @ 2022-02-22 16:41 UTC (permalink / raw)
  To: Marc Kleine-Budde; +Cc: linux-can


> You're missing a "netif_wake_queue done" here. There's probably an
> interrupt associated with this event. Add a print if that IRQ is active
> right after reading the IRQ status register

Done, only on enter in m_can_isr "m_can_isr: ir=", not exit.

If there are a RX traffic on but, the latchup happens very quickly.

Last dmesg lines from https://pastebin.com/yBv9xcWg:

[  396.390714] tcan4x5x spi6.0 can2: m_can_tx_handler m_can_tx_fifo_full
[  396.390955] tcan4x5x spi6.0 can2: m_can_isr: ir=0x5800
[  396.391091] tcan4x5x spi6.0 can2: m_can_isr: netif_wake_queue done
[  396.391109] tcan4x5x spi6.0 can2: m_can_start_xmit netif_stop_queue done
[  396.391282] tcan4x5x spi6.0 can2: m_can_tx_handler m_can_tx_fifo_full
[  396.391534] tcan4x5x spi6.0 can2: m_can_isr: ir=0x5800
[  396.391670] tcan4x5x spi6.0 can2: m_can_isr: netif_wake_queue done
[  396.391689] tcan4x5x spi6.0 can2: m_can_start_xmit netif_stop_queue done
[  396.391865] tcan4x5x spi6.0 can2: m_can_tx_handler m_can_tx_fifo_full
[  396.392134] tcan4x5x spi6.0 can2: m_can_isr: ir=0x1
[  396.392537] tcan4x5x spi6.0 can2: m_can_isr: ir=0x5800
[  396.392673] tcan4x5x spi6.0 can2: m_can_isr: netif_wake_queue done
[  396.392692] tcan4x5x spi6.0 can2: m_can_start_xmit netif_stop_queue done
[  396.392862] tcan4x5x spi6.0 can2: m_can_tx_handler m_can_tx_fifo_full



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: can: m_can: tcan4x5x m_can_isr do not handle tx if rx fails
  2022-02-22 16:09                         ` Marc Kleine-Budde
  2022-02-22 16:41                           ` Michael Anochin
@ 2022-02-22 16:58                           ` Michael Anochin
  1 sibling, 0 replies; 19+ messages in thread
From: Michael Anochin @ 2022-02-22 16:58 UTC (permalink / raw)
  To: Marc Kleine-Budde; +Cc: linux-can


It fires almost always 0x5800 for tx and 0x01 for rx


ir=0x01 means RF0N set (Rx FIFO 0 New Message)
ir=0x5800 means TEFF|TEFN|TFE
(Tx Event FIFO Full, Tx Event FIFO New Entry, Tx FIFO Empty)

It seems, that m_can_isr is called to late. I catch Fifo full and empty 
flags together.

According to tcan4550 datasheet, M_CAN Revision is 3.2.1.1 ,thus >3.0


I am slightly confused by the discrepancy in Bits > 29 in m_can.c and 
mcan_users_manual_v330.pdf, section 2.3.16 Interrupt Register (IR), page 
20. But it is not my focus point.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: can: m_can: tcan4x5x m_can_isr do not handle tx if rx fails
  2022-02-22 16:41                           ` Michael Anochin
@ 2022-02-22 20:10                             ` Marc Kleine-Budde
  2022-02-23  8:55                               ` Michael Anochin
  0 siblings, 1 reply; 19+ messages in thread
From: Marc Kleine-Budde @ 2022-02-22 20:10 UTC (permalink / raw)
  To: Michael Anochin; +Cc: linux-can

[-- Attachment #1: Type: text/plain, Size: 944 bytes --]

On 22.02.2022 17:41:54, Michael Anochin wrote:
> 
> > You're missing a "netif_wake_queue done" here. There's probably an
> > interrupt associated with this event. Add a print if that IRQ is active
> > right after reading the IRQ status register
> 
> Done, only on enter in m_can_isr "m_can_isr: ir=", not exit.
> 
> If there are a RX traffic on but, the latchup happens very quickly.

I just looked again at your overlay. Please change the IRQ type to
IRQ_TYPE_LEVEL_LOW. With edge falling you'll miss interrupts sooner or
later.

regards,
Marc

BTW: it's documented as level low in the bindings documentation:

|		interrupts = <14 IRQ_TYPE_LEVEL_LOW>;

-- 
Pengutronix e.K.                 | Marc Kleine-Budde           |
Embedded Linux                   | https://www.pengutronix.de  |
Vertretung West/Dortmund         | Phone: +49-231-2826-924     |
Amtsgericht Hildesheim, HRA 2686 | Fax:   +49-5121-206917-5555 |

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: can: m_can: tcan4x5x m_can_isr do not handle tx if rx fails
  2022-02-22 20:10                             ` Marc Kleine-Budde
@ 2022-02-23  8:55                               ` Michael Anochin
  0 siblings, 0 replies; 19+ messages in thread
From: Michael Anochin @ 2022-02-23  8:55 UTC (permalink / raw)
  To: Marc Kleine-Budde; +Cc: linux-can

> 
> BTW: it's documented as level low in the bindings documentation:
> 
> |		interrupts = <14 IRQ_TYPE_LEVEL_LOW>;
> 


Thank you very much. That was my false hope, I change the dt_binding for 
IRQ_TYPE_LEVEL_LOW=8

interrupts = <25 8>;
interrupts = <27 8>;
interrupts = <16 8>;

But without success, failure still occurs, maybe not so fast as with 
IRQ_TYPE_EDGE_FALLING (subjective). Interestingly, the first run after 
reboot may last longer than subsequent ones until it breaks. If it 
breaks, the interface is at random can0 or 1 or 2.


Here is an example, can1 brakes.  No more ISR fires after [ 1682.748485].
I added netdev_dbg before return, and netdev_ERR for out_fail.


[ 1682.747310] tcan4x5x spi4.0 can1: m_can_isr: enter ir=0x5800
[ 1682.747468] tcan4x5x spi4.0 can1: m_can_isr: netif_wake_queue done
[ 1682.747475] tcan4x5x spi4.0 can1: m_can_isr: return IRQ_HANDLED

[ 1682.747494] tcan4x5x spi4.0 can1: m_can_start_xmit netif_stop_queue done

//Last TX ISR (IR_TEFN was true)
[ 1682.747912] tcan4x5x spi4.0 can1: m_can_isr: enter ir=0x5800
//fifo not full and queue is stoppet -> wake queue
[ 1682.748053] tcan4x5x spi4.0 can1: m_can_isr: netif_wake_queue done
[ 1682.748061] tcan4x5x spi4.0 can1: m_can_isr: return IRQ_HANDLED

//Last RX-ISR
[ 1682.748199] tcan4x5x spi4.0 can1: m_can_isr: enter ir=0x1
done
[ 1682.748433] tcan4x5x spi4.0 can1: m_can_isr: return IRQ_HANDLED

//In m_can_tx_handler after fifo write an end, m_can_tx_fifo_full -> 
netif_stop_queue(dev);
[ 1682.748485] tcan4x5x spi4.0 can1: m_can_tx_handler m_can_tx_fifo_full

After that I expect m_can_isr with IR_TEFN flag in order to wake queue, 
but nothing follows. Write to socket returns permanently 105,ENOBUFS

Full dmesg output: https://pastebin.com/G0xikf3P

Maybe I should add some printk to skb.c and deb.c?

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2022-02-23  8:55 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-22 11:14 can: m_can: tcan4x5x m_can_isr do not handle tx if rx fails Michael Anochin
2022-02-22 13:20 ` Marc Kleine-Budde
     [not found]   ` <c2651e9c-d3e7-815a-6e18-8ddffc04d3d7@photo-meter.com>
2022-02-22 13:44     ` Marc Kleine-Budde
     [not found]       ` <e3504807-06fc-b6d9-3fb1-bf8d94e2b444@photo-meter.com>
2022-02-22 14:45         ` Marc Kleine-Budde
2022-02-22 14:54           ` Michael Anochin
2022-02-22 15:06             ` Marc Kleine-Budde
2022-02-22 15:40               ` Michael Anochin
2022-02-22 15:43                 ` Marc Kleine-Budde
2022-02-22 15:48                   ` Michael Anochin
2022-02-22 15:51                     ` Marc Kleine-Budde
2022-02-22 16:02                       ` Michael Anochin
2022-02-22 16:09                         ` Marc Kleine-Budde
2022-02-22 16:41                           ` Michael Anochin
2022-02-22 20:10                             ` Marc Kleine-Budde
2022-02-23  8:55                               ` Michael Anochin
2022-02-22 16:58                           ` Michael Anochin
2022-02-22 16:20                       ` Michael Anochin
2022-02-22 15:11           ` Michael Anochin
2022-02-22 15:32             ` Marc Kleine-Budde

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.