On 10.05.2021 15:23:02, Torin Cooper-Bennun wrote: > Scenario: I open a raw CAN socket, queue a bunch of frames for TX, then > close the socket as soon as possible. For the duration of the test, I > have another socket open listening for all frames (candump). > > After the sending socket has been closed, and there are still frames in > the queue yet to be transmitted, I find candump reporting the remainder > of my sent frames as RX rather than TX. > > For example, I send 1,000 8-byte classical CAN frames, immediately close > the socket and log the time at which I did so. Can you provide the program to reproduce the issue? Have you increased the CAN interface's txqueuelen? > My application reports the socket closed: > > | Socket closed at 15:02:45.987278 > > My candump log shows: > > | (2021-05-10 15:02:45.327724) can0 TX - - 000 [08] EE EE EE EE EE EE EE EE > | (2021-05-10 15:02:45.329578) can0 TX - - 001 [08] EE EE EE EE EE EE EE EE > | (2021-05-10 15:02:45.330493) can0 TX - - 002 [08] EE EE EE EE EE EE EE EE > | (2021-05-10 15:02:45.331341) can0 TX - - 003 [08] EE EE EE EE EE EE EE EE > | (2021-05-10 15:02:45.332264) can0 TX - - 004 [08] EE EE EE EE EE EE EE EE > | (2021-05-10 15:02:45.333148) can0 TX - - 005 [08] EE EE EE EE EE EE EE EE > | (2021-05-10 15:02:45.334115) can0 TX - - 006 [08] EE EE EE EE EE EE EE EE > | (2021-05-10 15:02:45.335061) can0 TX - - 007 [08] EE EE EE EE EE EE EE EE > | (2021-05-10 15:02:45.336021) can0 TX - - 008 [08] EE EE EE EE EE EE EE EE > | (2021-05-10 15:02:45.336951) can0 TX - - 009 [08] EE EE EE EE EE EE EE EE > | > | .... snip .... > | > | (2021-05-10 15:02:46.089177) can0 TX - - 399 [08] EE EE EE EE EE EE EE EE > | (2021-05-10 15:02:46.090001) can0 TX - - 39A [08] EE EE EE EE EE EE EE EE > | (2021-05-10 15:02:46.090852) can0 TX - - 39B [08] EE EE EE EE EE EE EE EE > | (2021-05-10 15:02:46.091735) can0 TX - - 39C [08] EE EE EE EE EE EE EE EE > | (2021-05-10 15:02:46.092483) can0 TX - - 39D [08] EE EE EE EE EE EE EE EE > | (2021-05-10 15:02:46.093313) can0 RX - - 39E [08] EE EE EE EE EE EE EE EE <----- !!!!! > | (2021-05-10 15:02:46.094091) can0 RX - - 39F [08] EE EE EE EE EE EE EE EE > | (2021-05-10 15:02:46.094931) can0 RX - - 3A0 [08] EE EE EE EE EE EE EE EE > | (2021-05-10 15:02:46.095774) can0 RX - - 3A1 [08] EE EE EE EE EE EE EE EE > | (2021-05-10 15:02:46.096513) can0 RX - - 3A2 [08] EE EE EE EE EE EE EE EE > | > | .... snip .... > | > | (2021-05-10 15:02:46.143287) can0 RX - - 3DE [08] EE EE EE EE EE EE EE EE > | (2021-05-10 15:02:46.144046) can0 RX - - 3DF [08] EE EE EE EE EE EE EE EE > | (2021-05-10 15:02:46.144808) can0 RX - - 3E0 [08] EE EE EE EE EE EE EE EE > | (2021-05-10 15:02:46.145570) can0 RX - - 3E1 [08] EE EE EE EE EE EE EE EE > | (2021-05-10 15:02:46.146357) can0 RX - - 3E2 [08] EE EE EE EE EE EE EE EE > | (2021-05-10 15:02:46.147117) can0 RX - - 3E3 [08] EE EE EE EE EE EE EE EE > | (2021-05-10 15:02:46.147876) can0 RX - - 3E4 [08] EE EE EE EE EE EE EE EE > | (2021-05-10 15:02:46.148635) can0 RX - - 3E5 [08] EE EE EE EE EE EE EE EE > | (2021-05-10 15:02:46.149395) can0 RX - - 3E6 [08] EE EE EE EE EE EE EE EE > | (2021-05-10 15:02:46.150161) can0 RX - - 3E7 [08] EE EE EE EE EE EE EE EE > > Why? > > candump.c prints 'RX' if the received frame has no MSG_DONTROUTE flag. > > | if (msg.msg_flags & MSG_DONTROUTE) > | printf (" TX %s", extra_m_info[frame.flags & 3]); > | else > | printf (" RX %s", extra_m_info[frame.flags & 3]); > > In turn, MSG_DONTROUTE is set in net/can/raw.c: raw_rcv(): > > | /* add CAN specific message flags for raw_recvmsg() */ > | pflags = raw_flags(skb); > | *pflags = 0; > | if (oskb->sk) > | *pflags |= MSG_DONTROUTE; > | if (oskb->sk == sk) > | *pflags |= MSG_CONFIRM; > > So, I'm guessing, some 100 ms after my application begins to request > that the socket be closed, the socket's pointer becomes NULL in further > TX skbs in the queue, so the raw CAN layer can no longer differentiate > these skbs as TX. (Sorry if my pathways are a bit mixed up.) > > Seems broken to me - is this known behaviour? Looks like a unknown bug to me! > Test setup: > - kernel: v5.13-rc1 with some RPi-specific patches > - hardware: RPi CM4 with TCAN4550 (so, m_can driver in peripheral mode) regards, Marc -- Pengutronix e.K. | Marc Kleine-Budde | Embedded Linux | https://www.pengutronix.de | Vertretung West/Dortmund | Phone: +49-231-2826-924 | Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |