linux-can.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* can: isotp: epoll breaks isotp_sendmsg
@ 2023-06-30 16:44 Maxime Jayat
  2023-06-30 22:35 ` Michal Sojka
  0 siblings, 1 reply; 9+ messages in thread
From: Maxime Jayat @ 2023-06-30 16:44 UTC (permalink / raw)
  To: Oliver Hartkopp, Marc Kleine-Budde, Michal Sojka
  Cc: linux-can, netdev, linux-kernel, Dae R. Jeong, Hillf Danton

Hi,

There is something not clear happening with the non-blocking behavior
of ISO-TP sockets in the TX path, but more importantly, using epoll now
completely breaks isotp_sendmsg.
I believe it is related to
79e19fa79c ("can: isotp: isotp_ops: fix poll() to not report false 
EPOLLOUT events"),
but actually is probably deeper than that.

I don't completely understand what is exactly going on, so I am sharing
the problem I face:

With an ISO-TP socket in non-blocking mode, using epoll seems to make
isotp_sendmsg always return -EAGAIN.

I have a non-blocking socket + epoll version of can-utils isotpsend 
available for
testing at https://gist.github.com/MJayat/4857da43ab154e4ba644d2446b5fa46d
With this version I do the following test:

isotprecv -l -m 0x80 -s 7e8 -d 7e0 vcan0 &
echo "01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f" | strace ./isotpsend 
-l10 -s 7E0 -d 7E8 vcan0

I get:
...
15:37:02.456849 epoll_ctl(4, EPOLL_CTL_ADD, 3, 
{events=EPOLLIN|EPOLLOUT|EPOLLRDHUP, data={u32=0, u64=0}}) = 0 <0.000249>
15:37:02.457839 epoll_wait(4, [{events=EPOLLOUT, data={u32=0, u64=0}}], 
1, 2000) = 1 <0.000200>
15:37:02.458838 write(3, 
"\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0a\x0b\x0c\x0d\x0e\x0f", 15) = -1 
EAGAIN (Resource temporarily unavailable) <0.000278>
15:37:02.459908 epoll_wait(4, [{events=EPOLLOUT, data={u32=0, u64=0}}], 
1, 2000) = 1 <0.000250>
15:37:02.460879 write(3, 
"\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0a\x0b\x0c\x0d\x0e\x0f", 15) = -1 
EAGAIN (Resource temporarily unavailable) <0.000272>
15:37:02.461831 epoll_wait(4, [{events=EPOLLOUT, data={u32=0, u64=0}}], 
1, 2000) = 1 <0.000199>
...
impossible to write on the socket, and busy-looping.

With a change to epoll_ctl flags, now including EPOLLET, I get:
...
15:36:22.443689 epoll_ctl(4, EPOLL_CTL_ADD, 3, 
{events=EPOLLIN|EPOLLOUT|EPOLLRDHUP|EPOLLET, data={u32=0, u64=0}}) = 0 
<0.000171>
15:36:22.444514 epoll_wait(4, [{events=EPOLLOUT, data={u32=0, u64=0}}], 
1, 2000) = 1 <0.000188>
15:36:22.445413 write(3, 
"\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0a\x0b\x0c\x0d\x0e\x0f", 15) = -1 
EAGAIN (Resource temporarily unavailable) <0.000175>
15:36:22.446335 epoll_wait(4, [], 1, 2000) = 0 <2.026006>
...
epoll_wait now blocks indefinitely.

By reverting 79e19fa79c, I get better results but still incorrect:
...
15:41:43.890880 epoll_ctl(4, EPOLL_CTL_ADD, 3, 
{events=EPOLLIN|EPOLLOUT|EPOLLRDHUP, data={u32=0, u64=0}}) = 0 <0.000200>
15:41:43.892011 epoll_wait(4, [{events=EPOLLOUT, data={u32=0, u64=0}}], 
1, 2000) = 1 <0.000408>
15:41:43.893148 write(3, 
"\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0a\x0b\x0c\x0d\x0e\x0f", 15) = 15 
<0.000458>
15:41:43.894405 epoll_wait(4, [{events=EPOLLOUT, data={u32=0, u64=0}}], 
1, 2000) = 1 <0.000257>
15:41:43.895385 write(3, 
"\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0a\x0b\x0c\x0d\x0e\x0f", 15) = 15 
<0.128429>
15:41:44.028757 epoll_wait(4, [{events=EPOLLOUT, data={u32=0, u64=0}}], 
1, 2000) = 1 <0.001886>
15:41:44.040858 write(3, 
"\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0a\x0b\x0c\x0d\x0e\x0f", 15) = 15 
<0.108069>
...
It is then possible to write on the socket but the write is blocking, 
which is
not the expected behavior for a non-blocking socket.

I don't know how to solve the problem. To me, using wq_has_sleeper seems 
weird.
The implementation of isotp_poll feels weird too (calling both 
datagram_poll and
poll_wait?). But I am not sure what would be the correct implementation.

Note that this behavior is currently on all linux-stable branches. I am
currently testing on v6.1.36 but I know it is failing on v6.3 too.

My actual use-case is in Async Rust using tokio. Not using epoll is not an
option and a non-blocking socket that sometimes blocks can be terrible for
the performance of the whole application. Our embedded target runs with the
out-of-tree module on an older linux version, but my colleagues with 
up-to-date
linux stable kernels have recently been prevented from running tests on 
their
PC.

Is there someone with a good idea of how to proceed to fix this?
I'll probably keep digging but I don't think I can spend so much time on 
this,
so any help is appreciated.

Thanks,
Maxime



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: can: isotp: epoll breaks isotp_sendmsg
  2023-06-30 16:44 can: isotp: epoll breaks isotp_sendmsg Maxime Jayat
@ 2023-06-30 22:35 ` Michal Sojka
  2023-07-17  7:49   ` Marc Kleine-Budde
  2023-08-13 11:23   ` Lukas Magel
  0 siblings, 2 replies; 9+ messages in thread
From: Michal Sojka @ 2023-06-30 22:35 UTC (permalink / raw)
  To: Maxime Jayat, Oliver Hartkopp, Marc Kleine-Budde
  Cc: linux-can, netdev, linux-kernel, Dae R. Jeong, Hillf Danton

Hi Maxime,

On Fri, Jun 30 2023, Maxime Jayat wrote:
> Hi,
>
> There is something not clear happening with the non-blocking behavior
> of ISO-TP sockets in the TX path, but more importantly, using epoll now
> completely breaks isotp_sendmsg.
> I believe it is related to
> 79e19fa79c ("can: isotp: isotp_ops: fix poll() to not report false 
> EPOLLOUT events"),
> but actually is probably deeper than that.
>
> I don't completely understand what is exactly going on, so I am sharing
> the problem I face:
>
> With an ISO-TP socket in non-blocking mode, using epoll seems to make
> isotp_sendmsg always return -EAGAIN.

That's definitely not expected behavior. I tested the patch only with
poll, hoping that epoll would behave the same.

[...]

>
> By reverting 79e19fa79c, I get better results but still incorrect:

[...]

> It is then possible to write on the socket but the write is blocking,
> which is not the expected behavior for a non-blocking socket.

Yes, incorrect behavior was why we made the commit in question, however
we saw write() returning -EAGAIN when it shouldn't.

> I don't know how to solve the problem. To me, using wq_has_sleeper seems 
> weird.

Agreed. I've never tried to understand how synchronization works here.
Hopefully, Oliver knows more.

> The implementation of isotp_poll feels weird too (calling both 
> datagram_poll and
> poll_wait?). But I am not sure what would be the correct
> implementation.

I understand it as follows (which might be wrong - someone, please
correct me), isotp_poll() should register the file with all waitqueues
it can wait on. so->wait is one and sock->sq.wait (used by
datagram_poll) is another. The former is definitely used for TX, the
latter is probably used because skb_recv_datagram() is called for RX.
But so->wait is also used for RX and there might proabbly be be some
inconsistency between those.

> My actual use-case is in Async Rust using tokio.

Our initial motivation was also Rust and tokio however than I did
testing only with simple C programs. I'm definitely interested in having
this working.

I'll try to look at this in more detail during the weekend. It's too
late for me today.

Best regards,
-Michal

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: can: isotp: epoll breaks isotp_sendmsg
  2023-06-30 22:35 ` Michal Sojka
@ 2023-07-17  7:49   ` Marc Kleine-Budde
  2023-07-17 13:05     ` Michal Sojka
  2023-08-13 11:23   ` Lukas Magel
  1 sibling, 1 reply; 9+ messages in thread
From: Marc Kleine-Budde @ 2023-07-17  7:49 UTC (permalink / raw)
  To: Michal Sojka
  Cc: Maxime Jayat, Oliver Hartkopp, linux-can, netdev, linux-kernel,
	Dae R. Jeong, Hillf Danton

[-- Attachment #1: Type: text/plain, Size: 2704 bytes --]

On 01.07.2023 00:35:18, Michal Sojka wrote:
> Hi Maxime,
> 
> On Fri, Jun 30 2023, Maxime Jayat wrote:
> > Hi,
> >
> > There is something not clear happening with the non-blocking behavior
> > of ISO-TP sockets in the TX path, but more importantly, using epoll now
> > completely breaks isotp_sendmsg.
> > I believe it is related to
> > 79e19fa79c ("can: isotp: isotp_ops: fix poll() to not report false 
> > EPOLLOUT events"),
> > but actually is probably deeper than that.
> >
> > I don't completely understand what is exactly going on, so I am sharing
> > the problem I face:
> >
> > With an ISO-TP socket in non-blocking mode, using epoll seems to make
> > isotp_sendmsg always return -EAGAIN.
> 
> That's definitely not expected behavior. I tested the patch only with
> poll, hoping that epoll would behave the same.
> 
> [...]
> 
> >
> > By reverting 79e19fa79c, I get better results but still incorrect:
> 
> [...]
> 
> > It is then possible to write on the socket but the write is blocking,
> > which is not the expected behavior for a non-blocking socket.
> 
> Yes, incorrect behavior was why we made the commit in question, however
> we saw write() returning -EAGAIN when it shouldn't.
> 
> > I don't know how to solve the problem. To me, using wq_has_sleeper seems 
> > weird.
> 
> Agreed. I've never tried to understand how synchronization works here.
> Hopefully, Oliver knows more.
> 
> > The implementation of isotp_poll feels weird too (calling both 
> > datagram_poll and
> > poll_wait?). But I am not sure what would be the correct
> > implementation.
> 
> I understand it as follows (which might be wrong - someone, please
> correct me), isotp_poll() should register the file with all waitqueues
> it can wait on. so->wait is one and sock->sq.wait (used by
> datagram_poll) is another. The former is definitely used for TX, the
> latter is probably used because skb_recv_datagram() is called for RX.
> But so->wait is also used for RX and there might proabbly be be some
> inconsistency between those.
> 
> > My actual use-case is in Async Rust using tokio.
> 
> Our initial motivation was also Rust and tokio however than I did
> testing only with simple C programs. I'm definitely interested in having
> this working.
> 
> I'll try to look at this in more detail during the weekend. It's too
> late for me today.

Any progress on this issue?

Marc

-- 
Pengutronix e.K.                 | Marc Kleine-Budde          |
Embedded Linux                   | https://www.pengutronix.de |
Vertretung Nürnberg              | Phone: +49-5121-206917-129 |
Amtsgericht Hildesheim, HRA 2686 | Fax:   +49-5121-206917-9   |

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: can: isotp: epoll breaks isotp_sendmsg
  2023-07-17  7:49   ` Marc Kleine-Budde
@ 2023-07-17 13:05     ` Michal Sojka
  0 siblings, 0 replies; 9+ messages in thread
From: Michal Sojka @ 2023-07-17 13:05 UTC (permalink / raw)
  To: Marc Kleine-Budde
  Cc: Maxime Jayat, Oliver Hartkopp, linux-can, netdev, linux-kernel,
	Dae R. Jeong, Hillf Danton

On Mon, Jul 17 2023, Marc Kleine-Budde wrote:
> Any progress on this issue?

I was on the vacation until today. Hope to look at it soon.

-Michal

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: can: isotp: epoll breaks isotp_sendmsg
  2023-06-30 22:35 ` Michal Sojka
  2023-07-17  7:49   ` Marc Kleine-Budde
@ 2023-08-13 11:23   ` Lukas Magel
  2023-08-18 11:53     ` Lukas Magel
  1 sibling, 1 reply; 9+ messages in thread
From: Lukas Magel @ 2023-08-13 11:23 UTC (permalink / raw)
  To: Michal Sojka, Maxime Jayat, Oliver Hartkopp, Marc Kleine-Budde
  Cc: linux-can, netdev, linux-kernel, Dae R. Jeong, Hillf Danton

Hi Maxime, hi Michal,

On 01.07.23 00:35, Michal Sojka wrote:
> Hi Maxime,
>
> On Fri, Jun 30 2023, Maxime Jayat wrote:
>> Hi,
>>
>> There is something not clear happening with the non-blocking behavior
>> of ISO-TP sockets in the TX path, but more importantly, using epoll now
>> completely breaks isotp_sendmsg.
>> I believe it is related to
>> 79e19fa79c ("can: isotp: isotp_ops: fix poll() to not report false 
>> EPOLLOUT events"),
>> but actually is probably deeper than that.
>>
>> I don't completely understand what is exactly going on, so I am sharing
>> the problem I face:
>>
>> With an ISO-TP socket in non-blocking mode, using epoll seems to make
>> isotp_sendmsg always return -EAGAIN.
> That's definitely not expected behavior. I tested the patch only with
> poll, hoping that epoll would behave the same.
>
> [...]


I am writing to report that we have been witnessing a behavior very similar
to what you describe. ISO-TP send breaks with EAGAIN if a poll (for a read)
occurs at the same time.

Our Python stack uses two threads to do concurrent, blocking reads & writes
from and to the ISO-TP socket. The socket has a timeout of 0.1s to facilitate
shutdown when requested by the application. Notably, the blocking semantics
are handled by CPython, i.e. the underlying kernel socket is non-blocking.
CPython polls until the requested operation (read or write) can be executed
or the timeout occurs.

What happens during execution is that the socket is continuously being
polled by the read thread, i.e. so->wait is always filled with one task.
This process repeats until the socket receives a frame from the bus and the
poll returns successsfully. The app reads the data from the socket and
sends a response. Since the send occurs in a different thread, the reader
thread will have already returned to its poll loop and to populating
so->wait. When the send occurs, isotp_sendmsg checks so->wait for sleepers
and returns EAGAIN because the socket is non-blocking although there is no
concurrent send operation. This dance continues until the timeout occurs for
either the read or the write operation. If the write times out first, a
timeout error causes the Python app to break. If the read times out first,
there is a race that the write goes through or the reader puts in
another poll.

This behavior can be seen rather nicely in strace:
110580 poll([{fd=5, events=POLLIN}], 1, 100) = 0 (Timeout)
110580 poll([{fd=5, events=POLLIN}], 1, 100) = 0 (Timeout)
110580 poll([{fd=5, events=POLLIN}], 1, 100) = 1 ([{fd=5, revents=POLLIN}])
110580 recvfrom(5, ">\0", 4095, 0, NULL, NULL) = 2
110580 poll([{fd=5, events=POLLIN}], 1, 100 <unfinished ...>
110569 poll([{fd=5, events=POLLOUT}], 1, 100) = 1 ([{fd=5, revents=POLLOUT}])
110569 sendto(5, "~\0", 2, 0, NULL, 0)  = -1 EAGAIN (Resource temporarily unavailable)
110569 poll([{fd=5, events=POLLOUT}], 1, 100) = 1 ([{fd=5, revents=POLLOUT}])
110569 sendto(5, "~\0", 2, 0, NULL, 0)  = -1 EAGAIN (Resource temporarily unavailable)
110569 poll([{fd=5, events=POLLOUT}], 1, 100) = 1 ([{fd=5, revents=POLLOUT}])
110569 sendto(5, "~\0", 2, 0, NULL, 0)  = -1 EAGAIN (Resource temporarily unavailable)
[....]
110569 poll([{fd=5, events=POLLOUT}], 1, 6) = 1 ([{fd=5, revents=POLLOUT}])
110569 sendto(5, "~\0", 2, 0, NULL, 0)  = -1 EAGAIN (Resource temporarily unavailable)
110569 poll([{fd=5, events=POLLOUT}], 1, 6 <unfinished ...>
110580 <... poll resumed>)              = 0 (Timeout)
110569 <... poll resumed>)              = 1 ([{fd=5, revents=POLLOUT}])
110580 poll([{fd=5, events=POLLIN}], 1, 100 <unfinished ...>
110569 sendto(5, "~\0", 2, 0, NULL, 0)  = -1 EAGAIN (Resource temporarily unavailable)
110569 poll([{fd=5, events=POLLOUT}], 1, 5) = 1 ([{fd=5, revents=POLLOUT}])
110569 sendto(5, "~\0", 2, 0, NULL, 0)  = -1 EAGAIN (Resource temporarily unavailable)

I believe this is consistent to the behavior you're witnessing with epoll
because epoll also places a sleeper in so->wait that is left in the queue
until the epoll descriptor is closed.


>
>> By reverting 79e19fa79c, I get better results but still incorrect:
> [...]
>
>> It is then possible to write on the socket but the write is blocking,
>> which is not the expected behavior for a non-blocking socket.
> Yes, incorrect behavior was why we made the commit in question, however
> we saw write() returning -EAGAIN when it shouldn't.
>
>> I don't know how to solve the problem. To me, using wq_has_sleeper seems 
>> weird.
> Agreed. I've never tried to understand how synchronization works here.
> Hopefully, Oliver knows more.
>
>> The implementation of isotp_poll feels weird too (calling both 
>> datagram_poll and
>> poll_wait?). But I am not sure what would be the correct
>> implementation.
> I understand it as follows (which might be wrong - someone, please
> correct me), isotp_poll() should register the file with all waitqueues
> it can wait on. so->wait is one and sock->sq.wait (used by
> datagram_poll) is another. The former is definitely used for TX, the
> latter is probably used because skb_recv_datagram() is called for RX.
> But so->wait is also used for RX and there might proabbly be be some
> inconsistency between those.


AFAIK, it is correct behavior for isotp_poll to register all wait queues
with the poller.

Before 79e19fa79c, I assume datagram_poll always returned
EPOLLOUT because its corresponding send buffer is unused and empty. This
return value can be incorrect if so->tx.state is not IDLE and a send would
block. With the patch, this behavior is now suppressed. I believe that
the inconsistency could have rather been introduced with:
can: isotp: fix race between isotp_sendsmg() and isotp_release() 0517374

With this patch, the behavior of isotp_sendmsg was changed to only check
so->wait for sleepers instead of consulting so->tx.state to see if the
socket is busy. Since the wait queue can also have sleepers only interested
in read operations, I believe the return value is not a valid indicator
of send readiness. Additionally, in this state, the behavior in isotp_sendmsg
is inconsistent with isotp_poll. I will try to test next week if reverting this
part of the patch could fix the race condition.


>
>> My actual use-case is in Async Rust using tokio.
> Our initial motivation was also Rust and tokio however than I did
> testing only with simple C programs. I'm definitely interested in having
> this working.
>
> I'll try to look at this in more detail during the weekend. It's too
> late for me today.
>
> Best regards,
> -Michal
>
Regards,
Lukas


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: can: isotp: epoll breaks isotp_sendmsg
  2023-08-13 11:23   ` Lukas Magel
@ 2023-08-18 11:53     ` Lukas Magel
  2023-08-20 19:51       ` Oliver Hartkopp
  0 siblings, 1 reply; 9+ messages in thread
From: Lukas Magel @ 2023-08-18 11:53 UTC (permalink / raw)
  To: Michal Sojka, Maxime Jayat, Oliver Hartkopp, Marc Kleine-Budde
  Cc: linux-can, netdev, linux-kernel, Dae R. Jeong, Hillf Danton

On 13.08.23 13:23, Lukas Magel wrote:
> Hi Maxime, hi Michal,
>
> On 01.07.23 00:35, Michal Sojka wrote:
>> Hi Maxime,
>>
>> On Fri, Jun 30 2023, Maxime Jayat wrote:
>>> Hi,
>>>
>>> There is something not clear happening with the non-blocking behavior
>>> of ISO-TP sockets in the TX path, but more importantly, using epoll now
>>> completely breaks isotp_sendmsg.
>>> I believe it is related to
>>> 79e19fa79c ("can: isotp: isotp_ops: fix poll() to not report false 
>>> EPOLLOUT events"),
>>> but actually is probably deeper than that.
>>>
>>> I don't completely understand what is exactly going on, so I am sharing
>>> the problem I face:
>>>
>>> With an ISO-TP socket in non-blocking mode, using epoll seems to make
>>> isotp_sendmsg always return -EAGAIN.
>> That's definitely not expected behavior. I tested the patch only with
>> poll, hoping that epoll would behave the same.
>>
>> [...]
>
> I am writing to report that we have been witnessing a behavior very similar
> to what you describe. ISO-TP send breaks with EAGAIN if a poll (for a read)
> occurs at the same time.
>
> Our Python stack uses two threads to do concurrent, blocking reads & writes
> from and to the ISO-TP socket. The socket has a timeout of 0.1s to facilitate
> shutdown when requested by the application. Notably, the blocking semantics
> are handled by CPython, i.e. the underlying kernel socket is non-blocking.
> CPython polls until the requested operation (read or write) can be executed
> or the timeout occurs.
>
> What happens during execution is that the socket is continuously being
> polled by the read thread, i.e. so->wait is always filled with one task.
> This process repeats until the socket receives a frame from the bus and the
> poll returns successsfully. The app reads the data from the socket and
> sends a response. Since the send occurs in a different thread, the reader
> thread will have already returned to its poll loop and to populating
> so->wait. When the send occurs, isotp_sendmsg checks so->wait for sleepers
> and returns EAGAIN because the socket is non-blocking although there is no
> concurrent send operation. This dance continues until the timeout occurs for
> either the read or the write operation. If the write times out first, a
> timeout error causes the Python app to break. If the read times out first,
> there is a race that the write goes through or the reader puts in
> another poll.
>
> This behavior can be seen rather nicely in strace:
> 110580 poll([{fd=5, events=POLLIN}], 1, 100) = 0 (Timeout)
> 110580 poll([{fd=5, events=POLLIN}], 1, 100) = 0 (Timeout)
> 110580 poll([{fd=5, events=POLLIN}], 1, 100) = 1 ([{fd=5, revents=POLLIN}])
> 110580 recvfrom(5, ">\0", 4095, 0, NULL, NULL) = 2
> 110580 poll([{fd=5, events=POLLIN}], 1, 100 <unfinished ...>
> 110569 poll([{fd=5, events=POLLOUT}], 1, 100) = 1 ([{fd=5, revents=POLLOUT}])
> 110569 sendto(5, "~\0", 2, 0, NULL, 0)  = -1 EAGAIN (Resource temporarily unavailable)
> 110569 poll([{fd=5, events=POLLOUT}], 1, 100) = 1 ([{fd=5, revents=POLLOUT}])
> 110569 sendto(5, "~\0", 2, 0, NULL, 0)  = -1 EAGAIN (Resource temporarily unavailable)
> 110569 poll([{fd=5, events=POLLOUT}], 1, 100) = 1 ([{fd=5, revents=POLLOUT}])
> 110569 sendto(5, "~\0", 2, 0, NULL, 0)  = -1 EAGAIN (Resource temporarily unavailable)
> [....]
> 110569 poll([{fd=5, events=POLLOUT}], 1, 6) = 1 ([{fd=5, revents=POLLOUT}])
> 110569 sendto(5, "~\0", 2, 0, NULL, 0)  = -1 EAGAIN (Resource temporarily unavailable)
> 110569 poll([{fd=5, events=POLLOUT}], 1, 6 <unfinished ...>
> 110580 <... poll resumed>)              = 0 (Timeout)
> 110569 <... poll resumed>)              = 1 ([{fd=5, revents=POLLOUT}])
> 110580 poll([{fd=5, events=POLLIN}], 1, 100 <unfinished ...>
> 110569 sendto(5, "~\0", 2, 0, NULL, 0)  = -1 EAGAIN (Resource temporarily unavailable)
> 110569 poll([{fd=5, events=POLLOUT}], 1, 5) = 1 ([{fd=5, revents=POLLOUT}])
> 110569 sendto(5, "~\0", 2, 0, NULL, 0)  = -1 EAGAIN (Resource temporarily unavailable)
>
> I believe this is consistent to the behavior you're witnessing with epoll
> because epoll also places a sleeper in so->wait that is left in the queue
> until the epoll descriptor is closed.
>
>
>>> By reverting 79e19fa79c, I get better results but still incorrect:
>> [...]
>>
>>> It is then possible to write on the socket but the write is blocking,
>>> which is not the expected behavior for a non-blocking socket.
>> Yes, incorrect behavior was why we made the commit in question, however
>> we saw write() returning -EAGAIN when it shouldn't.
>>
>>> I don't know how to solve the problem. To me, using wq_has_sleeper seems 
>>> weird.
>> Agreed. I've never tried to understand how synchronization works here.
>> Hopefully, Oliver knows more.
>>
>>> The implementation of isotp_poll feels weird too (calling both 
>>> datagram_poll and
>>> poll_wait?). But I am not sure what would be the correct
>>> implementation.
>> I understand it as follows (which might be wrong - someone, please
>> correct me), isotp_poll() should register the file with all waitqueues
>> it can wait on. so->wait is one and sock->sq.wait (used by
>> datagram_poll) is another. The former is definitely used for TX, the
>> latter is probably used because skb_recv_datagram() is called for RX.
>> But so->wait is also used for RX and there might proabbly be be some
>> inconsistency between those.
>
> AFAIK, it is correct behavior for isotp_poll to register all wait queues
> with the poller.
>
> Before 79e19fa79c, I assume datagram_poll always returned
> EPOLLOUT because its corresponding send buffer is unused and empty. This
> return value can be incorrect if so->tx.state is not IDLE and a send would
> block. With the patch, this behavior is now suppressed. I believe that
> the inconsistency could have rather been introduced with:
> can: isotp: fix race between isotp_sendsmg() and isotp_release() 0517374
>
> With this patch, the behavior of isotp_sendmsg was changed to only check
> so->wait for sleepers instead of consulting so->tx.state to see if the
> socket is busy. Since the wait queue can also have sleepers only interested
> in read operations, I believe the return value is not a valid indicator
> of send readiness. Additionally, in this state, the behavior in isotp_sendmsg
> is inconsistent with isotp_poll. I will try to test next week if reverting this
> part of the patch could fix the race condition.
>

I submitted a patch to the mailing list and would very much
appreciate your feedback:
https://lore.kernel.org/linux-can/20230818114345.142983-1-lukas.magel@posteo.net/

@Maxime, I tried your isotprecv / isotpsend snippet and the issue did
not occur with the supplied patch. The send operation went through right
away.

@Oliver I adjusted the exit path for the case where the initial wait is
interrupted to return immediately instead of jumping to err_event_drop.
Could you please check if you would agree with this change?

Regards,
Lukas


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: can: isotp: epoll breaks isotp_sendmsg
  2023-08-18 11:53     ` Lukas Magel
@ 2023-08-20 19:51       ` Oliver Hartkopp
  2023-08-22  6:51         ` Lukas Magel
  0 siblings, 1 reply; 9+ messages in thread
From: Oliver Hartkopp @ 2023-08-20 19:51 UTC (permalink / raw)
  To: Lukas Magel, Michal Sojka, Maxime Jayat, Marc Kleine-Budde
  Cc: linux-can, netdev, linux-kernel, Dae R. Jeong, Hillf Danton

Hi Lukas,

On 18.08.23 13:53, Lukas Magel wrote:
> On 13.08.23 13:23, Lukas Magel wrote:
>> Hi Maxime, hi Michal,
>>
>> On 01.07.23 00:35, Michal Sojka wrote:
>>> Hi Maxime,
>>>
>>> On Fri, Jun 30 2023, Maxime Jayat wrote:
>>>> Hi,
>>>>
>>>> There is something not clear happening with the non-blocking behavior
>>>> of ISO-TP sockets in the TX path, but more importantly, using epoll now
>>>> completely breaks isotp_sendmsg.
>>>> I believe it is related to
>>>> 79e19fa79c ("can: isotp: isotp_ops: fix poll() to not report false
>>>> EPOLLOUT events"),
>>>> but actually is probably deeper than that.
>>>>
>>>> I don't completely understand what is exactly going on, so I am sharing
>>>> the problem I face:
>>>>
>>>> With an ISO-TP socket in non-blocking mode, using epoll seems to make
>>>> isotp_sendmsg always return -EAGAIN.
>>> That's definitely not expected behavior. I tested the patch only with
>>> poll, hoping that epoll would behave the same.
>>>
>>> [...]
>>
>> I am writing to report that we have been witnessing a behavior very similar
>> to what you describe. ISO-TP send breaks with EAGAIN if a poll (for a read)
>> occurs at the same time.
>>
>> Our Python stack uses two threads to do concurrent, blocking reads & writes
>> from and to the ISO-TP socket. The socket has a timeout of 0.1s to facilitate
>> shutdown when requested by the application. Notably, the blocking semantics
>> are handled by CPython, i.e. the underlying kernel socket is non-blocking.
>> CPython polls until the requested operation (read or write) can be executed
>> or the timeout occurs.
>>
>> What happens during execution is that the socket is continuously being
>> polled by the read thread, i.e. so->wait is always filled with one task.
>> This process repeats until the socket receives a frame from the bus and the
>> poll returns successsfully. The app reads the data from the socket and
>> sends a response. Since the send occurs in a different thread, the reader
>> thread will have already returned to its poll loop and to populating
>> so->wait. When the send occurs, isotp_sendmsg checks so->wait for sleepers
>> and returns EAGAIN because the socket is non-blocking although there is no
>> concurrent send operation. This dance continues until the timeout occurs for
>> either the read or the write operation. If the write times out first, a
>> timeout error causes the Python app to break. If the read times out first,
>> there is a race that the write goes through or the reader puts in
>> another poll.
>>
>> This behavior can be seen rather nicely in strace:
>> 110580 poll([{fd=5, events=POLLIN}], 1, 100) = 0 (Timeout)
>> 110580 poll([{fd=5, events=POLLIN}], 1, 100) = 0 (Timeout)
>> 110580 poll([{fd=5, events=POLLIN}], 1, 100) = 1 ([{fd=5, revents=POLLIN}])
>> 110580 recvfrom(5, ">\0", 4095, 0, NULL, NULL) = 2
>> 110580 poll([{fd=5, events=POLLIN}], 1, 100 <unfinished ...>
>> 110569 poll([{fd=5, events=POLLOUT}], 1, 100) = 1 ([{fd=5, revents=POLLOUT}])
>> 110569 sendto(5, "~\0", 2, 0, NULL, 0)  = -1 EAGAIN (Resource temporarily unavailable)
>> 110569 poll([{fd=5, events=POLLOUT}], 1, 100) = 1 ([{fd=5, revents=POLLOUT}])
>> 110569 sendto(5, "~\0", 2, 0, NULL, 0)  = -1 EAGAIN (Resource temporarily unavailable)
>> 110569 poll([{fd=5, events=POLLOUT}], 1, 100) = 1 ([{fd=5, revents=POLLOUT}])
>> 110569 sendto(5, "~\0", 2, 0, NULL, 0)  = -1 EAGAIN (Resource temporarily unavailable)
>> [....]
>> 110569 poll([{fd=5, events=POLLOUT}], 1, 6) = 1 ([{fd=5, revents=POLLOUT}])
>> 110569 sendto(5, "~\0", 2, 0, NULL, 0)  = -1 EAGAIN (Resource temporarily unavailable)
>> 110569 poll([{fd=5, events=POLLOUT}], 1, 6 <unfinished ...>
>> 110580 <... poll resumed>)              = 0 (Timeout)
>> 110569 <... poll resumed>)              = 1 ([{fd=5, revents=POLLOUT}])
>> 110580 poll([{fd=5, events=POLLIN}], 1, 100 <unfinished ...>
>> 110569 sendto(5, "~\0", 2, 0, NULL, 0)  = -1 EAGAIN (Resource temporarily unavailable)
>> 110569 poll([{fd=5, events=POLLOUT}], 1, 5) = 1 ([{fd=5, revents=POLLOUT}])
>> 110569 sendto(5, "~\0", 2, 0, NULL, 0)  = -1 EAGAIN (Resource temporarily unavailable)
>>
>> I believe this is consistent to the behavior you're witnessing with epoll
>> because epoll also places a sleeper in so->wait that is left in the queue
>> until the epoll descriptor is closed.
>>
>>
>>>> By reverting 79e19fa79c, I get better results but still incorrect:
>>> [...]
>>>
>>>> It is then possible to write on the socket but the write is blocking,
>>>> which is not the expected behavior for a non-blocking socket.
>>> Yes, incorrect behavior was why we made the commit in question, however
>>> we saw write() returning -EAGAIN when it shouldn't.
>>>
>>>> I don't know how to solve the problem. To me, using wq_has_sleeper seems
>>>> weird.
>>> Agreed. I've never tried to understand how synchronization works here.
>>> Hopefully, Oliver knows more.
>>>
>>>> The implementation of isotp_poll feels weird too (calling both
>>>> datagram_poll and
>>>> poll_wait?). But I am not sure what would be the correct
>>>> implementation.
>>> I understand it as follows (which might be wrong - someone, please
>>> correct me), isotp_poll() should register the file with all waitqueues
>>> it can wait on. so->wait is one and sock->sq.wait (used by
>>> datagram_poll) is another. The former is definitely used for TX, the
>>> latter is probably used because skb_recv_datagram() is called for RX.
>>> But so->wait is also used for RX and there might proabbly be be some
>>> inconsistency between those.
>>
>> AFAIK, it is correct behavior for isotp_poll to register all wait queues
>> with the poller.
>>
>> Before 79e19fa79c, I assume datagram_poll always returned
>> EPOLLOUT because its corresponding send buffer is unused and empty. This
>> return value can be incorrect if so->tx.state is not IDLE and a send would
>> block. With the patch, this behavior is now suppressed. I believe that
>> the inconsistency could have rather been introduced with:
>> can: isotp: fix race between isotp_sendsmg() and isotp_release() 0517374
>>
>> With this patch, the behavior of isotp_sendmsg was changed to only check
>> so->wait for sleepers instead of consulting so->tx.state to see if the
>> socket is busy. Since the wait queue can also have sleepers only interested
>> in read operations, I believe the return value is not a valid indicator
>> of send readiness. Additionally, in this state, the behavior in isotp_sendmsg
>> is inconsistent with isotp_poll. I will try to test next week if reverting this
>> part of the patch could fix the race condition.
>>
> 
> I submitted a patch to the mailing list and would very much
> appreciate your feedback:
> https://lore.kernel.org/linux-can/20230818114345.142983-1-lukas.magel@posteo.net/
> 
> @Maxime, I tried your isotprecv / isotpsend snippet and the issue did
> not occur with the supplied patch. The send operation went through right
> away.
> 
> @Oliver I adjusted the exit path for the case where the initial wait is
> interrupted to return immediately instead of jumping to err_event_drop.
> Could you please check if you would agree with this change?

The code has really won with your change! Thanks!

But as you already assumed I have a problem with the handling of the 
cleanup when a signal interrupts the wait_event_interruptible() statement.

I think it should still be:

/* wait for complete transmission of current pdu */
err = wait_event_interruptible(so->wait, so->tx.state == ISOTP_IDLE);
if (err)
         goto err_event_drop;

as we need to make sure that the state machine is set to defined values 
and states for the next isotp_sendmsg() attempt.

Best regards,
Oliver

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: can: isotp: epoll breaks isotp_sendmsg
  2023-08-20 19:51       ` Oliver Hartkopp
@ 2023-08-22  6:51         ` Lukas Magel
  2023-08-22 16:37           ` Oliver Hartkopp
  0 siblings, 1 reply; 9+ messages in thread
From: Lukas Magel @ 2023-08-22  6:51 UTC (permalink / raw)
  To: Oliver Hartkopp, Michal Sojka, Maxime Jayat, Marc Kleine-Budde
  Cc: linux-can, netdev, linux-kernel, Dae R. Jeong, Hillf Danton

Hi Oliver,

On 20.08.23 21:51, Oliver Hartkopp wrote:
> Hi Lukas,
>
> On 18.08.23 13:53, Lukas Magel wrote:
>> On 13.08.23 13:23, Lukas Magel wrote:
>>> Hi Maxime, hi Michal,
>>>
>>> On 01.07.23 00:35, Michal Sojka wrote:
>>>> Hi Maxime,
>>>>
>>>> On Fri, Jun 30 2023, Maxime Jayat wrote:
>>>>> Hi,
>>>>>
>>>>> There is something not clear happening with the non-blocking behavior
>>>>> of ISO-TP sockets in the TX path, but more importantly, using epoll now
>>>>> completely breaks isotp_sendmsg.
>>>>> I believe it is related to
>>>>> 79e19fa79c ("can: isotp: isotp_ops: fix poll() to not report false
>>>>> EPOLLOUT events"),
>>>>> but actually is probably deeper than that.
>>>>>
>>>>> I don't completely understand what is exactly going on, so I am sharing
>>>>> the problem I face:
>>>>>
>>>>> With an ISO-TP socket in non-blocking mode, using epoll seems to make
>>>>> isotp_sendmsg always return -EAGAIN.
>>>> That's definitely not expected behavior. I tested the patch only with
>>>> poll, hoping that epoll would behave the same.
>>>>
>>>> [...]
>>> I am writing to report that we have been witnessing a behavior very similar
>>> to what you describe. ISO-TP send breaks with EAGAIN if a poll (for a read)
>>> occurs at the same time.
>>>
>>> Our Python stack uses two threads to do concurrent, blocking reads & writes
>>> from and to the ISO-TP socket. The socket has a timeout of 0.1s to facilitate
>>> shutdown when requested by the application. Notably, the blocking semantics
>>> are handled by CPython, i.e. the underlying kernel socket is non-blocking.
>>> CPython polls until the requested operation (read or write) can be executed
>>> or the timeout occurs.
>>>
>>> What happens during execution is that the socket is continuously being
>>> polled by the read thread, i.e. so->wait is always filled with one task.
>>> This process repeats until the socket receives a frame from the bus and the
>>> poll returns successsfully. The app reads the data from the socket and
>>> sends a response. Since the send occurs in a different thread, the reader
>>> thread will have already returned to its poll loop and to populating
>>> so->wait. When the send occurs, isotp_sendmsg checks so->wait for sleepers
>>> and returns EAGAIN because the socket is non-blocking although there is no
>>> concurrent send operation. This dance continues until the timeout occurs for
>>> either the read or the write operation. If the write times out first, a
>>> timeout error causes the Python app to break. If the read times out first,
>>> there is a race that the write goes through or the reader puts in
>>> another poll.
>>>
>>> This behavior can be seen rather nicely in strace:
>>> 110580 poll([{fd=5, events=POLLIN}], 1, 100) = 0 (Timeout)
>>> 110580 poll([{fd=5, events=POLLIN}], 1, 100) = 0 (Timeout)
>>> 110580 poll([{fd=5, events=POLLIN}], 1, 100) = 1 ([{fd=5, revents=POLLIN}])
>>> 110580 recvfrom(5, ">\0", 4095, 0, NULL, NULL) = 2
>>> 110580 poll([{fd=5, events=POLLIN}], 1, 100 <unfinished ...>
>>> 110569 poll([{fd=5, events=POLLOUT}], 1, 100) = 1 ([{fd=5, revents=POLLOUT}])
>>> 110569 sendto(5, "~\0", 2, 0, NULL, 0)  = -1 EAGAIN (Resource temporarily unavailable)
>>> 110569 poll([{fd=5, events=POLLOUT}], 1, 100) = 1 ([{fd=5, revents=POLLOUT}])
>>> 110569 sendto(5, "~\0", 2, 0, NULL, 0)  = -1 EAGAIN (Resource temporarily unavailable)
>>> 110569 poll([{fd=5, events=POLLOUT}], 1, 100) = 1 ([{fd=5, revents=POLLOUT}])
>>> 110569 sendto(5, "~\0", 2, 0, NULL, 0)  = -1 EAGAIN (Resource temporarily unavailable)
>>> [....]
>>> 110569 poll([{fd=5, events=POLLOUT}], 1, 6) = 1 ([{fd=5, revents=POLLOUT}])
>>> 110569 sendto(5, "~\0", 2, 0, NULL, 0)  = -1 EAGAIN (Resource temporarily unavailable)
>>> 110569 poll([{fd=5, events=POLLOUT}], 1, 6 <unfinished ...>
>>> 110580 <... poll resumed>)              = 0 (Timeout)
>>> 110569 <... poll resumed>)              = 1 ([{fd=5, revents=POLLOUT}])
>>> 110580 poll([{fd=5, events=POLLIN}], 1, 100 <unfinished ...>
>>> 110569 sendto(5, "~\0", 2, 0, NULL, 0)  = -1 EAGAIN (Resource temporarily unavailable)
>>> 110569 poll([{fd=5, events=POLLOUT}], 1, 5) = 1 ([{fd=5, revents=POLLOUT}])
>>> 110569 sendto(5, "~\0", 2, 0, NULL, 0)  = -1 EAGAIN (Resource temporarily unavailable)
>>>
>>> I believe this is consistent to the behavior you're witnessing with epoll
>>> because epoll also places a sleeper in so->wait that is left in the queue
>>> until the epoll descriptor is closed.
>>>
>>>
>>>>> By reverting 79e19fa79c, I get better results but still incorrect:
>>>> [...]
>>>>
>>>>> It is then possible to write on the socket but the write is blocking,
>>>>> which is not the expected behavior for a non-blocking socket.
>>>> Yes, incorrect behavior was why we made the commit in question, however
>>>> we saw write() returning -EAGAIN when it shouldn't.
>>>>
>>>>> I don't know how to solve the problem. To me, using wq_has_sleeper seems
>>>>> weird.
>>>> Agreed. I've never tried to understand how synchronization works here.
>>>> Hopefully, Oliver knows more.
>>>>
>>>>> The implementation of isotp_poll feels weird too (calling both
>>>>> datagram_poll and
>>>>> poll_wait?). But I am not sure what would be the correct
>>>>> implementation.
>>>> I understand it as follows (which might be wrong - someone, please
>>>> correct me), isotp_poll() should register the file with all waitqueues
>>>> it can wait on. so->wait is one and sock->sq.wait (used by
>>>> datagram_poll) is another. The former is definitely used for TX, the
>>>> latter is probably used because skb_recv_datagram() is called for RX.
>>>> But so->wait is also used for RX and there might proabbly be be some
>>>> inconsistency between those.
>>> AFAIK, it is correct behavior for isotp_poll to register all wait queues
>>> with the poller.
>>>
>>> Before 79e19fa79c, I assume datagram_poll always returned
>>> EPOLLOUT because its corresponding send buffer is unused and empty. This
>>> return value can be incorrect if so->tx.state is not IDLE and a send would
>>> block. With the patch, this behavior is now suppressed. I believe that
>>> the inconsistency could have rather been introduced with:
>>> can: isotp: fix race between isotp_sendsmg() and isotp_release() 0517374
>>>
>>> With this patch, the behavior of isotp_sendmsg was changed to only check
>>> so->wait for sleepers instead of consulting so->tx.state to see if the
>>> socket is busy. Since the wait queue can also have sleepers only interested
>>> in read operations, I believe the return value is not a valid indicator
>>> of send readiness. Additionally, in this state, the behavior in isotp_sendmsg
>>> is inconsistent with isotp_poll. I will try to test next week if reverting this
>>> part of the patch could fix the race condition.
>>>
>> I submitted a patch to the mailing list and would very much
>> appreciate your feedback:
>> https://lore.kernel.org/linux-can/20230818114345.142983-1-lukas.magel@posteo.net/
>>
>> @Maxime, I tried your isotprecv / isotpsend snippet and the issue did
>> not occur with the supplied patch. The send operation went through right
>> away.
>>
>> @Oliver I adjusted the exit path for the case where the initial wait is
>> interrupted to return immediately instead of jumping to err_event_drop.
>> Could you please check if you would agree with this change?
> The code has really won with your change! Thanks!
>
> But as you already assumed I have a problem with the handling of the 
> cleanup when a signal interrupts the wait_event_interruptible() statement.
>
> I think it should still be:
>
> /* wait for complete transmission of current pdu */
> err = wait_event_interruptible(so->wait, so->tx.state == ISOTP_IDLE);
> if (err)
>          goto err_event_drop;
>
> as we need to make sure that the state machine is set to defined values 
> and states for the next isotp_sendmsg() attempt.
>
> Best regards,
> Oliver


Thank you for the feedback! Can you elaborate why the state needs to be
reset here? For me, the loop is basically a "let's wait until we win
arbitration for the tx.state", which means that the task is allowed
to send. I'm imagining an application that has two threads, both sending
at the same time (because maybe they don't care about reading). So one
would always be waiting in the loop until the send operation of the other
has concluded. My motivation for not going to err_event_drop was that if
one thread was interrupted in its wait_event_interruptible, why would we
need to change tx.state that is currently being occupied by the other
thread? The thread waiting in the loop has not done any state manipulation
of the socket.

Regards,

Lukas


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: can: isotp: epoll breaks isotp_sendmsg
  2023-08-22  6:51         ` Lukas Magel
@ 2023-08-22 16:37           ` Oliver Hartkopp
  0 siblings, 0 replies; 9+ messages in thread
From: Oliver Hartkopp @ 2023-08-22 16:37 UTC (permalink / raw)
  To: Lukas Magel, Michal Sojka, Maxime Jayat, Marc Kleine-Budde
  Cc: linux-can, netdev, linux-kernel, Dae R. Jeong, Hillf Danton

Hi Lukas,

On 22.08.23 08:51, Lukas Magel wrote:

>>> @Oliver I adjusted the exit path for the case where the initial wait is
>>> interrupted to return immediately instead of jumping to err_event_drop.
>>> Could you please check if you would agree with this change?
>> The code has really won with your change! Thanks!
>>
>> But as you already assumed I have a problem with the handling of the
>> cleanup when a signal interrupts the wait_event_interruptible() statement.
>>
>> I think it should still be:
>>
>> /* wait for complete transmission of current pdu */
>> err = wait_event_interruptible(so->wait, so->tx.state == ISOTP_IDLE);
>> if (err)
>>           goto err_event_drop;
>>
>> as we need to make sure that the state machine is set to defined values
>> and states for the next isotp_sendmsg() attempt.
>>
>> Best regards,
>> Oliver
> 
> 
> Thank you for the feedback! Can you elaborate why the state needs to be
> reset here? For me, the loop is basically a "let's wait until we win
> arbitration for the tx.state", which means that the task is allowed
> to send. I'm imagining an application that has two threads, both sending
> at the same time (because maybe they don't care about reading). So one
> would always be waiting in the loop until the send operation of the other
> has concluded. My motivation for not going to err_event_drop was that if
> one thread was interrupted in its wait_event_interruptible, why would we
> need to change tx.state that is currently being occupied by the other
> thread? The thread waiting in the loop has not done any state manipulation
> of the socket.

Please don't only look at the isotp_sendmsg() function but the other 
possibilities e.g. from timeouts.

Look for the documentation from the commit 051737439eaee. This patch has 
been added recently as it was needed.

Best regards,
Oliver

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2023-08-22 16:37 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-06-30 16:44 can: isotp: epoll breaks isotp_sendmsg Maxime Jayat
2023-06-30 22:35 ` Michal Sojka
2023-07-17  7:49   ` Marc Kleine-Budde
2023-07-17 13:05     ` Michal Sojka
2023-08-13 11:23   ` Lukas Magel
2023-08-18 11:53     ` Lukas Magel
2023-08-20 19:51       ` Oliver Hartkopp
2023-08-22  6:51         ` Lukas Magel
2023-08-22 16:37           ` Oliver Hartkopp

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).