* [PATCH net] tcp: make sure EPOLLOUT wont be missed
@ 2019-08-17 4:26 Eric Dumazet
2019-08-17 12:39 ` Soheil Hassas Yeganeh
` (3 more replies)
0 siblings, 4 replies; 7+ messages in thread
From: Eric Dumazet @ 2019-08-17 4:26 UTC (permalink / raw)
To: David S . Miller
Cc: netdev, Eric Dumazet, Soheil Hassas Yeganeh, Neal Cardwell,
Eric Dumazet, Jason Baron, Vladimir Rutsky
As Jason Baron explained in commit 790ba4566c1a ("tcp: set SOCK_NOSPACE
under memory pressure"), it is crucial we properly set SOCK_NOSPACE
when needed.
However, Jason patch had a bug, because the 'nonblocking' status
as far as sk_stream_wait_memory() is concerned is governed
by MSG_DONTWAIT flag passed at sendmsg() time :
long timeo = sock_sndtimeo(sk, flags & MSG_DONTWAIT);
So it is very possible that tcp sendmsg() calls sk_stream_wait_memory(),
and that sk_stream_wait_memory() returns -EAGAIN with SOCK_NOSPACE
cleared, if sk->sk_sndtimeo has been set to a small (but not zero)
value.
This patch removes the 'noblock' variable since we must always
set SOCK_NOSPACE if -EAGAIN is returned.
It also renames the do_nonblock label since we might reach this
code path even if we were in blocking mode.
Fixes: 790ba4566c1a ("tcp: set SOCK_NOSPACE under memory pressure")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jason Baron <jbaron@akamai.com>
Reported-by: Vladimir Rutsky <rutsky@google.com>
---
net/core/stream.c | 16 +++++++++-------
1 file changed, 9 insertions(+), 7 deletions(-)
diff --git a/net/core/stream.c b/net/core/stream.c
index e94bb02a56295ec2db34ab423a8c7c890df0a696..4f1d4aa5fb38d989a9c81f32dfce3f31bbc1fa47 100644
--- a/net/core/stream.c
+++ b/net/core/stream.c
@@ -120,7 +120,6 @@ int sk_stream_wait_memory(struct sock *sk, long *timeo_p)
int err = 0;
long vm_wait = 0;
long current_timeo = *timeo_p;
- bool noblock = (*timeo_p ? false : true);
DEFINE_WAIT_FUNC(wait, woken_wake_function);
if (sk_stream_memory_free(sk))
@@ -133,11 +132,8 @@ int sk_stream_wait_memory(struct sock *sk, long *timeo_p)
if (sk->sk_err || (sk->sk_shutdown & SEND_SHUTDOWN))
goto do_error;
- if (!*timeo_p) {
- if (noblock)
- set_bit(SOCK_NOSPACE, &sk->sk_socket->flags);
- goto do_nonblock;
- }
+ if (!*timeo_p)
+ goto do_eagain;
if (signal_pending(current))
goto do_interrupted;
sk_clear_bit(SOCKWQ_ASYNC_NOSPACE, sk);
@@ -169,7 +165,13 @@ int sk_stream_wait_memory(struct sock *sk, long *timeo_p)
do_error:
err = -EPIPE;
goto out;
-do_nonblock:
+do_eagain:
+ /* Make sure that whenever EAGAIN is returned, EPOLLOUT event can
+ * be generated later.
+ * When TCP receives ACK packets that make room, tcp_check_space()
+ * only calls tcp_new_space() if SOCK_NOSPACE is set.
+ */
+ set_bit(SOCK_NOSPACE, &sk->sk_socket->flags);
err = -EAGAIN;
goto out;
do_interrupted:
--
2.23.0.rc1.153.gdeed80330f-goog
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH net] tcp: make sure EPOLLOUT wont be missed
2019-08-17 4:26 [PATCH net] tcp: make sure EPOLLOUT wont be missed Eric Dumazet
@ 2019-08-17 12:39 ` Soheil Hassas Yeganeh
2019-08-17 14:19 ` Jason Baron
` (2 subsequent siblings)
3 siblings, 0 replies; 7+ messages in thread
From: Soheil Hassas Yeganeh @ 2019-08-17 12:39 UTC (permalink / raw)
To: Eric Dumazet
Cc: David S . Miller, netdev, Neal Cardwell, Eric Dumazet,
Jason Baron, Vladimir Rutsky
On Sat, Aug 17, 2019 at 12:26 AM Eric Dumazet <edumazet@google.com> wrote:
>
> As Jason Baron explained in commit 790ba4566c1a ("tcp: set SOCK_NOSPACE
> under memory pressure"), it is crucial we properly set SOCK_NOSPACE
> when needed.
>
> However, Jason patch had a bug, because the 'nonblocking' status
> as far as sk_stream_wait_memory() is concerned is governed
> by MSG_DONTWAIT flag passed at sendmsg() time :
>
> long timeo = sock_sndtimeo(sk, flags & MSG_DONTWAIT);
>
> So it is very possible that tcp sendmsg() calls sk_stream_wait_memory(),
> and that sk_stream_wait_memory() returns -EAGAIN with SOCK_NOSPACE
> cleared, if sk->sk_sndtimeo has been set to a small (but not zero)
> value.
>
> This patch removes the 'noblock' variable since we must always
> set SOCK_NOSPACE if -EAGAIN is returned.
>
> It also renames the do_nonblock label since we might reach this
> code path even if we were in blocking mode.
>
> Fixes: 790ba4566c1a ("tcp: set SOCK_NOSPACE under memory pressure")
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: Jason Baron <jbaron@akamai.com>
> Reported-by: Vladimir Rutsky <rutsky@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Thank you for the fix!
> ---
> net/core/stream.c | 16 +++++++++-------
> 1 file changed, 9 insertions(+), 7 deletions(-)
>
> diff --git a/net/core/stream.c b/net/core/stream.c
> index e94bb02a56295ec2db34ab423a8c7c890df0a696..4f1d4aa5fb38d989a9c81f32dfce3f31bbc1fa47 100644
> --- a/net/core/stream.c
> +++ b/net/core/stream.c
> @@ -120,7 +120,6 @@ int sk_stream_wait_memory(struct sock *sk, long *timeo_p)
> int err = 0;
> long vm_wait = 0;
> long current_timeo = *timeo_p;
> - bool noblock = (*timeo_p ? false : true);
> DEFINE_WAIT_FUNC(wait, woken_wake_function);
>
> if (sk_stream_memory_free(sk))
> @@ -133,11 +132,8 @@ int sk_stream_wait_memory(struct sock *sk, long *timeo_p)
>
> if (sk->sk_err || (sk->sk_shutdown & SEND_SHUTDOWN))
> goto do_error;
> - if (!*timeo_p) {
> - if (noblock)
> - set_bit(SOCK_NOSPACE, &sk->sk_socket->flags);
> - goto do_nonblock;
> - }
> + if (!*timeo_p)
> + goto do_eagain;
> if (signal_pending(current))
> goto do_interrupted;
> sk_clear_bit(SOCKWQ_ASYNC_NOSPACE, sk);
> @@ -169,7 +165,13 @@ int sk_stream_wait_memory(struct sock *sk, long *timeo_p)
> do_error:
> err = -EPIPE;
> goto out;
> -do_nonblock:
> +do_eagain:
> + /* Make sure that whenever EAGAIN is returned, EPOLLOUT event can
> + * be generated later.
> + * When TCP receives ACK packets that make room, tcp_check_space()
> + * only calls tcp_new_space() if SOCK_NOSPACE is set.
> + */
> + set_bit(SOCK_NOSPACE, &sk->sk_socket->flags);
> err = -EAGAIN;
> goto out;
> do_interrupted:
> --
> 2.23.0.rc1.153.gdeed80330f-goog
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH net] tcp: make sure EPOLLOUT wont be missed
2019-08-17 4:26 [PATCH net] tcp: make sure EPOLLOUT wont be missed Eric Dumazet
2019-08-17 12:39 ` Soheil Hassas Yeganeh
@ 2019-08-17 14:19 ` Jason Baron
2019-08-17 16:26 ` Eric Dumazet
2019-08-17 17:10 ` Neal Cardwell
2019-08-19 20:08 ` David Miller
3 siblings, 1 reply; 7+ messages in thread
From: Jason Baron @ 2019-08-17 14:19 UTC (permalink / raw)
To: Eric Dumazet, David S . Miller
Cc: netdev, Soheil Hassas Yeganeh, Neal Cardwell, Eric Dumazet,
Vladimir Rutsky
On 8/17/19 12:26 AM, Eric Dumazet wrote:
> As Jason Baron explained in commit 790ba4566c1a ("tcp: set SOCK_NOSPACE
> under memory pressure"), it is crucial we properly set SOCK_NOSPACE
> when needed.
>
> However, Jason patch had a bug, because the 'nonblocking' status
> as far as sk_stream_wait_memory() is concerned is governed
> by MSG_DONTWAIT flag passed at sendmsg() time :
>
> long timeo = sock_sndtimeo(sk, flags & MSG_DONTWAIT);
>
> So it is very possible that tcp sendmsg() calls sk_stream_wait_memory(),
> and that sk_stream_wait_memory() returns -EAGAIN with SOCK_NOSPACE
> cleared, if sk->sk_sndtimeo has been set to a small (but not zero)
> value.
Is MSG_DONTWAIT not set in this case? The original patch was intended
only for the explicit non-blocking case. The epoll manpage says:
"EPOLLET flag should use nonblocking file descriptors". So the original
intention was not to impact the blocking case. This seems to me like
a different use-case.
Thanks,
-Jason
> This patch removes the 'noblock' variable since we must always
> set SOCK_NOSPACE if -EAGAIN is returned.
>
> It also renames the do_nonblock label since we might reach this
> code path even if we were in blocking mode.
>
> Fixes: 790ba4566c1a ("tcp: set SOCK_NOSPACE under memory pressure")
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: Jason Baron <jbaron@akamai.com>
> Reported-by: Vladimir Rutsky <rutsky@google.com>
> ---
> net/core/stream.c | 16 +++++++++-------
> 1 file changed, 9 insertions(+), 7 deletions(-)
>
> diff --git a/net/core/stream.c b/net/core/stream.c
> index e94bb02a56295ec2db34ab423a8c7c890df0a696..4f1d4aa5fb38d989a9c81f32dfce3f31bbc1fa47 100644
> --- a/net/core/stream.c
> +++ b/net/core/stream.c
> @@ -120,7 +120,6 @@ int sk_stream_wait_memory(struct sock *sk, long *timeo_p)
> int err = 0;
> long vm_wait = 0;
> long current_timeo = *timeo_p;
> - bool noblock = (*timeo_p ? false : true);
> DEFINE_WAIT_FUNC(wait, woken_wake_function);
>
> if (sk_stream_memory_free(sk))
> @@ -133,11 +132,8 @@ int sk_stream_wait_memory(struct sock *sk, long *timeo_p)
>
> if (sk->sk_err || (sk->sk_shutdown & SEND_SHUTDOWN))
> goto do_error;
> - if (!*timeo_p) {
> - if (noblock)
> - set_bit(SOCK_NOSPACE, &sk->sk_socket->flags);
> - goto do_nonblock;
> - }
> + if (!*timeo_p)
> + goto do_eagain;
> if (signal_pending(current))
> goto do_interrupted;
> sk_clear_bit(SOCKWQ_ASYNC_NOSPACE, sk);
> @@ -169,7 +165,13 @@ int sk_stream_wait_memory(struct sock *sk, long *timeo_p)
> do_error:
> err = -EPIPE;
> goto out;
> -do_nonblock:
> +do_eagain:
> + /* Make sure that whenever EAGAIN is returned, EPOLLOUT event can
> + * be generated later.
> + * When TCP receives ACK packets that make room, tcp_check_space()
> + * only calls tcp_new_space() if SOCK_NOSPACE is set.
> + */
> + set_bit(SOCK_NOSPACE, &sk->sk_socket->flags);
> err = -EAGAIN;
> goto out;
> do_interrupted:
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH net] tcp: make sure EPOLLOUT wont be missed
2019-08-17 14:19 ` Jason Baron
@ 2019-08-17 16:26 ` Eric Dumazet
2019-08-19 18:40 ` Jason Baron
0 siblings, 1 reply; 7+ messages in thread
From: Eric Dumazet @ 2019-08-17 16:26 UTC (permalink / raw)
To: Jason Baron, Eric Dumazet, David S . Miller
Cc: netdev, Soheil Hassas Yeganeh, Neal Cardwell, Eric Dumazet,
Vladimir Rutsky
On 8/17/19 4:19 PM, Jason Baron wrote:
>
>
> On 8/17/19 12:26 AM, Eric Dumazet wrote:
>> As Jason Baron explained in commit 790ba4566c1a ("tcp: set SOCK_NOSPACE
>> under memory pressure"), it is crucial we properly set SOCK_NOSPACE
>> when needed.
>>
>> However, Jason patch had a bug, because the 'nonblocking' status
>> as far as sk_stream_wait_memory() is concerned is governed
>> by MSG_DONTWAIT flag passed at sendmsg() time :
>>
>> long timeo = sock_sndtimeo(sk, flags & MSG_DONTWAIT);
>>
>> So it is very possible that tcp sendmsg() calls sk_stream_wait_memory(),
>> and that sk_stream_wait_memory() returns -EAGAIN with SOCK_NOSPACE
>> cleared, if sk->sk_sndtimeo has been set to a small (but not zero)
>> value.
>
> Is MSG_DONTWAIT not set in this case? The original patch was intended
> only for the explicit non-blocking case. The epoll manpage says:
> "EPOLLET flag should use nonblocking file descriptors". So the original
> intention was not to impact the blocking case. This seems to me like
> a different use-case.
>
I guess the problem is how we define 'non-blocking' ...
SO_SNDTIMEO can be used by application to implement a variation of non-blocking,
by waiting for a socket event with a short timeout, to maybe recover
from memory pressure conditions in a more efficient way than simply looping.
Note that the man page for epoll() only _suggests_ to use nonblocking file descriptors.
<quote>
The suggested way to use epoll as an edge-triggered (EPOLLET)
interface is as follows:
i with nonblocking file descriptors; and
ii by waiting for an event only after read(2) or
write(2) return EAGAIN.
</quote>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH net] tcp: make sure EPOLLOUT wont be missed
2019-08-17 4:26 [PATCH net] tcp: make sure EPOLLOUT wont be missed Eric Dumazet
2019-08-17 12:39 ` Soheil Hassas Yeganeh
2019-08-17 14:19 ` Jason Baron
@ 2019-08-17 17:10 ` Neal Cardwell
2019-08-19 20:08 ` David Miller
3 siblings, 0 replies; 7+ messages in thread
From: Neal Cardwell @ 2019-08-17 17:10 UTC (permalink / raw)
To: Eric Dumazet
Cc: David S . Miller, netdev, Soheil Hassas Yeganeh, Eric Dumazet,
Jason Baron, Vladimir Rutsky
On Sat, Aug 17, 2019 at 12:26 AM Eric Dumazet <edumazet@google.com> wrote:
>
> As Jason Baron explained in commit 790ba4566c1a ("tcp: set SOCK_NOSPACE
> under memory pressure"), it is crucial we properly set SOCK_NOSPACE
> when needed.
>
> However, Jason patch had a bug, because the 'nonblocking' status
> as far as sk_stream_wait_memory() is concerned is governed
> by MSG_DONTWAIT flag passed at sendmsg() time :
>
> long timeo = sock_sndtimeo(sk, flags & MSG_DONTWAIT);
>
> So it is very possible that tcp sendmsg() calls sk_stream_wait_memory(),
> and that sk_stream_wait_memory() returns -EAGAIN with SOCK_NOSPACE
> cleared, if sk->sk_sndtimeo has been set to a small (but not zero)
> value.
>
> This patch removes the 'noblock' variable since we must always
> set SOCK_NOSPACE if -EAGAIN is returned.
>
> It also renames the do_nonblock label since we might reach this
> code path even if we were in blocking mode.
>
> Fixes: 790ba4566c1a ("tcp: set SOCK_NOSPACE under memory pressure")
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: Jason Baron <jbaron@akamai.com>
> Reported-by: Vladimir Rutsky <rutsky@google.com>
> ---
> net/core/stream.c | 16 +++++++++-------
> 1 file changed, 9 insertions(+), 7 deletions(-)
Acked-by: Neal Cardwell <ncardwell@google.com>
Thanks, Eric!
neal
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH net] tcp: make sure EPOLLOUT wont be missed
2019-08-17 16:26 ` Eric Dumazet
@ 2019-08-19 18:40 ` Jason Baron
0 siblings, 0 replies; 7+ messages in thread
From: Jason Baron @ 2019-08-19 18:40 UTC (permalink / raw)
To: Eric Dumazet, Eric Dumazet, David S . Miller
Cc: netdev, Soheil Hassas Yeganeh, Neal Cardwell, Vladimir Rutsky
On 8/17/19 12:26 PM, Eric Dumazet wrote:
>
>
> On 8/17/19 4:19 PM, Jason Baron wrote:
>>
>>
>> On 8/17/19 12:26 AM, Eric Dumazet wrote:
>>> As Jason Baron explained in commit 790ba4566c1a ("tcp: set SOCK_NOSPACE
>>> under memory pressure"), it is crucial we properly set SOCK_NOSPACE
>>> when needed.
>>>
>>> However, Jason patch had a bug, because the 'nonblocking' status
>>> as far as sk_stream_wait_memory() is concerned is governed
>>> by MSG_DONTWAIT flag passed at sendmsg() time :
>>>
>>> long timeo = sock_sndtimeo(sk, flags & MSG_DONTWAIT);
>>>
>>> So it is very possible that tcp sendmsg() calls sk_stream_wait_memory(),
>>> and that sk_stream_wait_memory() returns -EAGAIN with SOCK_NOSPACE
>>> cleared, if sk->sk_sndtimeo has been set to a small (but not zero)
>>> value.
>>
>> Is MSG_DONTWAIT not set in this case? The original patch was intended
>> only for the explicit non-blocking case. The epoll manpage says:
>> "EPOLLET flag should use nonblocking file descriptors". So the original
>> intention was not to impact the blocking case. This seems to me like
>> a different use-case.
>>
>
> I guess the problem is how we define 'non-blocking' ...
>
> SO_SNDTIMEO can be used by application to implement a variation of non-blocking,
> by waiting for a socket event with a short timeout, to maybe recover
> from memory pressure conditions in a more efficient way than simply looping.
>
> Note that the man page for epoll() only _suggests_ to use nonblocking file descriptors.
>
> <quote>
> The suggested way to use epoll as an edge-triggered (EPOLLET)
> interface is as follows:
>
> i with nonblocking file descriptors; and
>
> ii by waiting for an event only after read(2) or
> write(2) return EAGAIN.
> </quote>
>
>
Ok, seems reasonable:
Acked-by: Jason Baron <jbaron@akamai.com>
I found a similar pattern in net/smc/smc_tx.c, which I also just sent a
patch for.
Thanks,
-Jason
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH net] tcp: make sure EPOLLOUT wont be missed
2019-08-17 4:26 [PATCH net] tcp: make sure EPOLLOUT wont be missed Eric Dumazet
` (2 preceding siblings ...)
2019-08-17 17:10 ` Neal Cardwell
@ 2019-08-19 20:08 ` David Miller
3 siblings, 0 replies; 7+ messages in thread
From: David Miller @ 2019-08-19 20:08 UTC (permalink / raw)
To: edumazet; +Cc: netdev, soheil, ncardwell, eric.dumazet, jbaron, rutsky
From: Eric Dumazet <edumazet@google.com>
Date: Fri, 16 Aug 2019 21:26:22 -0700
> As Jason Baron explained in commit 790ba4566c1a ("tcp: set SOCK_NOSPACE
> under memory pressure"), it is crucial we properly set SOCK_NOSPACE
> when needed.
>
> However, Jason patch had a bug, because the 'nonblocking' status
> as far as sk_stream_wait_memory() is concerned is governed
> by MSG_DONTWAIT flag passed at sendmsg() time :
>
> long timeo = sock_sndtimeo(sk, flags & MSG_DONTWAIT);
>
> So it is very possible that tcp sendmsg() calls sk_stream_wait_memory(),
> and that sk_stream_wait_memory() returns -EAGAIN with SOCK_NOSPACE
> cleared, if sk->sk_sndtimeo has been set to a small (but not zero)
> value.
>
> This patch removes the 'noblock' variable since we must always
> set SOCK_NOSPACE if -EAGAIN is returned.
>
> It also renames the do_nonblock label since we might reach this
> code path even if we were in blocking mode.
>
> Fixes: 790ba4566c1a ("tcp: set SOCK_NOSPACE under memory pressure")
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: Jason Baron <jbaron@akamai.com>
> Reported-by: Vladimir Rutsky <rutsky@google.com>
Applied and queued up for -stable.
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2019-08-19 20:08 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-17 4:26 [PATCH net] tcp: make sure EPOLLOUT wont be missed Eric Dumazet
2019-08-17 12:39 ` Soheil Hassas Yeganeh
2019-08-17 14:19 ` Jason Baron
2019-08-17 16:26 ` Eric Dumazet
2019-08-19 18:40 ` Jason Baron
2019-08-17 17:10 ` Neal Cardwell
2019-08-19 20:08 ` David Miller
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).