All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next 0/2] tcp: mem pressure vs SO_RCVLOWAT
@ 2021-02-12 23:22 Eric Dumazet
  2021-02-12 23:22 ` [PATCH net-next 1/2] tcp: fix SO_RCVLOWAT related hangs under mem pressure Eric Dumazet
  2021-02-12 23:22 ` [PATCH net-next 2/2] tcp: factorize logic into tcp_epollin_ready() Eric Dumazet
  0 siblings, 2 replies; 9+ messages in thread
From: Eric Dumazet @ 2021-02-12 23:22 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski; +Cc: netdev, Eric Dumazet, Eric Dumazet

From: Eric Dumazet <edumazet@google.com>

First patch fixes an issue for applications using SO_RCVLOWAT
to reduce context switches.

Second patch is a cleanup.

Eric Dumazet (2):
  tcp: fix SO_RCVLOWAT related hangs under mem pressure
  tcp: factorize logic into tcp_epollin_ready()

 include/net/tcp.h    | 21 +++++++++++++++++++--
 net/ipv4/tcp.c       | 16 ++++------------
 net/ipv4/tcp_input.c | 11 ++---------
 3 files changed, 25 insertions(+), 23 deletions(-)

-- 
2.30.0.478.g8a0d178c01-goog


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH net-next 1/2] tcp: fix SO_RCVLOWAT related hangs under mem pressure
  2021-02-12 23:22 [PATCH net-next 0/2] tcp: mem pressure vs SO_RCVLOWAT Eric Dumazet
@ 2021-02-12 23:22 ` Eric Dumazet
  2021-02-13  0:23   ` Wei Wang
  2021-02-12 23:22 ` [PATCH net-next 2/2] tcp: factorize logic into tcp_epollin_ready() Eric Dumazet
  1 sibling, 1 reply; 9+ messages in thread
From: Eric Dumazet @ 2021-02-12 23:22 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski
  Cc: netdev, Eric Dumazet, Eric Dumazet, Arjun Roy, Wei Wang

From: Eric Dumazet <edumazet@google.com>

While commit 24adbc1676af ("tcp: fix SO_RCVLOWAT hangs with fat skbs")
fixed an issue vs too small sk_rcvbuf for given sk_rcvlowat constraint,
it missed to address issue caused by memory pressure.

1) If we are under memory pressure and socket receive queue is empty.
First incoming packet is allowed to be queued, after commit
76dfa6082032 ("tcp: allow one skb to be received per socket under memory pressure")

But we do not send EPOLLIN yet, in case tcp_data_ready() sees sk_rcvlowat
is bigger than skb length.

2) Then, when next packet comes, it is dropped, and we directly
call sk->sk_data_ready().

3) If application is using poll(), tcp_poll() will then use
tcp_stream_is_readable() and decide the socket receive queue is
not yet filled, so nothing will happen.

Even when sender retransmits packets, phases 2) & 3) repeat
and flow is effectively frozen, until memory pressure is off.

Fix is to consider tcp_under_memory_pressure() to take care
of global memory pressure or memcg pressure.

Fixes: 24adbc1676af ("tcp: fix SO_RCVLOWAT hangs with fat skbs")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Arjun Roy <arjunroy@google.com>
Suggested-by: Wei Wang <weiwan@google.com>
---
 include/net/tcp.h | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/include/net/tcp.h b/include/net/tcp.h
index 25bbada379c46add16fb7239733bd6571f10f680..244208f6f6c2ace87920b633e469421f557427a6 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -1431,8 +1431,13 @@ void tcp_cleanup_rbuf(struct sock *sk, int copied);
  */
 static inline bool tcp_rmem_pressure(const struct sock *sk)
 {
-	int rcvbuf = READ_ONCE(sk->sk_rcvbuf);
-	int threshold = rcvbuf - (rcvbuf >> 3);
+	int rcvbuf, threshold;
+
+	if (tcp_under_memory_pressure(sk))
+		return true;
+
+	rcvbuf = READ_ONCE(sk->sk_rcvbuf);
+	threshold = rcvbuf - (rcvbuf >> 3);
 
 	return atomic_read(&sk->sk_rmem_alloc) > threshold;
 }
-- 
2.30.0.478.g8a0d178c01-goog


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH net-next 2/2] tcp: factorize logic into tcp_epollin_ready()
  2021-02-12 23:22 [PATCH net-next 0/2] tcp: mem pressure vs SO_RCVLOWAT Eric Dumazet
  2021-02-12 23:22 ` [PATCH net-next 1/2] tcp: fix SO_RCVLOWAT related hangs under mem pressure Eric Dumazet
@ 2021-02-12 23:22 ` Eric Dumazet
  2021-02-13  0:30   ` Wei Wang
  1 sibling, 1 reply; 9+ messages in thread
From: Eric Dumazet @ 2021-02-12 23:22 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski
  Cc: netdev, Eric Dumazet, Eric Dumazet, Arjun Roy, Wei Wang

From: Eric Dumazet <edumazet@google.com>

Both tcp_data_ready() and tcp_stream_is_readable() share the same logic.

Add tcp_epollin_ready() helper to avoid duplication.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Arjun Roy <arjunroy@google.com>
Cc: Wei Wang <weiwan@google.com>
---
 include/net/tcp.h    | 12 ++++++++++++
 net/ipv4/tcp.c       | 16 ++++------------
 net/ipv4/tcp_input.c | 11 ++---------
 3 files changed, 18 insertions(+), 21 deletions(-)

diff --git a/include/net/tcp.h b/include/net/tcp.h
index 244208f6f6c2ace87920b633e469421f557427a6..484eb2362645fd478f59b26b42457ecf4510eb14 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -1442,6 +1442,18 @@ static inline bool tcp_rmem_pressure(const struct sock *sk)
 	return atomic_read(&sk->sk_rmem_alloc) > threshold;
 }
 
+static inline bool tcp_epollin_ready(const struct sock *sk, int target)
+{
+	const struct tcp_sock *tp = tcp_sk(sk);
+	int avail = READ_ONCE(tp->rcv_nxt) - READ_ONCE(tp->copied_seq);
+
+	if (avail <= 0)
+		return false;
+
+	return (avail >= target) || tcp_rmem_pressure(sk) ||
+	       (tcp_receive_window(tp) <= inet_csk(sk)->icsk_ack.rcv_mss);
+}
+
 extern void tcp_openreq_init_rwin(struct request_sock *req,
 				  const struct sock *sk_listener,
 				  const struct dst_entry *dst);
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 9896ca10bb340924b779cb6a7606d57fdd5c3357..7a6b58ae408d1fb1e5536ccfed8215be123f3b57 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -481,19 +481,11 @@ static void tcp_tx_timestamp(struct sock *sk, u16 tsflags)
 	}
 }
 
-static inline bool tcp_stream_is_readable(const struct tcp_sock *tp,
-					  int target, struct sock *sk)
+static bool tcp_stream_is_readable(struct sock *sk, int target)
 {
-	int avail = READ_ONCE(tp->rcv_nxt) - READ_ONCE(tp->copied_seq);
+	if (tcp_epollin_ready(sk, target))
+		return true;
 
-	if (avail > 0) {
-		if (avail >= target)
-			return true;
-		if (tcp_rmem_pressure(sk))
-			return true;
-		if (tcp_receive_window(tp) <= inet_csk(sk)->icsk_ack.rcv_mss)
-			return true;
-	}
 	if (sk->sk_prot->stream_memory_read)
 		return sk->sk_prot->stream_memory_read(sk);
 	return false;
@@ -568,7 +560,7 @@ __poll_t tcp_poll(struct file *file, struct socket *sock, poll_table *wait)
 		    tp->urg_data)
 			target++;
 
-		if (tcp_stream_is_readable(tp, target, sk))
+		if (tcp_stream_is_readable(sk, target))
 			mask |= EPOLLIN | EPOLLRDNORM;
 
 		if (!(sk->sk_shutdown & SEND_SHUTDOWN)) {
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index a8f8f98159531e5d1c80660972148986f6acd20a..e32a7056cb7640c67ef2d6a4d9484684d2602fcd 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -4924,15 +4924,8 @@ int tcp_send_rcvq(struct sock *sk, struct msghdr *msg, size_t size)
 
 void tcp_data_ready(struct sock *sk)
 {
-	const struct tcp_sock *tp = tcp_sk(sk);
-	int avail = tp->rcv_nxt - tp->copied_seq;
-
-	if (avail < sk->sk_rcvlowat && !tcp_rmem_pressure(sk) &&
-	    !sock_flag(sk, SOCK_DONE) &&
-	    tcp_receive_window(tp) > inet_csk(sk)->icsk_ack.rcv_mss)
-		return;
-
-	sk->sk_data_ready(sk);
+	if (tcp_epollin_ready(sk, sk->sk_rcvlowat))
+		sk->sk_data_ready(sk);
 }
 
 static void tcp_data_queue(struct sock *sk, struct sk_buff *skb)
-- 
2.30.0.478.g8a0d178c01-goog


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH net-next 1/2] tcp: fix SO_RCVLOWAT related hangs under mem pressure
  2021-02-12 23:22 ` [PATCH net-next 1/2] tcp: fix SO_RCVLOWAT related hangs under mem pressure Eric Dumazet
@ 2021-02-13  0:23   ` Wei Wang
  0 siblings, 0 replies; 9+ messages in thread
From: Wei Wang @ 2021-02-13  0:23 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David S . Miller, Jakub Kicinski, netdev, Eric Dumazet, Arjun Roy

On Fri, Feb 12, 2021 at 3:22 PM Eric Dumazet <eric.dumazet@gmail.com> wrote:
>
> From: Eric Dumazet <edumazet@google.com>
>
> While commit 24adbc1676af ("tcp: fix SO_RCVLOWAT hangs with fat skbs")
> fixed an issue vs too small sk_rcvbuf for given sk_rcvlowat constraint,
> it missed to address issue caused by memory pressure.
>
> 1) If we are under memory pressure and socket receive queue is empty.
> First incoming packet is allowed to be queued, after commit
> 76dfa6082032 ("tcp: allow one skb to be received per socket under memory pressure")
>
> But we do not send EPOLLIN yet, in case tcp_data_ready() sees sk_rcvlowat
> is bigger than skb length.
>
> 2) Then, when next packet comes, it is dropped, and we directly
> call sk->sk_data_ready().
>
> 3) If application is using poll(), tcp_poll() will then use
> tcp_stream_is_readable() and decide the socket receive queue is
> not yet filled, so nothing will happen.
>
> Even when sender retransmits packets, phases 2) & 3) repeat
> and flow is effectively frozen, until memory pressure is off.
>
> Fix is to consider tcp_under_memory_pressure() to take care
> of global memory pressure or memcg pressure.
>
> Fixes: 24adbc1676af ("tcp: fix SO_RCVLOWAT hangs with fat skbs")
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Reported-by: Arjun Roy <arjunroy@google.com>
> Suggested-by: Wei Wang <weiwan@google.com>
> ---
Nice description in the commit msg!

Reviewed-by: Wei Wang <weiwan@google.com>

>  include/net/tcp.h | 9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/include/net/tcp.h b/include/net/tcp.h
> index 25bbada379c46add16fb7239733bd6571f10f680..244208f6f6c2ace87920b633e469421f557427a6 100644
> --- a/include/net/tcp.h
> +++ b/include/net/tcp.h
> @@ -1431,8 +1431,13 @@ void tcp_cleanup_rbuf(struct sock *sk, int copied);
>   */
>  static inline bool tcp_rmem_pressure(const struct sock *sk)
>  {
> -       int rcvbuf = READ_ONCE(sk->sk_rcvbuf);
> -       int threshold = rcvbuf - (rcvbuf >> 3);
> +       int rcvbuf, threshold;
> +
> +       if (tcp_under_memory_pressure(sk))
> +               return true;
> +
> +       rcvbuf = READ_ONCE(sk->sk_rcvbuf);
> +       threshold = rcvbuf - (rcvbuf >> 3);
>
>         return atomic_read(&sk->sk_rmem_alloc) > threshold;
>  }
> --
> 2.30.0.478.g8a0d178c01-goog
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net-next 2/2] tcp: factorize logic into tcp_epollin_ready()
  2021-02-12 23:22 ` [PATCH net-next 2/2] tcp: factorize logic into tcp_epollin_ready() Eric Dumazet
@ 2021-02-13  0:30   ` Wei Wang
  2021-02-13  7:50     ` Eric Dumazet
  0 siblings, 1 reply; 9+ messages in thread
From: Wei Wang @ 2021-02-13  0:30 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David S . Miller, Jakub Kicinski, netdev, Eric Dumazet, Arjun Roy

On Fri, Feb 12, 2021 at 3:22 PM Eric Dumazet <eric.dumazet@gmail.com> wrote:
>
> From: Eric Dumazet <edumazet@google.com>
>
> Both tcp_data_ready() and tcp_stream_is_readable() share the same logic.
>
> Add tcp_epollin_ready() helper to avoid duplication.
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: Arjun Roy <arjunroy@google.com>
> Cc: Wei Wang <weiwan@google.com>
> ---
>  include/net/tcp.h    | 12 ++++++++++++
>  net/ipv4/tcp.c       | 16 ++++------------
>  net/ipv4/tcp_input.c | 11 ++---------
>  3 files changed, 18 insertions(+), 21 deletions(-)
>
> diff --git a/include/net/tcp.h b/include/net/tcp.h
> index 244208f6f6c2ace87920b633e469421f557427a6..484eb2362645fd478f59b26b42457ecf4510eb14 100644
> --- a/include/net/tcp.h
> +++ b/include/net/tcp.h
> @@ -1442,6 +1442,18 @@ static inline bool tcp_rmem_pressure(const struct sock *sk)
>         return atomic_read(&sk->sk_rmem_alloc) > threshold;
>  }
>
> +static inline bool tcp_epollin_ready(const struct sock *sk, int target)
> +{
> +       const struct tcp_sock *tp = tcp_sk(sk);
> +       int avail = READ_ONCE(tp->rcv_nxt) - READ_ONCE(tp->copied_seq);
> +
> +       if (avail <= 0)
> +               return false;
> +
> +       return (avail >= target) || tcp_rmem_pressure(sk) ||
> +              (tcp_receive_window(tp) <= inet_csk(sk)->icsk_ack.rcv_mss);
> +}
> +
>  extern void tcp_openreq_init_rwin(struct request_sock *req,
>                                   const struct sock *sk_listener,
>                                   const struct dst_entry *dst);
> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> index 9896ca10bb340924b779cb6a7606d57fdd5c3357..7a6b58ae408d1fb1e5536ccfed8215be123f3b57 100644
> --- a/net/ipv4/tcp.c
> +++ b/net/ipv4/tcp.c
> @@ -481,19 +481,11 @@ static void tcp_tx_timestamp(struct sock *sk, u16 tsflags)
>         }
>  }
>
> -static inline bool tcp_stream_is_readable(const struct tcp_sock *tp,
> -                                         int target, struct sock *sk)
> +static bool tcp_stream_is_readable(struct sock *sk, int target)
>  {
> -       int avail = READ_ONCE(tp->rcv_nxt) - READ_ONCE(tp->copied_seq);
> +       if (tcp_epollin_ready(sk, target))
> +               return true;
>
> -       if (avail > 0) {
> -               if (avail >= target)
> -                       return true;
> -               if (tcp_rmem_pressure(sk))
> -                       return true;
> -               if (tcp_receive_window(tp) <= inet_csk(sk)->icsk_ack.rcv_mss)
> -                       return true;
> -       }
>         if (sk->sk_prot->stream_memory_read)
>                 return sk->sk_prot->stream_memory_read(sk);
>         return false;
> @@ -568,7 +560,7 @@ __poll_t tcp_poll(struct file *file, struct socket *sock, poll_table *wait)
>                     tp->urg_data)
>                         target++;
>
> -               if (tcp_stream_is_readable(tp, target, sk))
> +               if (tcp_stream_is_readable(sk, target))
>                         mask |= EPOLLIN | EPOLLRDNORM;
>
>                 if (!(sk->sk_shutdown & SEND_SHUTDOWN)) {
> diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
> index a8f8f98159531e5d1c80660972148986f6acd20a..e32a7056cb7640c67ef2d6a4d9484684d2602fcd 100644
> --- a/net/ipv4/tcp_input.c
> +++ b/net/ipv4/tcp_input.c
> @@ -4924,15 +4924,8 @@ int tcp_send_rcvq(struct sock *sk, struct msghdr *msg, size_t size)
>
>  void tcp_data_ready(struct sock *sk)
>  {
> -       const struct tcp_sock *tp = tcp_sk(sk);
> -       int avail = tp->rcv_nxt - tp->copied_seq;
> -
> -       if (avail < sk->sk_rcvlowat && !tcp_rmem_pressure(sk) &&
> -           !sock_flag(sk, SOCK_DONE) &&

Seems "!sock_flag(sk, SOCK_DONE)" is not checked in
tcp_epollin_read(). Does it matter?

> -           tcp_receive_window(tp) > inet_csk(sk)->icsk_ack.rcv_mss)
> -               return;
> -
> -       sk->sk_data_ready(sk);
> +       if (tcp_epollin_ready(sk, sk->sk_rcvlowat))
> +               sk->sk_data_ready(sk);
>  }
>
>  static void tcp_data_queue(struct sock *sk, struct sk_buff *skb)
> --
> 2.30.0.478.g8a0d178c01-goog
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net-next 2/2] tcp: factorize logic into tcp_epollin_ready()
  2021-02-13  0:30   ` Wei Wang
@ 2021-02-13  7:50     ` Eric Dumazet
  2021-02-13  8:04       ` Eric Dumazet
  0 siblings, 1 reply; 9+ messages in thread
From: Eric Dumazet @ 2021-02-13  7:50 UTC (permalink / raw)
  To: Wei Wang
  Cc: Eric Dumazet, David S . Miller, Jakub Kicinski, netdev, Arjun Roy

On Sat, Feb 13, 2021 at 1:30 AM Wei Wang <weiwan@google.com> wrote:
>

> >  void tcp_data_ready(struct sock *sk)
> >  {
> > -       const struct tcp_sock *tp = tcp_sk(sk);
> > -       int avail = tp->rcv_nxt - tp->copied_seq;
> > -
> > -       if (avail < sk->sk_rcvlowat && !tcp_rmem_pressure(sk) &&
> > -           !sock_flag(sk, SOCK_DONE) &&
>
> Seems "!sock_flag(sk, SOCK_DONE)" is not checked in
> tcp_epollin_read(). Does it matter?
>


Yes, probably, good catch.

Not sure where tcp_poll() gets this, I have to double check.

Thanks !

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net-next 2/2] tcp: factorize logic into tcp_epollin_ready()
  2021-02-13  7:50     ` Eric Dumazet
@ 2021-02-13  8:04       ` Eric Dumazet
  2021-02-13 17:10         ` Arjun Roy
  0 siblings, 1 reply; 9+ messages in thread
From: Eric Dumazet @ 2021-02-13  8:04 UTC (permalink / raw)
  To: Wei Wang
  Cc: Eric Dumazet, David S . Miller, Jakub Kicinski, netdev, Arjun Roy

On Sat, Feb 13, 2021 at 8:50 AM Eric Dumazet <edumazet@google.com> wrote:
>
> On Sat, Feb 13, 2021 at 1:30 AM Wei Wang <weiwan@google.com> wrote:
> >
>
> > >  void tcp_data_ready(struct sock *sk)
> > >  {
> > > -       const struct tcp_sock *tp = tcp_sk(sk);
> > > -       int avail = tp->rcv_nxt - tp->copied_seq;
> > > -
> > > -       if (avail < sk->sk_rcvlowat && !tcp_rmem_pressure(sk) &&
> > > -           !sock_flag(sk, SOCK_DONE) &&
> >
> > Seems "!sock_flag(sk, SOCK_DONE)" is not checked in
> > tcp_epollin_read(). Does it matter?
> >
>
>
> Yes, probably, good catch.
>
> Not sure where tcp_poll() gets this, I have to double check.

It gets the info from sk->sk_hutdown & RCV_SHUTDOWN

tcp_find() sets both sk->sk_shutdown |= RCV_SHUTDOWN and
sock_set_flag(sk, SOCK_DONE);

This seems to suggest tcp_fin() could call sk->sk_data_ready() so that
we do not have to test for this unlikely condition in tcp_data_ready()

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net-next 2/2] tcp: factorize logic into tcp_epollin_ready()
  2021-02-13  8:04       ` Eric Dumazet
@ 2021-02-13 17:10         ` Arjun Roy
  2021-02-13 17:21           ` Arjun Roy
  0 siblings, 1 reply; 9+ messages in thread
From: Arjun Roy @ 2021-02-13 17:10 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Wei Wang, Eric Dumazet, David S . Miller, Jakub Kicinski, netdev

On Sat, Feb 13, 2021 at 12:05 AM Eric Dumazet <edumazet@google.com> wrote:
>
> On Sat, Feb 13, 2021 at 8:50 AM Eric Dumazet <edumazet@google.com> wrote:
> >
> > On Sat, Feb 13, 2021 at 1:30 AM Wei Wang <weiwan@google.com> wrote:
> > >
> >
> > > >  void tcp_data_ready(struct sock *sk)
> > > >  {
> > > > -       const struct tcp_sock *tp = tcp_sk(sk);
> > > > -       int avail = tp->rcv_nxt - tp->copied_seq;
> > > > -
> > > > -       if (avail < sk->sk_rcvlowat && !tcp_rmem_pressure(sk) &&
> > > > -           !sock_flag(sk, SOCK_DONE) &&
> > >
> > > Seems "!sock_flag(sk, SOCK_DONE)" is not checked in
> > > tcp_epollin_read(). Does it matter?
> > >
> >
> >
> > Yes, probably, good catch.
> >
> > Not sure where tcp_poll() gets this, I have to double check.
>
> It gets the info from sk->sk_hutdown & RCV_SHUTDOWN
>
> tcp_find() sets both sk->sk_shutdown |= RCV_SHUTDOWN and
> sock_set_flag(sk, SOCK_DONE);
>
> This seems to suggest tcp_fin() could call sk->sk_data_ready() so that
> we do not have to test for this unlikely condition in tcp_data_ready()

When a thread is subsequently then woken up due to sk_data_ready(),
and it calls tcp_stream_is_readable() but we had lowat > 1 set, is
there a chance of that thread then thinking that the stream is not
readable, despite SOCK_DONE being set? This is assuming that the check
is not added to the refactored logic.

Note that on a related note if the tcp memory pressure check (for
system-wide pressure) is added just to the original code in
tcp_data_ready() but not added to tcp_stream_is_readable() we had this
kind of issue (sk_data_ready() was called but tcp_stream_is_readable()
returned false).

-Arjun

-Arjun

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net-next 2/2] tcp: factorize logic into tcp_epollin_ready()
  2021-02-13 17:10         ` Arjun Roy
@ 2021-02-13 17:21           ` Arjun Roy
  0 siblings, 0 replies; 9+ messages in thread
From: Arjun Roy @ 2021-02-13 17:21 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Wei Wang, Eric Dumazet, David S . Miller, Jakub Kicinski, netdev

On Sat, Feb 13, 2021 at 9:10 AM Arjun Roy <arjunroy@google.com> wrote:
>
> On Sat, Feb 13, 2021 at 12:05 AM Eric Dumazet <edumazet@google.com> wrote:
> >
> > On Sat, Feb 13, 2021 at 8:50 AM Eric Dumazet <edumazet@google.com> wrote:
> > >
> > > On Sat, Feb 13, 2021 at 1:30 AM Wei Wang <weiwan@google.com> wrote:
> > > >
> > >
> > > > >  void tcp_data_ready(struct sock *sk)
> > > > >  {
> > > > > -       const struct tcp_sock *tp = tcp_sk(sk);
> > > > > -       int avail = tp->rcv_nxt - tp->copied_seq;
> > > > > -
> > > > > -       if (avail < sk->sk_rcvlowat && !tcp_rmem_pressure(sk) &&
> > > > > -           !sock_flag(sk, SOCK_DONE) &&
> > > >
> > > > Seems "!sock_flag(sk, SOCK_DONE)" is not checked in
> > > > tcp_epollin_read(). Does it matter?
> > > >
> > >
> > >
> > > Yes, probably, good catch.
> > >
> > > Not sure where tcp_poll() gets this, I have to double check.
> >
> > It gets the info from sk->sk_hutdown & RCV_SHUTDOWN
> >
> > tcp_find() sets both sk->sk_shutdown |= RCV_SHUTDOWN and
> > sock_set_flag(sk, SOCK_DONE);
> >
> > This seems to suggest tcp_fin() could call sk->sk_data_ready() so that
> > we do not have to test for this unlikely condition in tcp_data_ready()
>
> When a thread is subsequently then woken up due to sk_data_ready(),
> and it calls tcp_stream_is_readable() but we had lowat > 1 set, is
> there a chance of that thread then thinking that the stream is not
> readable, despite SOCK_DONE being set? This is assuming that the check
> is not added to the refactored logic.
>
> Note that on a related note if the tcp memory pressure check (for
> system-wide pressure) is added just to the original code in
> tcp_data_ready() but not added to tcp_stream_is_readable() we had this
> kind of issue (sk_data_ready() was called but tcp_stream_is_readable()
> returned false).
>

Disregard,  I just saw your followup patch. So I guess it's fine.

-Arjun

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2021-02-13 17:22 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-12 23:22 [PATCH net-next 0/2] tcp: mem pressure vs SO_RCVLOWAT Eric Dumazet
2021-02-12 23:22 ` [PATCH net-next 1/2] tcp: fix SO_RCVLOWAT related hangs under mem pressure Eric Dumazet
2021-02-13  0:23   ` Wei Wang
2021-02-12 23:22 ` [PATCH net-next 2/2] tcp: factorize logic into tcp_epollin_ready() Eric Dumazet
2021-02-13  0:30   ` Wei Wang
2021-02-13  7:50     ` Eric Dumazet
2021-02-13  8:04       ` Eric Dumazet
2021-02-13 17:10         ` Arjun Roy
2021-02-13 17:21           ` Arjun Roy

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.