linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Eric Dumazet <edumazet@google.com>
To: menglong8.dong@gmail.com
Cc: kuba@kernel.org, davem@davemloft.net, pabeni@redhat.com,
	dsahern@kernel.org, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org,
	Menglong Dong <imagedong@tencent.com>
Subject: Re: [PATCH net-next 2/3] net: tcp: send zero-window when no memory
Date: Wed, 17 May 2023 16:44:55 +0200	[thread overview]
Message-ID: <CANn89iKGTPHK5wMyP4oRoAuv8f56VY-RrrMPBSb8jRMJSiL5Qg@mail.gmail.com> (raw)
In-Reply-To: <20230517124201.441634-3-imagedong@tencent.com>

On Wed, May 17, 2023 at 2:42 PM <menglong8.dong@gmail.com> wrote:
>
> From: Menglong Dong <imagedong@tencent.com>
>
> For now, skb will be dropped when no memory, which makes client keep
> retrans util timeout and it's not friendly to the users.

Yes, networking needs memory. Trying to deny it is recipe for OOM.

>
> Therefore, now we force to receive one packet on current socket when
> the protocol memory is out of the limitation. Then, this socket will
> stay in 'no mem' status, util protocol memory is available.
>

I think you missed one old patch.

commit ba3bb0e76ccd464bb66665a1941fabe55dadb3ba    tcp: fix
SO_RCVLOWAT possible hangs under high mem pressure



> When a socket is in 'no mem' status, it's receive window will become
> 0, which means window shrink happens. And the sender need to handle
> such window shrink properly, which is done in the next commit.
>
> Signed-off-by: Menglong Dong <imagedong@tencent.com>
> ---
>  include/net/sock.h    |  1 +
>  net/ipv4/tcp_input.c  | 12 ++++++++++++
>  net/ipv4/tcp_output.c |  7 +++++++
>  3 files changed, 20 insertions(+)
>
> diff --git a/include/net/sock.h b/include/net/sock.h
> index 5edf0038867c..90db8a1d7f31 100644
> --- a/include/net/sock.h
> +++ b/include/net/sock.h
> @@ -957,6 +957,7 @@ enum sock_flags {
>         SOCK_XDP, /* XDP is attached */
>         SOCK_TSTAMP_NEW, /* Indicates 64 bit timestamps always */
>         SOCK_RCVMARK, /* Receive SO_MARK  ancillary data with packet */
> +       SOCK_NO_MEM, /* protocol memory limitation happened */
>  };
>
>  #define SK_FLAGS_TIMESTAMP ((1UL << SOCK_TIMESTAMP) | (1UL << SOCK_TIMESTAMPING_RX_SOFTWARE))
> diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
> index a057330d6f59..56e395cb4554 100644
> --- a/net/ipv4/tcp_input.c
> +++ b/net/ipv4/tcp_input.c
> @@ -5047,10 +5047,22 @@ static void tcp_data_queue(struct sock *sk, struct sk_buff *skb)
>                 if (skb_queue_len(&sk->sk_receive_queue) == 0)
>                         sk_forced_mem_schedule(sk, skb->truesize);

I think you missed this part : We accept at least one packet,
regardless of memory pressure,
if the queue is empty.

So your changelog is misleading.

>                 else if (tcp_try_rmem_schedule(sk, skb, skb->truesize)) {
> +                       if (sysctl_tcp_wnd_shrink)

We no longer add global sysctls for TCP. All new sysctls must per net-ns.

> +                               goto do_wnd_shrink;
> +
>                         reason = SKB_DROP_REASON_PROTO_MEM;
>                         NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPRCVQDROP);
>                         sk->sk_data_ready(sk);
>                         goto drop;
> +do_wnd_shrink:
> +                       if (sock_flag(sk, SOCK_NO_MEM)) {
> +                               NET_INC_STATS(sock_net(sk),
> +                                             LINUX_MIB_TCPRCVQDROP);
> +                               sk->sk_data_ready(sk);
> +                               goto out_of_window;
> +                       }
> +                       sk_forced_mem_schedule(sk, skb->truesize);

So now we would accept two packets per TCP socket, and yet EPOLLIN
will not be sent in time ?

packets can consume about 45*4K each, I do not think it is wise to
double receive queue sizes.

What you want instead is simply to send EPOLLIN sooner (when the first
packet is queued instead when the second packet is dropped)
by changing sk_forced_mem_schedule() a bit.

This might matter for applications using SO_RCVLOWAT, but not for
other applications.

  reply	other threads:[~2023-05-17 14:45 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-17 12:41 [PATCH net-next 0/3] net: tcp: add support of window shrink menglong8.dong
2023-05-17 12:41 ` [PATCH net-next 1/3] net: tcp: add sysctl for controling tcp " menglong8.dong
2023-05-17 12:42 ` [PATCH net-next 2/3] net: tcp: send zero-window when no memory menglong8.dong
2023-05-17 14:44   ` Eric Dumazet [this message]
2023-05-18  2:14     ` Menglong Dong
2023-05-18 14:25     ` Menglong Dong
2023-05-17 12:42 ` [PATCH net-next 3/3] net: tcp: handle window shrink properly menglong8.dong
2023-05-17 14:47   ` Eric Dumazet
2023-05-18  2:34     ` Menglong Dong
2023-05-18 13:40       ` Neal Cardwell
2023-05-18 14:11         ` Menglong Dong
2023-05-18 16:03           ` Neal Cardwell
2023-05-20  9:07             ` Menglong Dong
2023-05-20 14:28               ` Neal Cardwell
2023-05-22  2:55                 ` Menglong Dong
2023-05-22 15:04                   ` Neal Cardwell
2023-05-23  8:59                     ` Menglong Dong
2023-05-23 13:27                       ` Neal Cardwell
2023-05-24 12:16                         ` Menglong Dong
2023-05-24 14:49                           ` Neal Cardwell
2023-05-23 10:26                     ` Eric Dumazet
2023-05-17 16:55   ` kernel test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CANn89iKGTPHK5wMyP4oRoAuv8f56VY-RrrMPBSb8jRMJSiL5Qg@mail.gmail.com \
    --to=edumazet@google.com \
    --cc=davem@davemloft.net \
    --cc=dsahern@kernel.org \
    --cc=imagedong@tencent.com \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=menglong8.dong@gmail.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).