All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jakub Kicinski <kuba@kernel.org>
To: Eric Dumazet <edumazet@google.com>
Cc: Jonathan Lemon <jonathan.lemon@gmail.com>,
	Alexander Duyck <alexander.duyck@gmail.com>,
	David Miller <davem@davemloft.net>,
	netdev <netdev@vger.kernel.org>, kernel-team <kernel-team@fb.com>,
	Neil Spring <ntspring@fb.com>, Yuchung Cheng <ycheng@google.com>
Subject: Re: [PATCH net] net: tcp: don't allocate fast clones for fastopen SYN
Date: Thu, 4 Mar 2021 15:27:09 -0800	[thread overview]
Message-ID: <20210304152709.4e91bd8b@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com> (raw)
In-Reply-To: <CANn89iLBi=2VzpiUBZPHaPHCeqqoFE-JmB0KAsf-vxaPvkvcxA@mail.gmail.com>

On Thu, 4 Mar 2021 22:26:58 +0100 Eric Dumazet wrote:
> It would be nice if tun driver would have the ability to delay TX
> completions by N usecs,
> so that packetdrill tests could be used.
> 
> It is probably not too hard to add such a feature.

Add an ioctl to turn off the skb_orphan, queue the skbs in tun_do_read() 
to free them from a timer?

> I was testing this (note how I also removed the tcp_rearm_rto(sk) call)
> 
> diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
> index 6f450e577975c7be9537338c8a4c0673d7d36c4c..9ef92ca55e530f76ad793d7342442c4ec06165f7
> 100644
> --- a/net/ipv4/tcp_input.c
> +++ b/net/ipv4/tcp_input.c
> @@ -6471,11 +6471,10 @@ static bool tcp_rcv_fastopen_synack(struct
> sock *sk, struct sk_buff *synack,
>         tcp_fastopen_cache_set(sk, mss, cookie, syn_drop, try_exp);
> 
>         if (data) { /* Retransmit unacked data in SYN */
> -               skb_rbtree_walk_from(data) {
> -                       if (__tcp_retransmit_skb(sk, data, 1))
> -                               break;
> -               }
> -               tcp_rearm_rto(sk);
> +               skb_rbtree_walk_from(data)
> +                       tcp_mark_skb_lost(sk, data);
> +
> +               tcp_xmit_retransmit_queue(sk);
>                 NET_INC_STATS(sock_net(sk),
>                                 LINUX_MIB_TCPFASTOPENACTIVEFAIL);
>                 return true;

AFAICT this works great now:

==> TFO case ret:-16 (0) ca_state:0 skb:ffff8881d3513800!
  FREED swapper/5 -- skb 0xffff8881d3513800 freed after: 1us
-----
First:
        __tcp_retransmit_skb+1
        tcp_retransmit_skb+18
        tcp_xmit_retransmit_queue.part.70+339
        tcp_rcv_state_process+2491
        tcp_v6_do_rcv+405
        tcp_v6_rcv+2984

Second:
        __tcp_retransmit_skb+1
        tcp_retransmit_skb+18
        tcp_xmit_retransmit_queue.part.70+339
        tcp_tsq_write.part.71+146
        tcp_tsq_handler+53
        tcp_tasklet_func+181

 sk:0xffff8885adc16f00 skb:ffff8881d3513800 --- 61us acked:1


The other case where we hit RTO after __tcp_retransmit_skb() fails is:

==> non-TFO case ret:-11 (0) ca_state:3 skb:ffff8883d71dd400!
-----
First:
        __tcp_retransmit_skb+1
        tcp_retransmit_skb+18
        tcp_xmit_retransmit_queue.part.70+339
        tcp_ack+2270
        tcp_rcv_established+303
        tcp_v6_do_rcv+190

Second:
        __tcp_retransmit_skb+1
        tcp_retransmit_skb+18
        tcp_retransmit_timer+716
        tcp_write_timer_handler+136
        tcp_write_timer+141
        call_timer_fn+43

 sk:0xffff88801772d340 skb:ffff8883d71dd400 --- 51738us acked:47324

Which I believe is this:

	if (refcount_read(&sk->sk_wmem_alloc) >
	    min_t(u32, sk->sk_wmem_queued + (sk->sk_wmem_queued >> 2),
		  sk->sk_sndbuf))
		return -EAGAIN;

Because __tcp_retransmit_skb() seems to bail before
inet6_sk_rebuild_header gets called. Should we arm TSQ here as well?

  reply	other threads:[~2021-03-04 23:27 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-02  6:07 [PATCH net] net: tcp: don't allocate fast clones for fastopen SYN Jakub Kicinski
2021-03-02  9:38 ` Eric Dumazet
2021-03-02 17:00   ` Jakub Kicinski
2021-03-02 17:02     ` Eric Dumazet
2021-03-02 20:52   ` Yuchung Cheng
2021-03-02 22:00     ` Jakub Kicinski
2021-03-03 21:35   ` Alexander Duyck
2021-03-04  0:07     ` Jakub Kicinski
2021-03-04  2:45       ` Alexander Duyck
2021-03-04 12:51         ` Eric Dumazet
2021-03-04 19:06           ` Jakub Kicinski
2021-03-04 19:41             ` Eric Dumazet
2021-03-04 20:18               ` Eric Dumazet
2021-03-04 21:08               ` Jonathan Lemon
2021-03-04 21:20                 ` Eric Dumazet
2021-03-04 21:26                   ` Eric Dumazet
2021-03-04 23:27                     ` Jakub Kicinski [this message]
2021-03-05  5:17                       ` Eric Dumazet
2021-03-05  5:33                         ` Eric Dumazet
2021-03-05  6:38                           ` Eric Dumazet

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210304152709.4e91bd8b@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com \
    --to=kuba@kernel.org \
    --cc=alexander.duyck@gmail.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=jonathan.lemon@gmail.com \
    --cc=kernel-team@fb.com \
    --cc=netdev@vger.kernel.org \
    --cc=ntspring@fb.com \
    --cc=ycheng@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.