From: Eric Dumazet <eric.dumazet@gmail.com>
To: "David S . Miller" <davem@davemloft.net>,
Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>
Cc: netdev <netdev@vger.kernel.org>,
Alexander Duyck <alexanderduyck@fb.com>,
Coco Li <lixiaoyan@google.com>,
Eric Dumazet <edumazet@google.com>,
Eric Dumazet <eric.dumazet@gmail.com>
Subject: [PATCH v5 net-next 00/13] tcp: BIG TCP implementation
Date: Mon, 9 May 2022 15:21:36 -0700 [thread overview]
Message-ID: <20220509222149.1763877-1-eric.dumazet@gmail.com> (raw)
From: Eric Dumazet <edumazet@google.com>
This series implements BIG TCP as presented in netdev 0x15:
https://netdevconf.info/0x15/session.html?BIG-TCP
Jonathan Corbet made a nice summary: https://lwn.net/Articles/884104/
Standard TSO/GRO packet limit is 64KB
With BIG TCP, we allow bigger TSO/GRO packet sizes for IPv6 traffic.
Note that this feature is by default not enabled, because it might
break some eBPF programs assuming TCP header immediately follows IPv6 header.
While tcpdump recognizes the HBH/Jumbo header, standard pcap filters
are unable to skip over IPv6 extension headers.
Reducing number of packets traversing networking stack usually improves
performance, as shown on this experiment using a 100Gbit NIC, and 4K MTU.
'Standard' performance with current (74KB) limits.
for i in {1..10}; do ./netperf -t TCP_RR -H iroa23 -- -r80000,80000 -O MIN_LATENCY,P90_LATENCY,P99_LATENCY,THROUGHPUT|tail -1; done
77 138 183 8542.19
79 143 178 8215.28
70 117 164 9543.39
80 144 176 8183.71
78 126 155 9108.47
80 146 184 8115.19
71 113 165 9510.96
74 113 164 9518.74
79 137 178 8575.04
73 111 171 9561.73
Now enable BIG TCP on both hosts.
ip link set dev eth0 gro_max_size 185000 gso_max_size 185000
for i in {1..10}; do ./netperf -t TCP_RR -H iroa23 -- -r80000,80000 -O MIN_LATENCY,P90_LATENCY,P99_LATENCY,THROUGHPUT|tail -1; done
57 83 117 13871.38
64 118 155 11432.94
65 116 148 11507.62
60 105 136 12645.15
60 103 135 12760.34
60 102 134 12832.64
62 109 132 10877.68
58 82 115 14052.93
57 83 124 14212.58
57 82 119 14196.01
We see an increase of transactions per second, and lower latencies as well.
v5: Replaced two patches (that were adding new attributes) with patches
from Alexander Duyck. Idea is to reuse existing gso_max_size/gro_max_size
v4: Rebased on top of Jakub series (Merge branch 'tso-gso-limit-split')
max_tso_size is now family independent.
v3: Fixed a typo in RFC number (Alexander)
Added Reviewed-by: tags from Tariq on mlx4/mlx5 parts.
v2: Removed the MAX_SKB_FRAGS change, this belongs to a different series.
Addressed feedback, for Alexander and nvidia folks.
Alexander Duyck (2):
net: allow gso_max_size to exceed 65536
net: allow gro_max_size to exceed 65536
Coco Li (2):
ipv6: Add hop-by-hop header to jumbograms in ip6_output
mlx5: support BIG TCP packets
Eric Dumazet (9):
net: add IFLA_TSO_{MAX_SIZE|SEGS} attributes
net: limit GSO_MAX_SIZE to 524280 bytes
tcp_cubic: make hystart_ack_delay() aware of BIG TCP
ipv6: add struct hop_jumbo_hdr definition
ipv6/gso: remove temporary HBH/jumbo header
ipv6/gro: insert temporary HBH/jumbo header
net: loopback: enable BIG TCP packets
veth: enable BIG TCP packets
mlx4: support BIG TCP packets
drivers/net/ethernet/amd/xgbe/xgbe.h | 3 +-
.../net/ethernet/mellanox/mlx4/en_netdev.c | 3 +
drivers/net/ethernet/mellanox/mlx4/en_tx.c | 47 +++++++++--
.../net/ethernet/mellanox/mlx5/core/en_main.c | 1 +
.../net/ethernet/mellanox/mlx5/core/en_rx.c | 2 +-
.../net/ethernet/mellanox/mlx5/core/en_tx.c | 84 +++++++++++++++----
drivers/net/ethernet/sfc/ef100_nic.c | 3 +-
drivers/net/ethernet/sfc/falcon/tx.c | 3 +-
drivers/net/ethernet/sfc/tx_common.c | 3 +-
drivers/net/ethernet/synopsys/dwc-xlgmac.h | 3 +-
drivers/net/hyperv/rndis_filter.c | 2 +-
drivers/net/loopback.c | 2 +
drivers/net/veth.c | 1 +
drivers/scsi/fcoe/fcoe.c | 2 +-
include/linux/ipv6.h | 1 +
include/linux/netdevice.h | 16 +++-
include/net/ipv6.h | 44 ++++++++++
include/uapi/linux/if_link.h | 2 +
net/bpf/test_run.c | 2 +-
net/core/dev.c | 7 +-
net/core/gro.c | 8 ++
net/core/rtnetlink.c | 16 ++--
net/core/sock.c | 4 +
net/ipv4/tcp_bbr.c | 2 +-
net/ipv4/tcp_cubic.c | 4 +-
net/ipv4/tcp_output.c | 2 +-
net/ipv6/ip6_offload.c | 56 ++++++++++++-
net/ipv6/ip6_output.c | 22 ++++-
net/sctp/output.c | 3 +-
tools/include/uapi/linux/if_link.h | 2 +
30 files changed, 291 insertions(+), 59 deletions(-)
--
2.36.0.512.ge40c2bad7a-goog
next reply other threads:[~2022-05-09 22:21 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-05-09 22:21 Eric Dumazet [this message]
2022-05-09 22:21 ` [PATCH v5 net-next 01/13] net: add IFLA_TSO_{MAX_SIZE|SEGS} attributes Eric Dumazet
2022-05-09 22:21 ` [PATCH v5 net-next 02/13] net: allow gso_max_size to exceed 65536 Eric Dumazet
2022-05-10 1:35 ` kernel test robot
2022-05-10 2:09 ` Eric Dumazet
2022-05-10 2:09 ` Eric Dumazet
2022-05-10 2:20 ` Eric Dumazet
2022-05-10 2:20 ` Eric Dumazet
2022-05-10 3:08 ` kernel test robot
2022-05-09 22:21 ` [PATCH v5 net-next 03/13] net: limit GSO_MAX_SIZE to 524280 bytes Eric Dumazet
2022-05-09 22:21 ` [PATCH v5 net-next 04/13] tcp_cubic: make hystart_ack_delay() aware of BIG TCP Eric Dumazet
2022-05-09 22:21 ` [PATCH v5 net-next 05/13] ipv6: add struct hop_jumbo_hdr definition Eric Dumazet
2022-05-09 22:21 ` [PATCH v5 net-next 06/13] ipv6/gso: remove temporary HBH/jumbo header Eric Dumazet
2022-05-09 22:21 ` [PATCH v5 net-next 07/13] ipv6/gro: insert " Eric Dumazet
2022-05-09 22:21 ` [PATCH v5 net-next 08/13] net: allow gro_max_size to exceed 65536 Eric Dumazet
2022-05-09 22:21 ` [PATCH v5 net-next 09/13] ipv6: Add hop-by-hop header to jumbograms in ip6_output Eric Dumazet
2022-05-09 22:21 ` [PATCH v5 net-next 10/13] net: loopback: enable BIG TCP packets Eric Dumazet
2022-05-09 22:21 ` [PATCH v5 net-next 11/13] veth: " Eric Dumazet
2022-05-09 22:21 ` [PATCH v5 net-next 12/13] mlx4: support " Eric Dumazet
2022-05-09 22:21 ` [PATCH v5 net-next 13/13] mlx5: " Eric Dumazet
2022-05-09 22:30 ` Eric Dumazet
2022-05-10 1:38 ` Jakub Kicinski
2022-05-10 2:00 ` Eric Dumazet
2022-05-10 15:49 ` Kees Cook
2022-05-11 2:55 ` Kees Cook
2022-05-11 16:26 ` Jakub Kicinski
2022-05-11 17:27 ` Kees Cook
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220509222149.1763877-1-eric.dumazet@gmail.com \
--to=eric.dumazet@gmail.com \
--cc=alexanderduyck@fb.com \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=kuba@kernel.org \
--cc=lixiaoyan@google.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.