netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Coco Li <lixiaoyan@google.com>
To: Jakub Kicinski <kuba@kernel.org>,
	Eric Dumazet <edumazet@google.com>,
	 Neal Cardwell <ncardwell@google.com>,
	Mubashir Adnan Qureshi <mubashirq@google.com>,
	 Paolo Abeni <pabeni@redhat.com>, Andrew Lunn <andrew@lunn.ch>,
	Jonathan Corbet <corbet@lwn.net>,
	 David Ahern <dsahern@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>
Cc: netdev@vger.kernel.org, Chao Wu <wwchao@google.com>,
	Wei Wang <weiwan@google.com>,
	 Pradeep Nemavat <pnemavat@google.com>,
	Coco Li <lixiaoyan@google.com>
Subject: [PATCH v8 net-next 0/5] Analyze and Reorganize core Networking Structs to optimize cacheline consumption
Date: Wed, 29 Nov 2023 07:27:51 +0000	[thread overview]
Message-ID: <20231129072756.3684495-1-lixiaoyan@google.com> (raw)

Currently, variable-heavy structs in the networking stack is organized
chronologically, logically and sometimes by cacheline access.

This patch series attempts to reorganize the core networking stack
variables to minimize cacheline consumption during the phase of data
transfer. Specifically, we looked at the TCP/IP stack and the fast
path definition in TCP.

For documentation purposes, we also added new files for each core data
structure we considered, although not all ended up being modified due
to the amount of existing cacheline they span in the fast path. In
the documentation, we recorded all variables we identified on the
fast path and the reasons. We also hope that in the future when
variables are added/modified, the document can be referred to and
updated accordingly to reflect the latest variable organization.

Tested:
Our tests were run with neper tcp_rr using tcp traffic. The tests have $cpu
number of threads and variable number of flows (see below).

Tests were run on 6.5-rc1

Efficiency is computed as cpu seconds / throughput (one tcp_rr round trip).
The following result shows efficiency delta before and after the patch
series is applied.

On AMD platforms with 100Gb/s NIC and 256Mb L3 cache:
IPv4
Flows   with patches    clean kernel      Percent reduction
30k     0.0001736538065 0.0002741191042 -36.65%
20k     0.0001583661752 0.0002712559158 -41.62%
10k     0.0001639148817 0.0002951800751 -44.47%
5k      0.0001859683866 0.0003320642536 -44.00%
1k      0.0002035190546 0.0003152056382 -35.43%

IPv6
Flows   with patches  clean kernel    Percent reduction
30k     0.000202535503  0.0003275329163 -38.16%
20k     0.0002020654777 0.0003411304786 -40.77%
10k     0.0002122427035 0.0003803674705 -44.20%
5k      0.0002348776729 0.0004030403953 -41.72%
1k      0.0002237384583 0.0002813646157 -20.48%

On Intel platforms with 200Gb/s NIC and 105Mb L3 cache:
IPv6
Flows   with patches    clean kernel    Percent reduction
30k     0.0006296537873 0.0006370427753 -1.16%
20k     0.0003451029365 0.0003628016076 -4.88%
10k     0.0003187646958 0.0003346835645 -4.76%
5k      0.0002954676348 0.000311807592  -5.24%
1k      0.0001909169342 0.0001848069709 3.31%

v8 changes:
1. Update net_device_read_txrx cache group maximum
2. Update MAINTAINERS for documentations
3. Skip __cache_group variables in scripts/kernel-doc

Coco Li (5):
  Documentations: Analyze heavily used Networking related structs
  cache: enforce cache groups
  netns-ipv4: reorganize netns_ipv4 fast path variables
  net-device: reorganize net_device fast path variables
  tcp: reorganize tcp_sock fast path variables

 Documentation/networking/index.rst            |   1 +
 .../networking/net_cachelines/index.rst       |  15 ++
 .../net_cachelines/inet_connection_sock.rst   |  49 ++++
 .../networking/net_cachelines/inet_sock.rst   |  43 +++
 .../networking/net_cachelines/net_device.rst  | 177 +++++++++++++
 .../net_cachelines/netns_ipv4_sysctl.rst      | 157 +++++++++++
 .../networking/net_cachelines/snmp.rst        | 134 ++++++++++
 .../networking/net_cachelines/tcp_sock.rst    | 156 +++++++++++
 MAINTAINERS                                   |   3 +
 include/linux/cache.h                         |  25 ++
 include/linux/netdevice.h                     | 117 +++++----
 include/linux/tcp.h                           | 248 ++++++++++--------
 include/net/netns/ipv4.h                      |  47 ++--
 net/core/dev.c                                |  56 ++++
 net/core/net_namespace.c                      |  45 ++++
 net/ipv4/tcp.c                                |  93 +++++++
 scripts/kernel-doc                            |   5 +
 17 files changed, 1189 insertions(+), 182 deletions(-)
 create mode 100644 Documentation/networking/net_cachelines/index.rst
 create mode 100644 Documentation/networking/net_cachelines/inet_connection_sock.rst
 create mode 100644 Documentation/networking/net_cachelines/inet_sock.rst
 create mode 100644 Documentation/networking/net_cachelines/net_device.rst
 create mode 100644 Documentation/networking/net_cachelines/netns_ipv4_sysctl.rst
 create mode 100644 Documentation/networking/net_cachelines/snmp.rst
 create mode 100644 Documentation/networking/net_cachelines/tcp_sock.rst

-- 
2.43.0.rc1.413.gea7ed67945-goog


             reply	other threads:[~2023-11-29  7:28 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-29  7:27 Coco Li [this message]
2023-11-29  7:27 ` [PATCH v8 net-next 1/5] Documentations: Analyze heavily used Networking related structs Coco Li
2023-11-30 10:37   ` Eric Dumazet
2023-12-02 20:00   ` Shakeel Butt
2023-11-29  7:27 ` [PATCH v8 net-next 2/5] cache: enforce cache groups Coco Li
2023-11-30 10:40   ` Eric Dumazet
2023-12-02  4:20   ` Jakub Kicinski
2023-12-02 20:08   ` Shakeel Butt
2023-11-29  7:27 ` [PATCH v8 net-next 3/5] netns-ipv4: reorganize netns_ipv4 fast path variables Coco Li
2023-11-30 10:48   ` Eric Dumazet
2023-12-02 20:23   ` Shakeel Butt
2023-11-29  7:27 ` [PATCH v8 net-next 4/5] net-device: reorganize net_device " Coco Li
2023-11-30 10:49   ` Eric Dumazet
2023-12-02 20:28   ` Shakeel Butt
2023-11-29  7:27 ` [PATCH v8 net-next 5/5] tcp: reorganize tcp_sock " Coco Li
2023-11-30 10:52   ` Eric Dumazet
2023-12-02 20:31   ` Shakeel Butt
2023-12-02 20:34 ` [PATCH v8 net-next 0/5] Analyze and Reorganize core Networking Structs to optimize cacheline consumption Shakeel Butt
2023-12-02 22:30 ` patchwork-bot+netdevbpf
2023-12-02 22:36   ` Neal Cardwell
2023-12-04 19:06     ` Jakub Kicinski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20231129072756.3684495-1-lixiaoyan@google.com \
    --to=lixiaoyan@google.com \
    --cc=andrew@lunn.ch \
    --cc=corbet@lwn.net \
    --cc=daniel@iogearbox.net \
    --cc=dsahern@kernel.org \
    --cc=edumazet@google.com \
    --cc=kuba@kernel.org \
    --cc=mubashirq@google.com \
    --cc=ncardwell@google.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=pnemavat@google.com \
    --cc=weiwan@google.com \
    --cc=wwchao@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).